Merge lp:~brad-marshall/charms/trusty/nagios/add-extra-config-options into lp:charms/trusty/nagios

Proposed by Brad Marshall
Status: Merged
Merged at revision: 33
Proposed branch: lp:~brad-marshall/charms/trusty/nagios/add-extra-config-options
Merge into: lp:charms/trusty/nagios
Diff against target: 2187 lines (+2014/-28)
11 files modified
README.md (+40/-1)
config.yaml (+93/-0)
files/nagios-pagerduty-flush-cron (+7/-0)
files/pagerduty_nagios.pl (+289/-0)
hooks/install (+0/-9)
hooks/templates/contacts-cfg.tmpl (+51/-0)
hooks/templates/nagios-cfg.tmpl (+1360/-0)
hooks/templates/pagerduty_nagios_cfg.tmpl (+26/-0)
hooks/upgrade-charm (+91/-17)
tests/00-setup (+1/-1)
tests/24-pagerduty-test (+56/-0)
To merge this branch: bzr merge lp:~brad-marshall/charms/trusty/nagios/add-extra-config-options
Reviewer Review Type Date Requested Status
Cory Johns (community) Approve
Review via email: mp+265480@code.launchpad.net

Description of the change

Adding extra nagios config options, as well as Pagerduty notification support.

To post a comment you must log in.
Revision history for this message
Cory Johns (johnsca) wrote :

Brad,

Thank you for your addition to this charm.

Overall, these changes look good. I particularly think that the switch to a template for the config files makes it easier to follow. However, I did run into some issues when running the tests.

I had a few tests fail that were unrelated to this change, and the mysql charm seems to error when removing the nagios relation. However both of those seem to be outside the scope of this review, and there is an existing bug (https://bugs.launchpad.net/charms/+source/nagios/+bug/1403574) related to test failures in this charm, so I'm just going to note it here and move on.

Unfortunately, I also got an error on the newly added test: http://pastebin.ubuntu.com/12014504/ When I re-ran the test to attempt to triage it, it passed without error, so I think this is a race condition. You already have a sentry.wait() in the test, so I'm really not sure why I ran into that error. Perhaps, though, a for-range loop + sleep could be wrapped around the directory check to give it a bit of leeway to account for the race?

Revision history for this message
Cory Johns (johnsca) wrote :

Oh, I forgot to mention that the "apt-get update" line in 00-setup is missing a "sudo". That's also not related to this change, but if you make a change to account for the test race, perhaps you would be willing to make that simple change as well?

48. By Brad Marshall

[bradm] Added missing sudo on apt-get

Revision history for this message
Brad Marshall (brad-marshall) wrote :

I've added the missing sudo to the apt-get update. I'm not sure how to handle the race condition, all the try block does is checks for a file existence. Maybe I should add extra to the timeout on the sentry wait?

Revision history for this message
Cory Johns (johnsca) wrote :

I'm still hitting the race condition in the test consistently on local provider. I believe the issue is that the sentry.wait() after setting the config doesn't actually block because the unit is not executing the hook yet when it gets there, so the file check happens before the hook is done.

I submitted a MP against this branch (https://code.launchpad.net/~johnsca/charms/trusty/nagios/pagerduty-test-race/+merge/269401) with the suggested work-around. That will retry three times at 5 second intervals. Depending on the cloud, you may want to increase that, but on local provider that duration (15s) was sufficient and the test passed.

Revision history for this message
Brad Marshall (brad-marshall) wrote :

Oddly the test is passing perfectly fine for me on a local provider. What version of juju are you using? Maybe that'll narrow down whats going on here a bit.

49. By Brad Marshall

[bradm] Add a sleep to allow things to settle before running the test

Revision history for this message
Brad Marshall (brad-marshall) wrote :

Thanks for your suggested MP, but I've taking that idea and just changed it to be a simple sleep before doing the tests. I've deployed it about a dozen times on a test openstack, and it worked every single time, and done a handful of tests locally which were fine too.

I went with the slightly larger sleep to accomodate any cloud that might be running a bit slower - we can always tweak it if we find its not long enough.

Revision history for this message
Cory Johns (johnsca) wrote :

Brad,

The problem with race conditions is that they are inherently difficult to reproduce, and this one would depend heavily on the environment in which they are run, so I'm not at all surprised that you weren't seeing the same failure I was.

The 30s wait, along with some recent improvements to Amulet, seems to be sufficient and the test is passing for me. I'd really like to see the other test failures resolved, but as I said before, we can call that out of scope for this change. So this gets my +1 and I'll get it merged today.

Apologies for the delay on the review.

review: Approve
Revision history for this message
Cory Johns (johnsca) wrote :

Brad,

This has been merged and should update on jujucharms.com within an hour. Once again, thank you for your contribution.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'README.md'
--- README.md 2015-05-07 07:07:09 +0000
+++ README.md 2015-09-08 04:46:07 +0000
@@ -26,12 +26,51 @@
2626
27Will get you the public IP of the web interface.27Will get you the public IP of the web interface.
2828
29# Configuration29# Livestatus Configuration
3030
31- `enable_livestatus` - Setting to enable the [livestatus module](https://mathias-kettner.de/checkmk_livestatus.html). This is an easy interface to get data out of Nagios.31- `enable_livestatus` - Setting to enable the [livestatus module](https://mathias-kettner.de/checkmk_livestatus.html). This is an easy interface to get data out of Nagios.
3232
33- `livestatus_path` - Configuration of where the livestatus module is stored - defaults to /var/lib/nagios3/livestatus/socket.33- `livestatus_path` - Configuration of where the livestatus module is stored - defaults to /var/lib/nagios3/livestatus/socket.
3434
35- `livestatus_args` - Arguments to be passed to the livestatus module, defaults to empty.
36
37# Pagerduty Configuration
38
39- `enable_pagerduty` - Config variable to enable pagerduty notifications or not.
40
41- `pagerduty_key` - Pagerduty API key to use for notifications
42
43- `pagerduty_path` - Path for Pagerduty notifications to be queued, default is /var/lib/nagios3/pagerduty.
44
45# Configuration
46
47- `nagios_user` - The effective user that nagios will run as.
48
49- `nagios_group` - The effective group that nagios will run as.
50
51- `check_external_commands` - Config variable to enable checking external commands.
52
53- `command_check_interval` - How often to check for external commands.
54
55- `command_file` - File that Nagios checks for external command requests.
56
57- `debug_level` - Specify the debug level for nagios. See the docs for more details.
58
59- `debug_verbosity` - How verbose will the debug logs be - 0 is brief, 1 is more detailed and 2 is very detailed.
60
61- `debug_file` - Path for the debug file - defaults to /var/log/nagios3/nagios.debug.
62
63- `daemon_dumps_core` - Option to determine if Nagios is allowed to create a core dump.
64
65- `admin_email` - Email address used for the admin, used by $ADMINEMAIL$ in notification commands - defaults to root@localhost.
66
67- `admin_pager` - Email address used for the admin pager, used by $ADMINPAGER$ in notification commands - defaults to pageroot@localhost.
68
69- `log_rotation_method` - Log rotation method that Nagios should use to rotate the main logfile, defaults to "d".
70
71- `log_archive_path` - Path for archived log files, defaults to /var/log/nagios3/archives
72- `use_syslog` - Log messages to syslog as well as main file.
73
35### SSL Configuration74### SSL Configuration
3675
37- `ssl` - Determinant configuration for enabling SSL. Valid options are "on", "off", "only". The "only" option disables HTTP traffic on Apache in favor of HTTPS. This setting may cause unexpected behavior with existing nagios charm deployments. 76- `ssl` - Determinant configuration for enabling SSL. Valid options are "on", "off", "only". The "only" option disables HTTP traffic on Apache in favor of HTTPS. This setting may cause unexpected behavior with existing nagios charm deployments.
3877
=== modified file 'config.yaml'
--- config.yaml 2015-05-01 03:06:29 +0000
+++ config.yaml 2015-09-08 04:46:07 +0000
@@ -44,3 +44,96 @@
44 default: "/var/lib/nagios3/livestatus/socket"44 default: "/var/lib/nagios3/livestatus/socket"
45 description: |45 description: |
46 Default path to livestatus socket, if enabled via enable_livestatus46 Default path to livestatus socket, if enabled via enable_livestatus
47 livestatus_args:
48 type: string
49 default: ""
50 description: |
51 Arguments to be passed to the livestatus module, defaults to empty.
52 nagios_user:
53 type: string
54 default: nagios
55 description: |
56 The effective user that nagios will run as.
57 nagios_group:
58 type: string
59 default: nagios
60 description: |
61 The effective group that nagios will run as.
62 check_external_commands:
63 type: int
64 default: 1
65 description: |
66 Config variable to enable checking external commands - 0 is disable, 1 is enable.
67 command_check_interval:
68 type: string
69 default: "-1"
70 description: |
71 How often to check for external commands.
72 command_file:
73 type: string
74 default: /var/lib/nagios3/rw/nagios.cmd
75 description: |
76 File that Nagios checks for external command requests.
77 debug_level:
78 type: int
79 default: 0
80 description: |
81 Specify the debug level for nagios. See the docs for more details.
82 debug_verbosity:
83 type: int
84 default: 1
85 description: |
86 How verbose will the debug logs be - 0 is brief, 1 is more detailed
87 and 2 is very detailed.
88 debug_file:
89 type: string
90 default: "/var/log/nagios3/nagios.debug"
91 description: |
92 Path for the debug file.
93 daemon_dumps_core:
94 type: int
95 default: 0
96 description:
97 Option to determine if Nagios is allowed to create a core dump.
98 admin_email:
99 type: string
100 default: root@localhost
101 description: |
102 Email address used for the admin, used by $ADMINEMAIL$ in notification
103 commands.
104 admin_pager:
105 type: string
106 default: pageroot@localhost
107 description: |
108 Email address used for the admin pager, used by $ADMINPAGER$ in
109 notification commands.
110 enable_pagerduty:
111 type: boolean
112 default: false
113 description: |
114 Config variable to enable pagerduty notifications or not.
115 pagerduty_key:
116 type: string
117 default: ""
118 description: |
119 Pagerduty API key to use for notifications
120 pagerduty_path:
121 type: string
122 default: "/var/lib/nagios3/pagerduty"
123 description: |
124 Path for Pagerduty notifications to be queued.
125 log_rotation_method:
126 type: string
127 default: "d"
128 description: |
129 Log rotation method that Nagios should use to rotate the main logfile.
130 log_archive_path:
131 type: string
132 default: "/var/log/nagios3/archives"
133 description: |
134 Path for archived log files
135 use_syslog:
136 type: int
137 default: 1
138 description: |
139 Log messages to syslog as well as main file.
47140
=== added file 'files/nagios-pagerduty-flush-cron'
--- files/nagios-pagerduty-flush-cron 1970-01-01 00:00:00 +0000
+++ files/nagios-pagerduty-flush-cron 2015-09-08 04:46:07 +0000
@@ -0,0 +1,7 @@
1#------------------------------------------------
2# This file is juju managed
3#------------------------------------------------
4
5# Flush the nagios pagerduty alerts every minute as per
6# http://www.pagerduty.com/docs/guides/nagios-perl-integration-guide/
7* * * * * nagios /usr/local/bin/pagerduty_nagios.pl flush
08
=== added file 'files/pagerduty_nagios.pl'
--- files/pagerduty_nagios.pl 1970-01-01 00:00:00 +0000
+++ files/pagerduty_nagios.pl 2015-09-08 04:46:07 +0000
@@ -0,0 +1,289 @@
1#!/usr/bin/env perl
2
3
4# Nagios plugin that sends Nagios events to PagerDuty.
5#
6# Copyright (c) 2011, PagerDuty, Inc. <info@pagerduty.com>
7# All rights reserved.
8#
9# Redistribution and use in source and binary forms, with or without
10# modification, are permitted provided that the following conditions are met:
11# * Redistributions of source code must retain the above copyright
12# notice, this list of conditions and the following disclaimer.
13# * Redistributions in binary form must reproduce the above copyright
14# notice, this list of conditions and the following disclaimer in the
15# documentation and/or other materials provided with the distribution.
16# * Neither the name of PagerDuty Inc nor the
17# names of its contributors may be used to endorse or promote products
18# derived from this software without specific prior written permission.
19#
20# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
21# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
22# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
23# DISCLAIMED. IN NO EVENT SHALL PAGERDUTY INC BE LIABLE FOR ANY
24# DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
25# (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
26# LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
27# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
28# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
29# SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
30
31
32use Pod::Usage;
33use Getopt::Long;
34use Sys::Syslog;
35use HTTP::Request::Common qw(POST);
36use HTTP::Status qw(is_client_error);
37use LWP::UserAgent;
38use File::Path;
39use Fcntl qw(:flock);
40
41
42=head1 NAME
43
44pagerduty_nagios -- Send Nagios events to the PagerDuty alert system
45
46=head1 SYNOPSIS
47
48pagerduty_nagios enqueue [options]
49
50pagerduty_nagios flush [options]
51
52=head1 DESCRIPTION
53
54 This script passes events from Nagios to the PagerDuty alert system. It's
55 meant to be run as a Nagios notification plugin. For more details, please see
56 the PagerDuty Nagios integration docs at:
57 http://www.pagerduty.com/docs/nagios-integration.
58
59 When called in the "enqueue" mode, the script loads a Nagios notification out
60 of the environment and into the event queue. It then tries to flush the
61 queue by sending any enqueued events to the PagerDuty server. The script is
62 typically invoked in this mode from a Nagios notification handler.
63
64 When called in the "flush" mode, the script simply tries to send any enqueued
65 events to the PagerDuty server. This mode is typically invoked by cron. The
66 purpose of this mode is to retry any events that couldn't be sent to the
67 PagerDuty server for whatever reason when they were initially enqueued.
68
69=head1 OPTIONS
70
71 --api-base URL
72 The base URL used to communicate with PagerDuty. The default option here
73 should be fine, but adjusting it may make sense if your firewall doesn't
74 pass HTTPS traffic for some reason. See the PagerDuty Nagios integration
75 docs for details.
76
77 --field KEY=VALUE
78 Add this key-value pair to the event being passed to PagerDuty. The script
79 automatically gathers Nagios macros out of the environment, so there's no
80 need to specify these explicitly. This option can be repeated as many
81 times as necessary to pass multiple key-value pairs. This option is only
82 useful when an event is being enqueued.0
83
84 --help
85 Display documentation for the script.
86
87 --queue-dir DIR
88 Path to the directory to use to store the event queue. By default, we use
89 /tmp/pagerduty_nagios.
90
91 --verbose
92 Turn on extra debugging information. Useful for debugging.
93
94 --proxy
95 Use a proxy for the connections like "--proxy http://127.0.0.1:8888/"
96
97=cut
98
99# This release tested on:
100# Debian Sarge (Perl 5.8.4)
101# Ubuntu 9.04 (Perl 5.10.0)
102
103my $opt_api_base = "https://events.pagerduty.com/nagios/2010-04-15";
104my %opt_fields;
105my $opt_help;
106my $opt_queue_dir = "/tmp/pagerduty_nagios";
107my $opt_verbose;
108my $opt_proxy;
109
110
111sub get_queue_from_dir {
112 my $dh;
113
114 unless (opendir($dh, $opt_queue_dir)) {
115 syslog(LOG_ERR, "opendir %s failed: %s", $opt_queue_dir, $!);
116 die $!;
117 }
118
119 my @files;
120 while (my $f = readdir($dh)) {
121 next unless $f =~ /^pd_(\d+)_\d+\.txt$/;
122 push @files, [int($1), $f];
123 }
124
125 closedir($dh);
126
127 @files = sort { @{$a}[0] <=> @{$b}[0] } @files;
128 return map { @{$_}[1] } @files;
129}
130
131
132sub flush_queue {
133 my @files = get_queue_from_dir();
134 my $ua = LWP::UserAgent->new;
135
136 # It's not a big deal if we don't get the message through the first time.
137 # It will get sent the next time cron fires.
138 $ua->timeout(15);
139
140 if ($opt_proxy) {
141 $ua->proxy (['http', 'https'], $opt_proxy);
142 }
143
144 foreach (@files) {
145 my $filename = "$opt_queue_dir/$_";
146 my $fd;
147 my %event;
148
149 print STDERR "==== Now processing: $filename\n" if $opt_verbose;
150
151 unless (open($fd, "<", $filename)) {
152 syslog(LOG_ERR, "open %s for read failed: %s", $filename, $!);
153 die $!;
154 }
155
156 while (<$fd>) {
157 chomp;
158 my @fields = split("=", $_, 2);
159 $event{$fields[0]} = $fields[1];
160 }
161
162 close($fd);
163
164 my $req = POST("$opt_api_base/create_event", \%event);
165
166 if ($opt_verbose) {
167 my $s = $req->as_string;
168 print STDERR "Request:\n$s\n";
169 }
170
171 my $resp = $ua->request($req);
172
173 if ($opt_verbose) {
174 my $s = $resp->as_string;
175 print STDERR "Response:\n$s\n";
176 }
177
178 if ($resp->is_success) {
179 syslog(LOG_INFO, "Nagios event in file %s ACCEPTED by the PagerDuty server.", $filename);
180 unlink($filename);
181 }
182 elsif (is_client_error($resp->code)) {
183 syslog(LOG_WARNING, "Nagios event in file %s REJECTED by the PagerDuty server. Server says: %s", $filename, $resp->content);
184 unlink($filename) if ($resp->content !~ /retry later/);
185 }
186 else {
187 # Something else went wrong.
188 syslog(LOG_WARNING, "Nagios event in file %s DEFERRED due to network/server problems.", $filename);
189 return 0;
190 }
191 }
192
193 # Everything that needed to be sent was sent.
194 return 1;
195}
196
197
198sub lock_and_flush_queue {
199 # Serialize access to the queue directory while we flush.
200 # (We don't want more than one flush at once.)
201
202 my $lock_filename = "$opt_queue_dir/lockfile";
203 my $lock_fd;
204
205 unless (open($lock_fd, ">", $lock_filename)) {
206 syslog(LOG_ERR, "open %s for write failed: %s", $lock_filename, $!);
207 die $!;
208 }
209
210 unless (flock($lock_fd, LOCK_EX)) {
211 syslog(LOG_ERR, "flock %s failed: %s", $lock_filename, $!);
212 die $!;
213 }
214
215 my $ret = flush_queue();
216
217 close($lock_fd);
218
219 return $ret;
220}
221
222
223sub enqueue_event {
224 my %event;
225
226 # Scoop all the Nagios related stuff out of the environment.
227 while ((my $k, my $v) = each %ENV) {
228 next unless $k =~ /^(ICINGA|NAGIOS)_(.*)$/;
229 $event{$2} = $v;
230 }
231
232 # Apply any other variables that were passed in.
233 %event = (%event, %opt_fields);
234
235 $event{"pd_version"} = "1.0";
236
237 # Right off the bat, enqueue the event. Nothing tiem consuming should come
238 # before here (i.e. no locks or remote connections), because we want to
239 # make sure we get the event written out within the Nagios notification
240 # timeout. If we get killed off after that, it isn't a big deal.
241
242 my $filename = sprintf("$opt_queue_dir/pd_%u_%u.txt", time(), $$);
243 my $fd;
244
245 unless (open($fd, ">", $filename)) {
246 syslog(LOG_ERR, "open %s for write failed: %s", $filename, $!);
247 die $!;
248 }
249
250 while ((my $k, my $v) = each %event) {
251 # "=" can't occur in the keyname, and "\n" can't occur anywhere.
252 # (Nagios follows this already, so I think we're safe)
253 print $fd "$k=$v\n";
254 }
255
256 close($fd);
257}
258
259###########
260
261GetOptions("api-base=s" => \$opt_api_base,
262 "field=s%" => \%opt_fields,
263 "help" => \$opt_help,
264 "queue-dir=s" => \$opt_queue_dir,
265 "verbose" => \$opt_verbose,
266 "proxy=s" => \$opt_proxy
267 ) || pod2usage(2);
268
269pod2usage(2) if @ARGV < 1 ||
270 (($ARGV[0] ne "enqueue") && ($ARGV[0] ne "flush"));
271
272pod2usage(-verbose => 3) if $opt_help;
273
274my @log_mode = ("nofatal", "pid");
275push(@log_mode, "perror") if $opt_verbose;
276
277openlog("pagerduty_nagios", join(",", @log_mode), LOG_LOCAL0);
278
279# This function automatically terminates the program on things like permission
280# errors.
281mkpath($opt_queue_dir);
282
283if ($ARGV[0] eq "enqueue") {
284 enqueue_event();
285 lock_and_flush_queue();
286}
287elsif ($ARGV[0] eq "flush") {
288 lock_and_flush_queue();
289}
0290
=== modified file 'hooks/install'
--- hooks/install 2015-05-01 05:06:20 +0000
+++ hooks/install 2015-09-08 04:46:07 +0000
@@ -23,11 +23,6 @@
23DEBIAN_FRONTEND=noninteractive apt-get -qy \23DEBIAN_FRONTEND=noninteractive apt-get -qy \
24 install nagios3 nagios-plugins python-cheetah python-jinja2 dnsutils debconf-utils nagios-nrpe-plugin pynag24 install nagios3 nagios-plugins python-cheetah python-jinja2 dnsutils debconf-utils nagios-nrpe-plugin pynag
2525
26# enable external commands per README.Debian file
27if ! grep '^check_external_commands=1$' /etc/nagios3/nagios.cfg ; then
28 echo check_external_commands=1 >> /etc/nagios3/nagios.cfg
29fi
30
31if [ -f $CHARM_DIR/files/hostgroups_nagios2.cfg ]; then26if [ -f $CHARM_DIR/files/hostgroups_nagios2.cfg ]; then
32 # Write the new hostgroups_nagios2.cfg file to prevent servers being classified as Debian.27 # Write the new hostgroups_nagios2.cfg file to prevent servers being classified as Debian.
33 cp -v $CHARM_DIR/files/hostgroups_nagios2.cfg /etc/nagios3/conf.d/hostgroups_nagios2.cfg28 cp -v $CHARM_DIR/files/hostgroups_nagios2.cfg /etc/nagios3/conf.d/hostgroups_nagios2.cfg
@@ -44,10 +39,6 @@
44if [ "$enable_livestatus" ]; then 39if [ "$enable_livestatus" ]; then
45 # install check-mk-livestatus40 # install check-mk-livestatus
46 DEBIAN_FRONTEND=noninteractive apt-get -qy install check-mk-livestatus41 DEBIAN_FRONTEND=noninteractive apt-get -qy install check-mk-livestatus
47 # enable livestatus broker
48 if ! grep '^broker_module=' /etc/nagios3/nagios.cfg ; then
49 echo "broker_module=/usr/lib/check_mk/livestatus.o $livestatus_path" >> /etc/nagios3/nagios.cfg
50 fi
51 # fix permissions on the livestatus directory42 # fix permissions on the livestatus directory
52 mkdir -p $livestatus_dir43 mkdir -p $livestatus_dir
53 chown nagios:www-data $livestatus_dir44 chown nagios:www-data $livestatus_dir
5445
=== added file 'hooks/templates/contacts-cfg.tmpl'
--- hooks/templates/contacts-cfg.tmpl 1970-01-01 00:00:00 +0000
+++ hooks/templates/contacts-cfg.tmpl 2015-09-08 04:46:07 +0000
@@ -0,0 +1,51 @@
1#------------------------------------------------
2# This file is juju managed
3#------------------------------------------------
4
5###############################################################################
6# contacts.cfg
7###############################################################################
8
9
10
11###############################################################################
12###############################################################################
13#
14# CONTACTS
15#
16###############################################################################
17###############################################################################
18
19# In this simple config file, a single contact will receive all alerts.
20
21define contact{
22 contact_name root
23 alias Root
24 service_notification_period 24x7
25 host_notification_period 24x7
26 service_notification_options w,u,c,r
27 host_notification_options d,r
28 service_notification_commands notify-service-by-email
29 host_notification_commands notify-host-by-email
30 email {{ admin_email }}
31 }
32
33
34
35###############################################################################
36###############################################################################
37#
38# CONTACT GROUPS
39#
40###############################################################################
41###############################################################################
42
43# We only have one contact in this simple configuration file, so there is
44# no need to create more than one contact group.
45
46define contactgroup{
47 contactgroup_name admins
48 alias Nagios Administrators
49 members root{% if enable_pagerduty -%}, pagerduty{% endif %}
50 }
51
052
=== added file 'hooks/templates/nagios-cfg.tmpl'
--- hooks/templates/nagios-cfg.tmpl 1970-01-01 00:00:00 +0000
+++ hooks/templates/nagios-cfg.tmpl 2015-09-08 04:46:07 +0000
@@ -0,0 +1,1360 @@
1#------------------------------------------------
2# This file is juju managed
3#------------------------------------------------
4
5##############################################################################
6#
7# NAGIOS.CFG - Sample Main Config File for Nagios
8#
9#
10##############################################################################
11
12
13# LOG FILE
14# This is the main log file where service and host events are logged
15# for historical purposes. This should be the first option specified
16# in the config file!!!
17
18log_file=/var/log/nagios3/nagios.log
19
20# Commands definitions
21cfg_file=/etc/nagios3/commands.cfg
22
23# Debian also defaults to using the check commands defined by the debian
24# nagios-plugins package
25cfg_dir=/etc/nagios-plugins/config
26
27# Debian uses by default a configuration directory where nagios3-common,
28# other packages and the local admin can dump or link configuration
29# files into.
30cfg_dir=/etc/nagios3/conf.d
31
32# OBJECT CONFIGURATION FILE(S)
33# These are the object configuration files in which you define hosts,
34# host groups, contacts, contact groups, services, etc.
35# You can split your object definitions across several config files
36# if you wish (as shown below), or keep them all in a single config file.
37
38# You can specify individual object config files as shown below:
39#cfg_file=/etc/nagios3/objects/commands.cfg
40#cfg_file=/etc/nagios3/objects/contacts.cfg
41#cfg_file=/etc/nagios3/objects/timeperiods.cfg
42#cfg_file=/etc/nagios3/objects/templates.cfg
43
44# Definitions for monitoring a Windows machine
45#cfg_file=/etc/nagios3/objects/windows.cfg
46
47# Definitions for monitoring a router/switch
48#cfg_file=/etc/nagios3/objects/switch.cfg
49
50# Definitions for monitoring a network printer
51#cfg_file=/etc/nagios3/objects/printer.cfg
52
53
54# You can also tell Nagios to process all config files (with a .cfg
55# extension) in a particular directory by using the cfg_dir
56# directive as shown below:
57
58#cfg_dir=/etc/nagios3/servers
59#cfg_dir=/etc/nagios3/printers
60#cfg_dir=/etc/nagios3/switches
61#cfg_dir=/etc/nagios3/routers
62
63
64
65
66# OBJECT CACHE FILE
67# This option determines where object definitions are cached when
68# Nagios starts/restarts. The CGIs read object definitions from
69# this cache file (rather than looking at the object config files
70# directly) in order to prevent inconsistencies that can occur
71# when the config files are modified after Nagios starts.
72
73object_cache_file=/var/cache/nagios3/objects.cache
74
75
76
77# PRE-CACHED OBJECT FILE
78# This options determines the location of the precached object file.
79# If you run Nagios with the -p command line option, it will preprocess
80# your object configuration file(s) and write the cached config to this
81# file. You can then start Nagios with the -u option to have it read
82# object definitions from this precached file, rather than the standard
83# object configuration files (see the cfg_file and cfg_dir options above).
84# Using a precached object file can speed up the time needed to (re)start
85# the Nagios process if you've got a large and/or complex configuration.
86# Read the documentation section on optimizing Nagios to find our more
87# about how this feature works.
88
89precached_object_file=/var/lib/nagios3/objects.precache
90
91
92
93# RESOURCE FILE
94# This is an optional resource file that contains $USERx$ macro
95# definitions. Multiple resource files can be specified by using
96# multiple resource_file definitions. The CGIs will not attempt to
97# read the contents of resource files, so information that is
98# considered to be sensitive (usernames, passwords, etc) can be
99# defined as macros in this file and restrictive permissions (600)
100# can be placed on this file.
101
102resource_file=/etc/nagios3/resource.cfg
103
104
105
106# STATUS FILE
107# This is where the current status of all monitored services and
108# hosts is stored. Its contents are read and processed by the CGIs.
109# The contents of the status file are deleted every time Nagios
110# restarts.
111
112status_file=/var/cache/nagios3/status.dat
113
114
115
116# STATUS FILE UPDATE INTERVAL
117# This option determines the frequency (in seconds) that
118# Nagios will periodically dump program, host, and
119# service status data.
120
121status_update_interval=10
122
123
124
125# NAGIOS USER
126# This determines the effective user that Nagios should run as.
127# You can either supply a username or a UID.
128
129nagios_user={{ nagios_user }}
130
131
132
133# NAGIOS GROUP
134# This determines the effective group that Nagios should run as.
135# You can either supply a group name or a GID.
136
137nagios_group={{ nagios_group }}
138
139
140
141# EXTERNAL COMMAND OPTION
142# This option allows you to specify whether or not Nagios should check
143# for external commands (in the command file defined below). By default
144# Nagios will *not* check for external commands, just to be on the
145# cautious side. If you want to be able to use the CGI command interface
146# you will have to enable this.
147# Values: 0 = disable commands, 1 = enable commands
148
149check_external_commands={{ check_external_commands }}
150
151
152
153# EXTERNAL COMMAND CHECK INTERVAL
154# This is the interval at which Nagios should check for external commands.
155# This value works of the interval_length you specify later. If you leave
156# that at its default value of 60 (seconds), a value of 1 here will cause
157# Nagios to check for external commands every minute. If you specify a
158# number followed by an "s" (i.e. 15s), this will be interpreted to mean
159# actual seconds rather than a multiple of the interval_length variable.
160# Note: In addition to reading the external command file at regularly
161# scheduled intervals, Nagios will also check for external commands after
162# event handlers are executed.
163# NOTE: Setting this value to -1 causes Nagios to check the external
164# command file as often as possible.
165
166#command_check_interval=15s
167command_check_interval={{ command_check_interval }}
168
169
170
171# EXTERNAL COMMAND FILE
172# This is the file that Nagios checks for external command requests.
173# It is also where the command CGI will write commands that are submitted
174# by users, so it must be writeable by the user that the web server
175# is running as (usually 'nobody'). Permissions should be set at the
176# directory level instead of on the file, as the file is deleted every
177# time its contents are processed.
178# Debian Users: In case you didn't read README.Debian yet, _NOW_ is the
179# time to do it.
180
181command_file={{ command_file }}
182
183
184
185# EXTERNAL COMMAND BUFFER SLOTS
186# This settings is used to tweak the number of items or "slots" that
187# the Nagios daemon should allocate to the buffer that holds incoming
188# external commands before they are processed. As external commands
189# are processed by the daemon, they are removed from the buffer.
190
191external_command_buffer_slots=4096
192
193
194
195# LOCK FILE
196# This is the lockfile that Nagios will use to store its PID number
197# in when it is running in daemon mode.
198
199lock_file=/var/run/nagios3/nagios3.pid
200
201
202
203# TEMP FILE
204# This is a temporary file that is used as scratch space when Nagios
205# updates the status log, cleans the comment file, etc. This file
206# is created, used, and deleted throughout the time that Nagios is
207# running.
208
209temp_file=/var/cache/nagios3/nagios.tmp
210
211
212
213# TEMP PATH
214# This is path where Nagios can create temp files for service and
215# host check results, etc.
216
217temp_path=/tmp
218
219
220
221# EVENT BROKER OPTIONS
222# Controls what (if any) data gets sent to the event broker.
223# Values: 0 = Broker nothing
224# -1 = Broker everything
225# <other> = See documentation
226
227event_broker_options=-1
228
229
230
231# EVENT BROKER MODULE(S)
232# This directive is used to specify an event broker module that should
233# by loaded by Nagios at startup. Use multiple directives if you want
234# to load more than one module. Arguments that should be passed to
235# the module at startup are seperated from the module path by a space.
236#
237#!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
238# WARNING !!! WARNING !!! WARNING !!! WARNING !!! WARNING !!! WARNING
239#!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
240#
241# Do NOT overwrite modules while they are being used by Nagios or Nagios
242# will crash in a fiery display of SEGFAULT glory. This is a bug/limitation
243# either in dlopen(), the kernel, and/or the filesystem. And maybe Nagios...
244#
245# The correct/safe way of updating a module is by using one of these methods:
246# 1. Shutdown Nagios, replace the module file, restart Nagios
247# 2. Delete the original module file, move the new module file into place, restart Nagios
248#
249# Example:
250#
251# broker_module=<modulepath> [moduleargs]
252
253#broker_module=/somewhere/module1.o
254#broker_module=/somewhere/module2.o arg1 arg2=3 debug=0
255{% if enable_livestatus -%}
256broker_module=/usr/lib/check_mk/livestatus.o {{ livestatus_path }} {{ livestatus_args }}
257{% endif %}
258
259
260# LOG ROTATION METHOD
261# This is the log rotation method that Nagios should use to rotate
262# the main log file. Values are as follows..
263# n = None - don't rotate the log
264# h = Hourly rotation (top of the hour)
265# d = Daily rotation (midnight every day)
266# w = Weekly rotation (midnight on Saturday evening)
267# m = Monthly rotation (midnight last day of month)
268
269log_rotation_method={{ log_rotation_method }}
270
271
272
273# LOG ARCHIVE PATH
274# This is the directory where archived (rotated) log files should be
275# placed (assuming you've chosen to do log rotation).
276
277log_archive_path={{ log_archive_path }}
278
279
280
281# LOGGING OPTIONS
282# If you want messages logged to the syslog facility, as well as the
283# Nagios log file set this option to 1. If not, set it to 0.
284
285use_syslog={{ use_syslog }}
286
287
288
289# NOTIFICATION LOGGING OPTION
290# If you don't want notifications to be logged, set this value to 0.
291# If notifications should be logged, set the value to 1.
292
293log_notifications=1
294
295
296
297# SERVICE RETRY LOGGING OPTION
298# If you don't want service check retries to be logged, set this value
299# to 0. If retries should be logged, set the value to 1.
300
301log_service_retries=1
302
303
304
305# HOST RETRY LOGGING OPTION
306# If you don't want host check retries to be logged, set this value to
307# 0. If retries should be logged, set the value to 1.
308
309log_host_retries=1
310
311
312
313# EVENT HANDLER LOGGING OPTION
314# If you don't want host and service event handlers to be logged, set
315# this value to 0. If event handlers should be logged, set the value
316# to 1.
317
318log_event_handlers=1
319
320
321
322# INITIAL STATES LOGGING OPTION
323# If you want Nagios to log all initial host and service states to
324# the main log file (the first time the service or host is checked)
325# you can enable this option by setting this value to 1. If you
326# are not using an external application that does long term state
327# statistics reporting, you do not need to enable this option. In
328# this case, set the value to 0.
329
330log_initial_states=0
331
332
333
334# EXTERNAL COMMANDS LOGGING OPTION
335# If you don't want Nagios to log external commands, set this value
336# to 0. If external commands should be logged, set this value to 1.
337# Note: This option does not include logging of passive service
338# checks - see the option below for controlling whether or not
339# passive checks are logged.
340
341log_external_commands=1
342
343
344
345# PASSIVE CHECKS LOGGING OPTION
346# If you don't want Nagios to log passive host and service checks, set
347# this value to 0. If passive checks should be logged, set
348# this value to 1.
349
350log_passive_checks=1
351
352
353
354# GLOBAL HOST AND SERVICE EVENT HANDLERS
355# These options allow you to specify a host and service event handler
356# command that is to be run for every host or service state change.
357# The global event handler is executed immediately prior to the event
358# handler that you have optionally specified in each host or
359# service definition. The command argument is the short name of a
360# command definition that you define in your host configuration file.
361# Read the HTML docs for more information.
362
363#global_host_event_handler=somecommand
364#global_service_event_handler=somecommand
365
366
367
368# SERVICE INTER-CHECK DELAY METHOD
369# This is the method that Nagios should use when initially
370# "spreading out" service checks when it starts monitoring. The
371# default is to use smart delay calculation, which will try to
372# space all service checks out evenly to minimize CPU load.
373# Using the dumb setting will cause all checks to be scheduled
374# at the same time (with no delay between them)! This is not a
375# good thing for production, but is useful when testing the
376# parallelization functionality.
377# n = None - don't use any delay between checks
378# d = Use a "dumb" delay of 1 second between checks
379# s = Use "smart" inter-check delay calculation
380# x.xx = Use an inter-check delay of x.xx seconds
381
382service_inter_check_delay_method=s
383
384
385
386# MAXIMUM SERVICE CHECK SPREAD
387# This variable determines the timeframe (in minutes) from the
388# program start time that an initial check of all services should
389# be completed. Default is 30 minutes.
390
391max_service_check_spread=30
392
393
394
395# SERVICE CHECK INTERLEAVE FACTOR
396# This variable determines how service checks are interleaved.
397# Interleaving the service checks allows for a more even
398# distribution of service checks and reduced load on remote
399# hosts. Setting this value to 1 is equivalent to how versions
400# of Nagios previous to 0.0.5 did service checks. Set this
401# value to s (smart) for automatic calculation of the interleave
402# factor unless you have a specific reason to change it.
403# s = Use "smart" interleave factor calculation
404# x = Use an interleave factor of x, where x is a
405# number greater than or equal to 1.
406
407service_interleave_factor=s
408
409
410
411# HOST INTER-CHECK DELAY METHOD
412# This is the method that Nagios should use when initially
413# "spreading out" host checks when it starts monitoring. The
414# default is to use smart delay calculation, which will try to
415# space all host checks out evenly to minimize CPU load.
416# Using the dumb setting will cause all checks to be scheduled
417# at the same time (with no delay between them)!
418# n = None - don't use any delay between checks
419# d = Use a "dumb" delay of 1 second between checks
420# s = Use "smart" inter-check delay calculation
421# x.xx = Use an inter-check delay of x.xx seconds
422
423host_inter_check_delay_method=s
424
425
426
427# MAXIMUM HOST CHECK SPREAD
428# This variable determines the timeframe (in minutes) from the
429# program start time that an initial check of all hosts should
430# be completed. Default is 30 minutes.
431
432max_host_check_spread=30
433
434
435
436# MAXIMUM CONCURRENT SERVICE CHECKS
437# This option allows you to specify the maximum number of
438# service checks that can be run in parallel at any given time.
439# Specifying a value of 1 for this variable essentially prevents
440# any service checks from being parallelized. A value of 0
441# will not restrict the number of concurrent checks that are
442# being executed.
443
444max_concurrent_checks=0
445
446
447
448# HOST AND SERVICE CHECK REAPER FREQUENCY
449# This is the frequency (in seconds!) that Nagios will process
450# the results of host and service checks.
451
452check_result_reaper_frequency=10
453
454
455
456
457# MAX CHECK RESULT REAPER TIME
458# This is the max amount of time (in seconds) that a single
459# check result reaper event will be allowed to run before
460# returning control back to Nagios so it can perform other
461# duties.
462
463max_check_result_reaper_time=30
464
465
466
467
468# CHECK RESULT PATH
469# This is directory where Nagios stores the results of host and
470# service checks that have not yet been processed.
471#
472# Note: Make sure that only one instance of Nagios has access
473# to this directory!
474
475check_result_path=/var/lib/nagios3/spool/checkresults
476
477
478
479
480# MAX CHECK RESULT FILE AGE
481# This option determines the maximum age (in seconds) which check
482# result files are considered to be valid. Files older than this
483# threshold will be mercilessly deleted without further processing.
484
485max_check_result_file_age=3600
486
487
488
489
490# CACHED HOST CHECK HORIZON
491# This option determines the maximum amount of time (in seconds)
492# that the state of a previous host check is considered current.
493# Cached host states (from host checks that were performed more
494# recently that the timeframe specified by this value) can immensely
495# improve performance in regards to the host check logic.
496# Too high of a value for this option may result in inaccurate host
497# states being used by Nagios, while a lower value may result in a
498# performance hit for host checks. Use a value of 0 to disable host
499# check caching.
500
501cached_host_check_horizon=15
502
503
504
505# CACHED SERVICE CHECK HORIZON
506# This option determines the maximum amount of time (in seconds)
507# that the state of a previous service check is considered current.
508# Cached service states (from service checks that were performed more
509# recently that the timeframe specified by this value) can immensely
510# improve performance in regards to predictive dependency checks.
511# Use a value of 0 to disable service check caching.
512
513cached_service_check_horizon=15
514
515
516
517# ENABLE PREDICTIVE HOST DEPENDENCY CHECKS
518# This option determines whether or not Nagios will attempt to execute
519# checks of hosts when it predicts that future dependency logic test
520# may be needed. These predictive checks can help ensure that your
521# host dependency logic works well.
522# Values:
523# 0 = Disable predictive checks
524# 1 = Enable predictive checks (default)
525
526enable_predictive_host_dependency_checks=1
527
528
529
530# ENABLE PREDICTIVE SERVICE DEPENDENCY CHECKS
531# This option determines whether or not Nagios will attempt to execute
532# checks of service when it predicts that future dependency logic test
533# may be needed. These predictive checks can help ensure that your
534# service dependency logic works well.
535# Values:
536# 0 = Disable predictive checks
537# 1 = Enable predictive checks (default)
538
539enable_predictive_service_dependency_checks=1
540
541
542
543# SOFT STATE DEPENDENCIES
544# This option determines whether or not Nagios will use soft state
545# information when checking host and service dependencies. Normally
546# Nagios will only use the latest hard host or service state when
547# checking dependencies. If you want it to use the latest state (regardless
548# of whether its a soft or hard state type), enable this option.
549# Values:
550# 0 = Don't use soft state dependencies (default)
551# 1 = Use soft state dependencies
552
553soft_state_dependencies=0
554
555
556
557# TIME CHANGE ADJUSTMENT THRESHOLDS
558# These options determine when Nagios will react to detected changes
559# in system time (either forward or backwards).
560
561#time_change_threshold=900
562
563
564
565# AUTO-RESCHEDULING OPTION
566# This option determines whether or not Nagios will attempt to
567# automatically reschedule active host and service checks to
568# "smooth" them out over time. This can help balance the load on
569# the monitoring server.
570# WARNING: THIS IS AN EXPERIMENTAL FEATURE - IT CAN DEGRADE
571# PERFORMANCE, RATHER THAN INCREASE IT, IF USED IMPROPERLY
572
573auto_reschedule_checks=0
574
575
576
577# AUTO-RESCHEDULING INTERVAL
578# This option determines how often (in seconds) Nagios will
579# attempt to automatically reschedule checks. This option only
580# has an effect if the auto_reschedule_checks option is enabled.
581# Default is 30 seconds.
582# WARNING: THIS IS AN EXPERIMENTAL FEATURE - IT CAN DEGRADE
583# PERFORMANCE, RATHER THAN INCREASE IT, IF USED IMPROPERLY
584
585auto_rescheduling_interval=30
586
587
588
589# AUTO-RESCHEDULING WINDOW
590# This option determines the "window" of time (in seconds) that
591# Nagios will look at when automatically rescheduling checks.
592# Only host and service checks that occur in the next X seconds
593# (determined by this variable) will be rescheduled. This option
594# only has an effect if the auto_reschedule_checks option is
595# enabled. Default is 180 seconds (3 minutes).
596# WARNING: THIS IS AN EXPERIMENTAL FEATURE - IT CAN DEGRADE
597# PERFORMANCE, RATHER THAN INCREASE IT, IF USED IMPROPERLY
598
599auto_rescheduling_window=180
600
601
602
603# SLEEP TIME
604# This is the number of seconds to sleep between checking for system
605# events and service checks that need to be run.
606
607sleep_time=0.25
608
609
610
611# TIMEOUT VALUES
612# These options control how much time Nagios will allow various
613# types of commands to execute before killing them off. Options
614# are available for controlling maximum time allotted for
615# service checks, host checks, event handlers, notifications, the
616# ocsp command, and performance data commands. All values are in
617# seconds.
618
619service_check_timeout=60
620host_check_timeout=30
621event_handler_timeout=30
622notification_timeout=30
623ocsp_timeout=5
624perfdata_timeout=5
625
626
627
628# RETAIN STATE INFORMATION
629# This setting determines whether or not Nagios will save state
630# information for services and hosts before it shuts down. Upon
631# startup Nagios will reload all saved service and host state
632# information before starting to monitor. This is useful for
633# maintaining long-term data on state statistics, etc, but will
634# slow Nagios down a bit when it (re)starts. Since its only
635# a one-time penalty, I think its well worth the additional
636# startup delay.
637
638retain_state_information=1
639
640
641
642# STATE RETENTION FILE
643# This is the file that Nagios should use to store host and
644# service state information before it shuts down. The state
645# information in this file is also read immediately prior to
646# starting to monitor the network when Nagios is restarted.
647# This file is used only if the retain_state_information
648# variable is set to 1.
649
650state_retention_file=/var/lib/nagios3/retention.dat
651
652
653
654# RETENTION DATA UPDATE INTERVAL
655# This setting determines how often (in minutes) that Nagios
656# will automatically save retention data during normal operation.
657# If you set this value to 0, Nagios will not save retention
658# data at regular interval, but it will still save retention
659# data before shutting down or restarting. If you have disabled
660# state retention, this option has no effect.
661
662retention_update_interval=60
663
664
665
666# USE RETAINED PROGRAM STATE
667# This setting determines whether or not Nagios will set
668# program status variables based on the values saved in the
669# retention file. If you want to use retained program status
670# information, set this value to 1. If not, set this value
671# to 0.
672
673use_retained_program_state=1
674
675
676
677# USE RETAINED SCHEDULING INFO
678# This setting determines whether or not Nagios will retain
679# the scheduling info (next check time) for hosts and services
680# based on the values saved in the retention file. If you
681# If you want to use retained scheduling info, set this
682# value to 1. If not, set this value to 0.
683
684use_retained_scheduling_info=1
685
686
687
688# RETAINED ATTRIBUTE MASKS (ADVANCED FEATURE)
689# The following variables are used to specify specific host and
690# service attributes that should *not* be retained by Nagios during
691# program restarts.
692#
693# The values of the masks are bitwise ANDs of values specified
694# by the "MODATTR_" definitions found in include/common.h.
695# For example, if you do not want the current enabled/disabled state
696# of flap detection and event handlers for hosts to be retained, you
697# would use a value of 24 for the host attribute mask...
698# MODATTR_EVENT_HANDLER_ENABLED (8) + MODATTR_FLAP_DETECTION_ENABLED (16) = 24
699
700# This mask determines what host attributes are not retained
701retained_host_attribute_mask=0
702
703# This mask determines what service attributes are not retained
704retained_service_attribute_mask=0
705
706# These two masks determine what process attributes are not retained.
707# There are two masks, because some process attributes have host and service
708# options. For example, you can disable active host checks, but leave active
709# service checks enabled.
710retained_process_host_attribute_mask=0
711retained_process_service_attribute_mask=0
712
713# These two masks determine what contact attributes are not retained.
714# There are two masks, because some contact attributes have host and
715# service options. For example, you can disable host notifications for
716# a contact, but leave service notifications enabled for them.
717retained_contact_host_attribute_mask=0
718retained_contact_service_attribute_mask=0
719
720
721
722# INTERVAL LENGTH
723# This is the seconds per unit interval as used in the
724# host/contact/service configuration files. Setting this to 60 means
725# that each interval is one minute long (60 seconds). Other settings
726# have not been tested much, so your mileage is likely to vary...
727
728interval_length=60
729
730
731
732# CHECK FOR UPDATES
733# This option determines whether Nagios will automatically check to
734# see if new updates (releases) are available. It is recommend that you
735# enable this option to ensure that you stay on top of the latest critical
736# patches to Nagios. Nagios is critical to you - make sure you keep it in
737# good shape. Nagios will check once a day for new updates. Data collected
738# by Nagios Enterprises from the update check is processed in accordance
739# with our privacy policy - see http://api.nagios.org for details.
740
741check_for_updates=1
742
743
744
745# BARE UPDATE CHECK
746# This option deterines what data Nagios will send to api.nagios.org when
747# it checks for updates. By default, Nagios will send information on the
748# current version of Nagios you have installed, as well as an indicator as
749# to whether this was a new installation or not. Nagios Enterprises uses
750# this data to determine the number of users running specific version of
751# Nagios. Enable this option if you do not want this information to be sent.
752
753bare_update_check=0
754
755
756
757# AGGRESSIVE HOST CHECKING OPTION
758# If you don't want to turn on aggressive host checking features, set
759# this value to 0 (the default). Otherwise set this value to 1 to
760# enable the aggressive check option. Read the docs for more info
761# on what aggressive host check is or check out the source code in
762# base/checks.c
763
764use_aggressive_host_checking=0
765
766
767
768# SERVICE CHECK EXECUTION OPTION
769# This determines whether or not Nagios will actively execute
770# service checks when it initially starts. If this option is
771# disabled, checks are not actively made, but Nagios can still
772# receive and process passive check results that come in. Unless
773# you're implementing redundant hosts or have a special need for
774# disabling the execution of service checks, leave this enabled!
775# Values: 1 = enable checks, 0 = disable checks
776
777execute_service_checks=1
778
779
780
781# PASSIVE SERVICE CHECK ACCEPTANCE OPTION
782# This determines whether or not Nagios will accept passive
783# service checks results when it initially (re)starts.
784# Values: 1 = accept passive checks, 0 = reject passive checks
785
786accept_passive_service_checks=1
787
788
789
790# HOST CHECK EXECUTION OPTION
791# This determines whether or not Nagios will actively execute
792# host checks when it initially starts. If this option is
793# disabled, checks are not actively made, but Nagios can still
794# receive and process passive check results that come in. Unless
795# you're implementing redundant hosts or have a special need for
796# disabling the execution of host checks, leave this enabled!
797# Values: 1 = enable checks, 0 = disable checks
798
799execute_host_checks=1
800
801
802
803# PASSIVE HOST CHECK ACCEPTANCE OPTION
804# This determines whether or not Nagios will accept passive
805# host checks results when it initially (re)starts.
806# Values: 1 = accept passive checks, 0 = reject passive checks
807
808accept_passive_host_checks=1
809
810
811
812# NOTIFICATIONS OPTION
813# This determines whether or not Nagios will sent out any host or
814# service notifications when it is initially (re)started.
815# Values: 1 = enable notifications, 0 = disable notifications
816
817enable_notifications=1
818
819
820
821# EVENT HANDLER USE OPTION
822# This determines whether or not Nagios will run any host or
823# service event handlers when it is initially (re)started. Unless
824# you're implementing redundant hosts, leave this option enabled.
825# Values: 1 = enable event handlers, 0 = disable event handlers
826
827enable_event_handlers=1
828
829
830
831# PROCESS PERFORMANCE DATA OPTION
832# This determines whether or not Nagios will process performance
833# data returned from service and host checks. If this option is
834# enabled, host performance data will be processed using the
835# host_perfdata_command (defined below) and service performance
836# data will be processed using the service_perfdata_command (also
837# defined below). Read the HTML docs for more information on
838# performance data.
839# Values: 1 = process performance data, 0 = do not process performance data
840
841process_performance_data=0
842
843
844
845# HOST AND SERVICE PERFORMANCE DATA PROCESSING COMMANDS
846# These commands are run after every host and service check is
847# performed. These commands are executed only if the
848# enable_performance_data option (above) is set to 1. The command
849# argument is the short name of a command definition that you
850# define in your host configuration file. Read the HTML docs for
851# more information on performance data.
852
853#host_perfdata_command=process-host-perfdata
854#service_perfdata_command=process-service-perfdata
855
856
857
858# HOST AND SERVICE PERFORMANCE DATA FILES
859# These files are used to store host and service performance data.
860# Performance data is only written to these files if the
861# enable_performance_data option (above) is set to 1.
862
863#host_perfdata_file=/tmp/host-perfdata
864#service_perfdata_file=/tmp/service-perfdata
865
866
867
868# HOST AND SERVICE PERFORMANCE DATA FILE TEMPLATES
869# These options determine what data is written (and how) to the
870# performance data files. The templates may contain macros, special
871# characters (\t for tab, \r for carriage return, \n for newline)
872# and plain text. A newline is automatically added after each write
873# to the performance data file. Some examples of what you can do are
874# shown below.
875
876#host_perfdata_file_template=[HOSTPERFDATA]\t$TIMET$\t$HOSTNAME$\t$HOSTEXECUTIONTIME$\t$HOSTOUTPUT$\t$HOSTPERFDATA$
877#service_perfdata_file_template=[SERVICEPERFDATA]\t$TIMET$\t$HOSTNAME$\t$SERVICEDESC$\t$SERVICEEXECUTIONTIME$\t$SERVICELATENCY$\t$SERVICEOUTPUT$\t$SERVICEPERFDATA$
878
879
880
881# HOST AND SERVICE PERFORMANCE DATA FILE MODES
882# This option determines whether or not the host and service
883# performance data files are opened in write ("w") or append ("a")
884# mode. If you want to use named pipes, you should use the special
885# pipe ("p") mode which avoid blocking at startup, otherwise you will
886# likely want the defult append ("a") mode.
887
888#host_perfdata_file_mode=a
889#service_perfdata_file_mode=a
890
891
892
893# HOST AND SERVICE PERFORMANCE DATA FILE PROCESSING INTERVAL
894# These options determine how often (in seconds) the host and service
895# performance data files are processed using the commands defined
896# below. A value of 0 indicates the files should not be periodically
897# processed.
898
899#host_perfdata_file_processing_interval=0
900#service_perfdata_file_processing_interval=0
901
902
903
904# HOST AND SERVICE PERFORMANCE DATA FILE PROCESSING COMMANDS
905# These commands are used to periodically process the host and
906# service performance data files. The interval at which the
907# processing occurs is determined by the options above.
908
909#host_perfdata_file_processing_command=process-host-perfdata-file
910#service_perfdata_file_processing_command=process-service-perfdata-file
911
912
913
914# HOST AND SERVICE PERFORMANCE DATA PROCESS EMPTY RESULTS
915# THese options determine wether the core will process empty perfdata
916# results or not. This is needed for distributed monitoring, and intentionally
917# turned on by default.
918# If you don't require empty perfdata - saving some cpu cycles
919# on unwanted macro calculation - you can turn that off. Be careful!
920# Values: 1 = enable, 0 = disable
921
922#host_perfdata_process_empty_results=1
923#service_perfdata_process_empty_results=1
924
925
926# OBSESS OVER SERVICE CHECKS OPTION
927# This determines whether or not Nagios will obsess over service
928# checks and run the ocsp_command defined below. Unless you're
929# planning on implementing distributed monitoring, do not enable
930# this option. Read the HTML docs for more information on
931# implementing distributed monitoring.
932# Values: 1 = obsess over services, 0 = do not obsess (default)
933
934obsess_over_services=0
935
936
937
938# OBSESSIVE COMPULSIVE SERVICE PROCESSOR COMMAND
939# This is the command that is run for every service check that is
940# processed by Nagios. This command is executed only if the
941# obsess_over_services option (above) is set to 1. The command
942# argument is the short name of a command definition that you
943# define in your host configuration file. Read the HTML docs for
944# more information on implementing distributed monitoring.
945
946#ocsp_command=somecommand
947
948
949
950# OBSESS OVER HOST CHECKS OPTION
951# This determines whether or not Nagios will obsess over host
952# checks and run the ochp_command defined below. Unless you're
953# planning on implementing distributed monitoring, do not enable
954# this option. Read the HTML docs for more information on
955# implementing distributed monitoring.
956# Values: 1 = obsess over hosts, 0 = do not obsess (default)
957
958obsess_over_hosts=0
959
960
961
962# OBSESSIVE COMPULSIVE HOST PROCESSOR COMMAND
963# This is the command that is run for every host check that is
964# processed by Nagios. This command is executed only if the
965# obsess_over_hosts option (above) is set to 1. The command
966# argument is the short name of a command definition that you
967# define in your host configuration file. Read the HTML docs for
968# more information on implementing distributed monitoring.
969
970#ochp_command=somecommand
971
972
973
974# TRANSLATE PASSIVE HOST CHECKS OPTION
975# This determines whether or not Nagios will translate
976# DOWN/UNREACHABLE passive host check results into their proper
977# state for this instance of Nagios. This option is useful
978# if you have distributed or failover monitoring setup. In
979# these cases your other Nagios servers probably have a different
980# "view" of the network, with regards to the parent/child relationship
981# of hosts. If a distributed monitoring server thinks a host
982# is DOWN, it may actually be UNREACHABLE from the point of
983# this Nagios instance. Enabling this option will tell Nagios
984# to translate any DOWN or UNREACHABLE host states it receives
985# passively into the correct state from the view of this server.
986# Values: 1 = perform translation, 0 = do not translate (default)
987
988translate_passive_host_checks=0
989
990
991
992# PASSIVE HOST CHECKS ARE SOFT OPTION
993# This determines whether or not Nagios will treat passive host
994# checks as being HARD or SOFT. By default, a passive host check
995# result will put a host into a HARD state type. This can be changed
996# by enabling this option.
997# Values: 0 = passive checks are HARD, 1 = passive checks are SOFT
998
999passive_host_checks_are_soft=0
1000
1001
1002
1003# ORPHANED HOST/SERVICE CHECK OPTIONS
1004# These options determine whether or not Nagios will periodically
1005# check for orphaned host service checks. Since service checks are
1006# not rescheduled until the results of their previous execution
1007# instance are processed, there exists a possibility that some
1008# checks may never get rescheduled. A similar situation exists for
1009# host checks, although the exact scheduling details differ a bit
1010# from service checks. Orphaned checks seem to be a rare
1011# problem and should not happen under normal circumstances.
1012# If you have problems with service checks never getting
1013# rescheduled, make sure you have orphaned service checks enabled.
1014# Values: 1 = enable checks, 0 = disable checks
1015
1016check_for_orphaned_services=1
1017check_for_orphaned_hosts=1
1018
1019
1020
1021# SERVICE FRESHNESS CHECK OPTION
1022# This option determines whether or not Nagios will periodically
1023# check the "freshness" of service results. Enabling this option
1024# is useful for ensuring passive checks are received in a timely
1025# manner.
1026# Values: 1 = enabled freshness checking, 0 = disable freshness checking
1027
1028check_service_freshness=1
1029
1030
1031
1032# SERVICE FRESHNESS CHECK INTERVAL
1033# This setting determines how often (in seconds) Nagios will
1034# check the "freshness" of service check results. If you have
1035# disabled service freshness checking, this option has no effect.
1036
1037service_freshness_check_interval=60
1038
1039
1040
1041# SERVICE CHECK TIMEOUT STATE
1042# This setting determines the state Nagios will report when a
1043# service check times out - that is does not respond within
1044# service_check_timeout seconds. This can be useful if a
1045# machine is running at too high a load and you do not want
1046# to consider a failed service check to be critical (the default).
1047# Valid settings are:
1048# c - Critical (default)
1049# u - Unknown
1050# w - Warning
1051# o - OK
1052
1053service_check_timeout_state=c
1054
1055
1056
1057# HOST FRESHNESS CHECK OPTION
1058# This option determines whether or not Nagios will periodically
1059# check the "freshness" of host results. Enabling this option
1060# is useful for ensuring passive checks are received in a timely
1061# manner.
1062# Values: 1 = enabled freshness checking, 0 = disable freshness checking
1063
1064check_host_freshness=0
1065
1066
1067
1068# HOST FRESHNESS CHECK INTERVAL
1069# This setting determines how often (in seconds) Nagios will
1070# check the "freshness" of host check results. If you have
1071# disabled host freshness checking, this option has no effect.
1072
1073host_freshness_check_interval=60
1074
1075
1076
1077
1078# ADDITIONAL FRESHNESS THRESHOLD LATENCY
1079# This setting determines the number of seconds that Nagios
1080# will add to any host and service freshness thresholds that
1081# it calculates (those not explicitly specified by the user).
1082
1083additional_freshness_latency=15
1084
1085
1086
1087
1088# FLAP DETECTION OPTION
1089# This option determines whether or not Nagios will try
1090# and detect hosts and services that are "flapping".
1091# Flapping occurs when a host or service changes between
1092# states too frequently. When Nagios detects that a
1093# host or service is flapping, it will temporarily suppress
1094# notifications for that host/service until it stops
1095# flapping. Flap detection is very experimental, so read
1096# the HTML documentation before enabling this feature!
1097# Values: 1 = enable flap detection
1098# 0 = disable flap detection (default)
1099
1100enable_flap_detection=1
1101
1102
1103
1104# FLAP DETECTION THRESHOLDS FOR HOSTS AND SERVICES
1105# Read the HTML documentation on flap detection for
1106# an explanation of what this option does. This option
1107# has no effect if flap detection is disabled.
1108
1109low_service_flap_threshold=5.0
1110high_service_flap_threshold=20.0
1111low_host_flap_threshold=5.0
1112high_host_flap_threshold=20.0
1113
1114
1115
1116# DATE FORMAT OPTION
1117# This option determines how short dates are displayed. Valid options
1118# include:
1119# us (MM-DD-YYYY HH:MM:SS)
1120# euro (DD-MM-YYYY HH:MM:SS)
1121# iso8601 (YYYY-MM-DD HH:MM:SS)
1122# strict-iso8601 (YYYY-MM-DDTHH:MM:SS)
1123#
1124
1125date_format=iso8601
1126
1127
1128
1129
1130# TIMEZONE OFFSET
1131# This option is used to override the default timezone that this
1132# instance of Nagios runs in. If not specified, Nagios will use
1133# the system configured timezone.
1134#
1135# NOTE: In order to display the correct timezone in the CGIs, you
1136# will also need to alter the Apache directives for the CGI path
1137# to include your timezone. Example:
1138#
1139# <Directory "/usr/local/nagios/sbin/">
1140# SetEnv TZ "Australia/Brisbane"
1141# ...
1142# </Directory>
1143
1144#use_timezone=US/Mountain
1145#use_timezone=Australia/Brisbane
1146
1147
1148
1149
1150# P1.PL FILE LOCATION
1151# This value determines where the p1.pl perl script (used by the
1152# embedded Perl interpreter) is located. If you didn't compile
1153# Nagios with embedded Perl support, this option has no effect.
1154
1155p1_file=/usr/lib/nagios3/p1.pl
1156
1157
1158
1159# EMBEDDED PERL INTERPRETER OPTION
1160# This option determines whether or not the embedded Perl interpreter
1161# will be enabled during runtime. This option has no effect if Nagios
1162# has not been compiled with support for embedded Perl.
1163# Values: 0 = disable interpreter, 1 = enable interpreter
1164
1165enable_embedded_perl=1
1166
1167
1168
1169# EMBEDDED PERL USAGE OPTION
1170# This option determines whether or not Nagios will process Perl plugins
1171# and scripts with the embedded Perl interpreter if the plugins/scripts
1172# do not explicitly indicate whether or not it is okay to do so. Read
1173# the HTML documentation on the embedded Perl interpreter for more
1174# information on how this option works.
1175
1176use_embedded_perl_implicitly=1
1177
1178
1179
1180# ILLEGAL OBJECT NAME CHARACTERS
1181# This option allows you to specify illegal characters that cannot
1182# be used in host names, service descriptions, or names of other
1183# object types.
1184
1185illegal_object_name_chars=`~!$%^&*|'"<>?,()=
1186
1187
1188
1189# ILLEGAL MACRO OUTPUT CHARACTERS
1190# This option allows you to specify illegal characters that are
1191# stripped from macros before being used in notifications, event
1192# handlers, etc. This DOES NOT affect macros used in service or
1193# host check commands.
1194# The following macros are stripped of the characters you specify:
1195# $HOSTOUTPUT$
1196# $HOSTPERFDATA$
1197# $HOSTACKAUTHOR$
1198# $HOSTACKCOMMENT$
1199# $SERVICEOUTPUT$
1200# $SERVICEPERFDATA$
1201# $SERVICEACKAUTHOR$
1202# $SERVICEACKCOMMENT$
1203
1204illegal_macro_output_chars=`~$&|'"<>
1205
1206
1207
1208# REGULAR EXPRESSION MATCHING
1209# This option controls whether or not regular expression matching
1210# takes place in the object config files. Regular expression
1211# matching is used to match host, hostgroup, service, and service
1212# group names/descriptions in some fields of various object types.
1213# Values: 1 = enable regexp matching, 0 = disable regexp matching
1214
1215use_regexp_matching=0
1216
1217
1218
1219# "TRUE" REGULAR EXPRESSION MATCHING
1220# This option controls whether or not "true" regular expression
1221# matching takes place in the object config files. This option
1222# only has an effect if regular expression matching is enabled
1223# (see above). If this option is DISABLED, regular expression
1224# matching only occurs if a string contains wildcard characters
1225# (* and ?). If the option is ENABLED, regexp matching occurs
1226# all the time (which can be annoying).
1227# Values: 1 = enable true matching, 0 = disable true matching
1228
1229use_true_regexp_matching=0
1230
1231
1232
1233# ADMINISTRATOR EMAIL/PAGER ADDRESSES
1234# The email and pager address of a global administrator (likely you).
1235# Nagios never uses these values itself, but you can access them by
1236# using the $ADMINEMAIL$ and $ADMINPAGER$ macros in your notification
1237# commands.
1238
1239admin_email={{ admin_email }}
1240admin_pager={{ admin_pager }}
1241
1242
1243
1244# DAEMON CORE DUMP OPTION
1245# This option determines whether or not Nagios is allowed to create
1246# a core dump when it runs as a daemon. Note that it is generally
1247# considered bad form to allow this, but it may be useful for
1248# debugging purposes. Enabling this option doesn't guarantee that
1249# a core file will be produced, but that's just life...
1250# Values: 1 - Allow core dumps
1251# 0 - Do not allow core dumps (default)
1252
1253daemon_dumps_core={{ daemon_dumps_core }}
1254
1255
1256
1257# LARGE INSTALLATION TWEAKS OPTION
1258# This option determines whether or not Nagios will take some shortcuts
1259# which can save on memory and CPU usage in large Nagios installations.
1260# Read the documentation for more information on the benefits/tradeoffs
1261# of enabling this option.
1262# Values: 1 - Enabled tweaks
1263# 0 - Disable tweaks (default)
1264
1265use_large_installation_tweaks=0
1266
1267
1268
1269# ENABLE ENVIRONMENT MACROS
1270# This option determines whether or not Nagios will make all standard
1271# macros available as environment variables when host/service checks
1272# and system commands (event handlers, notifications, etc.) are
1273# executed. Enabling this option can cause performance issues in
1274# large installations, as it will consume a bit more memory and (more
1275# importantly) consume more CPU.
1276# Values: 1 - Enable environment variable macros (default)
1277# 0 - Disable environment variable macros
1278
1279enable_environment_macros=1
1280
1281
1282
1283# CHILD PROCESS MEMORY OPTION
1284# This option determines whether or not Nagios will free memory in
1285# child processes (processed used to execute system commands and host/
1286# service checks). If you specify a value here, it will override
1287# program defaults.
1288# Value: 1 - Free memory in child processes
1289# 0 - Do not free memory in child processes
1290
1291#free_child_process_memory=1
1292
1293
1294
1295# CHILD PROCESS FORKING BEHAVIOR
1296# This option determines how Nagios will fork child processes
1297# (used to execute system commands and host/service checks). Normally
1298# child processes are fork()ed twice, which provides a very high level
1299# of isolation from problems. Fork()ing once is probably enough and will
1300# save a great deal on CPU usage (in large installs), so you might
1301# want to consider using this. If you specify a value here, it will
1302# program defaults.
1303# Value: 1 - Child processes fork() twice
1304# 0 - Child processes fork() just once
1305
1306#child_processes_fork_twice=1
1307
1308
1309
1310# DEBUG LEVEL
1311# This option determines how much (if any) debugging information will
1312# be written to the debug file. OR values together to log multiple
1313# types of information.
1314# Values:
1315# -1 = Everything
1316# 0 = Nothing
1317# 1 = Functions
1318# 2 = Configuration
1319# 4 = Process information
1320# 8 = Scheduled events
1321# 16 = Host/service checks
1322# 32 = Notifications
1323# 64 = Event broker
1324# 128 = External commands
1325# 256 = Commands
1326# 512 = Scheduled downtime
1327# 1024 = Comments
1328# 2048 = Macros
1329
1330debug_level={{ debug_level }}
1331
1332
1333
1334# DEBUG VERBOSITY
1335# This option determines how verbose the debug log out will be.
1336# Values: 0 = Brief output
1337# 1 = More detailed
1338# 2 = Very detailed
1339
1340debug_verbosity={{ debug_verbosity }}
1341
1342
1343
1344# DEBUG FILE
1345# This option determines where Nagios should write debugging information.
1346
1347debug_file={{ debug_file }}
1348
1349
1350
1351# MAX DEBUG FILE SIZE
1352# This option determines the maximum size (in bytes) of the debug file. If
1353# the file grows larger than this size, it will be renamed with a .old
1354# extension. If a file already exists with a .old extension it will
1355# automatically be deleted. This helps ensure your disk space usage doesn't
1356# get out of control when debugging Nagios.
1357
1358max_debug_file_size=1000000
1359
1360
01361
=== added file 'hooks/templates/pagerduty_nagios_cfg.tmpl'
--- hooks/templates/pagerduty_nagios_cfg.tmpl 1970-01-01 00:00:00 +0000
+++ hooks/templates/pagerduty_nagios_cfg.tmpl 2015-09-08 04:46:07 +0000
@@ -0,0 +1,26 @@
1#------------------------------------------------
2# This file is juju managed
3#------------------------------------------------
4
5define contact {
6 contact_name pagerduty
7 alias PagerDuty Pseudo-Contact
8 service_notification_period 24x7
9 host_notification_period 24x7
10 service_notification_options w,u,c,r
11 host_notification_options d,r
12 service_notification_commands notify-service-by-pagerduty
13 host_notification_commands notify-host-by-pagerduty
14 pager {{ pagerduty_key }}
15}
16
17define command {
18 command_name notify-service-by-pagerduty
19 command_line /usr/local/bin/pagerduty_nagios.pl enqueue -f pd_nagios_object=service -q {{ pagerduty_path }}
20}
21
22define command {
23 command_name notify-host-by-pagerduty
24 command_line /usr/local/bin/pagerduty_nagios.pl enqueue -f pd_nagios_object=host -q {{ pagerduty_path }}
25}
26
027
=== modified file 'hooks/upgrade-charm'
--- hooks/upgrade-charm 2015-05-07 07:07:09 +0000
+++ hooks/upgrade-charm 2015-09-08 04:46:07 +0000
@@ -5,12 +5,12 @@
5import base645import base64
6from jinja2 import Template6from jinja2 import Template
7import os7import os
8import re8# import re
9import pwd9import pwd
10import grp10import grp
11import stat11import stat
12import errno12import errno
13# import shutil13import shutil
14import subprocess14import subprocess
15from charmhelpers.contrib import ssl15from charmhelpers.contrib import ssl
16from charmhelpers.core import hookenv, host16from charmhelpers.core import hookenv, host
@@ -23,10 +23,17 @@
23extra_config = hookenv.config('extraconfig')23extra_config = hookenv.config('extraconfig')
24enable_livestatus = hookenv.config('enable_livestatus')24enable_livestatus = hookenv.config('enable_livestatus')
25livestatus_path = hookenv.config('livestatus_path')25livestatus_path = hookenv.config('livestatus_path')
26enable_pagerduty = hookenv.config('enable_pagerduty')
27pagerduty_key = hookenv.config('pagerduty_key')
28pagerduty_path = hookenv.config('pagerduty_path')
29nagios_user = hookenv.config('nagios_user')
30nagios_group = hookenv.config('nagios_group')
26ssl_config = hookenv.config('ssl')31ssl_config = hookenv.config('ssl')
27charm_dir = os.environ['CHARM_DIR']32charm_dir = os.environ['CHARM_DIR']
28cert_domain = hookenv.unit_get('public-address')33cert_domain = hookenv.unit_get('public-address')
29nagios_cfg = "/etc/nagios3/nagios.cfg"34nagios_cfg = "/etc/nagios3/nagios.cfg"
35pagerduty_cfg = "/etc/nagios3/conf.d/pagerduty_nagios.cfg"
36pagerduty_cron = "/etc/cron.d/nagios-pagerduty-flush"
3037
3138
32# Checks the charm relations for legacy relations39# Checks the charm relations for legacy relations
@@ -79,23 +86,9 @@
79 hookenv.log("Livestatus is enabled")86 hookenv.log("Livestatus is enabled")
80 fetch.apt_update()87 fetch.apt_update()
81 fetch.apt_install('check-mk-livestatus')88 fetch.apt_install('check-mk-livestatus')
82 broker = re.compile("^broker_module=")
83 broker_found = False
84 for line in open(nagios_cfg):
85 if broker.match(line):
86 broker_found = True
87 hookenv.log("broker_module line exists, not adding..")
88 break
89
90 if not broker_found:
91 with open(nagios_cfg, "a") as nagiosfile:
92 broker_str = "broker_module=/usr/lib/check_mk/livestatus.o " . livestatus_path
93 nagiosfile.write(broker_str)
94 nagiosfile.close()
9589
96 # Make the directory and fix perms on it90 # Make the directory and fix perms on it
97 hookenv.log("Fixing perms on livestatus_path")91 hookenv.log("Fixing perms on livestatus_path")
98 livestatus_path = hookenv.config('livestatus_path')
99 livestatus_dir = os.path.dirname(livestatus_path)92 livestatus_dir = os.path.dirname(livestatus_path)
100 if not os.path.isdir(livestatus_dir):93 if not os.path.isdir(livestatus_dir):
101 hookenv.log("Making path for livestatus_dir")94 hookenv.log("Making path for livestatus_dir")
@@ -104,7 +97,7 @@
10497
105 # Fix the perms on the socket98 # Fix the perms on the socket
106 hookenv.log("Fixing perms on the socket")99 hookenv.log("Fixing perms on the socket")
107 uid = pwd.getpwnam("nagios").pw_uid100 uid = pwd.getpwnam(nagios_user).pw_uid
108 gid = grp.getgrnam("www-data").gr_gid101 gid = grp.getgrnam("www-data").gr_gid
109 os.chown(livestatus_path, uid, gid)102 os.chown(livestatus_path, uid, gid)
110 os.chown(livestatus_dir, uid, gid)103 os.chown(livestatus_dir, uid, gid)
@@ -113,6 +106,57 @@
113 os.chmod(livestatus_dir, st.st_mode | stat.S_IRGRP | stat.S_ISGID)106 os.chmod(livestatus_dir, st.st_mode | stat.S_IRGRP | stat.S_ISGID)
114107
115108
109def enable_pagerduty_config():
110 if enable_pagerduty:
111 hookenv.log("Pagerduty is enabled")
112
113 # Ship the pagerduty_nagios.cfg file
114 template_values = {'enable_pagerduty': enable_pagerduty,
115 'pagerduty_key': pagerduty_key,
116 'pagerduty_path': pagerduty_path}
117
118 with open('hooks/templates/pagerduty_nagios_cfg.tmpl', 'r') as f:
119 templateDef = f.read()
120
121 t = Template(templateDef)
122 with open(pagerduty_cfg, 'w') as f:
123 f.write(t.render(template_values))
124
125 # Ship the cron file
126 shutil.copyfile('files/nagios-pagerduty-flush-cron', pagerduty_cron)
127
128 # Ship the pagerduty_nagios.pl script
129 shutil.copyfile('files/pagerduty_nagios.pl', '/usr/local/bin/pagerduty_nagios.pl')
130
131 # Create the pagerduty queue dir
132 if not os.path.isdir(pagerduty_path):
133 hookenv.log("Making path for pagerduty_path")
134 mkdir_p(pagerduty_path)
135 # Fix the perms on it
136 uid = pwd.getpwnam(nagios_user).pw_uid
137 gid = grp.getgrnam(nagios_group).gr_gid
138 os.chown(pagerduty_path, uid, gid)
139 else:
140 # Clean up the files if we don't want pagerduty
141 if os.path.isfile(pagerduty_cfg):
142 os.remove(pagerduty_cfg)
143 if os.path.isfile(pagerduty_cron):
144 os.remove(pagerduty_cron)
145
146 # Update contacts for admin
147 template_values = {'enable_pagerduty': enable_pagerduty,
148 'admin_email': hookenv.config('admin_email')}
149
150 with open('hooks/templates/contacts-cfg.tmpl', 'r') as f:
151 templateDef = f.read()
152
153 t = Template(templateDef)
154 with open('/etc/nagios3/conf.d/contacts_nagios2.cfg', 'w') as f:
155 f.write(t.render(template_values))
156
157 host.service_reload('nagios3')
158
159
116def ssl_configured():160def ssl_configured():
117 allowed_options = ["on", "only"]161 allowed_options = ["on", "only"]
118 if str(ssl_config).lower() in allowed_options:162 if str(ssl_config).lower() in allowed_options:
@@ -172,6 +216,34 @@
172 hookenv.log("Decoded SSL files", "INFO")216 hookenv.log("Decoded SSL files", "INFO")
173217
174218
219def update_config():
220 template_values = {'nagios_user': nagios_user,
221 'nagios_group': nagios_group,
222 'enable_livestatus': enable_livestatus,
223 'livestatus_path': livestatus_path,
224 'livestatus_args': hookenv.config('livestatus_args'),
225 'check_external_commands': hookenv.config('check_external_commands'),
226 'command_check_interval': hookenv.config('command_check_interval'),
227 'command_file': hookenv.config('command_file'),
228 'debug_file': hookenv.config('debug_file'),
229 'debug_verbosity': hookenv.config('debug_verbosity'),
230 'debug_level': hookenv.config('debug_level'),
231 'daemon_dumps_core': hookenv.config('daemon_dumps_core'),
232 'admin_email': hookenv.config('admin_email'),
233 'admin_pager': hookenv.config('admin_pager'),
234 'log_rotation_method': hookenv.config('log_rotation_method'),
235 'log_archive_path': hookenv.config('log_archive_path'),
236 'use_syslog': hookenv.config('use_syslog')}
237
238 with open('hooks/templates/nagios-cfg.tmpl', 'r') as f:
239 templateDef = f.read()
240
241 t = Template(templateDef)
242 with open(nagios_cfg, 'w') as f:
243 f.write(t.render(template_values))
244
245 host.service_reload('nagios3')
246
175# Nagios3 is deployed as a global apache application from the archive.247# Nagios3 is deployed as a global apache application from the archive.
176# We'll get a little funky and add the SSL keys to the default-ssl config248# We'll get a little funky and add the SSL keys to the default-ssl config
177# which sets our keys, including the self-signed ones, as the host keyfiles.249# which sets our keys, including the self-signed ones, as the host keyfiles.
@@ -212,7 +284,9 @@
212284
213warn_legacy_relations()285warn_legacy_relations()
214write_extra_config()286write_extra_config()
287update_config()
215enable_livestatus_config()288enable_livestatus_config()
289enable_pagerduty_config()
216if ssl_configured():290if ssl_configured():
217 enable_ssl()291 enable_ssl()
218update_apache()292update_apache()
219293
=== modified file 'tests/00-setup'
--- tests/00-setup 2014-03-19 22:03:03 +0000
+++ tests/00-setup 2015-09-08 04:46:07 +0000
@@ -1,5 +1,5 @@
1#!/bin/bash1#!/bin/bash
22
3sudo add-apt-repository -y ppa:juju/stable3sudo add-apt-repository -y ppa:juju/stable
4apt-get update4sudo apt-get update
5sudo apt-get install -y amulet juju-local python3-requests5sudo apt-get install -y amulet juju-local python3-requests
66
=== added file 'tests/24-pagerduty-test'
--- tests/24-pagerduty-test 1970-01-01 00:00:00 +0000
+++ tests/24-pagerduty-test 2015-09-08 04:46:07 +0000
@@ -0,0 +1,56 @@
1#!/usr/bin/python3
2
3from time import sleep
4import amulet
5# import requests
6
7seconds = 20000
8
9d = amulet.Deployment(series='trusty')
10
11d.add('nagios')
12
13d.expose('nagios')
14
15try:
16 d.setup(timeout=seconds)
17 d.sentry.wait()
18except amulet.helpers.TimeoutError:
19 amulet.raise_status(amulet.SKIP, msg="Environment wasn't stood up in time")
20except:
21 raise
22
23
24##
25# Set relationship aliases
26##
27nagios_unit = d.sentry.unit['nagios/0']
28
29d.configure('nagios', {
30 'enable_pagerduty': True
31})
32
33d.sentry.wait()
34
35# Give it a while to settle
36sleep(30)
37
38def test_pagerduty_path_exists():
39 pagerduty_path = nagios_unit.run('config-get pagerduty_path')
40 try:
41 pagerduty_file = nagios_unit.file(pagerduty_path[0])
42 except OSError:
43 message = "Can't find pagerduty directory"
44 amulet.raise_status(amulet.FAIL, msg=message)
45
46
47def test_pagerduty_config():
48 pagerduty_cfg = '/etc/nagios3/conf.d/pagerduty_nagios.cfg'
49 try:
50 pagerduty_cfg_file = nagios_unit.file(pagerduty_cfg)
51 except OSError:
52 message = "Can't find pagerduty config file"
53 amulet.raise_status(amulet.FAIL, msg=message)
54
55test_pagerduty_path_exists()
56test_pagerduty_config()

Subscribers

People subscribed via source and target branches

to all changes: