Merge lp:~brad-marshall/charms/trusty/nagios/add-extra-config-options into lp:charms/trusty/nagios

Proposed by Brad Marshall
Status: Merged
Merged at revision: 33
Proposed branch: lp:~brad-marshall/charms/trusty/nagios/add-extra-config-options
Merge into: lp:charms/trusty/nagios
Diff against target: 2187 lines (+2014/-28)
11 files modified
README.md (+40/-1)
config.yaml (+93/-0)
files/nagios-pagerduty-flush-cron (+7/-0)
files/pagerduty_nagios.pl (+289/-0)
hooks/install (+0/-9)
hooks/templates/contacts-cfg.tmpl (+51/-0)
hooks/templates/nagios-cfg.tmpl (+1360/-0)
hooks/templates/pagerduty_nagios_cfg.tmpl (+26/-0)
hooks/upgrade-charm (+91/-17)
tests/00-setup (+1/-1)
tests/24-pagerduty-test (+56/-0)
To merge this branch: bzr merge lp:~brad-marshall/charms/trusty/nagios/add-extra-config-options
Reviewer Review Type Date Requested Status
Cory Johns (community) Approve
Review via email: mp+265480@code.launchpad.net

Description of the change

Adding extra nagios config options, as well as Pagerduty notification support.

To post a comment you must log in.
Revision history for this message
Cory Johns (johnsca) wrote :

Brad,

Thank you for your addition to this charm.

Overall, these changes look good. I particularly think that the switch to a template for the config files makes it easier to follow. However, I did run into some issues when running the tests.

I had a few tests fail that were unrelated to this change, and the mysql charm seems to error when removing the nagios relation. However both of those seem to be outside the scope of this review, and there is an existing bug (https://bugs.launchpad.net/charms/+source/nagios/+bug/1403574) related to test failures in this charm, so I'm just going to note it here and move on.

Unfortunately, I also got an error on the newly added test: http://pastebin.ubuntu.com/12014504/ When I re-ran the test to attempt to triage it, it passed without error, so I think this is a race condition. You already have a sentry.wait() in the test, so I'm really not sure why I ran into that error. Perhaps, though, a for-range loop + sleep could be wrapped around the directory check to give it a bit of leeway to account for the race?

Revision history for this message
Cory Johns (johnsca) wrote :

Oh, I forgot to mention that the "apt-get update" line in 00-setup is missing a "sudo". That's also not related to this change, but if you make a change to account for the test race, perhaps you would be willing to make that simple change as well?

48. By Brad Marshall

[bradm] Added missing sudo on apt-get

Revision history for this message
Brad Marshall (brad-marshall) wrote :

I've added the missing sudo to the apt-get update. I'm not sure how to handle the race condition, all the try block does is checks for a file existence. Maybe I should add extra to the timeout on the sentry wait?

Revision history for this message
Cory Johns (johnsca) wrote :

I'm still hitting the race condition in the test consistently on local provider. I believe the issue is that the sentry.wait() after setting the config doesn't actually block because the unit is not executing the hook yet when it gets there, so the file check happens before the hook is done.

I submitted a MP against this branch (https://code.launchpad.net/~johnsca/charms/trusty/nagios/pagerduty-test-race/+merge/269401) with the suggested work-around. That will retry three times at 5 second intervals. Depending on the cloud, you may want to increase that, but on local provider that duration (15s) was sufficient and the test passed.

Revision history for this message
Brad Marshall (brad-marshall) wrote :

Oddly the test is passing perfectly fine for me on a local provider. What version of juju are you using? Maybe that'll narrow down whats going on here a bit.

49. By Brad Marshall

[bradm] Add a sleep to allow things to settle before running the test

Revision history for this message
Brad Marshall (brad-marshall) wrote :

Thanks for your suggested MP, but I've taking that idea and just changed it to be a simple sleep before doing the tests. I've deployed it about a dozen times on a test openstack, and it worked every single time, and done a handful of tests locally which were fine too.

I went with the slightly larger sleep to accomodate any cloud that might be running a bit slower - we can always tweak it if we find its not long enough.

Revision history for this message
Cory Johns (johnsca) wrote :

Brad,

The problem with race conditions is that they are inherently difficult to reproduce, and this one would depend heavily on the environment in which they are run, so I'm not at all surprised that you weren't seeing the same failure I was.

The 30s wait, along with some recent improvements to Amulet, seems to be sufficient and the test is passing for me. I'd really like to see the other test failures resolved, but as I said before, we can call that out of scope for this change. So this gets my +1 and I'll get it merged today.

Apologies for the delay on the review.

review: Approve
Revision history for this message
Cory Johns (johnsca) wrote :

Brad,

This has been merged and should update on jujucharms.com within an hour. Once again, thank you for your contribution.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'README.md'
2--- README.md 2015-05-07 07:07:09 +0000
3+++ README.md 2015-09-08 04:46:07 +0000
4@@ -26,12 +26,51 @@
5
6 Will get you the public IP of the web interface.
7
8-# Configuration
9+# Livestatus Configuration
10
11 - `enable_livestatus` - Setting to enable the [livestatus module](https://mathias-kettner.de/checkmk_livestatus.html). This is an easy interface to get data out of Nagios.
12
13 - `livestatus_path` - Configuration of where the livestatus module is stored - defaults to /var/lib/nagios3/livestatus/socket.
14
15+- `livestatus_args` - Arguments to be passed to the livestatus module, defaults to empty.
16+
17+# Pagerduty Configuration
18+
19+- `enable_pagerduty` - Config variable to enable pagerduty notifications or not.
20+
21+- `pagerduty_key` - Pagerduty API key to use for notifications
22+
23+- `pagerduty_path` - Path for Pagerduty notifications to be queued, default is /var/lib/nagios3/pagerduty.
24+
25+# Configuration
26+
27+- `nagios_user` - The effective user that nagios will run as.
28+
29+- `nagios_group` - The effective group that nagios will run as.
30+
31+- `check_external_commands` - Config variable to enable checking external commands.
32+
33+- `command_check_interval` - How often to check for external commands.
34+
35+- `command_file` - File that Nagios checks for external command requests.
36+
37+- `debug_level` - Specify the debug level for nagios. See the docs for more details.
38+
39+- `debug_verbosity` - How verbose will the debug logs be - 0 is brief, 1 is more detailed and 2 is very detailed.
40+
41+- `debug_file` - Path for the debug file - defaults to /var/log/nagios3/nagios.debug.
42+
43+- `daemon_dumps_core` - Option to determine if Nagios is allowed to create a core dump.
44+
45+- `admin_email` - Email address used for the admin, used by $ADMINEMAIL$ in notification commands - defaults to root@localhost.
46+
47+- `admin_pager` - Email address used for the admin pager, used by $ADMINPAGER$ in notification commands - defaults to pageroot@localhost.
48+
49+- `log_rotation_method` - Log rotation method that Nagios should use to rotate the main logfile, defaults to "d".
50+
51+- `log_archive_path` - Path for archived log files, defaults to /var/log/nagios3/archives
52+- `use_syslog` - Log messages to syslog as well as main file.
53+
54 ### SSL Configuration
55
56 - `ssl` - Determinant configuration for enabling SSL. Valid options are "on", "off", "only". The "only" option disables HTTP traffic on Apache in favor of HTTPS. This setting may cause unexpected behavior with existing nagios charm deployments.
57
58=== modified file 'config.yaml'
59--- config.yaml 2015-05-01 03:06:29 +0000
60+++ config.yaml 2015-09-08 04:46:07 +0000
61@@ -44,3 +44,96 @@
62 default: "/var/lib/nagios3/livestatus/socket"
63 description: |
64 Default path to livestatus socket, if enabled via enable_livestatus
65+ livestatus_args:
66+ type: string
67+ default: ""
68+ description: |
69+ Arguments to be passed to the livestatus module, defaults to empty.
70+ nagios_user:
71+ type: string
72+ default: nagios
73+ description: |
74+ The effective user that nagios will run as.
75+ nagios_group:
76+ type: string
77+ default: nagios
78+ description: |
79+ The effective group that nagios will run as.
80+ check_external_commands:
81+ type: int
82+ default: 1
83+ description: |
84+ Config variable to enable checking external commands - 0 is disable, 1 is enable.
85+ command_check_interval:
86+ type: string
87+ default: "-1"
88+ description: |
89+ How often to check for external commands.
90+ command_file:
91+ type: string
92+ default: /var/lib/nagios3/rw/nagios.cmd
93+ description: |
94+ File that Nagios checks for external command requests.
95+ debug_level:
96+ type: int
97+ default: 0
98+ description: |
99+ Specify the debug level for nagios. See the docs for more details.
100+ debug_verbosity:
101+ type: int
102+ default: 1
103+ description: |
104+ How verbose will the debug logs be - 0 is brief, 1 is more detailed
105+ and 2 is very detailed.
106+ debug_file:
107+ type: string
108+ default: "/var/log/nagios3/nagios.debug"
109+ description: |
110+ Path for the debug file.
111+ daemon_dumps_core:
112+ type: int
113+ default: 0
114+ description:
115+ Option to determine if Nagios is allowed to create a core dump.
116+ admin_email:
117+ type: string
118+ default: root@localhost
119+ description: |
120+ Email address used for the admin, used by $ADMINEMAIL$ in notification
121+ commands.
122+ admin_pager:
123+ type: string
124+ default: pageroot@localhost
125+ description: |
126+ Email address used for the admin pager, used by $ADMINPAGER$ in
127+ notification commands.
128+ enable_pagerduty:
129+ type: boolean
130+ default: false
131+ description: |
132+ Config variable to enable pagerduty notifications or not.
133+ pagerduty_key:
134+ type: string
135+ default: ""
136+ description: |
137+ Pagerduty API key to use for notifications
138+ pagerduty_path:
139+ type: string
140+ default: "/var/lib/nagios3/pagerduty"
141+ description: |
142+ Path for Pagerduty notifications to be queued.
143+ log_rotation_method:
144+ type: string
145+ default: "d"
146+ description: |
147+ Log rotation method that Nagios should use to rotate the main logfile.
148+ log_archive_path:
149+ type: string
150+ default: "/var/log/nagios3/archives"
151+ description: |
152+ Path for archived log files
153+ use_syslog:
154+ type: int
155+ default: 1
156+ description: |
157+ Log messages to syslog as well as main file.
158
159=== added file 'files/nagios-pagerduty-flush-cron'
160--- files/nagios-pagerduty-flush-cron 1970-01-01 00:00:00 +0000
161+++ files/nagios-pagerduty-flush-cron 2015-09-08 04:46:07 +0000
162@@ -0,0 +1,7 @@
163+#------------------------------------------------
164+# This file is juju managed
165+#------------------------------------------------
166+
167+# Flush the nagios pagerduty alerts every minute as per
168+# http://www.pagerduty.com/docs/guides/nagios-perl-integration-guide/
169+* * * * * nagios /usr/local/bin/pagerduty_nagios.pl flush
170
171=== added file 'files/pagerduty_nagios.pl'
172--- files/pagerduty_nagios.pl 1970-01-01 00:00:00 +0000
173+++ files/pagerduty_nagios.pl 2015-09-08 04:46:07 +0000
174@@ -0,0 +1,289 @@
175+#!/usr/bin/env perl
176+
177+
178+# Nagios plugin that sends Nagios events to PagerDuty.
179+#
180+# Copyright (c) 2011, PagerDuty, Inc. <info@pagerduty.com>
181+# All rights reserved.
182+#
183+# Redistribution and use in source and binary forms, with or without
184+# modification, are permitted provided that the following conditions are met:
185+# * Redistributions of source code must retain the above copyright
186+# notice, this list of conditions and the following disclaimer.
187+# * Redistributions in binary form must reproduce the above copyright
188+# notice, this list of conditions and the following disclaimer in the
189+# documentation and/or other materials provided with the distribution.
190+# * Neither the name of PagerDuty Inc nor the
191+# names of its contributors may be used to endorse or promote products
192+# derived from this software without specific prior written permission.
193+#
194+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
195+# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
196+# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
197+# DISCLAIMED. IN NO EVENT SHALL PAGERDUTY INC BE LIABLE FOR ANY
198+# DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
199+# (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
200+# LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
201+# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
202+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
203+# SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
204+
205+
206+use Pod::Usage;
207+use Getopt::Long;
208+use Sys::Syslog;
209+use HTTP::Request::Common qw(POST);
210+use HTTP::Status qw(is_client_error);
211+use LWP::UserAgent;
212+use File::Path;
213+use Fcntl qw(:flock);
214+
215+
216+=head1 NAME
217+
218+pagerduty_nagios -- Send Nagios events to the PagerDuty alert system
219+
220+=head1 SYNOPSIS
221+
222+pagerduty_nagios enqueue [options]
223+
224+pagerduty_nagios flush [options]
225+
226+=head1 DESCRIPTION
227+
228+ This script passes events from Nagios to the PagerDuty alert system. It's
229+ meant to be run as a Nagios notification plugin. For more details, please see
230+ the PagerDuty Nagios integration docs at:
231+ http://www.pagerduty.com/docs/nagios-integration.
232+
233+ When called in the "enqueue" mode, the script loads a Nagios notification out
234+ of the environment and into the event queue. It then tries to flush the
235+ queue by sending any enqueued events to the PagerDuty server. The script is
236+ typically invoked in this mode from a Nagios notification handler.
237+
238+ When called in the "flush" mode, the script simply tries to send any enqueued
239+ events to the PagerDuty server. This mode is typically invoked by cron. The
240+ purpose of this mode is to retry any events that couldn't be sent to the
241+ PagerDuty server for whatever reason when they were initially enqueued.
242+
243+=head1 OPTIONS
244+
245+ --api-base URL
246+ The base URL used to communicate with PagerDuty. The default option here
247+ should be fine, but adjusting it may make sense if your firewall doesn't
248+ pass HTTPS traffic for some reason. See the PagerDuty Nagios integration
249+ docs for details.
250+
251+ --field KEY=VALUE
252+ Add this key-value pair to the event being passed to PagerDuty. The script
253+ automatically gathers Nagios macros out of the environment, so there's no
254+ need to specify these explicitly. This option can be repeated as many
255+ times as necessary to pass multiple key-value pairs. This option is only
256+ useful when an event is being enqueued.0
257+
258+ --help
259+ Display documentation for the script.
260+
261+ --queue-dir DIR
262+ Path to the directory to use to store the event queue. By default, we use
263+ /tmp/pagerduty_nagios.
264+
265+ --verbose
266+ Turn on extra debugging information. Useful for debugging.
267+
268+ --proxy
269+ Use a proxy for the connections like "--proxy http://127.0.0.1:8888/"
270+
271+=cut
272+
273+# This release tested on:
274+# Debian Sarge (Perl 5.8.4)
275+# Ubuntu 9.04 (Perl 5.10.0)
276+
277+my $opt_api_base = "https://events.pagerduty.com/nagios/2010-04-15";
278+my %opt_fields;
279+my $opt_help;
280+my $opt_queue_dir = "/tmp/pagerduty_nagios";
281+my $opt_verbose;
282+my $opt_proxy;
283+
284+
285+sub get_queue_from_dir {
286+ my $dh;
287+
288+ unless (opendir($dh, $opt_queue_dir)) {
289+ syslog(LOG_ERR, "opendir %s failed: %s", $opt_queue_dir, $!);
290+ die $!;
291+ }
292+
293+ my @files;
294+ while (my $f = readdir($dh)) {
295+ next unless $f =~ /^pd_(\d+)_\d+\.txt$/;
296+ push @files, [int($1), $f];
297+ }
298+
299+ closedir($dh);
300+
301+ @files = sort { @{$a}[0] <=> @{$b}[0] } @files;
302+ return map { @{$_}[1] } @files;
303+}
304+
305+
306+sub flush_queue {
307+ my @files = get_queue_from_dir();
308+ my $ua = LWP::UserAgent->new;
309+
310+ # It's not a big deal if we don't get the message through the first time.
311+ # It will get sent the next time cron fires.
312+ $ua->timeout(15);
313+
314+ if ($opt_proxy) {
315+ $ua->proxy (['http', 'https'], $opt_proxy);
316+ }
317+
318+ foreach (@files) {
319+ my $filename = "$opt_queue_dir/$_";
320+ my $fd;
321+ my %event;
322+
323+ print STDERR "==== Now processing: $filename\n" if $opt_verbose;
324+
325+ unless (open($fd, "<", $filename)) {
326+ syslog(LOG_ERR, "open %s for read failed: %s", $filename, $!);
327+ die $!;
328+ }
329+
330+ while (<$fd>) {
331+ chomp;
332+ my @fields = split("=", $_, 2);
333+ $event{$fields[0]} = $fields[1];
334+ }
335+
336+ close($fd);
337+
338+ my $req = POST("$opt_api_base/create_event", \%event);
339+
340+ if ($opt_verbose) {
341+ my $s = $req->as_string;
342+ print STDERR "Request:\n$s\n";
343+ }
344+
345+ my $resp = $ua->request($req);
346+
347+ if ($opt_verbose) {
348+ my $s = $resp->as_string;
349+ print STDERR "Response:\n$s\n";
350+ }
351+
352+ if ($resp->is_success) {
353+ syslog(LOG_INFO, "Nagios event in file %s ACCEPTED by the PagerDuty server.", $filename);
354+ unlink($filename);
355+ }
356+ elsif (is_client_error($resp->code)) {
357+ syslog(LOG_WARNING, "Nagios event in file %s REJECTED by the PagerDuty server. Server says: %s", $filename, $resp->content);
358+ unlink($filename) if ($resp->content !~ /retry later/);
359+ }
360+ else {
361+ # Something else went wrong.
362+ syslog(LOG_WARNING, "Nagios event in file %s DEFERRED due to network/server problems.", $filename);
363+ return 0;
364+ }
365+ }
366+
367+ # Everything that needed to be sent was sent.
368+ return 1;
369+}
370+
371+
372+sub lock_and_flush_queue {
373+ # Serialize access to the queue directory while we flush.
374+ # (We don't want more than one flush at once.)
375+
376+ my $lock_filename = "$opt_queue_dir/lockfile";
377+ my $lock_fd;
378+
379+ unless (open($lock_fd, ">", $lock_filename)) {
380+ syslog(LOG_ERR, "open %s for write failed: %s", $lock_filename, $!);
381+ die $!;
382+ }
383+
384+ unless (flock($lock_fd, LOCK_EX)) {
385+ syslog(LOG_ERR, "flock %s failed: %s", $lock_filename, $!);
386+ die $!;
387+ }
388+
389+ my $ret = flush_queue();
390+
391+ close($lock_fd);
392+
393+ return $ret;
394+}
395+
396+
397+sub enqueue_event {
398+ my %event;
399+
400+ # Scoop all the Nagios related stuff out of the environment.
401+ while ((my $k, my $v) = each %ENV) {
402+ next unless $k =~ /^(ICINGA|NAGIOS)_(.*)$/;
403+ $event{$2} = $v;
404+ }
405+
406+ # Apply any other variables that were passed in.
407+ %event = (%event, %opt_fields);
408+
409+ $event{"pd_version"} = "1.0";
410+
411+ # Right off the bat, enqueue the event. Nothing tiem consuming should come
412+ # before here (i.e. no locks or remote connections), because we want to
413+ # make sure we get the event written out within the Nagios notification
414+ # timeout. If we get killed off after that, it isn't a big deal.
415+
416+ my $filename = sprintf("$opt_queue_dir/pd_%u_%u.txt", time(), $$);
417+ my $fd;
418+
419+ unless (open($fd, ">", $filename)) {
420+ syslog(LOG_ERR, "open %s for write failed: %s", $filename, $!);
421+ die $!;
422+ }
423+
424+ while ((my $k, my $v) = each %event) {
425+ # "=" can't occur in the keyname, and "\n" can't occur anywhere.
426+ # (Nagios follows this already, so I think we're safe)
427+ print $fd "$k=$v\n";
428+ }
429+
430+ close($fd);
431+}
432+
433+###########
434+
435+GetOptions("api-base=s" => \$opt_api_base,
436+ "field=s%" => \%opt_fields,
437+ "help" => \$opt_help,
438+ "queue-dir=s" => \$opt_queue_dir,
439+ "verbose" => \$opt_verbose,
440+ "proxy=s" => \$opt_proxy
441+ ) || pod2usage(2);
442+
443+pod2usage(2) if @ARGV < 1 ||
444+ (($ARGV[0] ne "enqueue") && ($ARGV[0] ne "flush"));
445+
446+pod2usage(-verbose => 3) if $opt_help;
447+
448+my @log_mode = ("nofatal", "pid");
449+push(@log_mode, "perror") if $opt_verbose;
450+
451+openlog("pagerduty_nagios", join(",", @log_mode), LOG_LOCAL0);
452+
453+# This function automatically terminates the program on things like permission
454+# errors.
455+mkpath($opt_queue_dir);
456+
457+if ($ARGV[0] eq "enqueue") {
458+ enqueue_event();
459+ lock_and_flush_queue();
460+}
461+elsif ($ARGV[0] eq "flush") {
462+ lock_and_flush_queue();
463+}
464
465=== modified file 'hooks/install'
466--- hooks/install 2015-05-01 05:06:20 +0000
467+++ hooks/install 2015-09-08 04:46:07 +0000
468@@ -23,11 +23,6 @@
469 DEBIAN_FRONTEND=noninteractive apt-get -qy \
470 install nagios3 nagios-plugins python-cheetah python-jinja2 dnsutils debconf-utils nagios-nrpe-plugin pynag
471
472-# enable external commands per README.Debian file
473-if ! grep '^check_external_commands=1$' /etc/nagios3/nagios.cfg ; then
474- echo check_external_commands=1 >> /etc/nagios3/nagios.cfg
475-fi
476-
477 if [ -f $CHARM_DIR/files/hostgroups_nagios2.cfg ]; then
478 # Write the new hostgroups_nagios2.cfg file to prevent servers being classified as Debian.
479 cp -v $CHARM_DIR/files/hostgroups_nagios2.cfg /etc/nagios3/conf.d/hostgroups_nagios2.cfg
480@@ -44,10 +39,6 @@
481 if [ "$enable_livestatus" ]; then
482 # install check-mk-livestatus
483 DEBIAN_FRONTEND=noninteractive apt-get -qy install check-mk-livestatus
484- # enable livestatus broker
485- if ! grep '^broker_module=' /etc/nagios3/nagios.cfg ; then
486- echo "broker_module=/usr/lib/check_mk/livestatus.o $livestatus_path" >> /etc/nagios3/nagios.cfg
487- fi
488 # fix permissions on the livestatus directory
489 mkdir -p $livestatus_dir
490 chown nagios:www-data $livestatus_dir
491
492=== added file 'hooks/templates/contacts-cfg.tmpl'
493--- hooks/templates/contacts-cfg.tmpl 1970-01-01 00:00:00 +0000
494+++ hooks/templates/contacts-cfg.tmpl 2015-09-08 04:46:07 +0000
495@@ -0,0 +1,51 @@
496+#------------------------------------------------
497+# This file is juju managed
498+#------------------------------------------------
499+
500+###############################################################################
501+# contacts.cfg
502+###############################################################################
503+
504+
505+
506+###############################################################################
507+###############################################################################
508+#
509+# CONTACTS
510+#
511+###############################################################################
512+###############################################################################
513+
514+# In this simple config file, a single contact will receive all alerts.
515+
516+define contact{
517+ contact_name root
518+ alias Root
519+ service_notification_period 24x7
520+ host_notification_period 24x7
521+ service_notification_options w,u,c,r
522+ host_notification_options d,r
523+ service_notification_commands notify-service-by-email
524+ host_notification_commands notify-host-by-email
525+ email {{ admin_email }}
526+ }
527+
528+
529+
530+###############################################################################
531+###############################################################################
532+#
533+# CONTACT GROUPS
534+#
535+###############################################################################
536+###############################################################################
537+
538+# We only have one contact in this simple configuration file, so there is
539+# no need to create more than one contact group.
540+
541+define contactgroup{
542+ contactgroup_name admins
543+ alias Nagios Administrators
544+ members root{% if enable_pagerduty -%}, pagerduty{% endif %}
545+ }
546+
547
548=== added file 'hooks/templates/nagios-cfg.tmpl'
549--- hooks/templates/nagios-cfg.tmpl 1970-01-01 00:00:00 +0000
550+++ hooks/templates/nagios-cfg.tmpl 2015-09-08 04:46:07 +0000
551@@ -0,0 +1,1360 @@
552+#------------------------------------------------
553+# This file is juju managed
554+#------------------------------------------------
555+
556+##############################################################################
557+#
558+# NAGIOS.CFG - Sample Main Config File for Nagios
559+#
560+#
561+##############################################################################
562+
563+
564+# LOG FILE
565+# This is the main log file where service and host events are logged
566+# for historical purposes. This should be the first option specified
567+# in the config file!!!
568+
569+log_file=/var/log/nagios3/nagios.log
570+
571+# Commands definitions
572+cfg_file=/etc/nagios3/commands.cfg
573+
574+# Debian also defaults to using the check commands defined by the debian
575+# nagios-plugins package
576+cfg_dir=/etc/nagios-plugins/config
577+
578+# Debian uses by default a configuration directory where nagios3-common,
579+# other packages and the local admin can dump or link configuration
580+# files into.
581+cfg_dir=/etc/nagios3/conf.d
582+
583+# OBJECT CONFIGURATION FILE(S)
584+# These are the object configuration files in which you define hosts,
585+# host groups, contacts, contact groups, services, etc.
586+# You can split your object definitions across several config files
587+# if you wish (as shown below), or keep them all in a single config file.
588+
589+# You can specify individual object config files as shown below:
590+#cfg_file=/etc/nagios3/objects/commands.cfg
591+#cfg_file=/etc/nagios3/objects/contacts.cfg
592+#cfg_file=/etc/nagios3/objects/timeperiods.cfg
593+#cfg_file=/etc/nagios3/objects/templates.cfg
594+
595+# Definitions for monitoring a Windows machine
596+#cfg_file=/etc/nagios3/objects/windows.cfg
597+
598+# Definitions for monitoring a router/switch
599+#cfg_file=/etc/nagios3/objects/switch.cfg
600+
601+# Definitions for monitoring a network printer
602+#cfg_file=/etc/nagios3/objects/printer.cfg
603+
604+
605+# You can also tell Nagios to process all config files (with a .cfg
606+# extension) in a particular directory by using the cfg_dir
607+# directive as shown below:
608+
609+#cfg_dir=/etc/nagios3/servers
610+#cfg_dir=/etc/nagios3/printers
611+#cfg_dir=/etc/nagios3/switches
612+#cfg_dir=/etc/nagios3/routers
613+
614+
615+
616+
617+# OBJECT CACHE FILE
618+# This option determines where object definitions are cached when
619+# Nagios starts/restarts. The CGIs read object definitions from
620+# this cache file (rather than looking at the object config files
621+# directly) in order to prevent inconsistencies that can occur
622+# when the config files are modified after Nagios starts.
623+
624+object_cache_file=/var/cache/nagios3/objects.cache
625+
626+
627+
628+# PRE-CACHED OBJECT FILE
629+# This options determines the location of the precached object file.
630+# If you run Nagios with the -p command line option, it will preprocess
631+# your object configuration file(s) and write the cached config to this
632+# file. You can then start Nagios with the -u option to have it read
633+# object definitions from this precached file, rather than the standard
634+# object configuration files (see the cfg_file and cfg_dir options above).
635+# Using a precached object file can speed up the time needed to (re)start
636+# the Nagios process if you've got a large and/or complex configuration.
637+# Read the documentation section on optimizing Nagios to find our more
638+# about how this feature works.
639+
640+precached_object_file=/var/lib/nagios3/objects.precache
641+
642+
643+
644+# RESOURCE FILE
645+# This is an optional resource file that contains $USERx$ macro
646+# definitions. Multiple resource files can be specified by using
647+# multiple resource_file definitions. The CGIs will not attempt to
648+# read the contents of resource files, so information that is
649+# considered to be sensitive (usernames, passwords, etc) can be
650+# defined as macros in this file and restrictive permissions (600)
651+# can be placed on this file.
652+
653+resource_file=/etc/nagios3/resource.cfg
654+
655+
656+
657+# STATUS FILE
658+# This is where the current status of all monitored services and
659+# hosts is stored. Its contents are read and processed by the CGIs.
660+# The contents of the status file are deleted every time Nagios
661+# restarts.
662+
663+status_file=/var/cache/nagios3/status.dat
664+
665+
666+
667+# STATUS FILE UPDATE INTERVAL
668+# This option determines the frequency (in seconds) that
669+# Nagios will periodically dump program, host, and
670+# service status data.
671+
672+status_update_interval=10
673+
674+
675+
676+# NAGIOS USER
677+# This determines the effective user that Nagios should run as.
678+# You can either supply a username or a UID.
679+
680+nagios_user={{ nagios_user }}
681+
682+
683+
684+# NAGIOS GROUP
685+# This determines the effective group that Nagios should run as.
686+# You can either supply a group name or a GID.
687+
688+nagios_group={{ nagios_group }}
689+
690+
691+
692+# EXTERNAL COMMAND OPTION
693+# This option allows you to specify whether or not Nagios should check
694+# for external commands (in the command file defined below). By default
695+# Nagios will *not* check for external commands, just to be on the
696+# cautious side. If you want to be able to use the CGI command interface
697+# you will have to enable this.
698+# Values: 0 = disable commands, 1 = enable commands
699+
700+check_external_commands={{ check_external_commands }}
701+
702+
703+
704+# EXTERNAL COMMAND CHECK INTERVAL
705+# This is the interval at which Nagios should check for external commands.
706+# This value works of the interval_length you specify later. If you leave
707+# that at its default value of 60 (seconds), a value of 1 here will cause
708+# Nagios to check for external commands every minute. If you specify a
709+# number followed by an "s" (i.e. 15s), this will be interpreted to mean
710+# actual seconds rather than a multiple of the interval_length variable.
711+# Note: In addition to reading the external command file at regularly
712+# scheduled intervals, Nagios will also check for external commands after
713+# event handlers are executed.
714+# NOTE: Setting this value to -1 causes Nagios to check the external
715+# command file as often as possible.
716+
717+#command_check_interval=15s
718+command_check_interval={{ command_check_interval }}
719+
720+
721+
722+# EXTERNAL COMMAND FILE
723+# This is the file that Nagios checks for external command requests.
724+# It is also where the command CGI will write commands that are submitted
725+# by users, so it must be writeable by the user that the web server
726+# is running as (usually 'nobody'). Permissions should be set at the
727+# directory level instead of on the file, as the file is deleted every
728+# time its contents are processed.
729+# Debian Users: In case you didn't read README.Debian yet, _NOW_ is the
730+# time to do it.
731+
732+command_file={{ command_file }}
733+
734+
735+
736+# EXTERNAL COMMAND BUFFER SLOTS
737+# This settings is used to tweak the number of items or "slots" that
738+# the Nagios daemon should allocate to the buffer that holds incoming
739+# external commands before they are processed. As external commands
740+# are processed by the daemon, they are removed from the buffer.
741+
742+external_command_buffer_slots=4096
743+
744+
745+
746+# LOCK FILE
747+# This is the lockfile that Nagios will use to store its PID number
748+# in when it is running in daemon mode.
749+
750+lock_file=/var/run/nagios3/nagios3.pid
751+
752+
753+
754+# TEMP FILE
755+# This is a temporary file that is used as scratch space when Nagios
756+# updates the status log, cleans the comment file, etc. This file
757+# is created, used, and deleted throughout the time that Nagios is
758+# running.
759+
760+temp_file=/var/cache/nagios3/nagios.tmp
761+
762+
763+
764+# TEMP PATH
765+# This is path where Nagios can create temp files for service and
766+# host check results, etc.
767+
768+temp_path=/tmp
769+
770+
771+
772+# EVENT BROKER OPTIONS
773+# Controls what (if any) data gets sent to the event broker.
774+# Values: 0 = Broker nothing
775+# -1 = Broker everything
776+# <other> = See documentation
777+
778+event_broker_options=-1
779+
780+
781+
782+# EVENT BROKER MODULE(S)
783+# This directive is used to specify an event broker module that should
784+# by loaded by Nagios at startup. Use multiple directives if you want
785+# to load more than one module. Arguments that should be passed to
786+# the module at startup are seperated from the module path by a space.
787+#
788+#!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
789+# WARNING !!! WARNING !!! WARNING !!! WARNING !!! WARNING !!! WARNING
790+#!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
791+#
792+# Do NOT overwrite modules while they are being used by Nagios or Nagios
793+# will crash in a fiery display of SEGFAULT glory. This is a bug/limitation
794+# either in dlopen(), the kernel, and/or the filesystem. And maybe Nagios...
795+#
796+# The correct/safe way of updating a module is by using one of these methods:
797+# 1. Shutdown Nagios, replace the module file, restart Nagios
798+# 2. Delete the original module file, move the new module file into place, restart Nagios
799+#
800+# Example:
801+#
802+# broker_module=<modulepath> [moduleargs]
803+
804+#broker_module=/somewhere/module1.o
805+#broker_module=/somewhere/module2.o arg1 arg2=3 debug=0
806+{% if enable_livestatus -%}
807+broker_module=/usr/lib/check_mk/livestatus.o {{ livestatus_path }} {{ livestatus_args }}
808+{% endif %}
809+
810+
811+# LOG ROTATION METHOD
812+# This is the log rotation method that Nagios should use to rotate
813+# the main log file. Values are as follows..
814+# n = None - don't rotate the log
815+# h = Hourly rotation (top of the hour)
816+# d = Daily rotation (midnight every day)
817+# w = Weekly rotation (midnight on Saturday evening)
818+# m = Monthly rotation (midnight last day of month)
819+
820+log_rotation_method={{ log_rotation_method }}
821+
822+
823+
824+# LOG ARCHIVE PATH
825+# This is the directory where archived (rotated) log files should be
826+# placed (assuming you've chosen to do log rotation).
827+
828+log_archive_path={{ log_archive_path }}
829+
830+
831+
832+# LOGGING OPTIONS
833+# If you want messages logged to the syslog facility, as well as the
834+# Nagios log file set this option to 1. If not, set it to 0.
835+
836+use_syslog={{ use_syslog }}
837+
838+
839+
840+# NOTIFICATION LOGGING OPTION
841+# If you don't want notifications to be logged, set this value to 0.
842+# If notifications should be logged, set the value to 1.
843+
844+log_notifications=1
845+
846+
847+
848+# SERVICE RETRY LOGGING OPTION
849+# If you don't want service check retries to be logged, set this value
850+# to 0. If retries should be logged, set the value to 1.
851+
852+log_service_retries=1
853+
854+
855+
856+# HOST RETRY LOGGING OPTION
857+# If you don't want host check retries to be logged, set this value to
858+# 0. If retries should be logged, set the value to 1.
859+
860+log_host_retries=1
861+
862+
863+
864+# EVENT HANDLER LOGGING OPTION
865+# If you don't want host and service event handlers to be logged, set
866+# this value to 0. If event handlers should be logged, set the value
867+# to 1.
868+
869+log_event_handlers=1
870+
871+
872+
873+# INITIAL STATES LOGGING OPTION
874+# If you want Nagios to log all initial host and service states to
875+# the main log file (the first time the service or host is checked)
876+# you can enable this option by setting this value to 1. If you
877+# are not using an external application that does long term state
878+# statistics reporting, you do not need to enable this option. In
879+# this case, set the value to 0.
880+
881+log_initial_states=0
882+
883+
884+
885+# EXTERNAL COMMANDS LOGGING OPTION
886+# If you don't want Nagios to log external commands, set this value
887+# to 0. If external commands should be logged, set this value to 1.
888+# Note: This option does not include logging of passive service
889+# checks - see the option below for controlling whether or not
890+# passive checks are logged.
891+
892+log_external_commands=1
893+
894+
895+
896+# PASSIVE CHECKS LOGGING OPTION
897+# If you don't want Nagios to log passive host and service checks, set
898+# this value to 0. If passive checks should be logged, set
899+# this value to 1.
900+
901+log_passive_checks=1
902+
903+
904+
905+# GLOBAL HOST AND SERVICE EVENT HANDLERS
906+# These options allow you to specify a host and service event handler
907+# command that is to be run for every host or service state change.
908+# The global event handler is executed immediately prior to the event
909+# handler that you have optionally specified in each host or
910+# service definition. The command argument is the short name of a
911+# command definition that you define in your host configuration file.
912+# Read the HTML docs for more information.
913+
914+#global_host_event_handler=somecommand
915+#global_service_event_handler=somecommand
916+
917+
918+
919+# SERVICE INTER-CHECK DELAY METHOD
920+# This is the method that Nagios should use when initially
921+# "spreading out" service checks when it starts monitoring. The
922+# default is to use smart delay calculation, which will try to
923+# space all service checks out evenly to minimize CPU load.
924+# Using the dumb setting will cause all checks to be scheduled
925+# at the same time (with no delay between them)! This is not a
926+# good thing for production, but is useful when testing the
927+# parallelization functionality.
928+# n = None - don't use any delay between checks
929+# d = Use a "dumb" delay of 1 second between checks
930+# s = Use "smart" inter-check delay calculation
931+# x.xx = Use an inter-check delay of x.xx seconds
932+
933+service_inter_check_delay_method=s
934+
935+
936+
937+# MAXIMUM SERVICE CHECK SPREAD
938+# This variable determines the timeframe (in minutes) from the
939+# program start time that an initial check of all services should
940+# be completed. Default is 30 minutes.
941+
942+max_service_check_spread=30
943+
944+
945+
946+# SERVICE CHECK INTERLEAVE FACTOR
947+# This variable determines how service checks are interleaved.
948+# Interleaving the service checks allows for a more even
949+# distribution of service checks and reduced load on remote
950+# hosts. Setting this value to 1 is equivalent to how versions
951+# of Nagios previous to 0.0.5 did service checks. Set this
952+# value to s (smart) for automatic calculation of the interleave
953+# factor unless you have a specific reason to change it.
954+# s = Use "smart" interleave factor calculation
955+# x = Use an interleave factor of x, where x is a
956+# number greater than or equal to 1.
957+
958+service_interleave_factor=s
959+
960+
961+
962+# HOST INTER-CHECK DELAY METHOD
963+# This is the method that Nagios should use when initially
964+# "spreading out" host checks when it starts monitoring. The
965+# default is to use smart delay calculation, which will try to
966+# space all host checks out evenly to minimize CPU load.
967+# Using the dumb setting will cause all checks to be scheduled
968+# at the same time (with no delay between them)!
969+# n = None - don't use any delay between checks
970+# d = Use a "dumb" delay of 1 second between checks
971+# s = Use "smart" inter-check delay calculation
972+# x.xx = Use an inter-check delay of x.xx seconds
973+
974+host_inter_check_delay_method=s
975+
976+
977+
978+# MAXIMUM HOST CHECK SPREAD
979+# This variable determines the timeframe (in minutes) from the
980+# program start time that an initial check of all hosts should
981+# be completed. Default is 30 minutes.
982+
983+max_host_check_spread=30
984+
985+
986+
987+# MAXIMUM CONCURRENT SERVICE CHECKS
988+# This option allows you to specify the maximum number of
989+# service checks that can be run in parallel at any given time.
990+# Specifying a value of 1 for this variable essentially prevents
991+# any service checks from being parallelized. A value of 0
992+# will not restrict the number of concurrent checks that are
993+# being executed.
994+
995+max_concurrent_checks=0
996+
997+
998+
999+# HOST AND SERVICE CHECK REAPER FREQUENCY
1000+# This is the frequency (in seconds!) that Nagios will process
1001+# the results of host and service checks.
1002+
1003+check_result_reaper_frequency=10
1004+
1005+
1006+
1007+
1008+# MAX CHECK RESULT REAPER TIME
1009+# This is the max amount of time (in seconds) that a single
1010+# check result reaper event will be allowed to run before
1011+# returning control back to Nagios so it can perform other
1012+# duties.
1013+
1014+max_check_result_reaper_time=30
1015+
1016+
1017+
1018+
1019+# CHECK RESULT PATH
1020+# This is directory where Nagios stores the results of host and
1021+# service checks that have not yet been processed.
1022+#
1023+# Note: Make sure that only one instance of Nagios has access
1024+# to this directory!
1025+
1026+check_result_path=/var/lib/nagios3/spool/checkresults
1027+
1028+
1029+
1030+
1031+# MAX CHECK RESULT FILE AGE
1032+# This option determines the maximum age (in seconds) which check
1033+# result files are considered to be valid. Files older than this
1034+# threshold will be mercilessly deleted without further processing.
1035+
1036+max_check_result_file_age=3600
1037+
1038+
1039+
1040+
1041+# CACHED HOST CHECK HORIZON
1042+# This option determines the maximum amount of time (in seconds)
1043+# that the state of a previous host check is considered current.
1044+# Cached host states (from host checks that were performed more
1045+# recently that the timeframe specified by this value) can immensely
1046+# improve performance in regards to the host check logic.
1047+# Too high of a value for this option may result in inaccurate host
1048+# states being used by Nagios, while a lower value may result in a
1049+# performance hit for host checks. Use a value of 0 to disable host
1050+# check caching.
1051+
1052+cached_host_check_horizon=15
1053+
1054+
1055+
1056+# CACHED SERVICE CHECK HORIZON
1057+# This option determines the maximum amount of time (in seconds)
1058+# that the state of a previous service check is considered current.
1059+# Cached service states (from service checks that were performed more
1060+# recently that the timeframe specified by this value) can immensely
1061+# improve performance in regards to predictive dependency checks.
1062+# Use a value of 0 to disable service check caching.
1063+
1064+cached_service_check_horizon=15
1065+
1066+
1067+
1068+# ENABLE PREDICTIVE HOST DEPENDENCY CHECKS
1069+# This option determines whether or not Nagios will attempt to execute
1070+# checks of hosts when it predicts that future dependency logic test
1071+# may be needed. These predictive checks can help ensure that your
1072+# host dependency logic works well.
1073+# Values:
1074+# 0 = Disable predictive checks
1075+# 1 = Enable predictive checks (default)
1076+
1077+enable_predictive_host_dependency_checks=1
1078+
1079+
1080+
1081+# ENABLE PREDICTIVE SERVICE DEPENDENCY CHECKS
1082+# This option determines whether or not Nagios will attempt to execute
1083+# checks of service when it predicts that future dependency logic test
1084+# may be needed. These predictive checks can help ensure that your
1085+# service dependency logic works well.
1086+# Values:
1087+# 0 = Disable predictive checks
1088+# 1 = Enable predictive checks (default)
1089+
1090+enable_predictive_service_dependency_checks=1
1091+
1092+
1093+
1094+# SOFT STATE DEPENDENCIES
1095+# This option determines whether or not Nagios will use soft state
1096+# information when checking host and service dependencies. Normally
1097+# Nagios will only use the latest hard host or service state when
1098+# checking dependencies. If you want it to use the latest state (regardless
1099+# of whether its a soft or hard state type), enable this option.
1100+# Values:
1101+# 0 = Don't use soft state dependencies (default)
1102+# 1 = Use soft state dependencies
1103+
1104+soft_state_dependencies=0
1105+
1106+
1107+
1108+# TIME CHANGE ADJUSTMENT THRESHOLDS
1109+# These options determine when Nagios will react to detected changes
1110+# in system time (either forward or backwards).
1111+
1112+#time_change_threshold=900
1113+
1114+
1115+
1116+# AUTO-RESCHEDULING OPTION
1117+# This option determines whether or not Nagios will attempt to
1118+# automatically reschedule active host and service checks to
1119+# "smooth" them out over time. This can help balance the load on
1120+# the monitoring server.
1121+# WARNING: THIS IS AN EXPERIMENTAL FEATURE - IT CAN DEGRADE
1122+# PERFORMANCE, RATHER THAN INCREASE IT, IF USED IMPROPERLY
1123+
1124+auto_reschedule_checks=0
1125+
1126+
1127+
1128+# AUTO-RESCHEDULING INTERVAL
1129+# This option determines how often (in seconds) Nagios will
1130+# attempt to automatically reschedule checks. This option only
1131+# has an effect if the auto_reschedule_checks option is enabled.
1132+# Default is 30 seconds.
1133+# WARNING: THIS IS AN EXPERIMENTAL FEATURE - IT CAN DEGRADE
1134+# PERFORMANCE, RATHER THAN INCREASE IT, IF USED IMPROPERLY
1135+
1136+auto_rescheduling_interval=30
1137+
1138+
1139+
1140+# AUTO-RESCHEDULING WINDOW
1141+# This option determines the "window" of time (in seconds) that
1142+# Nagios will look at when automatically rescheduling checks.
1143+# Only host and service checks that occur in the next X seconds
1144+# (determined by this variable) will be rescheduled. This option
1145+# only has an effect if the auto_reschedule_checks option is
1146+# enabled. Default is 180 seconds (3 minutes).
1147+# WARNING: THIS IS AN EXPERIMENTAL FEATURE - IT CAN DEGRADE
1148+# PERFORMANCE, RATHER THAN INCREASE IT, IF USED IMPROPERLY
1149+
1150+auto_rescheduling_window=180
1151+
1152+
1153+
1154+# SLEEP TIME
1155+# This is the number of seconds to sleep between checking for system
1156+# events and service checks that need to be run.
1157+
1158+sleep_time=0.25
1159+
1160+
1161+
1162+# TIMEOUT VALUES
1163+# These options control how much time Nagios will allow various
1164+# types of commands to execute before killing them off. Options
1165+# are available for controlling maximum time allotted for
1166+# service checks, host checks, event handlers, notifications, the
1167+# ocsp command, and performance data commands. All values are in
1168+# seconds.
1169+
1170+service_check_timeout=60
1171+host_check_timeout=30
1172+event_handler_timeout=30
1173+notification_timeout=30
1174+ocsp_timeout=5
1175+perfdata_timeout=5
1176+
1177+
1178+
1179+# RETAIN STATE INFORMATION
1180+# This setting determines whether or not Nagios will save state
1181+# information for services and hosts before it shuts down. Upon
1182+# startup Nagios will reload all saved service and host state
1183+# information before starting to monitor. This is useful for
1184+# maintaining long-term data on state statistics, etc, but will
1185+# slow Nagios down a bit when it (re)starts. Since its only
1186+# a one-time penalty, I think its well worth the additional
1187+# startup delay.
1188+
1189+retain_state_information=1
1190+
1191+
1192+
1193+# STATE RETENTION FILE
1194+# This is the file that Nagios should use to store host and
1195+# service state information before it shuts down. The state
1196+# information in this file is also read immediately prior to
1197+# starting to monitor the network when Nagios is restarted.
1198+# This file is used only if the retain_state_information
1199+# variable is set to 1.
1200+
1201+state_retention_file=/var/lib/nagios3/retention.dat
1202+
1203+
1204+
1205+# RETENTION DATA UPDATE INTERVAL
1206+# This setting determines how often (in minutes) that Nagios
1207+# will automatically save retention data during normal operation.
1208+# If you set this value to 0, Nagios will not save retention
1209+# data at regular interval, but it will still save retention
1210+# data before shutting down or restarting. If you have disabled
1211+# state retention, this option has no effect.
1212+
1213+retention_update_interval=60
1214+
1215+
1216+
1217+# USE RETAINED PROGRAM STATE
1218+# This setting determines whether or not Nagios will set
1219+# program status variables based on the values saved in the
1220+# retention file. If you want to use retained program status
1221+# information, set this value to 1. If not, set this value
1222+# to 0.
1223+
1224+use_retained_program_state=1
1225+
1226+
1227+
1228+# USE RETAINED SCHEDULING INFO
1229+# This setting determines whether or not Nagios will retain
1230+# the scheduling info (next check time) for hosts and services
1231+# based on the values saved in the retention file. If you
1232+# If you want to use retained scheduling info, set this
1233+# value to 1. If not, set this value to 0.
1234+
1235+use_retained_scheduling_info=1
1236+
1237+
1238+
1239+# RETAINED ATTRIBUTE MASKS (ADVANCED FEATURE)
1240+# The following variables are used to specify specific host and
1241+# service attributes that should *not* be retained by Nagios during
1242+# program restarts.
1243+#
1244+# The values of the masks are bitwise ANDs of values specified
1245+# by the "MODATTR_" definitions found in include/common.h.
1246+# For example, if you do not want the current enabled/disabled state
1247+# of flap detection and event handlers for hosts to be retained, you
1248+# would use a value of 24 for the host attribute mask...
1249+# MODATTR_EVENT_HANDLER_ENABLED (8) + MODATTR_FLAP_DETECTION_ENABLED (16) = 24
1250+
1251+# This mask determines what host attributes are not retained
1252+retained_host_attribute_mask=0
1253+
1254+# This mask determines what service attributes are not retained
1255+retained_service_attribute_mask=0
1256+
1257+# These two masks determine what process attributes are not retained.
1258+# There are two masks, because some process attributes have host and service
1259+# options. For example, you can disable active host checks, but leave active
1260+# service checks enabled.
1261+retained_process_host_attribute_mask=0
1262+retained_process_service_attribute_mask=0
1263+
1264+# These two masks determine what contact attributes are not retained.
1265+# There are two masks, because some contact attributes have host and
1266+# service options. For example, you can disable host notifications for
1267+# a contact, but leave service notifications enabled for them.
1268+retained_contact_host_attribute_mask=0
1269+retained_contact_service_attribute_mask=0
1270+
1271+
1272+
1273+# INTERVAL LENGTH
1274+# This is the seconds per unit interval as used in the
1275+# host/contact/service configuration files. Setting this to 60 means
1276+# that each interval is one minute long (60 seconds). Other settings
1277+# have not been tested much, so your mileage is likely to vary...
1278+
1279+interval_length=60
1280+
1281+
1282+
1283+# CHECK FOR UPDATES
1284+# This option determines whether Nagios will automatically check to
1285+# see if new updates (releases) are available. It is recommend that you
1286+# enable this option to ensure that you stay on top of the latest critical
1287+# patches to Nagios. Nagios is critical to you - make sure you keep it in
1288+# good shape. Nagios will check once a day for new updates. Data collected
1289+# by Nagios Enterprises from the update check is processed in accordance
1290+# with our privacy policy - see http://api.nagios.org for details.
1291+
1292+check_for_updates=1
1293+
1294+
1295+
1296+# BARE UPDATE CHECK
1297+# This option deterines what data Nagios will send to api.nagios.org when
1298+# it checks for updates. By default, Nagios will send information on the
1299+# current version of Nagios you have installed, as well as an indicator as
1300+# to whether this was a new installation or not. Nagios Enterprises uses
1301+# this data to determine the number of users running specific version of
1302+# Nagios. Enable this option if you do not want this information to be sent.
1303+
1304+bare_update_check=0
1305+
1306+
1307+
1308+# AGGRESSIVE HOST CHECKING OPTION
1309+# If you don't want to turn on aggressive host checking features, set
1310+# this value to 0 (the default). Otherwise set this value to 1 to
1311+# enable the aggressive check option. Read the docs for more info
1312+# on what aggressive host check is or check out the source code in
1313+# base/checks.c
1314+
1315+use_aggressive_host_checking=0
1316+
1317+
1318+
1319+# SERVICE CHECK EXECUTION OPTION
1320+# This determines whether or not Nagios will actively execute
1321+# service checks when it initially starts. If this option is
1322+# disabled, checks are not actively made, but Nagios can still
1323+# receive and process passive check results that come in. Unless
1324+# you're implementing redundant hosts or have a special need for
1325+# disabling the execution of service checks, leave this enabled!
1326+# Values: 1 = enable checks, 0 = disable checks
1327+
1328+execute_service_checks=1
1329+
1330+
1331+
1332+# PASSIVE SERVICE CHECK ACCEPTANCE OPTION
1333+# This determines whether or not Nagios will accept passive
1334+# service checks results when it initially (re)starts.
1335+# Values: 1 = accept passive checks, 0 = reject passive checks
1336+
1337+accept_passive_service_checks=1
1338+
1339+
1340+
1341+# HOST CHECK EXECUTION OPTION
1342+# This determines whether or not Nagios will actively execute
1343+# host checks when it initially starts. If this option is
1344+# disabled, checks are not actively made, but Nagios can still
1345+# receive and process passive check results that come in. Unless
1346+# you're implementing redundant hosts or have a special need for
1347+# disabling the execution of host checks, leave this enabled!
1348+# Values: 1 = enable checks, 0 = disable checks
1349+
1350+execute_host_checks=1
1351+
1352+
1353+
1354+# PASSIVE HOST CHECK ACCEPTANCE OPTION
1355+# This determines whether or not Nagios will accept passive
1356+# host checks results when it initially (re)starts.
1357+# Values: 1 = accept passive checks, 0 = reject passive checks
1358+
1359+accept_passive_host_checks=1
1360+
1361+
1362+
1363+# NOTIFICATIONS OPTION
1364+# This determines whether or not Nagios will sent out any host or
1365+# service notifications when it is initially (re)started.
1366+# Values: 1 = enable notifications, 0 = disable notifications
1367+
1368+enable_notifications=1
1369+
1370+
1371+
1372+# EVENT HANDLER USE OPTION
1373+# This determines whether or not Nagios will run any host or
1374+# service event handlers when it is initially (re)started. Unless
1375+# you're implementing redundant hosts, leave this option enabled.
1376+# Values: 1 = enable event handlers, 0 = disable event handlers
1377+
1378+enable_event_handlers=1
1379+
1380+
1381+
1382+# PROCESS PERFORMANCE DATA OPTION
1383+# This determines whether or not Nagios will process performance
1384+# data returned from service and host checks. If this option is
1385+# enabled, host performance data will be processed using the
1386+# host_perfdata_command (defined below) and service performance
1387+# data will be processed using the service_perfdata_command (also
1388+# defined below). Read the HTML docs for more information on
1389+# performance data.
1390+# Values: 1 = process performance data, 0 = do not process performance data
1391+
1392+process_performance_data=0
1393+
1394+
1395+
1396+# HOST AND SERVICE PERFORMANCE DATA PROCESSING COMMANDS
1397+# These commands are run after every host and service check is
1398+# performed. These commands are executed only if the
1399+# enable_performance_data option (above) is set to 1. The command
1400+# argument is the short name of a command definition that you
1401+# define in your host configuration file. Read the HTML docs for
1402+# more information on performance data.
1403+
1404+#host_perfdata_command=process-host-perfdata
1405+#service_perfdata_command=process-service-perfdata
1406+
1407+
1408+
1409+# HOST AND SERVICE PERFORMANCE DATA FILES
1410+# These files are used to store host and service performance data.
1411+# Performance data is only written to these files if the
1412+# enable_performance_data option (above) is set to 1.
1413+
1414+#host_perfdata_file=/tmp/host-perfdata
1415+#service_perfdata_file=/tmp/service-perfdata
1416+
1417+
1418+
1419+# HOST AND SERVICE PERFORMANCE DATA FILE TEMPLATES
1420+# These options determine what data is written (and how) to the
1421+# performance data files. The templates may contain macros, special
1422+# characters (\t for tab, \r for carriage return, \n for newline)
1423+# and plain text. A newline is automatically added after each write
1424+# to the performance data file. Some examples of what you can do are
1425+# shown below.
1426+
1427+#host_perfdata_file_template=[HOSTPERFDATA]\t$TIMET$\t$HOSTNAME$\t$HOSTEXECUTIONTIME$\t$HOSTOUTPUT$\t$HOSTPERFDATA$
1428+#service_perfdata_file_template=[SERVICEPERFDATA]\t$TIMET$\t$HOSTNAME$\t$SERVICEDESC$\t$SERVICEEXECUTIONTIME$\t$SERVICELATENCY$\t$SERVICEOUTPUT$\t$SERVICEPERFDATA$
1429+
1430+
1431+
1432+# HOST AND SERVICE PERFORMANCE DATA FILE MODES
1433+# This option determines whether or not the host and service
1434+# performance data files are opened in write ("w") or append ("a")
1435+# mode. If you want to use named pipes, you should use the special
1436+# pipe ("p") mode which avoid blocking at startup, otherwise you will
1437+# likely want the defult append ("a") mode.
1438+
1439+#host_perfdata_file_mode=a
1440+#service_perfdata_file_mode=a
1441+
1442+
1443+
1444+# HOST AND SERVICE PERFORMANCE DATA FILE PROCESSING INTERVAL
1445+# These options determine how often (in seconds) the host and service
1446+# performance data files are processed using the commands defined
1447+# below. A value of 0 indicates the files should not be periodically
1448+# processed.
1449+
1450+#host_perfdata_file_processing_interval=0
1451+#service_perfdata_file_processing_interval=0
1452+
1453+
1454+
1455+# HOST AND SERVICE PERFORMANCE DATA FILE PROCESSING COMMANDS
1456+# These commands are used to periodically process the host and
1457+# service performance data files. The interval at which the
1458+# processing occurs is determined by the options above.
1459+
1460+#host_perfdata_file_processing_command=process-host-perfdata-file
1461+#service_perfdata_file_processing_command=process-service-perfdata-file
1462+
1463+
1464+
1465+# HOST AND SERVICE PERFORMANCE DATA PROCESS EMPTY RESULTS
1466+# THese options determine wether the core will process empty perfdata
1467+# results or not. This is needed for distributed monitoring, and intentionally
1468+# turned on by default.
1469+# If you don't require empty perfdata - saving some cpu cycles
1470+# on unwanted macro calculation - you can turn that off. Be careful!
1471+# Values: 1 = enable, 0 = disable
1472+
1473+#host_perfdata_process_empty_results=1
1474+#service_perfdata_process_empty_results=1
1475+
1476+
1477+# OBSESS OVER SERVICE CHECKS OPTION
1478+# This determines whether or not Nagios will obsess over service
1479+# checks and run the ocsp_command defined below. Unless you're
1480+# planning on implementing distributed monitoring, do not enable
1481+# this option. Read the HTML docs for more information on
1482+# implementing distributed monitoring.
1483+# Values: 1 = obsess over services, 0 = do not obsess (default)
1484+
1485+obsess_over_services=0
1486+
1487+
1488+
1489+# OBSESSIVE COMPULSIVE SERVICE PROCESSOR COMMAND
1490+# This is the command that is run for every service check that is
1491+# processed by Nagios. This command is executed only if the
1492+# obsess_over_services option (above) is set to 1. The command
1493+# argument is the short name of a command definition that you
1494+# define in your host configuration file. Read the HTML docs for
1495+# more information on implementing distributed monitoring.
1496+
1497+#ocsp_command=somecommand
1498+
1499+
1500+
1501+# OBSESS OVER HOST CHECKS OPTION
1502+# This determines whether or not Nagios will obsess over host
1503+# checks and run the ochp_command defined below. Unless you're
1504+# planning on implementing distributed monitoring, do not enable
1505+# this option. Read the HTML docs for more information on
1506+# implementing distributed monitoring.
1507+# Values: 1 = obsess over hosts, 0 = do not obsess (default)
1508+
1509+obsess_over_hosts=0
1510+
1511+
1512+
1513+# OBSESSIVE COMPULSIVE HOST PROCESSOR COMMAND
1514+# This is the command that is run for every host check that is
1515+# processed by Nagios. This command is executed only if the
1516+# obsess_over_hosts option (above) is set to 1. The command
1517+# argument is the short name of a command definition that you
1518+# define in your host configuration file. Read the HTML docs for
1519+# more information on implementing distributed monitoring.
1520+
1521+#ochp_command=somecommand
1522+
1523+
1524+
1525+# TRANSLATE PASSIVE HOST CHECKS OPTION
1526+# This determines whether or not Nagios will translate
1527+# DOWN/UNREACHABLE passive host check results into their proper
1528+# state for this instance of Nagios. This option is useful
1529+# if you have distributed or failover monitoring setup. In
1530+# these cases your other Nagios servers probably have a different
1531+# "view" of the network, with regards to the parent/child relationship
1532+# of hosts. If a distributed monitoring server thinks a host
1533+# is DOWN, it may actually be UNREACHABLE from the point of
1534+# this Nagios instance. Enabling this option will tell Nagios
1535+# to translate any DOWN or UNREACHABLE host states it receives
1536+# passively into the correct state from the view of this server.
1537+# Values: 1 = perform translation, 0 = do not translate (default)
1538+
1539+translate_passive_host_checks=0
1540+
1541+
1542+
1543+# PASSIVE HOST CHECKS ARE SOFT OPTION
1544+# This determines whether or not Nagios will treat passive host
1545+# checks as being HARD or SOFT. By default, a passive host check
1546+# result will put a host into a HARD state type. This can be changed
1547+# by enabling this option.
1548+# Values: 0 = passive checks are HARD, 1 = passive checks are SOFT
1549+
1550+passive_host_checks_are_soft=0
1551+
1552+
1553+
1554+# ORPHANED HOST/SERVICE CHECK OPTIONS
1555+# These options determine whether or not Nagios will periodically
1556+# check for orphaned host service checks. Since service checks are
1557+# not rescheduled until the results of their previous execution
1558+# instance are processed, there exists a possibility that some
1559+# checks may never get rescheduled. A similar situation exists for
1560+# host checks, although the exact scheduling details differ a bit
1561+# from service checks. Orphaned checks seem to be a rare
1562+# problem and should not happen under normal circumstances.
1563+# If you have problems with service checks never getting
1564+# rescheduled, make sure you have orphaned service checks enabled.
1565+# Values: 1 = enable checks, 0 = disable checks
1566+
1567+check_for_orphaned_services=1
1568+check_for_orphaned_hosts=1
1569+
1570+
1571+
1572+# SERVICE FRESHNESS CHECK OPTION
1573+# This option determines whether or not Nagios will periodically
1574+# check the "freshness" of service results. Enabling this option
1575+# is useful for ensuring passive checks are received in a timely
1576+# manner.
1577+# Values: 1 = enabled freshness checking, 0 = disable freshness checking
1578+
1579+check_service_freshness=1
1580+
1581+
1582+
1583+# SERVICE FRESHNESS CHECK INTERVAL
1584+# This setting determines how often (in seconds) Nagios will
1585+# check the "freshness" of service check results. If you have
1586+# disabled service freshness checking, this option has no effect.
1587+
1588+service_freshness_check_interval=60
1589+
1590+
1591+
1592+# SERVICE CHECK TIMEOUT STATE
1593+# This setting determines the state Nagios will report when a
1594+# service check times out - that is does not respond within
1595+# service_check_timeout seconds. This can be useful if a
1596+# machine is running at too high a load and you do not want
1597+# to consider a failed service check to be critical (the default).
1598+# Valid settings are:
1599+# c - Critical (default)
1600+# u - Unknown
1601+# w - Warning
1602+# o - OK
1603+
1604+service_check_timeout_state=c
1605+
1606+
1607+
1608+# HOST FRESHNESS CHECK OPTION
1609+# This option determines whether or not Nagios will periodically
1610+# check the "freshness" of host results. Enabling this option
1611+# is useful for ensuring passive checks are received in a timely
1612+# manner.
1613+# Values: 1 = enabled freshness checking, 0 = disable freshness checking
1614+
1615+check_host_freshness=0
1616+
1617+
1618+
1619+# HOST FRESHNESS CHECK INTERVAL
1620+# This setting determines how often (in seconds) Nagios will
1621+# check the "freshness" of host check results. If you have
1622+# disabled host freshness checking, this option has no effect.
1623+
1624+host_freshness_check_interval=60
1625+
1626+
1627+
1628+
1629+# ADDITIONAL FRESHNESS THRESHOLD LATENCY
1630+# This setting determines the number of seconds that Nagios
1631+# will add to any host and service freshness thresholds that
1632+# it calculates (those not explicitly specified by the user).
1633+
1634+additional_freshness_latency=15
1635+
1636+
1637+
1638+
1639+# FLAP DETECTION OPTION
1640+# This option determines whether or not Nagios will try
1641+# and detect hosts and services that are "flapping".
1642+# Flapping occurs when a host or service changes between
1643+# states too frequently. When Nagios detects that a
1644+# host or service is flapping, it will temporarily suppress
1645+# notifications for that host/service until it stops
1646+# flapping. Flap detection is very experimental, so read
1647+# the HTML documentation before enabling this feature!
1648+# Values: 1 = enable flap detection
1649+# 0 = disable flap detection (default)
1650+
1651+enable_flap_detection=1
1652+
1653+
1654+
1655+# FLAP DETECTION THRESHOLDS FOR HOSTS AND SERVICES
1656+# Read the HTML documentation on flap detection for
1657+# an explanation of what this option does. This option
1658+# has no effect if flap detection is disabled.
1659+
1660+low_service_flap_threshold=5.0
1661+high_service_flap_threshold=20.0
1662+low_host_flap_threshold=5.0
1663+high_host_flap_threshold=20.0
1664+
1665+
1666+
1667+# DATE FORMAT OPTION
1668+# This option determines how short dates are displayed. Valid options
1669+# include:
1670+# us (MM-DD-YYYY HH:MM:SS)
1671+# euro (DD-MM-YYYY HH:MM:SS)
1672+# iso8601 (YYYY-MM-DD HH:MM:SS)
1673+# strict-iso8601 (YYYY-MM-DDTHH:MM:SS)
1674+#
1675+
1676+date_format=iso8601
1677+
1678+
1679+
1680+
1681+# TIMEZONE OFFSET
1682+# This option is used to override the default timezone that this
1683+# instance of Nagios runs in. If not specified, Nagios will use
1684+# the system configured timezone.
1685+#
1686+# NOTE: In order to display the correct timezone in the CGIs, you
1687+# will also need to alter the Apache directives for the CGI path
1688+# to include your timezone. Example:
1689+#
1690+# <Directory "/usr/local/nagios/sbin/">
1691+# SetEnv TZ "Australia/Brisbane"
1692+# ...
1693+# </Directory>
1694+
1695+#use_timezone=US/Mountain
1696+#use_timezone=Australia/Brisbane
1697+
1698+
1699+
1700+
1701+# P1.PL FILE LOCATION
1702+# This value determines where the p1.pl perl script (used by the
1703+# embedded Perl interpreter) is located. If you didn't compile
1704+# Nagios with embedded Perl support, this option has no effect.
1705+
1706+p1_file=/usr/lib/nagios3/p1.pl
1707+
1708+
1709+
1710+# EMBEDDED PERL INTERPRETER OPTION
1711+# This option determines whether or not the embedded Perl interpreter
1712+# will be enabled during runtime. This option has no effect if Nagios
1713+# has not been compiled with support for embedded Perl.
1714+# Values: 0 = disable interpreter, 1 = enable interpreter
1715+
1716+enable_embedded_perl=1
1717+
1718+
1719+
1720+# EMBEDDED PERL USAGE OPTION
1721+# This option determines whether or not Nagios will process Perl plugins
1722+# and scripts with the embedded Perl interpreter if the plugins/scripts
1723+# do not explicitly indicate whether or not it is okay to do so. Read
1724+# the HTML documentation on the embedded Perl interpreter for more
1725+# information on how this option works.
1726+
1727+use_embedded_perl_implicitly=1
1728+
1729+
1730+
1731+# ILLEGAL OBJECT NAME CHARACTERS
1732+# This option allows you to specify illegal characters that cannot
1733+# be used in host names, service descriptions, or names of other
1734+# object types.
1735+
1736+illegal_object_name_chars=`~!$%^&*|'"<>?,()=
1737+
1738+
1739+
1740+# ILLEGAL MACRO OUTPUT CHARACTERS
1741+# This option allows you to specify illegal characters that are
1742+# stripped from macros before being used in notifications, event
1743+# handlers, etc. This DOES NOT affect macros used in service or
1744+# host check commands.
1745+# The following macros are stripped of the characters you specify:
1746+# $HOSTOUTPUT$
1747+# $HOSTPERFDATA$
1748+# $HOSTACKAUTHOR$
1749+# $HOSTACKCOMMENT$
1750+# $SERVICEOUTPUT$
1751+# $SERVICEPERFDATA$
1752+# $SERVICEACKAUTHOR$
1753+# $SERVICEACKCOMMENT$
1754+
1755+illegal_macro_output_chars=`~$&|'"<>
1756+
1757+
1758+
1759+# REGULAR EXPRESSION MATCHING
1760+# This option controls whether or not regular expression matching
1761+# takes place in the object config files. Regular expression
1762+# matching is used to match host, hostgroup, service, and service
1763+# group names/descriptions in some fields of various object types.
1764+# Values: 1 = enable regexp matching, 0 = disable regexp matching
1765+
1766+use_regexp_matching=0
1767+
1768+
1769+
1770+# "TRUE" REGULAR EXPRESSION MATCHING
1771+# This option controls whether or not "true" regular expression
1772+# matching takes place in the object config files. This option
1773+# only has an effect if regular expression matching is enabled
1774+# (see above). If this option is DISABLED, regular expression
1775+# matching only occurs if a string contains wildcard characters
1776+# (* and ?). If the option is ENABLED, regexp matching occurs
1777+# all the time (which can be annoying).
1778+# Values: 1 = enable true matching, 0 = disable true matching
1779+
1780+use_true_regexp_matching=0
1781+
1782+
1783+
1784+# ADMINISTRATOR EMAIL/PAGER ADDRESSES
1785+# The email and pager address of a global administrator (likely you).
1786+# Nagios never uses these values itself, but you can access them by
1787+# using the $ADMINEMAIL$ and $ADMINPAGER$ macros in your notification
1788+# commands.
1789+
1790+admin_email={{ admin_email }}
1791+admin_pager={{ admin_pager }}
1792+
1793+
1794+
1795+# DAEMON CORE DUMP OPTION
1796+# This option determines whether or not Nagios is allowed to create
1797+# a core dump when it runs as a daemon. Note that it is generally
1798+# considered bad form to allow this, but it may be useful for
1799+# debugging purposes. Enabling this option doesn't guarantee that
1800+# a core file will be produced, but that's just life...
1801+# Values: 1 - Allow core dumps
1802+# 0 - Do not allow core dumps (default)
1803+
1804+daemon_dumps_core={{ daemon_dumps_core }}
1805+
1806+
1807+
1808+# LARGE INSTALLATION TWEAKS OPTION
1809+# This option determines whether or not Nagios will take some shortcuts
1810+# which can save on memory and CPU usage in large Nagios installations.
1811+# Read the documentation for more information on the benefits/tradeoffs
1812+# of enabling this option.
1813+# Values: 1 - Enabled tweaks
1814+# 0 - Disable tweaks (default)
1815+
1816+use_large_installation_tweaks=0
1817+
1818+
1819+
1820+# ENABLE ENVIRONMENT MACROS
1821+# This option determines whether or not Nagios will make all standard
1822+# macros available as environment variables when host/service checks
1823+# and system commands (event handlers, notifications, etc.) are
1824+# executed. Enabling this option can cause performance issues in
1825+# large installations, as it will consume a bit more memory and (more
1826+# importantly) consume more CPU.
1827+# Values: 1 - Enable environment variable macros (default)
1828+# 0 - Disable environment variable macros
1829+
1830+enable_environment_macros=1
1831+
1832+
1833+
1834+# CHILD PROCESS MEMORY OPTION
1835+# This option determines whether or not Nagios will free memory in
1836+# child processes (processed used to execute system commands and host/
1837+# service checks). If you specify a value here, it will override
1838+# program defaults.
1839+# Value: 1 - Free memory in child processes
1840+# 0 - Do not free memory in child processes
1841+
1842+#free_child_process_memory=1
1843+
1844+
1845+
1846+# CHILD PROCESS FORKING BEHAVIOR
1847+# This option determines how Nagios will fork child processes
1848+# (used to execute system commands and host/service checks). Normally
1849+# child processes are fork()ed twice, which provides a very high level
1850+# of isolation from problems. Fork()ing once is probably enough and will
1851+# save a great deal on CPU usage (in large installs), so you might
1852+# want to consider using this. If you specify a value here, it will
1853+# program defaults.
1854+# Value: 1 - Child processes fork() twice
1855+# 0 - Child processes fork() just once
1856+
1857+#child_processes_fork_twice=1
1858+
1859+
1860+
1861+# DEBUG LEVEL
1862+# This option determines how much (if any) debugging information will
1863+# be written to the debug file. OR values together to log multiple
1864+# types of information.
1865+# Values:
1866+# -1 = Everything
1867+# 0 = Nothing
1868+# 1 = Functions
1869+# 2 = Configuration
1870+# 4 = Process information
1871+# 8 = Scheduled events
1872+# 16 = Host/service checks
1873+# 32 = Notifications
1874+# 64 = Event broker
1875+# 128 = External commands
1876+# 256 = Commands
1877+# 512 = Scheduled downtime
1878+# 1024 = Comments
1879+# 2048 = Macros
1880+
1881+debug_level={{ debug_level }}
1882+
1883+
1884+
1885+# DEBUG VERBOSITY
1886+# This option determines how verbose the debug log out will be.
1887+# Values: 0 = Brief output
1888+# 1 = More detailed
1889+# 2 = Very detailed
1890+
1891+debug_verbosity={{ debug_verbosity }}
1892+
1893+
1894+
1895+# DEBUG FILE
1896+# This option determines where Nagios should write debugging information.
1897+
1898+debug_file={{ debug_file }}
1899+
1900+
1901+
1902+# MAX DEBUG FILE SIZE
1903+# This option determines the maximum size (in bytes) of the debug file. If
1904+# the file grows larger than this size, it will be renamed with a .old
1905+# extension. If a file already exists with a .old extension it will
1906+# automatically be deleted. This helps ensure your disk space usage doesn't
1907+# get out of control when debugging Nagios.
1908+
1909+max_debug_file_size=1000000
1910+
1911+
1912
1913=== added file 'hooks/templates/pagerduty_nagios_cfg.tmpl'
1914--- hooks/templates/pagerduty_nagios_cfg.tmpl 1970-01-01 00:00:00 +0000
1915+++ hooks/templates/pagerduty_nagios_cfg.tmpl 2015-09-08 04:46:07 +0000
1916@@ -0,0 +1,26 @@
1917+#------------------------------------------------
1918+# This file is juju managed
1919+#------------------------------------------------
1920+
1921+define contact {
1922+ contact_name pagerduty
1923+ alias PagerDuty Pseudo-Contact
1924+ service_notification_period 24x7
1925+ host_notification_period 24x7
1926+ service_notification_options w,u,c,r
1927+ host_notification_options d,r
1928+ service_notification_commands notify-service-by-pagerduty
1929+ host_notification_commands notify-host-by-pagerduty
1930+ pager {{ pagerduty_key }}
1931+}
1932+
1933+define command {
1934+ command_name notify-service-by-pagerduty
1935+ command_line /usr/local/bin/pagerduty_nagios.pl enqueue -f pd_nagios_object=service -q {{ pagerduty_path }}
1936+}
1937+
1938+define command {
1939+ command_name notify-host-by-pagerduty
1940+ command_line /usr/local/bin/pagerduty_nagios.pl enqueue -f pd_nagios_object=host -q {{ pagerduty_path }}
1941+}
1942+
1943
1944=== modified file 'hooks/upgrade-charm'
1945--- hooks/upgrade-charm 2015-05-07 07:07:09 +0000
1946+++ hooks/upgrade-charm 2015-09-08 04:46:07 +0000
1947@@ -5,12 +5,12 @@
1948 import base64
1949 from jinja2 import Template
1950 import os
1951-import re
1952+# import re
1953 import pwd
1954 import grp
1955 import stat
1956 import errno
1957-# import shutil
1958+import shutil
1959 import subprocess
1960 from charmhelpers.contrib import ssl
1961 from charmhelpers.core import hookenv, host
1962@@ -23,10 +23,17 @@
1963 extra_config = hookenv.config('extraconfig')
1964 enable_livestatus = hookenv.config('enable_livestatus')
1965 livestatus_path = hookenv.config('livestatus_path')
1966+enable_pagerduty = hookenv.config('enable_pagerduty')
1967+pagerduty_key = hookenv.config('pagerduty_key')
1968+pagerduty_path = hookenv.config('pagerduty_path')
1969+nagios_user = hookenv.config('nagios_user')
1970+nagios_group = hookenv.config('nagios_group')
1971 ssl_config = hookenv.config('ssl')
1972 charm_dir = os.environ['CHARM_DIR']
1973 cert_domain = hookenv.unit_get('public-address')
1974 nagios_cfg = "/etc/nagios3/nagios.cfg"
1975+pagerduty_cfg = "/etc/nagios3/conf.d/pagerduty_nagios.cfg"
1976+pagerduty_cron = "/etc/cron.d/nagios-pagerduty-flush"
1977
1978
1979 # Checks the charm relations for legacy relations
1980@@ -79,23 +86,9 @@
1981 hookenv.log("Livestatus is enabled")
1982 fetch.apt_update()
1983 fetch.apt_install('check-mk-livestatus')
1984- broker = re.compile("^broker_module=")
1985- broker_found = False
1986- for line in open(nagios_cfg):
1987- if broker.match(line):
1988- broker_found = True
1989- hookenv.log("broker_module line exists, not adding..")
1990- break
1991-
1992- if not broker_found:
1993- with open(nagios_cfg, "a") as nagiosfile:
1994- broker_str = "broker_module=/usr/lib/check_mk/livestatus.o " . livestatus_path
1995- nagiosfile.write(broker_str)
1996- nagiosfile.close()
1997
1998 # Make the directory and fix perms on it
1999 hookenv.log("Fixing perms on livestatus_path")
2000- livestatus_path = hookenv.config('livestatus_path')
2001 livestatus_dir = os.path.dirname(livestatus_path)
2002 if not os.path.isdir(livestatus_dir):
2003 hookenv.log("Making path for livestatus_dir")
2004@@ -104,7 +97,7 @@
2005
2006 # Fix the perms on the socket
2007 hookenv.log("Fixing perms on the socket")
2008- uid = pwd.getpwnam("nagios").pw_uid
2009+ uid = pwd.getpwnam(nagios_user).pw_uid
2010 gid = grp.getgrnam("www-data").gr_gid
2011 os.chown(livestatus_path, uid, gid)
2012 os.chown(livestatus_dir, uid, gid)
2013@@ -113,6 +106,57 @@
2014 os.chmod(livestatus_dir, st.st_mode | stat.S_IRGRP | stat.S_ISGID)
2015
2016
2017+def enable_pagerduty_config():
2018+ if enable_pagerduty:
2019+ hookenv.log("Pagerduty is enabled")
2020+
2021+ # Ship the pagerduty_nagios.cfg file
2022+ template_values = {'enable_pagerduty': enable_pagerduty,
2023+ 'pagerduty_key': pagerduty_key,
2024+ 'pagerduty_path': pagerduty_path}
2025+
2026+ with open('hooks/templates/pagerduty_nagios_cfg.tmpl', 'r') as f:
2027+ templateDef = f.read()
2028+
2029+ t = Template(templateDef)
2030+ with open(pagerduty_cfg, 'w') as f:
2031+ f.write(t.render(template_values))
2032+
2033+ # Ship the cron file
2034+ shutil.copyfile('files/nagios-pagerduty-flush-cron', pagerduty_cron)
2035+
2036+ # Ship the pagerduty_nagios.pl script
2037+ shutil.copyfile('files/pagerduty_nagios.pl', '/usr/local/bin/pagerduty_nagios.pl')
2038+
2039+ # Create the pagerduty queue dir
2040+ if not os.path.isdir(pagerduty_path):
2041+ hookenv.log("Making path for pagerduty_path")
2042+ mkdir_p(pagerduty_path)
2043+ # Fix the perms on it
2044+ uid = pwd.getpwnam(nagios_user).pw_uid
2045+ gid = grp.getgrnam(nagios_group).gr_gid
2046+ os.chown(pagerduty_path, uid, gid)
2047+ else:
2048+ # Clean up the files if we don't want pagerduty
2049+ if os.path.isfile(pagerduty_cfg):
2050+ os.remove(pagerduty_cfg)
2051+ if os.path.isfile(pagerduty_cron):
2052+ os.remove(pagerduty_cron)
2053+
2054+ # Update contacts for admin
2055+ template_values = {'enable_pagerduty': enable_pagerduty,
2056+ 'admin_email': hookenv.config('admin_email')}
2057+
2058+ with open('hooks/templates/contacts-cfg.tmpl', 'r') as f:
2059+ templateDef = f.read()
2060+
2061+ t = Template(templateDef)
2062+ with open('/etc/nagios3/conf.d/contacts_nagios2.cfg', 'w') as f:
2063+ f.write(t.render(template_values))
2064+
2065+ host.service_reload('nagios3')
2066+
2067+
2068 def ssl_configured():
2069 allowed_options = ["on", "only"]
2070 if str(ssl_config).lower() in allowed_options:
2071@@ -172,6 +216,34 @@
2072 hookenv.log("Decoded SSL files", "INFO")
2073
2074
2075+def update_config():
2076+ template_values = {'nagios_user': nagios_user,
2077+ 'nagios_group': nagios_group,
2078+ 'enable_livestatus': enable_livestatus,
2079+ 'livestatus_path': livestatus_path,
2080+ 'livestatus_args': hookenv.config('livestatus_args'),
2081+ 'check_external_commands': hookenv.config('check_external_commands'),
2082+ 'command_check_interval': hookenv.config('command_check_interval'),
2083+ 'command_file': hookenv.config('command_file'),
2084+ 'debug_file': hookenv.config('debug_file'),
2085+ 'debug_verbosity': hookenv.config('debug_verbosity'),
2086+ 'debug_level': hookenv.config('debug_level'),
2087+ 'daemon_dumps_core': hookenv.config('daemon_dumps_core'),
2088+ 'admin_email': hookenv.config('admin_email'),
2089+ 'admin_pager': hookenv.config('admin_pager'),
2090+ 'log_rotation_method': hookenv.config('log_rotation_method'),
2091+ 'log_archive_path': hookenv.config('log_archive_path'),
2092+ 'use_syslog': hookenv.config('use_syslog')}
2093+
2094+ with open('hooks/templates/nagios-cfg.tmpl', 'r') as f:
2095+ templateDef = f.read()
2096+
2097+ t = Template(templateDef)
2098+ with open(nagios_cfg, 'w') as f:
2099+ f.write(t.render(template_values))
2100+
2101+ host.service_reload('nagios3')
2102+
2103 # Nagios3 is deployed as a global apache application from the archive.
2104 # We'll get a little funky and add the SSL keys to the default-ssl config
2105 # which sets our keys, including the self-signed ones, as the host keyfiles.
2106@@ -212,7 +284,9 @@
2107
2108 warn_legacy_relations()
2109 write_extra_config()
2110+update_config()
2111 enable_livestatus_config()
2112+enable_pagerduty_config()
2113 if ssl_configured():
2114 enable_ssl()
2115 update_apache()
2116
2117=== modified file 'tests/00-setup'
2118--- tests/00-setup 2014-03-19 22:03:03 +0000
2119+++ tests/00-setup 2015-09-08 04:46:07 +0000
2120@@ -1,5 +1,5 @@
2121 #!/bin/bash
2122
2123 sudo add-apt-repository -y ppa:juju/stable
2124-apt-get update
2125+sudo apt-get update
2126 sudo apt-get install -y amulet juju-local python3-requests
2127
2128=== added file 'tests/24-pagerduty-test'
2129--- tests/24-pagerduty-test 1970-01-01 00:00:00 +0000
2130+++ tests/24-pagerduty-test 2015-09-08 04:46:07 +0000
2131@@ -0,0 +1,56 @@
2132+#!/usr/bin/python3
2133+
2134+from time import sleep
2135+import amulet
2136+# import requests
2137+
2138+seconds = 20000
2139+
2140+d = amulet.Deployment(series='trusty')
2141+
2142+d.add('nagios')
2143+
2144+d.expose('nagios')
2145+
2146+try:
2147+ d.setup(timeout=seconds)
2148+ d.sentry.wait()
2149+except amulet.helpers.TimeoutError:
2150+ amulet.raise_status(amulet.SKIP, msg="Environment wasn't stood up in time")
2151+except:
2152+ raise
2153+
2154+
2155+##
2156+# Set relationship aliases
2157+##
2158+nagios_unit = d.sentry.unit['nagios/0']
2159+
2160+d.configure('nagios', {
2161+ 'enable_pagerduty': True
2162+})
2163+
2164+d.sentry.wait()
2165+
2166+# Give it a while to settle
2167+sleep(30)
2168+
2169+def test_pagerduty_path_exists():
2170+ pagerduty_path = nagios_unit.run('config-get pagerduty_path')
2171+ try:
2172+ pagerduty_file = nagios_unit.file(pagerduty_path[0])
2173+ except OSError:
2174+ message = "Can't find pagerduty directory"
2175+ amulet.raise_status(amulet.FAIL, msg=message)
2176+
2177+
2178+def test_pagerduty_config():
2179+ pagerduty_cfg = '/etc/nagios3/conf.d/pagerduty_nagios.cfg'
2180+ try:
2181+ pagerduty_cfg_file = nagios_unit.file(pagerduty_cfg)
2182+ except OSError:
2183+ message = "Can't find pagerduty config file"
2184+ amulet.raise_status(amulet.FAIL, msg=message)
2185+
2186+test_pagerduty_path_exists()
2187+test_pagerduty_config()

Subscribers

People subscribed via source and target branches

to all changes: