Merge lp:~brad-marshall/charms/trusty/glance/add-nrpe-checks into lp:~openstack-charmers-archive/charms/trusty/glance/trunk

Proposed by Brad Marshall
Status: Superseded
Proposed branch: lp:~brad-marshall/charms/trusty/glance/add-nrpe-checks
Merge into: lp:~openstack-charmers-archive/charms/trusty/glance/trunk
Diff against target: 969 lines (+853/-0)
10 files modified
charm-helpers-hooks.yaml (+1/-0)
config.yaml (+11/-0)
files/nrpe-external-master/check_exit_status.pl (+189/-0)
files/nrpe-external-master/check_status_file.py (+60/-0)
files/nrpe-external-master/check_upstart_job (+72/-0)
files/nrpe-external-master/nagios_plugin.py (+78/-0)
hooks/charmhelpers/contrib/charmsupport/nrpe.py (+222/-0)
hooks/charmhelpers/contrib/charmsupport/volumes.py (+156/-0)
hooks/glance_relations.py (+61/-0)
metadata.yaml (+3/-0)
To merge this branch: bzr merge lp:~brad-marshall/charms/trusty/glance/add-nrpe-checks
Reviewer Review Type Date Requested Status
Liam Young (community) Needs Fixing
Review via email: mp+241494@code.launchpad.net

This proposal has been superseded by a proposal from 2014-11-17.

Description of the change

Adds nrpe-external-master interface and adds basic nrpe checks.

To post a comment you must log in.
Revision history for this message
Ryan Beisner (1chb1n) wrote :

UOSCI bot says:
charm_lint_check #989 trusty-glance for brad-marshall mp241494
    LINT FAIL: lint-test failed

LINT Results (max last 5 lines):
  hooks/glance_relations.py:476:18: E251 unexpected spaces around keyword / parameter equals
  hooks/glance_relations.py:476:20: E251 unexpected spaces around keyword / parameter equals
  hooks/glance_relations.py:481:18: E251 unexpected spaces around keyword / parameter equals
  hooks/glance_relations.py:481:20: E251 unexpected spaces around keyword / parameter equals
  make: *** [lint] Error 1

Full lint test output: http://paste.ubuntu.com/8955744/
Build: http://10.98.191.181:8080/job/charm_lint_check/989/

Revision history for this message
Ryan Beisner (1chb1n) wrote :

UOSCI bot says:
charm_unit_test #824 trusty-glance for brad-marshall mp241494
    UNIT FAIL: unit-test failed

UNIT Results (max last 5 lines):
  hooks/glance_utils 91 8 91% 156, 249-261
  TOTAL 369 39 89%
  Ran 65 tests in 3.551s
  FAILED (errors=3)
  make: *** [unit_test] Error 1

Full unit test output: http://paste.ubuntu.com/8955748/
Build: http://10.98.191.181:8080/job/charm_unit_test/824/

Revision history for this message
Ryan Beisner (1chb1n) wrote :

UOSCI bot says:
charm_amulet_test #369 trusty-glance for brad-marshall mp241494
    AMULET FAIL: amulet-test failed

AMULET Results (max last 5 lines):
  juju-test.conductor DEBUG : Calling "juju destroy-environment -y osci-sv04"
  WARNING cannot delete security group "juju-osci-sv04-0". Used by another environment?
  juju-test INFO : Results: 1 passed, 2 failed, 0 errored
  ERROR subprocess encountered error code 2
  make: *** [test] Error 2

Full amulet test output: http://paste.ubuntu.com/8955868/
Build: http://10.98.191.181:8080/job/charm_amulet_test/369/

Revision history for this message
Liam Young (gnuoy) wrote :

Thanks again for adding the nagios checks.

Please could check_upstart_job go into charm helpers. Also it'd be nice (but I wouldn't block on it) if glance_utils.services was used to get a list of services to add to the nagios check

review: Needs Fixing
84. By Brad Marshall

[bradm] Fixes from pep8 run, added sysvinit daemon services monitoring, use services() to get services to monitor

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

UOSCI bot says:
charm_lint_check #1081 trusty-glance for brad-marshall mp241494
    LINT OK: passed

LINT Results (max last 5 lines):
  I: config.yaml: option ssl_ca has no default value
  I: config.yaml: option ssl_cert has no default value
  I: config.yaml: option os-internal-network has no default value
  I: config.yaml: option os-public-network has no default value
  OK

Full lint test output: http://paste.ubuntu.com/9051422/
Build: http://10.98.191.181:8080/job/charm_lint_check/1081/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

UOSCI bot says:
charm_unit_test #915 trusty-glance for brad-marshall mp241494
    UNIT FAIL: unit-test failed

UNIT Results (max last 5 lines):
  hooks/glance_utils 91 8 91% 156, 249-261
  TOTAL 382 51 87%
  Ran 65 tests in 3.421s
  FAILED (errors=3)
  make: *** [unit_test] Error 1

Full unit test output: http://paste.ubuntu.com/9051423/
Build: http://10.98.191.181:8080/job/charm_unit_test/915/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

UOSCI bot says:
charm_amulet_test #423 trusty-glance for brad-marshall mp241494
    AMULET FAIL: amulet-test failed

AMULET Results (max last 5 lines):
  juju-test.conductor DEBUG : Calling "juju destroy-environment -y osci-sv07"
  WARNING cannot delete security group "juju-osci-sv07-0". Used by another environment?
  juju-test INFO : Results: 1 passed, 2 failed, 0 errored
  ERROR subprocess encountered error code 2
  make: *** [test] Error 2

Full amulet test output: http://paste.ubuntu.com/9051445/
Build: http://10.98.191.181:8080/job/charm_amulet_test/423/

85. By Brad Marshall

[bradm] Removed puppet header from nagios_plugin module

86. By Brad Marshall

[bradm] Removed nagios check files that were moved to nrpe-external-master charm

Unmerged revisions

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'charm-helpers-hooks.yaml'
--- charm-helpers-hooks.yaml 2014-10-01 22:14:32 +0000
+++ charm-helpers-hooks.yaml 2014-11-17 02:32:28 +0000
@@ -8,3 +8,4 @@
8 - contrib.storage.linux.ceph8 - contrib.storage.linux.ceph
9 - payload.execd9 - payload.execd
10 - contrib.network.ip10 - contrib.network.ip
11 - contrib.charmsupport
1112
=== modified file 'config.yaml'
--- config.yaml 2014-10-08 10:40:04 +0000
+++ config.yaml 2014-11-17 02:32:28 +0000
@@ -153,3 +153,14 @@
153 The CPU core multiplier to use when configuring worker processes for153 The CPU core multiplier to use when configuring worker processes for
154 Glance. By default, the number of workers for each daemon is set to154 Glance. By default, the number of workers for each daemon is set to
155 twice the number of CPU cores a service unit has.155 twice the number of CPU cores a service unit has.
156 nagios_context:
157 default: "juju"
158 type: string
159 description: |
160 Used by the nrpe-external-master subordinate charm.
161 A string that will be prepended to instance name to set the host name
162 in nagios. So for instance the hostname would be something like:
163 juju-myservice-0
164 If you're running multiple environments with the same services in them
165 this allows you to differentiate between them.
166
156167
=== added directory 'files'
=== added directory 'files/nrpe-external-master'
=== added file 'files/nrpe-external-master/check_exit_status.pl'
--- files/nrpe-external-master/check_exit_status.pl 1970-01-01 00:00:00 +0000
+++ files/nrpe-external-master/check_exit_status.pl 2014-11-17 02:32:28 +0000
@@ -0,0 +1,189 @@
1#!/usr/bin/perl
2################################################################################
3# #
4# Copyright (C) 2011 Chad Columbus <ccolumbu@hotmail.com> #
5# #
6# This program is free software; you can redistribute it and/or modify #
7# it under the terms of the GNU General Public License as published by #
8# the Free Software Foundation; either version 2 of the License, or #
9# (at your option) any later version. #
10# #
11# This program is distributed in the hope that it will be useful, #
12# but WITHOUT ANY WARRANTY; without even the implied warranty of #
13# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the #
14# GNU General Public License for more details. #
15# #
16# You should have received a copy of the GNU General Public License #
17# along with this program; if not, write to the Free Software #
18# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA #
19# #
20################################################################################
21
22use strict;
23use Getopt::Std;
24$| = 1;
25
26my %opts;
27getopts('heronp:s:', \%opts);
28
29my $VERSION = "Version 1.0";
30my $AUTHOR = '(c) 2011 Chad Columbus <ccolumbu@hotmail.com>';
31
32# Default values:
33my $script_to_check;
34my $pattern = 'is running';
35my $cmd;
36my $message;
37my $error;
38
39# Exit codes
40my $STATE_OK = 0;
41my $STATE_WARNING = 1;
42my $STATE_CRITICAL = 2;
43my $STATE_UNKNOWN = 3;
44
45# Parse command line options
46if ($opts{'h'} || scalar(%opts) == 0) {
47 &print_help();
48 exit($STATE_OK);
49}
50
51# Make sure scipt is provided:
52if ($opts{'s'} eq '') {
53 # Script to run not provided
54 print "\nYou must provide a script to run. Example: -s /etc/init.d/httpd\n";
55 exit($STATE_UNKNOWN);
56} else {
57 $script_to_check = $opts{'s'};
58}
59
60# Make sure only a-z, 0-9, /, _, and - are used in the script.
61if ($script_to_check =~ /[^a-z0-9\_\-\/\.]/) {
62 # Script contains illegal characters exit.
63 print "\nScript to check can only contain Letters, Numbers, Periods, Underscores, Hyphens, and/or Slashes\n";
64 exit($STATE_UNKNOWN);
65}
66
67# See if script is executable
68if (! -x "$script_to_check") {
69 print "\nIt appears you can't execute $script_to_check, $!\n";
70 exit($STATE_UNKNOWN);
71}
72
73# If a pattern is provided use it:
74if ($opts{'p'} ne '') {
75 $pattern = $opts{'p'};
76}
77
78# If -r run command via sudo as root:
79if ($opts{'r'}) {
80 $cmd = "sudo -n $script_to_check status" . ' 2>&1';
81} else {
82 $cmd = "$script_to_check status" . ' 2>&1';
83}
84
85my $cmd_result = `$cmd`;
86chomp($cmd_result);
87if ($cmd_result =~ /sudo/i) {
88 # This means it could not run the sudo command
89 $message = "$script_to_check CRITICAL - Could not run: 'sudo -n $script_to_check status'. Result is $cmd_result";
90 $error = $STATE_UNKNOWN;
91} else {
92 # Check exitstatus instead of output:
93 if ($opts{'e'} == 1) {
94 if ($? != 0) {
95 # error
96 $message = "$script_to_check CRITICAL - Exit code: $?\.";
97 if ($opts{'o'} == 0) {
98 $message .= " $cmd_result";
99 }
100 $error = $STATE_CRITICAL;
101 } else {
102 # success
103 $message = "$script_to_check OK - Exit code: $?\.";
104 if ($opts{'o'} == 0) {
105 $message .= " $cmd_result";
106 }
107 $error = $STATE_OK;
108 }
109 } else {
110 my $not_check = 1;
111 if ($opts{'n'} == 1) {
112 $not_check = 0;
113 }
114 if (($cmd_result =~ /$pattern/i) == $not_check) {
115 $message = "$script_to_check OK";
116 if ($opts{'o'} == 0) {
117 $message .= " - $cmd_result";
118 }
119 $error = $STATE_OK;
120 } else {
121 $message = "$script_to_check CRITICAL";
122 if ($opts{'o'} == 0) {
123 $message .= " - $cmd_result";
124 }
125 $error = $STATE_CRITICAL;
126 }
127 }
128}
129
130if ($message eq '') {
131 print "Error: program failed in an unknown way\n";
132 exit($STATE_UNKNOWN);
133}
134
135if ($error) {
136 print "$message\n";
137 exit($error);
138} else {
139 # If we get here we are OK
140 print "$message\n";
141 exit($STATE_OK);
142}
143
144####################################
145# Start Subs:
146####################################
147sub print_help() {
148 print << "EOF";
149Check the output or exit status of a script.
150$VERSION
151$AUTHOR
152
153Options:
154-h
155 Print detailed help screen
156
157-s
158 'FULL PATH TO SCRIPT' (required)
159 This is the script to run, the script is designed to run scripts in the
160 /etc/init.d dir (but can run any script) and will call the script with
161 a 'status' argument. So if you use another script make sure it will
162 work with /path/script status, example: /etc/init.d/httpd status
163
164-e
165 This is the "exitstaus" flag, it means check the exit status
166 code instead of looking for a pattern in the output of the script.
167
168-p 'REGEX'
169 This is a pattern to look for in the output of the script to confirm it
170 is running, default is 'is running', but not all init.d scripts output
171 (iptables), so you can specify an arbitrary pattern.
172 All patterns are case insensitive.
173
174-n
175 This is the "NOT" flag, it means not the -p pattern, so if you want to
176 make sure the output of the script does NOT contain -p 'REGEX'
177
178-r
179 This is the "ROOT" flag, it means run as root via sudo. You will need a
180 line in your /etc/sudoers file like:
181 nagios ALL=(root) NOPASSWD: /etc/init.d/* status
182
183-o
184 This is the "SUPPRESS OUTPUT" flag. Some programs have a long output
185 (like iptables), this flag suppresses that output so it is not printed
186 as a part of the nagios message.
187EOF
188}
189
0190
=== added file 'files/nrpe-external-master/check_status_file.py'
--- files/nrpe-external-master/check_status_file.py 1970-01-01 00:00:00 +0000
+++ files/nrpe-external-master/check_status_file.py 2014-11-17 02:32:28 +0000
@@ -0,0 +1,60 @@
1#!/usr/bin/python
2
3# m
4# mmmm m m mmmm mmmm mmm mm#mm
5# #" "# # # #" "# #" "# #" # #
6# # # # # # # # # #"""" #
7# ##m#" "mm"# ##m#" ##m#" "#mm" "mm
8# # # #
9# " " "
10# This file is managed by puppet. Do not make local changes.
11
12#
13# Copyright 2014 Canonical Ltd.
14#
15# Author: Jacek Nykis <jacek.nykis@canonical.com>
16#
17
18import re
19import nagios_plugin
20
21
22def parse_args():
23 import argparse
24
25 parser = argparse.ArgumentParser(
26 description='Read file and return nagios status based on its content',
27 formatter_class=argparse.ArgumentDefaultsHelpFormatter)
28 parser.add_argument('-f', '--status-file', required=True,
29 help='Status file path')
30 parser.add_argument('-c', '--critical-text', default='CRITICAL',
31 help='String indicating critical status')
32 parser.add_argument('-w', '--warning-text', default='WARNING',
33 help='String indicating warning status')
34 parser.add_argument('-o', '--ok-text', default='OK',
35 help='String indicating OK status')
36 parser.add_argument('-u', '--unknown-text', default='UNKNOWN',
37 help='String indicating unknown status')
38 return parser.parse_args()
39
40
41def check_status(args):
42 nagios_plugin.check_file_freshness(args.status_file, 43200)
43
44 with open(args.status_file, "r") as f:
45 content = [l.strip() for l in f.readlines()]
46
47 for line in content:
48 if re.search(args.critical_text, line):
49 raise nagios_plugin.CriticalError(line)
50 elif re.search(args.warning_text, line):
51 raise nagios_plugin.WarnError(line)
52 elif re.search(args.unknown_text, line):
53 raise nagios_plugin.UnknownError(line)
54 else:
55 print line
56
57
58if __name__ == '__main__':
59 args = parse_args()
60 nagios_plugin.try_check(check_status, args)
061
=== added file 'files/nrpe-external-master/check_upstart_job'
--- files/nrpe-external-master/check_upstart_job 1970-01-01 00:00:00 +0000
+++ files/nrpe-external-master/check_upstart_job 2014-11-17 02:32:28 +0000
@@ -0,0 +1,72 @@
1#!/usr/bin/python
2
3#
4# Copyright 2012, 2013 Canonical Ltd.
5#
6# Author: Paul Collins <paul.collins@canonical.com>
7#
8# Based on http://www.eurion.net/python-snippets/snippet/Upstart%20service%20status.html
9#
10
11import sys
12
13import dbus
14
15
16class Upstart(object):
17 def __init__(self):
18 self._bus = dbus.SystemBus()
19 self._upstart = self._bus.get_object('com.ubuntu.Upstart',
20 '/com/ubuntu/Upstart')
21 def get_job(self, job_name):
22 path = self._upstart.GetJobByName(job_name,
23 dbus_interface='com.ubuntu.Upstart0_6')
24 return self._bus.get_object('com.ubuntu.Upstart', path)
25
26 def get_properties(self, job):
27 path = job.GetInstance([], dbus_interface='com.ubuntu.Upstart0_6.Job')
28 instance = self._bus.get_object('com.ubuntu.Upstart', path)
29 return instance.GetAll('com.ubuntu.Upstart0_6.Instance',
30 dbus_interface=dbus.PROPERTIES_IFACE)
31
32 def get_job_instances(self, job_name):
33 job = self.get_job(job_name)
34 paths = job.GetAllInstances([], dbus_interface='com.ubuntu.Upstart0_6.Job')
35 return [self._bus.get_object('com.ubuntu.Upstart', path) for path in paths]
36
37 def get_job_instance_properties(self, job):
38 return job.GetAll('com.ubuntu.Upstart0_6.Instance',
39 dbus_interface=dbus.PROPERTIES_IFACE)
40
41try:
42 upstart = Upstart()
43 try:
44 job = upstart.get_job(sys.argv[1])
45 props = upstart.get_properties(job)
46
47 if props['state'] == 'running':
48 print 'OK: %s is running' % sys.argv[1]
49 sys.exit(0)
50 else:
51 print 'CRITICAL: %s is not running' % sys.argv[1]
52 sys.exit(2)
53
54 except dbus.DBusException as e:
55 instances = upstart.get_job_instances(sys.argv[1])
56 propses = [upstart.get_job_instance_properties(instance) for instance in instances]
57 states = dict([(props['name'], props['state']) for props in propses])
58 if len(states) != states.values().count('running'):
59 not_running = []
60 for name in states.keys():
61 if states[name] != 'running':
62 not_running.append(name)
63 print 'CRITICAL: %d instances of %s not running: %s' % \
64 (len(not_running), sys.argv[1], not_running.join(', '))
65 sys.exit(2)
66 else:
67 print 'OK: %d instances of %s running' % (len(states), sys.argv[1])
68
69except dbus.DBusException as e:
70 print 'CRITICAL: failed to get properties of \'%s\' from upstart' % sys.argv[1]
71 sys.exit(2)
72
073
=== added file 'files/nrpe-external-master/nagios_plugin.py'
--- files/nrpe-external-master/nagios_plugin.py 1970-01-01 00:00:00 +0000
+++ files/nrpe-external-master/nagios_plugin.py 2014-11-17 02:32:28 +0000
@@ -0,0 +1,78 @@
1#!/usr/bin/env python
2# m
3# mmmm m m mmmm mmmm mmm mm#mm
4# #" "# # # #" "# #" "# #" # #
5# # # # # # # # # #"""" #
6# ##m#" "mm"# ##m#" ##m#" "#mm" "mm
7# # # #
8# " " "
9# This file is managed by puppet. Do not make local changes.
10
11# Copyright (C) 2005, 2006, 2007, 2012 James Troup <james.troup@canonical.com>
12
13import os
14import stat
15import time
16import traceback
17import sys
18
19
20################################################################################
21
22class CriticalError(Exception):
23 """This indicates a critical error."""
24 pass
25
26
27class WarnError(Exception):
28 """This indicates a warning condition."""
29 pass
30
31
32class UnknownError(Exception):
33 """This indicates a unknown error was encountered."""
34 pass
35
36
37def try_check(function, *args, **kwargs):
38 """Perform a check with error/warn/unknown handling."""
39 try:
40 function(*args, **kwargs)
41 except UnknownError, msg:
42 print msg
43 sys.exit(3)
44 except CriticalError, msg:
45 print msg
46 sys.exit(2)
47 except WarnError, msg:
48 print msg
49 sys.exit(1)
50 except:
51 print "%s raised unknown exception '%s'" % (function, sys.exc_info()[0])
52 print '=' * 60
53 traceback.print_exc(file=sys.stdout)
54 print '=' * 60
55 sys.exit(3)
56
57
58################################################################################
59
60def check_file_freshness(filename, newer_than=600):
61 """Check a file exists, is readable and is newer than <n> seconds (where <n> defaults to 600)."""
62 # First check the file exists and is readable
63 if not os.path.exists(filename):
64 raise CriticalError("%s: does not exist." % (filename))
65 if os.access(filename, os.R_OK) == 0:
66 raise CriticalError("%s: is not readable." % (filename))
67
68 # Then ensure the file is up-to-date enough
69 mtime = os.stat(filename)[stat.ST_MTIME]
70 last_modified = time.time() - mtime
71 if last_modified > newer_than:
72 raise CriticalError("%s: was last modified on %s and is too old (> %s seconds)."
73 % (filename, time.ctime(mtime), newer_than))
74 if last_modified < 0:
75 raise CriticalError("%s: was last modified on %s which is in the future."
76 % (filename, time.ctime(mtime)))
77
78################################################################################
079
=== added directory 'hooks/charmhelpers/contrib/charmsupport'
=== added file 'hooks/charmhelpers/contrib/charmsupport/__init__.py'
=== added file 'hooks/charmhelpers/contrib/charmsupport/nrpe.py'
--- hooks/charmhelpers/contrib/charmsupport/nrpe.py 1970-01-01 00:00:00 +0000
+++ hooks/charmhelpers/contrib/charmsupport/nrpe.py 2014-11-17 02:32:28 +0000
@@ -0,0 +1,222 @@
1"""Compatibility with the nrpe-external-master charm"""
2# Copyright 2012 Canonical Ltd.
3#
4# Authors:
5# Matthew Wedgwood <matthew.wedgwood@canonical.com>
6
7import subprocess
8import pwd
9import grp
10import os
11import re
12import shlex
13import yaml
14
15from charmhelpers.core.hookenv import (
16 config,
17 local_unit,
18 log,
19 relation_ids,
20 relation_set,
21)
22
23from charmhelpers.core.host import service
24
25# This module adds compatibility with the nrpe-external-master and plain nrpe
26# subordinate charms. To use it in your charm:
27#
28# 1. Update metadata.yaml
29#
30# provides:
31# (...)
32# nrpe-external-master:
33# interface: nrpe-external-master
34# scope: container
35#
36# and/or
37#
38# provides:
39# (...)
40# local-monitors:
41# interface: local-monitors
42# scope: container
43
44#
45# 2. Add the following to config.yaml
46#
47# nagios_context:
48# default: "juju"
49# type: string
50# description: |
51# Used by the nrpe subordinate charms.
52# A string that will be prepended to instance name to set the host name
53# in nagios. So for instance the hostname would be something like:
54# juju-myservice-0
55# If you're running multiple environments with the same services in them
56# this allows you to differentiate between them.
57#
58# 3. Add custom checks (Nagios plugins) to files/nrpe-external-master
59#
60# 4. Update your hooks.py with something like this:
61#
62# from charmsupport.nrpe import NRPE
63# (...)
64# def update_nrpe_config():
65# nrpe_compat = NRPE()
66# nrpe_compat.add_check(
67# shortname = "myservice",
68# description = "Check MyService",
69# check_cmd = "check_http -w 2 -c 10 http://localhost"
70# )
71# nrpe_compat.add_check(
72# "myservice_other",
73# "Check for widget failures",
74# check_cmd = "/srv/myapp/scripts/widget_check"
75# )
76# nrpe_compat.write()
77#
78# def config_changed():
79# (...)
80# update_nrpe_config()
81#
82# def nrpe_external_master_relation_changed():
83# update_nrpe_config()
84#
85# def local_monitors_relation_changed():
86# update_nrpe_config()
87#
88# 5. ln -s hooks.py nrpe-external-master-relation-changed
89# ln -s hooks.py local-monitors-relation-changed
90
91
92class CheckException(Exception):
93 pass
94
95
96class Check(object):
97 shortname_re = '[A-Za-z0-9-_]+$'
98 service_template = ("""
99#---------------------------------------------------
100# This file is Juju managed
101#---------------------------------------------------
102define service {{
103 use active-service
104 host_name {nagios_hostname}
105 service_description {nagios_hostname}[{shortname}] """
106 """{description}
107 check_command check_nrpe!{command}
108 servicegroups {nagios_servicegroup}
109}}
110""")
111
112 def __init__(self, shortname, description, check_cmd):
113 super(Check, self).__init__()
114 # XXX: could be better to calculate this from the service name
115 if not re.match(self.shortname_re, shortname):
116 raise CheckException("shortname must match {}".format(
117 Check.shortname_re))
118 self.shortname = shortname
119 self.command = "check_{}".format(shortname)
120 # Note: a set of invalid characters is defined by the
121 # Nagios server config
122 # The default is: illegal_object_name_chars=`~!$%^&*"|'<>?,()=
123 self.description = description
124 self.check_cmd = self._locate_cmd(check_cmd)
125
126 def _locate_cmd(self, check_cmd):
127 search_path = (
128 '/',
129 os.path.join(os.environ['CHARM_DIR'],
130 'files/nrpe-external-master'),
131 '/usr/lib/nagios/plugins',
132 '/usr/local/lib/nagios/plugins',
133 )
134 parts = shlex.split(check_cmd)
135 for path in search_path:
136 if os.path.exists(os.path.join(path, parts[0])):
137 command = os.path.join(path, parts[0])
138 if len(parts) > 1:
139 command += " " + " ".join(parts[1:])
140 return command
141 log('Check command not found: {}'.format(parts[0]))
142 return ''
143
144 def write(self, nagios_context, hostname):
145 nrpe_check_file = '/etc/nagios/nrpe.d/{}.cfg'.format(
146 self.command)
147 with open(nrpe_check_file, 'w') as nrpe_check_config:
148 nrpe_check_config.write("# check {}\n".format(self.shortname))
149 nrpe_check_config.write("command[{}]={}\n".format(
150 self.command, self.check_cmd))
151
152 if not os.path.exists(NRPE.nagios_exportdir):
153 log('Not writing service config as {} is not accessible'.format(
154 NRPE.nagios_exportdir))
155 else:
156 self.write_service_config(nagios_context, hostname)
157
158 def write_service_config(self, nagios_context, hostname):
159 for f in os.listdir(NRPE.nagios_exportdir):
160 if re.search('.*{}.cfg'.format(self.command), f):
161 os.remove(os.path.join(NRPE.nagios_exportdir, f))
162
163 templ_vars = {
164 'nagios_hostname': hostname,
165 'nagios_servicegroup': nagios_context,
166 'description': self.description,
167 'shortname': self.shortname,
168 'command': self.command,
169 }
170 nrpe_service_text = Check.service_template.format(**templ_vars)
171 nrpe_service_file = '{}/service__{}_{}.cfg'.format(
172 NRPE.nagios_exportdir, hostname, self.command)
173 with open(nrpe_service_file, 'w') as nrpe_service_config:
174 nrpe_service_config.write(str(nrpe_service_text))
175
176 def run(self):
177 subprocess.call(self.check_cmd)
178
179
180class NRPE(object):
181 nagios_logdir = '/var/log/nagios'
182 nagios_exportdir = '/var/lib/nagios/export'
183 nrpe_confdir = '/etc/nagios/nrpe.d'
184
185 def __init__(self, hostname=None):
186 super(NRPE, self).__init__()
187 self.config = config()
188 self.nagios_context = self.config['nagios_context']
189 self.unit_name = local_unit().replace('/', '-')
190 if hostname:
191 self.hostname = hostname
192 else:
193 self.hostname = "{}-{}".format(self.nagios_context, self.unit_name)
194 self.checks = []
195
196 def add_check(self, *args, **kwargs):
197 self.checks.append(Check(*args, **kwargs))
198
199 def write(self):
200 try:
201 nagios_uid = pwd.getpwnam('nagios').pw_uid
202 nagios_gid = grp.getgrnam('nagios').gr_gid
203 except:
204 log("Nagios user not set up, nrpe checks not updated")
205 return
206
207 if not os.path.exists(NRPE.nagios_logdir):
208 os.mkdir(NRPE.nagios_logdir)
209 os.chown(NRPE.nagios_logdir, nagios_uid, nagios_gid)
210
211 nrpe_monitors = {}
212 monitors = {"monitors": {"remote": {"nrpe": nrpe_monitors}}}
213 for nrpecheck in self.checks:
214 nrpecheck.write(self.nagios_context, self.hostname)
215 nrpe_monitors[nrpecheck.shortname] = {
216 "command": nrpecheck.command,
217 }
218
219 service('restart', 'nagios-nrpe-server')
220
221 for rid in relation_ids("local-monitors"):
222 relation_set(relation_id=rid, monitors=yaml.dump(monitors))
0223
=== added file 'hooks/charmhelpers/contrib/charmsupport/volumes.py'
--- hooks/charmhelpers/contrib/charmsupport/volumes.py 1970-01-01 00:00:00 +0000
+++ hooks/charmhelpers/contrib/charmsupport/volumes.py 2014-11-17 02:32:28 +0000
@@ -0,0 +1,156 @@
1'''
2Functions for managing volumes in juju units. One volume is supported per unit.
3Subordinates may have their own storage, provided it is on its own partition.
4
5Configuration stanzas:
6 volume-ephemeral:
7 type: boolean
8 default: true
9 description: >
10 If false, a volume is mounted as sepecified in "volume-map"
11 If true, ephemeral storage will be used, meaning that log data
12 will only exist as long as the machine. YOU HAVE BEEN WARNED.
13 volume-map:
14 type: string
15 default: {}
16 description: >
17 YAML map of units to device names, e.g:
18 "{ rsyslog/0: /dev/vdb, rsyslog/1: /dev/vdb }"
19 Service units will raise a configure-error if volume-ephemeral
20 is 'true' and no volume-map value is set. Use 'juju set' to set a
21 value and 'juju resolved' to complete configuration.
22
23Usage:
24 from charmsupport.volumes import configure_volume, VolumeConfigurationError
25 from charmsupport.hookenv import log, ERROR
26 def post_mount_hook():
27 stop_service('myservice')
28 def post_mount_hook():
29 start_service('myservice')
30
31 if __name__ == '__main__':
32 try:
33 configure_volume(before_change=pre_mount_hook,
34 after_change=post_mount_hook)
35 except VolumeConfigurationError:
36 log('Storage could not be configured', ERROR)
37'''
38
39# XXX: Known limitations
40# - fstab is neither consulted nor updated
41
42import os
43from charmhelpers.core import hookenv
44from charmhelpers.core import host
45import yaml
46
47
48MOUNT_BASE = '/srv/juju/volumes'
49
50
51class VolumeConfigurationError(Exception):
52 '''Volume configuration data is missing or invalid'''
53 pass
54
55
56def get_config():
57 '''Gather and sanity-check volume configuration data'''
58 volume_config = {}
59 config = hookenv.config()
60
61 errors = False
62
63 if config.get('volume-ephemeral') in (True, 'True', 'true', 'Yes', 'yes'):
64 volume_config['ephemeral'] = True
65 else:
66 volume_config['ephemeral'] = False
67
68 try:
69 volume_map = yaml.safe_load(config.get('volume-map', '{}'))
70 except yaml.YAMLError as e:
71 hookenv.log("Error parsing YAML volume-map: {}".format(e),
72 hookenv.ERROR)
73 errors = True
74 if volume_map is None:
75 # probably an empty string
76 volume_map = {}
77 elif not isinstance(volume_map, dict):
78 hookenv.log("Volume-map should be a dictionary, not {}".format(
79 type(volume_map)))
80 errors = True
81
82 volume_config['device'] = volume_map.get(os.environ['JUJU_UNIT_NAME'])
83 if volume_config['device'] and volume_config['ephemeral']:
84 # asked for ephemeral storage but also defined a volume ID
85 hookenv.log('A volume is defined for this unit, but ephemeral '
86 'storage was requested', hookenv.ERROR)
87 errors = True
88 elif not volume_config['device'] and not volume_config['ephemeral']:
89 # asked for permanent storage but did not define volume ID
90 hookenv.log('Ephemeral storage was requested, but there is no volume '
91 'defined for this unit.', hookenv.ERROR)
92 errors = True
93
94 unit_mount_name = hookenv.local_unit().replace('/', '-')
95 volume_config['mountpoint'] = os.path.join(MOUNT_BASE, unit_mount_name)
96
97 if errors:
98 return None
99 return volume_config
100
101
102def mount_volume(config):
103 if os.path.exists(config['mountpoint']):
104 if not os.path.isdir(config['mountpoint']):
105 hookenv.log('Not a directory: {}'.format(config['mountpoint']))
106 raise VolumeConfigurationError()
107 else:
108 host.mkdir(config['mountpoint'])
109 if os.path.ismount(config['mountpoint']):
110 unmount_volume(config)
111 if not host.mount(config['device'], config['mountpoint'], persist=True):
112 raise VolumeConfigurationError()
113
114
115def unmount_volume(config):
116 if os.path.ismount(config['mountpoint']):
117 if not host.umount(config['mountpoint'], persist=True):
118 raise VolumeConfigurationError()
119
120
121def managed_mounts():
122 '''List of all mounted managed volumes'''
123 return filter(lambda mount: mount[0].startswith(MOUNT_BASE), host.mounts())
124
125
126def configure_volume(before_change=lambda: None, after_change=lambda: None):
127 '''Set up storage (or don't) according to the charm's volume configuration.
128 Returns the mount point or "ephemeral". before_change and after_change
129 are optional functions to be called if the volume configuration changes.
130 '''
131
132 config = get_config()
133 if not config:
134 hookenv.log('Failed to read volume configuration', hookenv.CRITICAL)
135 raise VolumeConfigurationError()
136
137 if config['ephemeral']:
138 if os.path.ismount(config['mountpoint']):
139 before_change()
140 unmount_volume(config)
141 after_change()
142 return 'ephemeral'
143 else:
144 # persistent storage
145 if os.path.ismount(config['mountpoint']):
146 mounts = dict(managed_mounts())
147 if mounts.get(config['mountpoint']) != config['device']:
148 before_change()
149 unmount_volume(config)
150 mount_volume(config)
151 after_change()
152 else:
153 before_change()
154 mount_volume(config)
155 after_change()
156 return config['mountpoint']
0157
=== modified file 'hooks/glance_relations.py'
--- hooks/glance_relations.py 2014-10-01 22:14:32 +0000
+++ hooks/glance_relations.py 2014-11-17 02:32:28 +0000
@@ -1,5 +1,6 @@
1#!/usr/bin/python1#!/usr/bin/python
2import sys2import sys
3import os
34
4from glance_utils import (5from glance_utils import (
5 do_openstack_upgrade,6 do_openstack_upgrade,
@@ -7,6 +8,7 @@
7 migrate_database,8 migrate_database,
8 register_configs,9 register_configs,
9 restart_map,10 restart_map,
11 services,
10 CLUSTER_RES,12 CLUSTER_RES,
11 PACKAGES,13 PACKAGES,
12 SERVICES,14 SERVICES,
@@ -30,6 +32,7 @@
30 relation_get,32 relation_get,
31 relation_set,33 relation_set,
32 relation_ids,34 relation_ids,
35 relations_of_type,
33 service_name,36 service_name,
34 unit_get,37 unit_get,
35 UnregisteredHookError, )38 UnregisteredHookError, )
@@ -73,6 +76,8 @@
7376
74from charmhelpers.contrib.openstack.context import ADDRESS_TYPES77from charmhelpers.contrib.openstack.context import ADDRESS_TYPES
7578
79from charmhelpers.contrib.charmsupport.nrpe import NRPE
80
76from subprocess import (81from subprocess import (
77 check_call,82 check_call,
78 call, )83 call, )
@@ -297,6 +302,8 @@
297 open_port(9292)302 open_port(9292)
298 configure_https()303 configure_https()
299304
305 update_nrpe_config()
306
300 # Pickup and changes due to network reference architecture307 # Pickup and changes due to network reference architecture
301 # configuration308 # configuration
302 [keystone_joined(rid) for rid in relation_ids('identity-service')]309 [keystone_joined(rid) for rid in relation_ids('identity-service')]
@@ -334,6 +341,7 @@
334def upgrade_charm():341def upgrade_charm():
335 apt_install(filter_installed_packages(PACKAGES), fatal=True)342 apt_install(filter_installed_packages(PACKAGES), fatal=True)
336 configure_https()343 configure_https()
344 update_nrpe_config()
337 CONFIGS.write_all()345 CONFIGS.write_all()
338346
339347
@@ -446,6 +454,59 @@
446 return454 return
447 CONFIGS.write(GLANCE_API_CONF)455 CONFIGS.write(GLANCE_API_CONF)
448456
457
458@hooks.hook('nrpe-external-master-relation-joined',
459 'nrpe-external-master-relation-changed')
460def update_nrpe_config():
461 # Find out if nrpe set nagios_hostname
462 hostname = None
463 host_context = None
464 for rel in relations_of_type('nrpe-external-master'):
465 if 'nagios_hostname' in rel:
466 hostname = rel['nagios_hostname']
467 host_context = rel['nagios_host_context']
468 break
469 nrpe = NRPE(hostname=hostname)
470 apt_install('python-dbus')
471
472 if host_context:
473 current_unit = "%s:%s" % (host_context, local_unit())
474 else:
475 current_unit = local_unit()
476
477 services_to_monitor = services()
478
479 for service in services_to_monitor:
480 upstart_init = '/etc/init/%s.conf' % service
481 sysv_init = '/etc/init.d/%s' % service
482
483 if os.path.exists(upstart_init):
484 nrpe.add_check(
485 shortname=service,
486 description='process check {%s}' % current_unit,
487 check_cmd='check_upstart_job %s' % service,
488 )
489 elif os.path.exists(sysv_init):
490 cronpath = '/etc/cron.d/nagios-service-check-%s' % service
491 checkpath = os.path.join(os.environ['CHARM_DIR'],
492 'files/nrpe-external-master',
493 'check_exit_status.pl'),
494 cron_template = '*/5 * * * * root \
495%s -s /etc/init.d/%s status > /var/lib/nagios/service-check-%s.txt\n' \
496 % (checkpath[0], service, service)
497 f = open(cronpath, 'w')
498 f.write(cron_template)
499 f.close()
500 nrpe.add_check(
501 shortname=service,
502 description='process check {%s}' % current_unit,
503 check_cmd='check_status_file.py -f \
504 /var/lib/nagios/service-check-%s.txt' % service,
505 )
506
507 nrpe.write()
508
509
449if __name__ == '__main__':510if __name__ == '__main__':
450 try:511 try:
451 hooks.execute(sys.argv)512 hooks.execute(sys.argv)
452513
=== added symlink 'hooks/nrpe-external-master-relation-changed'
=== target is u'glance_relations.py'
=== added symlink 'hooks/nrpe-external-master-relation-joined'
=== target is u'glance_relations.py'
=== modified file 'metadata.yaml'
--- metadata.yaml 2014-09-11 07:05:30 +0000
+++ metadata.yaml 2014-11-17 02:32:28 +0000
@@ -9,6 +9,9 @@
9categories:9categories:
10 - miscellaneous10 - miscellaneous
11provides:11provides:
12 nrpe-external-master:
13 interface: nrpe-external-master
14 scope: container
12 image-service:15 image-service:
13 interface: glance16 interface: glance
14requires:17requires:

Subscribers

People subscribed via source and target branches