Merge lp:~brad-marshall/charms/trusty/ceilometer/add-nrpe-checks into lp:~openstack-charmers-archive/charms/trusty/ceilometer/trunk

Proposed by Brad Marshall
Status: Superseded
Proposed branch: lp:~brad-marshall/charms/trusty/ceilometer/add-nrpe-checks
Merge into: lp:~openstack-charmers-archive/charms/trusty/ceilometer/trunk
Diff against target: 1020 lines (+869/-1)
11 files modified
charm-helpers.yaml (+1/-0)
config.yaml (+10/-1)
files/nrpe-external-master/check_exit_status.pl (+189/-0)
files/nrpe-external-master/check_status_file.py (+60/-0)
files/nrpe-external-master/check_upstart_job (+72/-0)
files/nrpe-external-master/nagios_plugin.py (+78/-0)
hooks/ceilometer_hooks.py (+59/-0)
hooks/ceilometer_utils.py (+19/-0)
hooks/charmhelpers/contrib/charmsupport/nrpe.py (+222/-0)
hooks/charmhelpers/contrib/charmsupport/volumes.py (+156/-0)
metadata.yaml (+3/-0)
To merge this branch: bzr merge lp:~brad-marshall/charms/trusty/ceilometer/add-nrpe-checks
Reviewer Review Type Date Requested Status
Liam Young (community) Needs Fixing
Review via email: mp+241498@code.launchpad.net

This proposal has been superseded by a proposal from 2014-11-17.

Description of the change

Adds nrpe-external-master interface and adds basic nrpe checks.

To post a comment you must log in.
Revision history for this message
Ryan Beisner (1chb1n) wrote :

UOSCI bot says:
charm_lint_check #992 trusty-ceilometer for brad-marshall mp241498
    LINT FAIL: lint-test failed

LINT Results (max last 5 lines):
ERROR:root:Make target returned non-zero.
  hooks/ceilometer_hooks.py:146:80: E501 line too long (92 > 79 characters)
  hooks/ceilometer_hooks.py:174:22: E251 unexpected spaces around keyword / parameter equals
  hooks/ceilometer_hooks.py:174:24: E251 unexpected spaces around keyword / parameter equals
  make: *** [lint] Error 1

Full lint test output: http://paste.ubuntu.com/8955755/
Build: http://10.98.191.181:8080/job/charm_lint_check/992/

Revision history for this message
Ryan Beisner (1chb1n) wrote :

UOSCI bot says:
charm_unit_test #827 trusty-ceilometer for brad-marshall mp241498
    UNIT FAIL: unit-test failed

UNIT Results (max last 5 lines):
  hooks/ceilometer_utils 58 1 98% 100
  TOTAL 180 24 87%
  Ran 25 tests in 0.729s
  FAILED (errors=3)
  make: *** [test] Error 1

Full unit test output: http://paste.ubuntu.com/8955757/
Build: http://10.98.191.181:8080/job/charm_unit_test/827/

Revision history for this message
Ryan Beisner (1chb1n) wrote :

UOSCI bot says:
charm_amulet_test #372 trusty-ceilometer for brad-marshall mp241498
    AMULET FAIL: amulet-test missing

AMULET Results (max last 5 lines):
INFO:root:Workspace dir: /var/lib/jenkins/workspace/charm_amulet_test
INFO:root:Reading file: Makefile
INFO:root:Searching for: ['@juju test']
INFO:root:Search string not found in makefile target commands.
ERROR:root:No make target was executed.

Full amulet test output: http://paste.ubuntu.com/8955875/
Build: http://10.98.191.181:8080/job/charm_amulet_test/372/

Revision history for this message
Liam Young (gnuoy) wrote :

Thank you for the mp, the nagios checks are sorely needed. It looks fine, there just a few things it would be good to get fixed up.

The list of services that comprise a ceilometer deployment are already compiled as part of the service context in ceilometer_utils.py. This includes logic to adjust the list depending on what Openstack release is being deployed. I think this mechanism should be used rather than defining a new list directly in update_nrpe_config()

That being said, it looks like the existing charm has a list of new icehouse packages which are being added to the package list in ceilometer_utils.py but the corresponding services are not being added to the CONFIG_FILES OrderedDict. This means that the ceilometer-alarm* and ceilometer-agent-notification services are not going to be restarted on changes to ceilometer.conf

So, what I think is needed is:
1) Steal the services() method from nova-cloud-controller and use that to get a list of services in update_nrpe_config()
2) Define a list of ICEHOUSE_SERVICES (probably exactly the same as ICEHOUSE_PACKAGES) and conditionally add (depending on ostack release) to the CEILOMETER_CONF service in register_configs():
    if (get_os_codename_install_source(config('openstack-origin'))
            >= 'icehouse'):
        CONFIG_FILES[CEILOMETER_CONF]['services'] = CONFIG_FILES[CEILOMETER_CONF]['services'] + ICEHOUSE_SERVICES

review: Needs Fixing
Revision history for this message
Liam Young (gnuoy) wrote :

Also, could you move the check_upstart_job into charmhelpers as it seems to be common across these mps?

62. By Brad Marshall

[bradm] Tweaked nagios checks to use functions to pull out services, added checks for sysv init style daemons, added in icehouse daemons, ran pep8 over the whole thing

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

UOSCI bot says:
charm_unit_test #917 trusty-ceilometer for brad-marshall mp241498
    UNIT FAIL: unit-test failed

UNIT Results (max last 5 lines):
  hooks/ceilometer_utils 66 6 91% 101, 111, 136-139
  TOTAL 200 41 80%
  Ran 25 tests in 0.747s
  FAILED (errors=3)
  make: *** [test] Error 1

Full unit test output: http://paste.ubuntu.com/9051427/
Build: http://10.98.191.181:8080/job/charm_unit_test/917/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

UOSCI bot says:
charm_amulet_test #425 trusty-ceilometer for brad-marshall mp241498
    AMULET FAIL: amulet-test missing

AMULET Results (max last 5 lines):
INFO:root:Workspace dir: /var/lib/jenkins/workspace/charm_amulet_test
INFO:root:Reading file: Makefile
INFO:root:Searching for: ['@juju test']
INFO:root:Search string not found in makefile target commands.
ERROR:root:No make target was executed.

Full amulet test output: http://paste.ubuntu.com/9051426/
Build: http://10.98.191.181:8080/job/charm_amulet_test/425/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

UOSCI bot says:
charm_lint_check #1083 trusty-ceilometer for brad-marshall mp241498
    LINT OK: passed

LINT Results (max last 5 lines):
  I: config.yaml: option os-internal-network has no default value
  I: config.yaml: option os-admin-network has no default value
  I: config.yaml: option ssl_ca has no default value
  I: config.yaml: option ssl_cert has no default value
  I: config.yaml: option os-public-network has no default value

Full lint test output: http://paste.ubuntu.com/9051428/
Build: http://10.98.191.181:8080/job/charm_lint_check/1083/

63. By Brad Marshall

[bradm] Removed puppet header from nagios_plugin module

64. By Brad Marshall

[bradm] Removed nagios check files that were moved to nrpe-external-master charm

Unmerged revisions

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'charm-helpers.yaml'
2--- charm-helpers.yaml 2014-07-24 10:23:25 +0000
3+++ charm-helpers.yaml 2014-11-17 02:28:34 +0000
4@@ -7,3 +7,4 @@
5 - contrib.hahelpers
6 - contrib.storage.linux
7 - contrib.network.ip
8+ - contrib.charmsupport
9
10=== modified file 'config.yaml'
11--- config.yaml 2014-10-01 15:22:55 +0000
12+++ config.yaml 2014-11-17 02:28:34 +0000
13@@ -58,6 +58,16 @@
14 description: |
15 SSL CA to use with the certificate and key provided - this is only
16 required if you are providing a privately signed ssl_cert and ssl_key.
17+ nagios_context:
18+ default: "juju"
19+ type: string
20+ description: |
21+ Used by the nrpe-external-master subordinate charm.
22+ A string that will be prepended to instance name to set the host name
23+ in nagios. So for instance the hostname would be something like:
24+ juju-myservice-0
25+ If you're running multiple environments with the same services in them
26+ this allows you to differentiate between them.
27 # Network configuration options
28 # by default all access is over 'private-address'
29 os-admin-network:
30@@ -84,4 +94,3 @@
31 192.168.0.0/24)
32 .
33 This network will be used for public endpoints.
34-
35
36=== added directory 'files'
37=== added directory 'files/nrpe-external-master'
38=== added file 'files/nrpe-external-master/check_exit_status.pl'
39--- files/nrpe-external-master/check_exit_status.pl 1970-01-01 00:00:00 +0000
40+++ files/nrpe-external-master/check_exit_status.pl 2014-11-17 02:28:34 +0000
41@@ -0,0 +1,189 @@
42+#!/usr/bin/perl
43+################################################################################
44+# #
45+# Copyright (C) 2011 Chad Columbus <ccolumbu@hotmail.com> #
46+# #
47+# This program is free software; you can redistribute it and/or modify #
48+# it under the terms of the GNU General Public License as published by #
49+# the Free Software Foundation; either version 2 of the License, or #
50+# (at your option) any later version. #
51+# #
52+# This program is distributed in the hope that it will be useful, #
53+# but WITHOUT ANY WARRANTY; without even the implied warranty of #
54+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the #
55+# GNU General Public License for more details. #
56+# #
57+# You should have received a copy of the GNU General Public License #
58+# along with this program; if not, write to the Free Software #
59+# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA #
60+# #
61+################################################################################
62+
63+use strict;
64+use Getopt::Std;
65+$| = 1;
66+
67+my %opts;
68+getopts('heronp:s:', \%opts);
69+
70+my $VERSION = "Version 1.0";
71+my $AUTHOR = '(c) 2011 Chad Columbus <ccolumbu@hotmail.com>';
72+
73+# Default values:
74+my $script_to_check;
75+my $pattern = 'is running';
76+my $cmd;
77+my $message;
78+my $error;
79+
80+# Exit codes
81+my $STATE_OK = 0;
82+my $STATE_WARNING = 1;
83+my $STATE_CRITICAL = 2;
84+my $STATE_UNKNOWN = 3;
85+
86+# Parse command line options
87+if ($opts{'h'} || scalar(%opts) == 0) {
88+ &print_help();
89+ exit($STATE_OK);
90+}
91+
92+# Make sure scipt is provided:
93+if ($opts{'s'} eq '') {
94+ # Script to run not provided
95+ print "\nYou must provide a script to run. Example: -s /etc/init.d/httpd\n";
96+ exit($STATE_UNKNOWN);
97+} else {
98+ $script_to_check = $opts{'s'};
99+}
100+
101+# Make sure only a-z, 0-9, /, _, and - are used in the script.
102+if ($script_to_check =~ /[^a-z0-9\_\-\/\.]/) {
103+ # Script contains illegal characters exit.
104+ print "\nScript to check can only contain Letters, Numbers, Periods, Underscores, Hyphens, and/or Slashes\n";
105+ exit($STATE_UNKNOWN);
106+}
107+
108+# See if script is executable
109+if (! -x "$script_to_check") {
110+ print "\nIt appears you can't execute $script_to_check, $!\n";
111+ exit($STATE_UNKNOWN);
112+}
113+
114+# If a pattern is provided use it:
115+if ($opts{'p'} ne '') {
116+ $pattern = $opts{'p'};
117+}
118+
119+# If -r run command via sudo as root:
120+if ($opts{'r'}) {
121+ $cmd = "sudo -n $script_to_check status" . ' 2>&1';
122+} else {
123+ $cmd = "$script_to_check status" . ' 2>&1';
124+}
125+
126+my $cmd_result = `$cmd`;
127+chomp($cmd_result);
128+if ($cmd_result =~ /sudo/i) {
129+ # This means it could not run the sudo command
130+ $message = "$script_to_check CRITICAL - Could not run: 'sudo -n $script_to_check status'. Result is $cmd_result";
131+ $error = $STATE_UNKNOWN;
132+} else {
133+ # Check exitstatus instead of output:
134+ if ($opts{'e'} == 1) {
135+ if ($? != 0) {
136+ # error
137+ $message = "$script_to_check CRITICAL - Exit code: $?\.";
138+ if ($opts{'o'} == 0) {
139+ $message .= " $cmd_result";
140+ }
141+ $error = $STATE_CRITICAL;
142+ } else {
143+ # success
144+ $message = "$script_to_check OK - Exit code: $?\.";
145+ if ($opts{'o'} == 0) {
146+ $message .= " $cmd_result";
147+ }
148+ $error = $STATE_OK;
149+ }
150+ } else {
151+ my $not_check = 1;
152+ if ($opts{'n'} == 1) {
153+ $not_check = 0;
154+ }
155+ if (($cmd_result =~ /$pattern/i) == $not_check) {
156+ $message = "$script_to_check OK";
157+ if ($opts{'o'} == 0) {
158+ $message .= " - $cmd_result";
159+ }
160+ $error = $STATE_OK;
161+ } else {
162+ $message = "$script_to_check CRITICAL";
163+ if ($opts{'o'} == 0) {
164+ $message .= " - $cmd_result";
165+ }
166+ $error = $STATE_CRITICAL;
167+ }
168+ }
169+}
170+
171+if ($message eq '') {
172+ print "Error: program failed in an unknown way\n";
173+ exit($STATE_UNKNOWN);
174+}
175+
176+if ($error) {
177+ print "$message\n";
178+ exit($error);
179+} else {
180+ # If we get here we are OK
181+ print "$message\n";
182+ exit($STATE_OK);
183+}
184+
185+####################################
186+# Start Subs:
187+####################################
188+sub print_help() {
189+ print << "EOF";
190+Check the output or exit status of a script.
191+$VERSION
192+$AUTHOR
193+
194+Options:
195+-h
196+ Print detailed help screen
197+
198+-s
199+ 'FULL PATH TO SCRIPT' (required)
200+ This is the script to run, the script is designed to run scripts in the
201+ /etc/init.d dir (but can run any script) and will call the script with
202+ a 'status' argument. So if you use another script make sure it will
203+ work with /path/script status, example: /etc/init.d/httpd status
204+
205+-e
206+ This is the "exitstaus" flag, it means check the exit status
207+ code instead of looking for a pattern in the output of the script.
208+
209+-p 'REGEX'
210+ This is a pattern to look for in the output of the script to confirm it
211+ is running, default is 'is running', but not all init.d scripts output
212+ (iptables), so you can specify an arbitrary pattern.
213+ All patterns are case insensitive.
214+
215+-n
216+ This is the "NOT" flag, it means not the -p pattern, so if you want to
217+ make sure the output of the script does NOT contain -p 'REGEX'
218+
219+-r
220+ This is the "ROOT" flag, it means run as root via sudo. You will need a
221+ line in your /etc/sudoers file like:
222+ nagios ALL=(root) NOPASSWD: /etc/init.d/* status
223+
224+-o
225+ This is the "SUPPRESS OUTPUT" flag. Some programs have a long output
226+ (like iptables), this flag suppresses that output so it is not printed
227+ as a part of the nagios message.
228+EOF
229+}
230+
231
232=== added file 'files/nrpe-external-master/check_status_file.py'
233--- files/nrpe-external-master/check_status_file.py 1970-01-01 00:00:00 +0000
234+++ files/nrpe-external-master/check_status_file.py 2014-11-17 02:28:34 +0000
235@@ -0,0 +1,60 @@
236+#!/usr/bin/python
237+
238+# m
239+# mmmm m m mmmm mmmm mmm mm#mm
240+# #" "# # # #" "# #" "# #" # #
241+# # # # # # # # # #"""" #
242+# ##m#" "mm"# ##m#" ##m#" "#mm" "mm
243+# # # #
244+# " " "
245+# This file is managed by puppet. Do not make local changes.
246+
247+#
248+# Copyright 2014 Canonical Ltd.
249+#
250+# Author: Jacek Nykis <jacek.nykis@canonical.com>
251+#
252+
253+import re
254+import nagios_plugin
255+
256+
257+def parse_args():
258+ import argparse
259+
260+ parser = argparse.ArgumentParser(
261+ description='Read file and return nagios status based on its content',
262+ formatter_class=argparse.ArgumentDefaultsHelpFormatter)
263+ parser.add_argument('-f', '--status-file', required=True,
264+ help='Status file path')
265+ parser.add_argument('-c', '--critical-text', default='CRITICAL',
266+ help='String indicating critical status')
267+ parser.add_argument('-w', '--warning-text', default='WARNING',
268+ help='String indicating warning status')
269+ parser.add_argument('-o', '--ok-text', default='OK',
270+ help='String indicating OK status')
271+ parser.add_argument('-u', '--unknown-text', default='UNKNOWN',
272+ help='String indicating unknown status')
273+ return parser.parse_args()
274+
275+
276+def check_status(args):
277+ nagios_plugin.check_file_freshness(args.status_file, 43200)
278+
279+ with open(args.status_file, "r") as f:
280+ content = [l.strip() for l in f.readlines()]
281+
282+ for line in content:
283+ if re.search(args.critical_text, line):
284+ raise nagios_plugin.CriticalError(line)
285+ elif re.search(args.warning_text, line):
286+ raise nagios_plugin.WarnError(line)
287+ elif re.search(args.unknown_text, line):
288+ raise nagios_plugin.UnknownError(line)
289+ else:
290+ print line
291+
292+
293+if __name__ == '__main__':
294+ args = parse_args()
295+ nagios_plugin.try_check(check_status, args)
296
297=== added file 'files/nrpe-external-master/check_upstart_job'
298--- files/nrpe-external-master/check_upstart_job 1970-01-01 00:00:00 +0000
299+++ files/nrpe-external-master/check_upstart_job 2014-11-17 02:28:34 +0000
300@@ -0,0 +1,72 @@
301+#!/usr/bin/python
302+
303+#
304+# Copyright 2012, 2013 Canonical Ltd.
305+#
306+# Author: Paul Collins <paul.collins@canonical.com>
307+#
308+# Based on http://www.eurion.net/python-snippets/snippet/Upstart%20service%20status.html
309+#
310+
311+import sys
312+
313+import dbus
314+
315+
316+class Upstart(object):
317+ def __init__(self):
318+ self._bus = dbus.SystemBus()
319+ self._upstart = self._bus.get_object('com.ubuntu.Upstart',
320+ '/com/ubuntu/Upstart')
321+ def get_job(self, job_name):
322+ path = self._upstart.GetJobByName(job_name,
323+ dbus_interface='com.ubuntu.Upstart0_6')
324+ return self._bus.get_object('com.ubuntu.Upstart', path)
325+
326+ def get_properties(self, job):
327+ path = job.GetInstance([], dbus_interface='com.ubuntu.Upstart0_6.Job')
328+ instance = self._bus.get_object('com.ubuntu.Upstart', path)
329+ return instance.GetAll('com.ubuntu.Upstart0_6.Instance',
330+ dbus_interface=dbus.PROPERTIES_IFACE)
331+
332+ def get_job_instances(self, job_name):
333+ job = self.get_job(job_name)
334+ paths = job.GetAllInstances([], dbus_interface='com.ubuntu.Upstart0_6.Job')
335+ return [self._bus.get_object('com.ubuntu.Upstart', path) for path in paths]
336+
337+ def get_job_instance_properties(self, job):
338+ return job.GetAll('com.ubuntu.Upstart0_6.Instance',
339+ dbus_interface=dbus.PROPERTIES_IFACE)
340+
341+try:
342+ upstart = Upstart()
343+ try:
344+ job = upstart.get_job(sys.argv[1])
345+ props = upstart.get_properties(job)
346+
347+ if props['state'] == 'running':
348+ print 'OK: %s is running' % sys.argv[1]
349+ sys.exit(0)
350+ else:
351+ print 'CRITICAL: %s is not running' % sys.argv[1]
352+ sys.exit(2)
353+
354+ except dbus.DBusException as e:
355+ instances = upstart.get_job_instances(sys.argv[1])
356+ propses = [upstart.get_job_instance_properties(instance) for instance in instances]
357+ states = dict([(props['name'], props['state']) for props in propses])
358+ if len(states) != states.values().count('running'):
359+ not_running = []
360+ for name in states.keys():
361+ if states[name] != 'running':
362+ not_running.append(name)
363+ print 'CRITICAL: %d instances of %s not running: %s' % \
364+ (len(not_running), sys.argv[1], not_running.join(', '))
365+ sys.exit(2)
366+ else:
367+ print 'OK: %d instances of %s running' % (len(states), sys.argv[1])
368+
369+except dbus.DBusException as e:
370+ print 'CRITICAL: failed to get properties of \'%s\' from upstart' % sys.argv[1]
371+ sys.exit(2)
372+
373
374=== added file 'files/nrpe-external-master/nagios_plugin.py'
375--- files/nrpe-external-master/nagios_plugin.py 1970-01-01 00:00:00 +0000
376+++ files/nrpe-external-master/nagios_plugin.py 2014-11-17 02:28:34 +0000
377@@ -0,0 +1,78 @@
378+#!/usr/bin/env python
379+# m
380+# mmmm m m mmmm mmmm mmm mm#mm
381+# #" "# # # #" "# #" "# #" # #
382+# # # # # # # # # #"""" #
383+# ##m#" "mm"# ##m#" ##m#" "#mm" "mm
384+# # # #
385+# " " "
386+# This file is managed by puppet. Do not make local changes.
387+
388+# Copyright (C) 2005, 2006, 2007, 2012 James Troup <james.troup@canonical.com>
389+
390+import os
391+import stat
392+import time
393+import traceback
394+import sys
395+
396+
397+################################################################################
398+
399+class CriticalError(Exception):
400+ """This indicates a critical error."""
401+ pass
402+
403+
404+class WarnError(Exception):
405+ """This indicates a warning condition."""
406+ pass
407+
408+
409+class UnknownError(Exception):
410+ """This indicates a unknown error was encountered."""
411+ pass
412+
413+
414+def try_check(function, *args, **kwargs):
415+ """Perform a check with error/warn/unknown handling."""
416+ try:
417+ function(*args, **kwargs)
418+ except UnknownError, msg:
419+ print msg
420+ sys.exit(3)
421+ except CriticalError, msg:
422+ print msg
423+ sys.exit(2)
424+ except WarnError, msg:
425+ print msg
426+ sys.exit(1)
427+ except:
428+ print "%s raised unknown exception '%s'" % (function, sys.exc_info()[0])
429+ print '=' * 60
430+ traceback.print_exc(file=sys.stdout)
431+ print '=' * 60
432+ sys.exit(3)
433+
434+
435+################################################################################
436+
437+def check_file_freshness(filename, newer_than=600):
438+ """Check a file exists, is readable and is newer than <n> seconds (where <n> defaults to 600)."""
439+ # First check the file exists and is readable
440+ if not os.path.exists(filename):
441+ raise CriticalError("%s: does not exist." % (filename))
442+ if os.access(filename, os.R_OK) == 0:
443+ raise CriticalError("%s: is not readable." % (filename))
444+
445+ # Then ensure the file is up-to-date enough
446+ mtime = os.stat(filename)[stat.ST_MTIME]
447+ last_modified = time.time() - mtime
448+ if last_modified > newer_than:
449+ raise CriticalError("%s: was last modified on %s and is too old (> %s seconds)."
450+ % (filename, time.ctime(mtime), newer_than))
451+ if last_modified < 0:
452+ raise CriticalError("%s: was last modified on %s which is in the future."
453+ % (filename, time.ctime(mtime)))
454+
455+################################################################################
456
457=== modified file 'hooks/ceilometer_hooks.py'
458--- hooks/ceilometer_hooks.py 2014-10-01 15:22:55 +0000
459+++ hooks/ceilometer_hooks.py 2014-11-17 02:28:34 +0000
460@@ -2,14 +2,17 @@
461
462 import base64
463 import sys
464+import os
465 from charmhelpers.fetch import (
466 apt_install, filter_installed_packages,
467 apt_update
468 )
469 from charmhelpers.core.hookenv import (
470 open_port,
471+ local_unit,
472 relation_set,
473 relation_ids,
474+ relations_of_type,
475 config,
476 Hooks, UnregisteredHookError,
477 log
478@@ -29,6 +32,7 @@
479 CEILOMETER_ROLE,
480 register_configs,
481 restart_map,
482+ services,
483 get_ceilometer_context,
484 do_openstack_upgrade
485 )
486@@ -37,6 +41,7 @@
487 canonical_url,
488 PUBLIC, INTERNAL, ADMIN
489 )
490+from charmhelpers.contrib.charmsupport.nrpe import NRPE
491
492 hooks = Hooks()
493 CONFIGS = register_configs()
494@@ -89,6 +94,7 @@
495 def config_changed():
496 if openstack_upgrade_available('ceilometer-common'):
497 do_openstack_upgrade(CONFIGS)
498+ update_nrpe_config()
499 CONFIGS.write_all()
500 ceilometer_joined()
501 for rid in relation_ids('identity-service'):
502@@ -98,6 +104,7 @@
503 @hooks.hook('upgrade-charm')
504 def upgrade_charm():
505 install()
506+ update_nrpe_config()
507 any_changed()
508
509
510@@ -137,6 +144,58 @@
511 for relid in relation_ids('ceilometer-service'):
512 relation_set(relid, context)
513
514+
515+@hooks.hook('nrpe-external-master-relation-joined',
516+ 'nrpe-external-master-relation-changed')
517+def update_nrpe_config():
518+ # Find out if nrpe set nagios_hostname
519+ hostname = None
520+ host_context = None
521+ for rel in relations_of_type('nrpe-external-master'):
522+ if 'nagios_hostname' in rel:
523+ hostname = rel['nagios_hostname']
524+ host_context = rel['nagios_host_context']
525+ break
526+ nrpe = NRPE(hostname=hostname)
527+ apt_install('python-dbus')
528+
529+ if host_context:
530+ current_unit = "%s:%s" % (host_context, local_unit())
531+ else:
532+ current_unit = local_unit()
533+
534+ services_to_monitor = services()
535+
536+ for service in services_to_monitor:
537+ upstart_init = '/etc/init/%s.conf' % service
538+ sysv_init = '/etc/init.d/%s' % service
539+ if os.path.exists(upstart_init):
540+ nrpe.add_check(
541+ shortname=service,
542+ description='process check {%s}' % current_unit,
543+ check_cmd='check_upstart_job %s' % service,
544+ )
545+ elif os.path.exists(sysv_init):
546+ cronpath = '/etc/cron.d/nagios-service-check-%s' % service
547+ checkpath = os.path.join(os.environ['CHARM_DIR'],
548+ 'files/nrpe-external-master',
549+ 'check_exit_status.pl'),
550+ cron_template = '*/5 * * * * root \
551+%s -s /etc/init.d/%s \
552+status > /var/lib/nagios/service-check-%s.txt\n' \
553+ % (checkpath[0], service, service)
554+ f = open(cronpath, 'w')
555+ f.write(cron_template)
556+ f.close()
557+ nrpe.add_check(
558+ shortname=service,
559+ description='process check {%s}' % current_unit,
560+ check_cmd='check_status_file.py -f \
561+ /var/lib/nagios/service-check-%s.txt' % service,
562+ )
563+
564+ nrpe.write()
565+
566 if __name__ == '__main__':
567 try:
568 hooks.execute(sys.argv)
569
570=== modified file 'hooks/ceilometer_utils.py'
571--- hooks/ceilometer_utils.py 2014-10-23 16:03:49 +0000
572+++ hooks/ceilometer_utils.py 2014-11-17 02:28:34 +0000
573@@ -50,6 +50,12 @@
574 'ceilometer-agent-notification'
575 ]
576
577+ICEHOUSE_SERVICES = [
578+ 'ceilometer-alarm-notifier',
579+ 'ceilometer-alarm-evaluator',
580+ 'ceilometer-agent-notification'
581+]
582+
583 CEILOMETER_ROLE = "ResellerAdmin"
584
585
586@@ -90,6 +96,11 @@
587 configs = templating.OSConfigRenderer(templates_dir=TEMPLATES,
588 openstack_release=release)
589
590+ if (get_os_codename_install_source(config('openstack-origin'))
591+ >= 'icehouse'):
592+ CONFIG_FILES[CEILOMETER_CONF]['services'] = \
593+ CONFIG_FILES[CEILOMETER_CONF]['services'] + ICEHOUSE_SERVICES
594+
595 for conf in CONFIG_FILES:
596 configs.register(conf, CONFIG_FILES[conf]['hook_contexts'])
597
598@@ -120,6 +131,14 @@
599 return _map
600
601
602+def services():
603+ ''' Returns a list of services associate with this charm '''
604+ _services = []
605+ for v in restart_map().values():
606+ _services = _services + v
607+ return list(set(_services))
608+
609+
610 def get_ceilometer_context():
611 ''' Retrieve a map of all current relation data for agent configuration '''
612 ctxt = {}
613
614=== added directory 'hooks/charmhelpers/contrib/charmsupport'
615=== added file 'hooks/charmhelpers/contrib/charmsupport/__init__.py'
616=== added file 'hooks/charmhelpers/contrib/charmsupport/nrpe.py'
617--- hooks/charmhelpers/contrib/charmsupport/nrpe.py 1970-01-01 00:00:00 +0000
618+++ hooks/charmhelpers/contrib/charmsupport/nrpe.py 2014-11-17 02:28:34 +0000
619@@ -0,0 +1,222 @@
620+"""Compatibility with the nrpe-external-master charm"""
621+# Copyright 2012 Canonical Ltd.
622+#
623+# Authors:
624+# Matthew Wedgwood <matthew.wedgwood@canonical.com>
625+
626+import subprocess
627+import pwd
628+import grp
629+import os
630+import re
631+import shlex
632+import yaml
633+
634+from charmhelpers.core.hookenv import (
635+ config,
636+ local_unit,
637+ log,
638+ relation_ids,
639+ relation_set,
640+)
641+
642+from charmhelpers.core.host import service
643+
644+# This module adds compatibility with the nrpe-external-master and plain nrpe
645+# subordinate charms. To use it in your charm:
646+#
647+# 1. Update metadata.yaml
648+#
649+# provides:
650+# (...)
651+# nrpe-external-master:
652+# interface: nrpe-external-master
653+# scope: container
654+#
655+# and/or
656+#
657+# provides:
658+# (...)
659+# local-monitors:
660+# interface: local-monitors
661+# scope: container
662+
663+#
664+# 2. Add the following to config.yaml
665+#
666+# nagios_context:
667+# default: "juju"
668+# type: string
669+# description: |
670+# Used by the nrpe subordinate charms.
671+# A string that will be prepended to instance name to set the host name
672+# in nagios. So for instance the hostname would be something like:
673+# juju-myservice-0
674+# If you're running multiple environments with the same services in them
675+# this allows you to differentiate between them.
676+#
677+# 3. Add custom checks (Nagios plugins) to files/nrpe-external-master
678+#
679+# 4. Update your hooks.py with something like this:
680+#
681+# from charmsupport.nrpe import NRPE
682+# (...)
683+# def update_nrpe_config():
684+# nrpe_compat = NRPE()
685+# nrpe_compat.add_check(
686+# shortname = "myservice",
687+# description = "Check MyService",
688+# check_cmd = "check_http -w 2 -c 10 http://localhost"
689+# )
690+# nrpe_compat.add_check(
691+# "myservice_other",
692+# "Check for widget failures",
693+# check_cmd = "/srv/myapp/scripts/widget_check"
694+# )
695+# nrpe_compat.write()
696+#
697+# def config_changed():
698+# (...)
699+# update_nrpe_config()
700+#
701+# def nrpe_external_master_relation_changed():
702+# update_nrpe_config()
703+#
704+# def local_monitors_relation_changed():
705+# update_nrpe_config()
706+#
707+# 5. ln -s hooks.py nrpe-external-master-relation-changed
708+# ln -s hooks.py local-monitors-relation-changed
709+
710+
711+class CheckException(Exception):
712+ pass
713+
714+
715+class Check(object):
716+ shortname_re = '[A-Za-z0-9-_]+$'
717+ service_template = ("""
718+#---------------------------------------------------
719+# This file is Juju managed
720+#---------------------------------------------------
721+define service {{
722+ use active-service
723+ host_name {nagios_hostname}
724+ service_description {nagios_hostname}[{shortname}] """
725+ """{description}
726+ check_command check_nrpe!{command}
727+ servicegroups {nagios_servicegroup}
728+}}
729+""")
730+
731+ def __init__(self, shortname, description, check_cmd):
732+ super(Check, self).__init__()
733+ # XXX: could be better to calculate this from the service name
734+ if not re.match(self.shortname_re, shortname):
735+ raise CheckException("shortname must match {}".format(
736+ Check.shortname_re))
737+ self.shortname = shortname
738+ self.command = "check_{}".format(shortname)
739+ # Note: a set of invalid characters is defined by the
740+ # Nagios server config
741+ # The default is: illegal_object_name_chars=`~!$%^&*"|'<>?,()=
742+ self.description = description
743+ self.check_cmd = self._locate_cmd(check_cmd)
744+
745+ def _locate_cmd(self, check_cmd):
746+ search_path = (
747+ '/',
748+ os.path.join(os.environ['CHARM_DIR'],
749+ 'files/nrpe-external-master'),
750+ '/usr/lib/nagios/plugins',
751+ '/usr/local/lib/nagios/plugins',
752+ )
753+ parts = shlex.split(check_cmd)
754+ for path in search_path:
755+ if os.path.exists(os.path.join(path, parts[0])):
756+ command = os.path.join(path, parts[0])
757+ if len(parts) > 1:
758+ command += " " + " ".join(parts[1:])
759+ return command
760+ log('Check command not found: {}'.format(parts[0]))
761+ return ''
762+
763+ def write(self, nagios_context, hostname):
764+ nrpe_check_file = '/etc/nagios/nrpe.d/{}.cfg'.format(
765+ self.command)
766+ with open(nrpe_check_file, 'w') as nrpe_check_config:
767+ nrpe_check_config.write("# check {}\n".format(self.shortname))
768+ nrpe_check_config.write("command[{}]={}\n".format(
769+ self.command, self.check_cmd))
770+
771+ if not os.path.exists(NRPE.nagios_exportdir):
772+ log('Not writing service config as {} is not accessible'.format(
773+ NRPE.nagios_exportdir))
774+ else:
775+ self.write_service_config(nagios_context, hostname)
776+
777+ def write_service_config(self, nagios_context, hostname):
778+ for f in os.listdir(NRPE.nagios_exportdir):
779+ if re.search('.*{}.cfg'.format(self.command), f):
780+ os.remove(os.path.join(NRPE.nagios_exportdir, f))
781+
782+ templ_vars = {
783+ 'nagios_hostname': hostname,
784+ 'nagios_servicegroup': nagios_context,
785+ 'description': self.description,
786+ 'shortname': self.shortname,
787+ 'command': self.command,
788+ }
789+ nrpe_service_text = Check.service_template.format(**templ_vars)
790+ nrpe_service_file = '{}/service__{}_{}.cfg'.format(
791+ NRPE.nagios_exportdir, hostname, self.command)
792+ with open(nrpe_service_file, 'w') as nrpe_service_config:
793+ nrpe_service_config.write(str(nrpe_service_text))
794+
795+ def run(self):
796+ subprocess.call(self.check_cmd)
797+
798+
799+class NRPE(object):
800+ nagios_logdir = '/var/log/nagios'
801+ nagios_exportdir = '/var/lib/nagios/export'
802+ nrpe_confdir = '/etc/nagios/nrpe.d'
803+
804+ def __init__(self, hostname=None):
805+ super(NRPE, self).__init__()
806+ self.config = config()
807+ self.nagios_context = self.config['nagios_context']
808+ self.unit_name = local_unit().replace('/', '-')
809+ if hostname:
810+ self.hostname = hostname
811+ else:
812+ self.hostname = "{}-{}".format(self.nagios_context, self.unit_name)
813+ self.checks = []
814+
815+ def add_check(self, *args, **kwargs):
816+ self.checks.append(Check(*args, **kwargs))
817+
818+ def write(self):
819+ try:
820+ nagios_uid = pwd.getpwnam('nagios').pw_uid
821+ nagios_gid = grp.getgrnam('nagios').gr_gid
822+ except:
823+ log("Nagios user not set up, nrpe checks not updated")
824+ return
825+
826+ if not os.path.exists(NRPE.nagios_logdir):
827+ os.mkdir(NRPE.nagios_logdir)
828+ os.chown(NRPE.nagios_logdir, nagios_uid, nagios_gid)
829+
830+ nrpe_monitors = {}
831+ monitors = {"monitors": {"remote": {"nrpe": nrpe_monitors}}}
832+ for nrpecheck in self.checks:
833+ nrpecheck.write(self.nagios_context, self.hostname)
834+ nrpe_monitors[nrpecheck.shortname] = {
835+ "command": nrpecheck.command,
836+ }
837+
838+ service('restart', 'nagios-nrpe-server')
839+
840+ for rid in relation_ids("local-monitors"):
841+ relation_set(relation_id=rid, monitors=yaml.dump(monitors))
842
843=== added file 'hooks/charmhelpers/contrib/charmsupport/volumes.py'
844--- hooks/charmhelpers/contrib/charmsupport/volumes.py 1970-01-01 00:00:00 +0000
845+++ hooks/charmhelpers/contrib/charmsupport/volumes.py 2014-11-17 02:28:34 +0000
846@@ -0,0 +1,156 @@
847+'''
848+Functions for managing volumes in juju units. One volume is supported per unit.
849+Subordinates may have their own storage, provided it is on its own partition.
850+
851+Configuration stanzas:
852+ volume-ephemeral:
853+ type: boolean
854+ default: true
855+ description: >
856+ If false, a volume is mounted as sepecified in "volume-map"
857+ If true, ephemeral storage will be used, meaning that log data
858+ will only exist as long as the machine. YOU HAVE BEEN WARNED.
859+ volume-map:
860+ type: string
861+ default: {}
862+ description: >
863+ YAML map of units to device names, e.g:
864+ "{ rsyslog/0: /dev/vdb, rsyslog/1: /dev/vdb }"
865+ Service units will raise a configure-error if volume-ephemeral
866+ is 'true' and no volume-map value is set. Use 'juju set' to set a
867+ value and 'juju resolved' to complete configuration.
868+
869+Usage:
870+ from charmsupport.volumes import configure_volume, VolumeConfigurationError
871+ from charmsupport.hookenv import log, ERROR
872+ def post_mount_hook():
873+ stop_service('myservice')
874+ def post_mount_hook():
875+ start_service('myservice')
876+
877+ if __name__ == '__main__':
878+ try:
879+ configure_volume(before_change=pre_mount_hook,
880+ after_change=post_mount_hook)
881+ except VolumeConfigurationError:
882+ log('Storage could not be configured', ERROR)
883+'''
884+
885+# XXX: Known limitations
886+# - fstab is neither consulted nor updated
887+
888+import os
889+from charmhelpers.core import hookenv
890+from charmhelpers.core import host
891+import yaml
892+
893+
894+MOUNT_BASE = '/srv/juju/volumes'
895+
896+
897+class VolumeConfigurationError(Exception):
898+ '''Volume configuration data is missing or invalid'''
899+ pass
900+
901+
902+def get_config():
903+ '''Gather and sanity-check volume configuration data'''
904+ volume_config = {}
905+ config = hookenv.config()
906+
907+ errors = False
908+
909+ if config.get('volume-ephemeral') in (True, 'True', 'true', 'Yes', 'yes'):
910+ volume_config['ephemeral'] = True
911+ else:
912+ volume_config['ephemeral'] = False
913+
914+ try:
915+ volume_map = yaml.safe_load(config.get('volume-map', '{}'))
916+ except yaml.YAMLError as e:
917+ hookenv.log("Error parsing YAML volume-map: {}".format(e),
918+ hookenv.ERROR)
919+ errors = True
920+ if volume_map is None:
921+ # probably an empty string
922+ volume_map = {}
923+ elif not isinstance(volume_map, dict):
924+ hookenv.log("Volume-map should be a dictionary, not {}".format(
925+ type(volume_map)))
926+ errors = True
927+
928+ volume_config['device'] = volume_map.get(os.environ['JUJU_UNIT_NAME'])
929+ if volume_config['device'] and volume_config['ephemeral']:
930+ # asked for ephemeral storage but also defined a volume ID
931+ hookenv.log('A volume is defined for this unit, but ephemeral '
932+ 'storage was requested', hookenv.ERROR)
933+ errors = True
934+ elif not volume_config['device'] and not volume_config['ephemeral']:
935+ # asked for permanent storage but did not define volume ID
936+ hookenv.log('Ephemeral storage was requested, but there is no volume '
937+ 'defined for this unit.', hookenv.ERROR)
938+ errors = True
939+
940+ unit_mount_name = hookenv.local_unit().replace('/', '-')
941+ volume_config['mountpoint'] = os.path.join(MOUNT_BASE, unit_mount_name)
942+
943+ if errors:
944+ return None
945+ return volume_config
946+
947+
948+def mount_volume(config):
949+ if os.path.exists(config['mountpoint']):
950+ if not os.path.isdir(config['mountpoint']):
951+ hookenv.log('Not a directory: {}'.format(config['mountpoint']))
952+ raise VolumeConfigurationError()
953+ else:
954+ host.mkdir(config['mountpoint'])
955+ if os.path.ismount(config['mountpoint']):
956+ unmount_volume(config)
957+ if not host.mount(config['device'], config['mountpoint'], persist=True):
958+ raise VolumeConfigurationError()
959+
960+
961+def unmount_volume(config):
962+ if os.path.ismount(config['mountpoint']):
963+ if not host.umount(config['mountpoint'], persist=True):
964+ raise VolumeConfigurationError()
965+
966+
967+def managed_mounts():
968+ '''List of all mounted managed volumes'''
969+ return filter(lambda mount: mount[0].startswith(MOUNT_BASE), host.mounts())
970+
971+
972+def configure_volume(before_change=lambda: None, after_change=lambda: None):
973+ '''Set up storage (or don't) according to the charm's volume configuration.
974+ Returns the mount point or "ephemeral". before_change and after_change
975+ are optional functions to be called if the volume configuration changes.
976+ '''
977+
978+ config = get_config()
979+ if not config:
980+ hookenv.log('Failed to read volume configuration', hookenv.CRITICAL)
981+ raise VolumeConfigurationError()
982+
983+ if config['ephemeral']:
984+ if os.path.ismount(config['mountpoint']):
985+ before_change()
986+ unmount_volume(config)
987+ after_change()
988+ return 'ephemeral'
989+ else:
990+ # persistent storage
991+ if os.path.ismount(config['mountpoint']):
992+ mounts = dict(managed_mounts())
993+ if mounts.get(config['mountpoint']) != config['device']:
994+ before_change()
995+ unmount_volume(config)
996+ mount_volume(config)
997+ after_change()
998+ else:
999+ before_change()
1000+ mount_volume(config)
1001+ after_change()
1002+ return config['mountpoint']
1003
1004=== added symlink 'hooks/nrpe-external-master-relation-changed'
1005=== target is u'ceilometer_hooks.py'
1006=== added symlink 'hooks/nrpe-external-master-relation-joined'
1007=== target is u'ceilometer_hooks.py'
1008=== modified file 'metadata.yaml'
1009--- metadata.yaml 2013-10-20 22:30:27 +0000
1010+++ metadata.yaml 2014-11-17 02:28:34 +0000
1011@@ -12,6 +12,9 @@
1012 - miscellaneous
1013 - openstack
1014 provides:
1015+ nrpe-external-master:
1016+ interface: nrpe-external-master
1017+ scope: container
1018 ceilometer-service:
1019 interface: ceilometer
1020 requires:

Subscribers

People subscribed via source and target branches