Merge lp:~brad-marshall/charms/trusty/ceilometer/add-nrpe-checks into lp:~openstack-charmers-archive/charms/trusty/ceilometer/trunk
- Trusty Tahr (14.04)
- add-nrpe-checks
- Merge into trunk
Status: | Superseded | ||||
---|---|---|---|---|---|
Proposed branch: | lp:~brad-marshall/charms/trusty/ceilometer/add-nrpe-checks | ||||
Merge into: | lp:~openstack-charmers-archive/charms/trusty/ceilometer/trunk | ||||
Diff against target: |
1020 lines (+869/-1) 11 files modified
charm-helpers.yaml (+1/-0) config.yaml (+10/-1) files/nrpe-external-master/check_exit_status.pl (+189/-0) files/nrpe-external-master/check_status_file.py (+60/-0) files/nrpe-external-master/check_upstart_job (+72/-0) files/nrpe-external-master/nagios_plugin.py (+78/-0) hooks/ceilometer_hooks.py (+59/-0) hooks/ceilometer_utils.py (+19/-0) hooks/charmhelpers/contrib/charmsupport/nrpe.py (+222/-0) hooks/charmhelpers/contrib/charmsupport/volumes.py (+156/-0) metadata.yaml (+3/-0) |
||||
To merge this branch: | bzr merge lp:~brad-marshall/charms/trusty/ceilometer/add-nrpe-checks | ||||
Related bugs: |
|
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Liam Young (community) | Needs Fixing | ||
Review via email: mp+241498@code.launchpad.net |
This proposal has been superseded by a proposal from 2014-11-17.
Commit message
Description of the change
Adds nrpe-external-
Ryan Beisner (1chb1n) wrote : | # |
Ryan Beisner (1chb1n) wrote : | # |
UOSCI bot says:
charm_unit_test #827 trusty-ceilometer for brad-marshall mp241498
UNIT FAIL: unit-test failed
UNIT Results (max last 5 lines):
hooks/
TOTAL 180 24 87%
Ran 25 tests in 0.729s
FAILED (errors=3)
make: *** [test] Error 1
Full unit test output: http://
Build: http://
Ryan Beisner (1chb1n) wrote : | # |
UOSCI bot says:
charm_amulet_test #372 trusty-ceilometer for brad-marshall mp241498
AMULET FAIL: amulet-test missing
AMULET Results (max last 5 lines):
INFO:root:Workspace dir: /var/lib/
INFO:root:Reading file: Makefile
INFO:root:Searching for: ['@juju test']
INFO:root:Search string not found in makefile target commands.
ERROR:root:No make target was executed.
Full amulet test output: http://
Build: http://
Liam Young (gnuoy) wrote : | # |
Thank you for the mp, the nagios checks are sorely needed. It looks fine, there just a few things it would be good to get fixed up.
The list of services that comprise a ceilometer deployment are already compiled as part of the service context in ceilometer_
That being said, it looks like the existing charm has a list of new icehouse packages which are being added to the package list in ceilometer_utils.py but the corresponding services are not being added to the CONFIG_FILES OrderedDict. This means that the ceilometer-alarm* and ceilometer-
So, what I think is needed is:
1) Steal the services() method from nova-cloud-
2) Define a list of ICEHOUSE_SERVICES (probably exactly the same as ICEHOUSE_PACKAGES) and conditionally add (depending on ostack release) to the CEILOMETER_CONF service in register_configs():
if (get_os_
>= 'icehouse'):
Liam Young (gnuoy) wrote : | # |
Also, could you move the check_upstart_job into charmhelpers as it seems to be common across these mps?
- 62. By Brad Marshall
-
[bradm] Tweaked nagios checks to use functions to pull out services, added checks for sysv init style daemons, added in icehouse daemons, ran pep8 over the whole thing
uosci-testing-bot (uosci-testing-bot) wrote : | # |
UOSCI bot says:
charm_unit_test #917 trusty-ceilometer for brad-marshall mp241498
UNIT FAIL: unit-test failed
UNIT Results (max last 5 lines):
hooks/
TOTAL 200 41 80%
Ran 25 tests in 0.747s
FAILED (errors=3)
make: *** [test] Error 1
Full unit test output: http://
Build: http://
uosci-testing-bot (uosci-testing-bot) wrote : | # |
UOSCI bot says:
charm_amulet_test #425 trusty-ceilometer for brad-marshall mp241498
AMULET FAIL: amulet-test missing
AMULET Results (max last 5 lines):
INFO:root:Workspace dir: /var/lib/
INFO:root:Reading file: Makefile
INFO:root:Searching for: ['@juju test']
INFO:root:Search string not found in makefile target commands.
ERROR:root:No make target was executed.
Full amulet test output: http://
Build: http://
uosci-testing-bot (uosci-testing-bot) wrote : | # |
UOSCI bot says:
charm_lint_check #1083 trusty-ceilometer for brad-marshall mp241498
LINT OK: passed
LINT Results (max last 5 lines):
I: config.yaml: option os-internal-network has no default value
I: config.yaml: option os-admin-network has no default value
I: config.yaml: option ssl_ca has no default value
I: config.yaml: option ssl_cert has no default value
I: config.yaml: option os-public-network has no default value
Full lint test output: http://
Build: http://
- 63. By Brad Marshall
-
[bradm] Removed puppet header from nagios_plugin module
- 64. By Brad Marshall
-
[bradm] Removed nagios check files that were moved to nrpe-external-
master charm
Unmerged revisions
Preview Diff
1 | === modified file 'charm-helpers.yaml' |
2 | --- charm-helpers.yaml 2014-07-24 10:23:25 +0000 |
3 | +++ charm-helpers.yaml 2014-11-17 02:28:34 +0000 |
4 | @@ -7,3 +7,4 @@ |
5 | - contrib.hahelpers |
6 | - contrib.storage.linux |
7 | - contrib.network.ip |
8 | + - contrib.charmsupport |
9 | |
10 | === modified file 'config.yaml' |
11 | --- config.yaml 2014-10-01 15:22:55 +0000 |
12 | +++ config.yaml 2014-11-17 02:28:34 +0000 |
13 | @@ -58,6 +58,16 @@ |
14 | description: | |
15 | SSL CA to use with the certificate and key provided - this is only |
16 | required if you are providing a privately signed ssl_cert and ssl_key. |
17 | + nagios_context: |
18 | + default: "juju" |
19 | + type: string |
20 | + description: | |
21 | + Used by the nrpe-external-master subordinate charm. |
22 | + A string that will be prepended to instance name to set the host name |
23 | + in nagios. So for instance the hostname would be something like: |
24 | + juju-myservice-0 |
25 | + If you're running multiple environments with the same services in them |
26 | + this allows you to differentiate between them. |
27 | # Network configuration options |
28 | # by default all access is over 'private-address' |
29 | os-admin-network: |
30 | @@ -84,4 +94,3 @@ |
31 | 192.168.0.0/24) |
32 | . |
33 | This network will be used for public endpoints. |
34 | - |
35 | |
36 | === added directory 'files' |
37 | === added directory 'files/nrpe-external-master' |
38 | === added file 'files/nrpe-external-master/check_exit_status.pl' |
39 | --- files/nrpe-external-master/check_exit_status.pl 1970-01-01 00:00:00 +0000 |
40 | +++ files/nrpe-external-master/check_exit_status.pl 2014-11-17 02:28:34 +0000 |
41 | @@ -0,0 +1,189 @@ |
42 | +#!/usr/bin/perl |
43 | +################################################################################ |
44 | +# # |
45 | +# Copyright (C) 2011 Chad Columbus <ccolumbu@hotmail.com> # |
46 | +# # |
47 | +# This program is free software; you can redistribute it and/or modify # |
48 | +# it under the terms of the GNU General Public License as published by # |
49 | +# the Free Software Foundation; either version 2 of the License, or # |
50 | +# (at your option) any later version. # |
51 | +# # |
52 | +# This program is distributed in the hope that it will be useful, # |
53 | +# but WITHOUT ANY WARRANTY; without even the implied warranty of # |
54 | +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # |
55 | +# GNU General Public License for more details. # |
56 | +# # |
57 | +# You should have received a copy of the GNU General Public License # |
58 | +# along with this program; if not, write to the Free Software # |
59 | +# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA # |
60 | +# # |
61 | +################################################################################ |
62 | + |
63 | +use strict; |
64 | +use Getopt::Std; |
65 | +$| = 1; |
66 | + |
67 | +my %opts; |
68 | +getopts('heronp:s:', \%opts); |
69 | + |
70 | +my $VERSION = "Version 1.0"; |
71 | +my $AUTHOR = '(c) 2011 Chad Columbus <ccolumbu@hotmail.com>'; |
72 | + |
73 | +# Default values: |
74 | +my $script_to_check; |
75 | +my $pattern = 'is running'; |
76 | +my $cmd; |
77 | +my $message; |
78 | +my $error; |
79 | + |
80 | +# Exit codes |
81 | +my $STATE_OK = 0; |
82 | +my $STATE_WARNING = 1; |
83 | +my $STATE_CRITICAL = 2; |
84 | +my $STATE_UNKNOWN = 3; |
85 | + |
86 | +# Parse command line options |
87 | +if ($opts{'h'} || scalar(%opts) == 0) { |
88 | + &print_help(); |
89 | + exit($STATE_OK); |
90 | +} |
91 | + |
92 | +# Make sure scipt is provided: |
93 | +if ($opts{'s'} eq '') { |
94 | + # Script to run not provided |
95 | + print "\nYou must provide a script to run. Example: -s /etc/init.d/httpd\n"; |
96 | + exit($STATE_UNKNOWN); |
97 | +} else { |
98 | + $script_to_check = $opts{'s'}; |
99 | +} |
100 | + |
101 | +# Make sure only a-z, 0-9, /, _, and - are used in the script. |
102 | +if ($script_to_check =~ /[^a-z0-9\_\-\/\.]/) { |
103 | + # Script contains illegal characters exit. |
104 | + print "\nScript to check can only contain Letters, Numbers, Periods, Underscores, Hyphens, and/or Slashes\n"; |
105 | + exit($STATE_UNKNOWN); |
106 | +} |
107 | + |
108 | +# See if script is executable |
109 | +if (! -x "$script_to_check") { |
110 | + print "\nIt appears you can't execute $script_to_check, $!\n"; |
111 | + exit($STATE_UNKNOWN); |
112 | +} |
113 | + |
114 | +# If a pattern is provided use it: |
115 | +if ($opts{'p'} ne '') { |
116 | + $pattern = $opts{'p'}; |
117 | +} |
118 | + |
119 | +# If -r run command via sudo as root: |
120 | +if ($opts{'r'}) { |
121 | + $cmd = "sudo -n $script_to_check status" . ' 2>&1'; |
122 | +} else { |
123 | + $cmd = "$script_to_check status" . ' 2>&1'; |
124 | +} |
125 | + |
126 | +my $cmd_result = `$cmd`; |
127 | +chomp($cmd_result); |
128 | +if ($cmd_result =~ /sudo/i) { |
129 | + # This means it could not run the sudo command |
130 | + $message = "$script_to_check CRITICAL - Could not run: 'sudo -n $script_to_check status'. Result is $cmd_result"; |
131 | + $error = $STATE_UNKNOWN; |
132 | +} else { |
133 | + # Check exitstatus instead of output: |
134 | + if ($opts{'e'} == 1) { |
135 | + if ($? != 0) { |
136 | + # error |
137 | + $message = "$script_to_check CRITICAL - Exit code: $?\."; |
138 | + if ($opts{'o'} == 0) { |
139 | + $message .= " $cmd_result"; |
140 | + } |
141 | + $error = $STATE_CRITICAL; |
142 | + } else { |
143 | + # success |
144 | + $message = "$script_to_check OK - Exit code: $?\."; |
145 | + if ($opts{'o'} == 0) { |
146 | + $message .= " $cmd_result"; |
147 | + } |
148 | + $error = $STATE_OK; |
149 | + } |
150 | + } else { |
151 | + my $not_check = 1; |
152 | + if ($opts{'n'} == 1) { |
153 | + $not_check = 0; |
154 | + } |
155 | + if (($cmd_result =~ /$pattern/i) == $not_check) { |
156 | + $message = "$script_to_check OK"; |
157 | + if ($opts{'o'} == 0) { |
158 | + $message .= " - $cmd_result"; |
159 | + } |
160 | + $error = $STATE_OK; |
161 | + } else { |
162 | + $message = "$script_to_check CRITICAL"; |
163 | + if ($opts{'o'} == 0) { |
164 | + $message .= " - $cmd_result"; |
165 | + } |
166 | + $error = $STATE_CRITICAL; |
167 | + } |
168 | + } |
169 | +} |
170 | + |
171 | +if ($message eq '') { |
172 | + print "Error: program failed in an unknown way\n"; |
173 | + exit($STATE_UNKNOWN); |
174 | +} |
175 | + |
176 | +if ($error) { |
177 | + print "$message\n"; |
178 | + exit($error); |
179 | +} else { |
180 | + # If we get here we are OK |
181 | + print "$message\n"; |
182 | + exit($STATE_OK); |
183 | +} |
184 | + |
185 | +#################################### |
186 | +# Start Subs: |
187 | +#################################### |
188 | +sub print_help() { |
189 | + print << "EOF"; |
190 | +Check the output or exit status of a script. |
191 | +$VERSION |
192 | +$AUTHOR |
193 | + |
194 | +Options: |
195 | +-h |
196 | + Print detailed help screen |
197 | + |
198 | +-s |
199 | + 'FULL PATH TO SCRIPT' (required) |
200 | + This is the script to run, the script is designed to run scripts in the |
201 | + /etc/init.d dir (but can run any script) and will call the script with |
202 | + a 'status' argument. So if you use another script make sure it will |
203 | + work with /path/script status, example: /etc/init.d/httpd status |
204 | + |
205 | +-e |
206 | + This is the "exitstaus" flag, it means check the exit status |
207 | + code instead of looking for a pattern in the output of the script. |
208 | + |
209 | +-p 'REGEX' |
210 | + This is a pattern to look for in the output of the script to confirm it |
211 | + is running, default is 'is running', but not all init.d scripts output |
212 | + (iptables), so you can specify an arbitrary pattern. |
213 | + All patterns are case insensitive. |
214 | + |
215 | +-n |
216 | + This is the "NOT" flag, it means not the -p pattern, so if you want to |
217 | + make sure the output of the script does NOT contain -p 'REGEX' |
218 | + |
219 | +-r |
220 | + This is the "ROOT" flag, it means run as root via sudo. You will need a |
221 | + line in your /etc/sudoers file like: |
222 | + nagios ALL=(root) NOPASSWD: /etc/init.d/* status |
223 | + |
224 | +-o |
225 | + This is the "SUPPRESS OUTPUT" flag. Some programs have a long output |
226 | + (like iptables), this flag suppresses that output so it is not printed |
227 | + as a part of the nagios message. |
228 | +EOF |
229 | +} |
230 | + |
231 | |
232 | === added file 'files/nrpe-external-master/check_status_file.py' |
233 | --- files/nrpe-external-master/check_status_file.py 1970-01-01 00:00:00 +0000 |
234 | +++ files/nrpe-external-master/check_status_file.py 2014-11-17 02:28:34 +0000 |
235 | @@ -0,0 +1,60 @@ |
236 | +#!/usr/bin/python |
237 | + |
238 | +# m |
239 | +# mmmm m m mmmm mmmm mmm mm#mm |
240 | +# #" "# # # #" "# #" "# #" # # |
241 | +# # # # # # # # # #"""" # |
242 | +# ##m#" "mm"# ##m#" ##m#" "#mm" "mm |
243 | +# # # # |
244 | +# " " " |
245 | +# This file is managed by puppet. Do not make local changes. |
246 | + |
247 | +# |
248 | +# Copyright 2014 Canonical Ltd. |
249 | +# |
250 | +# Author: Jacek Nykis <jacek.nykis@canonical.com> |
251 | +# |
252 | + |
253 | +import re |
254 | +import nagios_plugin |
255 | + |
256 | + |
257 | +def parse_args(): |
258 | + import argparse |
259 | + |
260 | + parser = argparse.ArgumentParser( |
261 | + description='Read file and return nagios status based on its content', |
262 | + formatter_class=argparse.ArgumentDefaultsHelpFormatter) |
263 | + parser.add_argument('-f', '--status-file', required=True, |
264 | + help='Status file path') |
265 | + parser.add_argument('-c', '--critical-text', default='CRITICAL', |
266 | + help='String indicating critical status') |
267 | + parser.add_argument('-w', '--warning-text', default='WARNING', |
268 | + help='String indicating warning status') |
269 | + parser.add_argument('-o', '--ok-text', default='OK', |
270 | + help='String indicating OK status') |
271 | + parser.add_argument('-u', '--unknown-text', default='UNKNOWN', |
272 | + help='String indicating unknown status') |
273 | + return parser.parse_args() |
274 | + |
275 | + |
276 | +def check_status(args): |
277 | + nagios_plugin.check_file_freshness(args.status_file, 43200) |
278 | + |
279 | + with open(args.status_file, "r") as f: |
280 | + content = [l.strip() for l in f.readlines()] |
281 | + |
282 | + for line in content: |
283 | + if re.search(args.critical_text, line): |
284 | + raise nagios_plugin.CriticalError(line) |
285 | + elif re.search(args.warning_text, line): |
286 | + raise nagios_plugin.WarnError(line) |
287 | + elif re.search(args.unknown_text, line): |
288 | + raise nagios_plugin.UnknownError(line) |
289 | + else: |
290 | + print line |
291 | + |
292 | + |
293 | +if __name__ == '__main__': |
294 | + args = parse_args() |
295 | + nagios_plugin.try_check(check_status, args) |
296 | |
297 | === added file 'files/nrpe-external-master/check_upstart_job' |
298 | --- files/nrpe-external-master/check_upstart_job 1970-01-01 00:00:00 +0000 |
299 | +++ files/nrpe-external-master/check_upstart_job 2014-11-17 02:28:34 +0000 |
300 | @@ -0,0 +1,72 @@ |
301 | +#!/usr/bin/python |
302 | + |
303 | +# |
304 | +# Copyright 2012, 2013 Canonical Ltd. |
305 | +# |
306 | +# Author: Paul Collins <paul.collins@canonical.com> |
307 | +# |
308 | +# Based on http://www.eurion.net/python-snippets/snippet/Upstart%20service%20status.html |
309 | +# |
310 | + |
311 | +import sys |
312 | + |
313 | +import dbus |
314 | + |
315 | + |
316 | +class Upstart(object): |
317 | + def __init__(self): |
318 | + self._bus = dbus.SystemBus() |
319 | + self._upstart = self._bus.get_object('com.ubuntu.Upstart', |
320 | + '/com/ubuntu/Upstart') |
321 | + def get_job(self, job_name): |
322 | + path = self._upstart.GetJobByName(job_name, |
323 | + dbus_interface='com.ubuntu.Upstart0_6') |
324 | + return self._bus.get_object('com.ubuntu.Upstart', path) |
325 | + |
326 | + def get_properties(self, job): |
327 | + path = job.GetInstance([], dbus_interface='com.ubuntu.Upstart0_6.Job') |
328 | + instance = self._bus.get_object('com.ubuntu.Upstart', path) |
329 | + return instance.GetAll('com.ubuntu.Upstart0_6.Instance', |
330 | + dbus_interface=dbus.PROPERTIES_IFACE) |
331 | + |
332 | + def get_job_instances(self, job_name): |
333 | + job = self.get_job(job_name) |
334 | + paths = job.GetAllInstances([], dbus_interface='com.ubuntu.Upstart0_6.Job') |
335 | + return [self._bus.get_object('com.ubuntu.Upstart', path) for path in paths] |
336 | + |
337 | + def get_job_instance_properties(self, job): |
338 | + return job.GetAll('com.ubuntu.Upstart0_6.Instance', |
339 | + dbus_interface=dbus.PROPERTIES_IFACE) |
340 | + |
341 | +try: |
342 | + upstart = Upstart() |
343 | + try: |
344 | + job = upstart.get_job(sys.argv[1]) |
345 | + props = upstart.get_properties(job) |
346 | + |
347 | + if props['state'] == 'running': |
348 | + print 'OK: %s is running' % sys.argv[1] |
349 | + sys.exit(0) |
350 | + else: |
351 | + print 'CRITICAL: %s is not running' % sys.argv[1] |
352 | + sys.exit(2) |
353 | + |
354 | + except dbus.DBusException as e: |
355 | + instances = upstart.get_job_instances(sys.argv[1]) |
356 | + propses = [upstart.get_job_instance_properties(instance) for instance in instances] |
357 | + states = dict([(props['name'], props['state']) for props in propses]) |
358 | + if len(states) != states.values().count('running'): |
359 | + not_running = [] |
360 | + for name in states.keys(): |
361 | + if states[name] != 'running': |
362 | + not_running.append(name) |
363 | + print 'CRITICAL: %d instances of %s not running: %s' % \ |
364 | + (len(not_running), sys.argv[1], not_running.join(', ')) |
365 | + sys.exit(2) |
366 | + else: |
367 | + print 'OK: %d instances of %s running' % (len(states), sys.argv[1]) |
368 | + |
369 | +except dbus.DBusException as e: |
370 | + print 'CRITICAL: failed to get properties of \'%s\' from upstart' % sys.argv[1] |
371 | + sys.exit(2) |
372 | + |
373 | |
374 | === added file 'files/nrpe-external-master/nagios_plugin.py' |
375 | --- files/nrpe-external-master/nagios_plugin.py 1970-01-01 00:00:00 +0000 |
376 | +++ files/nrpe-external-master/nagios_plugin.py 2014-11-17 02:28:34 +0000 |
377 | @@ -0,0 +1,78 @@ |
378 | +#!/usr/bin/env python |
379 | +# m |
380 | +# mmmm m m mmmm mmmm mmm mm#mm |
381 | +# #" "# # # #" "# #" "# #" # # |
382 | +# # # # # # # # # #"""" # |
383 | +# ##m#" "mm"# ##m#" ##m#" "#mm" "mm |
384 | +# # # # |
385 | +# " " " |
386 | +# This file is managed by puppet. Do not make local changes. |
387 | + |
388 | +# Copyright (C) 2005, 2006, 2007, 2012 James Troup <james.troup@canonical.com> |
389 | + |
390 | +import os |
391 | +import stat |
392 | +import time |
393 | +import traceback |
394 | +import sys |
395 | + |
396 | + |
397 | +################################################################################ |
398 | + |
399 | +class CriticalError(Exception): |
400 | + """This indicates a critical error.""" |
401 | + pass |
402 | + |
403 | + |
404 | +class WarnError(Exception): |
405 | + """This indicates a warning condition.""" |
406 | + pass |
407 | + |
408 | + |
409 | +class UnknownError(Exception): |
410 | + """This indicates a unknown error was encountered.""" |
411 | + pass |
412 | + |
413 | + |
414 | +def try_check(function, *args, **kwargs): |
415 | + """Perform a check with error/warn/unknown handling.""" |
416 | + try: |
417 | + function(*args, **kwargs) |
418 | + except UnknownError, msg: |
419 | + print msg |
420 | + sys.exit(3) |
421 | + except CriticalError, msg: |
422 | + print msg |
423 | + sys.exit(2) |
424 | + except WarnError, msg: |
425 | + print msg |
426 | + sys.exit(1) |
427 | + except: |
428 | + print "%s raised unknown exception '%s'" % (function, sys.exc_info()[0]) |
429 | + print '=' * 60 |
430 | + traceback.print_exc(file=sys.stdout) |
431 | + print '=' * 60 |
432 | + sys.exit(3) |
433 | + |
434 | + |
435 | +################################################################################ |
436 | + |
437 | +def check_file_freshness(filename, newer_than=600): |
438 | + """Check a file exists, is readable and is newer than <n> seconds (where <n> defaults to 600).""" |
439 | + # First check the file exists and is readable |
440 | + if not os.path.exists(filename): |
441 | + raise CriticalError("%s: does not exist." % (filename)) |
442 | + if os.access(filename, os.R_OK) == 0: |
443 | + raise CriticalError("%s: is not readable." % (filename)) |
444 | + |
445 | + # Then ensure the file is up-to-date enough |
446 | + mtime = os.stat(filename)[stat.ST_MTIME] |
447 | + last_modified = time.time() - mtime |
448 | + if last_modified > newer_than: |
449 | + raise CriticalError("%s: was last modified on %s and is too old (> %s seconds)." |
450 | + % (filename, time.ctime(mtime), newer_than)) |
451 | + if last_modified < 0: |
452 | + raise CriticalError("%s: was last modified on %s which is in the future." |
453 | + % (filename, time.ctime(mtime))) |
454 | + |
455 | +################################################################################ |
456 | |
457 | === modified file 'hooks/ceilometer_hooks.py' |
458 | --- hooks/ceilometer_hooks.py 2014-10-01 15:22:55 +0000 |
459 | +++ hooks/ceilometer_hooks.py 2014-11-17 02:28:34 +0000 |
460 | @@ -2,14 +2,17 @@ |
461 | |
462 | import base64 |
463 | import sys |
464 | +import os |
465 | from charmhelpers.fetch import ( |
466 | apt_install, filter_installed_packages, |
467 | apt_update |
468 | ) |
469 | from charmhelpers.core.hookenv import ( |
470 | open_port, |
471 | + local_unit, |
472 | relation_set, |
473 | relation_ids, |
474 | + relations_of_type, |
475 | config, |
476 | Hooks, UnregisteredHookError, |
477 | log |
478 | @@ -29,6 +32,7 @@ |
479 | CEILOMETER_ROLE, |
480 | register_configs, |
481 | restart_map, |
482 | + services, |
483 | get_ceilometer_context, |
484 | do_openstack_upgrade |
485 | ) |
486 | @@ -37,6 +41,7 @@ |
487 | canonical_url, |
488 | PUBLIC, INTERNAL, ADMIN |
489 | ) |
490 | +from charmhelpers.contrib.charmsupport.nrpe import NRPE |
491 | |
492 | hooks = Hooks() |
493 | CONFIGS = register_configs() |
494 | @@ -89,6 +94,7 @@ |
495 | def config_changed(): |
496 | if openstack_upgrade_available('ceilometer-common'): |
497 | do_openstack_upgrade(CONFIGS) |
498 | + update_nrpe_config() |
499 | CONFIGS.write_all() |
500 | ceilometer_joined() |
501 | for rid in relation_ids('identity-service'): |
502 | @@ -98,6 +104,7 @@ |
503 | @hooks.hook('upgrade-charm') |
504 | def upgrade_charm(): |
505 | install() |
506 | + update_nrpe_config() |
507 | any_changed() |
508 | |
509 | |
510 | @@ -137,6 +144,58 @@ |
511 | for relid in relation_ids('ceilometer-service'): |
512 | relation_set(relid, context) |
513 | |
514 | + |
515 | +@hooks.hook('nrpe-external-master-relation-joined', |
516 | + 'nrpe-external-master-relation-changed') |
517 | +def update_nrpe_config(): |
518 | + # Find out if nrpe set nagios_hostname |
519 | + hostname = None |
520 | + host_context = None |
521 | + for rel in relations_of_type('nrpe-external-master'): |
522 | + if 'nagios_hostname' in rel: |
523 | + hostname = rel['nagios_hostname'] |
524 | + host_context = rel['nagios_host_context'] |
525 | + break |
526 | + nrpe = NRPE(hostname=hostname) |
527 | + apt_install('python-dbus') |
528 | + |
529 | + if host_context: |
530 | + current_unit = "%s:%s" % (host_context, local_unit()) |
531 | + else: |
532 | + current_unit = local_unit() |
533 | + |
534 | + services_to_monitor = services() |
535 | + |
536 | + for service in services_to_monitor: |
537 | + upstart_init = '/etc/init/%s.conf' % service |
538 | + sysv_init = '/etc/init.d/%s' % service |
539 | + if os.path.exists(upstart_init): |
540 | + nrpe.add_check( |
541 | + shortname=service, |
542 | + description='process check {%s}' % current_unit, |
543 | + check_cmd='check_upstart_job %s' % service, |
544 | + ) |
545 | + elif os.path.exists(sysv_init): |
546 | + cronpath = '/etc/cron.d/nagios-service-check-%s' % service |
547 | + checkpath = os.path.join(os.environ['CHARM_DIR'], |
548 | + 'files/nrpe-external-master', |
549 | + 'check_exit_status.pl'), |
550 | + cron_template = '*/5 * * * * root \ |
551 | +%s -s /etc/init.d/%s \ |
552 | +status > /var/lib/nagios/service-check-%s.txt\n' \ |
553 | + % (checkpath[0], service, service) |
554 | + f = open(cronpath, 'w') |
555 | + f.write(cron_template) |
556 | + f.close() |
557 | + nrpe.add_check( |
558 | + shortname=service, |
559 | + description='process check {%s}' % current_unit, |
560 | + check_cmd='check_status_file.py -f \ |
561 | + /var/lib/nagios/service-check-%s.txt' % service, |
562 | + ) |
563 | + |
564 | + nrpe.write() |
565 | + |
566 | if __name__ == '__main__': |
567 | try: |
568 | hooks.execute(sys.argv) |
569 | |
570 | === modified file 'hooks/ceilometer_utils.py' |
571 | --- hooks/ceilometer_utils.py 2014-10-23 16:03:49 +0000 |
572 | +++ hooks/ceilometer_utils.py 2014-11-17 02:28:34 +0000 |
573 | @@ -50,6 +50,12 @@ |
574 | 'ceilometer-agent-notification' |
575 | ] |
576 | |
577 | +ICEHOUSE_SERVICES = [ |
578 | + 'ceilometer-alarm-notifier', |
579 | + 'ceilometer-alarm-evaluator', |
580 | + 'ceilometer-agent-notification' |
581 | +] |
582 | + |
583 | CEILOMETER_ROLE = "ResellerAdmin" |
584 | |
585 | |
586 | @@ -90,6 +96,11 @@ |
587 | configs = templating.OSConfigRenderer(templates_dir=TEMPLATES, |
588 | openstack_release=release) |
589 | |
590 | + if (get_os_codename_install_source(config('openstack-origin')) |
591 | + >= 'icehouse'): |
592 | + CONFIG_FILES[CEILOMETER_CONF]['services'] = \ |
593 | + CONFIG_FILES[CEILOMETER_CONF]['services'] + ICEHOUSE_SERVICES |
594 | + |
595 | for conf in CONFIG_FILES: |
596 | configs.register(conf, CONFIG_FILES[conf]['hook_contexts']) |
597 | |
598 | @@ -120,6 +131,14 @@ |
599 | return _map |
600 | |
601 | |
602 | +def services(): |
603 | + ''' Returns a list of services associate with this charm ''' |
604 | + _services = [] |
605 | + for v in restart_map().values(): |
606 | + _services = _services + v |
607 | + return list(set(_services)) |
608 | + |
609 | + |
610 | def get_ceilometer_context(): |
611 | ''' Retrieve a map of all current relation data for agent configuration ''' |
612 | ctxt = {} |
613 | |
614 | === added directory 'hooks/charmhelpers/contrib/charmsupport' |
615 | === added file 'hooks/charmhelpers/contrib/charmsupport/__init__.py' |
616 | === added file 'hooks/charmhelpers/contrib/charmsupport/nrpe.py' |
617 | --- hooks/charmhelpers/contrib/charmsupport/nrpe.py 1970-01-01 00:00:00 +0000 |
618 | +++ hooks/charmhelpers/contrib/charmsupport/nrpe.py 2014-11-17 02:28:34 +0000 |
619 | @@ -0,0 +1,222 @@ |
620 | +"""Compatibility with the nrpe-external-master charm""" |
621 | +# Copyright 2012 Canonical Ltd. |
622 | +# |
623 | +# Authors: |
624 | +# Matthew Wedgwood <matthew.wedgwood@canonical.com> |
625 | + |
626 | +import subprocess |
627 | +import pwd |
628 | +import grp |
629 | +import os |
630 | +import re |
631 | +import shlex |
632 | +import yaml |
633 | + |
634 | +from charmhelpers.core.hookenv import ( |
635 | + config, |
636 | + local_unit, |
637 | + log, |
638 | + relation_ids, |
639 | + relation_set, |
640 | +) |
641 | + |
642 | +from charmhelpers.core.host import service |
643 | + |
644 | +# This module adds compatibility with the nrpe-external-master and plain nrpe |
645 | +# subordinate charms. To use it in your charm: |
646 | +# |
647 | +# 1. Update metadata.yaml |
648 | +# |
649 | +# provides: |
650 | +# (...) |
651 | +# nrpe-external-master: |
652 | +# interface: nrpe-external-master |
653 | +# scope: container |
654 | +# |
655 | +# and/or |
656 | +# |
657 | +# provides: |
658 | +# (...) |
659 | +# local-monitors: |
660 | +# interface: local-monitors |
661 | +# scope: container |
662 | + |
663 | +# |
664 | +# 2. Add the following to config.yaml |
665 | +# |
666 | +# nagios_context: |
667 | +# default: "juju" |
668 | +# type: string |
669 | +# description: | |
670 | +# Used by the nrpe subordinate charms. |
671 | +# A string that will be prepended to instance name to set the host name |
672 | +# in nagios. So for instance the hostname would be something like: |
673 | +# juju-myservice-0 |
674 | +# If you're running multiple environments with the same services in them |
675 | +# this allows you to differentiate between them. |
676 | +# |
677 | +# 3. Add custom checks (Nagios plugins) to files/nrpe-external-master |
678 | +# |
679 | +# 4. Update your hooks.py with something like this: |
680 | +# |
681 | +# from charmsupport.nrpe import NRPE |
682 | +# (...) |
683 | +# def update_nrpe_config(): |
684 | +# nrpe_compat = NRPE() |
685 | +# nrpe_compat.add_check( |
686 | +# shortname = "myservice", |
687 | +# description = "Check MyService", |
688 | +# check_cmd = "check_http -w 2 -c 10 http://localhost" |
689 | +# ) |
690 | +# nrpe_compat.add_check( |
691 | +# "myservice_other", |
692 | +# "Check for widget failures", |
693 | +# check_cmd = "/srv/myapp/scripts/widget_check" |
694 | +# ) |
695 | +# nrpe_compat.write() |
696 | +# |
697 | +# def config_changed(): |
698 | +# (...) |
699 | +# update_nrpe_config() |
700 | +# |
701 | +# def nrpe_external_master_relation_changed(): |
702 | +# update_nrpe_config() |
703 | +# |
704 | +# def local_monitors_relation_changed(): |
705 | +# update_nrpe_config() |
706 | +# |
707 | +# 5. ln -s hooks.py nrpe-external-master-relation-changed |
708 | +# ln -s hooks.py local-monitors-relation-changed |
709 | + |
710 | + |
711 | +class CheckException(Exception): |
712 | + pass |
713 | + |
714 | + |
715 | +class Check(object): |
716 | + shortname_re = '[A-Za-z0-9-_]+$' |
717 | + service_template = (""" |
718 | +#--------------------------------------------------- |
719 | +# This file is Juju managed |
720 | +#--------------------------------------------------- |
721 | +define service {{ |
722 | + use active-service |
723 | + host_name {nagios_hostname} |
724 | + service_description {nagios_hostname}[{shortname}] """ |
725 | + """{description} |
726 | + check_command check_nrpe!{command} |
727 | + servicegroups {nagios_servicegroup} |
728 | +}} |
729 | +""") |
730 | + |
731 | + def __init__(self, shortname, description, check_cmd): |
732 | + super(Check, self).__init__() |
733 | + # XXX: could be better to calculate this from the service name |
734 | + if not re.match(self.shortname_re, shortname): |
735 | + raise CheckException("shortname must match {}".format( |
736 | + Check.shortname_re)) |
737 | + self.shortname = shortname |
738 | + self.command = "check_{}".format(shortname) |
739 | + # Note: a set of invalid characters is defined by the |
740 | + # Nagios server config |
741 | + # The default is: illegal_object_name_chars=`~!$%^&*"|'<>?,()= |
742 | + self.description = description |
743 | + self.check_cmd = self._locate_cmd(check_cmd) |
744 | + |
745 | + def _locate_cmd(self, check_cmd): |
746 | + search_path = ( |
747 | + '/', |
748 | + os.path.join(os.environ['CHARM_DIR'], |
749 | + 'files/nrpe-external-master'), |
750 | + '/usr/lib/nagios/plugins', |
751 | + '/usr/local/lib/nagios/plugins', |
752 | + ) |
753 | + parts = shlex.split(check_cmd) |
754 | + for path in search_path: |
755 | + if os.path.exists(os.path.join(path, parts[0])): |
756 | + command = os.path.join(path, parts[0]) |
757 | + if len(parts) > 1: |
758 | + command += " " + " ".join(parts[1:]) |
759 | + return command |
760 | + log('Check command not found: {}'.format(parts[0])) |
761 | + return '' |
762 | + |
763 | + def write(self, nagios_context, hostname): |
764 | + nrpe_check_file = '/etc/nagios/nrpe.d/{}.cfg'.format( |
765 | + self.command) |
766 | + with open(nrpe_check_file, 'w') as nrpe_check_config: |
767 | + nrpe_check_config.write("# check {}\n".format(self.shortname)) |
768 | + nrpe_check_config.write("command[{}]={}\n".format( |
769 | + self.command, self.check_cmd)) |
770 | + |
771 | + if not os.path.exists(NRPE.nagios_exportdir): |
772 | + log('Not writing service config as {} is not accessible'.format( |
773 | + NRPE.nagios_exportdir)) |
774 | + else: |
775 | + self.write_service_config(nagios_context, hostname) |
776 | + |
777 | + def write_service_config(self, nagios_context, hostname): |
778 | + for f in os.listdir(NRPE.nagios_exportdir): |
779 | + if re.search('.*{}.cfg'.format(self.command), f): |
780 | + os.remove(os.path.join(NRPE.nagios_exportdir, f)) |
781 | + |
782 | + templ_vars = { |
783 | + 'nagios_hostname': hostname, |
784 | + 'nagios_servicegroup': nagios_context, |
785 | + 'description': self.description, |
786 | + 'shortname': self.shortname, |
787 | + 'command': self.command, |
788 | + } |
789 | + nrpe_service_text = Check.service_template.format(**templ_vars) |
790 | + nrpe_service_file = '{}/service__{}_{}.cfg'.format( |
791 | + NRPE.nagios_exportdir, hostname, self.command) |
792 | + with open(nrpe_service_file, 'w') as nrpe_service_config: |
793 | + nrpe_service_config.write(str(nrpe_service_text)) |
794 | + |
795 | + def run(self): |
796 | + subprocess.call(self.check_cmd) |
797 | + |
798 | + |
799 | +class NRPE(object): |
800 | + nagios_logdir = '/var/log/nagios' |
801 | + nagios_exportdir = '/var/lib/nagios/export' |
802 | + nrpe_confdir = '/etc/nagios/nrpe.d' |
803 | + |
804 | + def __init__(self, hostname=None): |
805 | + super(NRPE, self).__init__() |
806 | + self.config = config() |
807 | + self.nagios_context = self.config['nagios_context'] |
808 | + self.unit_name = local_unit().replace('/', '-') |
809 | + if hostname: |
810 | + self.hostname = hostname |
811 | + else: |
812 | + self.hostname = "{}-{}".format(self.nagios_context, self.unit_name) |
813 | + self.checks = [] |
814 | + |
815 | + def add_check(self, *args, **kwargs): |
816 | + self.checks.append(Check(*args, **kwargs)) |
817 | + |
818 | + def write(self): |
819 | + try: |
820 | + nagios_uid = pwd.getpwnam('nagios').pw_uid |
821 | + nagios_gid = grp.getgrnam('nagios').gr_gid |
822 | + except: |
823 | + log("Nagios user not set up, nrpe checks not updated") |
824 | + return |
825 | + |
826 | + if not os.path.exists(NRPE.nagios_logdir): |
827 | + os.mkdir(NRPE.nagios_logdir) |
828 | + os.chown(NRPE.nagios_logdir, nagios_uid, nagios_gid) |
829 | + |
830 | + nrpe_monitors = {} |
831 | + monitors = {"monitors": {"remote": {"nrpe": nrpe_monitors}}} |
832 | + for nrpecheck in self.checks: |
833 | + nrpecheck.write(self.nagios_context, self.hostname) |
834 | + nrpe_monitors[nrpecheck.shortname] = { |
835 | + "command": nrpecheck.command, |
836 | + } |
837 | + |
838 | + service('restart', 'nagios-nrpe-server') |
839 | + |
840 | + for rid in relation_ids("local-monitors"): |
841 | + relation_set(relation_id=rid, monitors=yaml.dump(monitors)) |
842 | |
843 | === added file 'hooks/charmhelpers/contrib/charmsupport/volumes.py' |
844 | --- hooks/charmhelpers/contrib/charmsupport/volumes.py 1970-01-01 00:00:00 +0000 |
845 | +++ hooks/charmhelpers/contrib/charmsupport/volumes.py 2014-11-17 02:28:34 +0000 |
846 | @@ -0,0 +1,156 @@ |
847 | +''' |
848 | +Functions for managing volumes in juju units. One volume is supported per unit. |
849 | +Subordinates may have their own storage, provided it is on its own partition. |
850 | + |
851 | +Configuration stanzas: |
852 | + volume-ephemeral: |
853 | + type: boolean |
854 | + default: true |
855 | + description: > |
856 | + If false, a volume is mounted as sepecified in "volume-map" |
857 | + If true, ephemeral storage will be used, meaning that log data |
858 | + will only exist as long as the machine. YOU HAVE BEEN WARNED. |
859 | + volume-map: |
860 | + type: string |
861 | + default: {} |
862 | + description: > |
863 | + YAML map of units to device names, e.g: |
864 | + "{ rsyslog/0: /dev/vdb, rsyslog/1: /dev/vdb }" |
865 | + Service units will raise a configure-error if volume-ephemeral |
866 | + is 'true' and no volume-map value is set. Use 'juju set' to set a |
867 | + value and 'juju resolved' to complete configuration. |
868 | + |
869 | +Usage: |
870 | + from charmsupport.volumes import configure_volume, VolumeConfigurationError |
871 | + from charmsupport.hookenv import log, ERROR |
872 | + def post_mount_hook(): |
873 | + stop_service('myservice') |
874 | + def post_mount_hook(): |
875 | + start_service('myservice') |
876 | + |
877 | + if __name__ == '__main__': |
878 | + try: |
879 | + configure_volume(before_change=pre_mount_hook, |
880 | + after_change=post_mount_hook) |
881 | + except VolumeConfigurationError: |
882 | + log('Storage could not be configured', ERROR) |
883 | +''' |
884 | + |
885 | +# XXX: Known limitations |
886 | +# - fstab is neither consulted nor updated |
887 | + |
888 | +import os |
889 | +from charmhelpers.core import hookenv |
890 | +from charmhelpers.core import host |
891 | +import yaml |
892 | + |
893 | + |
894 | +MOUNT_BASE = '/srv/juju/volumes' |
895 | + |
896 | + |
897 | +class VolumeConfigurationError(Exception): |
898 | + '''Volume configuration data is missing or invalid''' |
899 | + pass |
900 | + |
901 | + |
902 | +def get_config(): |
903 | + '''Gather and sanity-check volume configuration data''' |
904 | + volume_config = {} |
905 | + config = hookenv.config() |
906 | + |
907 | + errors = False |
908 | + |
909 | + if config.get('volume-ephemeral') in (True, 'True', 'true', 'Yes', 'yes'): |
910 | + volume_config['ephemeral'] = True |
911 | + else: |
912 | + volume_config['ephemeral'] = False |
913 | + |
914 | + try: |
915 | + volume_map = yaml.safe_load(config.get('volume-map', '{}')) |
916 | + except yaml.YAMLError as e: |
917 | + hookenv.log("Error parsing YAML volume-map: {}".format(e), |
918 | + hookenv.ERROR) |
919 | + errors = True |
920 | + if volume_map is None: |
921 | + # probably an empty string |
922 | + volume_map = {} |
923 | + elif not isinstance(volume_map, dict): |
924 | + hookenv.log("Volume-map should be a dictionary, not {}".format( |
925 | + type(volume_map))) |
926 | + errors = True |
927 | + |
928 | + volume_config['device'] = volume_map.get(os.environ['JUJU_UNIT_NAME']) |
929 | + if volume_config['device'] and volume_config['ephemeral']: |
930 | + # asked for ephemeral storage but also defined a volume ID |
931 | + hookenv.log('A volume is defined for this unit, but ephemeral ' |
932 | + 'storage was requested', hookenv.ERROR) |
933 | + errors = True |
934 | + elif not volume_config['device'] and not volume_config['ephemeral']: |
935 | + # asked for permanent storage but did not define volume ID |
936 | + hookenv.log('Ephemeral storage was requested, but there is no volume ' |
937 | + 'defined for this unit.', hookenv.ERROR) |
938 | + errors = True |
939 | + |
940 | + unit_mount_name = hookenv.local_unit().replace('/', '-') |
941 | + volume_config['mountpoint'] = os.path.join(MOUNT_BASE, unit_mount_name) |
942 | + |
943 | + if errors: |
944 | + return None |
945 | + return volume_config |
946 | + |
947 | + |
948 | +def mount_volume(config): |
949 | + if os.path.exists(config['mountpoint']): |
950 | + if not os.path.isdir(config['mountpoint']): |
951 | + hookenv.log('Not a directory: {}'.format(config['mountpoint'])) |
952 | + raise VolumeConfigurationError() |
953 | + else: |
954 | + host.mkdir(config['mountpoint']) |
955 | + if os.path.ismount(config['mountpoint']): |
956 | + unmount_volume(config) |
957 | + if not host.mount(config['device'], config['mountpoint'], persist=True): |
958 | + raise VolumeConfigurationError() |
959 | + |
960 | + |
961 | +def unmount_volume(config): |
962 | + if os.path.ismount(config['mountpoint']): |
963 | + if not host.umount(config['mountpoint'], persist=True): |
964 | + raise VolumeConfigurationError() |
965 | + |
966 | + |
967 | +def managed_mounts(): |
968 | + '''List of all mounted managed volumes''' |
969 | + return filter(lambda mount: mount[0].startswith(MOUNT_BASE), host.mounts()) |
970 | + |
971 | + |
972 | +def configure_volume(before_change=lambda: None, after_change=lambda: None): |
973 | + '''Set up storage (or don't) according to the charm's volume configuration. |
974 | + Returns the mount point or "ephemeral". before_change and after_change |
975 | + are optional functions to be called if the volume configuration changes. |
976 | + ''' |
977 | + |
978 | + config = get_config() |
979 | + if not config: |
980 | + hookenv.log('Failed to read volume configuration', hookenv.CRITICAL) |
981 | + raise VolumeConfigurationError() |
982 | + |
983 | + if config['ephemeral']: |
984 | + if os.path.ismount(config['mountpoint']): |
985 | + before_change() |
986 | + unmount_volume(config) |
987 | + after_change() |
988 | + return 'ephemeral' |
989 | + else: |
990 | + # persistent storage |
991 | + if os.path.ismount(config['mountpoint']): |
992 | + mounts = dict(managed_mounts()) |
993 | + if mounts.get(config['mountpoint']) != config['device']: |
994 | + before_change() |
995 | + unmount_volume(config) |
996 | + mount_volume(config) |
997 | + after_change() |
998 | + else: |
999 | + before_change() |
1000 | + mount_volume(config) |
1001 | + after_change() |
1002 | + return config['mountpoint'] |
1003 | |
1004 | === added symlink 'hooks/nrpe-external-master-relation-changed' |
1005 | === target is u'ceilometer_hooks.py' |
1006 | === added symlink 'hooks/nrpe-external-master-relation-joined' |
1007 | === target is u'ceilometer_hooks.py' |
1008 | === modified file 'metadata.yaml' |
1009 | --- metadata.yaml 2013-10-20 22:30:27 +0000 |
1010 | +++ metadata.yaml 2014-11-17 02:28:34 +0000 |
1011 | @@ -12,6 +12,9 @@ |
1012 | - miscellaneous |
1013 | - openstack |
1014 | provides: |
1015 | + nrpe-external-master: |
1016 | + interface: nrpe-external-master |
1017 | + scope: container |
1018 | ceilometer-service: |
1019 | interface: ceilometer |
1020 | requires: |
UOSCI bot says:
charm_lint_check #992 trusty-ceilometer for brad-marshall mp241498
LINT FAIL: lint-test failed
LINT Results (max last 5 lines): ceilometer_ hooks.py: 146:80: E501 line too long (92 > 79 characters) ceilometer_ hooks.py: 174:22: E251 unexpected spaces around keyword / parameter equals ceilometer_ hooks.py: 174:24: E251 unexpected spaces around keyword / parameter equals
ERROR:root:Make target returned non-zero.
hooks/
hooks/
hooks/
make: *** [lint] Error 1
Full lint test output: http:// paste.ubuntu. com/8955755/ 10.98.191. 181:8080/ job/charm_ lint_check/ 992/
Build: http://