Merge lp:~gnuoy/charms/trusty/ceph/add-nrpe-checks into lp:~openstack-charmers-archive/charms/trusty/ceph/next

Proposed by Liam Young
Status: Merged
Merged at revision: 93
Proposed branch: lp:~gnuoy/charms/trusty/ceph/add-nrpe-checks
Merge into: lp:~openstack-charmers-archive/charms/trusty/ceph/next
Diff against target: 830 lines (+689/-6)
12 files modified
charm-helpers-hooks.yaml (+1/-0)
config.yaml (+9/-0)
files/nagios/check_ceph_status.py (+44/-0)
files/nagios/collect_ceph_status.sh (+18/-0)
hooks/charmhelpers/contrib/charmsupport/nrpe.py (+308/-0)
hooks/charmhelpers/contrib/charmsupport/volumes.py (+159/-0)
hooks/charmhelpers/contrib/storage/linux/ceph.py (+43/-0)
hooks/charmhelpers/core/decorators.py (+41/-0)
hooks/charmhelpers/core/host.py (+7/-4)
hooks/charmhelpers/fetch/__init__.py (+8/-1)
hooks/hooks.py (+44/-1)
metadata.yaml (+7/-0)
To merge this branch: bzr merge lp:~gnuoy/charms/trusty/ceph/add-nrpe-checks
Reviewer Review Type Date Requested Status
Liam Young (community) Approve
Review via email: mp+246155@code.launchpad.net

Description of the change

Add nrpe support. Based on branch from bradm with a few tweaks

To post a comment you must log in.
Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #705 ceph-next for gnuoy mp246155
    UNIT OK: passed

Build: http://10.245.162.77:8080/job/charm_unit_test/705/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #676 ceph-next for gnuoy mp246155
    LINT OK: passed

Build: http://10.245.162.77:8080/job/charm_lint_check/676/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #861 ceph-next for gnuoy mp246155
    AMULET OK: passed

Build: http://10.245.162.77:8080/job/charm_amulet_test/861/

Revision history for this message
Liam Young (gnuoy) wrote :

<jamespage> gnuoy, as they are re-syncs + tweaks to the nrpe stuff in the charms, I'm happy to give a conditional +1 across the board based on osci checking things out OK
<gnuoy> jamespage, I'll take that! thanks
...
<gnuoy> jamespage, osci is still working through. But on the subject of those mps, does your +1 stand for branches with no amulet tests?
<jamespage> gnuoy, yes

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'charm-helpers-hooks.yaml'
2--- charm-helpers-hooks.yaml 2014-12-11 16:46:45 +0000
3+++ charm-helpers-hooks.yaml 2015-01-12 14:00:31 +0000
4@@ -9,3 +9,4 @@
5 - payload.execd
6 - contrib.openstack.alternatives
7 - contrib.network.ip
8+ - contrib.charmsupport
9
10=== modified file 'config.yaml'
11--- config.yaml 2014-11-25 18:29:07 +0000
12+++ config.yaml 2015-01-12 14:00:31 +0000
13@@ -161,3 +161,12 @@
14 description: |
15 YAML-formatted associative array of sysctl key/value pairs to be set
16 persistently e.g. '{ kernel.pid_max : 4194303 }'.
17+ nagios_context:
18+ default: "juju"
19+ description: |
20+ Used by the nrpe-external-master subordinate charm.
21+ A string that will be prepended to instance name to set the host name
22+ in nagios. So for instance the hostname would be something like:
23+ juju-myservice-0
24+ If you're running multiple environments with the same services in them
25+ this allows you to differentiate between them.
26
27=== added directory 'files/nagios'
28=== added file 'files/nagios/check_ceph_status.py'
29--- files/nagios/check_ceph_status.py 1970-01-01 00:00:00 +0000
30+++ files/nagios/check_ceph_status.py 2015-01-12 14:00:31 +0000
31@@ -0,0 +1,44 @@
32+#!/usr/bin/env python
33+
34+# Copyright (C) 2014 Canonical
35+# All Rights Reserved
36+# Author: Jacek Nykis <jacek.nykis@canonical.com>
37+
38+import re
39+import argparse
40+import subprocess
41+import nagios_plugin
42+
43+
44+def check_ceph_status(args):
45+ if args.status_file:
46+ nagios_plugin.check_file_freshness(args.status_file, 3600)
47+ with open(args.status_file, "r") as f:
48+ lines = f.readlines()
49+ status_data = dict(l.strip().split(' ', 1) for l in lines if len(l) > 1)
50+ else:
51+ lines = subprocess.check_output(["ceph", "status"]).split('\n')
52+ status_data = dict(l.strip().split(' ', 1) for l in lines if len(l) > 1)
53+
54+ if ('health' not in status_data
55+ or 'monmap' not in status_data
56+ or 'osdmap'not in status_data):
57+ raise nagios_plugin.UnknownError('UNKNOWN: status data is incomplete')
58+
59+ if status_data['health'] != 'HEALTH_OK':
60+ msg = 'CRITICAL: ceph health status: "{}"'.format(status_data['health'])
61+ raise nagios_plugin.CriticalError(msg)
62+ osds = re.search("^.*: (\d+) osds: (\d+) up, (\d+) in", status_data['osdmap'])
63+ if osds.group(1) > osds.group(2): # not all OSDs are "up"
64+ msg = 'CRITICAL: Some OSDs are not up. Total: {}, up: {}'.format(
65+ osds.group(1), osds.group(2))
66+ raise nagios_plugin.CriticalError(msg)
67+ print "All OK"
68+
69+
70+if __name__ == '__main__':
71+ parser = argparse.ArgumentParser(description='Check ceph status')
72+ parser.add_argument('-f', '--file', dest='status_file',
73+ default=False, help='Optional file with "ceph status" output')
74+ args = parser.parse_args()
75+ nagios_plugin.try_check(check_ceph_status, args)
76
77=== added file 'files/nagios/collect_ceph_status.sh'
78--- files/nagios/collect_ceph_status.sh 1970-01-01 00:00:00 +0000
79+++ files/nagios/collect_ceph_status.sh 2015-01-12 14:00:31 +0000
80@@ -0,0 +1,18 @@
81+#!/bin/bash
82+# Copyright (C) 2014 Canonical
83+# All Rights Reserved
84+# Author: Jacek Nykis <jacek.nykis@canonical.com>
85+
86+LOCK=/var/lock/ceph-status.lock
87+lockfile-create -r2 --lock-name $LOCK > /dev/null 2>&1
88+if [ $? -ne 0 ]; then
89+ exit 1
90+fi
91+trap "rm -f $LOCK > /dev/null 2>&1" exit
92+
93+DATA_DIR="/var/lib/nagios"
94+if [ ! -d $DATA_DIR ]; then
95+ mkdir -p $DATA_DIR
96+fi
97+
98+ceph status >${DATA_DIR}/cat-ceph-status.txt
99
100=== added directory 'hooks/charmhelpers/contrib/charmsupport'
101=== added file 'hooks/charmhelpers/contrib/charmsupport/__init__.py'
102=== added file 'hooks/charmhelpers/contrib/charmsupport/nrpe.py'
103--- hooks/charmhelpers/contrib/charmsupport/nrpe.py 1970-01-01 00:00:00 +0000
104+++ hooks/charmhelpers/contrib/charmsupport/nrpe.py 2015-01-12 14:00:31 +0000
105@@ -0,0 +1,308 @@
106+"""Compatibility with the nrpe-external-master charm"""
107+# Copyright 2012 Canonical Ltd.
108+#
109+# Authors:
110+# Matthew Wedgwood <matthew.wedgwood@canonical.com>
111+
112+import subprocess
113+import pwd
114+import grp
115+import os
116+import re
117+import shlex
118+import yaml
119+
120+from charmhelpers.core.hookenv import (
121+ config,
122+ local_unit,
123+ log,
124+ relation_ids,
125+ relation_set,
126+ relations_of_type,
127+)
128+
129+from charmhelpers.core.host import service
130+
131+# This module adds compatibility with the nrpe-external-master and plain nrpe
132+# subordinate charms. To use it in your charm:
133+#
134+# 1. Update metadata.yaml
135+#
136+# provides:
137+# (...)
138+# nrpe-external-master:
139+# interface: nrpe-external-master
140+# scope: container
141+#
142+# and/or
143+#
144+# provides:
145+# (...)
146+# local-monitors:
147+# interface: local-monitors
148+# scope: container
149+
150+#
151+# 2. Add the following to config.yaml
152+#
153+# nagios_context:
154+# default: "juju"
155+# type: string
156+# description: |
157+# Used by the nrpe subordinate charms.
158+# A string that will be prepended to instance name to set the host name
159+# in nagios. So for instance the hostname would be something like:
160+# juju-myservice-0
161+# If you're running multiple environments with the same services in them
162+# this allows you to differentiate between them.
163+# nagios_servicegroups:
164+# default: ""
165+# type: string
166+# description: |
167+# A comma-separated list of nagios servicegroups.
168+# If left empty, the nagios_context will be used as the servicegroup
169+#
170+# 3. Add custom checks (Nagios plugins) to files/nrpe-external-master
171+#
172+# 4. Update your hooks.py with something like this:
173+#
174+# from charmsupport.nrpe import NRPE
175+# (...)
176+# def update_nrpe_config():
177+# nrpe_compat = NRPE()
178+# nrpe_compat.add_check(
179+# shortname = "myservice",
180+# description = "Check MyService",
181+# check_cmd = "check_http -w 2 -c 10 http://localhost"
182+# )
183+# nrpe_compat.add_check(
184+# "myservice_other",
185+# "Check for widget failures",
186+# check_cmd = "/srv/myapp/scripts/widget_check"
187+# )
188+# nrpe_compat.write()
189+#
190+# def config_changed():
191+# (...)
192+# update_nrpe_config()
193+#
194+# def nrpe_external_master_relation_changed():
195+# update_nrpe_config()
196+#
197+# def local_monitors_relation_changed():
198+# update_nrpe_config()
199+#
200+# 5. ln -s hooks.py nrpe-external-master-relation-changed
201+# ln -s hooks.py local-monitors-relation-changed
202+
203+
204+class CheckException(Exception):
205+ pass
206+
207+
208+class Check(object):
209+ shortname_re = '[A-Za-z0-9-_]+$'
210+ service_template = ("""
211+#---------------------------------------------------
212+# This file is Juju managed
213+#---------------------------------------------------
214+define service {{
215+ use active-service
216+ host_name {nagios_hostname}
217+ service_description {nagios_hostname}[{shortname}] """
218+ """{description}
219+ check_command check_nrpe!{command}
220+ servicegroups {nagios_servicegroup}
221+}}
222+""")
223+
224+ def __init__(self, shortname, description, check_cmd):
225+ super(Check, self).__init__()
226+ # XXX: could be better to calculate this from the service name
227+ if not re.match(self.shortname_re, shortname):
228+ raise CheckException("shortname must match {}".format(
229+ Check.shortname_re))
230+ self.shortname = shortname
231+ self.command = "check_{}".format(shortname)
232+ # Note: a set of invalid characters is defined by the
233+ # Nagios server config
234+ # The default is: illegal_object_name_chars=`~!$%^&*"|'<>?,()=
235+ self.description = description
236+ self.check_cmd = self._locate_cmd(check_cmd)
237+
238+ def _locate_cmd(self, check_cmd):
239+ search_path = (
240+ '/usr/lib/nagios/plugins',
241+ '/usr/local/lib/nagios/plugins',
242+ )
243+ parts = shlex.split(check_cmd)
244+ for path in search_path:
245+ if os.path.exists(os.path.join(path, parts[0])):
246+ command = os.path.join(path, parts[0])
247+ if len(parts) > 1:
248+ command += " " + " ".join(parts[1:])
249+ return command
250+ log('Check command not found: {}'.format(parts[0]))
251+ return ''
252+
253+ def write(self, nagios_context, hostname, nagios_servicegroups=None):
254+ nrpe_check_file = '/etc/nagios/nrpe.d/{}.cfg'.format(
255+ self.command)
256+ with open(nrpe_check_file, 'w') as nrpe_check_config:
257+ nrpe_check_config.write("# check {}\n".format(self.shortname))
258+ nrpe_check_config.write("command[{}]={}\n".format(
259+ self.command, self.check_cmd))
260+
261+ if not os.path.exists(NRPE.nagios_exportdir):
262+ log('Not writing service config as {} is not accessible'.format(
263+ NRPE.nagios_exportdir))
264+ else:
265+ self.write_service_config(nagios_context, hostname,
266+ nagios_servicegroups)
267+
268+ def write_service_config(self, nagios_context, hostname,
269+ nagios_servicegroups=None):
270+ for f in os.listdir(NRPE.nagios_exportdir):
271+ if re.search('.*{}.cfg'.format(self.command), f):
272+ os.remove(os.path.join(NRPE.nagios_exportdir, f))
273+
274+ if not nagios_servicegroups:
275+ nagios_servicegroups = nagios_context
276+
277+ templ_vars = {
278+ 'nagios_hostname': hostname,
279+ 'nagios_servicegroup': nagios_servicegroups,
280+ 'description': self.description,
281+ 'shortname': self.shortname,
282+ 'command': self.command,
283+ }
284+ nrpe_service_text = Check.service_template.format(**templ_vars)
285+ nrpe_service_file = '{}/service__{}_{}.cfg'.format(
286+ NRPE.nagios_exportdir, hostname, self.command)
287+ with open(nrpe_service_file, 'w') as nrpe_service_config:
288+ nrpe_service_config.write(str(nrpe_service_text))
289+
290+ def run(self):
291+ subprocess.call(self.check_cmd)
292+
293+
294+class NRPE(object):
295+ nagios_logdir = '/var/log/nagios'
296+ nagios_exportdir = '/var/lib/nagios/export'
297+ nrpe_confdir = '/etc/nagios/nrpe.d'
298+
299+ def __init__(self, hostname=None):
300+ super(NRPE, self).__init__()
301+ self.config = config()
302+ self.nagios_context = self.config['nagios_context']
303+ if 'nagios_servicegroups' in self.config:
304+ self.nagios_servicegroups = self.config['nagios_servicegroups']
305+ else:
306+ self.nagios_servicegroups = 'juju'
307+ self.unit_name = local_unit().replace('/', '-')
308+ if hostname:
309+ self.hostname = hostname
310+ else:
311+ self.hostname = "{}-{}".format(self.nagios_context, self.unit_name)
312+ self.checks = []
313+
314+ def add_check(self, *args, **kwargs):
315+ self.checks.append(Check(*args, **kwargs))
316+
317+ def write(self):
318+ try:
319+ nagios_uid = pwd.getpwnam('nagios').pw_uid
320+ nagios_gid = grp.getgrnam('nagios').gr_gid
321+ except:
322+ log("Nagios user not set up, nrpe checks not updated")
323+ return
324+
325+ if not os.path.exists(NRPE.nagios_logdir):
326+ os.mkdir(NRPE.nagios_logdir)
327+ os.chown(NRPE.nagios_logdir, nagios_uid, nagios_gid)
328+
329+ nrpe_monitors = {}
330+ monitors = {"monitors": {"remote": {"nrpe": nrpe_monitors}}}
331+ for nrpecheck in self.checks:
332+ nrpecheck.write(self.nagios_context, self.hostname,
333+ self.nagios_servicegroups)
334+ nrpe_monitors[nrpecheck.shortname] = {
335+ "command": nrpecheck.command,
336+ }
337+
338+ service('restart', 'nagios-nrpe-server')
339+
340+ for rid in relation_ids("local-monitors"):
341+ relation_set(relation_id=rid, monitors=yaml.dump(monitors))
342+
343+
344+def get_nagios_hostcontext(relation_name='nrpe-external-master'):
345+ """
346+ Query relation with nrpe subordinate, return the nagios_host_context
347+
348+ :param str relation_name: Name of relation nrpe sub joined to
349+ """
350+ for rel in relations_of_type(relation_name):
351+ if 'nagios_hostname' in rel:
352+ return rel['nagios_host_context']
353+
354+
355+def get_nagios_hostname(relation_name='nrpe-external-master'):
356+ """
357+ Query relation with nrpe subordinate, return the nagios_hostname
358+
359+ :param str relation_name: Name of relation nrpe sub joined to
360+ """
361+ for rel in relations_of_type(relation_name):
362+ if 'nagios_hostname' in rel:
363+ return rel['nagios_hostname']
364+
365+
366+def get_nagios_unit_name(relation_name='nrpe-external-master'):
367+ """
368+ Return the nagios unit name prepended with host_context if needed
369+
370+ :param str relation_name: Name of relation nrpe sub joined to
371+ """
372+ host_context = get_nagios_hostcontext(relation_name)
373+ if host_context:
374+ unit = "%s:%s" % (host_context, local_unit())
375+ else:
376+ unit = local_unit()
377+ return unit
378+
379+
380+def add_init_service_checks(nrpe, services, unit_name):
381+ """
382+ Add checks for each service in list
383+
384+ :param NRPE nrpe: NRPE object to add check to
385+ :param list services: List of services to check
386+ :param str unit_name: Unit name to use in check description
387+ """
388+ for svc in services:
389+ upstart_init = '/etc/init/%s.conf' % svc
390+ sysv_init = '/etc/init.d/%s' % svc
391+ if os.path.exists(upstart_init):
392+ nrpe.add_check(
393+ shortname=svc,
394+ description='process check {%s}' % unit_name,
395+ check_cmd='check_upstart_job %s' % svc
396+ )
397+ elif os.path.exists(sysv_init):
398+ cronpath = '/etc/cron.d/nagios-service-check-%s' % svc
399+ cron_file = ('*/5 * * * * root '
400+ '/usr/local/lib/nagios/plugins/check_exit_status.pl '
401+ '-s /etc/init.d/%s status > '
402+ '/var/lib/nagios/service-check-%s.txt\n' % (svc,
403+ svc)
404+ )
405+ f = open(cronpath, 'w')
406+ f.write(cron_file)
407+ f.close()
408+ nrpe.add_check(
409+ shortname=svc,
410+ description='process check {%s}' % unit_name,
411+ check_cmd='check_status_file.py -f '
412+ '/var/lib/nagios/service-check-%s.txt' % svc,
413+ )
414
415=== added file 'hooks/charmhelpers/contrib/charmsupport/volumes.py'
416--- hooks/charmhelpers/contrib/charmsupport/volumes.py 1970-01-01 00:00:00 +0000
417+++ hooks/charmhelpers/contrib/charmsupport/volumes.py 2015-01-12 14:00:31 +0000
418@@ -0,0 +1,159 @@
419+'''
420+Functions for managing volumes in juju units. One volume is supported per unit.
421+Subordinates may have their own storage, provided it is on its own partition.
422+
423+Configuration stanzas::
424+
425+ volume-ephemeral:
426+ type: boolean
427+ default: true
428+ description: >
429+ If false, a volume is mounted as sepecified in "volume-map"
430+ If true, ephemeral storage will be used, meaning that log data
431+ will only exist as long as the machine. YOU HAVE BEEN WARNED.
432+ volume-map:
433+ type: string
434+ default: {}
435+ description: >
436+ YAML map of units to device names, e.g:
437+ "{ rsyslog/0: /dev/vdb, rsyslog/1: /dev/vdb }"
438+ Service units will raise a configure-error if volume-ephemeral
439+ is 'true' and no volume-map value is set. Use 'juju set' to set a
440+ value and 'juju resolved' to complete configuration.
441+
442+Usage::
443+
444+ from charmsupport.volumes import configure_volume, VolumeConfigurationError
445+ from charmsupport.hookenv import log, ERROR
446+ def post_mount_hook():
447+ stop_service('myservice')
448+ def post_mount_hook():
449+ start_service('myservice')
450+
451+ if __name__ == '__main__':
452+ try:
453+ configure_volume(before_change=pre_mount_hook,
454+ after_change=post_mount_hook)
455+ except VolumeConfigurationError:
456+ log('Storage could not be configured', ERROR)
457+
458+'''
459+
460+# XXX: Known limitations
461+# - fstab is neither consulted nor updated
462+
463+import os
464+from charmhelpers.core import hookenv
465+from charmhelpers.core import host
466+import yaml
467+
468+
469+MOUNT_BASE = '/srv/juju/volumes'
470+
471+
472+class VolumeConfigurationError(Exception):
473+ '''Volume configuration data is missing or invalid'''
474+ pass
475+
476+
477+def get_config():
478+ '''Gather and sanity-check volume configuration data'''
479+ volume_config = {}
480+ config = hookenv.config()
481+
482+ errors = False
483+
484+ if config.get('volume-ephemeral') in (True, 'True', 'true', 'Yes', 'yes'):
485+ volume_config['ephemeral'] = True
486+ else:
487+ volume_config['ephemeral'] = False
488+
489+ try:
490+ volume_map = yaml.safe_load(config.get('volume-map', '{}'))
491+ except yaml.YAMLError as e:
492+ hookenv.log("Error parsing YAML volume-map: {}".format(e),
493+ hookenv.ERROR)
494+ errors = True
495+ if volume_map is None:
496+ # probably an empty string
497+ volume_map = {}
498+ elif not isinstance(volume_map, dict):
499+ hookenv.log("Volume-map should be a dictionary, not {}".format(
500+ type(volume_map)))
501+ errors = True
502+
503+ volume_config['device'] = volume_map.get(os.environ['JUJU_UNIT_NAME'])
504+ if volume_config['device'] and volume_config['ephemeral']:
505+ # asked for ephemeral storage but also defined a volume ID
506+ hookenv.log('A volume is defined for this unit, but ephemeral '
507+ 'storage was requested', hookenv.ERROR)
508+ errors = True
509+ elif not volume_config['device'] and not volume_config['ephemeral']:
510+ # asked for permanent storage but did not define volume ID
511+ hookenv.log('Ephemeral storage was requested, but there is no volume '
512+ 'defined for this unit.', hookenv.ERROR)
513+ errors = True
514+
515+ unit_mount_name = hookenv.local_unit().replace('/', '-')
516+ volume_config['mountpoint'] = os.path.join(MOUNT_BASE, unit_mount_name)
517+
518+ if errors:
519+ return None
520+ return volume_config
521+
522+
523+def mount_volume(config):
524+ if os.path.exists(config['mountpoint']):
525+ if not os.path.isdir(config['mountpoint']):
526+ hookenv.log('Not a directory: {}'.format(config['mountpoint']))
527+ raise VolumeConfigurationError()
528+ else:
529+ host.mkdir(config['mountpoint'])
530+ if os.path.ismount(config['mountpoint']):
531+ unmount_volume(config)
532+ if not host.mount(config['device'], config['mountpoint'], persist=True):
533+ raise VolumeConfigurationError()
534+
535+
536+def unmount_volume(config):
537+ if os.path.ismount(config['mountpoint']):
538+ if not host.umount(config['mountpoint'], persist=True):
539+ raise VolumeConfigurationError()
540+
541+
542+def managed_mounts():
543+ '''List of all mounted managed volumes'''
544+ return filter(lambda mount: mount[0].startswith(MOUNT_BASE), host.mounts())
545+
546+
547+def configure_volume(before_change=lambda: None, after_change=lambda: None):
548+ '''Set up storage (or don't) according to the charm's volume configuration.
549+ Returns the mount point or "ephemeral". before_change and after_change
550+ are optional functions to be called if the volume configuration changes.
551+ '''
552+
553+ config = get_config()
554+ if not config:
555+ hookenv.log('Failed to read volume configuration', hookenv.CRITICAL)
556+ raise VolumeConfigurationError()
557+
558+ if config['ephemeral']:
559+ if os.path.ismount(config['mountpoint']):
560+ before_change()
561+ unmount_volume(config)
562+ after_change()
563+ return 'ephemeral'
564+ else:
565+ # persistent storage
566+ if os.path.ismount(config['mountpoint']):
567+ mounts = dict(managed_mounts())
568+ if mounts.get(config['mountpoint']) != config['device']:
569+ before_change()
570+ unmount_volume(config)
571+ mount_volume(config)
572+ after_change()
573+ else:
574+ before_change()
575+ mount_volume(config)
576+ after_change()
577+ return config['mountpoint']
578
579=== modified file 'hooks/charmhelpers/contrib/storage/linux/ceph.py'
580--- hooks/charmhelpers/contrib/storage/linux/ceph.py 2014-11-26 09:07:27 +0000
581+++ hooks/charmhelpers/contrib/storage/linux/ceph.py 2015-01-12 14:00:31 +0000
582@@ -372,3 +372,46 @@
583 return None
584 else:
585 return None
586+
587+
588+class CephBrokerRq(object):
589+ """Ceph broker request.
590+
591+ Multiple operations can be added to a request and sent to the Ceph broker
592+ to be executed.
593+
594+ Request is json-encoded for sending over the wire.
595+
596+ The API is versioned and defaults to version 1.
597+ """
598+ def __init__(self, api_version=1):
599+ self.api_version = api_version
600+ self.ops = []
601+
602+ def add_op_create_pool(self, name, replica_count=3):
603+ self.ops.append({'op': 'create-pool', 'name': name,
604+ 'replicas': replica_count})
605+
606+ @property
607+ def request(self):
608+ return json.dumps({'api-version': self.api_version, 'ops': self.ops})
609+
610+
611+class CephBrokerRsp(object):
612+ """Ceph broker response.
613+
614+ Response is json-decoded and contents provided as methods/properties.
615+
616+ The API is versioned and defaults to version 1.
617+ """
618+ def __init__(self, encoded_rsp):
619+ self.api_version = None
620+ self.rsp = json.loads(encoded_rsp)
621+
622+ @property
623+ def exit_code(self):
624+ return self.rsp.get('exit-code')
625+
626+ @property
627+ def exit_msg(self):
628+ return self.rsp.get('stderr')
629
630=== added file 'hooks/charmhelpers/core/decorators.py'
631--- hooks/charmhelpers/core/decorators.py 1970-01-01 00:00:00 +0000
632+++ hooks/charmhelpers/core/decorators.py 2015-01-12 14:00:31 +0000
633@@ -0,0 +1,41 @@
634+#
635+# Copyright 2014 Canonical Ltd.
636+#
637+# Authors:
638+# Edward Hope-Morley <opentastic@gmail.com>
639+#
640+
641+import time
642+
643+from charmhelpers.core.hookenv import (
644+ log,
645+ INFO,
646+)
647+
648+
649+def retry_on_exception(num_retries, base_delay=0, exc_type=Exception):
650+ """If the decorated function raises exception exc_type, allow num_retries
651+ retry attempts before raise the exception.
652+ """
653+ def _retry_on_exception_inner_1(f):
654+ def _retry_on_exception_inner_2(*args, **kwargs):
655+ retries = num_retries
656+ multiplier = 1
657+ while True:
658+ try:
659+ return f(*args, **kwargs)
660+ except exc_type:
661+ if not retries:
662+ raise
663+
664+ delay = base_delay * multiplier
665+ multiplier += 1
666+ log("Retrying '%s' %d more times (delay=%s)" %
667+ (f.__name__, retries, delay), level=INFO)
668+ retries -= 1
669+ if delay:
670+ time.sleep(delay)
671+
672+ return _retry_on_exception_inner_2
673+
674+ return _retry_on_exception_inner_1
675
676=== modified file 'hooks/charmhelpers/core/host.py'
677--- hooks/charmhelpers/core/host.py 2014-12-11 16:47:24 +0000
678+++ hooks/charmhelpers/core/host.py 2015-01-12 14:00:31 +0000
679@@ -162,13 +162,16 @@
680 uid = pwd.getpwnam(owner).pw_uid
681 gid = grp.getgrnam(group).gr_gid
682 realpath = os.path.abspath(path)
683- if os.path.exists(realpath):
684- if force and not os.path.isdir(realpath):
685+ path_exists = os.path.exists(realpath)
686+ if path_exists and force:
687+ if not os.path.isdir(realpath):
688 log("Removing non-directory file {} prior to mkdir()".format(path))
689 os.unlink(realpath)
690- else:
691+ os.makedirs(realpath, perms)
692+ os.chown(realpath, uid, gid)
693+ elif not path_exists:
694 os.makedirs(realpath, perms)
695- os.chown(realpath, uid, gid)
696+ os.chown(realpath, uid, gid)
697
698
699 def write_file(path, content, owner='root', group='root', perms=0o444):
700
701=== modified file 'hooks/charmhelpers/fetch/__init__.py'
702--- hooks/charmhelpers/fetch/__init__.py 2014-11-25 17:07:46 +0000
703+++ hooks/charmhelpers/fetch/__init__.py 2015-01-12 14:00:31 +0000
704@@ -64,9 +64,16 @@
705 'trusty-juno/updates': 'trusty-updates/juno',
706 'trusty-updates/juno': 'trusty-updates/juno',
707 'juno/proposed': 'trusty-proposed/juno',
708- 'juno/proposed': 'trusty-proposed/juno',
709 'trusty-juno/proposed': 'trusty-proposed/juno',
710 'trusty-proposed/juno': 'trusty-proposed/juno',
711+ # Kilo
712+ 'kilo': 'trusty-updates/kilo',
713+ 'trusty-kilo': 'trusty-updates/kilo',
714+ 'trusty-kilo/updates': 'trusty-updates/kilo',
715+ 'trusty-updates/kilo': 'trusty-updates/kilo',
716+ 'kilo/proposed': 'trusty-proposed/kilo',
717+ 'trusty-kilo/proposed': 'trusty-proposed/kilo',
718+ 'trusty-proposed/kilo': 'trusty-proposed/kilo',
719 }
720
721 # The order of this list is very important. Handlers should be listed in from
722
723=== modified file 'hooks/hooks.py'
724--- hooks/hooks.py 2014-11-25 18:29:07 +0000
725+++ hooks/hooks.py 2015-01-12 14:00:31 +0000
726@@ -25,12 +25,15 @@
727 relation_set,
728 remote_unit,
729 Hooks, UnregisteredHookError,
730- service_name
731+ service_name,
732+ relations_of_type
733 )
734 from charmhelpers.core.host import (
735 service_restart,
736 umount,
737 mkdir,
738+ write_file,
739+ rsync,
740 cmp_pkgrevno
741 )
742 from charmhelpers.fetch import (
743@@ -56,8 +59,15 @@
744 process_requests
745 )
746
747+from charmhelpers.contrib.charmsupport import nrpe
748+
749 hooks = Hooks()
750
751+NAGIOS_PLUGINS = '/usr/local/lib/nagios/plugins'
752+SCRIPTS_DIR = '/usr/local/bin'
753+STATUS_FILE = '/var/lib/nagios/cat-ceph-status.txt'
754+STATUS_CRONFILE = '/etc/cron.d/cat-ceph-health'
755+
756
757 def install_upstart_scripts():
758 # Only install upstart configurations for older versions
759@@ -152,6 +162,9 @@
760 reformat_osd(), config('ignore-device-errors'))
761 ceph.start_osds(get_devices())
762
763+ if relations_of_type('nrpe-external-master'):
764+ update_nrpe_config()
765+
766
767 def get_mon_hosts():
768 hosts = []
769@@ -334,6 +347,36 @@
770 ceph.start_osds(get_devices())
771
772
773+@hooks.hook('nrpe-external-master-relation-joined')
774+@hooks.hook('nrpe-external-master-relation-changed')
775+def update_nrpe_config():
776+ # python-dbus is used by check_upstart_job
777+ apt_install('python-dbus')
778+ log('Refreshing nagios checks')
779+ if os.path.isdir(NAGIOS_PLUGINS):
780+ rsync(os.path.join(os.getenv('CHARM_DIR'), 'files', 'nagios',
781+ 'check_ceph_status.py'),
782+ os.path.join(NAGIOS_PLUGINS, 'check_ceph_status.py'))
783+
784+ script = os.path.join(SCRIPTS_DIR, 'collect_ceph_status.sh')
785+ rsync(os.path.join(os.getenv('CHARM_DIR'), 'files',
786+ 'nagios', 'collect_ceph_status.sh'),
787+ script)
788+ cronjob = "{} root {}\n".format('*/5 * * * *', script)
789+ write_file(STATUS_CRONFILE, cronjob)
790+
791+ # Find out if nrpe set nagios_hostname
792+ hostname = nrpe.get_nagios_hostname()
793+ current_unit = nrpe.get_nagios_unit_name()
794+ nrpe_setup = nrpe.NRPE(hostname=hostname)
795+ nrpe_setup.add_check(
796+ shortname="ceph",
797+ description='Check Ceph health {%s}' % current_unit,
798+ check_cmd='check_ceph_status.py -f {}'.format(STATUS_FILE)
799+ )
800+ nrpe_setup.write()
801+
802+
803 if __name__ == '__main__':
804 try:
805 hooks.execute(sys.argv)
806
807=== added symlink 'hooks/nrpe-external-master-relation-changed'
808=== target is u'hooks.py'
809=== added symlink 'hooks/nrpe-external-master-relation-joined'
810=== target is u'hooks.py'
811=== modified file 'metadata.yaml'
812--- metadata.yaml 2013-07-14 19:46:24 +0000
813+++ metadata.yaml 2015-01-12 14:00:31 +0000
814@@ -10,9 +10,16 @@
815 mon:
816 interface: ceph
817 provides:
818+ nrpe-external-master:
819+ interface: nrpe-external-master
820+ scope: container
821 client:
822 interface: ceph-client
823 osd:
824 interface: ceph-osd
825 radosgw:
826 interface: ceph-radosgw
827+ nrpe-external-master:
828+ interface: nrpe-external-master
829+ scope: container
830+ gets: [nagios_hostname, nagios_host_context]

Subscribers

People subscribed via source and target branches