Merge lp:~gnuoy/charms/trusty/ceph/add-nrpe-checks into lp:~openstack-charmers-archive/charms/trusty/ceph/next

Proposed by Liam Young
Status: Merged
Merged at revision: 93
Proposed branch: lp:~gnuoy/charms/trusty/ceph/add-nrpe-checks
Merge into: lp:~openstack-charmers-archive/charms/trusty/ceph/next
Diff against target: 830 lines (+689/-6)
12 files modified
charm-helpers-hooks.yaml (+1/-0)
config.yaml (+9/-0)
files/nagios/check_ceph_status.py (+44/-0)
files/nagios/collect_ceph_status.sh (+18/-0)
hooks/charmhelpers/contrib/charmsupport/nrpe.py (+308/-0)
hooks/charmhelpers/contrib/charmsupport/volumes.py (+159/-0)
hooks/charmhelpers/contrib/storage/linux/ceph.py (+43/-0)
hooks/charmhelpers/core/decorators.py (+41/-0)
hooks/charmhelpers/core/host.py (+7/-4)
hooks/charmhelpers/fetch/__init__.py (+8/-1)
hooks/hooks.py (+44/-1)
metadata.yaml (+7/-0)
To merge this branch: bzr merge lp:~gnuoy/charms/trusty/ceph/add-nrpe-checks
Reviewer Review Type Date Requested Status
Liam Young (community) Approve
Review via email: mp+246155@code.launchpad.net

Description of the change

Add nrpe support. Based on branch from bradm with a few tweaks

To post a comment you must log in.
Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #705 ceph-next for gnuoy mp246155
    UNIT OK: passed

Build: http://10.245.162.77:8080/job/charm_unit_test/705/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #676 ceph-next for gnuoy mp246155
    LINT OK: passed

Build: http://10.245.162.77:8080/job/charm_lint_check/676/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #861 ceph-next for gnuoy mp246155
    AMULET OK: passed

Build: http://10.245.162.77:8080/job/charm_amulet_test/861/

Revision history for this message
Liam Young (gnuoy) wrote :

<jamespage> gnuoy, as they are re-syncs + tweaks to the nrpe stuff in the charms, I'm happy to give a conditional +1 across the board based on osci checking things out OK
<gnuoy> jamespage, I'll take that! thanks
...
<gnuoy> jamespage, osci is still working through. But on the subject of those mps, does your +1 stand for branches with no amulet tests?
<jamespage> gnuoy, yes

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'charm-helpers-hooks.yaml'
--- charm-helpers-hooks.yaml 2014-12-11 16:46:45 +0000
+++ charm-helpers-hooks.yaml 2015-01-12 14:00:31 +0000
@@ -9,3 +9,4 @@
9 - payload.execd9 - payload.execd
10 - contrib.openstack.alternatives10 - contrib.openstack.alternatives
11 - contrib.network.ip11 - contrib.network.ip
12 - contrib.charmsupport
1213
=== modified file 'config.yaml'
--- config.yaml 2014-11-25 18:29:07 +0000
+++ config.yaml 2015-01-12 14:00:31 +0000
@@ -161,3 +161,12 @@
161 description: |161 description: |
162 YAML-formatted associative array of sysctl key/value pairs to be set162 YAML-formatted associative array of sysctl key/value pairs to be set
163 persistently e.g. '{ kernel.pid_max : 4194303 }'.163 persistently e.g. '{ kernel.pid_max : 4194303 }'.
164 nagios_context:
165 default: "juju"
166 description: |
167 Used by the nrpe-external-master subordinate charm.
168 A string that will be prepended to instance name to set the host name
169 in nagios. So for instance the hostname would be something like:
170 juju-myservice-0
171 If you're running multiple environments with the same services in them
172 this allows you to differentiate between them.
164173
=== added directory 'files/nagios'
=== added file 'files/nagios/check_ceph_status.py'
--- files/nagios/check_ceph_status.py 1970-01-01 00:00:00 +0000
+++ files/nagios/check_ceph_status.py 2015-01-12 14:00:31 +0000
@@ -0,0 +1,44 @@
1#!/usr/bin/env python
2
3# Copyright (C) 2014 Canonical
4# All Rights Reserved
5# Author: Jacek Nykis <jacek.nykis@canonical.com>
6
7import re
8import argparse
9import subprocess
10import nagios_plugin
11
12
13def check_ceph_status(args):
14 if args.status_file:
15 nagios_plugin.check_file_freshness(args.status_file, 3600)
16 with open(args.status_file, "r") as f:
17 lines = f.readlines()
18 status_data = dict(l.strip().split(' ', 1) for l in lines if len(l) > 1)
19 else:
20 lines = subprocess.check_output(["ceph", "status"]).split('\n')
21 status_data = dict(l.strip().split(' ', 1) for l in lines if len(l) > 1)
22
23 if ('health' not in status_data
24 or 'monmap' not in status_data
25 or 'osdmap'not in status_data):
26 raise nagios_plugin.UnknownError('UNKNOWN: status data is incomplete')
27
28 if status_data['health'] != 'HEALTH_OK':
29 msg = 'CRITICAL: ceph health status: "{}"'.format(status_data['health'])
30 raise nagios_plugin.CriticalError(msg)
31 osds = re.search("^.*: (\d+) osds: (\d+) up, (\d+) in", status_data['osdmap'])
32 if osds.group(1) > osds.group(2): # not all OSDs are "up"
33 msg = 'CRITICAL: Some OSDs are not up. Total: {}, up: {}'.format(
34 osds.group(1), osds.group(2))
35 raise nagios_plugin.CriticalError(msg)
36 print "All OK"
37
38
39if __name__ == '__main__':
40 parser = argparse.ArgumentParser(description='Check ceph status')
41 parser.add_argument('-f', '--file', dest='status_file',
42 default=False, help='Optional file with "ceph status" output')
43 args = parser.parse_args()
44 nagios_plugin.try_check(check_ceph_status, args)
045
=== added file 'files/nagios/collect_ceph_status.sh'
--- files/nagios/collect_ceph_status.sh 1970-01-01 00:00:00 +0000
+++ files/nagios/collect_ceph_status.sh 2015-01-12 14:00:31 +0000
@@ -0,0 +1,18 @@
1#!/bin/bash
2# Copyright (C) 2014 Canonical
3# All Rights Reserved
4# Author: Jacek Nykis <jacek.nykis@canonical.com>
5
6LOCK=/var/lock/ceph-status.lock
7lockfile-create -r2 --lock-name $LOCK > /dev/null 2>&1
8if [ $? -ne 0 ]; then
9 exit 1
10fi
11trap "rm -f $LOCK > /dev/null 2>&1" exit
12
13DATA_DIR="/var/lib/nagios"
14if [ ! -d $DATA_DIR ]; then
15 mkdir -p $DATA_DIR
16fi
17
18ceph status >${DATA_DIR}/cat-ceph-status.txt
019
=== added directory 'hooks/charmhelpers/contrib/charmsupport'
=== added file 'hooks/charmhelpers/contrib/charmsupport/__init__.py'
=== added file 'hooks/charmhelpers/contrib/charmsupport/nrpe.py'
--- hooks/charmhelpers/contrib/charmsupport/nrpe.py 1970-01-01 00:00:00 +0000
+++ hooks/charmhelpers/contrib/charmsupport/nrpe.py 2015-01-12 14:00:31 +0000
@@ -0,0 +1,308 @@
1"""Compatibility with the nrpe-external-master charm"""
2# Copyright 2012 Canonical Ltd.
3#
4# Authors:
5# Matthew Wedgwood <matthew.wedgwood@canonical.com>
6
7import subprocess
8import pwd
9import grp
10import os
11import re
12import shlex
13import yaml
14
15from charmhelpers.core.hookenv import (
16 config,
17 local_unit,
18 log,
19 relation_ids,
20 relation_set,
21 relations_of_type,
22)
23
24from charmhelpers.core.host import service
25
26# This module adds compatibility with the nrpe-external-master and plain nrpe
27# subordinate charms. To use it in your charm:
28#
29# 1. Update metadata.yaml
30#
31# provides:
32# (...)
33# nrpe-external-master:
34# interface: nrpe-external-master
35# scope: container
36#
37# and/or
38#
39# provides:
40# (...)
41# local-monitors:
42# interface: local-monitors
43# scope: container
44
45#
46# 2. Add the following to config.yaml
47#
48# nagios_context:
49# default: "juju"
50# type: string
51# description: |
52# Used by the nrpe subordinate charms.
53# A string that will be prepended to instance name to set the host name
54# in nagios. So for instance the hostname would be something like:
55# juju-myservice-0
56# If you're running multiple environments with the same services in them
57# this allows you to differentiate between them.
58# nagios_servicegroups:
59# default: ""
60# type: string
61# description: |
62# A comma-separated list of nagios servicegroups.
63# If left empty, the nagios_context will be used as the servicegroup
64#
65# 3. Add custom checks (Nagios plugins) to files/nrpe-external-master
66#
67# 4. Update your hooks.py with something like this:
68#
69# from charmsupport.nrpe import NRPE
70# (...)
71# def update_nrpe_config():
72# nrpe_compat = NRPE()
73# nrpe_compat.add_check(
74# shortname = "myservice",
75# description = "Check MyService",
76# check_cmd = "check_http -w 2 -c 10 http://localhost"
77# )
78# nrpe_compat.add_check(
79# "myservice_other",
80# "Check for widget failures",
81# check_cmd = "/srv/myapp/scripts/widget_check"
82# )
83# nrpe_compat.write()
84#
85# def config_changed():
86# (...)
87# update_nrpe_config()
88#
89# def nrpe_external_master_relation_changed():
90# update_nrpe_config()
91#
92# def local_monitors_relation_changed():
93# update_nrpe_config()
94#
95# 5. ln -s hooks.py nrpe-external-master-relation-changed
96# ln -s hooks.py local-monitors-relation-changed
97
98
99class CheckException(Exception):
100 pass
101
102
103class Check(object):
104 shortname_re = '[A-Za-z0-9-_]+$'
105 service_template = ("""
106#---------------------------------------------------
107# This file is Juju managed
108#---------------------------------------------------
109define service {{
110 use active-service
111 host_name {nagios_hostname}
112 service_description {nagios_hostname}[{shortname}] """
113 """{description}
114 check_command check_nrpe!{command}
115 servicegroups {nagios_servicegroup}
116}}
117""")
118
119 def __init__(self, shortname, description, check_cmd):
120 super(Check, self).__init__()
121 # XXX: could be better to calculate this from the service name
122 if not re.match(self.shortname_re, shortname):
123 raise CheckException("shortname must match {}".format(
124 Check.shortname_re))
125 self.shortname = shortname
126 self.command = "check_{}".format(shortname)
127 # Note: a set of invalid characters is defined by the
128 # Nagios server config
129 # The default is: illegal_object_name_chars=`~!$%^&*"|'<>?,()=
130 self.description = description
131 self.check_cmd = self._locate_cmd(check_cmd)
132
133 def _locate_cmd(self, check_cmd):
134 search_path = (
135 '/usr/lib/nagios/plugins',
136 '/usr/local/lib/nagios/plugins',
137 )
138 parts = shlex.split(check_cmd)
139 for path in search_path:
140 if os.path.exists(os.path.join(path, parts[0])):
141 command = os.path.join(path, parts[0])
142 if len(parts) > 1:
143 command += " " + " ".join(parts[1:])
144 return command
145 log('Check command not found: {}'.format(parts[0]))
146 return ''
147
148 def write(self, nagios_context, hostname, nagios_servicegroups=None):
149 nrpe_check_file = '/etc/nagios/nrpe.d/{}.cfg'.format(
150 self.command)
151 with open(nrpe_check_file, 'w') as nrpe_check_config:
152 nrpe_check_config.write("# check {}\n".format(self.shortname))
153 nrpe_check_config.write("command[{}]={}\n".format(
154 self.command, self.check_cmd))
155
156 if not os.path.exists(NRPE.nagios_exportdir):
157 log('Not writing service config as {} is not accessible'.format(
158 NRPE.nagios_exportdir))
159 else:
160 self.write_service_config(nagios_context, hostname,
161 nagios_servicegroups)
162
163 def write_service_config(self, nagios_context, hostname,
164 nagios_servicegroups=None):
165 for f in os.listdir(NRPE.nagios_exportdir):
166 if re.search('.*{}.cfg'.format(self.command), f):
167 os.remove(os.path.join(NRPE.nagios_exportdir, f))
168
169 if not nagios_servicegroups:
170 nagios_servicegroups = nagios_context
171
172 templ_vars = {
173 'nagios_hostname': hostname,
174 'nagios_servicegroup': nagios_servicegroups,
175 'description': self.description,
176 'shortname': self.shortname,
177 'command': self.command,
178 }
179 nrpe_service_text = Check.service_template.format(**templ_vars)
180 nrpe_service_file = '{}/service__{}_{}.cfg'.format(
181 NRPE.nagios_exportdir, hostname, self.command)
182 with open(nrpe_service_file, 'w') as nrpe_service_config:
183 nrpe_service_config.write(str(nrpe_service_text))
184
185 def run(self):
186 subprocess.call(self.check_cmd)
187
188
189class NRPE(object):
190 nagios_logdir = '/var/log/nagios'
191 nagios_exportdir = '/var/lib/nagios/export'
192 nrpe_confdir = '/etc/nagios/nrpe.d'
193
194 def __init__(self, hostname=None):
195 super(NRPE, self).__init__()
196 self.config = config()
197 self.nagios_context = self.config['nagios_context']
198 if 'nagios_servicegroups' in self.config:
199 self.nagios_servicegroups = self.config['nagios_servicegroups']
200 else:
201 self.nagios_servicegroups = 'juju'
202 self.unit_name = local_unit().replace('/', '-')
203 if hostname:
204 self.hostname = hostname
205 else:
206 self.hostname = "{}-{}".format(self.nagios_context, self.unit_name)
207 self.checks = []
208
209 def add_check(self, *args, **kwargs):
210 self.checks.append(Check(*args, **kwargs))
211
212 def write(self):
213 try:
214 nagios_uid = pwd.getpwnam('nagios').pw_uid
215 nagios_gid = grp.getgrnam('nagios').gr_gid
216 except:
217 log("Nagios user not set up, nrpe checks not updated")
218 return
219
220 if not os.path.exists(NRPE.nagios_logdir):
221 os.mkdir(NRPE.nagios_logdir)
222 os.chown(NRPE.nagios_logdir, nagios_uid, nagios_gid)
223
224 nrpe_monitors = {}
225 monitors = {"monitors": {"remote": {"nrpe": nrpe_monitors}}}
226 for nrpecheck in self.checks:
227 nrpecheck.write(self.nagios_context, self.hostname,
228 self.nagios_servicegroups)
229 nrpe_monitors[nrpecheck.shortname] = {
230 "command": nrpecheck.command,
231 }
232
233 service('restart', 'nagios-nrpe-server')
234
235 for rid in relation_ids("local-monitors"):
236 relation_set(relation_id=rid, monitors=yaml.dump(monitors))
237
238
239def get_nagios_hostcontext(relation_name='nrpe-external-master'):
240 """
241 Query relation with nrpe subordinate, return the nagios_host_context
242
243 :param str relation_name: Name of relation nrpe sub joined to
244 """
245 for rel in relations_of_type(relation_name):
246 if 'nagios_hostname' in rel:
247 return rel['nagios_host_context']
248
249
250def get_nagios_hostname(relation_name='nrpe-external-master'):
251 """
252 Query relation with nrpe subordinate, return the nagios_hostname
253
254 :param str relation_name: Name of relation nrpe sub joined to
255 """
256 for rel in relations_of_type(relation_name):
257 if 'nagios_hostname' in rel:
258 return rel['nagios_hostname']
259
260
261def get_nagios_unit_name(relation_name='nrpe-external-master'):
262 """
263 Return the nagios unit name prepended with host_context if needed
264
265 :param str relation_name: Name of relation nrpe sub joined to
266 """
267 host_context = get_nagios_hostcontext(relation_name)
268 if host_context:
269 unit = "%s:%s" % (host_context, local_unit())
270 else:
271 unit = local_unit()
272 return unit
273
274
275def add_init_service_checks(nrpe, services, unit_name):
276 """
277 Add checks for each service in list
278
279 :param NRPE nrpe: NRPE object to add check to
280 :param list services: List of services to check
281 :param str unit_name: Unit name to use in check description
282 """
283 for svc in services:
284 upstart_init = '/etc/init/%s.conf' % svc
285 sysv_init = '/etc/init.d/%s' % svc
286 if os.path.exists(upstart_init):
287 nrpe.add_check(
288 shortname=svc,
289 description='process check {%s}' % unit_name,
290 check_cmd='check_upstart_job %s' % svc
291 )
292 elif os.path.exists(sysv_init):
293 cronpath = '/etc/cron.d/nagios-service-check-%s' % svc
294 cron_file = ('*/5 * * * * root '
295 '/usr/local/lib/nagios/plugins/check_exit_status.pl '
296 '-s /etc/init.d/%s status > '
297 '/var/lib/nagios/service-check-%s.txt\n' % (svc,
298 svc)
299 )
300 f = open(cronpath, 'w')
301 f.write(cron_file)
302 f.close()
303 nrpe.add_check(
304 shortname=svc,
305 description='process check {%s}' % unit_name,
306 check_cmd='check_status_file.py -f '
307 '/var/lib/nagios/service-check-%s.txt' % svc,
308 )
0309
=== added file 'hooks/charmhelpers/contrib/charmsupport/volumes.py'
--- hooks/charmhelpers/contrib/charmsupport/volumes.py 1970-01-01 00:00:00 +0000
+++ hooks/charmhelpers/contrib/charmsupport/volumes.py 2015-01-12 14:00:31 +0000
@@ -0,0 +1,159 @@
1'''
2Functions for managing volumes in juju units. One volume is supported per unit.
3Subordinates may have their own storage, provided it is on its own partition.
4
5Configuration stanzas::
6
7 volume-ephemeral:
8 type: boolean
9 default: true
10 description: >
11 If false, a volume is mounted as sepecified in "volume-map"
12 If true, ephemeral storage will be used, meaning that log data
13 will only exist as long as the machine. YOU HAVE BEEN WARNED.
14 volume-map:
15 type: string
16 default: {}
17 description: >
18 YAML map of units to device names, e.g:
19 "{ rsyslog/0: /dev/vdb, rsyslog/1: /dev/vdb }"
20 Service units will raise a configure-error if volume-ephemeral
21 is 'true' and no volume-map value is set. Use 'juju set' to set a
22 value and 'juju resolved' to complete configuration.
23
24Usage::
25
26 from charmsupport.volumes import configure_volume, VolumeConfigurationError
27 from charmsupport.hookenv import log, ERROR
28 def post_mount_hook():
29 stop_service('myservice')
30 def post_mount_hook():
31 start_service('myservice')
32
33 if __name__ == '__main__':
34 try:
35 configure_volume(before_change=pre_mount_hook,
36 after_change=post_mount_hook)
37 except VolumeConfigurationError:
38 log('Storage could not be configured', ERROR)
39
40'''
41
42# XXX: Known limitations
43# - fstab is neither consulted nor updated
44
45import os
46from charmhelpers.core import hookenv
47from charmhelpers.core import host
48import yaml
49
50
51MOUNT_BASE = '/srv/juju/volumes'
52
53
54class VolumeConfigurationError(Exception):
55 '''Volume configuration data is missing or invalid'''
56 pass
57
58
59def get_config():
60 '''Gather and sanity-check volume configuration data'''
61 volume_config = {}
62 config = hookenv.config()
63
64 errors = False
65
66 if config.get('volume-ephemeral') in (True, 'True', 'true', 'Yes', 'yes'):
67 volume_config['ephemeral'] = True
68 else:
69 volume_config['ephemeral'] = False
70
71 try:
72 volume_map = yaml.safe_load(config.get('volume-map', '{}'))
73 except yaml.YAMLError as e:
74 hookenv.log("Error parsing YAML volume-map: {}".format(e),
75 hookenv.ERROR)
76 errors = True
77 if volume_map is None:
78 # probably an empty string
79 volume_map = {}
80 elif not isinstance(volume_map, dict):
81 hookenv.log("Volume-map should be a dictionary, not {}".format(
82 type(volume_map)))
83 errors = True
84
85 volume_config['device'] = volume_map.get(os.environ['JUJU_UNIT_NAME'])
86 if volume_config['device'] and volume_config['ephemeral']:
87 # asked for ephemeral storage but also defined a volume ID
88 hookenv.log('A volume is defined for this unit, but ephemeral '
89 'storage was requested', hookenv.ERROR)
90 errors = True
91 elif not volume_config['device'] and not volume_config['ephemeral']:
92 # asked for permanent storage but did not define volume ID
93 hookenv.log('Ephemeral storage was requested, but there is no volume '
94 'defined for this unit.', hookenv.ERROR)
95 errors = True
96
97 unit_mount_name = hookenv.local_unit().replace('/', '-')
98 volume_config['mountpoint'] = os.path.join(MOUNT_BASE, unit_mount_name)
99
100 if errors:
101 return None
102 return volume_config
103
104
105def mount_volume(config):
106 if os.path.exists(config['mountpoint']):
107 if not os.path.isdir(config['mountpoint']):
108 hookenv.log('Not a directory: {}'.format(config['mountpoint']))
109 raise VolumeConfigurationError()
110 else:
111 host.mkdir(config['mountpoint'])
112 if os.path.ismount(config['mountpoint']):
113 unmount_volume(config)
114 if not host.mount(config['device'], config['mountpoint'], persist=True):
115 raise VolumeConfigurationError()
116
117
118def unmount_volume(config):
119 if os.path.ismount(config['mountpoint']):
120 if not host.umount(config['mountpoint'], persist=True):
121 raise VolumeConfigurationError()
122
123
124def managed_mounts():
125 '''List of all mounted managed volumes'''
126 return filter(lambda mount: mount[0].startswith(MOUNT_BASE), host.mounts())
127
128
129def configure_volume(before_change=lambda: None, after_change=lambda: None):
130 '''Set up storage (or don't) according to the charm's volume configuration.
131 Returns the mount point or "ephemeral". before_change and after_change
132 are optional functions to be called if the volume configuration changes.
133 '''
134
135 config = get_config()
136 if not config:
137 hookenv.log('Failed to read volume configuration', hookenv.CRITICAL)
138 raise VolumeConfigurationError()
139
140 if config['ephemeral']:
141 if os.path.ismount(config['mountpoint']):
142 before_change()
143 unmount_volume(config)
144 after_change()
145 return 'ephemeral'
146 else:
147 # persistent storage
148 if os.path.ismount(config['mountpoint']):
149 mounts = dict(managed_mounts())
150 if mounts.get(config['mountpoint']) != config['device']:
151 before_change()
152 unmount_volume(config)
153 mount_volume(config)
154 after_change()
155 else:
156 before_change()
157 mount_volume(config)
158 after_change()
159 return config['mountpoint']
0160
=== modified file 'hooks/charmhelpers/contrib/storage/linux/ceph.py'
--- hooks/charmhelpers/contrib/storage/linux/ceph.py 2014-11-26 09:07:27 +0000
+++ hooks/charmhelpers/contrib/storage/linux/ceph.py 2015-01-12 14:00:31 +0000
@@ -372,3 +372,46 @@
372 return None372 return None
373 else:373 else:
374 return None374 return None
375
376
377class CephBrokerRq(object):
378 """Ceph broker request.
379
380 Multiple operations can be added to a request and sent to the Ceph broker
381 to be executed.
382
383 Request is json-encoded for sending over the wire.
384
385 The API is versioned and defaults to version 1.
386 """
387 def __init__(self, api_version=1):
388 self.api_version = api_version
389 self.ops = []
390
391 def add_op_create_pool(self, name, replica_count=3):
392 self.ops.append({'op': 'create-pool', 'name': name,
393 'replicas': replica_count})
394
395 @property
396 def request(self):
397 return json.dumps({'api-version': self.api_version, 'ops': self.ops})
398
399
400class CephBrokerRsp(object):
401 """Ceph broker response.
402
403 Response is json-decoded and contents provided as methods/properties.
404
405 The API is versioned and defaults to version 1.
406 """
407 def __init__(self, encoded_rsp):
408 self.api_version = None
409 self.rsp = json.loads(encoded_rsp)
410
411 @property
412 def exit_code(self):
413 return self.rsp.get('exit-code')
414
415 @property
416 def exit_msg(self):
417 return self.rsp.get('stderr')
375418
=== added file 'hooks/charmhelpers/core/decorators.py'
--- hooks/charmhelpers/core/decorators.py 1970-01-01 00:00:00 +0000
+++ hooks/charmhelpers/core/decorators.py 2015-01-12 14:00:31 +0000
@@ -0,0 +1,41 @@
1#
2# Copyright 2014 Canonical Ltd.
3#
4# Authors:
5# Edward Hope-Morley <opentastic@gmail.com>
6#
7
8import time
9
10from charmhelpers.core.hookenv import (
11 log,
12 INFO,
13)
14
15
16def retry_on_exception(num_retries, base_delay=0, exc_type=Exception):
17 """If the decorated function raises exception exc_type, allow num_retries
18 retry attempts before raise the exception.
19 """
20 def _retry_on_exception_inner_1(f):
21 def _retry_on_exception_inner_2(*args, **kwargs):
22 retries = num_retries
23 multiplier = 1
24 while True:
25 try:
26 return f(*args, **kwargs)
27 except exc_type:
28 if not retries:
29 raise
30
31 delay = base_delay * multiplier
32 multiplier += 1
33 log("Retrying '%s' %d more times (delay=%s)" %
34 (f.__name__, retries, delay), level=INFO)
35 retries -= 1
36 if delay:
37 time.sleep(delay)
38
39 return _retry_on_exception_inner_2
40
41 return _retry_on_exception_inner_1
042
=== modified file 'hooks/charmhelpers/core/host.py'
--- hooks/charmhelpers/core/host.py 2014-12-11 16:47:24 +0000
+++ hooks/charmhelpers/core/host.py 2015-01-12 14:00:31 +0000
@@ -162,13 +162,16 @@
162 uid = pwd.getpwnam(owner).pw_uid162 uid = pwd.getpwnam(owner).pw_uid
163 gid = grp.getgrnam(group).gr_gid163 gid = grp.getgrnam(group).gr_gid
164 realpath = os.path.abspath(path)164 realpath = os.path.abspath(path)
165 if os.path.exists(realpath):165 path_exists = os.path.exists(realpath)
166 if force and not os.path.isdir(realpath):166 if path_exists and force:
167 if not os.path.isdir(realpath):
167 log("Removing non-directory file {} prior to mkdir()".format(path))168 log("Removing non-directory file {} prior to mkdir()".format(path))
168 os.unlink(realpath)169 os.unlink(realpath)
169 else:170 os.makedirs(realpath, perms)
171 os.chown(realpath, uid, gid)
172 elif not path_exists:
170 os.makedirs(realpath, perms)173 os.makedirs(realpath, perms)
171 os.chown(realpath, uid, gid)174 os.chown(realpath, uid, gid)
172175
173176
174def write_file(path, content, owner='root', group='root', perms=0o444):177def write_file(path, content, owner='root', group='root', perms=0o444):
175178
=== modified file 'hooks/charmhelpers/fetch/__init__.py'
--- hooks/charmhelpers/fetch/__init__.py 2014-11-25 17:07:46 +0000
+++ hooks/charmhelpers/fetch/__init__.py 2015-01-12 14:00:31 +0000
@@ -64,9 +64,16 @@
64 'trusty-juno/updates': 'trusty-updates/juno',64 'trusty-juno/updates': 'trusty-updates/juno',
65 'trusty-updates/juno': 'trusty-updates/juno',65 'trusty-updates/juno': 'trusty-updates/juno',
66 'juno/proposed': 'trusty-proposed/juno',66 'juno/proposed': 'trusty-proposed/juno',
67 'juno/proposed': 'trusty-proposed/juno',
68 'trusty-juno/proposed': 'trusty-proposed/juno',67 'trusty-juno/proposed': 'trusty-proposed/juno',
69 'trusty-proposed/juno': 'trusty-proposed/juno',68 'trusty-proposed/juno': 'trusty-proposed/juno',
69 # Kilo
70 'kilo': 'trusty-updates/kilo',
71 'trusty-kilo': 'trusty-updates/kilo',
72 'trusty-kilo/updates': 'trusty-updates/kilo',
73 'trusty-updates/kilo': 'trusty-updates/kilo',
74 'kilo/proposed': 'trusty-proposed/kilo',
75 'trusty-kilo/proposed': 'trusty-proposed/kilo',
76 'trusty-proposed/kilo': 'trusty-proposed/kilo',
70}77}
7178
72# The order of this list is very important. Handlers should be listed in from79# The order of this list is very important. Handlers should be listed in from
7380
=== modified file 'hooks/hooks.py'
--- hooks/hooks.py 2014-11-25 18:29:07 +0000
+++ hooks/hooks.py 2015-01-12 14:00:31 +0000
@@ -25,12 +25,15 @@
25 relation_set,25 relation_set,
26 remote_unit,26 remote_unit,
27 Hooks, UnregisteredHookError,27 Hooks, UnregisteredHookError,
28 service_name28 service_name,
29 relations_of_type
29)30)
30from charmhelpers.core.host import (31from charmhelpers.core.host import (
31 service_restart,32 service_restart,
32 umount,33 umount,
33 mkdir,34 mkdir,
35 write_file,
36 rsync,
34 cmp_pkgrevno37 cmp_pkgrevno
35)38)
36from charmhelpers.fetch import (39from charmhelpers.fetch import (
@@ -56,8 +59,15 @@
56 process_requests59 process_requests
57)60)
5861
62from charmhelpers.contrib.charmsupport import nrpe
63
59hooks = Hooks()64hooks = Hooks()
6065
66NAGIOS_PLUGINS = '/usr/local/lib/nagios/plugins'
67SCRIPTS_DIR = '/usr/local/bin'
68STATUS_FILE = '/var/lib/nagios/cat-ceph-status.txt'
69STATUS_CRONFILE = '/etc/cron.d/cat-ceph-health'
70
6171
62def install_upstart_scripts():72def install_upstart_scripts():
63 # Only install upstart configurations for older versions73 # Only install upstart configurations for older versions
@@ -152,6 +162,9 @@
152 reformat_osd(), config('ignore-device-errors'))162 reformat_osd(), config('ignore-device-errors'))
153 ceph.start_osds(get_devices())163 ceph.start_osds(get_devices())
154164
165 if relations_of_type('nrpe-external-master'):
166 update_nrpe_config()
167
155168
156def get_mon_hosts():169def get_mon_hosts():
157 hosts = []170 hosts = []
@@ -334,6 +347,36 @@
334 ceph.start_osds(get_devices())347 ceph.start_osds(get_devices())
335348
336349
350@hooks.hook('nrpe-external-master-relation-joined')
351@hooks.hook('nrpe-external-master-relation-changed')
352def update_nrpe_config():
353 # python-dbus is used by check_upstart_job
354 apt_install('python-dbus')
355 log('Refreshing nagios checks')
356 if os.path.isdir(NAGIOS_PLUGINS):
357 rsync(os.path.join(os.getenv('CHARM_DIR'), 'files', 'nagios',
358 'check_ceph_status.py'),
359 os.path.join(NAGIOS_PLUGINS, 'check_ceph_status.py'))
360
361 script = os.path.join(SCRIPTS_DIR, 'collect_ceph_status.sh')
362 rsync(os.path.join(os.getenv('CHARM_DIR'), 'files',
363 'nagios', 'collect_ceph_status.sh'),
364 script)
365 cronjob = "{} root {}\n".format('*/5 * * * *', script)
366 write_file(STATUS_CRONFILE, cronjob)
367
368 # Find out if nrpe set nagios_hostname
369 hostname = nrpe.get_nagios_hostname()
370 current_unit = nrpe.get_nagios_unit_name()
371 nrpe_setup = nrpe.NRPE(hostname=hostname)
372 nrpe_setup.add_check(
373 shortname="ceph",
374 description='Check Ceph health {%s}' % current_unit,
375 check_cmd='check_ceph_status.py -f {}'.format(STATUS_FILE)
376 )
377 nrpe_setup.write()
378
379
337if __name__ == '__main__':380if __name__ == '__main__':
338 try:381 try:
339 hooks.execute(sys.argv)382 hooks.execute(sys.argv)
340383
=== added symlink 'hooks/nrpe-external-master-relation-changed'
=== target is u'hooks.py'
=== added symlink 'hooks/nrpe-external-master-relation-joined'
=== target is u'hooks.py'
=== modified file 'metadata.yaml'
--- metadata.yaml 2013-07-14 19:46:24 +0000
+++ metadata.yaml 2015-01-12 14:00:31 +0000
@@ -10,9 +10,16 @@
10 mon:10 mon:
11 interface: ceph11 interface: ceph
12provides:12provides:
13 nrpe-external-master:
14 interface: nrpe-external-master
15 scope: container
13 client:16 client:
14 interface: ceph-client17 interface: ceph-client
15 osd:18 osd:
16 interface: ceph-osd19 interface: ceph-osd
17 radosgw:20 radosgw:
18 interface: ceph-radosgw21 interface: ceph-radosgw
22 nrpe-external-master:
23 interface: nrpe-external-master
24 scope: container
25 gets: [nagios_hostname, nagios_host_context]

Subscribers

People subscribed via source and target branches