Merge lp:~clint-fewbar/charms/precise/nagios/add-monitors-2 into lp:charms/nagios
- Precise Pangolin (12.04)
- add-monitors-2
- Merge into trunk
Status: | Merged |
---|---|
Approved by: | Mark Mims |
Approved revision: | 38 |
Merged at revision: | 8 |
Proposed branch: | lp:~clint-fewbar/charms/precise/nagios/add-monitors-2 |
Merge into: | lp:charms/nagios |
Diff against target: |
809 lines (+517/-163) 16 files modified
.bzrignore (+1/-0) README (+18/-5) config.yaml (+8/-0) example.monitors.yaml (+30/-0) hooks/common.py (+222/-54) hooks/install (+16/-1) hooks/monitors-relation-changed (+125/-0) hooks/mymonitors-relation-joined (+17/-0) hooks/nagios-relation-broken (+0/-12) hooks/nagios-relation-changed (+0/-79) hooks/nagios-relation-departed (+0/-10) hooks/test-common.py (+50/-0) hooks/upgrade-charm (+14/-1) metadata.yaml (+4/-0) monitors.yaml (+11/-0) revision (+1/-1) |
To merge this branch: | bzr merge lp:~clint-fewbar/charms/precise/nagios/add-monitors-2 |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Mark Mims (community) | Approve | ||
Review via email: mp+117999@code.launchpad.net |
Commit message
A significant refactor of the charm's relations.
* Deprecate the 'monitoring' interface - never used in any other charm in the official store.
* Adds the 'monitors' interface to communicate complex monitoring information.
* Reworks to use 'pynag' library (embedded in a deb) for nagios configuration editting.
* Adds 'extraconfig' option for users to add extra config options.
* Enables 'external commands' so that checks can be rescheduled by an administrator.
Description of the change
A significant refactor of the charm's relations.
* Deprecate the 'monitoring' interface - never used in any other charm in the official store.
* Adds the 'monitors' interface to communicate complex monitoring information.
* Reworks to use 'pynag' library (embedded in a deb) for Nagios configuration editing.
* Adds 'extraconfig' option for users to add extra config options.
* Enables 'external commands' so that checks can be rescheduled by an administrator.
Clint Byrum (clint-fewbar) wrote : | # |
Mark Mims (mark-mims) wrote : | # |
Please add default user/pass instructions...
maybe mention how to use `juju ssh nagios/0 sudo cat /var/lib/
Preview Diff
1 | === added file '.bzrignore' | |||
2 | --- .bzrignore 1970-01-01 00:00:00 +0000 | |||
3 | +++ .bzrignore 2012-08-02 21:28:44 +0000 | |||
4 | @@ -0,0 +1,1 @@ | |||
5 | 1 | data | ||
6 | 0 | 2 | ||
7 | === modified file 'README' | |||
8 | --- README 2012-05-14 07:24:34 +0000 | |||
9 | +++ README 2012-08-02 21:28:44 +0000 | |||
10 | @@ -10,8 +10,21 @@ | |||
11 | 10 | 10 | ||
12 | 11 | This should result in your Nagios monitoring all of the service units. | 11 | This should result in your Nagios monitoring all of the service units. |
13 | 12 | 12 | ||
19 | 13 | TODO: | 13 | monitors interface |
20 | 14 | - Add a nagios-nrpe subordinate charm to make it easier to use local | 14 | ================== |
21 | 15 | plugins via NRPE | 15 | |
22 | 16 | - Create a proper 'monitoring' interface that charms can use to define | 16 | The monitors interface expects three fields: |
23 | 17 | what they want monitored remotely. | 17 | |
24 | 18 | * monitors - YAML matching the monitors yaml spec. See | ||
25 | 19 | example.monitors.yaml for more information. | ||
26 | 20 | * target-id - Assign any monitors to this target host definition. | ||
27 | 21 | * target-address - Optional, specifies the host of the target to | ||
28 | 22 | monitor. This must be specified by at least one unit so that the | ||
29 | 23 | intended target-id will be monitorable. | ||
30 | 24 | |||
31 | 25 | nrpe | ||
32 | 26 | ==== | ||
33 | 27 | |||
34 | 28 | There is an NRPE subordinate charm which must be used for any local | ||
35 | 29 | monitors. See the 'nrpe' charm's README for information on how to | ||
36 | 30 | make use of it. | ||
37 | 18 | 31 | ||
38 | === added file 'config.yaml' | |||
39 | --- config.yaml 1970-01-01 00:00:00 +0000 | |||
40 | +++ config.yaml 2012-08-02 21:28:44 +0000 | |||
41 | @@ -0,0 +1,8 @@ | |||
42 | 1 | options: | ||
43 | 2 | extraconfig: | ||
44 | 3 | type: string | ||
45 | 4 | default: "" | ||
46 | 5 | description: | | ||
47 | 6 | Any additional nagios configuration you would like to | ||
48 | 7 | add can be set into this element. It will be placed in | ||
49 | 8 | /etc/nagios3/conf.d/extra.cfg | ||
50 | 0 | 9 | ||
51 | === added directory 'debs' | |||
52 | === added file 'debs/pynag_0.4.2-1_all.deb' | |||
53 | 1 | Binary files debs/pynag_0.4.2-1_all.deb 1970-01-01 00:00:00 +0000 and debs/pynag_0.4.2-1_all.deb 2012-08-02 21:28:44 +0000 differ | 10 | Binary files debs/pynag_0.4.2-1_all.deb 1970-01-01 00:00:00 +0000 and debs/pynag_0.4.2-1_all.deb 2012-08-02 21:28:44 +0000 differ |
54 | === added file 'example.monitors.yaml' | |||
55 | --- example.monitors.yaml 1970-01-01 00:00:00 +0000 | |||
56 | +++ example.monitors.yaml 2012-08-02 21:28:44 +0000 | |||
57 | @@ -0,0 +1,30 @@ | |||
58 | 1 | # Version of the spec, mostly ignored but 0.3 is the current one | ||
59 | 2 | version: '0.3' | ||
60 | 3 | # Dict with just 'local' and 'remote' as parts | ||
61 | 4 | monitors: | ||
62 | 5 | # local monitors need an agent to be handled. See nrpe charm for | ||
63 | 6 | # some example implementations | ||
64 | 7 | local: | ||
65 | 8 | # procrunning checks for a running process named X (no path) | ||
66 | 9 | procrunning: | ||
67 | 10 | # Multiple procrunning can be defined, this is the "name" of it | ||
68 | 11 | nagios3: | ||
69 | 12 | min: 1 | ||
70 | 13 | max: 1 | ||
71 | 14 | executable: nagios3 | ||
72 | 15 | # Remote monitors can be polled directly by a remote system | ||
73 | 16 | remote: | ||
74 | 17 | # do a request on the HTTP protocol | ||
75 | 18 | http: | ||
76 | 19 | nagios: | ||
77 | 20 | port: 80 | ||
78 | 21 | path: /nagios3/ | ||
79 | 22 | # expected status response (otherwise just look for 200) | ||
80 | 23 | status: 'HTTP/1.1 401' | ||
81 | 24 | # Use as the Host: header (the server address will still be used to connect() to) | ||
82 | 25 | host: www.fewbar.com | ||
83 | 26 | mysql: | ||
84 | 27 | # Named basic check | ||
85 | 28 | basic: | ||
86 | 29 | username: monitors | ||
87 | 30 | password: abcdefg123456 | ||
88 | 0 | 31 | ||
89 | === modified file 'hooks/common.py' | |||
90 | --- hooks/common.py 2012-05-14 23:48:24 +0000 | |||
91 | +++ hooks/common.py 2012-08-02 21:28:44 +0000 | |||
92 | @@ -2,7 +2,26 @@ | |||
93 | 2 | import socket | 2 | import socket |
94 | 3 | import os | 3 | import os |
95 | 4 | import os.path | 4 | import os.path |
97 | 5 | 5 | import re | |
98 | 6 | import sqlite3 | ||
99 | 7 | import shutil | ||
100 | 8 | import tempfile | ||
101 | 9 | |||
102 | 10 | from pynag import Model | ||
103 | 11 | |||
104 | 12 | INPROGRESS_DIR = '/etc/nagios3-inprogress' | ||
105 | 13 | INPROGRESS_CFG = '/etc/nagios3-inprogress/nagios.cfg' | ||
106 | 14 | INPROGRESS_CONF_D = '/etc/nagios3-inprogress/conf.d' | ||
107 | 15 | CHARM_CFG = '/etc/nagios3-inprogress/conf.d/charm.cfg' | ||
108 | 16 | MAIN_NAGIOS_BAK = '/etc/nagios3.bak' | ||
109 | 17 | MAIN_NAGIOS_DIR = '/etc/nagios3' | ||
110 | 18 | MAIN_NAGIOS_CFG = '/etc/nagios3/nagios.cfg' | ||
111 | 19 | PLUGIN_PATH = '/usr/lib/nagios/plugins' | ||
112 | 20 | |||
113 | 21 | Model.cfg_file = INPROGRESS_CFG | ||
114 | 22 | Model.pynag_directory = INPROGRESS_CONF_D | ||
115 | 23 | |||
116 | 24 | reduce_RE = re.compile('[\W_]') | ||
117 | 6 | 25 | ||
118 | 7 | def check_ip(n): | 26 | def check_ip(n): |
119 | 8 | try: | 27 | try: |
120 | @@ -28,59 +47,208 @@ | |||
121 | 28 | if check_ip(hostname): | 47 | if check_ip(hostname): |
122 | 29 | # Some providers don't provide hostnames, so use the remote unit name. | 48 | # Some providers don't provide hostnames, so use the remote unit name. |
123 | 30 | ip_address = hostname | 49 | ip_address = hostname |
124 | 31 | hostname = remote_unit.replace('/','-') | ||
125 | 32 | else: | 50 | else: |
126 | 33 | ip_address = socket.getaddrinfo(hostname, None)[0][4][0] | 51 | ip_address = socket.getaddrinfo(hostname, None)[0][4][0] |
158 | 34 | return (ip_address, hostname) | 52 | return (ip_address, remote_unit.replace('/', '-')) |
159 | 35 | 53 | ||
160 | 36 | # relationId-hostname-config.cfg | 54 | |
161 | 37 | host_config_path_template = '/etc/nagios3/conf.d/%s-%s-config.cfg' | 55 | def refresh_hostgroups(): |
162 | 38 | 56 | """ Not the most efficient thing but since we're only | |
163 | 39 | hostgroup_template = """ | 57 | parsing what is already on disk here its not too bad """ |
164 | 40 | define hostgroup { | 58 | hosts = [ x['host_name'] for x in Model.Host.objects.all if x['host_name'] ] |
165 | 41 | hostgroup_name %(name)s | 59 | |
166 | 42 | alias %(alias)s | 60 | hgroups = {} |
167 | 43 | members %(members)s | 61 | for host in hosts: |
168 | 44 | } | 62 | try: |
169 | 45 | """ | 63 | (service, unit_id) = host.rsplit('-', 1) |
170 | 46 | hostgroup_path_template = '/etc/nagios3/conf.d/%s-hostgroup.cfg' | 64 | except ValueError: |
171 | 47 | 65 | continue | |
172 | 48 | 66 | if service in hgroups: | |
173 | 49 | def remove_hostgroup(relation_id): | 67 | hgroups[service].append(host) |
143 | 50 | hostgroup_path = hostgroup_path_template % (relation_id) | ||
144 | 51 | if os.path.exists(hostgroup_path): | ||
145 | 52 | os.unlink(hostgroup_path) | ||
146 | 53 | |||
147 | 54 | |||
148 | 55 | def handle_hostgroup(relation_id): | ||
149 | 56 | p = subprocess.Popen(["relation-list","-r",relation_id], | ||
150 | 57 | stdout=subprocess.PIPE) | ||
151 | 58 | services = {} | ||
152 | 59 | for unit in p.stdout: | ||
153 | 60 | unit = unit.strip() | ||
154 | 61 | service_name = unit.strip().split('/')[0] | ||
155 | 62 | (_, hostname) = get_ip_and_hostname(unit, relation_id) | ||
156 | 63 | if service_name in services: | ||
157 | 64 | services[service_name].add(hostname) | ||
174 | 65 | else: | 68 | else: |
196 | 66 | services[service_name] = set([hostname]) | 69 | hgroups[service] = [host] |
197 | 67 | p.communicate() | 70 | |
198 | 68 | if p.returncode != 0: | 71 | # Find existing autogenerated |
199 | 69 | raise RuntimeError('relation-list failed with code %d' % p.returncode) | 72 | auto_hgroups = Model.Hostgroup.objects.filter(notes__contains='#autogenerated#') |
200 | 70 | 73 | auto_hgroups = [ x.get_attribute('hostgroup_name') for x in auto_hgroups ] | |
201 | 71 | hostgroup_path = hostgroup_path_template % (relation_id) | 74 | |
202 | 72 | for service, members in services.iteritems(): | 75 | # Delete the ones not in hgroups |
203 | 73 | with open(hostgroup_path, 'w') as outfile: | 76 | to_delete = set(auto_hgroups).difference(set(hgroups.keys())) |
204 | 74 | outfile.write(hostgroup_template % {'name': service, | 77 | for hgroup_name in to_delete: |
205 | 75 | 'alias': service, 'members': ','.join(members)}) | 78 | try: |
206 | 76 | 79 | hgroup = Model.Hostgroup.objects.get_by_shortname(hgroup_name) | |
207 | 77 | def refresh_hostgroups(relation_name): | 80 | hgroup.delete() |
208 | 78 | p = subprocess.Popen(["relation-ids",relation_name], | 81 | except ValueError: |
209 | 79 | stdout=subprocess.PIPE) | 82 | pass |
210 | 80 | relids = [ relation_id.strip() for relation_id in p.stdout ] | 83 | |
211 | 81 | for relation_id in relids: | 84 | for hgroup_name, members in hgroups.iteritems(): |
212 | 82 | remove_hostgroup(relation_id) | 85 | try: |
213 | 83 | handle_hostgroup(relation_id) | 86 | hgroup = Model.Hostgroup.objects.get_by_shortname(hgroup_name) |
214 | 84 | p.communicate() | 87 | except ValueError: |
215 | 85 | if p.returncode != 0: | 88 | hgroup = Model.Hostgroup() |
216 | 86 | raise RuntimeError('relation-ids failed with code %d' % p.returncode) | 89 | hgroup.set_filename(CHARM_CFG) |
217 | 90 | hgroup.set_attribute('hostgroup_name', hgroup_name) | ||
218 | 91 | hgroup.set_attribute('notes', '#autogenerated#') | ||
219 | 92 | |||
220 | 93 | hgroup.set_attribute('members', ','.join(members)) | ||
221 | 94 | hgroup.save() | ||
222 | 95 | |||
223 | 96 | |||
224 | 97 | def _make_check_command(args): | ||
225 | 98 | args = [str(arg) for arg in args] | ||
226 | 99 | # There is some worry of collision, but the uniqueness of the initial | ||
227 | 100 | # command should be enough. | ||
228 | 101 | signature = reduce_RE.sub('_', ''.join( | ||
229 | 102 | [os.path.basename(arg) for arg in args])) | ||
230 | 103 | Model.Command.objects.reload_cache() | ||
231 | 104 | try: | ||
232 | 105 | cmd = Model.Command.objects.get_by_shortname(signature) | ||
233 | 106 | except ValueError: | ||
234 | 107 | cmd = Model.Command() | ||
235 | 108 | cmd.set_attribute('command_name', signature) | ||
236 | 109 | cmd.set_attribute('command_line', ' '.join(args)) | ||
237 | 110 | cmd.save() | ||
238 | 111 | return signature | ||
239 | 112 | |||
240 | 113 | def _extend_args(args, cmd_args, switch, value): | ||
241 | 114 | args.append(value) | ||
242 | 115 | cmd_args.extend((switch, '"$ARG%d$"' % len(args))) | ||
243 | 116 | |||
244 | 117 | def customize_http(service, name, extra): | ||
245 | 118 | args = [] | ||
246 | 119 | cmd_args = [] | ||
247 | 120 | plugin = os.path.join(PLUGIN_PATH, 'check_http') | ||
248 | 121 | port = extra.get('port', 80) | ||
249 | 122 | path = extra.get('path', '/') | ||
250 | 123 | args = [port, path] | ||
251 | 124 | cmd_args = [plugin, '-p', '"$ARG1$"', '-u', '"$ARG2$"'] | ||
252 | 125 | if 'status' in extra: | ||
253 | 126 | _extend_args(args, cmd_args, '-e', extra['status']) | ||
254 | 127 | if 'host' in extra: | ||
255 | 128 | _extend_args(args, cmd_args, '-H', extra['host']) | ||
256 | 129 | cmd_args.extend(('-I', '$HOSTADDRESS$')) | ||
257 | 130 | else: | ||
258 | 131 | cmd_args.extend(('-H', '$HOSTADDRESS$')) | ||
259 | 132 | check_command = _make_check_command(cmd_args) | ||
260 | 133 | cmd = '%s!%s' % (check_command, '!'.join([str(x) for x in args])) | ||
261 | 134 | service.set_attribute('check_command', cmd) | ||
262 | 135 | return True | ||
263 | 136 | |||
264 | 137 | |||
265 | 138 | def customize_mysql(service, name, extra): | ||
266 | 139 | plugin = os.path.join(PLUGIN_PATH, 'check_mysql') | ||
267 | 140 | args = [] | ||
268 | 141 | cmd_args = [plugin,'-H', '$HOSTADDRESS$'] | ||
269 | 142 | if 'user' in extra: | ||
270 | 143 | _extend_args(args, cmd_args, '-u', extra['user']) | ||
271 | 144 | if 'password' in extra: | ||
272 | 145 | _extend_args(args, cmd_args, '-p', extra['password']) | ||
273 | 146 | check_command = _make_check_command(cmd_args) | ||
274 | 147 | cmd = '%s!%s' % (check_command, '!'.join([str(x) for x in args])) | ||
275 | 148 | service.set_attribute('check_command', cmd) | ||
276 | 149 | return True | ||
277 | 150 | |||
278 | 151 | |||
279 | 152 | def customize_nrpe(service, name, extra): | ||
280 | 153 | plugin = os.path.join(PLUGIN_PATH, 'check_nrpe') | ||
281 | 154 | args = [] | ||
282 | 155 | cmd_args = [plugin,'-H', '$HOSTADDRESS$'] | ||
283 | 156 | if name in ('mem','swap'): | ||
284 | 157 | cmd_args.extend(('-c', 'check_%s' % name)) | ||
285 | 158 | elif 'command' in extra: | ||
286 | 159 | cmd_args.extend(('-c', extra['command'])) | ||
287 | 160 | else: | ||
288 | 161 | return False | ||
289 | 162 | check_command = _make_check_command(cmd_args) | ||
290 | 163 | cmd = '%s!%s' % (check_command, '!'.join([str(x) for x in args])) | ||
291 | 164 | service.set_attribute('check_command', cmd) | ||
292 | 165 | return True | ||
293 | 166 | |||
294 | 167 | |||
295 | 168 | def customize_service(service, family, name, extra): | ||
296 | 169 | customs = { 'http': customize_http, | ||
297 | 170 | 'mysql': customize_mysql, | ||
298 | 171 | 'nrpe': customize_nrpe} | ||
299 | 172 | if family in customs: | ||
300 | 173 | return customs[family](service, name, extra) | ||
301 | 174 | return False | ||
302 | 175 | |||
303 | 176 | |||
304 | 177 | def get_pynag_host(target_id, owner_unit=None, owner_relation=None): | ||
305 | 178 | try: | ||
306 | 179 | host = Model.Host.objects.get_by_shortname(target_id) | ||
307 | 180 | except ValueError: | ||
308 | 181 | host = Model.Host() | ||
309 | 182 | host.set_filename(CHARM_CFG) | ||
310 | 183 | host.set_attribute('host_name', target_id) | ||
311 | 184 | host.set_attribute('use', 'generic-host') | ||
312 | 185 | host.save() | ||
313 | 186 | host = Model.Host.objects.get_by_shortname(target_id) | ||
314 | 187 | apply_host_policy(target_id, owner_unit, owner_relation) | ||
315 | 188 | return host | ||
316 | 189 | |||
317 | 190 | |||
318 | 191 | def get_pynag_service(target_id, service_name): | ||
319 | 192 | services = Model.Service.objects.filter(host_name=target_id, | ||
320 | 193 | service_description=service_name) | ||
321 | 194 | if len(services) == 0: | ||
322 | 195 | service = Model.Service() | ||
323 | 196 | service.set_filename(CHARM_CFG) | ||
324 | 197 | service.set_attribute('service_description', service_name) | ||
325 | 198 | service.set_attribute('host_name', target_id) | ||
326 | 199 | service.set_attribute('use', 'generic-service') | ||
327 | 200 | else: | ||
328 | 201 | service = services[0] | ||
329 | 202 | return service | ||
330 | 203 | |||
331 | 204 | |||
332 | 205 | def apply_host_policy(target_id, owner_unit, owner_relation): | ||
333 | 206 | ssh_service = get_pynag_service(target_id, 'SSH') | ||
334 | 207 | ssh_service.set_attribute('check_command', 'check_ssh') | ||
335 | 208 | ssh_service.save() | ||
336 | 209 | |||
337 | 210 | |||
338 | 211 | def get_valid_relations(): | ||
339 | 212 | for x in subprocess.Popen(['relation-ids', 'monitors'], | ||
340 | 213 | stdout=subprocess.PIPE).stdout: | ||
341 | 214 | yield x.strip() | ||
342 | 215 | for x in subprocess.Popen(['relation-ids', 'nagios'], | ||
343 | 216 | stdout=subprocess.PIPE).stdout: | ||
344 | 217 | yield x.strip() | ||
345 | 218 | |||
346 | 219 | |||
347 | 220 | def get_valid_units(relation_id): | ||
348 | 221 | for x in subprocess.Popen(['relation-list', '-r', relation_id], | ||
349 | 222 | stdout=subprocess.PIPE).stdout: | ||
350 | 223 | yield x.strip() | ||
351 | 224 | |||
352 | 225 | |||
353 | 226 | def _replace_in_config(find_me, replacement): | ||
354 | 227 | with open(INPROGRESS_CFG) as cf: | ||
355 | 228 | with tempfile.NamedTemporaryFile(dir=INPROGRESS_DIR, delete=False) as new_cf: | ||
356 | 229 | for line in cf: | ||
357 | 230 | new_cf.write(line.replace(find_me, replacement)) | ||
358 | 231 | new_cf.flush() | ||
359 | 232 | os.chmod(new_cf.name, 0644) | ||
360 | 233 | os.unlink(INPROGRESS_CFG) | ||
361 | 234 | os.rename(new_cf.name, INPROGRESS_CFG) | ||
362 | 235 | |||
363 | 236 | |||
364 | 237 | def initialize_inprogress_config(): | ||
365 | 238 | if os.path.exists(INPROGRESS_DIR): | ||
366 | 239 | shutil.rmtree(INPROGRESS_DIR) | ||
367 | 240 | shutil.copytree(MAIN_NAGIOS_DIR, INPROGRESS_DIR) | ||
368 | 241 | _replace_in_config(MAIN_NAGIOS_DIR, INPROGRESS_DIR) | ||
369 | 242 | if os.path.exists(CHARM_CFG): | ||
370 | 243 | os.unlink(CHARM_CFG) | ||
371 | 244 | |||
372 | 245 | |||
373 | 246 | def flush_inprogress_config(): | ||
374 | 247 | if not os.path.exists(INPROGRESS_DIR): | ||
375 | 248 | return | ||
376 | 249 | _replace_in_config(INPROGRESS_DIR, MAIN_NAGIOS_DIR) | ||
377 | 250 | if os.path.exists(MAIN_NAGIOS_BAK): | ||
378 | 251 | shutil.rmtree(MAIN_NAGIOS_BAK) | ||
379 | 252 | if os.path.exists(MAIN_NAGIOS_DIR): | ||
380 | 253 | shutil.move(MAIN_NAGIOS_DIR, MAIN_NAGIOS_BAK) | ||
381 | 254 | shutil.move(INPROGRESS_DIR, MAIN_NAGIOS_DIR) | ||
382 | 87 | 255 | ||
383 | === added symlink 'hooks/config-changed' | |||
384 | === target is u'upgrade-charm' | |||
385 | === modified file 'hooks/install' | |||
386 | --- hooks/install 2012-05-13 23:04:19 +0000 | |||
387 | +++ hooks/install 2012-08-02 21:28:44 +0000 | |||
388 | @@ -17,7 +17,22 @@ | |||
389 | 17 | echo nagios3-cgi nagios3/adminpassword-repeat password $PASSWORD | debconf-set-selections | 17 | echo nagios3-cgi nagios3/adminpassword-repeat password $PASSWORD | debconf-set-selections |
390 | 18 | 18 | ||
391 | 19 | DEBIAN_FRONTEND=noninteractive apt-get -qy \ | 19 | DEBIAN_FRONTEND=noninteractive apt-get -qy \ |
393 | 20 | install nagios3 nagios-plugins python-cheetah dnsutils debconf-utils | 20 | install nagios3 nagios-plugins python-cheetah dnsutils debconf-utils nagios-nrpe-plugin |
394 | 21 | |||
395 | 22 | # Ideally these would be moved into the distro ASAP | ||
396 | 23 | if [ -d debs ] ; then | ||
397 | 24 | dpkg -i debs/*.deb | ||
398 | 25 | fi | ||
399 | 26 | |||
400 | 27 | # enable external commands per README.Debian file | ||
401 | 28 | if ! grep '^check_external_commands=1$' /etc/nagios3/nagios.cfg ; then | ||
402 | 29 | echo check_external_commands=1 >> /etc/nagios3/nagios.cfg | ||
403 | 30 | fi | ||
404 | 31 | # || :'s are for idempotency | ||
405 | 32 | service nagios3 stop || : | ||
406 | 33 | dpkg-statoverride --update --add nagios www-data 2710 /var/lib/nagios3/rw || : | ||
407 | 34 | dpkg-statoverride --update --add nagios nagios 751 /var/lib/nagios3 || : | ||
408 | 35 | service nagios3 start | ||
409 | 21 | 36 | ||
410 | 22 | # For the admin interface | 37 | # For the admin interface |
411 | 23 | open-port 80 | 38 | open-port 80 |
412 | 24 | 39 | ||
413 | === removed symlink 'hooks/legacy-relation-changed' | |||
414 | === target was u'nagios-relation-changed' | |||
415 | === removed symlink 'hooks/legacy-relation-departed' | |||
416 | === target was u'nagios-relation-departed' | |||
417 | === added symlink 'hooks/monitors-relation-broken' | |||
418 | === target is u'monitors-relation-changed' | |||
419 | === added file 'hooks/monitors-relation-changed' | |||
420 | --- hooks/monitors-relation-changed 1970-01-01 00:00:00 +0000 | |||
421 | +++ hooks/monitors-relation-changed 2012-08-02 21:28:44 +0000 | |||
422 | @@ -0,0 +1,125 @@ | |||
423 | 1 | #!/usr/bin/python | ||
424 | 2 | # monitors-relation-changed - Process monitors.yaml into remote nagios monitors | ||
425 | 3 | # Copyright Canonical 2012 Canonical Ltd. All Rights Reserved | ||
426 | 4 | # Author: Clint Byrum <clint.byrum@canonical.com> | ||
427 | 5 | # | ||
428 | 6 | # This program is free software: you can redistribute it and/or modify | ||
429 | 7 | # it under the terms of the GNU General Public License as published by | ||
430 | 8 | # the Free Software Foundation, either version 3 of the License, or | ||
431 | 9 | # (at your option) any later version. | ||
432 | 10 | # | ||
433 | 11 | # This program is distributed in the hope that it will be useful, | ||
434 | 12 | # but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
435 | 13 | # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
436 | 14 | # GNU General Public License for more details. | ||
437 | 15 | # | ||
438 | 16 | # You should have received a copy of the GNU General Public License | ||
439 | 17 | # along with this program. If not, see <http://www.gnu.org/licenses/>. | ||
440 | 18 | |||
441 | 19 | import sys | ||
442 | 20 | import os | ||
443 | 21 | import subprocess | ||
444 | 22 | import yaml | ||
445 | 23 | import json | ||
446 | 24 | import re | ||
447 | 25 | import string | ||
448 | 26 | |||
449 | 27 | |||
450 | 28 | from common import (customize_service, get_pynag_host, | ||
451 | 29 | get_pynag_service, refresh_hostgroups, | ||
452 | 30 | get_valid_relations, get_valid_units, | ||
453 | 31 | initialize_inprogress_config, flush_inprogress_config) | ||
454 | 32 | |||
455 | 33 | |||
456 | 34 | def main(argv): | ||
457 | 35 | # Note that one can pass in args positionally, 'monitors.yaml targetid | ||
458 | 36 | # and target-address' so the hook can be tested without being in a hook | ||
459 | 37 | # context. | ||
460 | 38 | # | ||
461 | 39 | if len(argv) > 1: | ||
462 | 40 | relation_settings = {'monitors': open(argv[1]).read(), | ||
463 | 41 | 'target-id': argv[2]} | ||
464 | 42 | if len(argv) > 3: | ||
465 | 43 | relation_settings['target-address'] = argv[3] | ||
466 | 44 | all_relations = {'monitors:99': {'testing/0': relation_settings}} | ||
467 | 45 | else: | ||
468 | 46 | all_relations = {} | ||
469 | 47 | for relid in get_valid_relations(): | ||
470 | 48 | (relname, relnum) = relid.split(':') | ||
471 | 49 | for unit in get_valid_units(relid): | ||
472 | 50 | relation_settings = json.loads( | ||
473 | 51 | subprocess.check_output(['relation-get', '--format=json', | ||
474 | 52 | '-r', relid, | ||
475 | 53 | '-',unit]).strip()) | ||
476 | 54 | |||
477 | 55 | if relation_settings is None or relation_settings == '': | ||
478 | 56 | continue | ||
479 | 57 | |||
480 | 58 | if relname == 'monitors': | ||
481 | 59 | if ('monitors' not in relation_settings | ||
482 | 60 | or 'target-id' not in relation_settings): | ||
483 | 61 | continue | ||
484 | 62 | else: | ||
485 | 63 | # Fake it for the more generic 'nagios' relation' | ||
486 | 64 | relation_settings['target-id'] = unit.replace('/','-') | ||
487 | 65 | relation_settings['target-address'] = relation_settings['private-address'] | ||
488 | 66 | relation_settings['monitors'] = {'monitors': {'remote': {} } } | ||
489 | 67 | |||
490 | 68 | if relid not in all_relations: | ||
491 | 69 | all_relations[relid] = {} | ||
492 | 70 | |||
493 | 71 | all_relations[relid][unit] = relation_settings | ||
494 | 72 | |||
495 | 73 | # Hack to work around http://pad.lv/1025478 | ||
496 | 74 | targets_with_addresses = set() | ||
497 | 75 | for relid, units in all_relations.iteritems(): | ||
498 | 76 | for unit, relation_settings in units.iteritems(): | ||
499 | 77 | if 'target-address' in relation_settings: | ||
500 | 78 | targets_with_addresses.add(relation_settings['target-id']) | ||
501 | 79 | new_all_relations = {} | ||
502 | 80 | for relid, units in all_relations.iteritems(): | ||
503 | 81 | for unit, relation_settings in units.iteritems(): | ||
504 | 82 | if relation_settings['target-id'] in targets_with_addresses: | ||
505 | 83 | if relid not in new_all_relations: | ||
506 | 84 | new_all_relations[relid] = {} | ||
507 | 85 | new_all_relations[relid][unit] = relation_settings | ||
508 | 86 | all_relations = new_all_relations | ||
509 | 87 | |||
510 | 88 | initialize_inprogress_config() | ||
511 | 89 | for relid, units in all_relations.iteritems(): | ||
512 | 90 | apply_relation_config(relid, units) | ||
513 | 91 | refresh_hostgroups() | ||
514 | 92 | flush_inprogress_config() | ||
515 | 93 | os.system('service nagios3 reload') | ||
516 | 94 | |||
517 | 95 | def apply_relation_config(relid, units): | ||
518 | 96 | for unit, relation_settings in units.iteritems(): | ||
519 | 97 | monitors = relation_settings['monitors'] | ||
520 | 98 | target_id = relation_settings['target-id'] | ||
521 | 99 | # If not set, we don't mess with it, as multiple services may feed | ||
522 | 100 | # monitors in for a particular address. Generally a primary will set this | ||
523 | 101 | # to its own private-address | ||
524 | 102 | target_address = relation_settings.get('target-address', None) | ||
525 | 103 | |||
526 | 104 | if type(monitors) != dict: | ||
527 | 105 | monitors = yaml.safe_load(monitors) | ||
528 | 106 | |||
529 | 107 | # Output nagios config | ||
530 | 108 | host = get_pynag_host(target_id) | ||
531 | 109 | |||
532 | 110 | if target_address is not None: | ||
533 | 111 | host.set_attribute('address', target_address) | ||
534 | 112 | host.save() | ||
535 | 113 | |||
536 | 114 | for mon_family, mons in monitors['monitors']['remote'].iteritems(): | ||
537 | 115 | for mon_name, mon in mons.iteritems(): | ||
538 | 116 | service_name = '%s-%s' % (target_id, mon_name) | ||
539 | 117 | service = get_pynag_service(target_id, service_name) | ||
540 | 118 | if customize_service(service, mon_family, mon_name, mon): | ||
541 | 119 | service.save() | ||
542 | 120 | else: | ||
543 | 121 | print('Ignoring %s due to unknown family %s' % (mon_name, | ||
544 | 122 | mon_family)) | ||
545 | 123 | |||
546 | 124 | if __name__ == '__main__': | ||
547 | 125 | main(sys.argv) | ||
548 | 0 | 126 | ||
549 | === added symlink 'hooks/monitors-relation-departed' | |||
550 | === target is u'monitors-relation-changed' | |||
551 | === added file 'hooks/mymonitors-relation-joined' | |||
552 | --- hooks/mymonitors-relation-joined 1970-01-01 00:00:00 +0000 | |||
553 | +++ hooks/mymonitors-relation-joined 2012-08-02 21:28:44 +0000 | |||
554 | @@ -0,0 +1,17 @@ | |||
555 | 1 | #!/bin/bash | ||
556 | 2 | if [ -n "$JUJU_RELATION_ID" ] ; then | ||
557 | 3 | # single relation joined | ||
558 | 4 | rels=$JUJU_RELATION_ID | ||
559 | 5 | else | ||
560 | 6 | # Refresh from upgrade or some other place | ||
561 | 7 | rels=`relation-ids mymonitors` | ||
562 | 8 | fi | ||
563 | 9 | |||
564 | 10 | target_id=${JUJU_UNIT_NAME//\//-} | ||
565 | 11 | |||
566 | 12 | for rel in $rels ; do | ||
567 | 13 | relation-set -r $rel \ | ||
568 | 14 | monitors="`cat monitors.yaml`" \ | ||
569 | 15 | target-address=`unit-get private-address` \ | ||
570 | 16 | target-id=$target_id | ||
571 | 17 | done | ||
572 | 0 | 18 | ||
573 | === added symlink 'hooks/nagios-relation-broken' | |||
574 | === target is u'monitors-relation-changed' | |||
575 | === removed file 'hooks/nagios-relation-broken' | |||
576 | --- hooks/nagios-relation-broken 2012-05-14 07:15:54 +0000 | |||
577 | +++ hooks/nagios-relation-broken 1970-01-01 00:00:00 +0000 | |||
578 | @@ -1,12 +0,0 @@ | |||
579 | 1 | #!/usr/bin/python | ||
580 | 2 | import glob | ||
581 | 3 | import os | ||
582 | 4 | |||
583 | 5 | import common | ||
584 | 6 | |||
585 | 7 | common.remove_hostgroup(os.environ['JUJU_RELATION_ID']) | ||
586 | 8 | glob_target = common.host_config_path_template % (os.environ['JUJU_RELATION_ID'], '*') | ||
587 | 9 | print 'Removing relation config files: %s' % (glob_target) | ||
588 | 10 | for oldconfig in glob.glob(glob_target): | ||
589 | 11 | print 'Removing %s' % (oldconfig) | ||
590 | 12 | os.unlink(oldconfig) | ||
591 | 13 | 0 | ||
592 | === added symlink 'hooks/nagios-relation-changed' | |||
593 | === target is u'monitors-relation-changed' | |||
594 | === removed file 'hooks/nagios-relation-changed' | |||
595 | --- hooks/nagios-relation-changed 2012-05-14 07:15:54 +0000 | |||
596 | +++ hooks/nagios-relation-changed 1970-01-01 00:00:00 +0000 | |||
597 | @@ -1,79 +0,0 @@ | |||
598 | 1 | #!/usr/bin/env python | ||
599 | 2 | |||
600 | 3 | import string | ||
601 | 4 | import sys | ||
602 | 5 | import os | ||
603 | 6 | import os.path | ||
604 | 7 | import yaml | ||
605 | 8 | import subprocess | ||
606 | 9 | from common import * | ||
607 | 10 | |||
608 | 11 | from Cheetah.Template import Template | ||
609 | 12 | |||
610 | 13 | def write_service_template(service, host, description, command): | ||
611 | 14 | service = service.replace("__hostname__", host) | ||
612 | 15 | service = service.replace("__description__", description) | ||
613 | 16 | service = service.replace("__command__", command) | ||
614 | 17 | return service | ||
615 | 18 | |||
616 | 19 | def write_host_template(host, hostname, ip_address): | ||
617 | 20 | host = host.replace("__hostname__", hostname) | ||
618 | 21 | host = host.replace("__alias__", hostname) | ||
619 | 22 | host = host.replace("__address__", ip_address) | ||
620 | 23 | return host | ||
621 | 24 | |||
622 | 25 | |||
623 | 26 | def main(): | ||
624 | 27 | for var in ['JUJU_REMOTE_UNIT', 'JUJU_RELATION_ID']: | ||
625 | 28 | if var not in os.environ: | ||
626 | 29 | print "%s must be set" % (var) | ||
627 | 30 | return 1 | ||
628 | 31 | relation_id = os.environ["JUJU_RELATION_ID"] | ||
629 | 32 | relation_name = os.path.basename(sys.argv[0]).split('-')[0] | ||
630 | 33 | remote_unit = os.environ["JUJU_REMOTE_UNIT"] | ||
631 | 34 | |||
632 | 35 | service_name, _ = remote_unit.split("/") | ||
633 | 36 | (ip_address, hostname) = get_ip_and_hostname(remote_unit) | ||
634 | 37 | |||
635 | 38 | nagios_service = "" | ||
636 | 39 | host_template = """ | ||
637 | 40 | define host { | ||
638 | 41 | use generic-host ; Name of host template to use | ||
639 | 42 | host_name __hostname__ | ||
640 | 43 | alias __alias__ | ||
641 | 44 | address __address__ | ||
642 | 45 | } | ||
643 | 46 | """ | ||
644 | 47 | service_template = """ | ||
645 | 48 | define service { | ||
646 | 49 | use generic-service ; Name of service template to use | ||
647 | 50 | host_name __hostname__ | ||
648 | 51 | service_description __description__ | ||
649 | 52 | check_command __command__ | ||
650 | 53 | } | ||
651 | 54 | """ | ||
652 | 55 | |||
653 | 56 | # write a single host | ||
654 | 57 | host_template = write_host_template(host_template, hostname, ip_address) | ||
655 | 58 | nagios_service += host_template | ||
656 | 59 | |||
657 | 60 | # all hosts should be running SSH | ||
658 | 61 | nagios_service += write_service_template(service_template, hostname, 'SSH', 'check_ssh') | ||
659 | 62 | |||
660 | 63 | namespace = {'hostname': hostname, 'nagios_config':nagios_service} | ||
661 | 64 | t = Template(open('hooks/templates/nagios.tmpl').read(), searchList=[namespace]) | ||
662 | 65 | config_file = host_config_path_template % (relation_id, hostname) | ||
663 | 66 | f = open(config_file, 'w') | ||
664 | 67 | f.write(str(t)) | ||
665 | 68 | f.close() | ||
666 | 69 | |||
667 | 70 | refresh_hostgroups(relation_name) | ||
668 | 71 | |||
669 | 72 | print "Restarting nagios" | ||
670 | 73 | subprocess.call(["service", "nagios3", "restart"]) | ||
671 | 74 | return 0 | ||
672 | 75 | |||
673 | 76 | if __name__ == '__main__': | ||
674 | 77 | sys.exit(main()) | ||
675 | 78 | |||
676 | 79 | |||
677 | 80 | 0 | ||
678 | === added symlink 'hooks/nagios-relation-departed' | |||
679 | === target is u'monitors-relation-changed' | |||
680 | === removed file 'hooks/nagios-relation-departed' | |||
681 | --- hooks/nagios-relation-departed 2012-05-14 07:15:54 +0000 | |||
682 | +++ hooks/nagios-relation-departed 1970-01-01 00:00:00 +0000 | |||
683 | @@ -1,10 +0,0 @@ | |||
684 | 1 | #!/usr/bin/python | ||
685 | 2 | |||
686 | 3 | import common | ||
687 | 4 | import os | ||
688 | 5 | |||
689 | 6 | relation_id = os.environ['JUJU_RELATION_ID'] | ||
690 | 7 | (_,hostname) = common.get_ip_and_hostname(os.environ['JUJU_REMOTE_UNIT']) | ||
691 | 8 | os.unlink(common.host_config_path_template % (relation_id, hostname)) | ||
692 | 9 | common.refresh_hostgroups(os.path.basename(sys.argv[0]).split('-')[0]) | ||
693 | 10 | subprocess.call(['service','nagios3','restart']) | ||
694 | 11 | 0 | ||
695 | === added file 'hooks/test-common.py' | |||
696 | --- hooks/test-common.py 1970-01-01 00:00:00 +0000 | |||
697 | +++ hooks/test-common.py 2012-08-02 21:28:44 +0000 | |||
698 | @@ -0,0 +1,50 @@ | |||
699 | 1 | from common import ObjectTagCollection | ||
700 | 2 | import os | ||
701 | 3 | |||
702 | 4 | from tempfile import NamedTemporaryFile | ||
703 | 5 | |||
704 | 6 | """ This is meant to test the ObjectTagCollection bits. It should | ||
705 | 7 | probably be made into a proper unit test. """ | ||
706 | 8 | |||
707 | 9 | x = ObjectTagCollection('test-units') | ||
708 | 10 | y = ObjectTagCollection('test-relids') | ||
709 | 11 | |||
710 | 12 | o = NamedTemporaryFile(delete=False) | ||
711 | 13 | o2 = NamedTemporaryFile(delete=False) | ||
712 | 14 | o3 = NamedTemporaryFile(delete=True) | ||
713 | 15 | o.write('some content') | ||
714 | 16 | o.flush() | ||
715 | 17 | |||
716 | 18 | x.tag_object(o.name, 'box-9') | ||
717 | 19 | x.tag_object(o.name, 'nrpe-1') | ||
718 | 20 | y.tag_object(o.name, 'monitors:2') | ||
719 | 21 | x.tag_object(o2.name, 'box-10') | ||
720 | 22 | x.tag_object(o2.name, 'nrpe-2') | ||
721 | 23 | y.tag_object(o2.name, 'monitors:2') | ||
722 | 24 | x.tag_object(o3.name, 'other-0') | ||
723 | 25 | y.tag_object(o3.name, 'monitors:3') | ||
724 | 26 | x.untag_object(o.name, 'box-9') | ||
725 | 27 | x.cleanup_untagged() | ||
726 | 28 | |||
727 | 29 | if not os.path.exists(o.name): | ||
728 | 30 | raise RuntimeError(o.name) | ||
729 | 31 | |||
730 | 32 | x.kill_tag('nrpe-1') | ||
731 | 33 | x.cleanup_untagged() | ||
732 | 34 | |||
733 | 35 | if os.path.exists(o.name): | ||
734 | 36 | raise RuntimeError(o.name) | ||
735 | 37 | |||
736 | 38 | if not os.path.exists(o2.name): | ||
737 | 39 | raise RuntimeError(o2.name) | ||
738 | 40 | |||
739 | 41 | y.kill_tag('monitors:2') | ||
740 | 42 | y.cleanup_untagged(['monitors:1','monitors:3']) | ||
741 | 43 | |||
742 | 44 | if os.path.exists(o.name): | ||
743 | 45 | raise RuntimeError(o2.name) | ||
744 | 46 | |||
745 | 47 | if os.path.exists(o2.name): | ||
746 | 48 | raise RuntimeError(o2.name) | ||
747 | 49 | |||
748 | 50 | x.destroy() | ||
749 | 0 | 51 | ||
750 | === modified file 'hooks/upgrade-charm' | |||
751 | --- hooks/upgrade-charm 2012-05-14 07:15:54 +0000 | |||
752 | +++ hooks/upgrade-charm 2012-08-02 21:28:44 +0000 | |||
753 | @@ -1,2 +1,15 @@ | |||
754 | 1 | #!/bin/sh | 1 | #!/bin/sh |
756 | 2 | juju-log -l WARNING 'Relations have been radically changed. Its best to remove any existing relationships and re-establish them.' | 2 | set -e |
757 | 3 | legacy_relations="`relation-ids legacy`" | ||
758 | 4 | if [ -n "$legacy_relations" ] ; then | ||
759 | 5 | juju-log -l WARNING 'Relations have been radically changed. The monitoring interface is not supported anymore.' | ||
760 | 6 | juju-log -l WARNING 'Please use the generic juju-info or the monitors interface' | ||
761 | 7 | fi | ||
762 | 8 | if [ -n "`config-get extraconfig`" ] ; then | ||
763 | 9 | config-get extraconfig > /etc/nagios3/conf.d/extra.cfg | ||
764 | 10 | else | ||
765 | 11 | rm -f /etc/nagios3/conf.d/extra.cfg | ||
766 | 12 | fi | ||
767 | 13 | # Refresh these hooks entirely | ||
768 | 14 | hooks/mymonitors-relation-joined | ||
769 | 15 | hooks/monitors-relation-changed | ||
770 | 3 | 16 | ||
771 | === modified file 'metadata.yaml' | |||
772 | --- metadata.yaml 2012-05-22 22:10:53 +0000 | |||
773 | +++ metadata.yaml 2012-08-02 21:28:44 +0000 | |||
774 | @@ -8,8 +8,12 @@ | |||
775 | 8 | provides: | 8 | provides: |
776 | 9 | website: | 9 | website: |
777 | 10 | interface: http | 10 | interface: http |
778 | 11 | mymonitors: | ||
779 | 12 | interface: monitors | ||
780 | 11 | requires: | 13 | requires: |
781 | 12 | legacy: | 14 | legacy: |
782 | 13 | interface: monitoring | 15 | interface: monitoring |
783 | 14 | nagios: | 16 | nagios: |
784 | 15 | interface: juju-info | 17 | interface: juju-info |
785 | 18 | monitors: | ||
786 | 19 | interface: monitors | ||
787 | 16 | 20 | ||
788 | === added file 'monitors.yaml' | |||
789 | --- monitors.yaml 1970-01-01 00:00:00 +0000 | |||
790 | +++ monitors.yaml 2012-08-02 21:28:44 +0000 | |||
791 | @@ -0,0 +1,11 @@ | |||
792 | 1 | version: '0.3' | ||
793 | 2 | monitors: | ||
794 | 3 | local: | ||
795 | 4 | procrunning: | ||
796 | 5 | min: 1 | ||
797 | 6 | name: '/usr/sbin/nagios3' | ||
798 | 7 | remote: | ||
799 | 8 | http: | ||
800 | 9 | nagios: | ||
801 | 10 | path: /nagios3/ | ||
802 | 11 | status: 'HTTP/1.1 401' | ||
803 | 0 | 12 | ||
804 | === modified file 'revision' | |||
805 | --- revision 2012-05-14 23:48:24 +0000 | |||
806 | +++ revision 2012-08-02 21:28:44 +0000 | |||
807 | @@ -1,1 +1,1 @@ | |||
809 | 1 | 22 | 1 | 40 |
For an example of the changes needed to add remote and local monitors, see
https:/ /code.launchpad .net/~clint- fewbar/ charms/ precise/ mysql/add- monitors/ +merge/ 118000