Merge lp:~sinzui/charms/precise/juju-gui/nagios into lp:charms/juju-gui

Proposed by Curtis Hovey
Status: Merged
Merged at revision: 74
Proposed branch: lp:~sinzui/charms/precise/juju-gui/nagios
Merge into: lp:charms/juju-gui
Diff against target: 446 lines (+378/-3)
8 files modified
config.yaml (+11/-1)
files/nrpe-external-master/check-app-access.sh (+18/-0)
metadata.yaml (+3/-0)
revision (+1/-1)
scripts/charmsupport/hookenv.py (+150/-0)
scripts/charmsupport/nrpe.py (+169/-0)
scripts/update-nrpe.py (+14/-0)
tests/20-functional.test (+12/-1)
To merge this branch: bzr merge lp:~sinzui/charms/precise/juju-gui/nagios
Reviewer Review Type Date Requested Status
Benji York (community) Approve
charmers Pending
Review via email: mp+177588@code.launchpad.net

Description of the change

Add nagios support.

RULES

    Pre-implementation: sidnei, moon127, wedgwood
    * We want to support the nagios subordinate charm to monitor production.
    * Webops prefer to use the nrpe module from charmsupport.
      * The project merged with charmhelpers (not related to the charm-helpers
        ppa that all PPA configured juju deploys get).
      * Webops write a custom script for each charm they deploy to embed the
        library in the charm; they alway deploy a forked charm. The charm
        runs with a modified python path to find that included libs.
      * Webops require the charm to ensure errors installing dependencies are
        are ignored since they have ensured the libs are already in the charm.
        * Or have they? Every change to the charm requires review of a webop
          to ensure the building of a fat charm really works.
    * I decided to include just the nrpe module and its dependencies in
      the charm's tree.
      * No action is needed to build out a fat charm.
      * No future coordination with webops to ensure the charm really
        installs its deps.
      * I chose an older version of nrpe because it has fewer deps than
        the current version. I chose the same version used by charmworld.
        I'd like to keep these two charms synced to simplify the parts used
        to deploy charmworld.com

QA

    DEPLOY JUJU-GUI AND NAGIOS
    juju bootstrap
    juju deploy --repo=~/Work/juju-charms/ local:precise/juju-gui
    juju deploy nagios
    juju add-relation nagios juju-gui
    juju expose nagios
    Verify you can see juju-gui in nagios with a single ssh check

    DEPLOY NRPE-EXTERNAL-MASTER
    juju deploy nrpe-external-master
    juju set nrpe-external-master nagios_master=http://10.0.3.45/<nagio-public-ip>
    juju add-relation nrpe-external-master:nrpe-external-master juju-gui:nrpe-external-master

    Verify you can see the nagios external cfg files in juju-gui's file system.
    ls /var/lib/nagios/export/
    Verify the unit config "host__juju-juju-gui-0.cfg" has the the nagio ip address.

IMPLEMENTATION

    * I added just the parts of the nrpe library needed.
      scripts/charmsupport/__init__.py
      scripts/charmsupport/hookenv.py
      scripts/charmsupport/nrpe.py

    * I added a hook that calls the update-nrpe.py script that in turn
      registers the check-front-page.sh script. nrpe requires the
      check scripts to be in files/nrpe-external-master/.
      hooks/nrpe-external-master-relation-changed
      scripts/update-nrpe.py
      files/nrpe-external-master/check-front-page.sh

    * I updated the metadata and config to enable the subordinate charm.
      config.yaml
      metadata.yaml
      revision

To post a comment you must log in.
78. By Curtis Hovey

Merged tip and resolved conflicts.

79. By Curtis Hovey

Merged tip.

Revision history for this message
Benji York (benji) wrote :

Thanks for wiring this up. Monitoring is essential.

Given that this MP doesn't have Reitveld links, I assume you didn't use
lbox to propose it. Despite it's many annoyances, lbox is the
prescribed way of proposing and landing GUI branches. The most
important thing it does is to run "make check" before proposing a
branch. Please manually run that command before landing. Thanks.

> I decided to include just the nrpe module and its dependencies in the
> charm's tree.

It's a shame that our policies push us toward doing things like copy
chunks of code from outdated dependencies, but I can't proffer anything
better.

In check-front-page.sh (line 31 of the diff) shouldn't it be "sites-enabled"
instead of "sites-available"?

Also, I suggest using "https://127.0.0.1:443/juju-ui/version.js" for ADDRESS
and "jujuGuiVersionInfo" for LIFE_SIGN. Being code, that is less likely
to change because of a UI/UX redesign and we can write a test for it
that ensures if it does change a test named "testMonitoringEndpoint"
will fail. Oh, and we need that test in this branch too.

Once the above are addressed, this will be ready to land.

review: Approve
80. By Curtis Hovey

change script to check-app-access.sh because it now checks that the
GUI can be downloaded.

81. By Curtis Hovey

Added test to correlate check-app-access.sh to the path it checks.

Revision history for this message
Gary Poster (gary) wrote :

Hey Curtis. Thank you! Is it reasonable/easy to merge these revisions into the ~juju-gui charm branch as well, or are there conflicts?

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'config.yaml'
--- config.yaml 2013-07-15 20:40:15 +0000
+++ config.yaml 2013-08-08 17:05:15 +0000
@@ -159,7 +159,7 @@
159 canvas. This is also known as browse mode.159 canvas. This is also known as browse mode.
160 - 'minimized': the charmbrowser will be minimized by default, and hidden.160 - 'minimized': the charmbrowser will be minimized by default, and hidden.
161 type: string161 type: string
162 default: sidebar 162 default: sidebar
163 show-get-juju-button:163 show-get-juju-button:
164 description: |164 description: |
165 There are deployment modes for Juju GUI which are not intended as regular165 There are deployment modes for Juju GUI which are not intended as regular
@@ -167,3 +167,13 @@
167 link to juju.ubuntu.com167 link to juju.ubuntu.com
168 type: boolean168 type: boolean
169 default: false169 default: false
170 nagios_context:
171 default: "juju"
172 type: string
173 description: |
174 Used by the nrpe-external-master subordinate charm.
175 A string that will be prepended to instance name to set the host name
176 in nagios. So for instance the hostname would be something like:
177 juju-myservice-0
178 If you're running multiple environments with the same services in them
179 this allows you to differentiate between them.
170180
=== added directory 'files'
=== added directory 'files/nrpe-external-master'
=== added file 'files/nrpe-external-master/check-app-access.sh'
--- files/nrpe-external-master/check-app-access.sh 1970-01-01 00:00:00 +0000
+++ files/nrpe-external-master/check-app-access.sh 2013-08-08 17:05:15 +0000
@@ -0,0 +1,18 @@
1#!/bin/bash
2SITE_CONF='/etc/apache2/sites-enabled/juju-gui'
3ADDRESS='https://127.0.0.1:443/juju-ui/version.js'
4LIFE_SIGN='jujuGuiVersionInfo'
5
6if [[ ! -f $SITE_CONF ]]; then
7 echo Apache is not configured serve juju-gui.
8 exit 2
9fi
10
11match=$(curl -k $ADDRESS | grep "$LIFE_SIGN")
12
13if [[ -n "$match" ]]; then
14 exit 0
15else
16 echo juju-gui did not return content indicating it was loading.
17 exit 2
18fi
019
=== added symlink 'hooks/nrpe-external-master-relation-changed'
=== target is u'../scripts/update-nrpe.py'
=== modified file 'metadata.yaml'
--- metadata.yaml 2013-06-11 14:13:45 +0000
+++ metadata.yaml 2013-08-08 17:05:15 +0000
@@ -22,3 +22,6 @@
22provides:22provides:
23 web:23 web:
24 interface: http24 interface: http
25 nrpe-external-master:
26 interface: nrpe-external-master
27 scope: container
2528
=== modified file 'revision'
--- revision 2013-08-02 14:19:45 +0000
+++ revision 2013-08-08 17:05:15 +0000
@@ -1,1 +1,1 @@
165166
22
=== added directory 'scripts'
=== added directory 'scripts/charmsupport'
=== added file 'scripts/charmsupport/__init__.py'
=== added file 'scripts/charmsupport/hookenv.py'
--- scripts/charmsupport/hookenv.py 1970-01-01 00:00:00 +0000
+++ scripts/charmsupport/hookenv.py 2013-08-08 17:05:15 +0000
@@ -0,0 +1,150 @@
1"Interactions with the Juju environment"
2# source: 27:lp:charmsupport
3# Copyright 2012 Canonical Ltd.
4#
5# Authors:
6# Matthew Wedgwood <matthew.wedgwood@canonical.com>
7
8import os
9import json
10import yaml
11import subprocess
12
13CRITICAL = "CRITICAL"
14ERROR = "ERROR"
15WARNING = "WARNING"
16INFO = "INFO"
17DEBUG = "DEBUG"
18def log(message, level=DEBUG):
19 "Write a message to the juju log"
20 subprocess.call( [ 'juju-log', '-l', level, message ] )
21
22class Serializable(object):
23 "Wrapper, an object that can be serialized to yaml or json"
24 def __init__(self, obj):
25 # wrap the object
26 super(Serializable, self).__init__()
27 self._wrapped_obj = obj
28
29 def __getattr__(self, attr):
30 # see if this object has attr
31 if attr in self.__dict__:
32 return getattr(self, attr)
33 # proxy to the wrapped object
34 return self[attr]
35
36 def __getitem__(self, key):
37 return self._wrapped_obj[key]
38
39 def json(self):
40 "Serialize the object to json"
41 return json.dumps(self._wrapped_obj)
42
43 def yaml(self):
44 "Serialize the object to yaml"
45 return yaml.dump(self._wrapped_obj)
46
47def execution_environment():
48 """A convenient bundling of the current execution context"""
49 context = {}
50 context['conf'] = config()
51 context['unit'] = local_unit()
52 context['rel'] = relations_of_type()
53 context['env'] = os.environ
54 return context
55
56def in_relation_hook():
57 "Determine whether we're running in a relation hook"
58 return os.environ.has_key('JUJU_RELATION')
59
60def relation_type():
61 "The scope for the current relation hook"
62 return os.environ['JUJU_RELATION']
63def relation_id():
64 "The relation ID for the current relation hook"
65 return os.environ['JUJU_RELATION_ID']
66def local_unit():
67 "Local unit ID"
68 return os.environ['JUJU_UNIT_NAME']
69def remote_unit():
70 "The remote unit for the current relation hook"
71 return os.environ['JUJU_REMOTE_UNIT']
72
73def config(scope=None):
74 "Juju charm configuration"
75 config_cmd_line = ['config-get']
76 if scope is not None:
77 config_cmd_line.append(scope)
78 config_cmd_line.append('--format=json')
79 try:
80 config_data = json.loads(subprocess.check_output(config_cmd_line))
81 except (ValueError, OSError, subprocess.CalledProcessError) as err:
82 log(str(err), level=ERROR)
83 raise err
84 return Serializable(config_data)
85
86def relation_ids(reltype=None):
87 "A list of relation_ids"
88 reltype = reltype or relation_type()
89 relids = []
90 relid_cmd_line = ['relation-ids', '--format=json', reltype]
91 relids.extend(json.loads(subprocess.check_output(relid_cmd_line)))
92 return relids
93
94def related_units(relid=None):
95 "A list of related units"
96 relid = relid or relation_id()
97 units_cmd_line = ['relation-list', '--format=json', '-r', relid]
98 units = json.loads(subprocess.check_output(units_cmd_line))
99 return units
100
101def relation_for_unit(unit=None):
102 "Get the json represenation of a unit's relation"
103 unit = unit or remote_unit()
104 relation_cmd_line = ['relation-get', '--format=json', '-', unit]
105 try:
106 relation = json.loads(subprocess.check_output(relation_cmd_line))
107 except (ValueError, OSError, subprocess.CalledProcessError), err:
108 log(str(err), level=ERROR)
109 raise err
110 for key in relation:
111 if key.endswith('-list'):
112 relation[key] = relation[key].split()
113 relation['__unit__'] = unit
114 return Serializable(relation)
115
116def relations_for_id(relid=None):
117 "Get relations of a specific relation ID"
118 relation_data = []
119 relid = relid or relation_ids()
120 for unit in related_units(relid):
121 unit_data = relation_for_unit(unit)
122 unit_data['__relid__'] = relid
123 relation_data.append(unit_data)
124 return relation_data
125
126def relations_of_type(reltype=None):
127 "Get relations of a specific type"
128 relation_data = []
129 if in_relation_hook():
130 reltype = reltype or relation_type()
131 for relid in relation_ids(reltype):
132 for relation in relations_for_id(relid):
133 relation['__relid__'] = relid
134 relation_data.append(relation)
135 return relation_data
136
137class UnregisteredHookError(Exception): pass
138
139class Hooks(object):
140 def __init__(self):
141 super(Hooks, self).__init__()
142 self._hooks = {}
143 def register(self, name, function):
144 self._hooks[name] = function
145 def execute(self, args):
146 hook_name = os.path.basename(args[0])
147 if hook_name in self._hooks:
148 self._hooks[hook_name]()
149 else:
150 raise UnregisteredHookError(hook_name)
0151
=== added file 'scripts/charmsupport/nrpe.py'
--- scripts/charmsupport/nrpe.py 1970-01-01 00:00:00 +0000
+++ scripts/charmsupport/nrpe.py 2013-08-08 17:05:15 +0000
@@ -0,0 +1,169 @@
1"""Compatibility with the nrpe-external-master charm"""
2# source: 27:lp:charmsupport
3# Copyright 2012 Canonical Ltd.
4#
5# Authors:
6# Matthew Wedgwood <matthew.wedgwood@canonical.com>
7
8import subprocess
9import pwd
10import grp
11import os
12import re
13import shlex
14
15from hookenv import config, local_unit
16
17# This module adds compatibility with the nrpe_external_master
18# subordinate charm. To use it in your charm:
19#
20# 1. Update metadata.yaml
21#
22# provides:
23# (...)
24# nrpe-external-master:
25# interface: nrpe-external-master
26# scope: container
27#
28# 2. Add the following to config.yaml
29#
30# nagios_context:
31# default: "juju"
32# type: string
33# description: |
34# Used by the nrpe-external-master subordinate charm.
35# A string that will be prepended to instance name to set the host name
36# in nagios. So for instance the hostname would be something like:
37# juju-myservice-0
38# If you're running multiple environments with the same services in them
39# this allows you to differentiate between them.
40#
41# 3. Add custom checks (Nagios plugins) to files/nrpe-external-master
42#
43# 4. Update your hooks.py with something like this:
44#
45# import nrpe
46# (...)
47# def update_nrpe_config():
48# nrpe_compat = NRPE("myservice")
49# nrpe_compat.add_check(
50# shortname = "myservice",
51# description = "Check MyService",
52# check_cmd = "check_http -w 2 -c 10 http://localhost"
53# )
54# nrpe_compat.add_check(
55# "myservice_other",
56# "Check for widget failures",
57# check_cmd = "/srv/myapp/scripts/widget_check"
58# )
59# nrpe_compat.write()
60#
61# def config_changed():
62# (...)
63# update_nrpe_config()
64# def nrpe_external_master_relation_changed():
65# update_nrpe_config()
66#
67# 5. ln -s hooks.py nrpe-external-master-relation-changed
68
69class CheckException(Exception): pass
70class Check(object):
71 shortname_re = '[A-Za-z0-9-_]*'
72 service_template = """
73#---------------------------------------------------
74# This file is Juju managed
75#---------------------------------------------------
76define service {{
77 use active-service
78 host_name {nagios_hostname}
79 service_description {nagios_hostname}[{shortname}] {description}
80 check_command check_nrpe!check_{shortname}
81 servicegroups {nagios_servicegroup}
82}}
83"""
84 def __init__(self, shortname, description, check_cmd):
85 super(Check, self).__init__()
86 # XXX: could be better to calculate this from the service name
87 if not re.match(self.shortname_re, shortname):
88 raise CheckException("shortname must match {}".format(Check.shortname_re))
89 self.shortname = shortname
90 # Note: a set of invalid characters is defined by the Nagios server config
91 # The default is: illegal_object_name_chars=`~!$%^&*"|'<>?,()=
92 self.description = description
93 self.check_cmd = self._locate_cmd(check_cmd)
94
95 def _locate_cmd(self, check_cmd):
96 search_path = (
97 '/',
98 os.path.join(os.environ['CHARM_DIR'], 'files/nrpe-external-master'),
99 '/usr/lib/nagios/plugins',
100 )
101 command = shlex.split(check_cmd)
102 for path in search_path:
103 if os.path.exists(os.path.join(path,command[0])):
104 return os.path.join(path, command[0]) + " " + " ".join(command[1:])
105 subprocess.call(['juju-log', 'Check command not found: {}'.format(command[0])])
106 return ''
107
108 def write(self, nagios_context, hostname):
109 for f in os.listdir(NRPE.nagios_exportdir):
110 if re.search('.*check_{}.cfg'.format(self.shortname), f):
111 os.remove(os.path.join(NRPE.nagios_exportdir, f))
112
113 templ_vars = {
114 'nagios_hostname': hostname,
115 'nagios_servicegroup': nagios_context,
116 'description': self.description,
117 'shortname': self.shortname,
118 }
119 nrpe_service_text = Check.service_template.format(**templ_vars)
120 nrpe_service_file = '{}/service__{}_check_{}.cfg'.format(
121 NRPE.nagios_exportdir, hostname, self.shortname)
122 with open(nrpe_service_file, 'w') as nrpe_service_config:
123 nrpe_service_config.write(str(nrpe_service_text))
124
125 nrpe_check_file = '/etc/nagios/nrpe.d/check_{}.cfg'.format(self.shortname)
126 with open(nrpe_check_file, 'w') as nrpe_check_config:
127 nrpe_check_config.write("# check {}\n".format(self.shortname))
128 nrpe_check_config.write("command[check_{}]={}\n".format(
129 self.shortname, self.check_cmd))
130
131 def run(self):
132 subprocess.call(self.check_cmd)
133
134class NRPE(object):
135 nagios_logdir = '/var/log/nagios'
136 nagios_exportdir = '/var/lib/nagios/export'
137 nrpe_confdir = '/etc/nagios/nrpe.d'
138 def __init__(self):
139 super(NRPE, self).__init__()
140 self.config = config()
141 self.nagios_context = self.config['nagios_context']
142 self.unit_name = local_unit().replace('/', '-')
143 self.hostname = "{}-{}".format(self.nagios_context, self.unit_name)
144 self.checks = []
145
146 def add_check(self, *args, **kwargs):
147 self.checks.append( Check(*args, **kwargs) )
148
149 def write(self):
150 try:
151 nagios_uid = pwd.getpwnam('nagios').pw_uid
152 nagios_gid = grp.getgrnam('nagios').gr_gid
153 except:
154 subprocess.call(['juju-log', "Nagios user not set up, nrpe checks not updated"])
155 return
156
157 if not os.path.exists(NRPE.nagios_exportdir):
158 subprocess.call(['juju-log', 'Exiting as {} is not accessible'.format(NRPE.nagios_exportdir)])
159 return
160
161 if not os.path.exists(NRPE.nagios_logdir):
162 os.mkdir(NRPE.nagios_logdir)
163 os.chown(NRPE.nagios_logdir, nagios_uid, nagios_gid)
164
165 for nrpecheck in self.checks:
166 nrpecheck.write(self.nagios_context, self.hostname)
167
168 if os.path.isfile('/etc/init.d/nagios-nrpe-server'):
169 subprocess.call(['service', 'nagios-nrpe-server', 'reload'])
0170
=== added file 'scripts/update-nrpe.py'
--- scripts/update-nrpe.py 1970-01-01 00:00:00 +0000
+++ scripts/update-nrpe.py 2013-08-08 17:05:15 +0000
@@ -0,0 +1,14 @@
1#!/usr/bin/env python
2from charmsupport import nrpe
3
4
5def update_nrpe_config():
6 nrpe_compat = nrpe.NRPE()
7 nrpe_compat.add_check(
8 'App is accessible', 'Check the app can be downloaded',
9 'check-app-access.sh')
10 nrpe_compat.write()
11
12
13if __name__ == '__main__':
14 update_nrpe_config()
015
=== modified file 'tests/20-functional.test'
--- tests/20-functional.test 2013-08-01 14:33:00 +0000
+++ tests/20-functional.test 2013-08-08 17:05:15 +0000
@@ -202,7 +202,18 @@
202 self.assertIn('max-age=0', cache_directives)202 self.assertIn('max-age=0', cache_directives)
203 self.assertIn('public', cache_directives)203 self.assertIn('public', cache_directives)
204 self.assertIn('must-revalidate', cache_directives)204 self.assertIn('must-revalidate', cache_directives)
205 205
206 def test_nrpe_check_available(self):
207 # Make sure the check-app-access.sh script's ADDRESS is available.
208 unit_info = self.juju_deploy(
209 self.charm, options={'juju-gui-source': JUJU_GUI_TEST_BRANCH})
210 hostname = unit_info['public-address']
211 conn = httplib.HTTPSConnection(hostname)
212 # This request matches the ADDRESS var in the script.
213 conn.request('GET', '/juju-ui/version.js')
214 message = 'ADDRESS in check-app-access.sh is not accessible.'
215 self.assertEqual(200, conn.getresponse().status, message)
216
206217
207 @unittest.skipIf(is_legacy_juju, 'force-machine only works in juju-core')218 @unittest.skipIf(is_legacy_juju, 'force-machine only works in juju-core')
208 def test_force_machine(self):219 def test_force_machine(self):

Subscribers

People subscribed via source and target branches