Merge lp:~thedac/capomastro/charm-configurable-nagios-context into lp:~ubuntuone-hackers/capomastro/charm

Proposed by David Ames
Status: Merged
Merged at revision: 59
Proposed branch: lp:~thedac/capomastro/charm-configurable-nagios-context
Merge into: lp:~ubuntuone-hackers/capomastro/charm
Diff against target: 66 lines (+20/-7)
3 files modified
config.yaml (+10/-0)
hooks/nrpe-external-master-relation-changed (+9/-6)
templates/nrpe_service_file.tmpl (+1/-1)
To merge this branch: bzr merge lp:~thedac/capomastro/charm-configurable-nagios-context
Reviewer Review Type Date Requested Status
Daniel Manrique (community) Approve
Review via email: mp+250242@code.launchpad.net

Description of the change

Make nagios_context configurable rather than hardcoded so staging and production use different service/hostgroups

Fix bug missing celery description for the celery check

This does not have to be the solution but something very much like this is required.

To post a comment you must log in.
Revision history for this message
Daniel Manrique (roadmr) wrote :

Thanks for contributing this to our charm :)

TL;DR We can merge this if you think the issues I comment on below are not going to affect your use case, though we should fix them down the line (and that's work that we can tackle later, no need to trouble you with this).

Now for the gory details.

I noticed one thing while testing this (which I did by running the spec, while forgetting to set the proper nagios_context). nagios_context is only used in the nrpe-external-master-relation-changed hook; so:

1) If the nagios_context is changed via a juju set, the .cfg file won't be updated.
2) If the nagios_context is changed and *then* the relation is destroyed/readded, then a .cfg file is created with the new nagios_context *but* the old one is not deleted.

I'm talking about the files in /etc/nagios/nrpe.d; I notice (and checked) that you nuke /var/lib/nagios/export/service_* in the hook, so that part is covered.

I don't know enough about nagios to tell if this is important; I also imagine that for deployment purposes you won't make the same rookie mistake I made by not setting nagios_context correctly from the getgo. I'm OK with this addition to our charm if it serves your purposes within the constraints I just mentioned, but (pending your input on whether it's needed), I'd like to later fix the hook so it does the right thing with files in /etc/ (unless you tell me they're not really needed), and so that they also get updated upon changing the option in question.

review: Needs Information
Revision history for this message
David Ames (thedac) wrote :

Daniel,

Thanks for looking at this.

> Thanks for contributing this to our charm :)
>
> TL;DR We can merge this if you think the issues I comment on below are not
> going to affect your use case, though we should fix them down the line (and
> that's work that we can tackle later, no need to trouble you with this).
>
> Now for the gory details.
>
> I noticed one thing while testing this (which I did by running the spec, while
> forgetting to set the proper nagios_context). nagios_context is only used in
> the nrpe-external-master-relation-changed hook; so:
>
> 1) If the nagios_context is changed via a juju set, the .cfg file won't be
> updated.

This would be ideal. That would mean the nagios configs need to be generated during a config-changed hook and not just nrpe-external-master changed. I didn't want to muck about in someone else’s charm so I changed as little as possible.

However, what you are describing is the right thing to do. Nagios config generation code should be in a common area that multiple hooks can call. I'll leave that to your team.

> 2) If the nagios_context is changed and *then* the relation is
> destroyed/readded, then a .cfg file is created with the new nagios_context
> *but* the old one is not deleted.

Right see above, due nagios configs not being generated in config-changed yet. The left over file in this case is OK, though it may make sense to name it in such a way that the nagios_context does not affect the file name.

> I'm talking about the files in /etc/nagios/nrpe.d; I notice (and checked) that
> you nuke /var/lib/nagios/export/service_* in the hook, so that part is
> covered.

Right the above is only true for /etc/nagios/nrpe.d

> I don't know enough about nagios to tell if this is important; I also imagine
> that for deployment purposes you won't make the same rookie mistake I made by
> not setting nagios_context correctly from the getgo. I'm OK with this addition
> to our charm if it serves your purposes within the constraints I just
> mentioned, but (pending your input on whether it's needed), I'd like to later
> fix the hook so it does the right thing with files in /etc/ (unless you tell
> me they're not really needed), and so that they also get updated upon changing
> the option in question.

Ultimately we (IS) just need this functionality added. This MP was as much an example as a requested MP. So feel free to dump this MP and have your team work on a complete solution. Having said that, this MP will fix our immediate problem and could be a stepping stone.

Revision history for this message
Daniel Manrique (roadmr) wrote :

Gotcha, OK, I'll merge this now to unblock you guys, as it won't break anything *we* depend on and it fixes things for you. But I'll file a bug in our project to improve this down the road.

Thanks again!

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'config.yaml'
2--- config.yaml 2014-12-16 16:53:12 +0000
3+++ config.yaml 2015-02-19 01:03:00 +0000
4@@ -47,3 +47,13 @@
5 type: string
6 default: "ppa:ce-infrastructure/capomastro"
7 description: "The PPA address or private repository line for apt-add-repository"
8+ nagios_context:
9+ default: "juju"
10+ type: string
11+ description: >
12+ Used by the nrpe-external-master subordinate charm.
13+ A string that will be prepended to instance name to set the host name
14+ in nagios. So for instance the hostname would be something like:
15+ juju-postgresql-0
16+ If you're running multiple environments with the same services in them
17+ this allows you to differentiate between them.
18
19=== modified file 'hooks/nrpe-external-master-relation-changed'
20--- hooks/nrpe-external-master-relation-changed 2014-12-19 11:00:12 +0000
21+++ hooks/nrpe-external-master-relation-changed 2015-02-19 01:03:00 +0000
22@@ -11,22 +11,25 @@
23 # a very opinionated charm by providing hardcoded values
24 # that possibly no one would need to change, hence we'll
25 # have a less cluttered config.yaml file
26+ export t_nagios_context="$(config-get nagios_context)"
27 export t_nagios_project="PES"
28- export t_nagios_instance_type="production"
29- export t_nagios_servicegroup="pes-prodstack-cert-infra"
30- export t_nagios_hostname="pes-prodstack-cert-infra-${unit_name//\//-}"
31+ export t_nagios_servicegroup="$t_nagios_context"
32+ export t_nagios_hostname="${t_nagios_context}-${unit_name//\//-}"
33+
34+ # Clean slate
35+ rm -f /var/lib/nagios/export/service_*
36
37 # check for the actual web page being available or not
38 export t_nagios_service_description="Capomastro build server"
39- export t_nagios_command_name="check_pes_capomastro_app_${t_nagios_instance_type}"
40+ export t_nagios_command_name="check_${t_nagios_context}_capomastro_app"
41 export t_nagios_command="/usr/lib/nagios/plugins/check_http -I ${private_address} -p ${port} -u / -e '302 FOUND' -u / -d 'http://capomastro/accounts/login/'"
42
43 cheetah fill --env -p templates/nrpe_command_file.tmpl > /etc/nagios/nrpe.d/${t_nagios_command_name}.cfg
44 cheetah fill --env -p templates/nrpe_service_file.tmpl > /var/lib/nagios/export/service_${t_nagios_hostname}_${t_nagios_command_name}.cfg
45
46 # check the status of the celery workers
47- export t_service_description="Capomastro celery workers"
48- export t_nagios_command_name="check_pes_capomastro_celery_${t_nagios_instance_type}"
49+ export t_nagios_service_description="Capomastro celery workers"
50+ export t_nagios_command_name="check_${t_nagios_context}_capomastro_celery"
51 export t_nagios_command="/usr/local/lib/nagios/plugins/check_status_file.sh /var/log/capomastro/capomastro-celery-status"
52
53 cheetah fill --env -p templates/nrpe_command_file.tmpl > /etc/nagios/nrpe.d/${t_nagios_command_name}.cfg
54
55=== modified file 'templates/nrpe_service_file.tmpl'
56--- templates/nrpe_service_file.tmpl 2014-11-04 19:33:38 +0000
57+++ templates/nrpe_service_file.tmpl 2015-02-19 01:03:00 +0000
58@@ -2,7 +2,7 @@
59 define service {
60 use active-service
61 host_name ${t_nagios_hostname}
62- service_description ${t_nagios_hostname} ${t_nagios_project} ${t_nagios_instance_type} ${t_nagios_service_description}
63+ service_description ${t_nagios_hostname} ${t_nagios_project} ${t_nagios_context} ${t_nagios_service_description}
64 check_command check_nrpe!${t_nagios_command_name}
65 servicegroups ${t_nagios_servicegroup},
66 }

Subscribers

People subscribed via source and target branches

to all changes: