Merge lp:~brad-marshall/charms/trusty/hacluster/add-nrpe-checks-update-charmhelpers into lp:~openstack-charmers/charms/trusty/hacluster/next

Proposed by Billy Olsen
Status: Merged
Merged at revision: 46
Proposed branch: lp:~brad-marshall/charms/trusty/hacluster/add-nrpe-checks-update-charmhelpers
Merge into: lp:~openstack-charmers/charms/trusty/hacluster/next
Diff against target: 1577 lines (+1456/-1) (has conflicts)
12 files modified
charm-helpers.yaml (+1/-0)
config.yaml (+16/-0)
files/nrpe/check_corosync_rings (+99/-0)
files/nrpe/check_crm (+201/-0)
files/sudoers/nagios (+5/-0)
hooks/charmhelpers/contrib/charmsupport/__init__.py (+15/-0)
hooks/charmhelpers/contrib/charmsupport/nrpe.py (+360/-0)
hooks/charmhelpers/contrib/charmsupport/volumes.py (+175/-0)
hooks/charmhelpers/core/strutils.py (+42/-0)
hooks/charmhelpers/core/unitdata.py (+477/-0)
hooks/hooks.py (+62/-1)
metadata.yaml (+3/-0)
Conflict adding file hooks/charmhelpers/core/strutils.py.  Moved existing file to hooks/charmhelpers/core/strutils.py.moved.
Conflict adding file hooks/charmhelpers/core/unitdata.py.  Moved existing file to hooks/charmhelpers/core/unitdata.py.moved.
To merge this branch: bzr merge lp:~brad-marshall/charms/trusty/hacluster/add-nrpe-checks-update-charmhelpers
Reviewer Review Type Date Requested Status
Liam Young (community) Approve
Billy Olsen Needs Information
OpenStack Charmers Pending
Review via email: mp+249570@code.launchpad.net

This proposal supersedes a proposal from 2015-02-12.

Description of the change

Adding nrpe checks to the hacluster to check the status of corosync.

To post a comment you must log in.
Revision history for this message
Billy Olsen (billy-olsen) wrote :

Brad, thanks for the submission! The hacluster charm is maintained by the openstack-charmers team and development operates off the /next branch. I've re-targeted this merge proposal towards the /next branch.

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #1956 hacluster-next for billy-olsen mp249570
    LINT FAIL: lint-test failed

LINT Results (max last 2 lines):
  hooks/hooks.py:22:1: F401 'relations_of_type' imported but unused
  make: *** [lint] Error 1

Full lint test output: http://paste.ubuntu.com/10193691/
Build: http://10.245.162.77:8080/job/charm_lint_check/1956/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #1746 hacluster-next for billy-olsen mp249570
    UNIT OK: passed

Build: http://10.245.162.77:8080/job/charm_unit_test/1746/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #1897 hacluster-next for billy-olsen mp249570
    AMULET FAIL: amulet-test missing

AMULET Results (max last 2 lines):
INFO:root:Search string not found in makefile target commands.
ERROR:root:No make target was executed.

Full amulet test output: http://paste.ubuntu.com/10193709/
Build: http://10.245.162.77:8080/job/charm_amulet_test/1897/

Revision history for this message
Billy Olsen (billy-olsen) wrote :

Brad, please fix the linting issue that's been identified by uosci-testing-bot.

When expanding a cluster by adding a node, the hacluster charm is somewhat inefficient as it will restart corosync whenever the corosync.conf file changes. If the nrpe check runs during this process I think it's going to trigger a false alarm. Can you confirm this? I'm thinking we may want to be a bit more intelligent about possibly disabling certain nagios checks while changes to the cluster are made. This is more of a question than anything.

review: Needs Fixing
Revision history for this message
Billy Olsen (billy-olsen) wrote :

As another comment, in the 15.01 release of the OpenStack charms, haproxy is now setup and configured on each service by default (it avoids port jumps, reconfig, the amount of work needed to be performed when clustering, etc). So I think that those charms need the haproxy check as well, but this will need to make sure it is okay to pair with a charm that has already enabled the haproxy nagios plugin. It may do so (I'm not overly familiar with nagios), but thought I'd drop it in and let you comment. Thanks.

49. By Brad Marshall

[bradm] Removed unneeded import of relations_of_type

Revision history for this message
Brad Marshall (brad-marshall) wrote :

Hi,

> Brad, please fix the linting issue that's been identified by uosci-testing-bot.

Done.

> When expanding a cluster by adding a node, the hacluster charm is somewhat
> inefficient as it will restart corosync whenever the corosync.conf file changes.
> If the nrpe check runs during this process I think it's going to trigger a false
> alarm. Can you confirm this? I'm thinking we may want to be a bit more intelligent
> about possibly disabling certain nagios checks while changes to the cluster are
> made. This is more of a question than anything.

Hmm, its an interesting question. Enabling and disabling checks in nagios isn't trivial,
so I'm not actually sure how it would work. Also, you need to be down for an extended
period before it'll alert, so I don't know if its worth the effort. I'd like to see
what we have here landed and we can make work out if there's a more efficient way
down the track?

Thanks,
Brad

Revision history for this message
Brad Marshall (brad-marshall) wrote :

> As another comment, in the 15.01 release of the OpenStack charms,
> haproxy is now setup and configured on each service by default
> (it avoids port jumps, reconfig, the amount of work needed to be
> performed when clustering, etc). So I think that those charms need
> the haproxy check as well, but this will need to make sure it is
> okay to pair with a charm that has already enabled the haproxy
> nagios plugin. It may do so (I'm not overly familiar with nagios),
> but thought I'd drop it in and let you comment. Thanks.

My understanding is that it should already be monitoring haproxy
with the latest charms, but it could be made better - it seems
that there's only a process check, which isn't really sufficient.

I'll try to look into what we could do with this, and how it would
interact with the subordinate charms.

Brad.

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #1958 hacluster-next for billy-olsen mp249570
    LINT OK: passed

Build: http://10.245.162.77:8080/job/charm_lint_check/1958/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #1748 hacluster-next for billy-olsen mp249570
    UNIT OK: passed

Build: http://10.245.162.77:8080/job/charm_unit_test/1748/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #1899 hacluster-next for billy-olsen mp249570
    AMULET FAIL: amulet-test missing

AMULET Results (max last 2 lines):
INFO:root:Search string not found in makefile target commands.
ERROR:root:No make target was executed.

Full amulet test output: http://paste.ubuntu.com/10196422/
Build: http://10.245.162.77:8080/job/charm_amulet_test/1899/

50. By Brad Marshall

[bradm] Removed haproxy nrpe checks

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #2044 hacluster-next for billy-olsen mp249570
    LINT OK: passed

Build: http://10.245.162.77:8080/job/charm_lint_check/2044/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #1834 hacluster-next for billy-olsen mp249570
    UNIT OK: passed

Build: http://10.245.162.77:8080/job/charm_unit_test/1834/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #1982 hacluster-next for billy-olsen mp249570
    AMULET FAIL: amulet-test missing

AMULET Results (max last 2 lines):
INFO:root:Search string not found in makefile target commands.
ERROR:root:No make target was executed.

Full amulet test output: http://paste.ubuntu.com/10268549/
Build: http://10.245.162.77:8080/job/charm_amulet_test/1982/

51. By Brad Marshall

[bradm] Sync charmhelpers

52. By Brad Marshall

[bradm] Add nagios_servicegroups config option

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #2107 hacluster-next for billy-olsen mp249570
    LINT OK: passed

Build: http://10.245.162.77:8080/job/charm_lint_check/2107/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #1897 hacluster-next for billy-olsen mp249570
    UNIT OK: passed

Build: http://10.245.162.77:8080/job/charm_unit_test/1897/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #2013 hacluster-next for billy-olsen mp249570
    AMULET FAIL: amulet-test missing

AMULET Results (max last 2 lines):
INFO:root:Search string not found in makefile target commands.
ERROR:root:No make target was executed.

Full amulet test output: http://paste.ubuntu.com/10302877/
Build: http://10.245.162.77:8080/job/charm_amulet_test/2013/

53. By Brad Marshall

[bradm] Handle case of empty nagios_servicegroups setting

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #2205 hacluster-next for billy-olsen mp249570
    LINT OK: passed

Build: http://10.245.162.77:8080/job/charm_lint_check/2205/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #1994 hacluster-next for billy-olsen mp249570
    UNIT OK: passed

Build: http://10.245.162.77:8080/job/charm_unit_test/1994/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #2151 hacluster-next for billy-olsen mp249570
    AMULET FAIL: amulet-test missing

AMULET Results (max last 2 lines):
INFO:root:Search string not found in makefile target commands.
ERROR:root:No make target was executed.

Full amulet test output: http://paste.ubuntu.com/10396224/
Build: http://10.245.162.77:8080/job/charm_amulet_test/2151/

Revision history for this message
Brad Marshall (brad-marshall) wrote :

This should now be ready for a re-review.

54. By Brad Marshall

[bradm] Add process check for corosync and pacemakerd

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #2490 hacluster-next for billy-olsen mp249570
    LINT OK: passed

Build: http://10.245.162.77:8080/job/charm_lint_check/2490/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #2280 hacluster-next for billy-olsen mp249570
    UNIT OK: passed

Build: http://10.245.162.77:8080/job/charm_unit_test/2280/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #2362 hacluster-next for billy-olsen mp249570
    AMULET FAIL: amulet-test missing

AMULET Results (max last 2 lines):
INFO:root:Search string not found in makefile target commands.
ERROR:root:No make target was executed.

Full amulet test output: http://paste.ubuntu.com/10545084/
Build: http://10.245.162.77:8080/job/charm_amulet_test/2362/

Revision history for this message
Billy Olsen (billy-olsen) wrote :

Hey Brad,

I think this is almost there... almost there... I just have a question about the size of your charmhelpers sync and why you included more files on your sync. I couldn't figure it out.

Also, its somewhat easier to review if the c-h sync comes as a different merge proposal (rumors are this is to be automated), which this can then depend on so we can focus on the changes in here. No biggie, just makes the diff review much larger.

As always, thanks for the submission :)

review: Needs Information
Revision history for this message
Brad Marshall (brad-marshall) wrote :

It looks like I changed it so it would pull in the haproxy checks we have in charmhelpers.contrib.openstack, but later on they were moved to the individual charms. I'll drop it back to charmhelpers.contrib.openstack.utils and see how that goes.

55. By Brad Marshall

[bradm] Dropped back to just requiring charmhelpers.contrib.openstack.utils, don't need the rest now.

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #3666 hacluster-next for billy-olsen mp249570
    LINT OK: passed

Build: http://10.245.162.77:8080/job/charm_lint_check/3666/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #3454 hacluster-next for billy-olsen mp249570
    UNIT OK: passed

Build: http://10.245.162.77:8080/job/charm_unit_test/3454/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #3456 hacluster-next for billy-olsen mp249570
    AMULET FAIL: amulet-test missing

AMULET Results (max last 2 lines):
INFO:root:Search string not found in makefile target commands.
ERROR:root:No make target was executed.

Full amulet test output: http://paste.ubuntu.com/10854117/
Build: http://10.245.162.77:8080/job/charm_amulet_test/3456/

Revision history for this message
Liam Young (gnuoy) wrote :

Approve

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'charm-helpers.yaml'
2--- charm-helpers.yaml 2015-01-26 10:44:46 +0000
3+++ charm-helpers.yaml 2015-04-20 00:40:18 +0000
4@@ -9,3 +9,4 @@
5 - contrib.network.ip
6 - contrib.openstack.utils
7 - contrib.python.packages
8+ - contrib.charmsupport
9
10=== modified file 'config.yaml'
11--- config.yaml 2015-04-14 10:16:50 +0000
12+++ config.yaml 2015-04-20 00:40:18 +0000
13@@ -97,3 +97,19 @@
14 default: False
15 type: boolean
16 description: Enable debug logging
17+ nagios_context:
18+ default: "juju"
19+ type: string
20+ description: |
21+ Used by the nrpe-external-master subordinate charm.
22+ A string that will be prepended to instance name to set the host name
23+ in nagios. So for instance the hostname would be something like:
24+ juju-postgresql-0
25+ If you're running multiple environments with the same services in them
26+ this allows you to differentiate between them.
27+ nagios_servicegroups:
28+ default: ""
29+ type: string
30+ description: |
31+ A comma-separated list of nagios servicegroups.
32+ If left empty, the nagios_context will be used as the servicegroup
33
34=== added directory 'files'
35=== added directory 'files/nrpe'
36=== added file 'files/nrpe/check_corosync_rings'
37--- files/nrpe/check_corosync_rings 1970-01-01 00:00:00 +0000
38+++ files/nrpe/check_corosync_rings 2015-04-20 00:40:18 +0000
39@@ -0,0 +1,99 @@
40+#!/usr/bin/perl
41+#
42+# check_corosync_rings
43+#
44+# Copyright © 2011 Phil Garner, Sysnix Consultants Limited
45+#
46+# This program is free software: you can redistribute it and/or modify
47+# it under the terms of the GNU General Public License as published by
48+# the Free Software Foundation, either version 3 of the License, or
49+# (at your option) any later version.
50+#
51+# This program is distributed in the hope that it will be useful,
52+# but WITHOUT ANY WARRANTY; without even the implied warranty of
53+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
54+# GNU General Public License for more details.
55+#
56+# You should have received a copy of the GNU General Public License
57+# along with this program. If not, see <http://www.gnu.org/licenses/>.
58+#
59+# Authors: Phil Garner - phil@sysnix.com & Peter Mottram peter@sysnix.com
60+#
61+# v0.1 05/01/2011
62+# v0.2 31/10/2011 - additional crit when closing the file handle and additional
63+# comments added
64+#
65+# NOTE:- Requires Perl 5.8 or higher & the Perl Module Nagios::Plugin
66+# Nagios user will need sudo acces - suggest adding line below to
67+# sudoers.
68+# nagios ALL=(ALL) NOPASSWD: /usr/sbin/corosync-cfgtool -s
69+#
70+# In sudoers if requiretty is on (off state is default)
71+# you will also need to add the line below
72+# Defaults:nagios !requiretty
73+#
74+
75+use warnings;
76+use strict;
77+use Nagios::Plugin;
78+
79+# Lines below may need changing if corosync-cfgtool or sudo installed in a
80+# diffrent location.
81+
82+my $sudo = '/usr/bin/sudo';
83+my $cfgtool = '/usr/sbin/corosync-cfgtool -s';
84+
85+# Now set up the plugin
86+my $np = Nagios::Plugin->new(
87+ shortname => 'check_cororings',
88+ version => '0.2',
89+ usage => "Usage: %s <ARGS>\n\t\t--help for help\n",
90+ license => "License - GPL v3 see code for more details",
91+ url => "http://www.sysnix.com",
92+ blurb =>
93+"\tNagios plugin that checks the status of corosync rings, requires Perl \t5.8+ and CPAN modules Nagios::Plugin.",
94+);
95+
96+#Args
97+$np->add_arg(
98+ spec => 'rings|r=s',
99+ help =>
100+'How many rings should be running (optinal) sends Crit if incorrect number of rings found.',
101+ required => 0,
102+);
103+
104+$np->getopts;
105+
106+my $found = 0;
107+my $fh;
108+my $rings = $np->opts->rings;
109+
110+# Run cfgtools spin through output and get info needed
111+
112+open( $fh, "$sudo $cfgtool |" )
113+ or $np->nagios_exit( CRITICAL, "Running corosync-cfgtool failed" );
114+
115+foreach my $line (<$fh>) {
116+ if ( $line =~ m/status\s*=\s*(\S.+)/ ) {
117+ my $status = $1;
118+ if ( $status =~ m/^ring (\d+) active with no faults/ ) {
119+ $np->add_message( OK, "ring $1 OK" );
120+ }
121+ else {
122+ $np->add_message( CRITICAL, $status );
123+ }
124+ $found++;
125+ }
126+}
127+
128+close($fh) or $np->nagios_exit( CRITICAL, "Running corosync-cfgtool failed" );
129+
130+# Check we found some rings and apply -r arg if needed
131+if ( $found == 0 ) {
132+ $np->nagios_exit( CRITICAL, "No Rings Found" );
133+}
134+elsif ( defined $rings && $rings != $found ) {
135+ $np->nagios_exit( CRITICAL, "Expected $rings rings but found $found" );
136+}
137+
138+$np->nagios_exit( $np->check_messages() );
139
140=== added file 'files/nrpe/check_crm'
141--- files/nrpe/check_crm 1970-01-01 00:00:00 +0000
142+++ files/nrpe/check_crm 2015-04-20 00:40:18 +0000
143@@ -0,0 +1,201 @@
144+#!/usr/bin/perl
145+#
146+# check_crm_v0_7
147+#
148+# Copyright © 2013 Philip Garner, Sysnix Consultants Limited
149+#
150+# This program is free software: you can redistribute it and/or modify
151+# it under the terms of the GNU General Public License as published by
152+# the Free Software Foundation, either version 3 of the License, or
153+# (at your option) any later version.
154+#
155+# This program is distributed in the hope that it will be useful,
156+# but WITHOUT ANY WARRANTY; without even the implied warranty of
157+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
158+# GNU General Public License for more details.
159+#
160+# You should have received a copy of the GNU General Public License
161+# along with this program. If not, see <http://www.gnu.org/licenses/>.
162+#
163+# Authors: Phil Garner - phil@sysnix.com & Peter Mottram - peter@sysnix.com
164+#
165+# v0.1 09/01/2011
166+# v0.2 11/01/2011
167+# v0.3 22/08/2011 - bug fix and changes suggested by Vadym Chepkov
168+# v0.4 23/08/2011 - update for spelling and anchor regex capture (Vadym Chepkov)
169+# v0.5 29/09/2011 - Add standby warn/crit suggested by Sönke Martens & removal
170+# of 'our' to 'my' to completely avoid problems with ePN
171+# v0.6 14/03/2013 - Change from \w+ to \S+ in stopped check to cope with
172+# Servers that have non word charachters in. Suggested by
173+# Igal Baevsky.
174+# v0.7 01/09/2013 - In testing as still not fully tested. Adds optional
175+# constraints check (Boris Wesslowski). Adds fail count
176+# threshold ( Zoran Bosnjak & Marko Hrastovec )
177+#
178+# NOTES: Requires Perl 5.8 or higher & the Perl Module Nagios::Plugin
179+# Nagios user will need sudo acces - suggest adding line below to
180+# sudoers
181+# nagios ALL=(ALL) NOPASSWD: /usr/sbin/crm_mon -1 -r -f
182+#
183+# if you want to check for location constraints (-c) also add
184+# nagios ALL=(ALL) NOPASSWD: /usr/sbin/crm configure show
185+#
186+# In sudoers if requiretty is on (off state is default)
187+# you will also need to add the line below
188+# Defaults:nagios !requiretty
189+#
190+
191+use warnings;
192+use strict;
193+use Nagios::Plugin;
194+
195+# Lines below may need changing if crm_mon or sudo installed in a
196+# different location.
197+
198+my $sudo = '/usr/bin/sudo';
199+my $crm_mon = '/usr/sbin/crm_mon -1 -r -f';
200+my $crm_configure_show = '/usr/sbin/crm configure show';
201+
202+my $np = Nagios::Plugin->new(
203+ shortname => 'check_crm',
204+ version => '0.7',
205+ usage => "Usage: %s <ARGS>\n\t\t--help for help\n",
206+);
207+
208+$np->add_arg(
209+ spec => 'warning|w',
210+ help =>
211+'If failed Nodes, stopped Resources detected or Standby Nodes sends Warning instead of Critical (default) as long as there are no other errors and there is Quorum',
212+ required => 0,
213+);
214+
215+$np->add_arg(
216+ spec => 'standbyignore|s',
217+ help => 'Ignore any node(s) in standby, by default sends Critical',
218+ required => 0,
219+);
220+
221+$np->add_arg(
222+ spec => 'constraint|constraints|c',
223+ help => 'Also check configuration for location constraints (caused by migrations) and warn if there are any. Requires additional privileges see notes',
224+ required => 0,
225+);
226+
227+$np->add_arg(
228+ spec => 'failcount|failcounts|f=i',
229+ help => 'resource fail count to start warning on [default = 1].',
230+ required => 0,
231+ default => 1,
232+);
233+
234+$np->getopts;
235+my $ConstraintsFlag = $np->opts->constraint;
236+
237+my @standby;
238+
239+# Check for -w option set warn if this is case instead of crit
240+my $warn_or_crit = 'CRITICAL';
241+$warn_or_crit = 'WARNING' if $np->opts->warning;
242+
243+my $fh;
244+
245+open( $fh, "$sudo $crm_mon |" )
246+ or $np->nagios_exit( CRITICAL, "Running $sudo $crm_mon has failed" );
247+
248+foreach my $line (<$fh>) {
249+
250+ if ( $line =~ m/Connection to cluster failed\:(.*)/i ) {
251+
252+ # Check Cluster connected
253+ $np->nagios_exit( CRITICAL, "Connection to cluster FAILED: $1" );
254+ }
255+ elsif ( $line =~ m/Current DC:/ ) {
256+
257+ # Check for Quorum
258+ if ( $line =~ m/partition with quorum$/ ) {
259+
260+ # Assume cluster is OK - we only add warn/crit after here
261+
262+ $np->add_message( OK, "Cluster OK" );
263+ }
264+ else {
265+ $np->add_message( CRITICAL, "No Quorum" );
266+ }
267+ }
268+ elsif ( $line =~ m/^offline:\s*\[\s*(\S.*?)\s*\]/i ) {
269+
270+ # Count offline nodes
271+ my @offline = split( /\s+/, $1 );
272+ my $numoffline = scalar @offline;
273+ $np->add_message( $warn_or_crit, ": $numoffline Nodes Offline" );
274+ }
275+ elsif ( $line =~ m/^node\s+(\S.*):\s*standby/i ) {
276+
277+ # Check for standby nodes (suggested by Sönke Martens)
278+ # See later in code for message created from this
279+ push @standby, $1;
280+ }
281+
282+ elsif ( $line =~ m/\s*(\S+)\s+\(\S+\)\:\s+Stopped/ ) {
283+
284+ # Check Resources Stopped
285+ $np->add_message( $warn_or_crit, ": $1 Stopped" );
286+ }
287+ elsif ( $line =~ m/\s*stopped\:\s*\[(.*)\]/i ) {
288+
289+ # Check Master/Slave stopped
290+ $np->add_message( $warn_or_crit, ": $1 Stopped" );
291+ }
292+ elsif ( $line =~ m/^Failed actions\:/ ) {
293+
294+ # Check Failed Actions
295+ $np->add_message( CRITICAL,
296+ ": FAILED actions detected or not cleaned up" );
297+ }
298+ elsif ( $line =~ m/\s*(\S+?)\s+ \(.*\)\:\s+\w+\s+\w+\s+\(unmanaged\)\s+/i )
299+ {
300+
301+ # Check Unmanaged
302+ $np->add_message( CRITICAL, ": $1 unmanaged FAILED" );
303+ }
304+ elsif ( $line =~ m/\s*(\S+?)\s+ \(.*\)\:\s+not installed/i ) {
305+
306+ # Check for errors
307+ $np->add_message( CRITICAL, ": $1 not installed" );
308+ }
309+ elsif ( $line =~ m/\s*(\S+?):.*fail-count=(\d+)/i ) {
310+ if ( $2 >= $np->opts->failcount ) {
311+
312+ # Check for resource Fail count (suggested by Vadym Chepkov)
313+ $np->add_message( WARNING, ": $1 failure detected, fail-count=$2" );
314+ }
315+ }
316+}
317+
318+# If found any Nodes in standby & no -s option used send warn/crit
319+if ( scalar @standby > 0 && !$np->opts->standbyignore ) {
320+ $np->add_message( $warn_or_crit,
321+ ": " . join( ', ', @standby ) . " in Standby" );
322+}
323+
324+close($fh) or $np->nagios_exit( CRITICAL, "Running $crm_mon FAILED" );
325+
326+# if -c flag set check configuration for constraints
327+if ($ConstraintsFlag) {
328+
329+ open( $fh, "$sudo $crm_configure_show|" )
330+ or $np->nagios_exit( CRITICAL,
331+ "Running $sudo $crm_configure_show has failed" );
332+
333+ foreach my $line (<$fh>) {
334+ if ( $line =~ m/location cli-(prefer|standby)-\S+\s+(\S+)/ ) {
335+ $np->add_message( WARNING,
336+ ": $2 blocking location constraint detected" );
337+ }
338+ }
339+ close($fh)
340+ or $np->nagios_exit( CRITICAL, "Running $crm_configure_show FAILED" );
341+}
342+
343+$np->nagios_exit( $np->check_messages() );
344+
345
346=== added directory 'files/sudoers'
347=== added file 'files/sudoers/nagios'
348--- files/sudoers/nagios 1970-01-01 00:00:00 +0000
349+++ files/sudoers/nagios 2015-04-20 00:40:18 +0000
350@@ -0,0 +1,5 @@
351+Defaults:nagios !requiretty
352+
353+nagios ALL=(ALL) NOPASSWD: /usr/sbin/corosync-cfgtool -s
354+nagios ALL=(ALL) NOPASSWD: /usr/sbin/crm_mon -1 -r -f
355+nagios ALL=(ALL) NOPASSWD: /usr/sbin/crm configure show
356
357=== added directory 'hooks/charmhelpers/contrib/charmsupport'
358=== added file 'hooks/charmhelpers/contrib/charmsupport/__init__.py'
359--- hooks/charmhelpers/contrib/charmsupport/__init__.py 1970-01-01 00:00:00 +0000
360+++ hooks/charmhelpers/contrib/charmsupport/__init__.py 2015-04-20 00:40:18 +0000
361@@ -0,0 +1,15 @@
362+# Copyright 2014-2015 Canonical Limited.
363+#
364+# This file is part of charm-helpers.
365+#
366+# charm-helpers is free software: you can redistribute it and/or modify
367+# it under the terms of the GNU Lesser General Public License version 3 as
368+# published by the Free Software Foundation.
369+#
370+# charm-helpers is distributed in the hope that it will be useful,
371+# but WITHOUT ANY WARRANTY; without even the implied warranty of
372+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
373+# GNU Lesser General Public License for more details.
374+#
375+# You should have received a copy of the GNU Lesser General Public License
376+# along with charm-helpers. If not, see <http://www.gnu.org/licenses/>.
377
378=== added file 'hooks/charmhelpers/contrib/charmsupport/nrpe.py'
379--- hooks/charmhelpers/contrib/charmsupport/nrpe.py 1970-01-01 00:00:00 +0000
380+++ hooks/charmhelpers/contrib/charmsupport/nrpe.py 2015-04-20 00:40:18 +0000
381@@ -0,0 +1,360 @@
382+# Copyright 2014-2015 Canonical Limited.
383+#
384+# This file is part of charm-helpers.
385+#
386+# charm-helpers is free software: you can redistribute it and/or modify
387+# it under the terms of the GNU Lesser General Public License version 3 as
388+# published by the Free Software Foundation.
389+#
390+# charm-helpers is distributed in the hope that it will be useful,
391+# but WITHOUT ANY WARRANTY; without even the implied warranty of
392+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
393+# GNU Lesser General Public License for more details.
394+#
395+# You should have received a copy of the GNU Lesser General Public License
396+# along with charm-helpers. If not, see <http://www.gnu.org/licenses/>.
397+
398+"""Compatibility with the nrpe-external-master charm"""
399+# Copyright 2012 Canonical Ltd.
400+#
401+# Authors:
402+# Matthew Wedgwood <matthew.wedgwood@canonical.com>
403+
404+import subprocess
405+import pwd
406+import grp
407+import os
408+import glob
409+import shutil
410+import re
411+import shlex
412+import yaml
413+
414+from charmhelpers.core.hookenv import (
415+ config,
416+ local_unit,
417+ log,
418+ relation_ids,
419+ relation_set,
420+ relations_of_type,
421+)
422+
423+from charmhelpers.core.host import service
424+
425+# This module adds compatibility with the nrpe-external-master and plain nrpe
426+# subordinate charms. To use it in your charm:
427+#
428+# 1. Update metadata.yaml
429+#
430+# provides:
431+# (...)
432+# nrpe-external-master:
433+# interface: nrpe-external-master
434+# scope: container
435+#
436+# and/or
437+#
438+# provides:
439+# (...)
440+# local-monitors:
441+# interface: local-monitors
442+# scope: container
443+
444+#
445+# 2. Add the following to config.yaml
446+#
447+# nagios_context:
448+# default: "juju"
449+# type: string
450+# description: |
451+# Used by the nrpe subordinate charms.
452+# A string that will be prepended to instance name to set the host name
453+# in nagios. So for instance the hostname would be something like:
454+# juju-myservice-0
455+# If you're running multiple environments with the same services in them
456+# this allows you to differentiate between them.
457+# nagios_servicegroups:
458+# default: ""
459+# type: string
460+# description: |
461+# A comma-separated list of nagios servicegroups.
462+# If left empty, the nagios_context will be used as the servicegroup
463+#
464+# 3. Add custom checks (Nagios plugins) to files/nrpe-external-master
465+#
466+# 4. Update your hooks.py with something like this:
467+#
468+# from charmsupport.nrpe import NRPE
469+# (...)
470+# def update_nrpe_config():
471+# nrpe_compat = NRPE()
472+# nrpe_compat.add_check(
473+# shortname = "myservice",
474+# description = "Check MyService",
475+# check_cmd = "check_http -w 2 -c 10 http://localhost"
476+# )
477+# nrpe_compat.add_check(
478+# "myservice_other",
479+# "Check for widget failures",
480+# check_cmd = "/srv/myapp/scripts/widget_check"
481+# )
482+# nrpe_compat.write()
483+#
484+# def config_changed():
485+# (...)
486+# update_nrpe_config()
487+#
488+# def nrpe_external_master_relation_changed():
489+# update_nrpe_config()
490+#
491+# def local_monitors_relation_changed():
492+# update_nrpe_config()
493+#
494+# 5. ln -s hooks.py nrpe-external-master-relation-changed
495+# ln -s hooks.py local-monitors-relation-changed
496+
497+
498+class CheckException(Exception):
499+ pass
500+
501+
502+class Check(object):
503+ shortname_re = '[A-Za-z0-9-_]+$'
504+ service_template = ("""
505+#---------------------------------------------------
506+# This file is Juju managed
507+#---------------------------------------------------
508+define service {{
509+ use active-service
510+ host_name {nagios_hostname}
511+ service_description {nagios_hostname}[{shortname}] """
512+ """{description}
513+ check_command check_nrpe!{command}
514+ servicegroups {nagios_servicegroup}
515+}}
516+""")
517+
518+ def __init__(self, shortname, description, check_cmd):
519+ super(Check, self).__init__()
520+ # XXX: could be better to calculate this from the service name
521+ if not re.match(self.shortname_re, shortname):
522+ raise CheckException("shortname must match {}".format(
523+ Check.shortname_re))
524+ self.shortname = shortname
525+ self.command = "check_{}".format(shortname)
526+ # Note: a set of invalid characters is defined by the
527+ # Nagios server config
528+ # The default is: illegal_object_name_chars=`~!$%^&*"|'<>?,()=
529+ self.description = description
530+ self.check_cmd = self._locate_cmd(check_cmd)
531+
532+ def _locate_cmd(self, check_cmd):
533+ search_path = (
534+ '/usr/lib/nagios/plugins',
535+ '/usr/local/lib/nagios/plugins',
536+ )
537+ parts = shlex.split(check_cmd)
538+ for path in search_path:
539+ if os.path.exists(os.path.join(path, parts[0])):
540+ command = os.path.join(path, parts[0])
541+ if len(parts) > 1:
542+ command += " " + " ".join(parts[1:])
543+ return command
544+ log('Check command not found: {}'.format(parts[0]))
545+ return ''
546+
547+ def write(self, nagios_context, hostname, nagios_servicegroups):
548+ nrpe_check_file = '/etc/nagios/nrpe.d/{}.cfg'.format(
549+ self.command)
550+ with open(nrpe_check_file, 'w') as nrpe_check_config:
551+ nrpe_check_config.write("# check {}\n".format(self.shortname))
552+ nrpe_check_config.write("command[{}]={}\n".format(
553+ self.command, self.check_cmd))
554+
555+ if not os.path.exists(NRPE.nagios_exportdir):
556+ log('Not writing service config as {} is not accessible'.format(
557+ NRPE.nagios_exportdir))
558+ else:
559+ self.write_service_config(nagios_context, hostname,
560+ nagios_servicegroups)
561+
562+ def write_service_config(self, nagios_context, hostname,
563+ nagios_servicegroups):
564+ for f in os.listdir(NRPE.nagios_exportdir):
565+ if re.search('.*{}.cfg'.format(self.command), f):
566+ os.remove(os.path.join(NRPE.nagios_exportdir, f))
567+
568+ templ_vars = {
569+ 'nagios_hostname': hostname,
570+ 'nagios_servicegroup': nagios_servicegroups,
571+ 'description': self.description,
572+ 'shortname': self.shortname,
573+ 'command': self.command,
574+ }
575+ nrpe_service_text = Check.service_template.format(**templ_vars)
576+ nrpe_service_file = '{}/service__{}_{}.cfg'.format(
577+ NRPE.nagios_exportdir, hostname, self.command)
578+ with open(nrpe_service_file, 'w') as nrpe_service_config:
579+ nrpe_service_config.write(str(nrpe_service_text))
580+
581+ def run(self):
582+ subprocess.call(self.check_cmd)
583+
584+
585+class NRPE(object):
586+ nagios_logdir = '/var/log/nagios'
587+ nagios_exportdir = '/var/lib/nagios/export'
588+ nrpe_confdir = '/etc/nagios/nrpe.d'
589+
590+ def __init__(self, hostname=None):
591+ super(NRPE, self).__init__()
592+ self.config = config()
593+ self.nagios_context = self.config['nagios_context']
594+ if 'nagios_servicegroups' in self.config and self.config['nagios_servicegroups']:
595+ self.nagios_servicegroups = self.config['nagios_servicegroups']
596+ else:
597+ self.nagios_servicegroups = self.nagios_context
598+ self.unit_name = local_unit().replace('/', '-')
599+ if hostname:
600+ self.hostname = hostname
601+ else:
602+ self.hostname = "{}-{}".format(self.nagios_context, self.unit_name)
603+ self.checks = []
604+
605+ def add_check(self, *args, **kwargs):
606+ self.checks.append(Check(*args, **kwargs))
607+
608+ def write(self):
609+ try:
610+ nagios_uid = pwd.getpwnam('nagios').pw_uid
611+ nagios_gid = grp.getgrnam('nagios').gr_gid
612+ except:
613+ log("Nagios user not set up, nrpe checks not updated")
614+ return
615+
616+ if not os.path.exists(NRPE.nagios_logdir):
617+ os.mkdir(NRPE.nagios_logdir)
618+ os.chown(NRPE.nagios_logdir, nagios_uid, nagios_gid)
619+
620+ nrpe_monitors = {}
621+ monitors = {"monitors": {"remote": {"nrpe": nrpe_monitors}}}
622+ for nrpecheck in self.checks:
623+ nrpecheck.write(self.nagios_context, self.hostname,
624+ self.nagios_servicegroups)
625+ nrpe_monitors[nrpecheck.shortname] = {
626+ "command": nrpecheck.command,
627+ }
628+
629+ service('restart', 'nagios-nrpe-server')
630+
631+ monitor_ids = relation_ids("local-monitors") + \
632+ relation_ids("nrpe-external-master")
633+ for rid in monitor_ids:
634+ relation_set(relation_id=rid, monitors=yaml.dump(monitors))
635+
636+
637+def get_nagios_hostcontext(relation_name='nrpe-external-master'):
638+ """
639+ Query relation with nrpe subordinate, return the nagios_host_context
640+
641+ :param str relation_name: Name of relation nrpe sub joined to
642+ """
643+ for rel in relations_of_type(relation_name):
644+ if 'nagios_hostname' in rel:
645+ return rel['nagios_host_context']
646+
647+
648+def get_nagios_hostname(relation_name='nrpe-external-master'):
649+ """
650+ Query relation with nrpe subordinate, return the nagios_hostname
651+
652+ :param str relation_name: Name of relation nrpe sub joined to
653+ """
654+ for rel in relations_of_type(relation_name):
655+ if 'nagios_hostname' in rel:
656+ return rel['nagios_hostname']
657+
658+
659+def get_nagios_unit_name(relation_name='nrpe-external-master'):
660+ """
661+ Return the nagios unit name prepended with host_context if needed
662+
663+ :param str relation_name: Name of relation nrpe sub joined to
664+ """
665+ host_context = get_nagios_hostcontext(relation_name)
666+ if host_context:
667+ unit = "%s:%s" % (host_context, local_unit())
668+ else:
669+ unit = local_unit()
670+ return unit
671+
672+
673+def add_init_service_checks(nrpe, services, unit_name):
674+ """
675+ Add checks for each service in list
676+
677+ :param NRPE nrpe: NRPE object to add check to
678+ :param list services: List of services to check
679+ :param str unit_name: Unit name to use in check description
680+ """
681+ for svc in services:
682+ upstart_init = '/etc/init/%s.conf' % svc
683+ sysv_init = '/etc/init.d/%s' % svc
684+ if os.path.exists(upstart_init):
685+ nrpe.add_check(
686+ shortname=svc,
687+ description='process check {%s}' % unit_name,
688+ check_cmd='check_upstart_job %s' % svc
689+ )
690+ elif os.path.exists(sysv_init):
691+ cronpath = '/etc/cron.d/nagios-service-check-%s' % svc
692+ cron_file = ('*/5 * * * * root '
693+ '/usr/local/lib/nagios/plugins/check_exit_status.pl '
694+ '-s /etc/init.d/%s status > '
695+ '/var/lib/nagios/service-check-%s.txt\n' % (svc,
696+ svc)
697+ )
698+ f = open(cronpath, 'w')
699+ f.write(cron_file)
700+ f.close()
701+ nrpe.add_check(
702+ shortname=svc,
703+ description='process check {%s}' % unit_name,
704+ check_cmd='check_status_file.py -f '
705+ '/var/lib/nagios/service-check-%s.txt' % svc,
706+ )
707+
708+
709+def copy_nrpe_checks():
710+ """
711+ Copy the nrpe checks into place
712+
713+ """
714+ NAGIOS_PLUGINS = '/usr/local/lib/nagios/plugins'
715+ nrpe_files_dir = os.path.join(os.getenv('CHARM_DIR'), 'hooks',
716+ 'charmhelpers', 'contrib', 'openstack',
717+ 'files')
718+
719+ if not os.path.exists(NAGIOS_PLUGINS):
720+ os.makedirs(NAGIOS_PLUGINS)
721+ for fname in glob.glob(os.path.join(nrpe_files_dir, "check_*")):
722+ if os.path.isfile(fname):
723+ shutil.copy2(fname,
724+ os.path.join(NAGIOS_PLUGINS, os.path.basename(fname)))
725+
726+
727+def add_haproxy_checks(nrpe, unit_name):
728+ """
729+ Add checks for each service in list
730+
731+ :param NRPE nrpe: NRPE object to add check to
732+ :param str unit_name: Unit name to use in check description
733+ """
734+ nrpe.add_check(
735+ shortname='haproxy_servers',
736+ description='Check HAProxy {%s}' % unit_name,
737+ check_cmd='check_haproxy.sh')
738+ nrpe.add_check(
739+ shortname='haproxy_queue',
740+ description='Check HAProxy queue depth {%s}' % unit_name,
741+ check_cmd='check_haproxy_queue_depth.sh')
742
743=== added file 'hooks/charmhelpers/contrib/charmsupport/volumes.py'
744--- hooks/charmhelpers/contrib/charmsupport/volumes.py 1970-01-01 00:00:00 +0000
745+++ hooks/charmhelpers/contrib/charmsupport/volumes.py 2015-04-20 00:40:18 +0000
746@@ -0,0 +1,175 @@
747+# Copyright 2014-2015 Canonical Limited.
748+#
749+# This file is part of charm-helpers.
750+#
751+# charm-helpers is free software: you can redistribute it and/or modify
752+# it under the terms of the GNU Lesser General Public License version 3 as
753+# published by the Free Software Foundation.
754+#
755+# charm-helpers is distributed in the hope that it will be useful,
756+# but WITHOUT ANY WARRANTY; without even the implied warranty of
757+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
758+# GNU Lesser General Public License for more details.
759+#
760+# You should have received a copy of the GNU Lesser General Public License
761+# along with charm-helpers. If not, see <http://www.gnu.org/licenses/>.
762+
763+'''
764+Functions for managing volumes in juju units. One volume is supported per unit.
765+Subordinates may have their own storage, provided it is on its own partition.
766+
767+Configuration stanzas::
768+
769+ volume-ephemeral:
770+ type: boolean
771+ default: true
772+ description: >
773+ If false, a volume is mounted as sepecified in "volume-map"
774+ If true, ephemeral storage will be used, meaning that log data
775+ will only exist as long as the machine. YOU HAVE BEEN WARNED.
776+ volume-map:
777+ type: string
778+ default: {}
779+ description: >
780+ YAML map of units to device names, e.g:
781+ "{ rsyslog/0: /dev/vdb, rsyslog/1: /dev/vdb }"
782+ Service units will raise a configure-error if volume-ephemeral
783+ is 'true' and no volume-map value is set. Use 'juju set' to set a
784+ value and 'juju resolved' to complete configuration.
785+
786+Usage::
787+
788+ from charmsupport.volumes import configure_volume, VolumeConfigurationError
789+ from charmsupport.hookenv import log, ERROR
790+ def post_mount_hook():
791+ stop_service('myservice')
792+ def post_mount_hook():
793+ start_service('myservice')
794+
795+ if __name__ == '__main__':
796+ try:
797+ configure_volume(before_change=pre_mount_hook,
798+ after_change=post_mount_hook)
799+ except VolumeConfigurationError:
800+ log('Storage could not be configured', ERROR)
801+
802+'''
803+
804+# XXX: Known limitations
805+# - fstab is neither consulted nor updated
806+
807+import os
808+from charmhelpers.core import hookenv
809+from charmhelpers.core import host
810+import yaml
811+
812+
813+MOUNT_BASE = '/srv/juju/volumes'
814+
815+
816+class VolumeConfigurationError(Exception):
817+ '''Volume configuration data is missing or invalid'''
818+ pass
819+
820+
821+def get_config():
822+ '''Gather and sanity-check volume configuration data'''
823+ volume_config = {}
824+ config = hookenv.config()
825+
826+ errors = False
827+
828+ if config.get('volume-ephemeral') in (True, 'True', 'true', 'Yes', 'yes'):
829+ volume_config['ephemeral'] = True
830+ else:
831+ volume_config['ephemeral'] = False
832+
833+ try:
834+ volume_map = yaml.safe_load(config.get('volume-map', '{}'))
835+ except yaml.YAMLError as e:
836+ hookenv.log("Error parsing YAML volume-map: {}".format(e),
837+ hookenv.ERROR)
838+ errors = True
839+ if volume_map is None:
840+ # probably an empty string
841+ volume_map = {}
842+ elif not isinstance(volume_map, dict):
843+ hookenv.log("Volume-map should be a dictionary, not {}".format(
844+ type(volume_map)))
845+ errors = True
846+
847+ volume_config['device'] = volume_map.get(os.environ['JUJU_UNIT_NAME'])
848+ if volume_config['device'] and volume_config['ephemeral']:
849+ # asked for ephemeral storage but also defined a volume ID
850+ hookenv.log('A volume is defined for this unit, but ephemeral '
851+ 'storage was requested', hookenv.ERROR)
852+ errors = True
853+ elif not volume_config['device'] and not volume_config['ephemeral']:
854+ # asked for permanent storage but did not define volume ID
855+ hookenv.log('Ephemeral storage was requested, but there is no volume '
856+ 'defined for this unit.', hookenv.ERROR)
857+ errors = True
858+
859+ unit_mount_name = hookenv.local_unit().replace('/', '-')
860+ volume_config['mountpoint'] = os.path.join(MOUNT_BASE, unit_mount_name)
861+
862+ if errors:
863+ return None
864+ return volume_config
865+
866+
867+def mount_volume(config):
868+ if os.path.exists(config['mountpoint']):
869+ if not os.path.isdir(config['mountpoint']):
870+ hookenv.log('Not a directory: {}'.format(config['mountpoint']))
871+ raise VolumeConfigurationError()
872+ else:
873+ host.mkdir(config['mountpoint'])
874+ if os.path.ismount(config['mountpoint']):
875+ unmount_volume(config)
876+ if not host.mount(config['device'], config['mountpoint'], persist=True):
877+ raise VolumeConfigurationError()
878+
879+
880+def unmount_volume(config):
881+ if os.path.ismount(config['mountpoint']):
882+ if not host.umount(config['mountpoint'], persist=True):
883+ raise VolumeConfigurationError()
884+
885+
886+def managed_mounts():
887+ '''List of all mounted managed volumes'''
888+ return filter(lambda mount: mount[0].startswith(MOUNT_BASE), host.mounts())
889+
890+
891+def configure_volume(before_change=lambda: None, after_change=lambda: None):
892+ '''Set up storage (or don't) according to the charm's volume configuration.
893+ Returns the mount point or "ephemeral". before_change and after_change
894+ are optional functions to be called if the volume configuration changes.
895+ '''
896+
897+ config = get_config()
898+ if not config:
899+ hookenv.log('Failed to read volume configuration', hookenv.CRITICAL)
900+ raise VolumeConfigurationError()
901+
902+ if config['ephemeral']:
903+ if os.path.ismount(config['mountpoint']):
904+ before_change()
905+ unmount_volume(config)
906+ after_change()
907+ return 'ephemeral'
908+ else:
909+ # persistent storage
910+ if os.path.ismount(config['mountpoint']):
911+ mounts = dict(managed_mounts())
912+ if mounts.get(config['mountpoint']) != config['device']:
913+ before_change()
914+ unmount_volume(config)
915+ mount_volume(config)
916+ after_change()
917+ else:
918+ before_change()
919+ mount_volume(config)
920+ after_change()
921+ return config['mountpoint']
922
923=== added file 'hooks/charmhelpers/core/strutils.py'
924--- hooks/charmhelpers/core/strutils.py 1970-01-01 00:00:00 +0000
925+++ hooks/charmhelpers/core/strutils.py 2015-04-20 00:40:18 +0000
926@@ -0,0 +1,42 @@
927+#!/usr/bin/env python
928+# -*- coding: utf-8 -*-
929+
930+# Copyright 2014-2015 Canonical Limited.
931+#
932+# This file is part of charm-helpers.
933+#
934+# charm-helpers is free software: you can redistribute it and/or modify
935+# it under the terms of the GNU Lesser General Public License version 3 as
936+# published by the Free Software Foundation.
937+#
938+# charm-helpers is distributed in the hope that it will be useful,
939+# but WITHOUT ANY WARRANTY; without even the implied warranty of
940+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
941+# GNU Lesser General Public License for more details.
942+#
943+# You should have received a copy of the GNU Lesser General Public License
944+# along with charm-helpers. If not, see <http://www.gnu.org/licenses/>.
945+
946+import six
947+
948+
949+def bool_from_string(value):
950+ """Interpret string value as boolean.
951+
952+ Returns True if value translates to True otherwise False.
953+ """
954+ if isinstance(value, six.string_types):
955+ value = six.text_type(value)
956+ else:
957+ msg = "Unable to interpret non-string value '%s' as boolean" % (value)
958+ raise ValueError(msg)
959+
960+ value = value.strip().lower()
961+
962+ if value in ['y', 'yes', 'true', 't', 'on']:
963+ return True
964+ elif value in ['n', 'no', 'false', 'f', 'off']:
965+ return False
966+
967+ msg = "Unable to interpret string value '%s' as boolean" % (value)
968+ raise ValueError(msg)
969
970=== renamed file 'hooks/charmhelpers/core/strutils.py' => 'hooks/charmhelpers/core/strutils.py.moved'
971=== added file 'hooks/charmhelpers/core/unitdata.py'
972--- hooks/charmhelpers/core/unitdata.py 1970-01-01 00:00:00 +0000
973+++ hooks/charmhelpers/core/unitdata.py 2015-04-20 00:40:18 +0000
974@@ -0,0 +1,477 @@
975+#!/usr/bin/env python
976+# -*- coding: utf-8 -*-
977+#
978+# Copyright 2014-2015 Canonical Limited.
979+#
980+# This file is part of charm-helpers.
981+#
982+# charm-helpers is free software: you can redistribute it and/or modify
983+# it under the terms of the GNU Lesser General Public License version 3 as
984+# published by the Free Software Foundation.
985+#
986+# charm-helpers is distributed in the hope that it will be useful,
987+# but WITHOUT ANY WARRANTY; without even the implied warranty of
988+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
989+# GNU Lesser General Public License for more details.
990+#
991+# You should have received a copy of the GNU Lesser General Public License
992+# along with charm-helpers. If not, see <http://www.gnu.org/licenses/>.
993+#
994+#
995+# Authors:
996+# Kapil Thangavelu <kapil.foss@gmail.com>
997+#
998+"""
999+Intro
1000+-----
1001+
1002+A simple way to store state in units. This provides a key value
1003+storage with support for versioned, transactional operation,
1004+and can calculate deltas from previous values to simplify unit logic
1005+when processing changes.
1006+
1007+
1008+Hook Integration
1009+----------------
1010+
1011+There are several extant frameworks for hook execution, including
1012+
1013+ - charmhelpers.core.hookenv.Hooks
1014+ - charmhelpers.core.services.ServiceManager
1015+
1016+The storage classes are framework agnostic, one simple integration is
1017+via the HookData contextmanager. It will record the current hook
1018+execution environment (including relation data, config data, etc.),
1019+setup a transaction and allow easy access to the changes from
1020+previously seen values. One consequence of the integration is the
1021+reservation of particular keys ('rels', 'unit', 'env', 'config',
1022+'charm_revisions') for their respective values.
1023+
1024+Here's a fully worked integration example using hookenv.Hooks::
1025+
1026+ from charmhelper.core import hookenv, unitdata
1027+
1028+ hook_data = unitdata.HookData()
1029+ db = unitdata.kv()
1030+ hooks = hookenv.Hooks()
1031+
1032+ @hooks.hook
1033+ def config_changed():
1034+ # Print all changes to configuration from previously seen
1035+ # values.
1036+ for changed, (prev, cur) in hook_data.conf.items():
1037+ print('config changed', changed,
1038+ 'previous value', prev,
1039+ 'current value', cur)
1040+
1041+ # Get some unit specific bookeeping
1042+ if not db.get('pkg_key'):
1043+ key = urllib.urlopen('https://example.com/pkg_key').read()
1044+ db.set('pkg_key', key)
1045+
1046+ # Directly access all charm config as a mapping.
1047+ conf = db.getrange('config', True)
1048+
1049+ # Directly access all relation data as a mapping
1050+ rels = db.getrange('rels', True)
1051+
1052+ if __name__ == '__main__':
1053+ with hook_data():
1054+ hook.execute()
1055+
1056+
1057+A more basic integration is via the hook_scope context manager which simply
1058+manages transaction scope (and records hook name, and timestamp)::
1059+
1060+ >>> from unitdata import kv
1061+ >>> db = kv()
1062+ >>> with db.hook_scope('install'):
1063+ ... # do work, in transactional scope.
1064+ ... db.set('x', 1)
1065+ >>> db.get('x')
1066+ 1
1067+
1068+
1069+Usage
1070+-----
1071+
1072+Values are automatically json de/serialized to preserve basic typing
1073+and complex data struct capabilities (dicts, lists, ints, booleans, etc).
1074+
1075+Individual values can be manipulated via get/set::
1076+
1077+ >>> kv.set('y', True)
1078+ >>> kv.get('y')
1079+ True
1080+
1081+ # We can set complex values (dicts, lists) as a single key.
1082+ >>> kv.set('config', {'a': 1, 'b': True'})
1083+
1084+ # Also supports returning dictionaries as a record which
1085+ # provides attribute access.
1086+ >>> config = kv.get('config', record=True)
1087+ >>> config.b
1088+ True
1089+
1090+
1091+Groups of keys can be manipulated with update/getrange::
1092+
1093+ >>> kv.update({'z': 1, 'y': 2}, prefix="gui.")
1094+ >>> kv.getrange('gui.', strip=True)
1095+ {'z': 1, 'y': 2}
1096+
1097+When updating values, its very helpful to understand which values
1098+have actually changed and how have they changed. The storage
1099+provides a delta method to provide for this::
1100+
1101+ >>> data = {'debug': True, 'option': 2}
1102+ >>> delta = kv.delta(data, 'config.')
1103+ >>> delta.debug.previous
1104+ None
1105+ >>> delta.debug.current
1106+ True
1107+ >>> delta
1108+ {'debug': (None, True), 'option': (None, 2)}
1109+
1110+Note the delta method does not persist the actual change, it needs to
1111+be explicitly saved via 'update' method::
1112+
1113+ >>> kv.update(data, 'config.')
1114+
1115+Values modified in the context of a hook scope retain historical values
1116+associated to the hookname.
1117+
1118+ >>> with db.hook_scope('config-changed'):
1119+ ... db.set('x', 42)
1120+ >>> db.gethistory('x')
1121+ [(1, u'x', 1, u'install', u'2015-01-21T16:49:30.038372'),
1122+ (2, u'x', 42, u'config-changed', u'2015-01-21T16:49:30.038786')]
1123+
1124+"""
1125+
1126+import collections
1127+import contextlib
1128+import datetime
1129+import json
1130+import os
1131+import pprint
1132+import sqlite3
1133+import sys
1134+
1135+__author__ = 'Kapil Thangavelu <kapil.foss@gmail.com>'
1136+
1137+
1138+class Storage(object):
1139+ """Simple key value database for local unit state within charms.
1140+
1141+ Modifications are automatically committed at hook exit. That's
1142+ currently regardless of exit code.
1143+
1144+ To support dicts, lists, integer, floats, and booleans values
1145+ are automatically json encoded/decoded.
1146+ """
1147+ def __init__(self, path=None):
1148+ self.db_path = path
1149+ if path is None:
1150+ self.db_path = os.path.join(
1151+ os.environ.get('CHARM_DIR', ''), '.unit-state.db')
1152+ self.conn = sqlite3.connect('%s' % self.db_path)
1153+ self.cursor = self.conn.cursor()
1154+ self.revision = None
1155+ self._closed = False
1156+ self._init()
1157+
1158+ def close(self):
1159+ if self._closed:
1160+ return
1161+ self.flush(False)
1162+ self.cursor.close()
1163+ self.conn.close()
1164+ self._closed = True
1165+
1166+ def _scoped_query(self, stmt, params=None):
1167+ if params is None:
1168+ params = []
1169+ return stmt, params
1170+
1171+ def get(self, key, default=None, record=False):
1172+ self.cursor.execute(
1173+ *self._scoped_query(
1174+ 'select data from kv where key=?', [key]))
1175+ result = self.cursor.fetchone()
1176+ if not result:
1177+ return default
1178+ if record:
1179+ return Record(json.loads(result[0]))
1180+ return json.loads(result[0])
1181+
1182+ def getrange(self, key_prefix, strip=False):
1183+ stmt = "select key, data from kv where key like '%s%%'" % key_prefix
1184+ self.cursor.execute(*self._scoped_query(stmt))
1185+ result = self.cursor.fetchall()
1186+
1187+ if not result:
1188+ return None
1189+ if not strip:
1190+ key_prefix = ''
1191+ return dict([
1192+ (k[len(key_prefix):], json.loads(v)) for k, v in result])
1193+
1194+ def update(self, mapping, prefix=""):
1195+ for k, v in mapping.items():
1196+ self.set("%s%s" % (prefix, k), v)
1197+
1198+ def unset(self, key):
1199+ self.cursor.execute('delete from kv where key=?', [key])
1200+ if self.revision and self.cursor.rowcount:
1201+ self.cursor.execute(
1202+ 'insert into kv_revisions values (?, ?, ?)',
1203+ [key, self.revision, json.dumps('DELETED')])
1204+
1205+ def set(self, key, value):
1206+ serialized = json.dumps(value)
1207+
1208+ self.cursor.execute(
1209+ 'select data from kv where key=?', [key])
1210+ exists = self.cursor.fetchone()
1211+
1212+ # Skip mutations to the same value
1213+ if exists:
1214+ if exists[0] == serialized:
1215+ return value
1216+
1217+ if not exists:
1218+ self.cursor.execute(
1219+ 'insert into kv (key, data) values (?, ?)',
1220+ (key, serialized))
1221+ else:
1222+ self.cursor.execute('''
1223+ update kv
1224+ set data = ?
1225+ where key = ?''', [serialized, key])
1226+
1227+ # Save
1228+ if not self.revision:
1229+ return value
1230+
1231+ self.cursor.execute(
1232+ 'select 1 from kv_revisions where key=? and revision=?',
1233+ [key, self.revision])
1234+ exists = self.cursor.fetchone()
1235+
1236+ if not exists:
1237+ self.cursor.execute(
1238+ '''insert into kv_revisions (
1239+ revision, key, data) values (?, ?, ?)''',
1240+ (self.revision, key, serialized))
1241+ else:
1242+ self.cursor.execute(
1243+ '''
1244+ update kv_revisions
1245+ set data = ?
1246+ where key = ?
1247+ and revision = ?''',
1248+ [serialized, key, self.revision])
1249+
1250+ return value
1251+
1252+ def delta(self, mapping, prefix):
1253+ """
1254+ return a delta containing values that have changed.
1255+ """
1256+ previous = self.getrange(prefix, strip=True)
1257+ if not previous:
1258+ pk = set()
1259+ else:
1260+ pk = set(previous.keys())
1261+ ck = set(mapping.keys())
1262+ delta = DeltaSet()
1263+
1264+ # added
1265+ for k in ck.difference(pk):
1266+ delta[k] = Delta(None, mapping[k])
1267+
1268+ # removed
1269+ for k in pk.difference(ck):
1270+ delta[k] = Delta(previous[k], None)
1271+
1272+ # changed
1273+ for k in pk.intersection(ck):
1274+ c = mapping[k]
1275+ p = previous[k]
1276+ if c != p:
1277+ delta[k] = Delta(p, c)
1278+
1279+ return delta
1280+
1281+ @contextlib.contextmanager
1282+ def hook_scope(self, name=""):
1283+ """Scope all future interactions to the current hook execution
1284+ revision."""
1285+ assert not self.revision
1286+ self.cursor.execute(
1287+ 'insert into hooks (hook, date) values (?, ?)',
1288+ (name or sys.argv[0],
1289+ datetime.datetime.utcnow().isoformat()))
1290+ self.revision = self.cursor.lastrowid
1291+ try:
1292+ yield self.revision
1293+ self.revision = None
1294+ except:
1295+ self.flush(False)
1296+ self.revision = None
1297+ raise
1298+ else:
1299+ self.flush()
1300+
1301+ def flush(self, save=True):
1302+ if save:
1303+ self.conn.commit()
1304+ elif self._closed:
1305+ return
1306+ else:
1307+ self.conn.rollback()
1308+
1309+ def _init(self):
1310+ self.cursor.execute('''
1311+ create table if not exists kv (
1312+ key text,
1313+ data text,
1314+ primary key (key)
1315+ )''')
1316+ self.cursor.execute('''
1317+ create table if not exists kv_revisions (
1318+ key text,
1319+ revision integer,
1320+ data text,
1321+ primary key (key, revision)
1322+ )''')
1323+ self.cursor.execute('''
1324+ create table if not exists hooks (
1325+ version integer primary key autoincrement,
1326+ hook text,
1327+ date text
1328+ )''')
1329+ self.conn.commit()
1330+
1331+ def gethistory(self, key, deserialize=False):
1332+ self.cursor.execute(
1333+ '''
1334+ select kv.revision, kv.key, kv.data, h.hook, h.date
1335+ from kv_revisions kv,
1336+ hooks h
1337+ where kv.key=?
1338+ and kv.revision = h.version
1339+ ''', [key])
1340+ if deserialize is False:
1341+ return self.cursor.fetchall()
1342+ return map(_parse_history, self.cursor.fetchall())
1343+
1344+ def debug(self, fh=sys.stderr):
1345+ self.cursor.execute('select * from kv')
1346+ pprint.pprint(self.cursor.fetchall(), stream=fh)
1347+ self.cursor.execute('select * from kv_revisions')
1348+ pprint.pprint(self.cursor.fetchall(), stream=fh)
1349+
1350+
1351+def _parse_history(d):
1352+ return (d[0], d[1], json.loads(d[2]), d[3],
1353+ datetime.datetime.strptime(d[-1], "%Y-%m-%dT%H:%M:%S.%f"))
1354+
1355+
1356+class HookData(object):
1357+ """Simple integration for existing hook exec frameworks.
1358+
1359+ Records all unit information, and stores deltas for processing
1360+ by the hook.
1361+
1362+ Sample::
1363+
1364+ from charmhelper.core import hookenv, unitdata
1365+
1366+ changes = unitdata.HookData()
1367+ db = unitdata.kv()
1368+ hooks = hookenv.Hooks()
1369+
1370+ @hooks.hook
1371+ def config_changed():
1372+ # View all changes to configuration
1373+ for changed, (prev, cur) in changes.conf.items():
1374+ print('config changed', changed,
1375+ 'previous value', prev,
1376+ 'current value', cur)
1377+
1378+ # Get some unit specific bookeeping
1379+ if not db.get('pkg_key'):
1380+ key = urllib.urlopen('https://example.com/pkg_key').read()
1381+ db.set('pkg_key', key)
1382+
1383+ if __name__ == '__main__':
1384+ with changes():
1385+ hook.execute()
1386+
1387+ """
1388+ def __init__(self):
1389+ self.kv = kv()
1390+ self.conf = None
1391+ self.rels = None
1392+
1393+ @contextlib.contextmanager
1394+ def __call__(self):
1395+ from charmhelpers.core import hookenv
1396+ hook_name = hookenv.hook_name()
1397+
1398+ with self.kv.hook_scope(hook_name):
1399+ self._record_charm_version(hookenv.charm_dir())
1400+ delta_config, delta_relation = self._record_hook(hookenv)
1401+ yield self.kv, delta_config, delta_relation
1402+
1403+ def _record_charm_version(self, charm_dir):
1404+ # Record revisions.. charm revisions are meaningless
1405+ # to charm authors as they don't control the revision.
1406+ # so logic dependnent on revision is not particularly
1407+ # useful, however it is useful for debugging analysis.
1408+ charm_rev = open(
1409+ os.path.join(charm_dir, 'revision')).read().strip()
1410+ charm_rev = charm_rev or '0'
1411+ revs = self.kv.get('charm_revisions', [])
1412+ if charm_rev not in revs:
1413+ revs.append(charm_rev.strip() or '0')
1414+ self.kv.set('charm_revisions', revs)
1415+
1416+ def _record_hook(self, hookenv):
1417+ data = hookenv.execution_environment()
1418+ self.conf = conf_delta = self.kv.delta(data['conf'], 'config')
1419+ self.rels = rels_delta = self.kv.delta(data['rels'], 'rels')
1420+ self.kv.set('env', dict(data['env']))
1421+ self.kv.set('unit', data['unit'])
1422+ self.kv.set('relid', data.get('relid'))
1423+ return conf_delta, rels_delta
1424+
1425+
1426+class Record(dict):
1427+
1428+ __slots__ = ()
1429+
1430+ def __getattr__(self, k):
1431+ if k in self:
1432+ return self[k]
1433+ raise AttributeError(k)
1434+
1435+
1436+class DeltaSet(Record):
1437+
1438+ __slots__ = ()
1439+
1440+
1441+Delta = collections.namedtuple('Delta', ['previous', 'current'])
1442+
1443+
1444+_KV = None
1445+
1446+
1447+def kv():
1448+ global _KV
1449+ if _KV is None:
1450+ _KV = Storage()
1451+ return _KV
1452
1453=== renamed file 'hooks/charmhelpers/core/unitdata.py' => 'hooks/charmhelpers/core/unitdata.py.moved'
1454=== modified file 'hooks/hooks.py'
1455--- hooks/hooks.py 2015-04-14 10:16:50 +0000
1456+++ hooks/hooks.py 2015-04-20 00:40:18 +0000
1457@@ -11,6 +11,7 @@
1458 import shutil
1459 import sys
1460 import os
1461+import glob
1462 from base64 import b64decode
1463
1464 import maas as MAAS
1465@@ -56,6 +57,8 @@
1466
1467 from charmhelpers.contrib.openstack.utils import get_host_ip
1468
1469+from charmhelpers.contrib.charmsupport import nrpe
1470+
1471 hooks = Hooks()
1472
1473 COROSYNC_CONF = '/etc/corosync/corosync.conf'
1474@@ -68,7 +71,8 @@
1475 COROSYNC_CONF
1476 ]
1477
1478-PACKAGES = ['corosync', 'pacemaker', 'python-netaddr', 'ipmitool']
1479+PACKAGES = ['corosync', 'pacemaker', 'python-netaddr', 'ipmitool',
1480+ 'libnagios-plugin-perl']
1481 SUPPORTED_TRANSPORTS = ['udp', 'udpu', 'multicast', 'unicast']
1482
1483
1484@@ -209,11 +213,15 @@
1485 configure_monitor_host()
1486 configure_stonith()
1487
1488+ update_nrpe_config()
1489+
1490
1491 @hooks.hook()
1492 def upgrade_charm():
1493 install()
1494
1495+ update_nrpe_config()
1496+
1497
1498 def restart_corosync():
1499 if service_running("pacemaker"):
1500@@ -594,6 +602,59 @@
1501 "versions less than Trusty 14.04")
1502
1503
1504+@hooks.hook('nrpe-external-master-relation-joined',
1505+ 'nrpe-external-master-relation-changed')
1506+def update_nrpe_config():
1507+ scripts_src = os.path.join(os.environ["CHARM_DIR"], "files",
1508+ "nrpe")
1509+ scripts_dst = "/usr/local/lib/nagios/plugins"
1510+ if not os.path.exists(scripts_dst):
1511+ os.makedirs(scripts_dst)
1512+ for fname in glob.glob(os.path.join(scripts_src, "*")):
1513+ if os.path.isfile(fname):
1514+ shutil.copy2(fname,
1515+ os.path.join(scripts_dst, os.path.basename(fname)))
1516+
1517+ sudoers_src = os.path.join(os.environ["CHARM_DIR"], "files",
1518+ "sudoers")
1519+ sudoers_dst = "/etc/sudoers.d"
1520+ for fname in glob.glob(os.path.join(sudoers_src, "*")):
1521+ if os.path.isfile(fname):
1522+ shutil.copy2(fname,
1523+ os.path.join(sudoers_dst, os.path.basename(fname)))
1524+
1525+ hostname = nrpe.get_nagios_hostname()
1526+ current_unit = nrpe.get_nagios_unit_name()
1527+
1528+ nrpe_setup = nrpe.NRPE(hostname=hostname)
1529+
1530+ apt_install('python-dbus')
1531+
1532+ # corosync/crm checks
1533+ nrpe_setup.add_check(
1534+ shortname='corosync_rings',
1535+ description='Check Corosync rings {%s}' % current_unit,
1536+ check_cmd='check_corosync_rings')
1537+ nrpe_setup.add_check(
1538+ shortname='crm_status',
1539+ description='Check crm status {%s}' % current_unit,
1540+ check_cmd='check_crm')
1541+
1542+ # process checks
1543+ nrpe_setup.add_check(
1544+ shortname='corosync_proc',
1545+ description='Check Corosync process {%s}' % current_unit,
1546+ check_cmd='check_procs -c 1:1 -C corosync'
1547+ )
1548+ nrpe_setup.add_check(
1549+ shortname='pacemakerd_proc',
1550+ description='Check Pacemakerd process {%s}' % current_unit,
1551+ check_cmd='check_procs -c 1:1 -C pacemakerd'
1552+ )
1553+
1554+ nrpe_setup.write()
1555+
1556+
1557 if __name__ == '__main__':
1558 try:
1559 hooks.execute(sys.argv)
1560
1561=== added symlink 'hooks/nrpe-external-master-relation-changed'
1562=== target is u'hooks.py'
1563=== added symlink 'hooks/nrpe-external-master-relation-joined'
1564=== target is u'hooks.py'
1565=== modified file 'metadata.yaml'
1566--- metadata.yaml 2014-04-11 11:22:46 +0000
1567+++ metadata.yaml 2015-04-20 00:40:18 +0000
1568@@ -14,6 +14,9 @@
1569 ha:
1570 interface: hacluster
1571 scope: container
1572+ nrpe-external-master:
1573+ interface: nrpe-external-master
1574+ scope: container
1575 peers:
1576 hanode:
1577 interface: hacluster

Subscribers

People subscribed via source and target branches