Merge lp:~brad-marshall/charms/trusty/hacluster/add-nrpe-checks-update-charmhelpers into lp:~openstack-charmers/charms/trusty/hacluster/next

Proposed by Billy Olsen
Status: Merged
Merged at revision: 46
Proposed branch: lp:~brad-marshall/charms/trusty/hacluster/add-nrpe-checks-update-charmhelpers
Merge into: lp:~openstack-charmers/charms/trusty/hacluster/next
Diff against target: 1577 lines (+1456/-1) (has conflicts)
12 files modified
charm-helpers.yaml (+1/-0)
config.yaml (+16/-0)
files/nrpe/check_corosync_rings (+99/-0)
files/nrpe/check_crm (+201/-0)
files/sudoers/nagios (+5/-0)
hooks/charmhelpers/contrib/charmsupport/__init__.py (+15/-0)
hooks/charmhelpers/contrib/charmsupport/nrpe.py (+360/-0)
hooks/charmhelpers/contrib/charmsupport/volumes.py (+175/-0)
hooks/charmhelpers/core/strutils.py (+42/-0)
hooks/charmhelpers/core/unitdata.py (+477/-0)
hooks/hooks.py (+62/-1)
metadata.yaml (+3/-0)
Conflict adding file hooks/charmhelpers/core/strutils.py.  Moved existing file to hooks/charmhelpers/core/strutils.py.moved.
Conflict adding file hooks/charmhelpers/core/unitdata.py.  Moved existing file to hooks/charmhelpers/core/unitdata.py.moved.
To merge this branch: bzr merge lp:~brad-marshall/charms/trusty/hacluster/add-nrpe-checks-update-charmhelpers
Reviewer Review Type Date Requested Status
Liam Young (community) Approve
Billy Olsen Needs Information
OpenStack Charmers Pending
Review via email: mp+249570@code.launchpad.net

This proposal supersedes a proposal from 2015-02-12.

Description of the change

Adding nrpe checks to the hacluster to check the status of corosync.

To post a comment you must log in.
Revision history for this message
Billy Olsen (billy-olsen) wrote :

Brad, thanks for the submission! The hacluster charm is maintained by the openstack-charmers team and development operates off the /next branch. I've re-targeted this merge proposal towards the /next branch.

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #1956 hacluster-next for billy-olsen mp249570
    LINT FAIL: lint-test failed

LINT Results (max last 2 lines):
  hooks/hooks.py:22:1: F401 'relations_of_type' imported but unused
  make: *** [lint] Error 1

Full lint test output: http://paste.ubuntu.com/10193691/
Build: http://10.245.162.77:8080/job/charm_lint_check/1956/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #1746 hacluster-next for billy-olsen mp249570
    UNIT OK: passed

Build: http://10.245.162.77:8080/job/charm_unit_test/1746/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #1897 hacluster-next for billy-olsen mp249570
    AMULET FAIL: amulet-test missing

AMULET Results (max last 2 lines):
INFO:root:Search string not found in makefile target commands.
ERROR:root:No make target was executed.

Full amulet test output: http://paste.ubuntu.com/10193709/
Build: http://10.245.162.77:8080/job/charm_amulet_test/1897/

Revision history for this message
Billy Olsen (billy-olsen) wrote :

Brad, please fix the linting issue that's been identified by uosci-testing-bot.

When expanding a cluster by adding a node, the hacluster charm is somewhat inefficient as it will restart corosync whenever the corosync.conf file changes. If the nrpe check runs during this process I think it's going to trigger a false alarm. Can you confirm this? I'm thinking we may want to be a bit more intelligent about possibly disabling certain nagios checks while changes to the cluster are made. This is more of a question than anything.

review: Needs Fixing
Revision history for this message
Billy Olsen (billy-olsen) wrote :

As another comment, in the 15.01 release of the OpenStack charms, haproxy is now setup and configured on each service by default (it avoids port jumps, reconfig, the amount of work needed to be performed when clustering, etc). So I think that those charms need the haproxy check as well, but this will need to make sure it is okay to pair with a charm that has already enabled the haproxy nagios plugin. It may do so (I'm not overly familiar with nagios), but thought I'd drop it in and let you comment. Thanks.

49. By Brad Marshall

[bradm] Removed unneeded import of relations_of_type

Revision history for this message
Brad Marshall (brad-marshall) wrote :

Hi,

> Brad, please fix the linting issue that's been identified by uosci-testing-bot.

Done.

> When expanding a cluster by adding a node, the hacluster charm is somewhat
> inefficient as it will restart corosync whenever the corosync.conf file changes.
> If the nrpe check runs during this process I think it's going to trigger a false
> alarm. Can you confirm this? I'm thinking we may want to be a bit more intelligent
> about possibly disabling certain nagios checks while changes to the cluster are
> made. This is more of a question than anything.

Hmm, its an interesting question. Enabling and disabling checks in nagios isn't trivial,
so I'm not actually sure how it would work. Also, you need to be down for an extended
period before it'll alert, so I don't know if its worth the effort. I'd like to see
what we have here landed and we can make work out if there's a more efficient way
down the track?

Thanks,
Brad

Revision history for this message
Brad Marshall (brad-marshall) wrote :

> As another comment, in the 15.01 release of the OpenStack charms,
> haproxy is now setup and configured on each service by default
> (it avoids port jumps, reconfig, the amount of work needed to be
> performed when clustering, etc). So I think that those charms need
> the haproxy check as well, but this will need to make sure it is
> okay to pair with a charm that has already enabled the haproxy
> nagios plugin. It may do so (I'm not overly familiar with nagios),
> but thought I'd drop it in and let you comment. Thanks.

My understanding is that it should already be monitoring haproxy
with the latest charms, but it could be made better - it seems
that there's only a process check, which isn't really sufficient.

I'll try to look into what we could do with this, and how it would
interact with the subordinate charms.

Brad.

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #1958 hacluster-next for billy-olsen mp249570
    LINT OK: passed

Build: http://10.245.162.77:8080/job/charm_lint_check/1958/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #1748 hacluster-next for billy-olsen mp249570
    UNIT OK: passed

Build: http://10.245.162.77:8080/job/charm_unit_test/1748/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #1899 hacluster-next for billy-olsen mp249570
    AMULET FAIL: amulet-test missing

AMULET Results (max last 2 lines):
INFO:root:Search string not found in makefile target commands.
ERROR:root:No make target was executed.

Full amulet test output: http://paste.ubuntu.com/10196422/
Build: http://10.245.162.77:8080/job/charm_amulet_test/1899/

50. By Brad Marshall

[bradm] Removed haproxy nrpe checks

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #2044 hacluster-next for billy-olsen mp249570
    LINT OK: passed

Build: http://10.245.162.77:8080/job/charm_lint_check/2044/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #1834 hacluster-next for billy-olsen mp249570
    UNIT OK: passed

Build: http://10.245.162.77:8080/job/charm_unit_test/1834/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #1982 hacluster-next for billy-olsen mp249570
    AMULET FAIL: amulet-test missing

AMULET Results (max last 2 lines):
INFO:root:Search string not found in makefile target commands.
ERROR:root:No make target was executed.

Full amulet test output: http://paste.ubuntu.com/10268549/
Build: http://10.245.162.77:8080/job/charm_amulet_test/1982/

51. By Brad Marshall

[bradm] Sync charmhelpers

52. By Brad Marshall

[bradm] Add nagios_servicegroups config option

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #2107 hacluster-next for billy-olsen mp249570
    LINT OK: passed

Build: http://10.245.162.77:8080/job/charm_lint_check/2107/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #1897 hacluster-next for billy-olsen mp249570
    UNIT OK: passed

Build: http://10.245.162.77:8080/job/charm_unit_test/1897/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #2013 hacluster-next for billy-olsen mp249570
    AMULET FAIL: amulet-test missing

AMULET Results (max last 2 lines):
INFO:root:Search string not found in makefile target commands.
ERROR:root:No make target was executed.

Full amulet test output: http://paste.ubuntu.com/10302877/
Build: http://10.245.162.77:8080/job/charm_amulet_test/2013/

53. By Brad Marshall

[bradm] Handle case of empty nagios_servicegroups setting

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #2205 hacluster-next for billy-olsen mp249570
    LINT OK: passed

Build: http://10.245.162.77:8080/job/charm_lint_check/2205/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #1994 hacluster-next for billy-olsen mp249570
    UNIT OK: passed

Build: http://10.245.162.77:8080/job/charm_unit_test/1994/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #2151 hacluster-next for billy-olsen mp249570
    AMULET FAIL: amulet-test missing

AMULET Results (max last 2 lines):
INFO:root:Search string not found in makefile target commands.
ERROR:root:No make target was executed.

Full amulet test output: http://paste.ubuntu.com/10396224/
Build: http://10.245.162.77:8080/job/charm_amulet_test/2151/

Revision history for this message
Brad Marshall (brad-marshall) wrote :

This should now be ready for a re-review.

54. By Brad Marshall

[bradm] Add process check for corosync and pacemakerd

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #2490 hacluster-next for billy-olsen mp249570
    LINT OK: passed

Build: http://10.245.162.77:8080/job/charm_lint_check/2490/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #2280 hacluster-next for billy-olsen mp249570
    UNIT OK: passed

Build: http://10.245.162.77:8080/job/charm_unit_test/2280/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #2362 hacluster-next for billy-olsen mp249570
    AMULET FAIL: amulet-test missing

AMULET Results (max last 2 lines):
INFO:root:Search string not found in makefile target commands.
ERROR:root:No make target was executed.

Full amulet test output: http://paste.ubuntu.com/10545084/
Build: http://10.245.162.77:8080/job/charm_amulet_test/2362/

Revision history for this message
Billy Olsen (billy-olsen) wrote :

Hey Brad,

I think this is almost there... almost there... I just have a question about the size of your charmhelpers sync and why you included more files on your sync. I couldn't figure it out.

Also, its somewhat easier to review if the c-h sync comes as a different merge proposal (rumors are this is to be automated), which this can then depend on so we can focus on the changes in here. No biggie, just makes the diff review much larger.

As always, thanks for the submission :)

review: Needs Information
Revision history for this message
Brad Marshall (brad-marshall) wrote :

It looks like I changed it so it would pull in the haproxy checks we have in charmhelpers.contrib.openstack, but later on they were moved to the individual charms. I'll drop it back to charmhelpers.contrib.openstack.utils and see how that goes.

55. By Brad Marshall

[bradm] Dropped back to just requiring charmhelpers.contrib.openstack.utils, don't need the rest now.

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #3666 hacluster-next for billy-olsen mp249570
    LINT OK: passed

Build: http://10.245.162.77:8080/job/charm_lint_check/3666/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #3454 hacluster-next for billy-olsen mp249570
    UNIT OK: passed

Build: http://10.245.162.77:8080/job/charm_unit_test/3454/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #3456 hacluster-next for billy-olsen mp249570
    AMULET FAIL: amulet-test missing

AMULET Results (max last 2 lines):
INFO:root:Search string not found in makefile target commands.
ERROR:root:No make target was executed.

Full amulet test output: http://paste.ubuntu.com/10854117/
Build: http://10.245.162.77:8080/job/charm_amulet_test/3456/

Revision history for this message
Liam Young (gnuoy) wrote :

Approve

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'charm-helpers.yaml'
--- charm-helpers.yaml 2015-01-26 10:44:46 +0000
+++ charm-helpers.yaml 2015-04-20 00:40:18 +0000
@@ -9,3 +9,4 @@
9 - contrib.network.ip9 - contrib.network.ip
10 - contrib.openstack.utils10 - contrib.openstack.utils
11 - contrib.python.packages11 - contrib.python.packages
12 - contrib.charmsupport
1213
=== modified file 'config.yaml'
--- config.yaml 2015-04-14 10:16:50 +0000
+++ config.yaml 2015-04-20 00:40:18 +0000
@@ -97,3 +97,19 @@
97 default: False97 default: False
98 type: boolean98 type: boolean
99 description: Enable debug logging99 description: Enable debug logging
100 nagios_context:
101 default: "juju"
102 type: string
103 description: |
104 Used by the nrpe-external-master subordinate charm.
105 A string that will be prepended to instance name to set the host name
106 in nagios. So for instance the hostname would be something like:
107 juju-postgresql-0
108 If you're running multiple environments with the same services in them
109 this allows you to differentiate between them.
110 nagios_servicegroups:
111 default: ""
112 type: string
113 description: |
114 A comma-separated list of nagios servicegroups.
115 If left empty, the nagios_context will be used as the servicegroup
100116
=== added directory 'files'
=== added directory 'files/nrpe'
=== added file 'files/nrpe/check_corosync_rings'
--- files/nrpe/check_corosync_rings 1970-01-01 00:00:00 +0000
+++ files/nrpe/check_corosync_rings 2015-04-20 00:40:18 +0000
@@ -0,0 +1,99 @@
1#!/usr/bin/perl
2#
3# check_corosync_rings
4#
5# Copyright © 2011 Phil Garner, Sysnix Consultants Limited
6#
7# This program is free software: you can redistribute it and/or modify
8# it under the terms of the GNU General Public License as published by
9# the Free Software Foundation, either version 3 of the License, or
10# (at your option) any later version.
11#
12# This program is distributed in the hope that it will be useful,
13# but WITHOUT ANY WARRANTY; without even the implied warranty of
14# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
15# GNU General Public License for more details.
16#
17# You should have received a copy of the GNU General Public License
18# along with this program. If not, see <http://www.gnu.org/licenses/>.
19#
20# Authors: Phil Garner - phil@sysnix.com & Peter Mottram peter@sysnix.com
21#
22# v0.1 05/01/2011
23# v0.2 31/10/2011 - additional crit when closing the file handle and additional
24# comments added
25#
26# NOTE:- Requires Perl 5.8 or higher & the Perl Module Nagios::Plugin
27# Nagios user will need sudo acces - suggest adding line below to
28# sudoers.
29# nagios ALL=(ALL) NOPASSWD: /usr/sbin/corosync-cfgtool -s
30#
31# In sudoers if requiretty is on (off state is default)
32# you will also need to add the line below
33# Defaults:nagios !requiretty
34#
35
36use warnings;
37use strict;
38use Nagios::Plugin;
39
40# Lines below may need changing if corosync-cfgtool or sudo installed in a
41# diffrent location.
42
43my $sudo = '/usr/bin/sudo';
44my $cfgtool = '/usr/sbin/corosync-cfgtool -s';
45
46# Now set up the plugin
47my $np = Nagios::Plugin->new(
48 shortname => 'check_cororings',
49 version => '0.2',
50 usage => "Usage: %s <ARGS>\n\t\t--help for help\n",
51 license => "License - GPL v3 see code for more details",
52 url => "http://www.sysnix.com",
53 blurb =>
54"\tNagios plugin that checks the status of corosync rings, requires Perl \t5.8+ and CPAN modules Nagios::Plugin.",
55);
56
57#Args
58$np->add_arg(
59 spec => 'rings|r=s',
60 help =>
61'How many rings should be running (optinal) sends Crit if incorrect number of rings found.',
62 required => 0,
63);
64
65$np->getopts;
66
67my $found = 0;
68my $fh;
69my $rings = $np->opts->rings;
70
71# Run cfgtools spin through output and get info needed
72
73open( $fh, "$sudo $cfgtool |" )
74 or $np->nagios_exit( CRITICAL, "Running corosync-cfgtool failed" );
75
76foreach my $line (<$fh>) {
77 if ( $line =~ m/status\s*=\s*(\S.+)/ ) {
78 my $status = $1;
79 if ( $status =~ m/^ring (\d+) active with no faults/ ) {
80 $np->add_message( OK, "ring $1 OK" );
81 }
82 else {
83 $np->add_message( CRITICAL, $status );
84 }
85 $found++;
86 }
87}
88
89close($fh) or $np->nagios_exit( CRITICAL, "Running corosync-cfgtool failed" );
90
91# Check we found some rings and apply -r arg if needed
92if ( $found == 0 ) {
93 $np->nagios_exit( CRITICAL, "No Rings Found" );
94}
95elsif ( defined $rings && $rings != $found ) {
96 $np->nagios_exit( CRITICAL, "Expected $rings rings but found $found" );
97}
98
99$np->nagios_exit( $np->check_messages() );
0100
=== added file 'files/nrpe/check_crm'
--- files/nrpe/check_crm 1970-01-01 00:00:00 +0000
+++ files/nrpe/check_crm 2015-04-20 00:40:18 +0000
@@ -0,0 +1,201 @@
1#!/usr/bin/perl
2#
3# check_crm_v0_7
4#
5# Copyright © 2013 Philip Garner, Sysnix Consultants Limited
6#
7# This program is free software: you can redistribute it and/or modify
8# it under the terms of the GNU General Public License as published by
9# the Free Software Foundation, either version 3 of the License, or
10# (at your option) any later version.
11#
12# This program is distributed in the hope that it will be useful,
13# but WITHOUT ANY WARRANTY; without even the implied warranty of
14# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
15# GNU General Public License for more details.
16#
17# You should have received a copy of the GNU General Public License
18# along with this program. If not, see <http://www.gnu.org/licenses/>.
19#
20# Authors: Phil Garner - phil@sysnix.com & Peter Mottram - peter@sysnix.com
21#
22# v0.1 09/01/2011
23# v0.2 11/01/2011
24# v0.3 22/08/2011 - bug fix and changes suggested by Vadym Chepkov
25# v0.4 23/08/2011 - update for spelling and anchor regex capture (Vadym Chepkov)
26# v0.5 29/09/2011 - Add standby warn/crit suggested by Sönke Martens & removal
27# of 'our' to 'my' to completely avoid problems with ePN
28# v0.6 14/03/2013 - Change from \w+ to \S+ in stopped check to cope with
29# Servers that have non word charachters in. Suggested by
30# Igal Baevsky.
31# v0.7 01/09/2013 - In testing as still not fully tested. Adds optional
32# constraints check (Boris Wesslowski). Adds fail count
33# threshold ( Zoran Bosnjak & Marko Hrastovec )
34#
35# NOTES: Requires Perl 5.8 or higher & the Perl Module Nagios::Plugin
36# Nagios user will need sudo acces - suggest adding line below to
37# sudoers
38# nagios ALL=(ALL) NOPASSWD: /usr/sbin/crm_mon -1 -r -f
39#
40# if you want to check for location constraints (-c) also add
41# nagios ALL=(ALL) NOPASSWD: /usr/sbin/crm configure show
42#
43# In sudoers if requiretty is on (off state is default)
44# you will also need to add the line below
45# Defaults:nagios !requiretty
46#
47
48use warnings;
49use strict;
50use Nagios::Plugin;
51
52# Lines below may need changing if crm_mon or sudo installed in a
53# different location.
54
55my $sudo = '/usr/bin/sudo';
56my $crm_mon = '/usr/sbin/crm_mon -1 -r -f';
57my $crm_configure_show = '/usr/sbin/crm configure show';
58
59my $np = Nagios::Plugin->new(
60 shortname => 'check_crm',
61 version => '0.7',
62 usage => "Usage: %s <ARGS>\n\t\t--help for help\n",
63);
64
65$np->add_arg(
66 spec => 'warning|w',
67 help =>
68'If failed Nodes, stopped Resources detected or Standby Nodes sends Warning instead of Critical (default) as long as there are no other errors and there is Quorum',
69 required => 0,
70);
71
72$np->add_arg(
73 spec => 'standbyignore|s',
74 help => 'Ignore any node(s) in standby, by default sends Critical',
75 required => 0,
76);
77
78$np->add_arg(
79 spec => 'constraint|constraints|c',
80 help => 'Also check configuration for location constraints (caused by migrations) and warn if there are any. Requires additional privileges see notes',
81 required => 0,
82);
83
84$np->add_arg(
85 spec => 'failcount|failcounts|f=i',
86 help => 'resource fail count to start warning on [default = 1].',
87 required => 0,
88 default => 1,
89);
90
91$np->getopts;
92my $ConstraintsFlag = $np->opts->constraint;
93
94my @standby;
95
96# Check for -w option set warn if this is case instead of crit
97my $warn_or_crit = 'CRITICAL';
98$warn_or_crit = 'WARNING' if $np->opts->warning;
99
100my $fh;
101
102open( $fh, "$sudo $crm_mon |" )
103 or $np->nagios_exit( CRITICAL, "Running $sudo $crm_mon has failed" );
104
105foreach my $line (<$fh>) {
106
107 if ( $line =~ m/Connection to cluster failed\:(.*)/i ) {
108
109 # Check Cluster connected
110 $np->nagios_exit( CRITICAL, "Connection to cluster FAILED: $1" );
111 }
112 elsif ( $line =~ m/Current DC:/ ) {
113
114 # Check for Quorum
115 if ( $line =~ m/partition with quorum$/ ) {
116
117 # Assume cluster is OK - we only add warn/crit after here
118
119 $np->add_message( OK, "Cluster OK" );
120 }
121 else {
122 $np->add_message( CRITICAL, "No Quorum" );
123 }
124 }
125 elsif ( $line =~ m/^offline:\s*\[\s*(\S.*?)\s*\]/i ) {
126
127 # Count offline nodes
128 my @offline = split( /\s+/, $1 );
129 my $numoffline = scalar @offline;
130 $np->add_message( $warn_or_crit, ": $numoffline Nodes Offline" );
131 }
132 elsif ( $line =~ m/^node\s+(\S.*):\s*standby/i ) {
133
134 # Check for standby nodes (suggested by Sönke Martens)
135 # See later in code for message created from this
136 push @standby, $1;
137 }
138
139 elsif ( $line =~ m/\s*(\S+)\s+\(\S+\)\:\s+Stopped/ ) {
140
141 # Check Resources Stopped
142 $np->add_message( $warn_or_crit, ": $1 Stopped" );
143 }
144 elsif ( $line =~ m/\s*stopped\:\s*\[(.*)\]/i ) {
145
146 # Check Master/Slave stopped
147 $np->add_message( $warn_or_crit, ": $1 Stopped" );
148 }
149 elsif ( $line =~ m/^Failed actions\:/ ) {
150
151 # Check Failed Actions
152 $np->add_message( CRITICAL,
153 ": FAILED actions detected or not cleaned up" );
154 }
155 elsif ( $line =~ m/\s*(\S+?)\s+ \(.*\)\:\s+\w+\s+\w+\s+\(unmanaged\)\s+/i )
156 {
157
158 # Check Unmanaged
159 $np->add_message( CRITICAL, ": $1 unmanaged FAILED" );
160 }
161 elsif ( $line =~ m/\s*(\S+?)\s+ \(.*\)\:\s+not installed/i ) {
162
163 # Check for errors
164 $np->add_message( CRITICAL, ": $1 not installed" );
165 }
166 elsif ( $line =~ m/\s*(\S+?):.*fail-count=(\d+)/i ) {
167 if ( $2 >= $np->opts->failcount ) {
168
169 # Check for resource Fail count (suggested by Vadym Chepkov)
170 $np->add_message( WARNING, ": $1 failure detected, fail-count=$2" );
171 }
172 }
173}
174
175# If found any Nodes in standby & no -s option used send warn/crit
176if ( scalar @standby > 0 && !$np->opts->standbyignore ) {
177 $np->add_message( $warn_or_crit,
178 ": " . join( ', ', @standby ) . " in Standby" );
179}
180
181close($fh) or $np->nagios_exit( CRITICAL, "Running $crm_mon FAILED" );
182
183# if -c flag set check configuration for constraints
184if ($ConstraintsFlag) {
185
186 open( $fh, "$sudo $crm_configure_show|" )
187 or $np->nagios_exit( CRITICAL,
188 "Running $sudo $crm_configure_show has failed" );
189
190 foreach my $line (<$fh>) {
191 if ( $line =~ m/location cli-(prefer|standby)-\S+\s+(\S+)/ ) {
192 $np->add_message( WARNING,
193 ": $2 blocking location constraint detected" );
194 }
195 }
196 close($fh)
197 or $np->nagios_exit( CRITICAL, "Running $crm_configure_show FAILED" );
198}
199
200$np->nagios_exit( $np->check_messages() );
201
0202
=== added directory 'files/sudoers'
=== added file 'files/sudoers/nagios'
--- files/sudoers/nagios 1970-01-01 00:00:00 +0000
+++ files/sudoers/nagios 2015-04-20 00:40:18 +0000
@@ -0,0 +1,5 @@
1Defaults:nagios !requiretty
2
3nagios ALL=(ALL) NOPASSWD: /usr/sbin/corosync-cfgtool -s
4nagios ALL=(ALL) NOPASSWD: /usr/sbin/crm_mon -1 -r -f
5nagios ALL=(ALL) NOPASSWD: /usr/sbin/crm configure show
06
=== added directory 'hooks/charmhelpers/contrib/charmsupport'
=== added file 'hooks/charmhelpers/contrib/charmsupport/__init__.py'
--- hooks/charmhelpers/contrib/charmsupport/__init__.py 1970-01-01 00:00:00 +0000
+++ hooks/charmhelpers/contrib/charmsupport/__init__.py 2015-04-20 00:40:18 +0000
@@ -0,0 +1,15 @@
1# Copyright 2014-2015 Canonical Limited.
2#
3# This file is part of charm-helpers.
4#
5# charm-helpers is free software: you can redistribute it and/or modify
6# it under the terms of the GNU Lesser General Public License version 3 as
7# published by the Free Software Foundation.
8#
9# charm-helpers is distributed in the hope that it will be useful,
10# but WITHOUT ANY WARRANTY; without even the implied warranty of
11# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
12# GNU Lesser General Public License for more details.
13#
14# You should have received a copy of the GNU Lesser General Public License
15# along with charm-helpers. If not, see <http://www.gnu.org/licenses/>.
016
=== added file 'hooks/charmhelpers/contrib/charmsupport/nrpe.py'
--- hooks/charmhelpers/contrib/charmsupport/nrpe.py 1970-01-01 00:00:00 +0000
+++ hooks/charmhelpers/contrib/charmsupport/nrpe.py 2015-04-20 00:40:18 +0000
@@ -0,0 +1,360 @@
1# Copyright 2014-2015 Canonical Limited.
2#
3# This file is part of charm-helpers.
4#
5# charm-helpers is free software: you can redistribute it and/or modify
6# it under the terms of the GNU Lesser General Public License version 3 as
7# published by the Free Software Foundation.
8#
9# charm-helpers is distributed in the hope that it will be useful,
10# but WITHOUT ANY WARRANTY; without even the implied warranty of
11# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
12# GNU Lesser General Public License for more details.
13#
14# You should have received a copy of the GNU Lesser General Public License
15# along with charm-helpers. If not, see <http://www.gnu.org/licenses/>.
16
17"""Compatibility with the nrpe-external-master charm"""
18# Copyright 2012 Canonical Ltd.
19#
20# Authors:
21# Matthew Wedgwood <matthew.wedgwood@canonical.com>
22
23import subprocess
24import pwd
25import grp
26import os
27import glob
28import shutil
29import re
30import shlex
31import yaml
32
33from charmhelpers.core.hookenv import (
34 config,
35 local_unit,
36 log,
37 relation_ids,
38 relation_set,
39 relations_of_type,
40)
41
42from charmhelpers.core.host import service
43
44# This module adds compatibility with the nrpe-external-master and plain nrpe
45# subordinate charms. To use it in your charm:
46#
47# 1. Update metadata.yaml
48#
49# provides:
50# (...)
51# nrpe-external-master:
52# interface: nrpe-external-master
53# scope: container
54#
55# and/or
56#
57# provides:
58# (...)
59# local-monitors:
60# interface: local-monitors
61# scope: container
62
63#
64# 2. Add the following to config.yaml
65#
66# nagios_context:
67# default: "juju"
68# type: string
69# description: |
70# Used by the nrpe subordinate charms.
71# A string that will be prepended to instance name to set the host name
72# in nagios. So for instance the hostname would be something like:
73# juju-myservice-0
74# If you're running multiple environments with the same services in them
75# this allows you to differentiate between them.
76# nagios_servicegroups:
77# default: ""
78# type: string
79# description: |
80# A comma-separated list of nagios servicegroups.
81# If left empty, the nagios_context will be used as the servicegroup
82#
83# 3. Add custom checks (Nagios plugins) to files/nrpe-external-master
84#
85# 4. Update your hooks.py with something like this:
86#
87# from charmsupport.nrpe import NRPE
88# (...)
89# def update_nrpe_config():
90# nrpe_compat = NRPE()
91# nrpe_compat.add_check(
92# shortname = "myservice",
93# description = "Check MyService",
94# check_cmd = "check_http -w 2 -c 10 http://localhost"
95# )
96# nrpe_compat.add_check(
97# "myservice_other",
98# "Check for widget failures",
99# check_cmd = "/srv/myapp/scripts/widget_check"
100# )
101# nrpe_compat.write()
102#
103# def config_changed():
104# (...)
105# update_nrpe_config()
106#
107# def nrpe_external_master_relation_changed():
108# update_nrpe_config()
109#
110# def local_monitors_relation_changed():
111# update_nrpe_config()
112#
113# 5. ln -s hooks.py nrpe-external-master-relation-changed
114# ln -s hooks.py local-monitors-relation-changed
115
116
117class CheckException(Exception):
118 pass
119
120
121class Check(object):
122 shortname_re = '[A-Za-z0-9-_]+$'
123 service_template = ("""
124#---------------------------------------------------
125# This file is Juju managed
126#---------------------------------------------------
127define service {{
128 use active-service
129 host_name {nagios_hostname}
130 service_description {nagios_hostname}[{shortname}] """
131 """{description}
132 check_command check_nrpe!{command}
133 servicegroups {nagios_servicegroup}
134}}
135""")
136
137 def __init__(self, shortname, description, check_cmd):
138 super(Check, self).__init__()
139 # XXX: could be better to calculate this from the service name
140 if not re.match(self.shortname_re, shortname):
141 raise CheckException("shortname must match {}".format(
142 Check.shortname_re))
143 self.shortname = shortname
144 self.command = "check_{}".format(shortname)
145 # Note: a set of invalid characters is defined by the
146 # Nagios server config
147 # The default is: illegal_object_name_chars=`~!$%^&*"|'<>?,()=
148 self.description = description
149 self.check_cmd = self._locate_cmd(check_cmd)
150
151 def _locate_cmd(self, check_cmd):
152 search_path = (
153 '/usr/lib/nagios/plugins',
154 '/usr/local/lib/nagios/plugins',
155 )
156 parts = shlex.split(check_cmd)
157 for path in search_path:
158 if os.path.exists(os.path.join(path, parts[0])):
159 command = os.path.join(path, parts[0])
160 if len(parts) > 1:
161 command += " " + " ".join(parts[1:])
162 return command
163 log('Check command not found: {}'.format(parts[0]))
164 return ''
165
166 def write(self, nagios_context, hostname, nagios_servicegroups):
167 nrpe_check_file = '/etc/nagios/nrpe.d/{}.cfg'.format(
168 self.command)
169 with open(nrpe_check_file, 'w') as nrpe_check_config:
170 nrpe_check_config.write("# check {}\n".format(self.shortname))
171 nrpe_check_config.write("command[{}]={}\n".format(
172 self.command, self.check_cmd))
173
174 if not os.path.exists(NRPE.nagios_exportdir):
175 log('Not writing service config as {} is not accessible'.format(
176 NRPE.nagios_exportdir))
177 else:
178 self.write_service_config(nagios_context, hostname,
179 nagios_servicegroups)
180
181 def write_service_config(self, nagios_context, hostname,
182 nagios_servicegroups):
183 for f in os.listdir(NRPE.nagios_exportdir):
184 if re.search('.*{}.cfg'.format(self.command), f):
185 os.remove(os.path.join(NRPE.nagios_exportdir, f))
186
187 templ_vars = {
188 'nagios_hostname': hostname,
189 'nagios_servicegroup': nagios_servicegroups,
190 'description': self.description,
191 'shortname': self.shortname,
192 'command': self.command,
193 }
194 nrpe_service_text = Check.service_template.format(**templ_vars)
195 nrpe_service_file = '{}/service__{}_{}.cfg'.format(
196 NRPE.nagios_exportdir, hostname, self.command)
197 with open(nrpe_service_file, 'w') as nrpe_service_config:
198 nrpe_service_config.write(str(nrpe_service_text))
199
200 def run(self):
201 subprocess.call(self.check_cmd)
202
203
204class NRPE(object):
205 nagios_logdir = '/var/log/nagios'
206 nagios_exportdir = '/var/lib/nagios/export'
207 nrpe_confdir = '/etc/nagios/nrpe.d'
208
209 def __init__(self, hostname=None):
210 super(NRPE, self).__init__()
211 self.config = config()
212 self.nagios_context = self.config['nagios_context']
213 if 'nagios_servicegroups' in self.config and self.config['nagios_servicegroups']:
214 self.nagios_servicegroups = self.config['nagios_servicegroups']
215 else:
216 self.nagios_servicegroups = self.nagios_context
217 self.unit_name = local_unit().replace('/', '-')
218 if hostname:
219 self.hostname = hostname
220 else:
221 self.hostname = "{}-{}".format(self.nagios_context, self.unit_name)
222 self.checks = []
223
224 def add_check(self, *args, **kwargs):
225 self.checks.append(Check(*args, **kwargs))
226
227 def write(self):
228 try:
229 nagios_uid = pwd.getpwnam('nagios').pw_uid
230 nagios_gid = grp.getgrnam('nagios').gr_gid
231 except:
232 log("Nagios user not set up, nrpe checks not updated")
233 return
234
235 if not os.path.exists(NRPE.nagios_logdir):
236 os.mkdir(NRPE.nagios_logdir)
237 os.chown(NRPE.nagios_logdir, nagios_uid, nagios_gid)
238
239 nrpe_monitors = {}
240 monitors = {"monitors": {"remote": {"nrpe": nrpe_monitors}}}
241 for nrpecheck in self.checks:
242 nrpecheck.write(self.nagios_context, self.hostname,
243 self.nagios_servicegroups)
244 nrpe_monitors[nrpecheck.shortname] = {
245 "command": nrpecheck.command,
246 }
247
248 service('restart', 'nagios-nrpe-server')
249
250 monitor_ids = relation_ids("local-monitors") + \
251 relation_ids("nrpe-external-master")
252 for rid in monitor_ids:
253 relation_set(relation_id=rid, monitors=yaml.dump(monitors))
254
255
256def get_nagios_hostcontext(relation_name='nrpe-external-master'):
257 """
258 Query relation with nrpe subordinate, return the nagios_host_context
259
260 :param str relation_name: Name of relation nrpe sub joined to
261 """
262 for rel in relations_of_type(relation_name):
263 if 'nagios_hostname' in rel:
264 return rel['nagios_host_context']
265
266
267def get_nagios_hostname(relation_name='nrpe-external-master'):
268 """
269 Query relation with nrpe subordinate, return the nagios_hostname
270
271 :param str relation_name: Name of relation nrpe sub joined to
272 """
273 for rel in relations_of_type(relation_name):
274 if 'nagios_hostname' in rel:
275 return rel['nagios_hostname']
276
277
278def get_nagios_unit_name(relation_name='nrpe-external-master'):
279 """
280 Return the nagios unit name prepended with host_context if needed
281
282 :param str relation_name: Name of relation nrpe sub joined to
283 """
284 host_context = get_nagios_hostcontext(relation_name)
285 if host_context:
286 unit = "%s:%s" % (host_context, local_unit())
287 else:
288 unit = local_unit()
289 return unit
290
291
292def add_init_service_checks(nrpe, services, unit_name):
293 """
294 Add checks for each service in list
295
296 :param NRPE nrpe: NRPE object to add check to
297 :param list services: List of services to check
298 :param str unit_name: Unit name to use in check description
299 """
300 for svc in services:
301 upstart_init = '/etc/init/%s.conf' % svc
302 sysv_init = '/etc/init.d/%s' % svc
303 if os.path.exists(upstart_init):
304 nrpe.add_check(
305 shortname=svc,
306 description='process check {%s}' % unit_name,
307 check_cmd='check_upstart_job %s' % svc
308 )
309 elif os.path.exists(sysv_init):
310 cronpath = '/etc/cron.d/nagios-service-check-%s' % svc
311 cron_file = ('*/5 * * * * root '
312 '/usr/local/lib/nagios/plugins/check_exit_status.pl '
313 '-s /etc/init.d/%s status > '
314 '/var/lib/nagios/service-check-%s.txt\n' % (svc,
315 svc)
316 )
317 f = open(cronpath, 'w')
318 f.write(cron_file)
319 f.close()
320 nrpe.add_check(
321 shortname=svc,
322 description='process check {%s}' % unit_name,
323 check_cmd='check_status_file.py -f '
324 '/var/lib/nagios/service-check-%s.txt' % svc,
325 )
326
327
328def copy_nrpe_checks():
329 """
330 Copy the nrpe checks into place
331
332 """
333 NAGIOS_PLUGINS = '/usr/local/lib/nagios/plugins'
334 nrpe_files_dir = os.path.join(os.getenv('CHARM_DIR'), 'hooks',
335 'charmhelpers', 'contrib', 'openstack',
336 'files')
337
338 if not os.path.exists(NAGIOS_PLUGINS):
339 os.makedirs(NAGIOS_PLUGINS)
340 for fname in glob.glob(os.path.join(nrpe_files_dir, "check_*")):
341 if os.path.isfile(fname):
342 shutil.copy2(fname,
343 os.path.join(NAGIOS_PLUGINS, os.path.basename(fname)))
344
345
346def add_haproxy_checks(nrpe, unit_name):
347 """
348 Add checks for each service in list
349
350 :param NRPE nrpe: NRPE object to add check to
351 :param str unit_name: Unit name to use in check description
352 """
353 nrpe.add_check(
354 shortname='haproxy_servers',
355 description='Check HAProxy {%s}' % unit_name,
356 check_cmd='check_haproxy.sh')
357 nrpe.add_check(
358 shortname='haproxy_queue',
359 description='Check HAProxy queue depth {%s}' % unit_name,
360 check_cmd='check_haproxy_queue_depth.sh')
0361
=== added file 'hooks/charmhelpers/contrib/charmsupport/volumes.py'
--- hooks/charmhelpers/contrib/charmsupport/volumes.py 1970-01-01 00:00:00 +0000
+++ hooks/charmhelpers/contrib/charmsupport/volumes.py 2015-04-20 00:40:18 +0000
@@ -0,0 +1,175 @@
1# Copyright 2014-2015 Canonical Limited.
2#
3# This file is part of charm-helpers.
4#
5# charm-helpers is free software: you can redistribute it and/or modify
6# it under the terms of the GNU Lesser General Public License version 3 as
7# published by the Free Software Foundation.
8#
9# charm-helpers is distributed in the hope that it will be useful,
10# but WITHOUT ANY WARRANTY; without even the implied warranty of
11# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
12# GNU Lesser General Public License for more details.
13#
14# You should have received a copy of the GNU Lesser General Public License
15# along with charm-helpers. If not, see <http://www.gnu.org/licenses/>.
16
17'''
18Functions for managing volumes in juju units. One volume is supported per unit.
19Subordinates may have their own storage, provided it is on its own partition.
20
21Configuration stanzas::
22
23 volume-ephemeral:
24 type: boolean
25 default: true
26 description: >
27 If false, a volume is mounted as sepecified in "volume-map"
28 If true, ephemeral storage will be used, meaning that log data
29 will only exist as long as the machine. YOU HAVE BEEN WARNED.
30 volume-map:
31 type: string
32 default: {}
33 description: >
34 YAML map of units to device names, e.g:
35 "{ rsyslog/0: /dev/vdb, rsyslog/1: /dev/vdb }"
36 Service units will raise a configure-error if volume-ephemeral
37 is 'true' and no volume-map value is set. Use 'juju set' to set a
38 value and 'juju resolved' to complete configuration.
39
40Usage::
41
42 from charmsupport.volumes import configure_volume, VolumeConfigurationError
43 from charmsupport.hookenv import log, ERROR
44 def post_mount_hook():
45 stop_service('myservice')
46 def post_mount_hook():
47 start_service('myservice')
48
49 if __name__ == '__main__':
50 try:
51 configure_volume(before_change=pre_mount_hook,
52 after_change=post_mount_hook)
53 except VolumeConfigurationError:
54 log('Storage could not be configured', ERROR)
55
56'''
57
58# XXX: Known limitations
59# - fstab is neither consulted nor updated
60
61import os
62from charmhelpers.core import hookenv
63from charmhelpers.core import host
64import yaml
65
66
67MOUNT_BASE = '/srv/juju/volumes'
68
69
70class VolumeConfigurationError(Exception):
71 '''Volume configuration data is missing or invalid'''
72 pass
73
74
75def get_config():
76 '''Gather and sanity-check volume configuration data'''
77 volume_config = {}
78 config = hookenv.config()
79
80 errors = False
81
82 if config.get('volume-ephemeral') in (True, 'True', 'true', 'Yes', 'yes'):
83 volume_config['ephemeral'] = True
84 else:
85 volume_config['ephemeral'] = False
86
87 try:
88 volume_map = yaml.safe_load(config.get('volume-map', '{}'))
89 except yaml.YAMLError as e:
90 hookenv.log("Error parsing YAML volume-map: {}".format(e),
91 hookenv.ERROR)
92 errors = True
93 if volume_map is None:
94 # probably an empty string
95 volume_map = {}
96 elif not isinstance(volume_map, dict):
97 hookenv.log("Volume-map should be a dictionary, not {}".format(
98 type(volume_map)))
99 errors = True
100
101 volume_config['device'] = volume_map.get(os.environ['JUJU_UNIT_NAME'])
102 if volume_config['device'] and volume_config['ephemeral']:
103 # asked for ephemeral storage but also defined a volume ID
104 hookenv.log('A volume is defined for this unit, but ephemeral '
105 'storage was requested', hookenv.ERROR)
106 errors = True
107 elif not volume_config['device'] and not volume_config['ephemeral']:
108 # asked for permanent storage but did not define volume ID
109 hookenv.log('Ephemeral storage was requested, but there is no volume '
110 'defined for this unit.', hookenv.ERROR)
111 errors = True
112
113 unit_mount_name = hookenv.local_unit().replace('/', '-')
114 volume_config['mountpoint'] = os.path.join(MOUNT_BASE, unit_mount_name)
115
116 if errors:
117 return None
118 return volume_config
119
120
121def mount_volume(config):
122 if os.path.exists(config['mountpoint']):
123 if not os.path.isdir(config['mountpoint']):
124 hookenv.log('Not a directory: {}'.format(config['mountpoint']))
125 raise VolumeConfigurationError()
126 else:
127 host.mkdir(config['mountpoint'])
128 if os.path.ismount(config['mountpoint']):
129 unmount_volume(config)
130 if not host.mount(config['device'], config['mountpoint'], persist=True):
131 raise VolumeConfigurationError()
132
133
134def unmount_volume(config):
135 if os.path.ismount(config['mountpoint']):
136 if not host.umount(config['mountpoint'], persist=True):
137 raise VolumeConfigurationError()
138
139
140def managed_mounts():
141 '''List of all mounted managed volumes'''
142 return filter(lambda mount: mount[0].startswith(MOUNT_BASE), host.mounts())
143
144
145def configure_volume(before_change=lambda: None, after_change=lambda: None):
146 '''Set up storage (or don't) according to the charm's volume configuration.
147 Returns the mount point or "ephemeral". before_change and after_change
148 are optional functions to be called if the volume configuration changes.
149 '''
150
151 config = get_config()
152 if not config:
153 hookenv.log('Failed to read volume configuration', hookenv.CRITICAL)
154 raise VolumeConfigurationError()
155
156 if config['ephemeral']:
157 if os.path.ismount(config['mountpoint']):
158 before_change()
159 unmount_volume(config)
160 after_change()
161 return 'ephemeral'
162 else:
163 # persistent storage
164 if os.path.ismount(config['mountpoint']):
165 mounts = dict(managed_mounts())
166 if mounts.get(config['mountpoint']) != config['device']:
167 before_change()
168 unmount_volume(config)
169 mount_volume(config)
170 after_change()
171 else:
172 before_change()
173 mount_volume(config)
174 after_change()
175 return config['mountpoint']
0176
=== added file 'hooks/charmhelpers/core/strutils.py'
--- hooks/charmhelpers/core/strutils.py 1970-01-01 00:00:00 +0000
+++ hooks/charmhelpers/core/strutils.py 2015-04-20 00:40:18 +0000
@@ -0,0 +1,42 @@
1#!/usr/bin/env python
2# -*- coding: utf-8 -*-
3
4# Copyright 2014-2015 Canonical Limited.
5#
6# This file is part of charm-helpers.
7#
8# charm-helpers is free software: you can redistribute it and/or modify
9# it under the terms of the GNU Lesser General Public License version 3 as
10# published by the Free Software Foundation.
11#
12# charm-helpers is distributed in the hope that it will be useful,
13# but WITHOUT ANY WARRANTY; without even the implied warranty of
14# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
15# GNU Lesser General Public License for more details.
16#
17# You should have received a copy of the GNU Lesser General Public License
18# along with charm-helpers. If not, see <http://www.gnu.org/licenses/>.
19
20import six
21
22
23def bool_from_string(value):
24 """Interpret string value as boolean.
25
26 Returns True if value translates to True otherwise False.
27 """
28 if isinstance(value, six.string_types):
29 value = six.text_type(value)
30 else:
31 msg = "Unable to interpret non-string value '%s' as boolean" % (value)
32 raise ValueError(msg)
33
34 value = value.strip().lower()
35
36 if value in ['y', 'yes', 'true', 't', 'on']:
37 return True
38 elif value in ['n', 'no', 'false', 'f', 'off']:
39 return False
40
41 msg = "Unable to interpret string value '%s' as boolean" % (value)
42 raise ValueError(msg)
043
=== renamed file 'hooks/charmhelpers/core/strutils.py' => 'hooks/charmhelpers/core/strutils.py.moved'
=== added file 'hooks/charmhelpers/core/unitdata.py'
--- hooks/charmhelpers/core/unitdata.py 1970-01-01 00:00:00 +0000
+++ hooks/charmhelpers/core/unitdata.py 2015-04-20 00:40:18 +0000
@@ -0,0 +1,477 @@
1#!/usr/bin/env python
2# -*- coding: utf-8 -*-
3#
4# Copyright 2014-2015 Canonical Limited.
5#
6# This file is part of charm-helpers.
7#
8# charm-helpers is free software: you can redistribute it and/or modify
9# it under the terms of the GNU Lesser General Public License version 3 as
10# published by the Free Software Foundation.
11#
12# charm-helpers is distributed in the hope that it will be useful,
13# but WITHOUT ANY WARRANTY; without even the implied warranty of
14# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
15# GNU Lesser General Public License for more details.
16#
17# You should have received a copy of the GNU Lesser General Public License
18# along with charm-helpers. If not, see <http://www.gnu.org/licenses/>.
19#
20#
21# Authors:
22# Kapil Thangavelu <kapil.foss@gmail.com>
23#
24"""
25Intro
26-----
27
28A simple way to store state in units. This provides a key value
29storage with support for versioned, transactional operation,
30and can calculate deltas from previous values to simplify unit logic
31when processing changes.
32
33
34Hook Integration
35----------------
36
37There are several extant frameworks for hook execution, including
38
39 - charmhelpers.core.hookenv.Hooks
40 - charmhelpers.core.services.ServiceManager
41
42The storage classes are framework agnostic, one simple integration is
43via the HookData contextmanager. It will record the current hook
44execution environment (including relation data, config data, etc.),
45setup a transaction and allow easy access to the changes from
46previously seen values. One consequence of the integration is the
47reservation of particular keys ('rels', 'unit', 'env', 'config',
48'charm_revisions') for their respective values.
49
50Here's a fully worked integration example using hookenv.Hooks::
51
52 from charmhelper.core import hookenv, unitdata
53
54 hook_data = unitdata.HookData()
55 db = unitdata.kv()
56 hooks = hookenv.Hooks()
57
58 @hooks.hook
59 def config_changed():
60 # Print all changes to configuration from previously seen
61 # values.
62 for changed, (prev, cur) in hook_data.conf.items():
63 print('config changed', changed,
64 'previous value', prev,
65 'current value', cur)
66
67 # Get some unit specific bookeeping
68 if not db.get('pkg_key'):
69 key = urllib.urlopen('https://example.com/pkg_key').read()
70 db.set('pkg_key', key)
71
72 # Directly access all charm config as a mapping.
73 conf = db.getrange('config', True)
74
75 # Directly access all relation data as a mapping
76 rels = db.getrange('rels', True)
77
78 if __name__ == '__main__':
79 with hook_data():
80 hook.execute()
81
82
83A more basic integration is via the hook_scope context manager which simply
84manages transaction scope (and records hook name, and timestamp)::
85
86 >>> from unitdata import kv
87 >>> db = kv()
88 >>> with db.hook_scope('install'):
89 ... # do work, in transactional scope.
90 ... db.set('x', 1)
91 >>> db.get('x')
92 1
93
94
95Usage
96-----
97
98Values are automatically json de/serialized to preserve basic typing
99and complex data struct capabilities (dicts, lists, ints, booleans, etc).
100
101Individual values can be manipulated via get/set::
102
103 >>> kv.set('y', True)
104 >>> kv.get('y')
105 True
106
107 # We can set complex values (dicts, lists) as a single key.
108 >>> kv.set('config', {'a': 1, 'b': True'})
109
110 # Also supports returning dictionaries as a record which
111 # provides attribute access.
112 >>> config = kv.get('config', record=True)
113 >>> config.b
114 True
115
116
117Groups of keys can be manipulated with update/getrange::
118
119 >>> kv.update({'z': 1, 'y': 2}, prefix="gui.")
120 >>> kv.getrange('gui.', strip=True)
121 {'z': 1, 'y': 2}
122
123When updating values, its very helpful to understand which values
124have actually changed and how have they changed. The storage
125provides a delta method to provide for this::
126
127 >>> data = {'debug': True, 'option': 2}
128 >>> delta = kv.delta(data, 'config.')
129 >>> delta.debug.previous
130 None
131 >>> delta.debug.current
132 True
133 >>> delta
134 {'debug': (None, True), 'option': (None, 2)}
135
136Note the delta method does not persist the actual change, it needs to
137be explicitly saved via 'update' method::
138
139 >>> kv.update(data, 'config.')
140
141Values modified in the context of a hook scope retain historical values
142associated to the hookname.
143
144 >>> with db.hook_scope('config-changed'):
145 ... db.set('x', 42)
146 >>> db.gethistory('x')
147 [(1, u'x', 1, u'install', u'2015-01-21T16:49:30.038372'),
148 (2, u'x', 42, u'config-changed', u'2015-01-21T16:49:30.038786')]
149
150"""
151
152import collections
153import contextlib
154import datetime
155import json
156import os
157import pprint
158import sqlite3
159import sys
160
161__author__ = 'Kapil Thangavelu <kapil.foss@gmail.com>'
162
163
164class Storage(object):
165 """Simple key value database for local unit state within charms.
166
167 Modifications are automatically committed at hook exit. That's
168 currently regardless of exit code.
169
170 To support dicts, lists, integer, floats, and booleans values
171 are automatically json encoded/decoded.
172 """
173 def __init__(self, path=None):
174 self.db_path = path
175 if path is None:
176 self.db_path = os.path.join(
177 os.environ.get('CHARM_DIR', ''), '.unit-state.db')
178 self.conn = sqlite3.connect('%s' % self.db_path)
179 self.cursor = self.conn.cursor()
180 self.revision = None
181 self._closed = False
182 self._init()
183
184 def close(self):
185 if self._closed:
186 return
187 self.flush(False)
188 self.cursor.close()
189 self.conn.close()
190 self._closed = True
191
192 def _scoped_query(self, stmt, params=None):
193 if params is None:
194 params = []
195 return stmt, params
196
197 def get(self, key, default=None, record=False):
198 self.cursor.execute(
199 *self._scoped_query(
200 'select data from kv where key=?', [key]))
201 result = self.cursor.fetchone()
202 if not result:
203 return default
204 if record:
205 return Record(json.loads(result[0]))
206 return json.loads(result[0])
207
208 def getrange(self, key_prefix, strip=False):
209 stmt = "select key, data from kv where key like '%s%%'" % key_prefix
210 self.cursor.execute(*self._scoped_query(stmt))
211 result = self.cursor.fetchall()
212
213 if not result:
214 return None
215 if not strip:
216 key_prefix = ''
217 return dict([
218 (k[len(key_prefix):], json.loads(v)) for k, v in result])
219
220 def update(self, mapping, prefix=""):
221 for k, v in mapping.items():
222 self.set("%s%s" % (prefix, k), v)
223
224 def unset(self, key):
225 self.cursor.execute('delete from kv where key=?', [key])
226 if self.revision and self.cursor.rowcount:
227 self.cursor.execute(
228 'insert into kv_revisions values (?, ?, ?)',
229 [key, self.revision, json.dumps('DELETED')])
230
231 def set(self, key, value):
232 serialized = json.dumps(value)
233
234 self.cursor.execute(
235 'select data from kv where key=?', [key])
236 exists = self.cursor.fetchone()
237
238 # Skip mutations to the same value
239 if exists:
240 if exists[0] == serialized:
241 return value
242
243 if not exists:
244 self.cursor.execute(
245 'insert into kv (key, data) values (?, ?)',
246 (key, serialized))
247 else:
248 self.cursor.execute('''
249 update kv
250 set data = ?
251 where key = ?''', [serialized, key])
252
253 # Save
254 if not self.revision:
255 return value
256
257 self.cursor.execute(
258 'select 1 from kv_revisions where key=? and revision=?',
259 [key, self.revision])
260 exists = self.cursor.fetchone()
261
262 if not exists:
263 self.cursor.execute(
264 '''insert into kv_revisions (
265 revision, key, data) values (?, ?, ?)''',
266 (self.revision, key, serialized))
267 else:
268 self.cursor.execute(
269 '''
270 update kv_revisions
271 set data = ?
272 where key = ?
273 and revision = ?''',
274 [serialized, key, self.revision])
275
276 return value
277
278 def delta(self, mapping, prefix):
279 """
280 return a delta containing values that have changed.
281 """
282 previous = self.getrange(prefix, strip=True)
283 if not previous:
284 pk = set()
285 else:
286 pk = set(previous.keys())
287 ck = set(mapping.keys())
288 delta = DeltaSet()
289
290 # added
291 for k in ck.difference(pk):
292 delta[k] = Delta(None, mapping[k])
293
294 # removed
295 for k in pk.difference(ck):
296 delta[k] = Delta(previous[k], None)
297
298 # changed
299 for k in pk.intersection(ck):
300 c = mapping[k]
301 p = previous[k]
302 if c != p:
303 delta[k] = Delta(p, c)
304
305 return delta
306
307 @contextlib.contextmanager
308 def hook_scope(self, name=""):
309 """Scope all future interactions to the current hook execution
310 revision."""
311 assert not self.revision
312 self.cursor.execute(
313 'insert into hooks (hook, date) values (?, ?)',
314 (name or sys.argv[0],
315 datetime.datetime.utcnow().isoformat()))
316 self.revision = self.cursor.lastrowid
317 try:
318 yield self.revision
319 self.revision = None
320 except:
321 self.flush(False)
322 self.revision = None
323 raise
324 else:
325 self.flush()
326
327 def flush(self, save=True):
328 if save:
329 self.conn.commit()
330 elif self._closed:
331 return
332 else:
333 self.conn.rollback()
334
335 def _init(self):
336 self.cursor.execute('''
337 create table if not exists kv (
338 key text,
339 data text,
340 primary key (key)
341 )''')
342 self.cursor.execute('''
343 create table if not exists kv_revisions (
344 key text,
345 revision integer,
346 data text,
347 primary key (key, revision)
348 )''')
349 self.cursor.execute('''
350 create table if not exists hooks (
351 version integer primary key autoincrement,
352 hook text,
353 date text
354 )''')
355 self.conn.commit()
356
357 def gethistory(self, key, deserialize=False):
358 self.cursor.execute(
359 '''
360 select kv.revision, kv.key, kv.data, h.hook, h.date
361 from kv_revisions kv,
362 hooks h
363 where kv.key=?
364 and kv.revision = h.version
365 ''', [key])
366 if deserialize is False:
367 return self.cursor.fetchall()
368 return map(_parse_history, self.cursor.fetchall())
369
370 def debug(self, fh=sys.stderr):
371 self.cursor.execute('select * from kv')
372 pprint.pprint(self.cursor.fetchall(), stream=fh)
373 self.cursor.execute('select * from kv_revisions')
374 pprint.pprint(self.cursor.fetchall(), stream=fh)
375
376
377def _parse_history(d):
378 return (d[0], d[1], json.loads(d[2]), d[3],
379 datetime.datetime.strptime(d[-1], "%Y-%m-%dT%H:%M:%S.%f"))
380
381
382class HookData(object):
383 """Simple integration for existing hook exec frameworks.
384
385 Records all unit information, and stores deltas for processing
386 by the hook.
387
388 Sample::
389
390 from charmhelper.core import hookenv, unitdata
391
392 changes = unitdata.HookData()
393 db = unitdata.kv()
394 hooks = hookenv.Hooks()
395
396 @hooks.hook
397 def config_changed():
398 # View all changes to configuration
399 for changed, (prev, cur) in changes.conf.items():
400 print('config changed', changed,
401 'previous value', prev,
402 'current value', cur)
403
404 # Get some unit specific bookeeping
405 if not db.get('pkg_key'):
406 key = urllib.urlopen('https://example.com/pkg_key').read()
407 db.set('pkg_key', key)
408
409 if __name__ == '__main__':
410 with changes():
411 hook.execute()
412
413 """
414 def __init__(self):
415 self.kv = kv()
416 self.conf = None
417 self.rels = None
418
419 @contextlib.contextmanager
420 def __call__(self):
421 from charmhelpers.core import hookenv
422 hook_name = hookenv.hook_name()
423
424 with self.kv.hook_scope(hook_name):
425 self._record_charm_version(hookenv.charm_dir())
426 delta_config, delta_relation = self._record_hook(hookenv)
427 yield self.kv, delta_config, delta_relation
428
429 def _record_charm_version(self, charm_dir):
430 # Record revisions.. charm revisions are meaningless
431 # to charm authors as they don't control the revision.
432 # so logic dependnent on revision is not particularly
433 # useful, however it is useful for debugging analysis.
434 charm_rev = open(
435 os.path.join(charm_dir, 'revision')).read().strip()
436 charm_rev = charm_rev or '0'
437 revs = self.kv.get('charm_revisions', [])
438 if charm_rev not in revs:
439 revs.append(charm_rev.strip() or '0')
440 self.kv.set('charm_revisions', revs)
441
442 def _record_hook(self, hookenv):
443 data = hookenv.execution_environment()
444 self.conf = conf_delta = self.kv.delta(data['conf'], 'config')
445 self.rels = rels_delta = self.kv.delta(data['rels'], 'rels')
446 self.kv.set('env', dict(data['env']))
447 self.kv.set('unit', data['unit'])
448 self.kv.set('relid', data.get('relid'))
449 return conf_delta, rels_delta
450
451
452class Record(dict):
453
454 __slots__ = ()
455
456 def __getattr__(self, k):
457 if k in self:
458 return self[k]
459 raise AttributeError(k)
460
461
462class DeltaSet(Record):
463
464 __slots__ = ()
465
466
467Delta = collections.namedtuple('Delta', ['previous', 'current'])
468
469
470_KV = None
471
472
473def kv():
474 global _KV
475 if _KV is None:
476 _KV = Storage()
477 return _KV
0478
=== renamed file 'hooks/charmhelpers/core/unitdata.py' => 'hooks/charmhelpers/core/unitdata.py.moved'
=== modified file 'hooks/hooks.py'
--- hooks/hooks.py 2015-04-14 10:16:50 +0000
+++ hooks/hooks.py 2015-04-20 00:40:18 +0000
@@ -11,6 +11,7 @@
11import shutil11import shutil
12import sys12import sys
13import os13import os
14import glob
14from base64 import b64decode15from base64 import b64decode
1516
16import maas as MAAS17import maas as MAAS
@@ -56,6 +57,8 @@
5657
57from charmhelpers.contrib.openstack.utils import get_host_ip58from charmhelpers.contrib.openstack.utils import get_host_ip
5859
60from charmhelpers.contrib.charmsupport import nrpe
61
59hooks = Hooks()62hooks = Hooks()
6063
61COROSYNC_CONF = '/etc/corosync/corosync.conf'64COROSYNC_CONF = '/etc/corosync/corosync.conf'
@@ -68,7 +71,8 @@
68 COROSYNC_CONF71 COROSYNC_CONF
69]72]
7073
71PACKAGES = ['corosync', 'pacemaker', 'python-netaddr', 'ipmitool']74PACKAGES = ['corosync', 'pacemaker', 'python-netaddr', 'ipmitool',
75 'libnagios-plugin-perl']
72SUPPORTED_TRANSPORTS = ['udp', 'udpu', 'multicast', 'unicast']76SUPPORTED_TRANSPORTS = ['udp', 'udpu', 'multicast', 'unicast']
7377
7478
@@ -209,11 +213,15 @@
209 configure_monitor_host()213 configure_monitor_host()
210 configure_stonith()214 configure_stonith()
211215
216 update_nrpe_config()
217
212218
213@hooks.hook()219@hooks.hook()
214def upgrade_charm():220def upgrade_charm():
215 install()221 install()
216222
223 update_nrpe_config()
224
217225
218def restart_corosync():226def restart_corosync():
219 if service_running("pacemaker"):227 if service_running("pacemaker"):
@@ -594,6 +602,59 @@
594 "versions less than Trusty 14.04")602 "versions less than Trusty 14.04")
595603
596604
605@hooks.hook('nrpe-external-master-relation-joined',
606 'nrpe-external-master-relation-changed')
607def update_nrpe_config():
608 scripts_src = os.path.join(os.environ["CHARM_DIR"], "files",
609 "nrpe")
610 scripts_dst = "/usr/local/lib/nagios/plugins"
611 if not os.path.exists(scripts_dst):
612 os.makedirs(scripts_dst)
613 for fname in glob.glob(os.path.join(scripts_src, "*")):
614 if os.path.isfile(fname):
615 shutil.copy2(fname,
616 os.path.join(scripts_dst, os.path.basename(fname)))
617
618 sudoers_src = os.path.join(os.environ["CHARM_DIR"], "files",
619 "sudoers")
620 sudoers_dst = "/etc/sudoers.d"
621 for fname in glob.glob(os.path.join(sudoers_src, "*")):
622 if os.path.isfile(fname):
623 shutil.copy2(fname,
624 os.path.join(sudoers_dst, os.path.basename(fname)))
625
626 hostname = nrpe.get_nagios_hostname()
627 current_unit = nrpe.get_nagios_unit_name()
628
629 nrpe_setup = nrpe.NRPE(hostname=hostname)
630
631 apt_install('python-dbus')
632
633 # corosync/crm checks
634 nrpe_setup.add_check(
635 shortname='corosync_rings',
636 description='Check Corosync rings {%s}' % current_unit,
637 check_cmd='check_corosync_rings')
638 nrpe_setup.add_check(
639 shortname='crm_status',
640 description='Check crm status {%s}' % current_unit,
641 check_cmd='check_crm')
642
643 # process checks
644 nrpe_setup.add_check(
645 shortname='corosync_proc',
646 description='Check Corosync process {%s}' % current_unit,
647 check_cmd='check_procs -c 1:1 -C corosync'
648 )
649 nrpe_setup.add_check(
650 shortname='pacemakerd_proc',
651 description='Check Pacemakerd process {%s}' % current_unit,
652 check_cmd='check_procs -c 1:1 -C pacemakerd'
653 )
654
655 nrpe_setup.write()
656
657
597if __name__ == '__main__':658if __name__ == '__main__':
598 try:659 try:
599 hooks.execute(sys.argv)660 hooks.execute(sys.argv)
600661
=== added symlink 'hooks/nrpe-external-master-relation-changed'
=== target is u'hooks.py'
=== added symlink 'hooks/nrpe-external-master-relation-joined'
=== target is u'hooks.py'
=== modified file 'metadata.yaml'
--- metadata.yaml 2014-04-11 11:22:46 +0000
+++ metadata.yaml 2015-04-20 00:40:18 +0000
@@ -14,6 +14,9 @@
14 ha:14 ha:
15 interface: hacluster15 interface: hacluster
16 scope: container16 scope: container
17 nrpe-external-master:
18 interface: nrpe-external-master
19 scope: container
17peers:20peers:
18 hanode:21 hanode:
19 interface: hacluster22 interface: hacluster

Subscribers

People subscribed via source and target branches