Merge lp:~james-page/charms/trusty/rabbitmq-server/status-check into lp:~openstack-charmers-archive/charms/trusty/rabbitmq-server/next

Proposed by James Page
Status: Merged
Merged at revision: 116
Proposed branch: lp:~james-page/charms/trusty/rabbitmq-server/status-check
Merge into: lp:~openstack-charmers-archive/charms/trusty/rabbitmq-server/next
Diff against target: 261 lines (+131/-18)
3 files modified
hooks/rabbit_utils.py (+67/-16)
hooks/rabbitmq_server_relations.py (+7/-0)
unit_tests/test_rabbit_utils.py (+57/-2)
To merge this branch: bzr merge lp:~james-page/charms/trusty/rabbitmq-server/status-check
Reviewer Review Type Date Requested Status
David Ames (community) Approve
James Page Needs Resubmitting
Review via email: mp+273037@code.launchpad.net

Description of the change

Add basic use of status feature in Juju

This includes:

1) Evaluation of current rabbitmq state on the server using 'status'
2) Evaluation of cluster status once peers appear on the cluster relation
3) status set when installing packages and when configuring mirroring (long ops)

I also found a race in leader-settings-changed - it can fire prior to installation of rabbitmq in config-changed, so added a basic defer until installed check.

To post a comment you must log in.
Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #11153 rabbitmq-server-next for james-page mp273037
    LINT OK: passed

Build: http://10.245.162.77:8080/job/charm_lint_check/11153/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #10358 rabbitmq-server-next for james-page mp273037
    UNIT OK: passed

Build: http://10.245.162.77:8080/job/charm_unit_test/10358/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #6934 rabbitmq-server-next for james-page mp273037
    AMULET FAIL: amulet-test failed

AMULET Results (max last 2 lines):
make: *** [functional_test] Error 124
ERROR:root:Make target returned non-zero.

Full amulet test output: http://paste.ubuntu.com/12634394/
Build: http://10.245.162.77:8080/job/charm_amulet_test/6934/

Revision history for this message
David Ames (thedac) wrote :

I really like the workgroup status approach in this MP.

I also like your solution to https://bugs.launchpad.net/charms/+source/rabbitmq-server/+bug/1501048 better than mine.
mkdir is imported in hooks/rabbitmq_server_relations.py but is never used. I suspect you used os.path.exists instead.

I also really like breaking out clustered() as a cached function on its own. However, one of the bugs I am fighting (https://bugs.launchpad.net/charms/+source/rabbitmq-server/+bug/1500204) is a direct result of checking for more than one running node.

if len(running_nodes()) > 1:

Consider, the 3rd, 4th and 5th nodes to attempt clustering. There will already by 2+ running nodes and they will assume incorrectly they are already clustered.

In my testing I have removed this check entirely with some success. I would be interested in finding a more robust clustered check.

I plan to merge your MP into my branch and do some more testing and see if I can come up with a clustered check that works in all cases.

review: Needs Fixing
Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #6937 rabbitmq-server-next for james-page mp273037
    AMULET FAIL: amulet-test failed

AMULET Results (max last 2 lines):
2015-10-01 21:19:58,241 publish_amqp_message_by_unit DEBUG: Publishing message to test queue:
ERROR:root:Make target returned non-zero.

Full amulet test output: http://paste.ubuntu.com/12634746/
Build: http://10.245.162.77:8080/job/charm_amulet_test/6937/

Revision history for this message
David Ames (thedac) wrote :

> However, one of the bugs I am fighting
> (https://bugs.launchpad.net/charms/+source/rabbitmq-server/+bug/1500204) is a
> direct result of checking for more than one running node.
>
> if len(running_nodes()) > 1:
>
> Consider, the 3rd, 4th and 5th nodes to attempt clustering. There will already
> by 2+ running nodes and they will assume incorrectly they are already
> clustered.

Ignore this. I see what it is supposed to do. As the local unit will only ever see 1 if not clustered or more if clustered.

> I plan to merge your MP into my branch and do some more testing and see if I
> can come up with a clustered check that works in all cases.

I am testing a combination of my MP which ignores min-cluster-size when leadership election is available and your MP now. I'll see if any more race conditions reveal themselves.

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #6938 rabbitmq-server-next for james-page mp273037
    AMULET FAIL: amulet-test failed

AMULET Results (max last 2 lines):
make: *** [functional_test] Error 1
ERROR:root:Make target returned non-zero.

Full amulet test output: http://paste.ubuntu.com/12635100/
Build: http://10.245.162.77:8080/job/charm_amulet_test/6938/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #11156 rabbitmq-server-next for james-page mp273037
    LINT OK: passed

Build: http://10.245.162.77:8080/job/charm_lint_check/11156/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #10364 rabbitmq-server-next for james-page mp273037
    UNIT OK: passed

Build: http://10.245.162.77:8080/job/charm_unit_test/10364/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #6970 rabbitmq-server-next for james-page mp273037
    AMULET FAIL: amulet-test failed

AMULET Results (max last 2 lines):
make: *** [functional_test] Error 1
ERROR:root:Make target returned non-zero.

Full amulet test output: http://paste.ubuntu.com/12638811/
Build: http://10.245.162.77:8080/job/charm_amulet_test/6970/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #11169 rabbitmq-server-next for james-page mp273037
    LINT OK: passed

Build: http://10.245.162.77:8080/job/charm_lint_check/11169/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #10372 rabbitmq-server-next for james-page mp273037
    UNIT OK: passed

Build: http://10.245.162.77:8080/job/charm_unit_test/10372/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #6992 rabbitmq-server-next for james-page mp273037
    AMULET FAIL: amulet-test failed

AMULET Results (max last 2 lines):
make: *** [functional_test] Error 124
ERROR:root:Make target returned non-zero.

Full amulet test output: http://paste.ubuntu.com/12640571/
Build: http://10.245.162.77:8080/job/charm_amulet_test/6992/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #7002 rabbitmq-server-next for james-page mp273037
    AMULET FAIL: amulet-test failed

AMULET Results (max last 2 lines):
make: *** [functional_test] Error 1
ERROR:root:Make target returned non-zero.

Full amulet test output: http://paste.ubuntu.com/12641390/
Build: http://10.245.162.77:8080/job/charm_amulet_test/7002/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #7009 rabbitmq-server-next for james-page mp273037
    AMULET FAIL: amulet-test failed

AMULET Results (max last 2 lines):
make: *** [functional_test] Error 1
ERROR:root:Make target returned non-zero.

Full amulet test output: http://paste.ubuntu.com/12642068/
Build: http://10.245.162.77:8080/job/charm_amulet_test/7009/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #7025 rabbitmq-server-next for james-page mp273037
    AMULET FAIL: amulet-test failed

AMULET Results (max last 2 lines):
make: *** [functional_test] Error 1
ERROR:root:Make target returned non-zero.

Full amulet test output: http://paste.ubuntu.com/12644560/
Build: http://10.245.162.77:8080/job/charm_amulet_test/7025/

Revision history for this message
James Page (james-page) :
review: Needs Resubmitting
Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #10521 rabbitmq-server-next for james-page mp273037
    UNIT OK: passed

Build: http://10.245.162.77:8080/job/charm_unit_test/10521/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #11327 rabbitmq-server-next for james-page mp273037
    LINT OK: passed

Build: http://10.245.162.77:8080/job/charm_lint_check/11327/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #7112 rabbitmq-server-next for james-page mp273037
    AMULET OK: passed

Build: http://10.245.162.77:8080/job/charm_amulet_test/7112/

Revision history for this message
David Ames (thedac) wrote :

Apporved

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'hooks/rabbit_utils.py'
2--- hooks/rabbit_utils.py 2015-09-23 15:53:31 +0000
3+++ hooks/rabbit_utils.py 2015-10-01 09:52:45 +0000
4@@ -23,7 +23,9 @@
5 related_units,
6 log, ERROR,
7 INFO,
8- service_name
9+ service_name,
10+ status_set,
11+ cached
12 )
13
14 from charmhelpers.core.host import (
15@@ -193,6 +195,11 @@
16 subprocess.check_call(cmd)
17
18
19+@cached
20+def caching_cmp_pkgrevno(package, revno, pkgcache=None):
21+ return cmp_pkgrevno(package, revno, pkgcache)
22+
23+
24 def set_ha_mode(vhost, mode, params=None, sync_mode='automatic'):
25 """Valid mode values:
26
27@@ -212,7 +219,7 @@
28 http://www.rabbitmq.com./ha.html#eager-synchronisation
29 """
30
31- if cmp_pkgrevno('rabbitmq-server', '3.0.0') < 0:
32+ if caching_cmp_pkgrevno('rabbitmq-server', '3.0.0') < 0:
33 log(("Mirroring queues cannot be enabled, only supported "
34 "in rabbitmq-server >= 3.0"), level='WARN')
35 log(("More information at http://www.rabbitmq.com/blog/"
36@@ -266,6 +273,11 @@
37 "2012/11/19/breaking-things-with-rabbitmq-3-0"), level='INFO')
38 return
39
40+ if enable:
41+ status_set('active', 'Enabling queue mirroring')
42+ else:
43+ status_set('active', 'Disabling queue mirroring')
44+
45 for vhost in list_vhosts():
46 if enable:
47 set_ha_mode(vhost, 'all')
48@@ -284,19 +296,8 @@
49 cluster_cmd = 'join_cluster'
50 else:
51 cluster_cmd = 'cluster'
52- out = subprocess.check_output([RABBITMQ_CTL, 'cluster_status'])
53- log('cluster status is %s' % str(out))
54-
55- # check if node is already clustered
56- total_nodes = 1
57- running_nodes = []
58- m = re.search("\{running_nodes,\[(.*?)\]\}", out.strip(), re.DOTALL)
59- if m is not None:
60- running_nodes = m.group(1).split(',')
61- running_nodes = [x.replace("'", '') for x in running_nodes]
62- total_nodes = len(running_nodes)
63-
64- if total_nodes > 1:
65+
66+ if clustered():
67 log('Node is already clustered, skipping')
68 return False
69
70@@ -324,10 +325,11 @@
71 return False
72
73 # iterate over all the nodes, join to the first available
74+ active_nodes = running_nodes()
75 num_tries = 0
76 for node in available_nodes:
77 log('Clustering with remote rabbit host (%s).' % node)
78- if node in running_nodes:
79+ if node in active_nodes:
80 log('Host already clustered with %s.' % node)
81 return False
82
83@@ -600,3 +602,52 @@
84 for v in restart_map().values():
85 _services = _services + v
86 return list(set(_services))
87+
88+
89+@cached
90+def running_nodes():
91+ ''' Determine the current set of running rabbitmq-units in the cluster '''
92+ out = subprocess.check_output([RABBITMQ_CTL, 'cluster_status'])
93+
94+ running_nodes = []
95+ m = re.search("\{running_nodes,\[(.*?)\]\}", out.strip(), re.DOTALL)
96+ if m is not None:
97+ running_nodes = m.group(1).split(',')
98+ running_nodes = [x.replace("'", '').strip() for x in running_nodes]
99+
100+ return running_nodes
101+
102+
103+@cached
104+def clustered():
105+ ''' Determine whether local rabbitmq-server is clustered '''
106+ if len(running_nodes()) > 1:
107+ return True
108+ else:
109+ return False
110+
111+
112+def assess_status():
113+ ''' Assess the status for the current running unit '''
114+ # NOTE: ensure rabbitmq is actually installed before doing
115+ # any checks
116+ if os.path.exists(RABBITMQ_CTL):
117+ # Clustering Check
118+ peer_ids = relation_ids('cluster')
119+ if peer_ids and len(related_units(peer_ids[0])):
120+ if not clustered():
121+ status_set('waiting',
122+ 'Unit has peers, but RabbitMQ not clustered')
123+ return
124+ # General status check
125+ status_cmd = ['rabbitmqctl', 'status']
126+ ret = subprocess.call(status_cmd)
127+ if ret > 0:
128+ status_set('blocked', 'RabbitMQ server is not running')
129+ else:
130+ if clustered():
131+ status_set('active', 'Unit is ready and clustered')
132+ else:
133+ status_set('active', 'Unit is ready')
134+ else:
135+ status_set('waiting', 'RabbitMQ is not yet installed')
136
137=== modified file 'hooks/rabbitmq_server_relations.py'
138--- hooks/rabbitmq_server_relations.py 2015-09-23 13:16:23 +0000
139+++ hooks/rabbitmq_server_relations.py 2015-10-01 09:52:45 +0000
140@@ -65,6 +65,7 @@
141 UnregisteredHookError,
142 is_leader,
143 charm_dir,
144+ status_set,
145 )
146 from charmhelpers.core.host import (
147 cmp_pkgrevno,
148@@ -73,6 +74,7 @@
149 service_stop,
150 service_restart,
151 write_file,
152+ mkdir,
153 )
154 from charmhelpers.contrib.charmsupport import nrpe
155
156@@ -644,6 +646,7 @@
157 '/etc/default/rabbitmq-server')
158 # Install packages to ensure any changes to source
159 # result in an upgrade if applicable.
160+ status_set('maintenance', 'Installing/upgrading RabbitMQ packages')
161 apt_install(rabbit.PACKAGES, fatal=True)
162
163 open_port(5672)
164@@ -688,6 +691,9 @@
165
166 @hooks.hook('leader-settings-changed')
167 def leader_settings_changed():
168+ if not os.path.exists(rabbit.RABBITMQ_CTL):
169+ log('Deferring cookie configuration, RabbitMQ not yet installed')
170+ return
171 # Get cookie from leader, update cookie locally and
172 # force cluster-relation-changed hooks to run on peers
173 cookie = leader_get(attribute='cookie')
174@@ -716,5 +722,6 @@
175 if __name__ == '__main__':
176 try:
177 hooks.execute(sys.argv)
178+ rabbit.assess_status()
179 except UnregisteredHookError as e:
180 log('Unknown hook {} - skipping.'.format(e))
181
182=== modified file 'unit_tests/test_rabbit_utils.py'
183--- unit_tests/test_rabbit_utils.py 2015-09-24 21:49:44 +0000
184+++ unit_tests/test_rabbit_utils.py 2015-10-01 09:52:45 +0000
185@@ -4,8 +4,19 @@
186 import tempfile
187 import sys
188 import collections
189-
190-import rabbit_utils
191+from functools import wraps
192+
193+
194+with mock.patch('charmhelpers.core.hookenv.cached') as cached:
195+ def passthrough(func):
196+ @wraps(func)
197+ def wrapper(*args, **kwargs):
198+ return func(*args, **kwargs)
199+ wrapper._wrapped = func
200+ return wrapper
201+ cached.side_effect = passthrough
202+ import rabbit_utils
203+
204 sys.modules['MySQLdb'] = mock.Mock()
205
206
207@@ -41,6 +52,23 @@
208 self.assertTrue(log.called)
209
210
211+RABBITMQCTL_CLUSTERSTATUS_RUNNING = """Cluster status of node 'rabbit@juju-devel3-machine-19' ...
212+[{nodes,[{disc,['rabbit@juju-devel3-machine-14',
213+ 'rabbit@juju-devel3-machine-19']}]},
214+ {running_nodes,['rabbit@juju-devel3-machine-14',
215+ 'rabbit@juju-devel3-machine-19']},
216+ {cluster_name,<<"rabbit@juju-devel3-machine-14.openstacklocal">>},
217+ {partitions,[]}]
218+ """
219+
220+RABBITMQCTL_CLUSTERSTATUS_SOLO = """Cluster status of node 'rabbit@juju-devel3-machine-14' ...
221+[{nodes,[{disc,['rabbit@juju-devel3-machine-14']}]},
222+ {running_nodes,['rabbit@juju-devel3-machine-14']},
223+ {cluster_name,<<"rabbit@juju-devel3-machine-14.openstacklocal">>},
224+ {partitions,[]}]
225+ """
226+
227+
228 class UtilsTests(unittest.TestCase):
229 def setUp(self):
230 super(UtilsTests, self).setUp()
231@@ -102,3 +130,30 @@
232 self.assertEqual(lines[0], "#somedata\n")
233 self.assertEqual(lines[1], "%s %s\n" % (map.items()[0]))
234 self.assertEqual(lines[4], "%s %s\n" % (map.items()[3]))
235+
236+ @mock.patch('rabbit_utils.running_nodes')
237+ def test_not_clustered(self, mock_running_nodes):
238+ mock_running_nodes.return_value = []
239+ self.assertFalse(rabbit_utils.clustered())
240+
241+ @mock.patch('rabbit_utils.running_nodes')
242+ def test_clustered(self, mock_running_nodes):
243+ mock_running_nodes.return_value = ['a', 'b']
244+ self.assertTrue(rabbit_utils.clustered())
245+
246+ @mock.patch('rabbit_utils.subprocess')
247+ def test_running_nodes(self, mock_subprocess):
248+ '''Ensure cluster_status can be parsed for a clustered deployment'''
249+ mock_subprocess.check_output.return_value = \
250+ RABBITMQCTL_CLUSTERSTATUS_RUNNING
251+ self.assertEqual(rabbit_utils.running_nodes(),
252+ ['rabbit@juju-devel3-machine-14',
253+ 'rabbit@juju-devel3-machine-19'])
254+
255+ @mock.patch('rabbit_utils.subprocess')
256+ def test_running_nodes_solo(self, mock_subprocess):
257+ '''Ensure cluster_status can be parsed for a single unit deployment'''
258+ mock_subprocess.check_output.return_value = \
259+ RABBITMQCTL_CLUSTERSTATUS_SOLO
260+ self.assertEqual(rabbit_utils.running_nodes(),
261+ ['rabbit@juju-devel3-machine-14'])

Subscribers

People subscribed via source and target branches