Merge lp:~james-page/charms/trusty/rabbitmq-server/status-check into lp:~openstack-charmers-archive/charms/trusty/rabbitmq-server/next

Proposed by James Page
Status: Merged
Merged at revision: 116
Proposed branch: lp:~james-page/charms/trusty/rabbitmq-server/status-check
Merge into: lp:~openstack-charmers-archive/charms/trusty/rabbitmq-server/next
Diff against target: 261 lines (+131/-18)
3 files modified
hooks/rabbit_utils.py (+67/-16)
hooks/rabbitmq_server_relations.py (+7/-0)
unit_tests/test_rabbit_utils.py (+57/-2)
To merge this branch: bzr merge lp:~james-page/charms/trusty/rabbitmq-server/status-check
Reviewer Review Type Date Requested Status
David Ames (community) Approve
James Page Needs Resubmitting
Review via email: mp+273037@code.launchpad.net

Description of the change

Add basic use of status feature in Juju

This includes:

1) Evaluation of current rabbitmq state on the server using 'status'
2) Evaluation of cluster status once peers appear on the cluster relation
3) status set when installing packages and when configuring mirroring (long ops)

I also found a race in leader-settings-changed - it can fire prior to installation of rabbitmq in config-changed, so added a basic defer until installed check.

To post a comment you must log in.
Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #11153 rabbitmq-server-next for james-page mp273037
    LINT OK: passed

Build: http://10.245.162.77:8080/job/charm_lint_check/11153/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #10358 rabbitmq-server-next for james-page mp273037
    UNIT OK: passed

Build: http://10.245.162.77:8080/job/charm_unit_test/10358/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #6934 rabbitmq-server-next for james-page mp273037
    AMULET FAIL: amulet-test failed

AMULET Results (max last 2 lines):
make: *** [functional_test] Error 124
ERROR:root:Make target returned non-zero.

Full amulet test output: http://paste.ubuntu.com/12634394/
Build: http://10.245.162.77:8080/job/charm_amulet_test/6934/

Revision history for this message
David Ames (thedac) wrote :

I really like the workgroup status approach in this MP.

I also like your solution to https://bugs.launchpad.net/charms/+source/rabbitmq-server/+bug/1501048 better than mine.
mkdir is imported in hooks/rabbitmq_server_relations.py but is never used. I suspect you used os.path.exists instead.

I also really like breaking out clustered() as a cached function on its own. However, one of the bugs I am fighting (https://bugs.launchpad.net/charms/+source/rabbitmq-server/+bug/1500204) is a direct result of checking for more than one running node.

if len(running_nodes()) > 1:

Consider, the 3rd, 4th and 5th nodes to attempt clustering. There will already by 2+ running nodes and they will assume incorrectly they are already clustered.

In my testing I have removed this check entirely with some success. I would be interested in finding a more robust clustered check.

I plan to merge your MP into my branch and do some more testing and see if I can come up with a clustered check that works in all cases.

review: Needs Fixing
Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #6937 rabbitmq-server-next for james-page mp273037
    AMULET FAIL: amulet-test failed

AMULET Results (max last 2 lines):
2015-10-01 21:19:58,241 publish_amqp_message_by_unit DEBUG: Publishing message to test queue:
ERROR:root:Make target returned non-zero.

Full amulet test output: http://paste.ubuntu.com/12634746/
Build: http://10.245.162.77:8080/job/charm_amulet_test/6937/

Revision history for this message
David Ames (thedac) wrote :

> However, one of the bugs I am fighting
> (https://bugs.launchpad.net/charms/+source/rabbitmq-server/+bug/1500204) is a
> direct result of checking for more than one running node.
>
> if len(running_nodes()) > 1:
>
> Consider, the 3rd, 4th and 5th nodes to attempt clustering. There will already
> by 2+ running nodes and they will assume incorrectly they are already
> clustered.

Ignore this. I see what it is supposed to do. As the local unit will only ever see 1 if not clustered or more if clustered.

> I plan to merge your MP into my branch and do some more testing and see if I
> can come up with a clustered check that works in all cases.

I am testing a combination of my MP which ignores min-cluster-size when leadership election is available and your MP now. I'll see if any more race conditions reveal themselves.

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #6938 rabbitmq-server-next for james-page mp273037
    AMULET FAIL: amulet-test failed

AMULET Results (max last 2 lines):
make: *** [functional_test] Error 1
ERROR:root:Make target returned non-zero.

Full amulet test output: http://paste.ubuntu.com/12635100/
Build: http://10.245.162.77:8080/job/charm_amulet_test/6938/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #11156 rabbitmq-server-next for james-page mp273037
    LINT OK: passed

Build: http://10.245.162.77:8080/job/charm_lint_check/11156/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #10364 rabbitmq-server-next for james-page mp273037
    UNIT OK: passed

Build: http://10.245.162.77:8080/job/charm_unit_test/10364/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #6970 rabbitmq-server-next for james-page mp273037
    AMULET FAIL: amulet-test failed

AMULET Results (max last 2 lines):
make: *** [functional_test] Error 1
ERROR:root:Make target returned non-zero.

Full amulet test output: http://paste.ubuntu.com/12638811/
Build: http://10.245.162.77:8080/job/charm_amulet_test/6970/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #11169 rabbitmq-server-next for james-page mp273037
    LINT OK: passed

Build: http://10.245.162.77:8080/job/charm_lint_check/11169/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #10372 rabbitmq-server-next for james-page mp273037
    UNIT OK: passed

Build: http://10.245.162.77:8080/job/charm_unit_test/10372/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #6992 rabbitmq-server-next for james-page mp273037
    AMULET FAIL: amulet-test failed

AMULET Results (max last 2 lines):
make: *** [functional_test] Error 124
ERROR:root:Make target returned non-zero.

Full amulet test output: http://paste.ubuntu.com/12640571/
Build: http://10.245.162.77:8080/job/charm_amulet_test/6992/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #7002 rabbitmq-server-next for james-page mp273037
    AMULET FAIL: amulet-test failed

AMULET Results (max last 2 lines):
make: *** [functional_test] Error 1
ERROR:root:Make target returned non-zero.

Full amulet test output: http://paste.ubuntu.com/12641390/
Build: http://10.245.162.77:8080/job/charm_amulet_test/7002/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #7009 rabbitmq-server-next for james-page mp273037
    AMULET FAIL: amulet-test failed

AMULET Results (max last 2 lines):
make: *** [functional_test] Error 1
ERROR:root:Make target returned non-zero.

Full amulet test output: http://paste.ubuntu.com/12642068/
Build: http://10.245.162.77:8080/job/charm_amulet_test/7009/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #7025 rabbitmq-server-next for james-page mp273037
    AMULET FAIL: amulet-test failed

AMULET Results (max last 2 lines):
make: *** [functional_test] Error 1
ERROR:root:Make target returned non-zero.

Full amulet test output: http://paste.ubuntu.com/12644560/
Build: http://10.245.162.77:8080/job/charm_amulet_test/7025/

Revision history for this message
James Page (james-page) :
review: Needs Resubmitting
Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #10521 rabbitmq-server-next for james-page mp273037
    UNIT OK: passed

Build: http://10.245.162.77:8080/job/charm_unit_test/10521/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #11327 rabbitmq-server-next for james-page mp273037
    LINT OK: passed

Build: http://10.245.162.77:8080/job/charm_lint_check/11327/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #7112 rabbitmq-server-next for james-page mp273037
    AMULET OK: passed

Build: http://10.245.162.77:8080/job/charm_amulet_test/7112/

Revision history for this message
David Ames (thedac) wrote :

Apporved

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'hooks/rabbit_utils.py'
--- hooks/rabbit_utils.py 2015-09-23 15:53:31 +0000
+++ hooks/rabbit_utils.py 2015-10-01 09:52:45 +0000
@@ -23,7 +23,9 @@
23 related_units,23 related_units,
24 log, ERROR,24 log, ERROR,
25 INFO,25 INFO,
26 service_name26 service_name,
27 status_set,
28 cached
27)29)
2830
29from charmhelpers.core.host import (31from charmhelpers.core.host import (
@@ -193,6 +195,11 @@
193 subprocess.check_call(cmd)195 subprocess.check_call(cmd)
194196
195197
198@cached
199def caching_cmp_pkgrevno(package, revno, pkgcache=None):
200 return cmp_pkgrevno(package, revno, pkgcache)
201
202
196def set_ha_mode(vhost, mode, params=None, sync_mode='automatic'):203def set_ha_mode(vhost, mode, params=None, sync_mode='automatic'):
197 """Valid mode values:204 """Valid mode values:
198205
@@ -212,7 +219,7 @@
212 http://www.rabbitmq.com./ha.html#eager-synchronisation219 http://www.rabbitmq.com./ha.html#eager-synchronisation
213 """220 """
214221
215 if cmp_pkgrevno('rabbitmq-server', '3.0.0') < 0:222 if caching_cmp_pkgrevno('rabbitmq-server', '3.0.0') < 0:
216 log(("Mirroring queues cannot be enabled, only supported "223 log(("Mirroring queues cannot be enabled, only supported "
217 "in rabbitmq-server >= 3.0"), level='WARN')224 "in rabbitmq-server >= 3.0"), level='WARN')
218 log(("More information at http://www.rabbitmq.com/blog/"225 log(("More information at http://www.rabbitmq.com/blog/"
@@ -266,6 +273,11 @@
266 "2012/11/19/breaking-things-with-rabbitmq-3-0"), level='INFO')273 "2012/11/19/breaking-things-with-rabbitmq-3-0"), level='INFO')
267 return274 return
268275
276 if enable:
277 status_set('active', 'Enabling queue mirroring')
278 else:
279 status_set('active', 'Disabling queue mirroring')
280
269 for vhost in list_vhosts():281 for vhost in list_vhosts():
270 if enable:282 if enable:
271 set_ha_mode(vhost, 'all')283 set_ha_mode(vhost, 'all')
@@ -284,19 +296,8 @@
284 cluster_cmd = 'join_cluster'296 cluster_cmd = 'join_cluster'
285 else:297 else:
286 cluster_cmd = 'cluster'298 cluster_cmd = 'cluster'
287 out = subprocess.check_output([RABBITMQ_CTL, 'cluster_status'])299
288 log('cluster status is %s' % str(out))300 if clustered():
289
290 # check if node is already clustered
291 total_nodes = 1
292 running_nodes = []
293 m = re.search("\{running_nodes,\[(.*?)\]\}", out.strip(), re.DOTALL)
294 if m is not None:
295 running_nodes = m.group(1).split(',')
296 running_nodes = [x.replace("'", '') for x in running_nodes]
297 total_nodes = len(running_nodes)
298
299 if total_nodes > 1:
300 log('Node is already clustered, skipping')301 log('Node is already clustered, skipping')
301 return False302 return False
302303
@@ -324,10 +325,11 @@
324 return False325 return False
325326
326 # iterate over all the nodes, join to the first available327 # iterate over all the nodes, join to the first available
328 active_nodes = running_nodes()
327 num_tries = 0329 num_tries = 0
328 for node in available_nodes:330 for node in available_nodes:
329 log('Clustering with remote rabbit host (%s).' % node)331 log('Clustering with remote rabbit host (%s).' % node)
330 if node in running_nodes:332 if node in active_nodes:
331 log('Host already clustered with %s.' % node)333 log('Host already clustered with %s.' % node)
332 return False334 return False
333335
@@ -600,3 +602,52 @@
600 for v in restart_map().values():602 for v in restart_map().values():
601 _services = _services + v603 _services = _services + v
602 return list(set(_services))604 return list(set(_services))
605
606
607@cached
608def running_nodes():
609 ''' Determine the current set of running rabbitmq-units in the cluster '''
610 out = subprocess.check_output([RABBITMQ_CTL, 'cluster_status'])
611
612 running_nodes = []
613 m = re.search("\{running_nodes,\[(.*?)\]\}", out.strip(), re.DOTALL)
614 if m is not None:
615 running_nodes = m.group(1).split(',')
616 running_nodes = [x.replace("'", '').strip() for x in running_nodes]
617
618 return running_nodes
619
620
621@cached
622def clustered():
623 ''' Determine whether local rabbitmq-server is clustered '''
624 if len(running_nodes()) > 1:
625 return True
626 else:
627 return False
628
629
630def assess_status():
631 ''' Assess the status for the current running unit '''
632 # NOTE: ensure rabbitmq is actually installed before doing
633 # any checks
634 if os.path.exists(RABBITMQ_CTL):
635 # Clustering Check
636 peer_ids = relation_ids('cluster')
637 if peer_ids and len(related_units(peer_ids[0])):
638 if not clustered():
639 status_set('waiting',
640 'Unit has peers, but RabbitMQ not clustered')
641 return
642 # General status check
643 status_cmd = ['rabbitmqctl', 'status']
644 ret = subprocess.call(status_cmd)
645 if ret > 0:
646 status_set('blocked', 'RabbitMQ server is not running')
647 else:
648 if clustered():
649 status_set('active', 'Unit is ready and clustered')
650 else:
651 status_set('active', 'Unit is ready')
652 else:
653 status_set('waiting', 'RabbitMQ is not yet installed')
603654
=== modified file 'hooks/rabbitmq_server_relations.py'
--- hooks/rabbitmq_server_relations.py 2015-09-23 13:16:23 +0000
+++ hooks/rabbitmq_server_relations.py 2015-10-01 09:52:45 +0000
@@ -65,6 +65,7 @@
65 UnregisteredHookError,65 UnregisteredHookError,
66 is_leader,66 is_leader,
67 charm_dir,67 charm_dir,
68 status_set,
68)69)
69from charmhelpers.core.host import (70from charmhelpers.core.host import (
70 cmp_pkgrevno,71 cmp_pkgrevno,
@@ -73,6 +74,7 @@
73 service_stop,74 service_stop,
74 service_restart,75 service_restart,
75 write_file,76 write_file,
77 mkdir,
76)78)
77from charmhelpers.contrib.charmsupport import nrpe79from charmhelpers.contrib.charmsupport import nrpe
7880
@@ -644,6 +646,7 @@
644 '/etc/default/rabbitmq-server')646 '/etc/default/rabbitmq-server')
645 # Install packages to ensure any changes to source647 # Install packages to ensure any changes to source
646 # result in an upgrade if applicable.648 # result in an upgrade if applicable.
649 status_set('maintenance', 'Installing/upgrading RabbitMQ packages')
647 apt_install(rabbit.PACKAGES, fatal=True)650 apt_install(rabbit.PACKAGES, fatal=True)
648651
649 open_port(5672)652 open_port(5672)
@@ -688,6 +691,9 @@
688691
689@hooks.hook('leader-settings-changed')692@hooks.hook('leader-settings-changed')
690def leader_settings_changed():693def leader_settings_changed():
694 if not os.path.exists(rabbit.RABBITMQ_CTL):
695 log('Deferring cookie configuration, RabbitMQ not yet installed')
696 return
691 # Get cookie from leader, update cookie locally and697 # Get cookie from leader, update cookie locally and
692 # force cluster-relation-changed hooks to run on peers698 # force cluster-relation-changed hooks to run on peers
693 cookie = leader_get(attribute='cookie')699 cookie = leader_get(attribute='cookie')
@@ -716,5 +722,6 @@
716if __name__ == '__main__':722if __name__ == '__main__':
717 try:723 try:
718 hooks.execute(sys.argv)724 hooks.execute(sys.argv)
725 rabbit.assess_status()
719 except UnregisteredHookError as e:726 except UnregisteredHookError as e:
720 log('Unknown hook {} - skipping.'.format(e))727 log('Unknown hook {} - skipping.'.format(e))
721728
=== modified file 'unit_tests/test_rabbit_utils.py'
--- unit_tests/test_rabbit_utils.py 2015-09-24 21:49:44 +0000
+++ unit_tests/test_rabbit_utils.py 2015-10-01 09:52:45 +0000
@@ -4,8 +4,19 @@
4import tempfile4import tempfile
5import sys5import sys
6import collections6import collections
77from functools import wraps
8import rabbit_utils8
9
10with mock.patch('charmhelpers.core.hookenv.cached') as cached:
11 def passthrough(func):
12 @wraps(func)
13 def wrapper(*args, **kwargs):
14 return func(*args, **kwargs)
15 wrapper._wrapped = func
16 return wrapper
17 cached.side_effect = passthrough
18 import rabbit_utils
19
9sys.modules['MySQLdb'] = mock.Mock()20sys.modules['MySQLdb'] = mock.Mock()
1021
1122
@@ -41,6 +52,23 @@
41 self.assertTrue(log.called)52 self.assertTrue(log.called)
4253
4354
55RABBITMQCTL_CLUSTERSTATUS_RUNNING = """Cluster status of node 'rabbit@juju-devel3-machine-19' ...
56[{nodes,[{disc,['rabbit@juju-devel3-machine-14',
57 'rabbit@juju-devel3-machine-19']}]},
58 {running_nodes,['rabbit@juju-devel3-machine-14',
59 'rabbit@juju-devel3-machine-19']},
60 {cluster_name,<<"rabbit@juju-devel3-machine-14.openstacklocal">>},
61 {partitions,[]}]
62 """
63
64RABBITMQCTL_CLUSTERSTATUS_SOLO = """Cluster status of node 'rabbit@juju-devel3-machine-14' ...
65[{nodes,[{disc,['rabbit@juju-devel3-machine-14']}]},
66 {running_nodes,['rabbit@juju-devel3-machine-14']},
67 {cluster_name,<<"rabbit@juju-devel3-machine-14.openstacklocal">>},
68 {partitions,[]}]
69 """
70
71
44class UtilsTests(unittest.TestCase):72class UtilsTests(unittest.TestCase):
45 def setUp(self):73 def setUp(self):
46 super(UtilsTests, self).setUp()74 super(UtilsTests, self).setUp()
@@ -102,3 +130,30 @@
102 self.assertEqual(lines[0], "#somedata\n")130 self.assertEqual(lines[0], "#somedata\n")
103 self.assertEqual(lines[1], "%s %s\n" % (map.items()[0]))131 self.assertEqual(lines[1], "%s %s\n" % (map.items()[0]))
104 self.assertEqual(lines[4], "%s %s\n" % (map.items()[3]))132 self.assertEqual(lines[4], "%s %s\n" % (map.items()[3]))
133
134 @mock.patch('rabbit_utils.running_nodes')
135 def test_not_clustered(self, mock_running_nodes):
136 mock_running_nodes.return_value = []
137 self.assertFalse(rabbit_utils.clustered())
138
139 @mock.patch('rabbit_utils.running_nodes')
140 def test_clustered(self, mock_running_nodes):
141 mock_running_nodes.return_value = ['a', 'b']
142 self.assertTrue(rabbit_utils.clustered())
143
144 @mock.patch('rabbit_utils.subprocess')
145 def test_running_nodes(self, mock_subprocess):
146 '''Ensure cluster_status can be parsed for a clustered deployment'''
147 mock_subprocess.check_output.return_value = \
148 RABBITMQCTL_CLUSTERSTATUS_RUNNING
149 self.assertEqual(rabbit_utils.running_nodes(),
150 ['rabbit@juju-devel3-machine-14',
151 'rabbit@juju-devel3-machine-19'])
152
153 @mock.patch('rabbit_utils.subprocess')
154 def test_running_nodes_solo(self, mock_subprocess):
155 '''Ensure cluster_status can be parsed for a single unit deployment'''
156 mock_subprocess.check_output.return_value = \
157 RABBITMQCTL_CLUSTERSTATUS_SOLO
158 self.assertEqual(rabbit_utils.running_nodes(),
159 ['rabbit@juju-devel3-machine-14'])

Subscribers

People subscribed via source and target branches