Merge lp:~gmb/maas/add-alerts-for-disconnected-clusters-bug-1341121 into lp:~maas-committers/maas/trunk

Proposed by Graham Binns
Status: Merged
Approved by: Graham Binns
Approved revision: no longer in the source branch.
Merged at revision: 3195
Proposed branch: lp:~gmb/maas/add-alerts-for-disconnected-clusters-bug-1341121
Merge into: lp:~maas-committers/maas/trunk
Diff against target: 186 lines (+115/-2)
4 files modified
src/maasserver/api/tests/test_nodegroup.py (+3/-0)
src/maasserver/enum.py (+1/-0)
src/maasserver/middleware.py (+21/-2)
src/maasserver/tests/test_middleware.py (+90/-0)
To merge this branch: bzr merge lp:~gmb/maas/add-alerts-for-disconnected-clusters-bug-1341121
Reviewer Review Type Date Requested Status
Raphaël Badin (community) Approve
Jeroen T. Vermeulen (community) Approve
Review via email: mp+237039@code.launchpad.net

Commit message

Add a persistent error when there are disconnected clusters, and remove it when all the clusters are connected again.

Previously, the only way to know about clusters being disconnected was to go to the cluster listing and look.

To post a comment you must log in.
Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

I'm sure this will be appreciated, even if it may be a little annoying to some people with distributed clusters.

Review notes inline.

review: Approve
Revision history for this message
Raphaël Badin (rvb) :
Revision history for this message
Graham Binns (gmb) wrote :
Download full text (3.9 KiB)

On 3 October 2014 12:09, Jeroen T. Vermeulen <email address hidden> wrote:
> I'm sure this will be appreciated, even if it may be a little annoying to some people with distributed clusters.

Definitely! Thank you! Don't worry about being annoying.

Oh, wait, you mean the branch. Nevermind ;P

> Review notes inline.
>
> Diff comments:
>
> Tiny suggestion: instead of...
>
> any(
> not cluster.is_connected() for cluster in accepted_clusters)
>
> you could probably write...
>
> not all(
> cluster.is_connected() for cluster in accepted_clusters)

Yep, done.

>> +class ExternalComponentsMiddlewareTest(MAASServerTestCase):
>> + """Tests for the ExternalComponentsMiddleware."""
>> +
>> + def test__checks_connectivity_of_accepted_clusters(self):
>> + get_client_for = self.patch(nodegroup_module, 'getClientFor')
>> + middleware = ExternalComponentsMiddleware()
>> + clusters = {
>> + factory.make_NodeGroup(status=NODEGROUP_STATUS.ACCEPTED)
>> + for _ in range(3)}
>
> Why are we creating 3 of everything? For luck? If the code is correct for 2, it's probably correct for 3. (And in Python, if it's correct for 1 chances are it's correct for 2. It's not like C where you might do *ptr instead of ptr[i] for every i.)
>
> Also, we've taken to putting the closing brace/bracket of a comprehension on a line of its own.

You know, I honestly don't know why. Habit, I guess. Three is a magic
number. It's not really relevant to have > 1 cluster in this test
anyway, I'll change it.

>> +
>> + request = factory.make_fake_request(factory.make_string(), 'GET')
>> + middleware.process_request(request)
>> +
>> + self.assertThat(
>> + get_client_for, MockCallsMatch(
>> + *[call(cluster.uuid, timeout=0) for cluster in clusters]))
>> +
>> + def test__ignores_non_accepted_clusters(self):
>> + get_client_for = self.patch(nodegroup_module, 'getClientFor')
>> + middleware = ExternalComponentsMiddleware()
>> + statuses = (
>> + NODEGROUP_STATUS.PENDING,
>> + NODEGROUP_STATUS.ACCEPTED,
>> + NODEGROUP_STATUS.REJECTED)
>> + clusters = {
>> + factory.make_NodeGroup(status=random.choice(statuses))
>> + for _ in range(3)}
>
> Again with the three! But why bother testing three random items from a list of 3 values, when it's actually easier to test all values?
>
> Although in this case I personally would do it differently: write one simple test just for ACCEPTED, and one simple test for a random other status. In the event that we get it wrong for pending clusters but not rejected ones, or vice versa, that leaves a fifty-fifty chance of the test passing by accident on any given run, so up to you to judge how likely that is.

Right.

> Also, there's no need to enumerate the statuses manually; just use factory.pick_enum(NODEGROUP_STATUS, but_not=[NODEGROUP_STATUS.ACCEPTED]).

Ah, didn't know about that, thanks.

>> + accepted_cluster_uuids = (
>> + cluster.uuid for cluster in clusters
>> + if cluster.status == NODEGROUP_STATUS....

Read more...

Revision history for this message
Graham Binns (gmb) wrote :

On 3 October 2014 12:11, Raphaël Badin <email address hidden> wrote:
>
>
> Diff comments:
>
>> === modified file 'src/maasserver/enum.py'
>> --- src/maasserver/enum.py 2014-09-27 02:22:32 +0000
>> +++ src/maasserver/enum.py 2014-10-03 10:53:21 +0000
>> @@ -41,6 +41,7 @@
>> """Major moving parts of the application that may have failure states."""
>> PSERV = 'provisioning server'
>> IMPORT_PXE_FILES = 'maas-import-pxe-files script'
>> + CLUSTERS = 'clusters'
>
> We might want to have other component errors related to clusters. Maybe this should be renamed "DISCONNECTED_CLUSTERS"

Agreed.

>> class NODE_STATUS:
>>
>> + accepted_clusters = NodeGroup.objects.filter(
>> + status=NODEGROUP_STATUS.ACCEPTED)
>> + disconnected_clusters_found = any(
>> + not cluster.is_connected() for cluster in accepted_clusters)
>> + if disconnected_clusters_found:
>> + register_persistent_error(
>> + COMPONENT.CLUSTERS,
>> + "One or more clusters are currently disconnected. Visit "
>> + "the <a href=\"%s\">clusters</a> page for more "
>
> I think the link should apply to "clusters page" instead of just "clusters" but it's a detail.
>

Fixed.

Revision history for this message
Raphaël Badin (rvb) :
review: Approve
Revision history for this message
MAAS Lander (maas-lander) wrote :
Download full text (23.0 KiB)

The attempt to merge lp:~gmb/maas/add-alerts-for-disconnected-clusters-bug-1341121 into lp:maas failed. Below is the output from the failed tests.

Ign http://security.ubuntu.com trusty-security InRelease
Get:1 http://security.ubuntu.com trusty-security Release.gpg [933 B]
Ign http://nova.clouds.archive.ubuntu.com trusty InRelease
Get:2 http://security.ubuntu.com trusty-security Release [59.7 kB]
Ign http://nova.clouds.archive.ubuntu.com trusty-updates InRelease
Hit http://nova.clouds.archive.ubuntu.com trusty Release.gpg
Get:3 http://nova.clouds.archive.ubuntu.com trusty-updates Release.gpg [933 B]
Hit http://nova.clouds.archive.ubuntu.com trusty Release
Get:4 http://nova.clouds.archive.ubuntu.com trusty-updates Release [59.7 kB]
Get:5 http://security.ubuntu.com trusty-security/main Sources [46.3 kB]
Hit http://nova.clouds.archive.ubuntu.com trusty/main Sources
Get:6 http://security.ubuntu.com trusty-security/universe Sources [10.8 kB]
Get:7 http://security.ubuntu.com trusty-security/main amd64 Packages [146 kB]
Hit http://nova.clouds.archive.ubuntu.com trusty/universe Sources
Hit http://nova.clouds.archive.ubuntu.com trusty/main amd64 Packages
Hit http://nova.clouds.archive.ubuntu.com trusty/universe amd64 Packages
Hit http://nova.clouds.archive.ubuntu.com trusty/main Translation-en
Hit http://nova.clouds.archive.ubuntu.com trusty/universe Translation-en
Get:8 http://security.ubuntu.com trusty-security/universe amd64 Packages [49.1 kB]
Hit http://security.ubuntu.com trusty-security/main Translation-en
Hit http://security.ubuntu.com trusty-security/universe Translation-en
Ign http://nova.clouds.archive.ubuntu.com trusty/main Translation-en_US
Ign http://nova.clouds.archive.ubuntu.com trusty/universe Translation-en_US
Get:9 http://nova.clouds.archive.ubuntu.com trusty-updates/main Sources [125 kB]
Get:10 http://nova.clouds.archive.ubuntu.com trusty-updates/universe Sources [86.2 kB]
Get:11 http://nova.clouds.archive.ubuntu.com trusty-updates/main amd64 Packages [336 kB]
Get:12 http://nova.clouds.archive.ubuntu.com trusty-updates/universe amd64 Packages [209 kB]
Hit http://nova.clouds.archive.ubuntu.com trusty-updates/main Translation-en
Hit http://nova.clouds.archive.ubuntu.com trusty-updates/universe Translation-en
Fetched 1,129 kB in 0s (1,852 kB/s)
Reading package lists...
sudo DEBIAN_FRONTEND=noninteractive apt-get -y \
     --no-install-recommends install apache2 authbind bind9 bind9utils build-essential bzr-builddeb curl daemontools debhelper dh-apport distro-info dnsutils firefox freeipmi-tools gjs ipython isc-dhcp-common libjs-raphael libjs-yui3-full libjs-yui3-min libpq-dev make pep8 postgresql pyflakes python-amqplib python-bzrlib python-celery python-convoy python-crochet python-cssselect python-curtin python-dev python-distro-info python-django python-django-piston python-django-south python-djorm-ext-pgarray python-docutils python-extras python-fixtures python-flake8 python-formencode python-hivex python-httplib2 python-jinja2 python-jsonschema python-lockfile python-lxml python-mimeparse python-mock python-netaddr python-netifaces python-nose python-oauth python-oops python-oops-amqp python-oops-datedir-repo python-oops-twisted ...

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/maasserver/api/tests/test_nodegroup.py'
2--- src/maasserver/api/tests/test_nodegroup.py 2014-09-24 14:20:51 +0000
3+++ src/maasserver/api/tests/test_nodegroup.py 2014-10-03 14:52:27 +0000
4@@ -218,6 +218,9 @@
5 self.assertEqual(httplib.FORBIDDEN, response.status_code)
6
7 def test_import_boot_images_for_all_accepted_clusters(self):
8+ # Patch out the is_connected method of NodeGroup so as to avoid
9+ # extra calls to getClient()
10+ self.patch(nodegroup_module.NodeGroup, 'is_connected')
11 self.become_admin()
12 mock_getClientFor = self.patch(nodegroup_module, 'getClientFor')
13 accepted_nodegroups = [
14
15=== modified file 'src/maasserver/enum.py'
16--- src/maasserver/enum.py 2014-09-27 02:22:32 +0000
17+++ src/maasserver/enum.py 2014-10-03 14:52:27 +0000
18@@ -41,6 +41,7 @@
19 """Major moving parts of the application that may have failure states."""
20 PSERV = 'provisioning server'
21 IMPORT_PXE_FILES = 'maas-import-pxe-files script'
22+ CLUSTERS = 'clusters'
23
24
25 class NODE_STATUS:
26
27=== modified file 'src/maasserver/middleware.py'
28--- src/maasserver/middleware.py 2014-10-01 16:50:53 +0000
29+++ src/maasserver/middleware.py 2014-10-03 14:52:27 +0000
30@@ -44,6 +44,14 @@
31 HttpResponseForbidden,
32 HttpResponseRedirect,
33 )
34+from maasserver.components import (
35+ discard_persistent_error,
36+ register_persistent_error,
37+ )
38+from maasserver.enum import (
39+ COMPONENT,
40+ NODEGROUP_STATUS,
41+ )
42 from maasserver.models.nodegroup import NodeGroup
43 from provisioningserver.rpc.exceptions import (
44 MultipleFailures,
45@@ -139,8 +147,19 @@
46 # disturbing the handling of the request. Proper error reporting
47 # should be handled in the check method itself.
48 try:
49- # TODO: Components checks here.
50- pass
51+ # Check cluster connectivity.
52+ accepted_clusters = NodeGroup.objects.filter(
53+ status=NODEGROUP_STATUS.ACCEPTED)
54+ disconnected_clusters_found = not all(
55+ cluster.is_connected() for cluster in accepted_clusters)
56+ if disconnected_clusters_found:
57+ register_persistent_error(
58+ COMPONENT.CLUSTERS,
59+ "One or more clusters are currently disconnected. Visit "
60+ "the <a href=\"%s\">clusters page</a> for more "
61+ "information." % reverse('cluster-list'))
62+ else:
63+ discard_persistent_error(COMPONENT.CLUSTERS)
64 except Exception:
65 pass
66 return None
67
68=== modified file 'src/maasserver/tests/test_middleware.py'
69--- src/maasserver/tests/test_middleware.py 2014-10-01 16:50:53 +0000
70+++ src/maasserver/tests/test_middleware.py 2014-10-03 14:52:27 +0000
71@@ -25,9 +25,18 @@
72 PermissionDenied,
73 ValidationError,
74 )
75+from django.core.urlresolvers import reverse
76 from django.http import HttpResponse
77 from django.http.request import build_request_repr
78 from fixtures import FakeLogger
79+from maasserver.components import (
80+ get_persistent_error,
81+ register_persistent_error,
82+ )
83+from maasserver.enum import (
84+ COMPONENT,
85+ NODEGROUP_STATUS,
86+ )
87 from maasserver.exceptions import (
88 ExternalComponentException,
89 MAASAPIException,
90@@ -40,11 +49,17 @@
91 DebuggingLoggerMiddleware,
92 ErrorsMiddleware,
93 ExceptionMiddleware,
94+ ExternalComponentsMiddleware,
95 RPCErrorsMiddleware,
96 )
97+from maasserver.models import nodegroup as nodegroup_module
98 from maasserver.testing import extract_redirect
99 from maasserver.testing.factory import factory
100 from maasserver.testing.testcase import MAASServerTestCase
101+from maastesting.matchers import (
102+ MockCalledOnceWith,
103+ MockNotCalled,
104+ )
105 from maastesting.utils import sample_binary_data
106 from provisioningserver.rpc.exceptions import (
107 MultipleFailures,
108@@ -506,3 +521,78 @@
109 "but that's just peanuts to space!")
110 response = middleware.process_exception(request, exception)
111 self.assertIsNone(response)
112+
113+
114+class ExternalComponentsMiddlewareTest(MAASServerTestCase):
115+ """Tests for the ExternalComponentsMiddleware."""
116+
117+ def test__checks_connectivity_of_accepted_clusters(self):
118+ get_client_for = self.patch(nodegroup_module, 'getClientFor')
119+ middleware = ExternalComponentsMiddleware()
120+ cluster = factory.make_NodeGroup(status=NODEGROUP_STATUS.ACCEPTED)
121+
122+ request = factory.make_fake_request(factory.make_string(), 'GET')
123+ middleware.process_request(request)
124+
125+ self.assertThat(
126+ get_client_for, MockCalledOnceWith(cluster.uuid, timeout=0))
127+
128+ def test__ignores_non_accepted_clusters(self):
129+ get_client_for = self.patch(nodegroup_module, 'getClientFor')
130+ factory.make_NodeGroup(
131+ status=factory.pick_enum(
132+ NODEGROUP_STATUS, but_not=[NODEGROUP_STATUS.ACCEPTED]))
133+ middleware = ExternalComponentsMiddleware()
134+ request = factory.make_fake_request(factory.make_string(), 'GET')
135+ middleware.process_request(request)
136+
137+ self.assertThat(get_client_for, MockNotCalled())
138+
139+ def test__registers_error_if_all_clusters_are_disconnected(self):
140+ get_client_for = self.patch(nodegroup_module, 'getClientFor')
141+ get_client_for.side_effect = NoConnectionsAvailable(
142+ "Why, it's a jet-propelled, guided NAAFI!")
143+ middleware = ExternalComponentsMiddleware()
144+ factory.make_NodeGroup(status=NODEGROUP_STATUS.ACCEPTED)
145+
146+ request = factory.make_fake_request(factory.make_string(), 'GET')
147+ middleware.process_request(request)
148+
149+ error = get_persistent_error(COMPONENT.CLUSTERS)
150+ self.assertEqual(
151+ "One or more clusters are currently disconnected. Visit the "
152+ "<a href=\"%s\">clusters page</a> for more information." %
153+ reverse('cluster-list'),
154+ error)
155+
156+ def test__registers_error_if_any_clusters_are_disconnected(self):
157+ get_client_for = self.patch(nodegroup_module, 'getClientFor')
158+ get_client_for.side_effect = [
159+ NoConnectionsAvailable("Why, it's a jet-propelled, guided NAAFI!"),
160+ None,
161+ ]
162+ middleware = ExternalComponentsMiddleware()
163+ factory.make_NodeGroup(status=NODEGROUP_STATUS.ACCEPTED)
164+
165+ request = factory.make_fake_request(factory.make_string(), 'GET')
166+ middleware.process_request(request)
167+
168+ error = get_persistent_error(COMPONENT.CLUSTERS)
169+ self.assertEqual(
170+ "One or more clusters are currently disconnected. Visit the "
171+ "<a href=\"%s\">clusters page</a> for more information." %
172+ reverse('cluster-list'),
173+ error)
174+
175+ def test__removes_error_once_all_clusters_are_connected(self):
176+ # Patch getClientFor() to ensure that we don't actually try to
177+ # connect to the cluster.
178+ self.patch(nodegroup_module, 'getClientFor')
179+ middleware = ExternalComponentsMiddleware()
180+ factory.make_NodeGroup(status=NODEGROUP_STATUS.ACCEPTED)
181+
182+ register_persistent_error(
183+ COMPONENT.CLUSTERS, "Who flung that batter pudding?")
184+ request = factory.make_fake_request(factory.make_string(), 'GET')
185+ middleware.process_request(request)
186+ self.assertIsNone(get_persistent_error(COMPONENT.CLUSTERS))