Merge lp:~gmb/maas/add-alerts-for-disconnected-clusters-bug-1341121 into lp:~maas-committers/maas/trunk
- add-alerts-for-disconnected-clusters-bug-1341121
- Merge into trunk
Status: | Merged |
---|---|
Approved by: | Graham Binns |
Approved revision: | no longer in the source branch. |
Merged at revision: | 3195 |
Proposed branch: | lp:~gmb/maas/add-alerts-for-disconnected-clusters-bug-1341121 |
Merge into: | lp:~maas-committers/maas/trunk |
Diff against target: |
186 lines (+115/-2) 4 files modified
src/maasserver/api/tests/test_nodegroup.py (+3/-0) src/maasserver/enum.py (+1/-0) src/maasserver/middleware.py (+21/-2) src/maasserver/tests/test_middleware.py (+90/-0) |
To merge this branch: | bzr merge lp:~gmb/maas/add-alerts-for-disconnected-clusters-bug-1341121 |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Raphaël Badin (community) | Approve | ||
Jeroen T. Vermeulen (community) | Approve | ||
Review via email: mp+237039@code.launchpad.net |
Commit message
Add a persistent error when there are disconnected clusters, and remove it when all the clusters are connected again.
Previously, the only way to know about clusters being disconnected was to go to the cluster listing and look.
Description of the change
Raphaël Badin (rvb) : | # |
Graham Binns (gmb) wrote : | # |
On 3 October 2014 12:09, Jeroen T. Vermeulen <email address hidden> wrote:
> I'm sure this will be appreciated, even if it may be a little annoying to some people with distributed clusters.
Definitely! Thank you! Don't worry about being annoying.
Oh, wait, you mean the branch. Nevermind ;P
> Review notes inline.
>
> Diff comments:
>
> Tiny suggestion: instead of...
>
> any(
> not cluster.
>
> you could probably write...
>
> not all(
> cluster.
Yep, done.
>> +class ExternalCompone
>> + """Tests for the ExternalCompone
>> +
>> + def test__checks_
>> + get_client_for = self.patch(
>> + middleware = ExternalCompone
>> + clusters = {
>> + factory.
>> + for _ in range(3)}
>
> Why are we creating 3 of everything? For luck? If the code is correct for 2, it's probably correct for 3. (And in Python, if it's correct for 1 chances are it's correct for 2. It's not like C where you might do *ptr instead of ptr[i] for every i.)
>
> Also, we've taken to putting the closing brace/bracket of a comprehension on a line of its own.
You know, I honestly don't know why. Habit, I guess. Three is a magic
number. It's not really relevant to have > 1 cluster in this test
anyway, I'll change it.
>> +
>> + request = factory.
>> + middleware.
>> +
>> + self.assertThat(
>> + get_client_for, MockCallsMatch(
>> + *[call(
>> +
>> + def test__ignores_
>> + get_client_for = self.patch(
>> + middleware = ExternalCompone
>> + statuses = (
>> + NODEGROUP_
>> + NODEGROUP_
>> + NODEGROUP_
>> + clusters = {
>> + factory.
>> + for _ in range(3)}
>
> Again with the three! But why bother testing three random items from a list of 3 values, when it's actually easier to test all values?
>
> Although in this case I personally would do it differently: write one simple test just for ACCEPTED, and one simple test for a random other status. In the event that we get it wrong for pending clusters but not rejected ones, or vice versa, that leaves a fifty-fifty chance of the test passing by accident on any given run, so up to you to judge how likely that is.
Right.
> Also, there's no need to enumerate the statuses manually; just use factory.
Ah, didn't know about that, thanks.
>> + accepted_
>> + cluster.uuid for cluster in clusters
>> + if cluster.status == NODEGROUP_
Graham Binns (gmb) wrote : | # |
On 3 October 2014 12:11, Raphaël Badin <email address hidden> wrote:
>
>
> Diff comments:
>
>> === modified file 'src/maasserver
>> --- src/maasserver/
>> +++ src/maasserver/
>> @@ -41,6 +41,7 @@
>> """Major moving parts of the application that may have failure states."""
>> PSERV = 'provisioning server'
>> IMPORT_PXE_FILES = 'maas-import-
>> + CLUSTERS = 'clusters'
>
> We might want to have other component errors related to clusters. Maybe this should be renamed "DISCONNECTED_
Agreed.
>> class NODE_STATUS:
>>
>> + accepted_clusters = NodeGroup.
>> + status=
>> + disconnected_
>> + not cluster.
>> + if disconnected_
>> + register_
>> + COMPONENT.CLUSTERS,
>> + "One or more clusters are currently disconnected. Visit "
>> + "the <a href=\"
>
> I think the link should apply to "clusters page" instead of just "clusters" but it's a detail.
>
Fixed.
Raphaël Badin (rvb) : | # |
MAAS Lander (maas-lander) wrote : | # |
The attempt to merge lp:~gmb/maas/add-alerts-for-disconnected-clusters-bug-1341121 into lp:maas failed. Below is the output from the failed tests.
Ign http://
Get:1 http://
Ign http://
Get:2 http://
Ign http://
Hit http://
Get:3 http://
Hit http://
Get:4 http://
Get:5 http://
Hit http://
Get:6 http://
Get:7 http://
Hit http://
Hit http://
Hit http://
Hit http://
Hit http://
Get:8 http://
Hit http://
Hit http://
Ign http://
Ign http://
Get:9 http://
Get:10 http://
Get:11 http://
Get:12 http://
Hit http://
Hit http://
Fetched 1,129 kB in 0s (1,852 kB/s)
Reading package lists...
sudo DEBIAN_
--
Preview Diff
1 | === modified file 'src/maasserver/api/tests/test_nodegroup.py' | |||
2 | --- src/maasserver/api/tests/test_nodegroup.py 2014-09-24 14:20:51 +0000 | |||
3 | +++ src/maasserver/api/tests/test_nodegroup.py 2014-10-03 14:52:27 +0000 | |||
4 | @@ -218,6 +218,9 @@ | |||
5 | 218 | self.assertEqual(httplib.FORBIDDEN, response.status_code) | 218 | self.assertEqual(httplib.FORBIDDEN, response.status_code) |
6 | 219 | 219 | ||
7 | 220 | def test_import_boot_images_for_all_accepted_clusters(self): | 220 | def test_import_boot_images_for_all_accepted_clusters(self): |
8 | 221 | # Patch out the is_connected method of NodeGroup so as to avoid | ||
9 | 222 | # extra calls to getClient() | ||
10 | 223 | self.patch(nodegroup_module.NodeGroup, 'is_connected') | ||
11 | 221 | self.become_admin() | 224 | self.become_admin() |
12 | 222 | mock_getClientFor = self.patch(nodegroup_module, 'getClientFor') | 225 | mock_getClientFor = self.patch(nodegroup_module, 'getClientFor') |
13 | 223 | accepted_nodegroups = [ | 226 | accepted_nodegroups = [ |
14 | 224 | 227 | ||
15 | === modified file 'src/maasserver/enum.py' | |||
16 | --- src/maasserver/enum.py 2014-09-27 02:22:32 +0000 | |||
17 | +++ src/maasserver/enum.py 2014-10-03 14:52:27 +0000 | |||
18 | @@ -41,6 +41,7 @@ | |||
19 | 41 | """Major moving parts of the application that may have failure states.""" | 41 | """Major moving parts of the application that may have failure states.""" |
20 | 42 | PSERV = 'provisioning server' | 42 | PSERV = 'provisioning server' |
21 | 43 | IMPORT_PXE_FILES = 'maas-import-pxe-files script' | 43 | IMPORT_PXE_FILES = 'maas-import-pxe-files script' |
22 | 44 | CLUSTERS = 'clusters' | ||
23 | 44 | 45 | ||
24 | 45 | 46 | ||
25 | 46 | class NODE_STATUS: | 47 | class NODE_STATUS: |
26 | 47 | 48 | ||
27 | === modified file 'src/maasserver/middleware.py' | |||
28 | --- src/maasserver/middleware.py 2014-10-01 16:50:53 +0000 | |||
29 | +++ src/maasserver/middleware.py 2014-10-03 14:52:27 +0000 | |||
30 | @@ -44,6 +44,14 @@ | |||
31 | 44 | HttpResponseForbidden, | 44 | HttpResponseForbidden, |
32 | 45 | HttpResponseRedirect, | 45 | HttpResponseRedirect, |
33 | 46 | ) | 46 | ) |
34 | 47 | from maasserver.components import ( | ||
35 | 48 | discard_persistent_error, | ||
36 | 49 | register_persistent_error, | ||
37 | 50 | ) | ||
38 | 51 | from maasserver.enum import ( | ||
39 | 52 | COMPONENT, | ||
40 | 53 | NODEGROUP_STATUS, | ||
41 | 54 | ) | ||
42 | 47 | from maasserver.models.nodegroup import NodeGroup | 55 | from maasserver.models.nodegroup import NodeGroup |
43 | 48 | from provisioningserver.rpc.exceptions import ( | 56 | from provisioningserver.rpc.exceptions import ( |
44 | 49 | MultipleFailures, | 57 | MultipleFailures, |
45 | @@ -139,8 +147,19 @@ | |||
46 | 139 | # disturbing the handling of the request. Proper error reporting | 147 | # disturbing the handling of the request. Proper error reporting |
47 | 140 | # should be handled in the check method itself. | 148 | # should be handled in the check method itself. |
48 | 141 | try: | 149 | try: |
51 | 142 | # TODO: Components checks here. | 150 | # Check cluster connectivity. |
52 | 143 | pass | 151 | accepted_clusters = NodeGroup.objects.filter( |
53 | 152 | status=NODEGROUP_STATUS.ACCEPTED) | ||
54 | 153 | disconnected_clusters_found = not all( | ||
55 | 154 | cluster.is_connected() for cluster in accepted_clusters) | ||
56 | 155 | if disconnected_clusters_found: | ||
57 | 156 | register_persistent_error( | ||
58 | 157 | COMPONENT.CLUSTERS, | ||
59 | 158 | "One or more clusters are currently disconnected. Visit " | ||
60 | 159 | "the <a href=\"%s\">clusters page</a> for more " | ||
61 | 160 | "information." % reverse('cluster-list')) | ||
62 | 161 | else: | ||
63 | 162 | discard_persistent_error(COMPONENT.CLUSTERS) | ||
64 | 144 | except Exception: | 163 | except Exception: |
65 | 145 | pass | 164 | pass |
66 | 146 | return None | 165 | return None |
67 | 147 | 166 | ||
68 | === modified file 'src/maasserver/tests/test_middleware.py' | |||
69 | --- src/maasserver/tests/test_middleware.py 2014-10-01 16:50:53 +0000 | |||
70 | +++ src/maasserver/tests/test_middleware.py 2014-10-03 14:52:27 +0000 | |||
71 | @@ -25,9 +25,18 @@ | |||
72 | 25 | PermissionDenied, | 25 | PermissionDenied, |
73 | 26 | ValidationError, | 26 | ValidationError, |
74 | 27 | ) | 27 | ) |
75 | 28 | from django.core.urlresolvers import reverse | ||
76 | 28 | from django.http import HttpResponse | 29 | from django.http import HttpResponse |
77 | 29 | from django.http.request import build_request_repr | 30 | from django.http.request import build_request_repr |
78 | 30 | from fixtures import FakeLogger | 31 | from fixtures import FakeLogger |
79 | 32 | from maasserver.components import ( | ||
80 | 33 | get_persistent_error, | ||
81 | 34 | register_persistent_error, | ||
82 | 35 | ) | ||
83 | 36 | from maasserver.enum import ( | ||
84 | 37 | COMPONENT, | ||
85 | 38 | NODEGROUP_STATUS, | ||
86 | 39 | ) | ||
87 | 31 | from maasserver.exceptions import ( | 40 | from maasserver.exceptions import ( |
88 | 32 | ExternalComponentException, | 41 | ExternalComponentException, |
89 | 33 | MAASAPIException, | 42 | MAASAPIException, |
90 | @@ -40,11 +49,17 @@ | |||
91 | 40 | DebuggingLoggerMiddleware, | 49 | DebuggingLoggerMiddleware, |
92 | 41 | ErrorsMiddleware, | 50 | ErrorsMiddleware, |
93 | 42 | ExceptionMiddleware, | 51 | ExceptionMiddleware, |
94 | 52 | ExternalComponentsMiddleware, | ||
95 | 43 | RPCErrorsMiddleware, | 53 | RPCErrorsMiddleware, |
96 | 44 | ) | 54 | ) |
97 | 55 | from maasserver.models import nodegroup as nodegroup_module | ||
98 | 45 | from maasserver.testing import extract_redirect | 56 | from maasserver.testing import extract_redirect |
99 | 46 | from maasserver.testing.factory import factory | 57 | from maasserver.testing.factory import factory |
100 | 47 | from maasserver.testing.testcase import MAASServerTestCase | 58 | from maasserver.testing.testcase import MAASServerTestCase |
101 | 59 | from maastesting.matchers import ( | ||
102 | 60 | MockCalledOnceWith, | ||
103 | 61 | MockNotCalled, | ||
104 | 62 | ) | ||
105 | 48 | from maastesting.utils import sample_binary_data | 63 | from maastesting.utils import sample_binary_data |
106 | 49 | from provisioningserver.rpc.exceptions import ( | 64 | from provisioningserver.rpc.exceptions import ( |
107 | 50 | MultipleFailures, | 65 | MultipleFailures, |
108 | @@ -506,3 +521,78 @@ | |||
109 | 506 | "but that's just peanuts to space!") | 521 | "but that's just peanuts to space!") |
110 | 507 | response = middleware.process_exception(request, exception) | 522 | response = middleware.process_exception(request, exception) |
111 | 508 | self.assertIsNone(response) | 523 | self.assertIsNone(response) |
112 | 524 | |||
113 | 525 | |||
114 | 526 | class ExternalComponentsMiddlewareTest(MAASServerTestCase): | ||
115 | 527 | """Tests for the ExternalComponentsMiddleware.""" | ||
116 | 528 | |||
117 | 529 | def test__checks_connectivity_of_accepted_clusters(self): | ||
118 | 530 | get_client_for = self.patch(nodegroup_module, 'getClientFor') | ||
119 | 531 | middleware = ExternalComponentsMiddleware() | ||
120 | 532 | cluster = factory.make_NodeGroup(status=NODEGROUP_STATUS.ACCEPTED) | ||
121 | 533 | |||
122 | 534 | request = factory.make_fake_request(factory.make_string(), 'GET') | ||
123 | 535 | middleware.process_request(request) | ||
124 | 536 | |||
125 | 537 | self.assertThat( | ||
126 | 538 | get_client_for, MockCalledOnceWith(cluster.uuid, timeout=0)) | ||
127 | 539 | |||
128 | 540 | def test__ignores_non_accepted_clusters(self): | ||
129 | 541 | get_client_for = self.patch(nodegroup_module, 'getClientFor') | ||
130 | 542 | factory.make_NodeGroup( | ||
131 | 543 | status=factory.pick_enum( | ||
132 | 544 | NODEGROUP_STATUS, but_not=[NODEGROUP_STATUS.ACCEPTED])) | ||
133 | 545 | middleware = ExternalComponentsMiddleware() | ||
134 | 546 | request = factory.make_fake_request(factory.make_string(), 'GET') | ||
135 | 547 | middleware.process_request(request) | ||
136 | 548 | |||
137 | 549 | self.assertThat(get_client_for, MockNotCalled()) | ||
138 | 550 | |||
139 | 551 | def test__registers_error_if_all_clusters_are_disconnected(self): | ||
140 | 552 | get_client_for = self.patch(nodegroup_module, 'getClientFor') | ||
141 | 553 | get_client_for.side_effect = NoConnectionsAvailable( | ||
142 | 554 | "Why, it's a jet-propelled, guided NAAFI!") | ||
143 | 555 | middleware = ExternalComponentsMiddleware() | ||
144 | 556 | factory.make_NodeGroup(status=NODEGROUP_STATUS.ACCEPTED) | ||
145 | 557 | |||
146 | 558 | request = factory.make_fake_request(factory.make_string(), 'GET') | ||
147 | 559 | middleware.process_request(request) | ||
148 | 560 | |||
149 | 561 | error = get_persistent_error(COMPONENT.CLUSTERS) | ||
150 | 562 | self.assertEqual( | ||
151 | 563 | "One or more clusters are currently disconnected. Visit the " | ||
152 | 564 | "<a href=\"%s\">clusters page</a> for more information." % | ||
153 | 565 | reverse('cluster-list'), | ||
154 | 566 | error) | ||
155 | 567 | |||
156 | 568 | def test__registers_error_if_any_clusters_are_disconnected(self): | ||
157 | 569 | get_client_for = self.patch(nodegroup_module, 'getClientFor') | ||
158 | 570 | get_client_for.side_effect = [ | ||
159 | 571 | NoConnectionsAvailable("Why, it's a jet-propelled, guided NAAFI!"), | ||
160 | 572 | None, | ||
161 | 573 | ] | ||
162 | 574 | middleware = ExternalComponentsMiddleware() | ||
163 | 575 | factory.make_NodeGroup(status=NODEGROUP_STATUS.ACCEPTED) | ||
164 | 576 | |||
165 | 577 | request = factory.make_fake_request(factory.make_string(), 'GET') | ||
166 | 578 | middleware.process_request(request) | ||
167 | 579 | |||
168 | 580 | error = get_persistent_error(COMPONENT.CLUSTERS) | ||
169 | 581 | self.assertEqual( | ||
170 | 582 | "One or more clusters are currently disconnected. Visit the " | ||
171 | 583 | "<a href=\"%s\">clusters page</a> for more information." % | ||
172 | 584 | reverse('cluster-list'), | ||
173 | 585 | error) | ||
174 | 586 | |||
175 | 587 | def test__removes_error_once_all_clusters_are_connected(self): | ||
176 | 588 | # Patch getClientFor() to ensure that we don't actually try to | ||
177 | 589 | # connect to the cluster. | ||
178 | 590 | self.patch(nodegroup_module, 'getClientFor') | ||
179 | 591 | middleware = ExternalComponentsMiddleware() | ||
180 | 592 | factory.make_NodeGroup(status=NODEGROUP_STATUS.ACCEPTED) | ||
181 | 593 | |||
182 | 594 | register_persistent_error( | ||
183 | 595 | COMPONENT.CLUSTERS, "Who flung that batter pudding?") | ||
184 | 596 | request = factory.make_fake_request(factory.make_string(), 'GET') | ||
185 | 597 | middleware.process_request(request) | ||
186 | 598 | self.assertIsNone(get_persistent_error(COMPONENT.CLUSTERS)) |
I'm sure this will be appreciated, even if it may be a little annoying to some people with distributed clusters.
Review notes inline.