Merge lp:~blake-rouse/maas/power-query-multiple into lp:~maas-committers/maas/trunk

Proposed by Blake Rouse
Status: Merged
Approved by: Blake Rouse
Approved revision: no longer in the source branch.
Merged at revision: 3274
Proposed branch: lp:~blake-rouse/maas/power-query-multiple
Merge into: lp:~maas-committers/maas/trunk
Diff against target: 232 lines (+98/-30)
4 files modified
src/provisioningserver/pserv_services/node_power_monitor_service.py (+4/-1)
src/provisioningserver/pserv_services/tests/test_node_power_monitor_service.py (+9/-2)
src/provisioningserver/rpc/power.py (+47/-24)
src/provisioningserver/rpc/tests/test_power.py (+38/-3)
To merge this branch: bzr merge lp:~blake-rouse/maas/power-query-multiple
Reviewer Review Type Date Requested Status
Gavin Panella (community) Approve
Blake Rouse (community) Abstain
Review via email: mp+237499@code.launchpad.net

Commit message

Run 5 queries in parallel, when querying the power status of nodes.

To post a comment you must log in.
Revision history for this message
Julian Edwards (julian-edwards) wrote :

Looking good so far. I have a suggestion to make the parallelism better since you're blocking on arbitrary Deferreds rather than the ones that might have fired.

Revision history for this message
Blake Rouse (blake-rouse) wrote :

I have updated the branch to stop the blocking on just the oldest defer, if you could give it another test that would be great.

review: Approve
Revision history for this message
Blake Rouse (blake-rouse) wrote :

Haha. Sorry about the "Approve" I must press that to much. ;-)

Revision history for this message
Blake Rouse (blake-rouse) :
review: Abstain
Revision history for this message
Julian Edwards (julian-edwards) wrote :

On Friday 10 Oct 2014 12:34:47 you wrote:
> Haha. Sorry about the "Approve" I must press that to much. ;-)

I laughed :)

Revision history for this message
Julian Edwards (julian-edwards) wrote :

On Friday 10 Oct 2014 12:34:07 you wrote:
> I think I will keep it the way it is, because it allows me to make the log
> message type different per exception and the message different for each. I

You could still do that with the single callback, FWIW.

> choose not to catch general exceptions, do you think this is a good idea?
> Or should I just catch all exceptions and maaslog as an unknown error
> printing the exception message.

I think the service has a general catch-all at the top doesn't it? Remember
that the service stops completely if there's any exception, so make sure
that's the case. I think it's fine to omit the catch-all here if there's one
at the top of the service.

Right, I'll go and test this now, the code looks much nicer now with the
semaphore.

Revision history for this message
Julian Edwards (julian-edwards) :
Revision history for this message
Julian Edwards (julian-edwards) wrote :

Ok tested and it doesn't work. I manually powered up a node and it was not reflected in the UI.

Revision history for this message
Julian Edwards (julian-edwards) wrote :

(And I have a good test rig here, I intentionally left power disconnected from some of the BMCs so I have a mix of Microservers with power, some without and some NUCs)

Revision history for this message
Blake Rouse (blake-rouse) wrote :

There is an issue using semaphore.run so I went back to using acquire and release.

I was able to test this, and it is now working correctly.

Revision history for this message
Gavin Panella (allenap) wrote :

Not a full review yet, but some questions.

review: Needs Information
Revision history for this message
Blake Rouse (blake-rouse) :
Revision history for this message
Gavin Panella (allenap) wrote :

Comment about handling failures.

Revision history for this message
Gavin Panella (allenap) wrote :

Info about cooperate/coiterate.

Revision history for this message
Gavin Panella (allenap) wrote :

Looks good.

review: Approve
Revision history for this message
Gavin Panella (allenap) wrote :

You may also need http://paste.ubuntu.com/8578497/ to get test_node_power_monitor_service to run successfully. The tests need to run in the reactor thread, otherwise the behaviour of @asynchronous becomes blocking.

Revision history for this message
Gavin Panella (allenap) wrote :

Applying http://paste.ubuntu.com/8578519/ (to use semaphore.run) in addition to the above also appears to work fine.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/provisioningserver/pserv_services/node_power_monitor_service.py'
2--- src/provisioningserver/pserv_services/node_power_monitor_service.py 2014-08-27 09:41:37 +0000
3+++ src/provisioningserver/pserv_services/node_power_monitor_service.py 2014-10-20 21:07:30 +0000
4@@ -43,6 +43,7 @@
5 """Service to monitor the power status of all nodes in this cluster."""
6
7 check_interval = timedelta(minutes=5).total_seconds()
8+ max_nodes_at_once = 5
9
10 def __init__(self, cluster_uuid, clock=None):
11 # Call self.query_nodes() every self.check_interval.
12@@ -92,4 +93,6 @@
13 uuid)
14 else:
15 node_power_parameters = response['nodes']
16- yield query_all_nodes(node_power_parameters, clock=self.clock)
17+ yield query_all_nodes(
18+ node_power_parameters,
19+ max_concurrency=self.max_nodes_at_once, clock=self.clock)
20
21=== modified file 'src/provisioningserver/pserv_services/tests/test_node_power_monitor_service.py'
22--- src/provisioningserver/pserv_services/tests/test_node_power_monitor_service.py 2014-08-27 09:41:37 +0000
23+++ src/provisioningserver/pserv_services/tests/test_node_power_monitor_service.py 2014-10-20 21:07:30 +0000
24@@ -21,7 +21,10 @@
25 MockCalledOnceWith,
26 MockCallsMatch,
27 )
28-from maastesting.testcase import MAASTestCase
29+from maastesting.testcase import (
30+ MAASTestCase,
31+ MAASTwistedRunTest,
32+ )
33 from mock import (
34 ANY,
35 call,
36@@ -48,6 +51,8 @@
37
38 class TestNodePowerMonitorService(MAASTestCase):
39
40+ run_tests_with = MAASTwistedRunTest.make_factory(timeout=5)
41+
42 def test_init_sets_up_timer_correctly(self):
43 cluster_uuid = factory.make_UUID()
44 service = npms.NodePowerMonitorService(cluster_uuid)
45@@ -118,7 +123,9 @@
46 self.assertEqual(None, extract_result(d))
47 self.assertThat(
48 query_all_nodes,
49- MockCalledOnceWith([], clock=service.clock))
50+ MockCalledOnceWith(
51+ [], max_concurrency=service.max_nodes_at_once,
52+ clock=service.clock))
53
54 def test_query_nodes_copes_with_NoSuchCluster(self):
55 cluster_uuid, service = self.make_monitor_service()
56
57=== modified file 'src/provisioningserver/rpc/power.py'
58--- src/provisioningserver/rpc/power.py 2014-10-06 11:02:29 +0000
59+++ src/provisioningserver/rpc/power.py 2014-10-20 21:07:30 +0000
60@@ -18,6 +18,7 @@
61 ]
62
63 from datetime import timedelta
64+from functools import partial
65
66 from provisioningserver.events import (
67 EVENT_TYPES,
68@@ -46,6 +47,8 @@
69 )
70 from twisted.internet import reactor
71 from twisted.internet.defer import (
72+ DeferredList,
73+ DeferredSemaphore,
74 inlineCallbacks,
75 returnValue,
76 )
77@@ -314,29 +317,49 @@
78 raise PowerActionFail(error)
79
80
81-@asynchronous
82-@inlineCallbacks
83-def query_all_nodes(nodes, clock=reactor):
84+def maaslog_query_failure(node, failure):
85+ """Log failure to query node."""
86+ if failure.check(PowerActionFail):
87+ maaslog.error(
88+ "%s: Failed to query power state: %s.",
89+ node['hostname'], failure.getErrorMessage())
90+ elif failure.check(NoSuchNode):
91+ maaslog.debug(
92+ "%s: Could not update power status; "
93+ "no such node.", node['hostname'])
94+ else:
95+ maaslog.error(
96+ "%s: Failed to query power state, unknown error: %s",
97+ node['hostname'], failure.getErrorMessage())
98+
99+
100+def maaslog_query(node, power_state):
101+ """Log change in power state for node."""
102+ if node['power_state'] != power_state:
103+ maaslog.info(
104+ "%s: Power state has changed from %s to %s.", node['hostname'],
105+ node['power_state'], power_state)
106+ return power_state
107+
108+
109+def _query_node(node, clock):
110+ """Performs `power_query` on the given node."""
111+ # Start the query of the power state for the given node, logging to
112+ # maaslog as errors and power states change.
113+ d = get_power_state(
114+ node['system_id'], node['hostname'],
115+ node['power_type'], node['context'],
116+ clock=clock)
117+ d.addCallbacks(
118+ partial(maaslog_query, node),
119+ partial(maaslog_query_failure, node))
120+ return d
121+
122+
123+def query_all_nodes(nodes, max_concurrency=5, clock=reactor):
124 """Performs `power_query` on all nodes. If the nodes state has changed,
125 then that is sent back to the region."""
126- for node in nodes:
127- system_id = node['system_id']
128- hostname = node['hostname']
129- power_state_recorded = node['power_state']
130- try:
131- power_state_observed = yield get_power_state(
132- system_id, hostname, node['power_type'], node['context'],
133- clock=clock)
134- except PowerActionFail as e:
135- maaslog.error(
136- "%s: Failed to query power state: %s.", hostname, e)
137- continue
138- except NoSuchNode:
139- maaslog.debug(
140- "%s: Could not update power status; "
141- "no such node.", hostname)
142- continue
143- if power_state_observed != power_state_recorded:
144- maaslog.info(
145- "%s: Power state has changed from %s to %s.", hostname,
146- power_state_recorded, power_state_observed)
147+ semaphore = DeferredSemaphore(tokens=max_concurrency)
148+ return DeferredList(
149+ semaphore.run(_query_node, node, clock)
150+ for node in nodes)
151
152=== modified file 'src/provisioningserver/rpc/tests/test_power.py'
153--- src/provisioningserver/rpc/tests/test_power.py 2014-10-06 11:02:29 +0000
154+++ src/provisioningserver/rpc/tests/test_power.py 2014-10-20 21:07:30 +0000
155@@ -58,9 +58,11 @@
156 from twisted.internet import reactor
157 from twisted.internet.defer import (
158 Deferred,
159+ fail,
160 inlineCallbacks,
161 maybeDeferred,
162 returnValue,
163+ succeed,
164 )
165 from twisted.internet.task import Clock
166
167@@ -620,7 +622,10 @@
168 # Report back that all nodes' power states are as recorded.
169 power_states = [node['power_state'] for node in nodes]
170 get_power_state = self.patch(power, 'get_power_state')
171- get_power_state.side_effect = power_states
172+ get_power_state.side_effect = [
173+ succeed(power_state)
174+ for power_state in power_states
175+ ]
176
177 yield power.query_all_nodes(nodes)
178 self.assertThat(get_power_state, MockCallsMatch(*(
179@@ -638,7 +643,7 @@
180 get_power_state = self.patch(power, 'get_power_state')
181 error_msg = factory.make_name("error")
182 get_power_state.side_effect = [
183- PowerActionFail(error_msg), new_state_2]
184+ fail(PowerActionFail(error_msg)), succeed(new_state_2)]
185
186 with FakeLogger("maas.power", level=logging.DEBUG) as maaslog:
187 yield power.query_all_nodes([node1, node2])
188@@ -656,7 +661,7 @@
189 new_state_2 = self.pick_alternate_state(node2['power_state'])
190 get_power_state = self.patch(power, 'get_power_state')
191 get_power_state.side_effect = [
192- exceptions.NoSuchNode(), new_state_2]
193+ fail(exceptions.NoSuchNode()), succeed(new_state_2)]
194
195 with FakeLogger("maas.power", level=logging.DEBUG) as maaslog:
196 yield power.query_all_nodes([node1, node2])
197@@ -668,6 +673,36 @@
198 """,
199 maaslog.output)
200
201+ @inlineCallbacks
202+ def test_query_all_nodes_swallows_Exception(self):
203+ node1, node2 = self.make_nodes(2)
204+ new_state_2 = self.pick_alternate_state(node2['power_state'])
205+ get_power_state = self.patch(power, 'get_power_state')
206+ get_power_state.side_effect = [
207+ fail(Exception('unknown')), succeed(new_state_2)]
208+
209+ with FakeLogger("maas.power", level=logging.DEBUG) as maaslog:
210+ yield power.query_all_nodes([node1, node2])
211+
212+ self.assertDocTestMatches(
213+ """\
214+ hostname-...: Failed to query power state, unknown error: unknown
215+ hostname-...: Power state has changed from ... to ...
216+ """,
217+ maaslog.output)
218+
219+ @inlineCallbacks
220+ def test_query_all_nodes_returns_deferredlist_of_number_of_nodes(self):
221+ node1, node2 = self.make_nodes(2)
222+ get_power_state = self.patch(power, 'get_power_state')
223+ get_power_state.side_effect = [
224+ succeed(node1['power_state']), succeed(node2['power_state'])]
225+
226+ results = yield power.query_all_nodes([node1, node2])
227+ self.assertEqual(
228+ [(True, node1['power_state']), (True, node2['power_state'])],
229+ results)
230+
231
232 class TestMaybeChangePowerState(MAASTestCase):
233