Merge lp:~blake-rouse/maas/power-query-multiple into lp:~maas-committers/maas/trunk

Proposed by Blake Rouse
Status: Merged
Approved by: Blake Rouse
Approved revision: no longer in the source branch.
Merged at revision: 3274
Proposed branch: lp:~blake-rouse/maas/power-query-multiple
Merge into: lp:~maas-committers/maas/trunk
Diff against target: 232 lines (+98/-30)
4 files modified
src/provisioningserver/pserv_services/node_power_monitor_service.py (+4/-1)
src/provisioningserver/pserv_services/tests/test_node_power_monitor_service.py (+9/-2)
src/provisioningserver/rpc/power.py (+47/-24)
src/provisioningserver/rpc/tests/test_power.py (+38/-3)
To merge this branch: bzr merge lp:~blake-rouse/maas/power-query-multiple
Reviewer Review Type Date Requested Status
Gavin Panella (community) Approve
Blake Rouse (community) Abstain
Review via email: mp+237499@code.launchpad.net

Commit message

Run 5 queries in parallel, when querying the power status of nodes.

To post a comment you must log in.
Revision history for this message
Julian Edwards (julian-edwards) wrote :

Looking good so far. I have a suggestion to make the parallelism better since you're blocking on arbitrary Deferreds rather than the ones that might have fired.

Revision history for this message
Blake Rouse (blake-rouse) wrote :

I have updated the branch to stop the blocking on just the oldest defer, if you could give it another test that would be great.

review: Approve
Revision history for this message
Blake Rouse (blake-rouse) wrote :

Haha. Sorry about the "Approve" I must press that to much. ;-)

Revision history for this message
Blake Rouse (blake-rouse) :
review: Abstain
Revision history for this message
Julian Edwards (julian-edwards) wrote :

On Friday 10 Oct 2014 12:34:47 you wrote:
> Haha. Sorry about the "Approve" I must press that to much. ;-)

I laughed :)

Revision history for this message
Julian Edwards (julian-edwards) wrote :

On Friday 10 Oct 2014 12:34:07 you wrote:
> I think I will keep it the way it is, because it allows me to make the log
> message type different per exception and the message different for each. I

You could still do that with the single callback, FWIW.

> choose not to catch general exceptions, do you think this is a good idea?
> Or should I just catch all exceptions and maaslog as an unknown error
> printing the exception message.

I think the service has a general catch-all at the top doesn't it? Remember
that the service stops completely if there's any exception, so make sure
that's the case. I think it's fine to omit the catch-all here if there's one
at the top of the service.

Right, I'll go and test this now, the code looks much nicer now with the
semaphore.

Revision history for this message
Julian Edwards (julian-edwards) :
Revision history for this message
Julian Edwards (julian-edwards) wrote :

Ok tested and it doesn't work. I manually powered up a node and it was not reflected in the UI.

Revision history for this message
Julian Edwards (julian-edwards) wrote :

(And I have a good test rig here, I intentionally left power disconnected from some of the BMCs so I have a mix of Microservers with power, some without and some NUCs)

Revision history for this message
Blake Rouse (blake-rouse) wrote :

There is an issue using semaphore.run so I went back to using acquire and release.

I was able to test this, and it is now working correctly.

Revision history for this message
Gavin Panella (allenap) wrote :

Not a full review yet, but some questions.

review: Needs Information
Revision history for this message
Blake Rouse (blake-rouse) :
Revision history for this message
Gavin Panella (allenap) wrote :

Comment about handling failures.

Revision history for this message
Gavin Panella (allenap) wrote :

Info about cooperate/coiterate.

Revision history for this message
Gavin Panella (allenap) wrote :

Looks good.

review: Approve
Revision history for this message
Gavin Panella (allenap) wrote :

You may also need http://paste.ubuntu.com/8578497/ to get test_node_power_monitor_service to run successfully. The tests need to run in the reactor thread, otherwise the behaviour of @asynchronous becomes blocking.

Revision history for this message
Gavin Panella (allenap) wrote :

Applying http://paste.ubuntu.com/8578519/ (to use semaphore.run) in addition to the above also appears to work fine.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'src/provisioningserver/pserv_services/node_power_monitor_service.py'
--- src/provisioningserver/pserv_services/node_power_monitor_service.py 2014-08-27 09:41:37 +0000
+++ src/provisioningserver/pserv_services/node_power_monitor_service.py 2014-10-20 21:07:30 +0000
@@ -43,6 +43,7 @@
43 """Service to monitor the power status of all nodes in this cluster."""43 """Service to monitor the power status of all nodes in this cluster."""
4444
45 check_interval = timedelta(minutes=5).total_seconds()45 check_interval = timedelta(minutes=5).total_seconds()
46 max_nodes_at_once = 5
4647
47 def __init__(self, cluster_uuid, clock=None):48 def __init__(self, cluster_uuid, clock=None):
48 # Call self.query_nodes() every self.check_interval.49 # Call self.query_nodes() every self.check_interval.
@@ -92,4 +93,6 @@
92 uuid)93 uuid)
93 else:94 else:
94 node_power_parameters = response['nodes']95 node_power_parameters = response['nodes']
95 yield query_all_nodes(node_power_parameters, clock=self.clock)96 yield query_all_nodes(
97 node_power_parameters,
98 max_concurrency=self.max_nodes_at_once, clock=self.clock)
9699
=== modified file 'src/provisioningserver/pserv_services/tests/test_node_power_monitor_service.py'
--- src/provisioningserver/pserv_services/tests/test_node_power_monitor_service.py 2014-08-27 09:41:37 +0000
+++ src/provisioningserver/pserv_services/tests/test_node_power_monitor_service.py 2014-10-20 21:07:30 +0000
@@ -21,7 +21,10 @@
21 MockCalledOnceWith,21 MockCalledOnceWith,
22 MockCallsMatch,22 MockCallsMatch,
23 )23 )
24from maastesting.testcase import MAASTestCase24from maastesting.testcase import (
25 MAASTestCase,
26 MAASTwistedRunTest,
27 )
25from mock import (28from mock import (
26 ANY,29 ANY,
27 call,30 call,
@@ -48,6 +51,8 @@
4851
49class TestNodePowerMonitorService(MAASTestCase):52class TestNodePowerMonitorService(MAASTestCase):
5053
54 run_tests_with = MAASTwistedRunTest.make_factory(timeout=5)
55
51 def test_init_sets_up_timer_correctly(self):56 def test_init_sets_up_timer_correctly(self):
52 cluster_uuid = factory.make_UUID()57 cluster_uuid = factory.make_UUID()
53 service = npms.NodePowerMonitorService(cluster_uuid)58 service = npms.NodePowerMonitorService(cluster_uuid)
@@ -118,7 +123,9 @@
118 self.assertEqual(None, extract_result(d))123 self.assertEqual(None, extract_result(d))
119 self.assertThat(124 self.assertThat(
120 query_all_nodes,125 query_all_nodes,
121 MockCalledOnceWith([], clock=service.clock))126 MockCalledOnceWith(
127 [], max_concurrency=service.max_nodes_at_once,
128 clock=service.clock))
122129
123 def test_query_nodes_copes_with_NoSuchCluster(self):130 def test_query_nodes_copes_with_NoSuchCluster(self):
124 cluster_uuid, service = self.make_monitor_service()131 cluster_uuid, service = self.make_monitor_service()
125132
=== modified file 'src/provisioningserver/rpc/power.py'
--- src/provisioningserver/rpc/power.py 2014-10-06 11:02:29 +0000
+++ src/provisioningserver/rpc/power.py 2014-10-20 21:07:30 +0000
@@ -18,6 +18,7 @@
18]18]
1919
20from datetime import timedelta20from datetime import timedelta
21from functools import partial
2122
22from provisioningserver.events import (23from provisioningserver.events import (
23 EVENT_TYPES,24 EVENT_TYPES,
@@ -46,6 +47,8 @@
46 )47 )
47from twisted.internet import reactor48from twisted.internet import reactor
48from twisted.internet.defer import (49from twisted.internet.defer import (
50 DeferredList,
51 DeferredSemaphore,
49 inlineCallbacks,52 inlineCallbacks,
50 returnValue,53 returnValue,
51 )54 )
@@ -314,29 +317,49 @@
314 raise PowerActionFail(error)317 raise PowerActionFail(error)
315318
316319
317@asynchronous320def maaslog_query_failure(node, failure):
318@inlineCallbacks321 """Log failure to query node."""
319def query_all_nodes(nodes, clock=reactor):322 if failure.check(PowerActionFail):
323 maaslog.error(
324 "%s: Failed to query power state: %s.",
325 node['hostname'], failure.getErrorMessage())
326 elif failure.check(NoSuchNode):
327 maaslog.debug(
328 "%s: Could not update power status; "
329 "no such node.", node['hostname'])
330 else:
331 maaslog.error(
332 "%s: Failed to query power state, unknown error: %s",
333 node['hostname'], failure.getErrorMessage())
334
335
336def maaslog_query(node, power_state):
337 """Log change in power state for node."""
338 if node['power_state'] != power_state:
339 maaslog.info(
340 "%s: Power state has changed from %s to %s.", node['hostname'],
341 node['power_state'], power_state)
342 return power_state
343
344
345def _query_node(node, clock):
346 """Performs `power_query` on the given node."""
347 # Start the query of the power state for the given node, logging to
348 # maaslog as errors and power states change.
349 d = get_power_state(
350 node['system_id'], node['hostname'],
351 node['power_type'], node['context'],
352 clock=clock)
353 d.addCallbacks(
354 partial(maaslog_query, node),
355 partial(maaslog_query_failure, node))
356 return d
357
358
359def query_all_nodes(nodes, max_concurrency=5, clock=reactor):
320 """Performs `power_query` on all nodes. If the nodes state has changed,360 """Performs `power_query` on all nodes. If the nodes state has changed,
321 then that is sent back to the region."""361 then that is sent back to the region."""
322 for node in nodes:362 semaphore = DeferredSemaphore(tokens=max_concurrency)
323 system_id = node['system_id']363 return DeferredList(
324 hostname = node['hostname']364 semaphore.run(_query_node, node, clock)
325 power_state_recorded = node['power_state']365 for node in nodes)
326 try:
327 power_state_observed = yield get_power_state(
328 system_id, hostname, node['power_type'], node['context'],
329 clock=clock)
330 except PowerActionFail as e:
331 maaslog.error(
332 "%s: Failed to query power state: %s.", hostname, e)
333 continue
334 except NoSuchNode:
335 maaslog.debug(
336 "%s: Could not update power status; "
337 "no such node.", hostname)
338 continue
339 if power_state_observed != power_state_recorded:
340 maaslog.info(
341 "%s: Power state has changed from %s to %s.", hostname,
342 power_state_recorded, power_state_observed)
343366
=== modified file 'src/provisioningserver/rpc/tests/test_power.py'
--- src/provisioningserver/rpc/tests/test_power.py 2014-10-06 11:02:29 +0000
+++ src/provisioningserver/rpc/tests/test_power.py 2014-10-20 21:07:30 +0000
@@ -58,9 +58,11 @@
58from twisted.internet import reactor58from twisted.internet import reactor
59from twisted.internet.defer import (59from twisted.internet.defer import (
60 Deferred,60 Deferred,
61 fail,
61 inlineCallbacks,62 inlineCallbacks,
62 maybeDeferred,63 maybeDeferred,
63 returnValue,64 returnValue,
65 succeed,
64 )66 )
65from twisted.internet.task import Clock67from twisted.internet.task import Clock
6668
@@ -620,7 +622,10 @@
620 # Report back that all nodes' power states are as recorded.622 # Report back that all nodes' power states are as recorded.
621 power_states = [node['power_state'] for node in nodes]623 power_states = [node['power_state'] for node in nodes]
622 get_power_state = self.patch(power, 'get_power_state')624 get_power_state = self.patch(power, 'get_power_state')
623 get_power_state.side_effect = power_states625 get_power_state.side_effect = [
626 succeed(power_state)
627 for power_state in power_states
628 ]
624629
625 yield power.query_all_nodes(nodes)630 yield power.query_all_nodes(nodes)
626 self.assertThat(get_power_state, MockCallsMatch(*(631 self.assertThat(get_power_state, MockCallsMatch(*(
@@ -638,7 +643,7 @@
638 get_power_state = self.patch(power, 'get_power_state')643 get_power_state = self.patch(power, 'get_power_state')
639 error_msg = factory.make_name("error")644 error_msg = factory.make_name("error")
640 get_power_state.side_effect = [645 get_power_state.side_effect = [
641 PowerActionFail(error_msg), new_state_2]646 fail(PowerActionFail(error_msg)), succeed(new_state_2)]
642647
643 with FakeLogger("maas.power", level=logging.DEBUG) as maaslog:648 with FakeLogger("maas.power", level=logging.DEBUG) as maaslog:
644 yield power.query_all_nodes([node1, node2])649 yield power.query_all_nodes([node1, node2])
@@ -656,7 +661,7 @@
656 new_state_2 = self.pick_alternate_state(node2['power_state'])661 new_state_2 = self.pick_alternate_state(node2['power_state'])
657 get_power_state = self.patch(power, 'get_power_state')662 get_power_state = self.patch(power, 'get_power_state')
658 get_power_state.side_effect = [663 get_power_state.side_effect = [
659 exceptions.NoSuchNode(), new_state_2]664 fail(exceptions.NoSuchNode()), succeed(new_state_2)]
660665
661 with FakeLogger("maas.power", level=logging.DEBUG) as maaslog:666 with FakeLogger("maas.power", level=logging.DEBUG) as maaslog:
662 yield power.query_all_nodes([node1, node2])667 yield power.query_all_nodes([node1, node2])
@@ -668,6 +673,36 @@
668 """,673 """,
669 maaslog.output)674 maaslog.output)
670675
676 @inlineCallbacks
677 def test_query_all_nodes_swallows_Exception(self):
678 node1, node2 = self.make_nodes(2)
679 new_state_2 = self.pick_alternate_state(node2['power_state'])
680 get_power_state = self.patch(power, 'get_power_state')
681 get_power_state.side_effect = [
682 fail(Exception('unknown')), succeed(new_state_2)]
683
684 with FakeLogger("maas.power", level=logging.DEBUG) as maaslog:
685 yield power.query_all_nodes([node1, node2])
686
687 self.assertDocTestMatches(
688 """\
689 hostname-...: Failed to query power state, unknown error: unknown
690 hostname-...: Power state has changed from ... to ...
691 """,
692 maaslog.output)
693
694 @inlineCallbacks
695 def test_query_all_nodes_returns_deferredlist_of_number_of_nodes(self):
696 node1, node2 = self.make_nodes(2)
697 get_power_state = self.patch(power, 'get_power_state')
698 get_power_state.side_effect = [
699 succeed(node1['power_state']), succeed(node2['power_state'])]
700
701 results = yield power.query_all_nodes([node1, node2])
702 self.assertEqual(
703 [(True, node1['power_state']), (True, node2['power_state'])],
704 results)
705
671706
672class TestMaybeChangePowerState(MAASTestCase):707class TestMaybeChangePowerState(MAASTestCase):
673708