Merge lp:~newell-jensen/maas/fix-1665839 into lp:~maas-committers/maas/trunk

Proposed by Newell Jensen
Status: Merged
Approved by: Newell Jensen
Approved revision: no longer in the source branch.
Merged at revision: 5753
Proposed branch: lp:~newell-jensen/maas/fix-1665839
Merge into: lp:~maas-committers/maas/trunk
Diff against target: 39 lines (+12/-6)
1 file modified
src/provisioningserver/drivers/pod/rsd.py (+12/-6)
To merge this branch: bzr merge lp:~newell-jensen/maas/fix-1665839
Reviewer Review Type Date Requested Status
LaMont Jones (community) Approve
Review via email: mp+317899@code.launchpad.net

Commit message

Sieve the newly composed machine based off of node_id.

To post a comment you must log in.
Revision history for this message
Newell Jensen (newell-jensen) wrote :

This change is still covered by current unit tests.

Revision history for this message
Christian Reis (kiko) wrote :

On Tue, Feb 21, 2017 at 07:29:29PM -0000, Newell Jensen wrote:
> - discovered_machine = [m for m in new_machines if m not in machines][0]
> + discovered_machine = [
> + nm for nm in new_machines if (
> + nm.power_parameters.get('node_id') not in [
> + m.power_parameters.get('node_id') for m in machines])][0]

Ugh, I really don't like these nested comprehensions. The code is super
hard to understand!

Also, I don't like the assumption that this will only have one useful
item, which is not very defensive. Is this necessarily always
new_machines?

How about:

    discovered_machines = []
    power_node_ids = [m.power_parameters.get("node_id") for m in machines]
    for nm in new_machines:
        if nm.power_parameters.get('node_id') not in power_node_ids:
            discovered_machines.append(nm)

    # if you actually can guarantee that you are only getting one
    assert len(discovered_machines) == 1
    discovered_machine = discovered_machines[0]

This code may need a deeper look, as it's not very intuitive, and I
don't know it well.

Revision history for this message
Christian Reis (kiko) wrote :

On Tue, Feb 21, 2017 at 07:28:25PM -0000, Newell Jensen wrote:
> This change is still covered by current unit tests.

I don't get it -- you don't have a case which reproduces the bug, do you?

Revision history for this message
Newell Jensen (newell-jensen) wrote :

> On Tue, Feb 21, 2017 at 07:28:25PM -0000, Newell Jensen wrote:
> > This change is still covered by current unit tests.
>
> I don't get it -- you don't have a case which reproduces the bug, do you?

Before, the discovered_machine was being sieved based off of object equality. Something funky was going on with this equality check as I only saw the bug appear randomly. Maybe something to do with the object's __hash__ method, which Python uses to check object equality.

In any event, now that code is checking for the actual node_id (which I should have been checking for in the first place as this is ultimately what is needed) instead of checking for the new object from the list, I do not have a test to test the original bug.

However, the compose allocation error has been moved so we are testing whether or not the discovered_machine was found which tests whether or not this code fails to find a node_id for the new machine. The test_compose_raises_error_for_no_allocation tests this.

Thanks for the valid review comments. Hope all is going well for you as well Kiko :)

Revision history for this message
LaMont Jones (lamont) wrote :

Seems good to me.

Revision history for this message
LaMont Jones (lamont) wrote :

Seems good to me.

review: Approve
Revision history for this message
Christian Reis (kiko) wrote :

On Tue, Feb 21, 2017 at 09:20:57PM -0000, Newell Jensen wrote:
> - new_machines = yield self.get_pod_machines(url, headers)
> - if new_machines == machines:
> + # Sieve the new machine.
> + discovered_machine = None
> + current_machines = yield self.get_pod_machines(url, headers)
> + previous_node_ids = [
> + pm.power_parameters.get('node_id') for pm in previous_machines]
> + for cm in current_machines:
> + if cm.power_parameters.get('node_id') not in previous_node_ids:
> + discovered_machine = cm

Much nicer!

Can you explain why you only care about the first machine you find in
current_machines? This class of code is always a source of weird bugs --
the list might come sorted the way you expect, or it might contain
something unexpected.

If current_machines is always a list of 1 item, then assert() that. And
if not, checking that the list contains what you expect is a good idea.

Revision history for this message
Newell Jensen (newell-jensen) wrote :

> Can you explain why you only care about the first machine you find in
> current_machines?

There will only be a difference of one machine and that difference is the new machine that was just created in the RSD Pod.

> And if not, checking that the list contains what you expect is a good idea.

The list will be more than 1 item, so checking as is done is what we are doing.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/provisioningserver/drivers/pod/rsd.py'
2--- src/provisioningserver/drivers/pod/rsd.py 2017-02-15 15:49:54 +0000
3+++ src/provisioningserver/drivers/pod/rsd.py 2017-02-21 21:11:20 +0000
4@@ -607,8 +607,8 @@
5 url = self.get_url(context)
6 headers = self.make_auth_headers(**context)
7 endpoint = b"redfish/v1/Nodes/Actions/Allocate"
8- # Get current list of composed machines in pod.
9- machines = yield self.get_pod_machines(
10+ # Get previous list of composed machines in pod.
11+ previous_machines = yield self.get_pod_machines(
12 url, headers)
13 # Create allocate payload.
14 requested_cores = request.cores
15@@ -640,14 +640,20 @@
16 break
17 continue
18
19- new_machines = yield self.get_pod_machines(url, headers)
20- if new_machines == machines:
21+ # Sieve the new machine.
22+ discovered_machine = None
23+ current_machines = yield self.get_pod_machines(url, headers)
24+ previous_node_ids = [
25+ pm.power_parameters.get('node_id') for pm in previous_machines]
26+ for cm in current_machines:
27+ if cm.power_parameters.get('node_id') not in previous_node_ids:
28+ discovered_machine = cm
29+
30+ if discovered_machine is None:
31 # Allocation did not succeed.
32 raise PodInvalidResources(
33 "Unable to allocate machine with requested resources.")
34
35- # Sieve the new machine.
36- discovered_machine = [m for m in new_machines if m not in machines][0]
37 node_id = discovered_machine.power_parameters.get(
38 'node_id').encode('utf-8')
39 # Assemble the node.