MAAS

Merge lp:~newell-jensen/maas/fix-1665839 into lp:~maas-committers/maas/trunk

fix-1665839
Merge into trunk

Proposed by Newell Jensen on 2017-02-21

Status:

Merged

Approved by:

Newell Jensen on 2017-02-21

Approved revision:

no longer in the source branch.

Merged at revision:

5753

Proposed branch:

lp:~newell-jensen/maas/fix-1665839

Merge into:

lp:~maas-committers/maas/trunk

Diff against target:

39 lines (+12/-6)

1 file modified

src/provisioningserver/drivers/pod/rsd.py (+12/-6)

To merge this branch:

bzr merge lp:~newell-jensen/maas/fix-1665839

Critical

Fix Released

Link a bug report

Reviewer	Review Type	Date Requested	Status
LaMont Jones (community)		2017-02-21	Approve on 2017-02-21
Review via email: mp+317899@code.launchpad.net

Commit message

Sieve the newly composed machine based off of node_id.

Revision history for this message

Newell Jensen (newell-jensen) wrote on 2017-02-21:

This change is still covered by current unit tests.

Revision history for this message

Christian Reis (kiko) wrote on 2017-02-21:

On Tue, Feb 21, 2017 at 07:29:29PM -0000, Newell Jensen wrote:
> - discovered_machine = [m for m in new_machines if m not in machines][0]
> + discovered_machine = [
> + nm for nm in new_machines if (
> + nm.power_parameters.get('node_id') not in [
> + m.power_parameters.get('node_id') for m in machines])][0]

Ugh, I really don't like these nested comprehensions. The code is super
hard to understand!

Also, I don't like the assumption that this will only have one useful
item, which is not very defensive. Is this necessarily always
new_machines?

How about:

    discovered_machines = []
    power_node_ids = [m.power_parameters.get("node_id") for m in machines]
    for nm in new_machines:
        if nm.power_parameters.get('node_id') not in power_node_ids:
            discovered_machines.append(nm)

    # if you actually can guarantee that you are only getting one
    assert len(discovered_machines) == 1
    discovered_machine = discovered_machines[0]

This code may need a deeper look, as it's not very intuitive, and I
don't know it well.

Revision history for this message

Christian Reis (kiko) wrote on 2017-02-21:

On Tue, Feb 21, 2017 at 07:28:25PM -0000, Newell Jensen wrote:
> This change is still covered by current unit tests.

I don't get it -- you don't have a case which reproduces the bug, do you?

Revision history for this message

Newell Jensen (newell-jensen) wrote on 2017-02-21:

> On Tue, Feb 21, 2017 at 07:28:25PM -0000, Newell Jensen wrote:
> > This change is still covered by current unit tests.
>
> I don't get it -- you don't have a case which reproduces the bug, do you?

Before, the discovered_machine was being sieved based off of object equality. Something funky was going on with this equality check as I only saw the bug appear randomly. Maybe something to do with the object's __hash__ method, which Python uses to check object equality.

In any event, now that code is checking for the actual node_id (which I should have been checking for in the first place as this is ultimately what is needed) instead of checking for the new object from the list, I do not have a test to test the original bug.

However, the compose allocation error has been moved so we are testing whether or not the discovered_machine was found which tests whether or not this code fails to find a node_id for the new machine. The test_compose_raises_error_for_no_allocation tests this.

Thanks for the valid review comments. Hope all is going well for you as well Kiko :)

Revision history for this message

LaMont Jones (lamont) wrote on 2017-02-21:

Seems good to me.

Revision history for this message

LaMont Jones (lamont) wrote on 2017-02-21:

Seems good to me.

review: Approve

Revision history for this message

Christian Reis (kiko) wrote on 2017-02-22:

On Tue, Feb 21, 2017 at 09:20:57PM -0000, Newell Jensen wrote:
> - new_machines = yield self.get_pod_machines(url, headers)
> - if new_machines == machines:
> + # Sieve the new machine.
> + discovered_machine = None
> + current_machines = yield self.get_pod_machines(url, headers)
> + previous_node_ids = [
> + pm.power_parameters.get('node_id') for pm in previous_machines]
> + for cm in current_machines:
> + if cm.power_parameters.get('node_id') not in previous_node_ids:
> + discovered_machine = cm

Much nicer!

Can you explain why you only care about the first machine you find in
current_machines? This class of code is always a source of weird bugs --
the list might come sorted the way you expect, or it might contain
something unexpected.

If current_machines is always a list of 1 item, then assert() that. And
if not, checking that the list contains what you expect is a good idea.

Revision history for this message

Newell Jensen (newell-jensen) wrote on 2017-02-24:

> Can you explain why you only care about the first machine you find in
> current_machines?

There will only be a difference of one machine and that difference is the new machine that was just created in the RSD Pod.

> And if not, checking that the list contains what you expect is a good idea.

The list will be more than 1 item, so checking as is done is what we are doing.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Andres Rodriguez

Blake Rouse

Brendan Donegan

Dave Walker

Deepa

Enrique Chirivella Pérez

Fumihito YOSHIDA

MAAS Committers

Mike Pontillo

Newell Jensen

james beedy

 === modified file 'src/provisioningserver/drivers/pod/rsd.py'
 --- src/provisioningserver/drivers/pod/rsd.py	2017-02-15 15:49:54 +0000
 +++ src/provisioningserver/drivers/pod/rsd.py	2017-02-21 21:11:20 +0000
@@ -607,8 +607,8 @@
          url = self.get_url(context)
          headers = self.make_auth_headers(**context)
          endpoint = b"redfish/v1/Nodes/Actions/Allocate"
--        # Get current list of composed machines in pod.
--        machines = yield self.get_pod_machines(
++        # Get previous list of composed machines in pod.
++        previous_machines = yield self.get_pod_machines(
              url, headers)
          # Create allocate payload.
          requested_cores = request.cores
@@ -640,14 +640,20 @@
                      break
                  continue
--        new_machines = yield self.get_pod_machines(url, headers)
--        if new_machines == machines:
++        # Sieve the new machine.
++        discovered_machine = None
++        current_machines = yield self.get_pod_machines(url, headers)
++        previous_node_ids = [
++            pm.power_parameters.get('node_id') for pm in previous_machines]
++        for cm in current_machines:
++            if cm.power_parameters.get('node_id') not in previous_node_ids:
++                discovered_machine = cm
++
++        if discovered_machine is None:
              # Allocation did not succeed.
              raise PodInvalidResources(
                  "Unable to allocate machine with requested resources.")
--        # Sieve the new machine.
--        discovered_machine = [m for m in new_machines if m not in machines][0]
          node_id = discovered_machine.power_parameters.get(
              'node_id').encode('utf-8')
          # Assemble the node.

MAAS

Merge lp:~newell-jensen/maas/fix-1665839 into lp:~maas-committers/maas/trunk

Commit message

Description of the change

Preview Diff

Subscribers