Merge ~pguimaraes/maas:db-fail-lp-1843493 into maas:master

Proposed by Pedro Guimarães
Status: Rejected
Rejected by: Blake Rouse
Proposed branch: ~pguimaraes/maas:db-fail-lp-1843493
Merge into: maas:master
Diff against target: 30 lines (+3/-2)
1 file modified
src/maasserver/forms/pods.py (+3/-2)
Reviewer Review Type Date Requested Status
Blake Rouse (community) Disapprove
MAAS Lander Needs Fixing
Review via email: mp+372595@code.launchpad.net

Commit message

* forms/pods.py ComposeMachineForPodsForm: reducing scope of except catch on compose method.
* only PodProblem exceptions should be accepted since they represent a node that cannot overcommit to receive that specific VM. Other exceptions should be raised

Closes-bug: #1843493

Description of the change

[WIP]

forms/pods.py ComposeMachineForPodsForm: reducing scope of except catch on compose method.
Only PodProblem exceptions should be accepted since they represent a node that cannot overcommit to receive that specific VM. Other exceptions should be raised.

To post a comment you must log in.
Revision history for this message
MAAS Lander (maas-lander) wrote :

UNIT TESTS
-b db-fail-lp-1843493 lp:~pguimaraes/maas/+git/maas into -b master lp:~maas-committers/maas

STATUS: FAILED
LOG: http://maas-ci-jenkins.internal:8080/job/maas/job/branch-tester/6379/console
COMMIT: 3541b062f6af81f90c9f8b0f24b30046c36e6f07

review: Needs Fixing
Revision history for this message
Blake Rouse (blake-rouse) wrote :

Tests will need to be fixed to handle those specific exception types. Also I think this might still have an issue, of causing MAAS to try to compose another machine on the retry leaving the machine that was composed during the serialization error still inside libvirt (which would be very bad).

review: Needs Fixing
Revision history for this message
Pedro Guimarães (pguimaraes) wrote :

Hey blake-rouse, thanks for the review.
Well, I am essentially seeing two kinds of errors: PodProblem-related (which is expected, nothing wrong here); and OperationalError (check: https://bugs.launchpad.net/maas/+bug/1843493/comments/2)
The later is a database issue, actually, which I think comes before libvirt setup, since libvirt is configured once I fire an RPC to provisioningserver, right?

I don't think I am near the solution however, I'm still seeing DB conflicts, even if I set my thread to sleep for several seconds (which is recommended approach as per postgres docs).

I feel we're missing a "with transaction.atomic():" somewhere to make all other pod allocations to hold. However, whenever I add this to compose method, it breaks with TransactionManagementError.

Revision history for this message
Blake Rouse (blake-rouse) wrote :

Pedro,

Thanks for digging deep into this, I have done an MP(https://code.launchpad.net/~blake-rouse/maas/+git/maas/+merge/372623) that I think will solve the issue. Could you test it out for me and let me know if this solves it?

Thanks,
Blake

Revision history for this message
Blake Rouse (blake-rouse) wrote :
review: Disapprove

Unmerged commits

3541b06... by Pedro Guimarães

forms/pods.py ComposeMachineForPodsForm: reducing scope of except catch on compose method.
Only PodProblem exceptions should be accepted since they represent a node that cannot overcommit to receive that specific VM. Other exceptions should be raised.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/src/maasserver/forms/pods.py b/src/maasserver/forms/pods.py
2index 1a68329..96ebb1b 100644
3--- a/src/maasserver/forms/pods.py
4+++ b/src/maasserver/forms/pods.py
5@@ -81,6 +81,7 @@ from provisioningserver.utils.twisted import asynchronous
6 from twisted.internet.defer import inlineCallbacks
7 from twisted.python.threadable import isInIOThread
8
9+from maasserver.exceptions import PodProblem
10
11 log = LegacyLogger()
12
13@@ -720,7 +721,7 @@ class ComposeMachineForPodsForm(forms.Form):
14 return form.compose(
15 skip_commissioning=True,
16 creation_type=NODE_CREATION_TYPE.DYNAMIC)
17- except Exception:
18+ except PodProblem:
19 continue
20 # Second, try to compose a machine from commitable pods
21 for form in commit_forms:
22@@ -728,7 +729,7 @@ class ComposeMachineForPodsForm(forms.Form):
23 return form.compose(
24 skip_commissioning=True,
25 creation_type=NODE_CREATION_TYPE.DYNAMIC)
26- except Exception:
27+ except PodProblem:
28 continue
29 # No machine found.
30 return None

Subscribers

People subscribed via source and target branches