Merge lp:~lamont/maas/bug-1567176 into lp:maas/trunk

Proposed by LaMont Jones on 2016-04-11
Status: Rejected
Rejected by: MAAS Lander on 2017-06-22
Proposed branch: lp:~lamont/maas/bug-1567176
Merge into: lp:maas/trunk
Diff against target: 224 lines (+103/-16)
4 files modified
src/maasserver/models/fabric.py (+32/-8)
src/maasserver/models/space.py (+31/-8)
src/maasserver/models/tests/test_fabric.py (+20/-0)
src/maasserver/models/tests/test_space.py (+20/-0)
To merge this branch: bzr merge lp:~lamont/maas/bug-1567176
Reviewer Review Type Date Requested Status
Gavin Panella (community) Disapprove on 2016-04-12
Mike Pontillo (community) 2016-04-11 Needs Fixing on 2016-04-12
Review via email: mp+291545@code.launchpad.net

Commit message

Harden space/fabric creation to avoid duplicate names, even temporarily.

Description of the change

Harden space/fabric creation to avoid duplicate names, even temporarily.

To post a comment you must log in.
lp:~lamont/maas/bug-1567176 updated on 2016-04-11
4902. By LaMont Jones on 2016-04-11

restore CleanSave.save before the assertEqual.

LaMont Jones (lamont) wrote :

The crux of the issue here is that if postgres is enforcing unique=True on name, then (since many things are not done inside of transactions), then default names become a source of conflict (the default name relies on the id, which is not known until the commit succeeds, and multiple branches can be in the middle of the insert-and-update logic at the same time.)

Making everything that inserts a space or fabric be transactional would be a solution, but I believe it's an invasive one to consider.

If we don't have the database enforce uniqueness, then we get near-uniqueness from the very sorts of situations that we run into above (where two inserts of spaces _with_ names that are the same, happen to pass in the night and both succeed.) TBF, this is an extremely unlikely case. If we do hit it, then we have the situation where two users create spaces (with the same name), and then whoever updates next fails, even though they didn't change the name....

Mike Pontillo (mpontillo) wrote :

I don't like the random numbers used here, but I don't have a better idea at the moment. We might have to let it slide unless someone else can think of a better way to do it.

You should be able to use self.patch() to mock rather than replacing methods with your mock by hand. This will ensure that all the proper cleanup happens, even in exception cases.

review: Needs Fixing
Lee Trager (ltrager) wrote :

Looks good, I'm just wondering if the retry loop can ever get into a deadlock state.

Gavin Panella (allenap) wrote :

Why is it important to give fabrics names like "fabric-$id", where $id
is the primary key? This seems arbitrary, is causing problems, and
exposing surrogate keys to users is often considered bad form anyway.

My suggestion is to use a "maasserve_fabric_name_id_seq" sequence:

  fabric_name_id = Sequence(
      "maasserver_fabric_name_id_seq", cycle=False,
      owner="maasserver_fabric.name")

   def generate_fabric_name():
       return "fabric-%d" % next(fabric_name_id)

Then there's no need for looping, checking for IntegrityError, and so
forth.

Another alternative might be to lock the table at the start of save():

  LOCK TABLE maasserver_fabric IN SHARE MODE

A SHARE lock will prevent concurrent data changes.

http://www.postgresql.org/docs/current/static/sql-lock.html
http://www.postgresql.org/docs/current/static/explicit-locking.html

I may have missed an important justification for what's going on in this
branch. However, on the basis of what I currently know, this branch is
an over-complicated fix to the problem.

> ... (since many things are not done inside of transactions) ...
[...]
> Making everything that inserts a space or fabric be transactional
> would be a solution, but I believe it's an invasive one to consider.

The intention in MAAS is that ALL database access is transactional, run
with at least REPEATABLE READ isolation. If you find something (not
including the test suite) that's writing to OR reading from the database
outside of an explicit transaction, it's highly likely to be a bug.

review: Disapprove
Blake Rouse (blake-rouse) wrote :

> Why is it important to give fabrics names like "fabric-$id", where $id
> is the primary key? This seems arbitrary, is causing problems, and
> exposing surrogate keys to users is often considered bad form anyway.

It is important that the fabric/space names come from the ID of the fabric. That way it makes since when the user is looking at the result of the API.

>
> My suggestion is to use a "maasserve_fabric_name_id_seq" sequence:
>
> fabric_name_id = Sequence(
> "maasserver_fabric_name_id_seq", cycle=False,
> owner="maasserver_fabric.name")

Doesn't postgres already create a sequence for the ID of the fabric/space. Why not just get the nextval of the fabric/space ID and set the ID and name in the initial INSERT call so its always correct. That will prevent the conflict and ensure that the ID and name always match. It would also simplify this logic a lot.

>
> def generate_fabric_name():
> return "fabric-%d" % next(fabric_name_id)
>
> Then there's no need for looping, checking for IntegrityError, and so
> forth.
>
> Another alternative might be to lock the table at the start of save():
>
> LOCK TABLE maasserver_fabric IN SHARE MODE
>
> A SHARE lock will prevent concurrent data changes.
>
> http://www.postgresql.org/docs/current/static/sql-lock.html
> http://www.postgresql.org/docs/current/static/explicit-locking.html
>
> I may have missed an important justification for what's going on in this
> branch. However, on the basis of what I currently know, this branch is
> an over-complicated fix to the problem.
>
> > ... (since many things are not done inside of transactions) ...
> [...]
> > Making everything that inserts a space or fabric be transactional
> > would be a solution, but I believe it's an invasive one to consider.
>
> The intention in MAAS is that ALL database access is transactional, run
> with at least REPEATABLE READ isolation. If you find something (not
> including the test suite) that's writing to OR reading from the database
> outside of an explicit transaction, it's highly likely to be a bug.

Gavin Panella (allenap) wrote :

> > Why is it important to give fabrics names like "fabric-$id", where
> > $id is the primary key? This seems arbitrary, is causing problems,
> > and exposing surrogate keys to users is often considered bad form
> > anyway.
>
> It is important that the fabric/space names come from the ID of the
> fabric. That way it makes since when the user is looking at the result
> of the API.

Typically a user ought not to be looking at the result of the API;
that's for machines. True, our CLI will currently show them all this,
but that won't be the case in future. It's not an absolute requirement,
but we should seek to hide surrogate keys from users.

[...]
> Doesn't postgres already create a sequence for the ID of the
> fabric/space. Why not just get the nextval of the fabric/space ID and
> set the ID and name in the initial INSERT call so its always correct.
> That will prevent the conflict and ensure that the ID and name always
> match. It would also simplify this logic a lot.

Yes, this is possible too.

MAAS Lander (maas-lander) wrote :

Transitioned to Git.

lp:maas has now moved from Bzr to Git.
Please propose your branches with Launchpad using Git.

git clone https://git.launchpad.net/maas

Unmerged revisions

4902. By LaMont Jones on 2016-04-11

restore CleanSave.save before the assertEqual.

4901. By LaMont Jones on 2016-04-11

Harden space/fabric creation to avoid duplicate names, even temporarily.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/maasserver/models/fabric.py'
2--- src/maasserver/models/fabric.py 2016-04-07 19:37:40 +0000
3+++ src/maasserver/models/fabric.py 2016-04-11 20:53:39 +0000
4@@ -10,6 +10,7 @@
5 ]
6
7 import datetime
8+import random
9 import re
10
11 from django.core.exceptions import (
12@@ -209,24 +210,47 @@
13
14 def save(self, *args, **kwargs):
15 # Name will get set by clean_name() if None or empty, and there is an
16- # id. We just need to handle names here for creation.
17+ # id. We just need to handle names here for creation. It is possible
18+ # that we could collide with another thread creating a fabric as well,
19+ # that gets resolved by picking random almost-correct (reserved) names
20+ # until the save works, then renaming it to the proper default name.
21 created = self.id is None
22- super().save(*args, **kwargs)
23- if self.name is None or self.name == '':
24+ saved = False
25+ save_name = self.name
26+ while not saved:
27+ try:
28+ super().save(*args, **kwargs)
29+ saved = True
30+ except IntegrityError as err:
31+ # A leading zero still hits the reserved names, but won't
32+ # actually be generated by us, regardless of the ID. Pick
33+ # random IDs until we get through.
34+ if (err.args[0].startswith('duplicate key value violates')):
35+ self.name = 'fabric-0%d' % random.randint(1, 999999)
36+ spcl = re.compile('^fabric-0\d+$')
37+ if (save_name is None or save_name == '' or spcl.search(save_name)):
38 # If we got here, then we have a newly created fabric that needs a
39- # default name.
40- self.name = "fabric-%d" % self.id
41+ # default name. Since we may have strayed from None above, we need
42+ # to explicitly set it, rather than relying on clean_name to do so.
43 fabric = Fabric.objects.get(id=self.id)
44+ fabric.name = "fabric-%d" % self.id
45+ self.name = fabric.name
46 fabric.save()
47 # Create default VLAN if this is a fabric creation.
48 if created:
49 self._create_default_vlan()
50
51 def clean_name(self):
52- reserved = re.compile('^fabric-\d+$')
53+ resvd = re.compile('^fabric-(\d+)$')
54 if self.name is not None and self.name != '':
55- if reserved.search(self.name):
56- if (self.id is None or self.name != 'fabric-%d' % self.id):
57+ res = resvd.search(self.name)
58+ if res is not None:
59+ num = res.group(1)
60+ if self.id is None and num.startswith('0') and int(num) > 0:
61+ # "fabric-0\d+ is only valid for creation names, and will
62+ # get replaced in save with the default name.
63+ pass
64+ elif self.id is None or self.name != 'fabric-%d' % self.id:
65 raise ValidationError(
66 {'name': ["Reserved fabric name."]})
67 elif self.id is not None:
68
69=== modified file 'src/maasserver/models/space.py'
70--- src/maasserver/models/space.py 2016-04-07 19:37:40 +0000
71+++ src/maasserver/models/space.py 2016-04-11 20:53:39 +0000
72@@ -9,6 +9,7 @@
73 ]
74
75 import datetime
76+import random
77 import re
78
79 from django.core.exceptions import (
80@@ -155,10 +156,16 @@
81 return "space-%s" % self.id
82
83 def clean_name(self):
84- reserved = re.compile('^space-\d+$')
85+ resvd = re.compile('^space-(\d+)$')
86 if self.name is not None and self.name != '':
87- if reserved.search(self.name):
88- if (self.id is None or self.name != 'space-%d' % self.id):
89+ res = resvd.search(self.name)
90+ if res is not None:
91+ num = res.group(1)
92+ if self.id is None and num.startswith('0') and int(num) > 0:
93+ # "space-0\d+ is only valid for creation names, and will
94+ # get replaced in save with the default name.
95+ pass
96+ elif self.id is None or self.name != 'space-%d' % self.id:
97 raise ValidationError(
98 {'name': ["Reserved space name."]})
99 elif self.id is not None:
100@@ -168,13 +175,29 @@
101
102 def save(self, *args, **kwargs):
103 # Name will get set by clean_name() if None or empty, and there is an
104- # id. We just need to handle names here for creation.
105- super().save(*args, **kwargs)
106- if self.name is None or self.name == '':
107+ # id. We just need to handle names here for creation. It is possible
108+ # that we could collide with another thread creating a space as well,
109+ # that gets resolved by picking random almost-correct (reserved) names
110+ # until the save works, then renaming it to the proper default name.
111+ saved = False
112+ save_name = self.name
113+ while not saved:
114+ try:
115+ super().save(*args, **kwargs)
116+ saved = True
117+ except IntegrityError as err:
118+ # A leading zero means that we are creating a space with a
119+ # default id.
120+ if (err.args[0].startswith('duplicate key value violates')):
121+ self.name = 'space-0%d' % random.randint(1, 999999)
122+ spcl = re.compile('^space-0\d+$')
123+ if (save_name is None or save_name == '' or spcl.search(save_name)):
124 # If we got here, then we have a newly created space that needs a
125- # default name.
126- self.name = "space-%d" % self.id
127+ # default name. Since we may have strayed from None above, we need
128+ # to explicitly set it, rather than relying on clean_name to do so.
129 space = Space.objects.get(id=self.id)
130+ space.name = "space-%d" % self.id
131+ self.name = space.name
132 space.save()
133
134 def clean(self, *args, **kwargs):
135
136=== modified file 'src/maasserver/models/tests/test_fabric.py'
137--- src/maasserver/models/tests/test_fabric.py 2016-04-07 19:37:40 +0000
138+++ src/maasserver/models/tests/test_fabric.py 2016-04-11 20:53:39 +0000
139@@ -6,6 +6,9 @@
140 __all__ = []
141
142
143+import random
144+import re
145+
146 from django.core.exceptions import (
147 PermissionDenied,
148 ValidationError,
149@@ -15,6 +18,7 @@
150 INTERFACE_TYPE,
151 NODE_PERMISSION,
152 )
153+from maasserver.models.cleansave import CleanSave
154 from maasserver.models.fabric import (
155 DEFAULT_FABRIC_NAME,
156 Fabric,
157@@ -154,6 +158,22 @@
158 fabric = factory.make_Fabric(name=name)
159 self.assertEqual(name, fabric.get_name())
160
161+ def test_save_handles_collision(self):
162+ real_save = CleanSave.save
163+
164+ def mock_save(inst, *args, **kwargs):
165+ spcl = re.compile('^fabric-0\d+$')
166+ if (inst.name is None or (
167+ spcl.search(inst.name) and random.randint(0, 1) == 1)):
168+ raise IntegrityError(
169+ 'duplicate key value violates unique constraint ')
170+ else:
171+ real_save(inst, *args, **kwargs)
172+ CleanSave.save = mock_save
173+ fabric = Fabric.objects.create(name=None)
174+ CleanSave.save = real_save
175+ self.assertEqual("fabric-%d" % fabric.id, fabric.name)
176+
177 def test_creates_fabric_with_default_vlan(self):
178 name = factory.make_name('name')
179 fabric = factory.make_Fabric(name=name)
180
181=== modified file 'src/maasserver/models/tests/test_space.py'
182--- src/maasserver/models/tests/test_space.py 2016-04-07 19:37:40 +0000
183+++ src/maasserver/models/tests/test_space.py 2016-04-11 20:53:39 +0000
184@@ -7,6 +7,9 @@
185 __all__ = []
186
187
188+import random
189+import re
190+
191 from django.core.exceptions import (
192 PermissionDenied,
193 ValidationError,
194@@ -14,6 +17,7 @@
195 from django.db.models import ProtectedError
196 from django.db.utils import IntegrityError
197 from maasserver.enum import NODE_PERMISSION
198+from maasserver.models.cleansave import CleanSave
199 from maasserver.models.space import (
200 DEFAULT_SPACE_NAME,
201 Space,
202@@ -165,6 +169,22 @@
203 factory.make_Space,
204 name='space-1999')
205
206+ def test_save_handles_collision(self):
207+ real_save = CleanSave.save
208+
209+ def mock_save(inst, *args, **kwargs):
210+ spcl = re.compile('^space-0\d+$')
211+ if (inst.name is None or (
212+ spcl.search(inst.name) and random.randint(0, 1) == 1)):
213+ raise IntegrityError(
214+ 'duplicate key value violates unique constraint ')
215+ else:
216+ real_save(inst, *args, **kwargs)
217+ CleanSave.save = mock_save
218+ space = Space.objects.create(name=None)
219+ CleanSave.save = real_save
220+ self.assertEqual("space-%d" % space.id, space.name)
221+
222 def test_create_sets_name(self):
223 space = Space.objects.create(name=None)
224 self.assertEqual("space-%d" % space.id, space.name)