Merge lp:~andreserl/maas/fix_lp1382575 into lp:~maas-committers/maas/trunk

Proposed by Andres Rodriguez
Status: Rejected
Rejected by: Christian Reis
Proposed branch: lp:~andreserl/maas/fix_lp1382575
Merge into: lp:~maas-committers/maas/trunk
Diff against target: 66 lines (+4/-10)
3 files modified
contrib/maas_local_settings.py (+0/-6)
src/maas/development.py (+2/-2)
src/maas/settings.py (+2/-2)
To merge this branch: bzr merge lp:~andreserl/maas/fix_lp1382575
Reviewer Review Type Date Requested Status
Gavin Panella (community) Disapprove
MAAS Maintainers Pending
Review via email: mp+238817@code.launchpad.net

Commit message

Use ISOLATION_LEVEL_SERIALIZABLE instead of ISOLATION_LEVEL_READ_COMMIT

To post a comment you must log in.
Revision history for this message
Gavin Panella (allenap) wrote :

We can't do this without code retry failed transactions. This level of isolation will cause all concurrent updates to the same rows in the database (except the first to commit) to fail. We need a way to automatically retry the operation when those failures happen, otherwise users will just see what appear to be spurious failures. This is something learned from Launchpad, which uses this isolation level with PostgreSQL, and also learned on MAAS when it was attempted before.

review: Disapprove
Revision history for this message
Andres Rodriguez (andreserl) wrote :

Gavin,

You do realize that the change you made to the DB settings are creating a critical regression in MAAS that will have a critical impact in ALL deployments right? With the change you made, you are re-introducing bug [1]. This issue is critical and will affect all of our internal deployments and customer's.

[1]: https://bugs.launchpad.net/juju-core/+bug/1314409

Revision history for this message
Gavin Panella (allenap) wrote :

As discussed elsewhere, we never ran SERIALIZABLE in the production configurations. The change to maas_local_settings.py to specify READ COMMITTED was only making explicit what was already the case. My -1 vote isn't an attempt to stop conversation though because it's possible I've misunderstood some side-effect, or that I'm just plain wrong. I'd rather figure this out than be right.

Revision history for this message
Julian Edwards (julian-edwards) wrote :

On Sunday 19 Oct 2014 22:31:14 you wrote:
> As discussed elsewhere, we never ran SERIALIZABLE in the production
> configurations. The change to maas_local_settings.py to specify READ
> COMMITTED was only making explicit what was already the case. My -1 vote
> isn't an attempt to stop conversation though because it's possible I've
> misunderstood some side-effect, or that I'm just plain wrong. I'd rather
> figure this out than be right.

Agreed.

The lock hack will fix the acquisition bug for now but we need to put in
SERIALIZABLE with a retry mechanism, and a decent error page for when the
retries fail too.

I think this is perfectly do-able for the next point release and in fact we
should try to do this as soon as possible after utopic's beta is released so
that we have a longer testing and bedding-in period.

Revision history for this message
Raphaël Badin (rvb) wrote :

I agree with Gavin and Julian here. What I think we should do is:
- get rid of the deprecated TransactionMiddleware (that might be the source of some of the bugs we're seeing); using ATOMIC_REQUESTS=True in the config might be a drop in replacement (but we might want to control this manually, see my next point); this is mostly a cleanup, to make sure we're not mixing deprecated utilities with more recent utilities (the transaction stuff in Django changed quite a lot in 1.6)
- as Gavin and Julian suggested, implement a retry mechanism; A middleware should be able to do this.
- switch to SERIALIZABLE

Revision history for this message
Gavin Panella (allenap) wrote :

On 20 October 2014 08:48, Raphaël Badin wrote:
> I agree with Gavin and Julian here. What I think we should do is:
> - get rid of the deprecated TransactionMiddleware (that might be the
> source of some of the bugs we're seeing); using ATOMIC_REQUESTS=True
> in the config might be a drop in replacement (but we might want to
> control this manually, see my next point); this is mostly a cleanup,
> to make sure we're not mixing deprecated utilities with more recent
> utilities (the transaction stuff in Django changed quite a lot in
> 1.6)

There is a problem with ATOMIC_REQUESTS: the atomic decorator prevents
explicit commits (by "official" channels at least) while it's active.
Even if we switch to SERIALIZABLE we will still want to explicitly
commit before making some RPC calls.

There is a way around this: decorate a view with non_atomic_requests.
However, the explicit commits we have so far are in the model layer, not
the view, so there probably won't be a nice 1:1 mapping between where we
need to commit and view.

> - as Gavin and Julian suggested, implement a retry mechanism; A
> middleware should be able to do this.
> - switch to SERIALIZABLE

We should consider REPEATABLE READ too. I don't know enough about it in
PostgreSQL to say more, but it was suggested to me.

Revision history for this message
Raphaël Badin (rvb) wrote :

On 10/20/2014 10:24 AM, Gavin Panella wrote:
> On 20 October 2014 08:48, Raphaël Badin wrote:
>> I agree with Gavin and Julian here. What I think we should do is:
>> - get rid of the deprecated TransactionMiddleware (that might be the
>> source of some of the bugs we're seeing); using ATOMIC_REQUESTS=True
>> in the config might be a drop in replacement (but we might want to
>> control this manually, see my next point); this is mostly a cleanup,
>> to make sure we're not mixing deprecated utilities with more recent
>> utilities (the transaction stuff in Django changed quite a lot in
>> 1.6)
>
> There is a problem with ATOMIC_REQUESTS: the atomic decorator prevents
> explicit commits (by "official" channels at least) while it's active.
> Even if we switch to SERIALIZABLE we will still want to explicitly
> commit before making some RPC calls.
>
> There is a way around this: decorate a view with non_atomic_requests.
> However, the explicit commits we have so far are in the model layer, not
> the view, so there probably won't be a nice 1:1 mapping between where we
> need to commit and view.

This is a good point. We need to investigate if switching to
ATOMIC_REQUESTS and using non_atomic_requests for the views we want to
control manually is simpler than the alternative: controlling things
manually everywhere.

>
>> - as Gavin and Julian suggested, implement a retry mechanism; A
>> middleware should be able to do this.
>> - switch to SERIALIZABLE
>
> We should consider REPEATABLE READ too. I don't know enough about it in
> PostgreSQL to say more, but it was suggested to me.

Agreed. The problem we're seeing with a node being acquired twice is,
precisely, a non-repeatable read.

Unmerged revisions

3269. By Andres Rodriguez

Revert commits r3206 and r3225 to use ISOLATION_LEVEL_SERIALIZABLE again

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'contrib/maas_local_settings.py'
2--- contrib/maas_local_settings.py 2014-10-08 22:06:20 +0000
3+++ contrib/maas_local_settings.py 2014-10-19 11:46:21 +0000
4@@ -74,9 +74,6 @@
5 }
6
7 # Database access configuration.
8-from psycopg2.extensions import ISOLATION_LEVEL_READ_COMMITTED
9-
10-
11 DATABASES = {
12 'default': {
13 # 'postgresql_psycopg2', 'postgresql', 'mysql', 'sqlite3' etc.
14@@ -85,8 +82,5 @@
15 'USER': '',
16 'PASSWORD': '',
17 'HOST': 'localhost',
18- 'OPTIONS': {
19- 'isolation_level': ISOLATION_LEVEL_READ_COMMITTED,
20- },
21 }
22 }
23
24=== modified file 'src/maas/development.py'
25--- src/maas/development.py 2014-10-08 11:01:44 +0000
26+++ src/maas/development.py 2014-10-19 11:46:21 +0000
27@@ -26,7 +26,7 @@
28 from metadataserver.address import guess_server_host
29 import provisioningserver.config
30 from provisioningserver.utils import compose_URL
31-from psycopg2.extensions import ISOLATION_LEVEL_READ_COMMITTED
32+from psycopg2.extensions import ISOLATION_LEVEL_SERIALIZABLE
33
34 # We expect the following settings to be overridden. They are mentioned here
35 # to silence lint warnings.
36@@ -70,7 +70,7 @@
37 # Unix socket directory.
38 'HOST': abspath('db'),
39 'OPTIONS': {
40- 'isolation_level': ISOLATION_LEVEL_READ_COMMITTED,
41+ 'isolation_level': ISOLATION_LEVEL_SERIALIZABLE,
42 },
43 },
44 }
45
46=== modified file 'src/maas/settings.py'
47--- src/maas/settings.py 2014-10-15 20:39:07 +0000
48+++ src/maas/settings.py 2014-10-19 11:46:21 +0000
49@@ -23,7 +23,7 @@
50 from maas.monkey import patch_get_script_prefix
51 from metadataserver.address import guess_server_host
52 from provisioningserver.utils import compose_URL
53-from psycopg2.extensions import ISOLATION_LEVEL_READ_COMMITTED
54+from psycopg2.extensions import ISOLATION_LEVEL_SERIALIZABLE
55
56
57 django.template.add_to_builtins('django.templatetags.future')
58@@ -128,7 +128,7 @@
59 'HOST': '',
60 'PORT': '',
61 'OPTIONS': {
62- 'isolation_level': ISOLATION_LEVEL_READ_COMMITTED,
63+ 'isolation_level': ISOLATION_LEVEL_SERIALIZABLE,
64 },
65 },
66 }