Openstack quantum, race condition in ip address creation when starting 50 VMs on a 5-node cluster

Bug #1110807 reported by Spatialist
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Gary Kotton
Folsom
Fix Released
High
Gary Kotton

Bug Description

When starting many VMs the same ip address is allocated to two different ports.
We use quantum in combination with openvswitch and a postgresql database. This affects folsom 2012-2

2013-01-30 16:21:28 DEBUG [quantum.db.db_base_plugin_v2] Generated mac for network 66b5d6c2-3869-4d5f-b239-3896ff502022 is fa:16:3e:85:44:35
2013-01-30 16:21:28 DEBUG [quantum.db.db_base_plugin_v2] Allocated IP - 10.200.200.24 from 10.200.200.24 to 10.200.200.254
2013-01-30 16:21:28 DEBUG [quantum.db.db_base_plugin_v2] Allocated IP 10.200.200.24 (66b5d6c2-3869-4d5f-b239-3896ff502022/73b7eb77-c987-4c83-9ea7-45e57c89869d/e8e60edc-e5ce-46a3-8dbd-cfe5648c07b9)
2013-01-30 16:21:28 DEBUG [routes.middleware] No route matched for POST /ports.json
2013-01-30 16:21:28 DEBUG [routes.middleware] Matched POST /ports.json
2013-01-30 16:21:28 DEBUG [routes.middleware] Route path: '/ports{.format}', defaults: {'action': u'create', 'controller': wsgify(quantum.api.v2.resource.resource, RequestClass=<class 'quantum.api.v2.resource.Request'>)}
2013-01-30 16:21:28 DEBUG [routes.middleware] Match dict: {'action': u'create', 'controller': wsgify(quantum.api.v2.resource.resource, RequestClass=<class 'quantum.api.v2.resource.Request'>), 'format': u'json'}
2013-01-30 16:21:28 DEBUG [quantum.openstack.common.rpc.amqp] Sending port.create.start on notifications.info
2013-01-30 16:21:28 ERROR [quantum.api.v2.resource] create failed
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/quantum/api/v2/resource.py", line 96, in resource
    result = method(request=request, **args)
  File "/usr/lib/python2.7/dist-packages/quantum/api/v2/base.py", line 335, in create
    obj = obj_creator(request.context, **kwargs)
  File "/usr/lib/python2.7/dist-packages/quantum/db/db_base_plugin_v2.py", line 1216, in create_port
    context.session.add(allocated)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 402, in __exit__
    self.commit()
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 314, in commit
    self._prepare_impl()
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 298, in _prepare_impl
    self.session.flush()
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 1583, in flush
    self._flush(objects)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 1654, in _flush
    flush_context.execute()
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/unitofwork.py", line 331, in execute
    rec.execute(self)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/unitofwork.py", line 475, in execute
    uow
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/persistence.py", line 64, in save_obj
    table, insert)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/persistence.py", line 530, in _emit_insert_statements
    execute(statement, multiparams)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1449, in execute
    params)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1584, in _execute_clauseelement
    compiled_sql, distilled_params
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1698, in _execute_context
    context)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1691, in _execute_context
    context)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/default.py", line 331, in do_execute
    cursor.execute(statement, parameters)
IntegrityError: (IntegrityError) duplicate key value violates unique constraint "ipallocations_pkey"
DETAIL: Key (ip_address, subnet_id, network_id)=(10.200.200.24, 73b7eb77-c987-4c83-9ea7-45e57c89869d, 66b5d6c2-3869-4d5f-b239-3896ff502022) already exists.
 'INSERT INTO ipallocations (port_id, ip_address, subnet_id, network_id, expiration) VALUES (%(port_id)s, %(ip_address)s, %(subnet_id)s, %(network_id)s, %(expiration)s)' {'network_id': u'66b5d6c2-3869-4d5f-b239-3896ff502022', 'subnet_id': u'73b7eb77-c987-4c83-9ea7-45e57c89869d', 'port_id': 'e8e60edc-e5ce-46a3-8dbd-cfe5648c07b9', 'ip_address': u'10.200.200.24', 'expiration': datetime.datetime(2013, 1, 30, 21, 23, 28, 665181)}
2013-01-30 16:21:28 DEBUG [eventlet.wsgi.server] 192.168.119.14 - - [30/Jan/2013 16:21:28] "POST /v2.0/ports.json HTTP/1.1" 500 215 0.240021
2013-01-30 16:21:28 DEBUG [amqplib] Closed channel #1
2013-01-30 16:21:28 DEBUG [amqplib] using channel_id: 1
2013-01-30 16:21:28 DEBUG [amqplib] Channel open
2013-01-30 16:21:28 DEBUG [quantum.openstack.common.rpc.amqp] Sending port.create.start on notifications.info
2013-01-30 16:21:28 DEBUG [amqplib] Closed channel #1
2013-01-30 16:21:28 DEBUG [amqplib] using channel_id: 1
2013-01-30 16:21:28 DEBUG [amqplib] Channel open
:

Tags: l3-ipam-dhcp
Revision history for this message
Spatialist (fsluiter) wrote :

Actually out of 50 started, more then 20 fail. This seems correlated to the number of connections quantum has to postgresql, after a fresh reboot, only a few fail.

summary: - Openstack quantum, race condition address creation when starting 50 VMs
- on a 5-node cluster
+ Openstack quantum, race condition in ip address creation when starting
+ 50 VMs on a 5-node cluster
Revision history for this message
Spatialist (fsluiter) wrote :

The race condition is that an ip-addres is selected, but the record is not locked for update. When the record is updated, the same ip adres is also given to another port. I included a part of the log of quantum server

tags: added: l3-ipam-dhcp
Revision history for this message
Spatialist (fsluiter) wrote :

We used the dashboard to boot up a cluster of 50 VMs, all identical and small images. We recently installed quantum 2012-1, and upgraded this week to 2012-2. Both versions are affected.

As seen from the logs, the error is in File "/usr/lib/python2.7/dist-packages/quantum/db/db_base_plugin_v2.py", line 1216, in create_port.
This function is probably called simultaneously for two different ports.
In postgresql, two simultaneous selects on the table ipavailabilityranges will give the same results for a free ip adres, even when protected by a transaction. A better, but non-portable way would be "select for update".
However, to make it portable this part should run as a single thread and might probably better be protected by a semaphore or a similar construct.

If any additional info or testing is needed, don hesitate to ask. To get this working is really important to us.

Revision history for this message
Spatialist (fsluiter) wrote :

sorry I had the version wrong: we used folsom 2012-2 and this week 2012-2.1

Revision history for this message
Spatialist (fsluiter) wrote :

The actual queries seem to be done in the function def _generate_ip(context, subnets):
        """Generate an IP address.

Revision history for this message
Spatialist (fsluiter) wrote :

The same code is unchanged present in grizzly, so I would expect the same error in that release, can anybody verify?

Revision history for this message
dan wendlandt (danwent) wrote :

I cannot repro this with mysql using master. Perhaps its specific to postgresql ?

Here is my test, it would be good if you could try it on your setup:

set port_quota in quantum.conf to -1 (unlimited) and restart quantum-server

In one window, i run:

quantum net-create test1
quantum subnet-create test1 10.0.0.0/24

Then I start two windows in parallel with the same command:

for i in `seq 1 50`; do quantum port-create test1 | grep fixed_ips; done

I don't see any instances of the same IP being allocated, even though there does seem to be interweaving based on the fact that the same window sometimes gets two IPs in a row.

Changed in quantum:
assignee: nobody → dan wendlandt (danwent)
importance: Undecided → Critical
Revision history for this message
dan wendlandt (danwent) wrote :

Ah, from your earlier comment, looks like you already have tracked this down to a difference of how selects will work between mysql and postgresql. If so, we should update the title of the bug to indicate this.

Changed in quantum:
importance: Critical → High
Revision history for this message
Spatialist (fsluiter) wrote : Re: [Bug 1110807] Re: Openstack quantum, race condition in ip address creation when starting 50 VMs on a 5-node cluster
Download full text (5.9 KiB)

Actually i think it will also happen with mysql if the requests are handled
by multiple quantum threads.
Op 1 feb. 2013 08:01 schreef "dan wendlandt" <email address hidden>
het volgende:

> Ah, from your earlier comment, looks like you already have tracked this
> down to a difference of how selects will work between mysql and
> postgresql. If so, we should update the title of the bug to indicate
> this.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1110807
>
> Title:
> Openstack quantum, race condition in ip address creation when starting
> 50 VMs on a 5-node cluster
>
> Status in OpenStack Quantum (virtual network service):
> New
>
> Bug description:
> When starting many VMs the same ip address is allocated to two different
> ports.
> We use quantum in combination with openvswitch and a postgresql
> database. This affects folsom 2012-2
>
> 2013-01-30 16:21:28 DEBUG [quantum.db.db_base_plugin_v2] Generated
> mac for network 66b5d6c2-3869-4d5f-b239-3896ff502022 is fa:16:3e:85:44:35
> 2013-01-30 16:21:28 DEBUG [quantum.db.db_base_plugin_v2] Allocated IP
> - 10.200.200.24 from 10.200.200.24 to 10.200.200.254
> 2013-01-30 16:21:28 DEBUG [quantum.db.db_base_plugin_v2] Allocated IP
> 10.200.200.24
> (66b5d6c2-3869-4d5f-b239-3896ff502022/73b7eb77-c987-4c83-9ea7-45e57c89869d/e8e60edc-e5ce-46a3-8dbd-cfe5648c07b9)
> 2013-01-30 16:21:28 DEBUG [routes.middleware] No route matched for
> POST /ports.json
> 2013-01-30 16:21:28 DEBUG [routes.middleware] Matched POST /ports.json
> 2013-01-30 16:21:28 DEBUG [routes.middleware] Route path:
> '/ports{.format}', defaults: {'action': u'create', 'controller':
> wsgify(quantum.api.v2.resource.resource, RequestClass=<class
> 'quantum.api.v2.resource.Request'>)}
> 2013-01-30 16:21:28 DEBUG [routes.middleware] Match dict: {'action':
> u'create', 'controller': wsgify(quantum.api.v2.resource.resource,
> RequestClass=<class 'quantum.api.v2.resource.Request'>), 'format': u'json'}
> 2013-01-30 16:21:28 DEBUG [quantum.openstack.common.rpc.amqp] Sending
> port.create.start on notifications.info
> 2013-01-30 16:21:28 ERROR [quantum.api.v2.resource] create failed
> Traceback (most recent call last):
> File "/usr/lib/python2.7/dist-packages/quantum/api/v2/resource.py",
> line 96, in resource
> result = method(request=request, **args)
> File "/usr/lib/python2.7/dist-packages/quantum/api/v2/base.py", line
> 335, in create
> obj = obj_creator(request.context, **kwargs)
> File
> "/usr/lib/python2.7/dist-packages/quantum/db/db_base_plugin_v2.py", line
> 1216, in create_port
> context.session.add(allocated)
> File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py",
> line 402, in __exit__
> self.commit()
> File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py",
> line 314, in commit
> self._prepare_impl()
> File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py",
> line 298, in _prepare_impl
> self.session.flush()
> File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py",
> line 1583, in fl...

Read more...

Revision history for this message
Spatialist (fsluiter) wrote :
Download full text (7.8 KiB)

Hi Dan,

I first ran your test:

> set port_quota in quantum.conf to -1 (unlimited) and restart quantum-
> server
> In one window, i run:
> quantum net-create test1
> quantum subnet-create test1 10.0.0.0/24
>
> Then I start two windows in parallel with the same command:
> for i in `seq 1 50`; do quantum port-create test1 | grep fixed_ips; done

But that didn show the bug.
But now I slightly modified your test and created 10 parallel hreads
(all in one go pasted in in the same xterm):

quantum net-create bugtest1
quantum subnet-create bugtest1 10.0.0.0/24
for i in `seq 1 10`; do quantum port-create bugtest1 | grep fixed_ips; done&
for i in `seq 1 10`; do quantum port-create bugtest1 | grep fixed_ips; done&
for i in `seq 1 10`; do quantum port-create bugtest1 | grep fixed_ips; done&
for i in `seq 1 10`; do quantum port-create bugtest1 | grep fixed_ips; done&
for i in `seq 1 10`; do quantum port-create bugtest1 | grep fixed_ips; done&
for i in `seq 1 10`; do quantum port-create bugtest1 | grep fixed_ips; done&
for i in `seq 1 10`; do quantum port-create bugtest1 | grep fixed_ips; done&
for i in `seq 1 10`; do quantum port-create bugtest1 | grep fixed_ips; done&
for i in `seq 1 10`; do quantum port-create bugtest1 | grep fixed_ips; done&
for i in `seq 1 10`; do quantum port-create bugtest1 | grep fixed_ips; done&

Now this failed 3 times:
"Request Failed: internal server error while processing your request."
ANd the server.log shows the same errors as above: "create failed"
Now this test might seem artificial, however if you use dashboard to
create 50 VMs (and yes we do in production), it uses multiple Apache
threads to handle the calls. And Apache in turn calls quantum in
parallel.

Can you reproduce my errors?

Kind regards,
Floris

2013/2/1 F.Sluiter <email address hidden>:
> Actually i think it will also happen with mysql if the requests are handled
> by multiple quantum threads.
>
> Op 1 feb. 2013 08:01 schreef "dan wendlandt" <email address hidden>
> het volgende:
>
>> Ah, from your earlier comment, looks like you already have tracked this
>> down to a difference of how selects will work between mysql and
>> postgresql. If so, we should update the title of the bug to indicate
>> this.
>>
>> --
>> You received this bug notification because you are subscribed to the bug
>> report.
>> https://bugs.launchpad.net/bugs/1110807
>>
>> Title:
>> Openstack quantum, race condition in ip address creation when starting
>> 50 VMs on a 5-node cluster
>>
>> Status in OpenStack Quantum (virtual network service):
>> New
>>
>> Bug description:
>> When starting many VMs the same ip address is allocated to two different
>> ports.
>> We use quantum in combination with openvswitch and a postgresql
>> database. This affects folsom 2012-2
>>
>> 2013-01-30 16:21:28 DEBUG [quantum.db.db_base_plugin_v2] Generated
>> mac for network 66b5d6c2-3869-4d5f-b239-3896ff502022 is fa:16:3e:85:44:35
>> 2013-01-30 16:21:28 DEBUG [quantum.db.db_base_plugin_v2] Allocated IP
>> - 10.200.200.24 from 10.200.200.24 to 10.200.200.254
>> 2013-01-30 16:21:28 DEBUG [quantum.db.db_base_plugin_v2] Allocated IP
>> 10.200.200.24
>> (66b5d6c2-3869-4d5f-b239-3896ff...

Read more...

Revision history for this message
Spatialist (fsluiter) wrote :

To add some more info: I added the "&" to start 10 simultaneous port-creating processes in the background.

Interestingly, after I deleted these 97 newly created ports (easiest to do in the horizon dashboard), and when I reran the 10 threads, almost half of the port-creations failed.

Revision history for this message
dan wendlandt (danwent) wrote :

yeah, backgrounding was going to be the next thing I tried. we're on the same page. will hopefully get to this today.

Revision history for this message
dan wendlandt (danwent) wrote :

i have run this in parallel in several different ways (including the one you describe above) and I do not see the error with mysql.

When you say you have reproduced it with the above script, is that with mysql or postgresql?

we'll address this issue either way, its just a matter of understanding the scope of who is impacted.

Revision history for this message
Spatialist (fsluiter) wrote :
Download full text (6.4 KiB)

Hi,

it was with postgresql. However I expect it to be an issue with more
databases, as I think it has to do with not locking the record.
Postgresql supports concurrent transactions, which means that a record
can be accessed for read concurrently. Updates of course lock a
record.
Now in the code I noticed a select and a few lines down the update.
This will not give the expected behaviour of finding and locking a ip
address, as in between these statements another thread can grab the ip
address.

I am not sure how transactions are implemented in mysql, but I would
expect that they behave similar. Maybe try creating 255 ports in one
go?
multiple times?

Cheers,
Floris

2013/2/4 dan wendlandt <email address hidden>:
> i have run this in parallel in several different ways (including the one
> you describe above) and I do not see the error with mysql.
>
> When you say you have reproduced it with the above script, is that with
> mysql or postgresql?
>
> we'll address this issue either way, its just a matter of understanding
> the scope of who is impacted.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1110807
>
> Title:
> Openstack quantum, race condition in ip address creation when starting
> 50 VMs on a 5-node cluster
>
> Status in OpenStack Quantum (virtual network service):
> New
>
> Bug description:
> When starting many VMs the same ip address is allocated to two different ports.
> We use quantum in combination with openvswitch and a postgresql database. This affects folsom 2012-2
>
> 2013-01-30 16:21:28 DEBUG [quantum.db.db_base_plugin_v2] Generated mac for network 66b5d6c2-3869-4d5f-b239-3896ff502022 is fa:16:3e:85:44:35
> 2013-01-30 16:21:28 DEBUG [quantum.db.db_base_plugin_v2] Allocated IP - 10.200.200.24 from 10.200.200.24 to 10.200.200.254
> 2013-01-30 16:21:28 DEBUG [quantum.db.db_base_plugin_v2] Allocated IP 10.200.200.24 (66b5d6c2-3869-4d5f-b239-3896ff502022/73b7eb77-c987-4c83-9ea7-45e57c89869d/e8e60edc-e5ce-46a3-8dbd-cfe5648c07b9)
> 2013-01-30 16:21:28 DEBUG [routes.middleware] No route matched for POST /ports.json
> 2013-01-30 16:21:28 DEBUG [routes.middleware] Matched POST /ports.json
> 2013-01-30 16:21:28 DEBUG [routes.middleware] Route path: '/ports{.format}', defaults: {'action': u'create', 'controller': wsgify(quantum.api.v2.resource.resource, RequestClass=<class 'quantum.api.v2.resource.Request'>)}
> 2013-01-30 16:21:28 DEBUG [routes.middleware] Match dict: {'action': u'create', 'controller': wsgify(quantum.api.v2.resource.resource, RequestClass=<class 'quantum.api.v2.resource.Request'>), 'format': u'json'}
> 2013-01-30 16:21:28 DEBUG [quantum.openstack.common.rpc.amqp] Sending port.create.start on notifications.info
> 2013-01-30 16:21:28 ERROR [quantum.api.v2.resource] create failed
> Traceback (most recent call last):
> File "/usr/lib/python2.7/dist-packages/quantum/api/v2/resource.py", line 96, in resource
> result = method(request=request, **args)
> File "/usr/lib/python2.7/dist-packages/quantum/api/v2/base.py", line 335, in create
> obj = obj_creator(request.conte...

Read more...

Revision history for this message
Spatialist (fsluiter) wrote :

Hi Dan,
are you working with a developer setup of quantum?

From:http://wiki.openstack.org/QuantumDevelopment
> Here is a list of tools that can (or cant) be used in order to debug Quantum. Please add your findings to the list.
> Eclipse pydev - Free. It works! (Thanks to gong yong sheng). You need to modify quntum-server as following: From:
> eventlet.monkey_patch(os=False) To: eventlet.monkey_patch(os=False, thread=False)

If so, this bug will never show up, as it occurs due to
multi-threading. In Eclipse Pydev it is turned of it seems.

Revision history for this message
Ante Karamatić (ivoks) wrote :

FWIW, I'm unable to reproduce this problem on Quantum Essex with MySQL.

Revision history for this message
Spatialist (fsluiter) wrote :

Ante, thank you that is helpfull information! Do you use a developer version of quantum or the regular install packages?
In any case, can you verifiy if all the ports received a unique ip-address? The difference in behaviour might be that postgresql might not allow the update of a record if the value was changed by another thread, and maybe mysql allows this. If so, a few of the ports will probably share the same ip-address.

Revision history for this message
Ante Karamatić (ivoks) wrote :

Dana 06.02.2013 14:13, Spatialist je napisao:

> Ante, thank you that is helpfull information! Do you use a developer version of quantum or the regular install packages?

I've used default packages from Ubuntu 12.04.

> In any case, can you verifiy if all the ports received a unique ip-address? The difference in behaviour might be that postgresql might not allow the update of a record if the value was changed by another thread, and maybe mysql allows this. If so, a few of the ports will probably share the same ip-address.

I'll have to check the IPs, cause I haven't looked at them in detail. I
was just about to reinstall folsom, so I'll re-check with folosom today
or tomorrow.

--
Ante Karamatic <email address hidden>
Professional and Engineering Services
Canonical Ltd

Revision history for this message
Spatialist (fsluiter) wrote :
Download full text (6.2 KiB)

Great, thanks!

2013/2/6 Ante Karamatić <email address hidden>:
> Dana 06.02.2013 14:13, Spatialist je napisao:
>
>> Ante, thank you that is helpfull information! Do you use a developer
> version of quantum or the regular install packages?
>
> I've used default packages from Ubuntu 12.04.
>
>> In any case, can you verifiy if all the ports received a unique ip-
> address? The difference in behaviour might be that postgresql might not
> allow the update of a record if the value was changed by another thread,
> and maybe mysql allows this. If so, a few of the ports will probably
> share the same ip-address.
>
> I'll have to check the IPs, cause I haven't looked at them in detail. I
> was just about to reinstall folsom, so I'll re-check with folosom today
> or tomorrow.
>
> --
> Ante Karamatic <email address hidden>
> Professional and Engineering Services
> Canonical Ltd
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1110807
>
> Title:
> Openstack quantum, race condition in ip address creation when starting
> 50 VMs on a 5-node cluster
>
> Status in OpenStack Quantum (virtual network service):
> New
>
> Bug description:
> When starting many VMs the same ip address is allocated to two different ports.
> We use quantum in combination with openvswitch and a postgresql database. This affects folsom 2012-2
>
> 2013-01-30 16:21:28 DEBUG [quantum.db.db_base_plugin_v2] Generated mac for network 66b5d6c2-3869-4d5f-b239-3896ff502022 is fa:16:3e:85:44:35
> 2013-01-30 16:21:28 DEBUG [quantum.db.db_base_plugin_v2] Allocated IP - 10.200.200.24 from 10.200.200.24 to 10.200.200.254
> 2013-01-30 16:21:28 DEBUG [quantum.db.db_base_plugin_v2] Allocated IP 10.200.200.24 (66b5d6c2-3869-4d5f-b239-3896ff502022/73b7eb77-c987-4c83-9ea7-45e57c89869d/e8e60edc-e5ce-46a3-8dbd-cfe5648c07b9)
> 2013-01-30 16:21:28 DEBUG [routes.middleware] No route matched for POST /ports.json
> 2013-01-30 16:21:28 DEBUG [routes.middleware] Matched POST /ports.json
> 2013-01-30 16:21:28 DEBUG [routes.middleware] Route path: '/ports{.format}', defaults: {'action': u'create', 'controller': wsgify(quantum.api.v2.resource.resource, RequestClass=<class 'quantum.api.v2.resource.Request'>)}
> 2013-01-30 16:21:28 DEBUG [routes.middleware] Match dict: {'action': u'create', 'controller': wsgify(quantum.api.v2.resource.resource, RequestClass=<class 'quantum.api.v2.resource.Request'>), 'format': u'json'}
> 2013-01-30 16:21:28 DEBUG [quantum.openstack.common.rpc.amqp] Sending port.create.start on notifications.info
> 2013-01-30 16:21:28 ERROR [quantum.api.v2.resource] create failed
> Traceback (most recent call last):
> File "/usr/lib/python2.7/dist-packages/quantum/api/v2/resource.py", line 96, in resource
> result = method(request=request, **args)
> File "/usr/lib/python2.7/dist-packages/quantum/api/v2/base.py", line 335, in create
> obj = obj_creator(request.context, **kwargs)
> File "/usr/lib/python2.7/dist-packages/quantum/db/db_base_plugin_v2.py", line 1216, in create_port
> context.session.add(allocated)
> File "/usr/l...

Read more...

Revision history for this message
Ante Karamatić (ivoks) wrote :

I was unable to reproduce the problem with Folsom and in the previous test on Essex, all the IDs for the ports in were unique. All tests were done with MySQL backend.

Revision history for this message
Ante Karamatić (ivoks) wrote :

Reproducible with pgsql:

| fixed_ips | {"subnet_id": "3fb0124e-cf87-4ccd-a777-39f8d6f141e6", "ip_address": "10.0.0.3"} |
| fixed_ips | {"subnet_id": "3fb0124e-cf87-4ccd-a777-39f8d6f141e6", "ip_address": "10.0.0.4"} |
| fixed_ips | {"subnet_id": "3fb0124e-cf87-4ccd-a777-39f8d6f141e6", "ip_address": "10.0.0.5"} |
| fixed_ips | {"subnet_id": "3fb0124e-cf87-4ccd-a777-39f8d6f141e6", "ip_address": "10.0.0.6"} |
Request Failed: internal server error while processing your request.
| fixed_ips | {"subnet_id": "3fb0124e-cf87-4ccd-a777-39f8d6f141e6", "ip_address": "10.0.0.7"} |
| fixed_ips | {"subnet_id": "3fb0124e-cf87-4ccd-a777-39f8d6f141e6", "ip_address": "10.0.0.8"} |
| fixed_ips | {"subnet_id": "3fb0124e-cf87-4ccd-a777-39f8d6f141e6", "ip_address": "10.0.0.9"} |
| fixed_ips | {"subnet_id": "3fb0124e-cf87-4ccd-a777-39f8d6f141e6", "ip_address": "10.0.0.10"} |
Request Failed: internal server error while processing your request.
| fixed_ips | {"subnet_id": "3fb0124e-cf87-4ccd-a777-39f8d6f141e6", "ip_address": "10.0.0.11"} |
| fixed_ips | {"subnet_id": "3fb0124e-cf87-4ccd-a777-39f8d6f141e6", "ip_address": "10.0.0.12"} |
| fixed_ips | {"subnet_id": "3fb0124e-cf87-4ccd-a777-39f8d6f141e6", "ip_address": "10.0.0.13"} |
Request Failed: internal server error while processing your request.

Revision history for this message
Ante Karamatić (ivoks) wrote :

Dan, if you don't mind, I'll take over this one. I already have a solution, just doing some testing before I commit it.

Changed in quantum:
assignee: dan wendlandt (danwent) → Ante Karamatić (ivoks)
Revision history for this message
Ante Karamatić (ivoks) wrote :

@Spatialist attached patch solves the problem on my end. Could you test it and confirm it solves the problem on your end as well and doesn't produce any regression?

On Ubuntu, db_base_plugin_v2.py is located in /usr/lib/python2.7/dist-packages/quantum/db. Make sure you locate it withinyour distribution. Patch it:

sudo patch -p0 -i lp-1110807.patch
sudo restart quantum-server

and then try reproducing the problem

If everything works out, I'll submit it for review.

Thanks

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to quantum (master)

Fix proposed to branch: master
Review: https://review.openstack.org/21424

Changed in quantum:
status: New → In Progress
Revision history for this message
Spatialist (fsluiter) wrote :

Hi Ante,

yes this fixes it! I was so happy, I tested it 10 times ;-)
I can now start 96 VMs in one go, and kill them again when done without errors.

Maybe this patch can also be applied to folsom 2012.2.3 and 2012.2.4 as it fixes a race condition that more people might encounter. Some linux distributions will support folsom for a long time, so it can affect many pepole.

Thanks for the quick fix.

Gary Kotton (garyk)
tags: added: folsom-backport-potential
Akihiro Motoki (amotoki)
tags: removed: critical
Changed in quantum:
assignee: Ante Karamatić (ivoks) → Gary Kotton (garyk)
dan wendlandt (danwent)
Changed in quantum:
milestone: none → grizzly-rc1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to quantum (master)

Reviewed: https://review.openstack.org/21424
Committed: http://github.com/openstack/quantum/commit/b6722af476eb7b4b54f7b29e626568c3ac1c7ede
Submitter: Jenkins
Branch: master

commit b6722af476eb7b4b54f7b29e626568c3ac1c7ede
Author: Ante Karamatic <email address hidden>
Date: Thu Feb 7 12:33:49 2013 +0100

    Lock tables for update on allocation/deletion

    Allocating, creating and deleting port might happen
    in parallel and we need to make sure we don't
    assign same IP to multiple different requests.

    Added treatment for vlan tags and tunnel ID's

    Fixes: bug #1110807

    Change-Id: Idbb04d3ce6eacd308b05536f1942a35a0792199e

Changed in quantum:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to quantum (stable/folsom)

Fix proposed to branch: stable/folsom
Review: https://review.openstack.org/23900

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to quantum (stable/folsom)

Reviewed: https://review.openstack.org/23900
Committed: http://github.com/openstack/quantum/commit/5a2ef81430a5c91feffc244382657c23b4db57d1
Submitter: Jenkins
Branch: stable/folsom

commit 5a2ef81430a5c91feffc244382657c23b4db57d1
Author: Ante Karamatic <email address hidden>
Date: Thu Feb 7 12:33:49 2013 +0100

    Lock tables for update on allocation/deletion

    Allocating, creating and deleting port might happen
    in parallel and we need to make sure we don't
    assign same IP to multiple different requests.

    Added treatment for vlan tags and tunnel ID's

    Fixes: bug #1110807

    Change-Id: Idbb04d3ce6eacd308b05536f1942a35a0792199e

tags: added: in-stable-folsom
Thierry Carrez (ttx)
Changed in quantum:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in quantum:
milestone: grizzly-rc1 → 2013.1
Alan Pevec (apevec)
tags: removed: folsom-backport-potential in-stable-folsom
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.