[SRU] Attempting to attach the same volume multiple times can cause bdm record for existing attachment to be deleted.

Bug #1349888 reported by git-harry
30
This bug affects 5 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
git-harry
nova (Ubuntu)
Fix Released
High
Edward Hope-Morley
Trusty
Fix Released
High
Edward Hope-Morley

Bug Description

[Impact]

 * Ensure attching already attached volume to second instance does not
   interfere with attached instance volume record.

[Test Case]

 * Create cinder volume vol1 and two instances vm1 and vm2

 * Attach vol1 to vm1 and check that attach was successful by doing:

   - cinder list
   - nova show <vm1>

   e.g. http://paste.ubuntu.com/12314443/

 * Attach vol1 to vm2 and check that attach fails and, crucially, that the
   first attach is unaffected (as above). You can also check the Nova db as
   follows:

   select * from block_device_mapping where source_type='volume' and \
       (instance_uuid='<vm1>' or instance_uuid='<vm2>');

   from which you would expect e.g. http://paste.ubuntu.com/12314416/ which
   shows that vol1 is attached to vm1 and vm2 attach failed.

 * finally detach vol1 from vm1 and ensure that it succeeds.

[Regression Potential]

 * none

---- ---- ---- ----

nova assumes there is only ever one bdm per volume. When an attach is initiated a new bdm is created, if the attach fails a bdm for the volume is deleted however it is not necessarily the one that was just created. The following steps show how a volume can get stuck detaching because of this.

$ nova list
c+--------------------------------------+--------+--------+------------+-------------+------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+--------+--------+------------+-------------+------------------+
| cb5188f8-3fe1-4461-8a9d-3902f7cc8296 | test13 | ACTIVE | - | Running | private=10.0.0.2 |
+--------------------------------------+--------+--------+------------+-------------+------------------+

$ cinder list
+--------------------------------------+-----------+--------+------+-------------+----------+-------------+
| ID | Status | Name | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+--------+------+-------------+----------+-------------+
| c1e38e93-d566-4c99-bfc3-42e77a428cc4 | available | test10 | 1 | lvm1 | false | |
+--------------------------------------+-----------+--------+------+-------------+----------+-------------+

$ nova volume-attach test13 c1e38e93-d566-4c99-bfc3-42e77a428cc4
+----------+--------------------------------------+
| Property | Value |
+----------+--------------------------------------+
| device | /dev/vdb |
| id | c1e38e93-d566-4c99-bfc3-42e77a428cc4 |
| serverId | cb5188f8-3fe1-4461-8a9d-3902f7cc8296 |
| volumeId | c1e38e93-d566-4c99-bfc3-42e77a428cc4 |
+----------+--------------------------------------+

$ cinder list
+--------------------------------------+--------+--------+------+-------------+----------+--------------------------------------+
| ID | Status | Name | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+--------+--------+------+-------------+----------+--------------------------------------+
| c1e38e93-d566-4c99-bfc3-42e77a428cc4 | in-use | test10 | 1 | lvm1 | false | cb5188f8-3fe1-4461-8a9d-3902f7cc8296 |
+--------------------------------------+--------+--------+------+-------------+----------+--------------------------------------+

$ nova volume-attach test13 c1e38e93-d566-4c99-bfc3-42e77a428cc4
ERROR (BadRequest): Invalid volume: status must be 'available' (HTTP 400) (Request-ID: req-1fa34b54-25b5-4296-9134-b63321b0015d)

$ nova volume-detach test13 c1e38e93-d566-4c99-bfc3-42e77a428cc4

$ cinder list
+--------------------------------------+-----------+--------+------+-------------+----------+--------------------------------------+
| ID | Status | Name | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+--------+------+-------------+----------+--------------------------------------+
| c1e38e93-d566-4c99-bfc3-42e77a428cc4 | detaching | test10 | 1 | lvm1 | false | cb5188f8-3fe1-4461-8a9d-3902f7cc8296 |
+--------------------------------------+-----------+--------+------+-------------+----------+--------------------------------------+

2014-07-29 14:47:13.952 ERROR oslo.messaging.rpc.dispatcher [req-134dfd17-14da-4de0-93fc-5d8d7bbf65a5 admin admin] Exception during message handling: <type 'NoneType'> can't be decoded
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher Traceback (most recent call last):
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 134, in _dispatch_and_reply
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher incoming.message))
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 177, in _dispatch
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher return self._do_dispatch(endpoint, method, ctxt, args)
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 123, in _do_dispatch
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher result = getattr(endpoint, method)(ctxt, **new_args)
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/nova/nova/compute/manager.py", line 406, in decorated_function
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/nova/nova/exception.py", line 88, in wrapped
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher payload)
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/nova/nova/openstack/common/excutils.py", line 82, in __exit__
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/nova/nova/exception.py", line 71, in wrapped
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher return f(self, context, *args, **kw)
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/nova/nova/compute/manager.py", line 291, in decorated_function
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher pass
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/nova/nova/openstack/common/excutils.py", line 82, in __exit__
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/nova/nova/compute/manager.py", line 277, in decorated_function
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/nova/nova/compute/manager.py", line 319, in decorated_function
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher kwargs['instance'], e, sys.exc_info())
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/nova/nova/openstack/common/excutils.py", line 82, in __exit__
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/nova/nova/compute/manager.py", line 307, in decorated_function
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/nova/nova/compute/manager.py", line 4363, in detach_volume
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher self._detach_volume(context, instance, bdm)
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/nova/nova/compute/manager.py", line 4309, in _detach_volume
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher connection_info = jsonutils.loads(bdm.connection_info)
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/nova/nova/openstack/common/jsonutils.py", line 176, in loads
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher return json.loads(strutils.safe_decode(s, encoding), **kwargs)
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/nova/nova/openstack/common/strutils.py", line 134, in safe_decode
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher raise TypeError("%s can't be decoded" % type(text))
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher TypeError: <type 'NoneType'> can't be decoded
2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher

Related branches

git-harry (git-harry)
Changed in nova:
assignee: nobody → git-harry (git-harry)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/110319

Changed in nova:
status: New → In Progress
Revision history for this message
Nikola Đipanov (ndipanov) wrote : Re: Attempting to attach the same volume multiple times can cause bdm record for existing attachment to be deleted.

Interesting. So for clarity - the bug is that BlockDeviceMapping.get_by_volume_id() (actually db_api.block_device_mapping_get_by_volume_id() under the hood) will return the first volume it gets, even if there is more.

Changed in nova:
importance: Undecided → High
milestone: none → juno-3
Revision history for this message
git-harry (git-harry) wrote :

Yes, that is essentially it.

tags: added: icehouse-backport-potential
Changed in nova:
assignee: git-harry (git-harry) → Nikola Đipanov (ndipanov)
Changed in nova:
assignee: Nikola Đipanov (ndipanov) → git-harry (git-harry)
Thierry Carrez (ttx)
Changed in nova:
milestone: juno-3 → juno-rc1
Revision history for this message
Michael Still (mikal) wrote :

The review linked to this bug appears to still be being actively worked on.

Changed in nova:
assignee: git-harry (git-harry) → Nikola Đipanov (ndipanov)
Changed in nova:
assignee: Nikola Đipanov (ndipanov) → git-harry (git-harry)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/110319
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=339a97d0f2d17f531cfc79e09cd8b8bc75ce6e2a
Submitter: Jenkins
Branch: master

commit 339a97d0f2d17f531cfc79e09cd8b8bc75ce6e2a
Author: git-harry <email address hidden>
Date: Mon Aug 4 15:17:29 2014 +0100

    Fix creating bdm for failed volume attachment

    This commit modifies the reserve_block_device_name method to return the
    bdm object, when the corresponding keyword argument is True. This
    ensures the correct bdm is destroyed if the attach fails. Currently the
    code assumes only one bdm per volume and so retrieving it can cause the
    incorrect db entry to be returned.

    Change-Id: I22a6db76d2044331d1a846eb4b6d7338c50270e2
    Closes-Bug: #1349888

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: juno-rc1 → 2014.2
Revision history for this message
Kevin Fox (kevpn) wrote : Re: Attempting to attach the same volume multiple times can cause bdm record for existing attachment to be deleted.
Download full text (5.9 KiB)

I have a vm that looks like its in this state....

I do have rdo juno on sl7 with ceph backend and the patch above is already there...

in the dashboard for the instance, I see:

Volumes Attached
Attached To
    LIFT-Dev Extra Storage 2 on /dev/vdb
Attached To
    LIFT-Dev Extra Storage on /dev/vdc
Attached To
    LIFT-Dev Extra Storage on /dev/vdd

Two attachments of the same volume. In the vm, I only see one though /dev/vdc.

If I try and delete the attachment, it goes into detaching forever.

On the compute node:
2015-01-23 11:54:15.894 17425 AUDIT nova.compute.manager [req-5c64523f-765b-441d-9559-aa10cd130ab4 None] [instance: e674bf94-a7f4-4483-bfb8-0a065f2c327f] Detach volume 7be1f689-617d-4d90-909d-2e03139a2920 from m
ountpoint /dev/vdc
2015-01-23 11:54:16.464 17425 INFO nova.scheduler.client.report [req-5c64523f-765b-441d-9559-aa10cd130ab4 None] Compute_service record updated for ('rcn44.cloud.pnnl.gov', 'rcn44.cloud.pnnl.gov')
2015-01-23 11:54:16.475 17425 ERROR oslo.messaging.rpc.dispatcher [req-5c64523f-765b-441d-9559-aa10cd130ab4 ] Exception during message handling: <type 'NoneType'> can't be decoded
2015-01-23 11:54:16.475 17425 TRACE oslo.messaging.rpc.dispatcher Traceback (most recent call last):
2015-01-23 11:54:16.475 17425 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py", line 134, in _dispatch_and_reply
2015-01-23 11:54:16.475 17425 TRACE oslo.messaging.rpc.dispatcher incoming.message))
2015-01-23 11:54:16.475 17425 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py", line 177, in _dispatch
2015-01-23 11:54:16.475 17425 TRACE oslo.messaging.rpc.dispatcher return self._do_dispatch(endpoint, method, ctxt, args)
2015-01-23 11:54:16.475 17425 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py", line 123, in _do_dispatch
2015-01-23 11:54:16.475 17425 TRACE oslo.messaging.rpc.dispatcher result = getattr(endpoint, method)(ctxt, **new_args)
2015-01-23 11:54:16.475 17425 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 414, in decorated_function
2015-01-23 11:54:16.475 17425 TRACE oslo.messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2015-01-23 11:54:16.475 17425 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/exception.py", line 88, in wrapped
2015-01-23 11:54:16.475 17425 TRACE oslo.messaging.rpc.dispatcher payload)
2015-01-23 11:54:16.475 17425 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/openstack/common/excutils.py", line 82, in __exit__
2015-01-23 11:54:16.475 17425 TRACE oslo.messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2015-01-23 11:54:16.475 17425 TRACE oslo.messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/exception.py", line 71, in wrapped
2015-01-23 11:54:16.475 17425 TRACE oslo.messaging.rpc.dispatcher return f(self, context, *args, **kw)
2015-01-23 11:54:16.475 17425 TRACE oslo.messaging.rpc.dispatcher File "/usr/l...

Read more...

Changed in nova (Ubuntu):
status: New → In Progress
assignee: nobody → Edward Hope-Morley (hopem)
importance: Undecided → High
Changed in nova (Ubuntu Trusty):
assignee: nobody → Edward Hope-Morley (hopem)
importance: Undecided → High
status: New → In Progress
summary: - Attempting to attach the same volume multiple times can cause bdm record
- for existing attachment to be deleted.
+ [SRU] Attempting to attach the same volume multiple times can cause bdm
+ record for existing attachment to be deleted.
description: updated
Changed in nova (Ubuntu):
status: In Progress → Fix Released
description: updated
Revision history for this message
Chris J Arges (arges) wrote : Please test proposed package

Hello git-harry, or anyone else affected,

Accepted nova into trusty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/nova/1:2014.1.5-0ubuntu1.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in nova (Ubuntu Trusty):
status: In Progress → Fix Committed
tags: added: verification-needed
Revision history for this message
Edward Hope-Morley (hopem) wrote :

Deployed Trusty Icehouse with this nova version, ran the test described above and lgtm +1.

tags: added: verification-done
removed: verification-needed
Revision history for this message
Chris J Arges (arges) wrote : Update Released

The verification of the Stable Release Update for nova has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nova - 1:2014.1.5-0ubuntu1.3

---------------
nova (1:2014.1.5-0ubuntu1.3) trusty; urgency=medium

  * Attempting to attach the same volume multiple times can cause
    bdm record for existing attachment to be deleted. (LP: #1349888)
    - d/p/fix-creating-bdm-for-failed-volume-attachment.patch

 -- Edward Hope-Morley <email address hidden> Tue, 08 Sep 2015 12:32:45 +0100

Changed in nova (Ubuntu Trusty):
status: Fix Committed → Fix Released
Matt Riedemann (mriedem)
tags: removed: icehouse-backport-potential
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.