Volume can't be deleted if tgt has had a reconnect.

Bug #1159948 reported by Vish Ishaya
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
High
Vish Ishaya
Folsom
Fix Released
High
Vish Ishaya

Bug Description

During heavy read/write operations, we are seeing reconnect issues from the connecting machine when using tgt:

Mar 24 20:06:24 node009-cont001 iscsid: Kernel reported iSCSI connection 44:0 error (1021 - ISCSI_ERR_SCSI_EH_SESSION_RST: Session was dropped as a result of SCSI error recovery) state (3)

The system seems to recover correctly but tgt still thinks there is a connection open:

arget 2: iqn.2010-10.org.openstack:volume-23603703-c801-41c7-b357-ab8e802033af
    System information:
        Driver: iscsi
        State: ready
    I_T nexus information:
        I_T nexus: 2
            Initiator: iqn.1993-08.org.debian:01:19cd983643f alias: node009-cont001
            Connection: 0
                IP Address: 172.18.1.9
        I_T nexus: 3
            Initiator: iqn.1993-08.org.debian:01:19cd983643f alias: node009-cont001
            Connection: 0
                IP Address: 172.18.1.9

This means the volume can't be deleted correctly:

2013-03-24 21:49:03 DEBUG cinder.volume.manager [req-dd6b1079-a9f0-45e6-8c3c-12bc6be46007 533801293fbe4b528db3a01ff70c2cc0 d3c7b21c21844e4e862abc894ef0b14c] volume volume-23603703-c801-41c7-b357-ab8e802033af: removing export delete_volume /usr/lib/python2.7/dist-packages/cinder/volume/manager.py:192
2013-03-24 21:49:03 DEBUG cinder.utils [req-dd6b1079-a9f0-45e6-8c3c-12bc6be46007 533801293fbe4b528db3a01ff70c2cc0 d3c7b21c21844e4e862abc894ef0b14c] Running cmd (subprocess): sudo cinder-rootwrap /etc/cinder/rootwrap.conf tgt-admin --show execute /usr/lib/python2.7/dist-packages/cinder/utils.py:163
2013-03-24 21:49:03 INFO cinder.volume.iscsi [req-dd6b1079-a9f0-45e6-8c3c-12bc6be46007 533801293fbe4b528db3a01ff70c2cc0 d3c7b21c21844e4e862abc894ef0b14c] Removing volume: 23603703-c801-41c7-b357-ab8e802033af
2013-03-24 21:49:03 DEBUG cinder.utils [req-dd6b1079-a9f0-45e6-8c3c-12bc6be46007 533801293fbe4b528db3a01ff70c2cc0 d3c7b21c21844e4e862abc894ef0b14c] Running cmd (subprocess): sudo cinder-rootwrap /etc/cinder/rootwrap.conf tgt-admin --delete iqn.2010-10.org.openstack:volume-23603703-c801-41c7-b357-ab8e802033af execute /usr/lib/python2.7/dist-packages/cinder/utils.py:163
2013-03-24 21:49:03 DEBUG cinder.volume.manager [req-dd6b1079-a9f0-45e6-8c3c-12bc6be46007 533801293fbe4b528db3a01ff70c2cc0 d3c7b21c21844e4e862abc894ef0b14c] volume volume-23603703-c801-41c7-b357-ab8e802033af: deleting delete_volume /usr/lib/python2.7/dist-packages/cinder/volume/manager.py:194
2013-03-24 21:49:03 DEBUG cinder.utils [req-dd6b1079-a9f0-45e6-8c3c-12bc6be46007 533801293fbe4b528db3a01ff70c2cc0 d3c7b21c21844e4e862abc894ef0b14c] Running cmd (subprocess): sudo cinder-rootwrap /etc/cinder/rootwrap.conf lvdisplay vg.nebula.openstack/volume-23603703-c801-41c7-b357-ab8e802033af execute /usr/lib/python2.7/dist-packages/cinder/utils.py:163
2013-03-24 21:49:03 DEBUG cinder.utils [req-dd6b1079-a9f0-45e6-8c3c-12bc6be46007 533801293fbe4b528db3a01ff70c2cc0 d3c7b21c21844e4e862abc894ef0b14c] Running cmd (subprocess): sudo cinder-rootwrap /etc/cinder/rootwrap.conf lvdisplay --noheading -C -o Attr vg.nebula.openstack/volume-23603703-c801-41c7-b357-ab8e802033af execute /usr/lib/python2.7/dist-packages/cinder/utils.py:163
2013-03-24 21:49:03 DEBUG cinder.utils [req-dd6b1079-a9f0-45e6-8c3c-12bc6be46007 533801293fbe4b528db3a01ff70c2cc0 d3c7b21c21844e4e862abc894ef0b14c] Running cmd (subprocess): sudo cinder-rootwrap /etc/cinder/rootwrap.conf dmsetup remove -f /dev/mapper/vg.nebula.openstack-volume--23603703--c801--41c7--b357--ab8e802033af execute /usr/lib/python2.7/dist-packages/cinder/utils.py:163
2013-03-24 21:49:03 DEBUG cinder.utils [req-dd6b1079-a9f0-45e6-8c3c-12bc6be46007 533801293fbe4b528db3a01ff70c2cc0 d3c7b21c21844e4e862abc894ef0b14c] Result was 1 execute /usr/lib/python2.7/dist-packages/cinder/utils.py:180
2013-03-24 21:49:03 ERROR cinder.volume.driver [req-dd6b1079-a9f0-45e6-8c3c-12bc6be46007 533801293fbe4b528db3a01ff70c2cc0 d3c7b21c21844e4e862abc894ef0b14c] Recovering from a failed execute. Try number 1
2013-03-24 21:49:03 19996 TRACE cinder.volume.driver Traceback (most recent call last):
2013-03-24 21:49:03 19996 TRACE cinder.volume.driver File "/usr/lib/python2.7/dist-packages/cinder/volume/driver.py", line 105, in _try_execute
2013-03-24 21:49:03 19996 TRACE cinder.volume.driver self._execute(*command, **kwargs)
2013-03-24 21:49:03 19996 TRACE cinder.volume.driver File "/usr/lib/python2.7/dist-packages/cinder/utils.py", line 187, in execute
2013-03-24 21:49:03 19996 TRACE cinder.volume.driver cmd=' '.join(cmd))
2013-03-24 21:49:03 19996 TRACE cinder.volume.driver ProcessExecutionError: Unexpected error while running command.
2013-03-24 21:49:03 19996 TRACE cinder.volume.driver Command: sudo cinder-rootwrap /etc/cinder/rootwrap.conf dmsetup remove -f /dev/mapper/vg.nebula.openstack-volume--23603703--c801--41c7--b357--ab8e802033af
2013-03-24 21:49:03 19996 TRACE cinder.volume.driver Exit code: 1
2013-03-24 21:49:03 19996 TRACE cinder.volume.driver Stdout: ''
2013-03-24 21:49:03 19996 TRACE cinder.volume.driver Stderr: 'device-mapper: remove ioctl on vg.nebula.openstack-volume--23603703--c801--41c7--b357--ab8e802033af failed: Device or resource busy\nCommand failed\n'

Revision history for this message
Vish Ishaya (vishvananda) wrote :

workaround for this issue is to issue the delete with a --force:

sudo cinder-rootwrap /etc/cinder/rootwrap.conf tgt-admin --force --delete iqn.2010-10.org.openstack:volume-23603703-c801-41c7-b357-ab8e802033af

tags: added: folsom-backport-potential grizzly-rc-potential
Changed in cinder:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/25321

Changed in cinder:
assignee: nobody → Vish Ishaya (vishvananda)
status: Confirmed → In Progress
Revision history for this message
Vish Ishaya (vishvananda) wrote :

upstream tgt issue reported here: https://github.com/fujita/tgt/issues/3

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (milestone-proposed)

Fix proposed to branch: milestone-proposed
Review: https://review.openstack.org/25324

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/folsom)

Fix proposed to branch: stable/folsom
Review: https://review.openstack.org/25325

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/25321
Committed: http://github.com/openstack/cinder/commit/b84218633ab5417adf94c568eb54243f4074ab11
Submitter: Jenkins
Branch: master

commit b84218633ab5417adf94c568eb54243f4074ab11
Author: Vishvananda Ishaya <email address hidden>
Date: Mon Mar 25 12:46:51 2013 -0700

    Force deletes using tgt to workaround bug 1159948

    Tgt has a bug where it can have multiple copies of an initiator
    if there has been a reconnect.

    See https://bugs.launchpad.net/cinder/+bug/1159948

    Change-Id: I9a1b6757eb780efbaa1403016e50de7c0e45d720

Changed in cinder:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/folsom)

Reviewed: https://review.openstack.org/25325
Committed: http://github.com/openstack/cinder/commit/ad2ddddba50a5d822ddde78782142297f992b408
Submitter: Jenkins
Branch: stable/folsom

commit ad2ddddba50a5d822ddde78782142297f992b408
Author: Vishvananda Ishaya <email address hidden>
Date: Mon Mar 25 12:46:51 2013 -0700

    Force deletes using tgt to workaround bug 1159948

    Tgt has a bug where it can have multiple copies of an initiator
    if there has been a reconnect.

    See https://bugs.launchpad.net/cinder/+bug/1159948

    Change-Id: I9a1b6757eb780efbaa1403016e50de7c0e45d720
    (cherry picked from commit b84218633ab5417adf94c568eb54243f4074ab11)

Thierry Carrez (ttx)
no longer affects: cinder/grizzly
Thierry Carrez (ttx)
Changed in cinder:
milestone: none → grizzly-rc3
tags: removed: folsom-backport-potential grizzly-rc-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (milestone-proposed)

Reviewed: https://review.openstack.org/25324
Committed: http://github.com/openstack/cinder/commit/4645123cadc441ed138355d94823310255f548ba
Submitter: Jenkins
Branch: milestone-proposed

commit 4645123cadc441ed138355d94823310255f548ba
Author: Vishvananda Ishaya <email address hidden>
Date: Mon Mar 25 12:46:51 2013 -0700

    Force deletes using tgt to workaround bug 1159948

    Tgt has a bug where it can have multiple copies of an initiator
    if there has been a reconnect.

    See https://bugs.launchpad.net/cinder/+bug/1159948

    Change-Id: I9a1b6757eb780efbaa1403016e50de7c0e45d720
    (cherry picked from commit b84218633ab5417adf94c568eb54243f4074ab11)

Changed in cinder:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in cinder:
milestone: grizzly-rc3 → 2013.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.