rbd: timeout when create volume from snapshot

Bug #1658037 reported by zhangsong
20
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
Undecided
zhangsong

Bug Description

Error is raised in cinder when I boot VMs from a VM snapshot.
I use rbd as cinder backend, here is the config about ceph:

[ceph1]
volume_driver = cinder.volume.drivers.rbd.RBDDriver
rbd_pool = volumes
rbd_ceph_conf = /etc/ceph/ceph.conf
rbd_flatten_volume_from_snapshot = true
rbd_max_clone_depth = 5
rbd_store_chunk_size = 4
rados_connect_timeout = -1
volume_backend_name = ceph1

Error info in cinder-volume:
2017-01-20 17:09:35.812 14085 ERROR oslo_messaging.rpc.server [-] Can not acknowledge message. Skip processing
2017-01-20 17:09:35.812 14085 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2017-01-20 17:09:35.812 14085 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 126, in _process_incoming
2017-01-20 17:09:35.812 14085 ERROR oslo_messaging.rpc.server message.acknowledge()
2017-01-20 17:09:35.812 14085 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 119, in acknowledge
2017-01-20 17:09:35.812 14085 ERROR oslo_messaging.rpc.server self.message.acknowledge()
2017-01-20 17:09:35.812 14085 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py", line 251, in acknowledge
2017-01-20 17:09:35.812 14085 ERROR oslo_messaging.rpc.server self._raw_message.ack()
......
2017-01-20 17:09:35.812 14085 ERROR oslo_messaging.rpc.server return self._send_loop(self.fd.send, data, flags)
2017-01-20 17:09:35.812 14085 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/eventlet/greenio/base.py", line 366, in _send_loop
2017-01-20 17:09:35.812 14085 ERROR oslo_messaging.rpc.server return send_method(data, *args)
2017-01-20 17:09:35.812 14085 ERROR oslo_messaging.rpc.server error: [Errno 32] Broken pipe
2017-01-20 17:09:35.812 14085 ERROR oslo_messaging.rpc.server
2017-01-20 17:09:35.838 14085 DEBUG cinder.volume.flows.manager.create_volume [req-3bbc145a-7453-4f9b-885b-04d6ebfdbf75 1e7c6b8b585e44fb9b2aacc2d0c52f78 fe48a5e0b8e1439abb423fb4a025d064 - default default] Marking volume 898121a2-3251-42bb-afac-4f2d51210a83 as bootable. _enable_bootable_flag /usr/lib/python2.7/site-packages/cinder/volume/flows/manager/create_volume.py:464

This error occuers only when rbd_flatten_volume_from_snapshot is set to true;It always means heartbeat between rabbitmq-server and cinder-volume is missing;

Rbd driver will flatten the volume when rbd_flatten_volume_from_snapshot is set,we shouldn't do this time-consuming operation in greenthread. Use tpool proxy instead may be a good way.

This is similar to https://review.openstack.org/#/c/175555/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/423184

Changed in cinder:
assignee: nobody → zhangsong (zhangsong)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/423184
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=8f2b7f4f20727d35d0dce25fddb5a879f26c9614
Submitter: Jenkins
Branch: master

commit 8f2b7f4f20727d35d0dce25fddb5a879f26c9614
Author: zhangsong <email address hidden>
Date: Fri Jan 20 18:17:28 2017 +0800

    RBD:Move RBDVolume calls to a separate threads

    RBD is a python binding for librados which isn't patched by eventlet.

    Time-consuming operations like flattening and copying volumes
    blocks eventlet loop and all cinder-volume service hangs
    until it finished. It makes cinder-volume services unavailable for
    a while.

    This patch moves all RBDVolume calls to a separate python thread which
    doesn't block eventlet loop.

    Change-Id: Id3f2d48428d74011ba690141cee3afad8e48c52f
    Closes-Bug: #1658037

Changed in cinder:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/464930

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/ocata)

Reviewed: https://review.openstack.org/464930
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=23d261c8512236b7669b38fc5301bd14b5682c76
Submitter: Jenkins
Branch: stable/ocata

commit 23d261c8512236b7669b38fc5301bd14b5682c76
Author: zhangsong <email address hidden>
Date: Fri Jan 20 18:17:28 2017 +0800

    RBD:Move RBDVolume calls to a separate threads

    RBD is a python binding for librados which isn't patched by eventlet.

    Time-consuming operations like flattening and copying volumes
    blocks eventlet loop and all cinder-volume service hangs
    until it finished. It makes cinder-volume services unavailable for
    a while.

    This patch moves all RBDVolume calls to a separate python thread which
    doesn't block eventlet loop.

    Change-Id: Id3f2d48428d74011ba690141cee3afad8e48c52f
    Closes-Bug: #1658037
    (cherry picked from commit 8f2b7f4f20727d35d0dce25fddb5a879f26c9614)

tags: added: in-stable-ocata
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/467811

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/newton)

Reviewed: https://review.openstack.org/467811
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=3db5c2cc4995965686054af09296f955e0a80181
Submitter: Jenkins
Branch: stable/newton

commit 3db5c2cc4995965686054af09296f955e0a80181
Author: zhangsong <email address hidden>
Date: Fri Jan 20 18:17:28 2017 +0800

    RBD:Move RBDVolume calls to a separate threads

    RBD is a python binding for librados which isn't patched by eventlet.

    Time-consuming operations like flattening and copying volumes
    blocks eventlet loop and all cinder-volume service hangs
    until it finished. It makes cinder-volume services unavailable for
    a while.

    This patch moves all RBDVolume calls to a separate python thread which
    doesn't block eventlet loop.

    Change-Id: Id3f2d48428d74011ba690141cee3afad8e48c52f
    Closes-Bug: #1658037
    (cherry picked from commit 8f2b7f4f20727d35d0dce25fddb5a879f26c9614)

tags: added: in-stable-newton
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 11.0.0.0b2

This issue was fixed in the openstack/cinder 11.0.0.0b2 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 10.0.3

This issue was fixed in the openstack/cinder 10.0.3 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.