[SRU] cinder rbd calls block eventlet threads

Bug #1678275 reported by Matt Rae
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu Cloud Archive
Invalid
Undecided
Unassigned
Kilo
Invalid
High
Unassigned

Bug Description

[Impact]

cinder-volume's rbd driver makes a call out to rbd it does not yield to eventlet, thus blocking all other processing. When this happens any pending requests are stuck unacknowledged in the rabbit queue until the current rbd task completes. This results in an unresponsive cloud presented to the user and actions such as instance creation failing due to nova timing out waiting on cinder.

[Test Case]

Steps to reproduce:
1: Create a volume that will take more than an instant to delete.
2: Delete the volume
3: Immediately attempt to create some volumes

Expected results:
Volumes create in a timely manner and become available
Volume delete processes and delete finishes in parallel

[Regression Potential]

This patch moves all rados calls to a separate python thread which
+doesn't block eventlet loop.

[Original Description LP: #1401335]

When cinder-volume's rbd driver makes a call out to rbd it does not yield to eventlet, thus blocking all other processing.
When this happens any pending requests are stuck unacknowledged in the rabbit queue until the current rbd task completes. This results in an unresponsive cloud presented to the user and actions such as instance creation failing due to nova timing out waiting on cinder.

Requirements to reproduce:
1: Ceph set up with a rbd backend. A bundle is attached which was used to reproduce, using 3 Ceph OSDs
2: A single ceph-volume worker to prevent the distributed nature from masking the problem
3: A method of creating a large volume, writing to it
4: A method for slowing down the rbd deletion speed

Steps to reproduce:
1: Create a 100G volume

2: Attach the volume to an instance and fill the volume with random data
fdisk /dev/vdb
mkfs.ext4 /dev/vdb
mkdir /data
mount /dev/vdb /data
cd /data
dd if=/dev/urandom of=bigfile count=90000 bs=10M

2: Add cgroup for blkio to each Ceph OSD host limiting read/write to 1M/s, then restart ceph
mkdir -p /sys/fs/cgroup/blkio
mount -t cgroup -o blkio none /sys/fs/cgroup/blkio
mkdir -p /sys/fs/cgroup/blkio/limit1M/
# replace 8:0 with the major:minor number of the OSD disk.
echo "8:0 1000000" > /sys/fs/cgroup/blkio/limit1M/blkio.throttle.write_bps_device
echo "8:0 1000000" > /sys/fs/cgroup/blkio/limit1M/blkio.throttle.read_bps_device
# update ceph-osd upstart job to run ceph using the cgroup we created
# edit /etc/init/ceph-osd.conf changing exec line to the following
exec cgexec -g blkio:limit1M /usr/bin/ceph-osd --cluster="${cluster:-ceph}" -i "$id" -f
# restart ceph-osd
service ceph-osd-all restart

2: Delete the volume

3: Immediately attempt to create some volumes

4: Verify if cinder-volume has status 'up' in service-list

Expected results:
Volumes create in a timely manner and become available
Volume delete processes and delete finishes in parallel

cinder service-list
+------------------+--------------------+------+---------+-------+----------------------------+-----------------+
| Binary | Host | Zone | Status | State | Updated_at | Disabled Reason |
+------------------+--------------------+------+---------+-------+----------------------------+-----------------+
| cinder-scheduler | cinder | nova | enabled | up | 2017-05-03T23:41:31.000000 | - |
| cinder-volume | cinder@cinder-ceph | nova | enabled | up | 2017-05-03T23:41:36.000000 | - |
+------------------+--------------------+------+---------+-------+----------------------------+-----------------+

Actual results:
Volumes creations are processed after the delete has finished
Volume delete blocks threads and must process first

cinder service-list
+------------------+--------------------+------+---------+-------+----------------------------+-----------------+
| Binary | Host | Zone | Status | State | Updated_at | Disabled Reason |
+------------------+--------------------+------+---------+-------+----------------------------+-----------------+
| cinder-scheduler | cinder | nova | enabled | up | 2017-05-03T23:39:41.000000 | - |
| cinder-volume | cinder@cinder-ceph | nova | enabled | down | 2017-05-03T23:38:22.000000 | - |
+------------------+--------------------+------+---------+-------+----------------------------+-----------------+

As RBD commands consume a fair amount of CPU time to process we should not just background the RBD commands as that would represent a DoS risk for the cinder-volume hosts.
One possible way to fix this would be to implement at least 2 queues that control the spawning of threads, reserving x of y threads for time sensitive and fast tasks.

Matt Rae (mattrae)
summary: - cinder rbd calls block eventlet threads
+ [SRU] cinder rbd calls block eventlet threads
tags: added: sts-sponsor
Matt Rae (mattrae)
tags: removed: sts-sponsor
Felipe Reyes (freyes)
tags: added: sts
Revision history for this message
Matt Rae (mattrae) wrote :
Matt Rae (mattrae)
description: updated
Revision history for this message
Matt Rae (mattrae) wrote :
Matt Rae (mattrae)
tags: added: sts-sru-needed
James Page (james-page)
Changed in cloud-archive:
status: New → Invalid
Revision history for this message
Edward Hope-Morley (hopem) wrote :

Moved to the original bug for SRU submission - https://bugs.launchpad.net/cinder/+bug/1401335

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.