OpenStack Compute (nova)

Rebooting instance doesn't restore mounted volume

Bug #747922 reported by Tushar Patil on 2011-04-02

This bug affects 2 people

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Compute (nova)	Fix Released	Medium	Masanori Itoh	OpenStack Compute (nova) 2011.3 "diablo"

Bug Description

Tested on Revision No 925.

Steps to reproduce:-
1) Run one VM instance
2) Attach volume to the VM instance
3) SSH to the VM instance, mount the volume and logout from SSH
4) reboot the VM instance
5) Again SSH to the VM instance and try to mount the volume.
It doesn't allow and gives the error message
{{{
Could not stat /dev/vdb --- No such file or directory

The device apparently does not exist; did you specify it correctly?
}}}

6) euca-describe-volumes still shows that the volume is attached to the VM instance and is in use.
{{{
root@ubuntu-openstack-single-server:/home/tpatil# euca-describe-volumes
VOLUME vol-00000001 1 nova in-use (admin, ubuntu-openstack-single-server, i-00000002[ubuntu-openstack-single-server], \/dev\/vdb) 2011-04-02T00:48:20Z
}}}

7) If I try to detach the volume, then it gives error message in the nova-compute.log
nova-compute.log
-----------------------
{{{
2011-04-01 17:59:04,743 ERROR nova [-] Exception during message handling
(nova): TRACE: Traceback (most recent call last):
(nova): TRACE: File "/home/tpatil/nova/nova/rpc.py", line 190, in _receive
(nova): TRACE: rval = node_func(context=ctxt, **node_args)
(nova): TRACE: File "/home/tpatil/nova/nova/exception.py", line 120, in _wrap
(nova): TRACE: return f(*args, **kw)
(nova): TRACE: File "/home/tpatil/nova/nova/compute/manager.py", line 105, in decorated_function
(nova): TRACE: function(self, context, instance_id, *args, **kwargs)
(nova): TRACE: File "/home/tpatil/nova/nova/compute/manager.py", line 779, in detach_volume
(nova): TRACE: volume_ref['mountpoint'])
(nova): TRACE: File "/home/tpatil/nova/nova/exception.py", line 120, in _wrap
(nova): TRACE: return f(*args, **kw)
(nova): TRACE: File "/home/tpatil/nova/nova/virt/libvirt_conn.py", line 405, in detach_volume
(nova): TRACE: raise exception.NotFound(_("No disk at %s") % mount_device)
(nova): TRACE: NotFound: No disk at vdb
(nova): TRACE:
}}}

Related branches

lp:~itohm/nova/lp747922

Merged into lp:~hudson-openstack/nova/trunk at revision 1028

Vish Ishaya (community): Approve on 2011-04-22

termie (community): Approve on 2011-04-20

Thierry Carrez (ttx) on 2011-04-04

Changed in nova:
importance:	Undecided → Medium
status:	New → Confirmed

Revision history for this message

Masanori Itoh (itohm) wrote on 2011-04-06:

Hi Tushar,

You use KVM on Ubuntu, right?
Also, you rebooted the instance on the guest OS?
I mean, not using euca-reboot-instances.

I'm feeling that the root cause of this issue would be a KVM problem.
If this issue is reproducible, please check collect information before and after rebooting instances on compute node.

# virsh dumpxml VM_NAME

I guess the device you attached to your VM vanished from VM configuration after guest OS reboot.
In the case, what we can do anyway would be logging the exeption and cleaning up database, I think.

-Masanori

Revision history for this message

Masanori Itoh (itohm) wrote on 2011-04-06:

BTW, if you used euca-reboot-instances on KVM based systems, the issue gets back to nova side.
At this moment, libvirt does not support rebooting KVM instances, and the current implementation of
RebootInstance is like the following.

  trunk/nova/virt/libvirt_conn.py
    473 def reboot(self, instance):
    474 self.destroy(instance, False) # DESTROY ONCE
    475 xml = self.to_xml(instance)

One idea could be calling virsh dumpxml to the instance to be rebooted and updating the above xml here.

    476     477     478     479     480
    481     482
    483     484     485     486     487     488     489     490     491     492     493     494     495     496     497
    498     499 self.firewall_driver.setup_basic_filtering(instance)
self.firewall_driver.prepare_instance_filter(instance)
self._conn.createXML(xml, 0) # CREATE AGAIN, AND THERE IS NO CODE TO RE-ATTACH EBSs.
self.firewall_driver.apply_instance_filter(instance)
timer = utils.LoopingCall(f=None)
def _wait_for_reboot():
try:
state = self.get_info(instance['name'])['state']
db.instance_set_state(context.get_admin_context(),
instance['id'], state)
if state == power_state.RUNNING:
LOG.debug(_('instance %s: rebooted'), instance['name'])
timer.stop()
except Exception, exn:
LOG.exception(_('_wait_for_reboot failed: %s'), exn)
db.instance_set_state(context.get_admin_context(),
instance['id'],
power_state.SHUTDOWN)
timer.stop()
timer.f = _wait_for_reboot
return timer.start(interval=0.5, now=True)

-Masanori

Revision history for this message

Masanori Itoh (itohm) wrote on 2011-04-13:

Hi,

I wrote an ultimately STUPID workaround fix on this issue and linked my branch here.
Actually, this issue is a problematic one if we want to resolve it in an elegant way, I think.

Anyway, it looks working on my Ubuntu 10.10 box at least.

Tushar, if you are testing volume driver other than ISCSI or multiple multi-node nova installation,
could you try the patch below?

lp:~itoumsn/nova/lp747922

Thanks,

Changed in nova:
status:	Confirmed → In Progress
assignee:	nobody → Masanori Itoh (itoumsn)

Revision history for this message

Masanori Itoh (itohm) wrote on 2011-04-13:

I will volunteer as the assignee of this issue until someone with more elegant resolution appears...

-Masanori

Revision history for this message

Tushar Patil (tpatil) wrote on 2011-04-14:

I tested using your branch lp:~itoumsn/nova/lp747922 and it seems to be working as expected now.

After rebooting the instance, I see the volume is still attached and I can see all files intact.

Thank you.

Revision history for this message

Masanori Itoh (itohm) wrote on 2011-04-15:

Hi Tushar,

Thanks for testing. :)
I will post a merge request soon after the cactus release.

Thanks,
Masanori

Masanori Itoh (itohm) on 2011-04-22

Changed in nova:
status:	In Progress → Fix Committed

Thierry Carrez (ttx) on 2011-06-02

Changed in nova:
milestone:	none → diablo-1

Revision history for this message

ady (imelurang) wrote on 2011-08-23:

pardon me, i'm a newbie here, how can i apply above patch into my existing cloud? :)

Thierry Carrez (ttx) on 2011-09-22

Changed in nova:
milestone:	diablo-1 → 2011.3
status:	Fix Committed → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

Duplicates of this bug

Bug #764957

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.