Instances in vm state DELETED are preventing compute restart

Bug #1053441 reported by Stanislaw Pitucha
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Stanislaw Pitucha
Folsom
Fix Released
High
Chuck Short
nova (Ubuntu)
Fix Released
Undecided
Unassigned
Quantal
Fix Released
Undecided
Unassigned

Bug Description

Instances which end up with the following values:

    power_state: 1
    vm_state: deleted

will prevent compute manager from starting correctly, since it tries to get the instance back, looking only at RUNNING power_state. The same could potentially happen with SOFT_DELETED state.

Resulting exception can be for example:

2012-09-20 12:41:55 CRITICAL nova [-] [Errno 2] No such file or directory: '/var/lib/nova/instances/instance-00000212/libvirt.xml'
2012-09-20 12:41:55 TRACE nova Traceback (most recent call last):
2012-09-20 12:41:55 TRACE nova File "/usr/bin/nova-compute", line 48, in <module>
2012-09-20 12:41:55 TRACE nova service.wait()
2012-09-20 12:41:55 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/service.py", line 659, in wait
2012-09-20 12:41:55 TRACE nova _launcher.wait()
2012-09-20 12:41:55 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/service.py", line 192, in wait
2012-09-20 12:41:55 TRACE nova super(ServiceLauncher, self).wait()
2012-09-20 12:41:55 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/service.py", line 162, in wait
2012-09-20 12:41:55 TRACE nova service.wait()
2012-09-20 12:41:55 TRACE nova File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 166, in wait
2012-09-20 12:41:55 TRACE nova return self._exit_event.wait()
2012-09-20 12:41:55 TRACE nova File "/usr/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait
2012-09-20 12:41:55 TRACE nova return hubs.get_hub().switch()
2012-09-20 12:41:55 TRACE nova File "/usr/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 177, in switch
2012-09-20 12:41:55 TRACE nova return self.greenlet.switch()
2012-09-20 12:41:55 TRACE nova File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 192, in main
2012-09-20 12:41:55 TRACE nova result = function(*args, **kwargs)
2012-09-20 12:41:55 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/service.py", line 132, in run_server
2012-09-20 12:41:55 TRACE nova server.start()
2012-09-20 12:41:55 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/service.py", line 398, in start
2012-09-20 12:41:55 TRACE nova self.manager.init_host()
2012-09-20 12:41:55 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 313, in init_host
2012-09-20 12:41:55 TRACE nova block_device_info)
2012-09-20 12:41:55 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 117, in wrapped
2012-09-20 12:41:55 TRACE nova temp_level, payload)
2012-09-20 12:41:55 TRACE nova File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
2012-09-20 12:41:55 TRACE nova self.gen.next()
2012-09-20 12:41:55 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 92, in wrapped
2012-09-20 12:41:55 TRACE nova return f(*args, **kw)
2012-09-20 12:41:55 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 953, in resume_state_on_host_boot
2012-09-20 12:41:55 TRACE nova xml = self._get_domain_xml(instance)
2012-09-20 12:41:55 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 653, in _get_domain_xml
2012-09-20 12:41:55 TRACE nova xml = libvirt_utils.load_file(xml_path)
2012-09-20 12:41:55 TRACE nova File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/utils.py", line 349, in load_file
2012-09-20 12:41:55 TRACE nova with open(path, 'r') as fp:
2012-09-20 12:41:55 TRACE nova IOError: [Errno 2] No such file or directory: '/var/lib/nova/instances/instance-00000212/libvirt.xml'
2012-09-20 12:41:55 TRACE nova

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/13362

Changed in nova:
assignee: nobody → Stanislaw Pitucha (stanislaw-pitucha)
status: New → In Progress
Changed in nova:
importance: Undecided → High
tags: added: folsom-rc-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/13362
Committed: http://github.com/openstack/nova/commit/fdd9325df75652a95a96ccd4e59b73556df811c6
Submitter: Jenkins
Branch: master

commit fdd9325df75652a95a96ccd4e59b73556df811c6
Author: Stanislaw Pitucha <email address hidden>
Date: Thu Sep 20 15:47:19 2012 +0100

    Fix startup with DELETED instances

    Make sure that compute manager with DELETED and SOFT_DELETED instances
    starts up properly. Before it was possible to end up with only the db
    entry and no local configuration allowing a successful restart (even
    then it would be the wrong decision to try).

    Fixes bug 1053441

    Change-Id: Iab1ca81068733a5e8546c32ad122f1d60d22310b

Changed in nova:
status: In Progress → Fix Committed
tags: added: folsom-backport-potential
removed: folsom-rc-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/folsom)

Fix proposed to branch: stable/folsom
Review: https://review.openstack.org/14052

Mark McLoughlin (markmc)
tags: removed: folsom-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/folsom)

Reviewed: https://review.openstack.org/14052
Committed: http://github.com/openstack/nova/commit/ebbfa9ef5e2ed23b567768e2a375204d434e87fa
Submitter: Jenkins
Branch: stable/folsom

commit ebbfa9ef5e2ed23b567768e2a375204d434e87fa
Author: Stanislaw Pitucha <email address hidden>
Date: Thu Sep 20 15:47:19 2012 +0100

    Fix startup with DELETED instances

    Make sure that compute manager with DELETED and SOFT_DELETED instances
    starts up properly. Before it was possible to end up with only the db
    entry and no local configuration allowing a successful restart (even
    then it would be the wrong decision to try).

    Fixes bug 1053441

    Change-Id: Iab1ca81068733a5e8546c32ad122f1d60d22310b

Thierry Carrez (ttx)
Changed in nova:
milestone: none → grizzly-1
status: Fix Committed → Fix Released
Changed in nova (Ubuntu):
status: New → Fix Released
Changed in nova (Ubuntu Quantal):
status: New → Confirmed
Revision history for this message
Clint Byrum (clint-fewbar) wrote : Please test proposed package

Hello Stanislaw, or anyone else affected,

Accepted nova into quantal-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/nova/2012.2.1+stable-20121212-a99a802e-0ubuntu1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in nova (Ubuntu Quantal):
status: Confirmed → Fix Committed
tags: added: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (8.3 KiB)

This bug was fixed in the package nova - 2012.2.1+stable-20121212-a99a802e-0ubuntu1

---------------
nova (2012.2.1+stable-20121212-a99a802e-0ubuntu1) quantal-proposed; urgency=low

  * Ubuntu updates:
    - debian/control: Ensure novaclient is upgraded with nova,
      require python-keystoneclient >= 1:2.9.0. (LP: #1073289)
    - d/p/avoid_setuptools_git_dependency.patch: Refresh.
  * Dropped patches, applied upstream:
    - debian/patches/CVE-2012-5625.patch: [a99a802]
  * Resynchronize with stable/folsom (b55014ca) (LP: #1085255):
    - [a99a802] create_lvm_image allocates dirty blocks (LP: #1070539)
    - [670b388] RPC exchange name defaults to 'openstack' (LP: #1083944)
    - [3ede373] disassociate_floating_ip with multi_host=True fails
      (LP: #1074437)
    - [22d7c3b] libvirt imagecache should handle shared image storage
      (LP: #1075018)
    - [e787786] Detached and deleted RBD volumes remain associated with insance
      (LP: #1083818)
    - [9265eb0] live_migration missing migrate_data parameter in Hyper-V driver
      (LP: #1066513)
    - [3d99848] use_single_default_gateway does not function correctly
      (LP: #1075859)
    - [65a2d0a] resize does not migrate DHCP host information (LP: #1065440)
    - [102c76b] Nova backup image fails (LP: #1065053)
    - [48a3521] Fix config-file overrides for nova-dhcpbridge
    - [69663ee] Cloudpipe in Folsom: no such option: cnt_vpn_clients
      (LP: #1069573)
    - [6e47cc8] DisassociateAddress can cause Internal Server Error
      (LP: #1080406)
    - [22c3d7b] API calls to dis-associate an auto-assigned floating IP should
      return proper warning (LP: #1061499)
    - [bd11d15] libvirt: if exception raised during volume_detach, volume state
      is inconsistent (LP: #1057756)
    - [dcb59c3] admin can't describe all images in ec2 api (LP: #1070138)
    - [78de622] Incorrect Exception raised during Create server when metadata
      over 255 characters (LP: #1004007)
    - [c313de4] Fixed IP isn't released before updating DHCP host file
      (LP: #1078718)
    - [f4ab42d] Enabling Return Reservation ID with XML create server request
      returns no body (LP: #1061124)
    - [3db2a38] 'BackupCreate' should accept rotation parameter greater than or
      equal to zero (LP: #1071168)
    - [f7e5dde] libvirt reboot sometimes fails to reattach volumes
      (LP: #1073720)
    - [ff776d4] libvirt: detaching volume may fail while terminating other
      instances on the same host concurrently (LP: #1060836)
    - [85a8bc2] Used instance uuid rather than id in remove-fixed-ip
    - [42a85c0] Fix error on invalid delete_on_termination value
    - [6a17579] xenapi migrations fail w/ swap (LP: #1064083)
    - [97649b8] attach-time field for volumes is not updated for detach volume
      (LP: #1056122)
    - [8f6a718] libvirt: rebuild is not using kernel and ramdisk associated with
      the new image (LP: #1060925)
    - [fbe835f] live-migration and volume host assignement (LP: #1066887)
    - [c2a9150] typo prevents volume_tmp_dir flag from working (LP: #1071536)
    - [93efa21] Instances deleted during spawn leak network allocations
      (LP: #1068716)
    - [ebabd02] After restartin...

Read more...

Changed in nova (Ubuntu Quantal):
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: grizzly-1 → 2013.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.