Merge lp:~jk0/nova/xs-rescue into lp:~hudson-openstack/nova/trunk

Proposed by Josh Kearney
Status: Merged
Approved by: Rick Harris
Approved revision: 603
Merged at revision: 754
Proposed branch: lp:~jk0/nova/xs-rescue
Merge into: lp:~hudson-openstack/nova/trunk
Diff against target: 641 lines (+220/-62)
13 files modified
.mailmap (+1/-0)
Authors (+1/-1)
nova/api/openstack/__init__.py (+2/-0)
nova/api/openstack/servers.py (+22/-0)
nova/compute/manager.py (+24/-10)
nova/db/sqlalchemy/models.py (+6/-1)
nova/tests/xenapi/stubs.py (+1/-1)
nova/virt/libvirt_conn.py (+2/-2)
nova/virt/xenapi/fake.py (+1/-1)
nova/virt/xenapi/vm_utils.py (+10/-12)
nova/virt/xenapi/vmops.py (+132/-27)
nova/virt/xenapi/volumeops.py (+1/-1)
nova/virt/xenapi_conn.py (+17/-6)
To merge this branch: bzr merge lp:~jk0/nova/xs-rescue
Reviewer Review Type Date Requested Status
Rick Harris (community) Approve
Matt Dietz (community) Approve
Review via email: mp+51780@code.launchpad.net

Description of the change

Provide the ability to rescue and unrescue a XenServer instance.

To post a comment you must log in.
Revision history for this message
Rick Harris (rconradharris) wrote :
Download full text (4.2 KiB)

Good work jk0, this is a great start.

Some small nits:

> 80 + self.driver.rescue(instance_ref,
> 81 + lambda result: self._update_state_callback(self,
> 82 + context,
> 83 + instance_id,
> 84 + result))

Breaking this into two lines would probably aid readability.

> 145 + def rescue(self, instance, callback=None):

The callback function isn't firing for rescue or unrescue. It looks like the
other methods are using

        self._wait_with_callback(instance.id, task, callback)

    To fire it off.

> 342 + def _bootlock(self, vm, unlock=False):
> 343 + """Prevent an instance from booting"""
> 344 + if unlock:
> 345 + self._session.call_xenapi(
> 346 + "VM.remove_from_blocked_operations",
> 347 + vm,
> 348 + "start")
> 349 + else:
> 350 + self._session.call_xenapi(
> 351 + "VM.set_blocked_operations",
> 352 + vm,
> 353 + {"start": ""})
> 354 +

I think this make more sense as two methods:
_acquire_bootlock and _release_bootlock.

Another thing you could do, is assert that we're in the correct state. For
example, that we're not trying to unlock a VM that's already unlocked (that'd
indicate a bug in the code).

> 394 + try:
> 395 + task = self._session.call_xenapi("Async.VM.clean_shutdown", vm)
> 396 + self._session.wait_for_task(task, instance.id)
> 397 + except self.XenAPI.Failure:
> 398 + task = self._session.call_xenapi("Async.VM.hard_shutdown", vm)
> 399 + self._session.wait_for_task(task, instance.id)

By default, I think _shutdown should perform either a clean or hard shutdown
depending on whats requested (Dietz added a hard=true kwarg in his
xs-migrations branch).

If at all possible, I'd like to avoid silent-fall-backs (like from clean to
hard), because they're usually indicative of a larger problem

That said, a _force_shutdown that attempts a clean and then falls-back to
hard would be okay since it's very explicity when that behavior is being
requested (And will be easier to remove later on if we want to).

> 484 + if vbd["userdevice"] == str(1):

str(1) can just be "1".

> 488 + task1 = self._session.call_xenapi("Async.VM.hard_shutdown", rescue_vm)
> 489 + self._session.wait_for_task(task1, instance.id)

This could be something like _shutdown(instance, rescue_vm, hard=True).

> 491 + vdis = VMHelper.lookup_vm_vdis(self._session, rescue_vm)
> 492 + for vdi in vdis:
> 493 + try:
> 494 + task = self._session.call_xenapi('Async.VDI.destroy', vdi)
> 495 + self._session.wait_for_task(task, instance.id)
> 496 + except self.XenAPI.Failure:
> 497 + continue

Should be able to use `_destroy_vdis` here.

> 499 + task2 = self._session.call_xenapi('Async.VM.destroy', rescue_vm)
> 500 + self._session.wait_for_task(task2, instance.i...

Read more...

review: Needs Fixing
Revision history for this message
Josh Kearney (jk0) wrote :
Download full text (5.3 KiB)

> Good work jk0, this is a great start.
>
>
> Some small nits:
>
> > 80 + self.driver.rescue(instance_ref,
> > 81 + lambda result: self._update_state_callback(self,
> > 82 + context,
> > 83 + instance_id,
> > 84 + result))
>
> Breaking this into two lines would probably aid readability.

Copied this directly from another call, wanted to maintain consistency. I'll go thru them all and clean up.

>
> > 145 + def rescue(self, instance, callback=None):
>
> The callback function isn't firing for rescue or unrescue. It looks like the
> other methods are using
>
> self._wait_with_callback(instance.id, task, callback)
>
> To fire it off.

I had to keep this in place since libvirt uses the callback (passed down to the driver).

>
> > 342 + def _bootlock(self, vm, unlock=False):
> > 343 + """Prevent an instance from booting"""
> > 344 + if unlock:
> > 345 + self._session.call_xenapi(
> > 346 + "VM.remove_from_blocked_operations",
> > 347 + vm,
> > 348 + "start")
> > 349 + else:
> > 350 + self._session.call_xenapi(
> > 351 + "VM.set_blocked_operations",
> > 352 + vm,
> > 353 + {"start": ""})
> > 354 +
>
> I think this make more sense as two methods:
> _acquire_bootlock and _release_bootlock.
>
> Another thing you could do, is assert that we're in the correct state. For
> example, that we're not trying to unlock a VM that's already unlocked (that'd
> indicate a bug in the code).

Good points, I'll break this down into two methods.

>
> > 394 + try:
> > 395 + task =
> self._session.call_xenapi("Async.VM.clean_shutdown", vm)
> > 396 + self._session.wait_for_task(task, instance.id)
> > 397 + except self.XenAPI.Failure:
> > 398 + task =
> self._session.call_xenapi("Async.VM.hard_shutdown", vm)
> > 399 + self._session.wait_for_task(task, instance.id)
>
> By default, I think _shutdown should perform either a clean or hard shutdown
> depending on whats requested (Dietz added a hard=true kwarg in his
> xs-migrations branch).
>
> If at all possible, I'd like to avoid silent-fall-backs (like from clean to
> hard), because they're usually indicative of a larger problem
>
> That said, a _force_shutdown that attempts a clean and then falls-back to
> hard would be okay since it's very explicity when that behavior is being
> requested (And will be easier to remove later on if we want to).
>

Sounds like I could get by using Dietz' method. I'll do this to avoid conflicts down the road.

> > 484 + if vbd["userdevice"] == str(1):
>
> str(1) can just be "1".

Nice catch. Not sure why I used str() here.

>
> > 488 + task1 = self._session.call_xenapi("Async.VM.hard_shutdown",
> rescue_vm)
> > 489 + self._session.wait_for_task(task1, instance.id)
>
> This could be som...

Read more...

lp:~jk0/nova/xs-rescue updated
602. By Josh Kearney

Review feedback

Revision history for this message
Matt Dietz (cerberus) wrote :

Pinning rescue status on the name of the VM seems pretty messy to me. It seems to me that, since rescue is such a critical state to be aware of, that we should maintain that either through a flag on the instance model or through the instance status attribute. Is there a reason you didn't go this route?

review: Needs Information
Revision history for this message
Josh Kearney (jk0) wrote :

> Pinning rescue status on the name of the VM seems pretty messy to me. It seems
> to me that, since rescue is such a critical state to be aware of, that we
> should maintain that either through a flag on the instance model or through
> the instance status attribute. Is there a reason you didn't go this route?

We had to go this route to avoid duplicate name-label exceptions. By doing this, we can spin up a rescue image using the same instance object and easily refer back to the original image when the time comes to unrescue. The name-labels are guaranteed to be unique.

Revision history for this message
Matt Dietz (cerberus) wrote :

Sure, I get that. My concern comes in cases where we may want to auto-unrescue instances. If we were to search for anything with '-rescue' on the end, we might try to unrescue stuff that's just unfortunately named.

Revision history for this message
Matt Dietz (cerberus) wrote :

Nevermind, feedback received off list. Forgot that name-labels were not instance-descriptions (we should probably rename one or both of those...)

review: Approve
Revision history for this message
Rick Harris (rconradharris) wrote :

Thanks for the fixes jk0. Have a couple of more potential cleanups, nothing big:

> 136 @property
> 137 def name(self):
> 138 - return FLAGS.instance_name_template % self.id
> 139 + base_name = FLAGS.instance_name_template % self.id
> 140 + if getattr(self, '_rescue', False):
> 141 + base_name += "-rescue"
> 142 + return base_name

I know I suggested this, but I'm having second thoughts about putting this
logic here. At least right now, this is XenAPI hackiness, I'm not sure this
belongs in the generic instance model.

Instead, I think the XenAPI code should have a method, `get_name_label` which
it uses instead of instance.name to generate the VM name-labels. That code
would look like:

def get_name_label(instance):
    """Generate name label for the XenServer VM record

    We have two scenarios:

        1. We're in normal mode and the INSTANCE VM record uses the instance name as
           its name label

        2. We're in rescue mode and the RESCUE VM Record uses instance name +
           '-rescue' as its name label
    """
    base_name = instance.name
    if getattr(self, '_rescue', False):
        base_name += "-rescue"
    return base_name

The code is really pretty readable, but, it still might be a good idea to
include a big doc-string on the XenAPI rescue method that explains, at a
high-level, what's involved in a rescue, like:

"""
Rescue consists of a number of steps

1. We shutdown the INSTANCE VM

2. We apply a 'bootlock' to prevent the instance VM from being started while
   in rescue

2. We spawn a RESCUE VM (the vm name-label in this case will be
   instance-0002-rescue)

3. We attach the INSTANCE VM's VBDs to the RESCUE VM

Unrescue consists of:

1. We unplug the INSTANCE VM's from the RESCUE VM

2. We teardown the RESCUE VM

3. Release the bootlock to allow the INSTANCE VM to be started

4. Start the INSTANCE VM
"""

lp:~jk0/nova/xs-rescue updated
603. By Josh Kearney

Updated docstrings

Revision history for this message
Rick Harris (rconradharris) wrote :

Looks good.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file '.mailmap'
--- .mailmap 2011-02-18 18:27:30 +0000
+++ .mailmap 2011-03-02 23:12:16 +0000
@@ -19,6 +19,7 @@
19<jmckenty@gmail.com> <jmckenty@joshua-mckentys-macbook-pro.local>19<jmckenty@gmail.com> <jmckenty@joshua-mckentys-macbook-pro.local>
20<jmckenty@gmail.com> <jmckenty@yyj-dhcp171.corp.flock.com>20<jmckenty@gmail.com> <jmckenty@yyj-dhcp171.corp.flock.com>
21<jmckenty@gmail.com> <joshua.mckenty@nasa.gov>21<jmckenty@gmail.com> <joshua.mckenty@nasa.gov>
22<josh@jk0.org> <josh.kearney@rackspace.com>
22<justin@fathomdb.com> <justinsb@justinsb-desktop>23<justin@fathomdb.com> <justinsb@justinsb-desktop>
23<justin@fathomdb.com> <superstack@superstack.org>24<justin@fathomdb.com> <superstack@superstack.org>
24<masumotok@nttdata.co.jp> Masumoto<masumotok@nttdata.co.jp>25<masumotok@nttdata.co.jp> Masumoto<masumotok@nttdata.co.jp>
2526
=== modified file 'Authors'
--- Authors 2011-03-01 00:46:39 +0000
+++ Authors 2011-03-02 23:12:16 +0000
@@ -31,7 +31,7 @@
31Jonathan Bryce <jbryce@jbryce.com>31Jonathan Bryce <jbryce@jbryce.com>
32Jordan Rinke <jordan@openstack.org>32Jordan Rinke <jordan@openstack.org>
33Josh Durgin <joshd@hq.newdream.net>33Josh Durgin <joshd@hq.newdream.net>
34Josh Kearney <josh.kearney@rackspace.com>34Josh Kearney <josh@jk0.org>
35Joshua McKenty <jmckenty@gmail.com>35Joshua McKenty <jmckenty@gmail.com>
36Justin Santa Barbara <justin@fathomdb.com>36Justin Santa Barbara <justin@fathomdb.com>
37Kei Masumoto <masumotok@nttdata.co.jp>37Kei Masumoto <masumotok@nttdata.co.jp>
3838
=== modified file 'nova/api/openstack/__init__.py'
--- nova/api/openstack/__init__.py 2011-02-24 23:35:21 +0000
+++ nova/api/openstack/__init__.py 2011-03-02 23:12:16 +0000
@@ -80,6 +80,8 @@
80 server_members["actions"] = "GET"80 server_members["actions"] = "GET"
81 server_members['suspend'] = 'POST'81 server_members['suspend'] = 'POST'
82 server_members['resume'] = 'POST'82 server_members['resume'] = 'POST'
83 server_members['rescue'] = 'POST'
84 server_members['unrescue'] = 'POST'
83 server_members['reset_network'] = 'POST'85 server_members['reset_network'] = 'POST'
84 server_members['inject_network_info'] = 'POST'86 server_members['inject_network_info'] = 'POST'
8587
8688
=== modified file 'nova/api/openstack/servers.py'
--- nova/api/openstack/servers.py 2011-03-01 00:56:46 +0000
+++ nova/api/openstack/servers.py 2011-03-02 23:12:16 +0000
@@ -335,6 +335,28 @@
335 return faults.Fault(exc.HTTPUnprocessableEntity())335 return faults.Fault(exc.HTTPUnprocessableEntity())
336 return exc.HTTPAccepted()336 return exc.HTTPAccepted()
337337
338 def rescue(self, req, id):
339 """Permit users to rescue the server."""
340 context = req.environ["nova.context"]
341 try:
342 self.compute_api.rescue(context, id)
343 except:
344 readable = traceback.format_exc()
345 LOG.exception(_("compute.api::rescue %s"), readable)
346 return faults.Fault(exc.HTTPUnprocessableEntity())
347 return exc.HTTPAccepted()
348
349 def unrescue(self, req, id):
350 """Permit users to unrescue the server."""
351 context = req.environ["nova.context"]
352 try:
353 self.compute_api.unrescue(context, id)
354 except:
355 readable = traceback.format_exc()
356 LOG.exception(_("compute.api::unrescue %s"), readable)
357 return faults.Fault(exc.HTTPUnprocessableEntity())
358 return exc.HTTPAccepted()
359
338 def get_ajax_console(self, req, id):360 def get_ajax_console(self, req, id):
339 """ Returns a url to an instance's ajaxterm console. """361 """ Returns a url to an instance's ajaxterm console. """
340 try:362 try:
341363
=== modified file 'nova/compute/manager.py'
--- nova/compute/manager.py 2011-02-24 23:35:21 +0000
+++ nova/compute/manager.py 2011-03-02 23:12:16 +0000
@@ -370,12 +370,19 @@
370 context = context.elevated()370 context = context.elevated()
371 instance_ref = self.db.instance_get(context, instance_id)371 instance_ref = self.db.instance_get(context, instance_id)
372 LOG.audit(_('instance %s: rescuing'), instance_id, context=context)372 LOG.audit(_('instance %s: rescuing'), instance_id, context=context)
373 self.db.instance_set_state(context,373 self.db.instance_set_state(
374 instance_id,374 context,
375 power_state.NOSTATE,375 instance_id,
376 'rescuing')376 power_state.NOSTATE,
377 'rescuing')
377 self.network_manager.setup_compute_network(context, instance_id)378 self.network_manager.setup_compute_network(context, instance_id)
378 self.driver.rescue(instance_ref)379 self.driver.rescue(
380 instance_ref,
381 lambda result: self._update_state_callback(
382 self,
383 context,
384 instance_id,
385 result))
379 self._update_state(context, instance_id)386 self._update_state(context, instance_id)
380387
381 @exception.wrap_exception388 @exception.wrap_exception
@@ -385,11 +392,18 @@
385 context = context.elevated()392 context = context.elevated()
386 instance_ref = self.db.instance_get(context, instance_id)393 instance_ref = self.db.instance_get(context, instance_id)
387 LOG.audit(_('instance %s: unrescuing'), instance_id, context=context)394 LOG.audit(_('instance %s: unrescuing'), instance_id, context=context)
388 self.db.instance_set_state(context,395 self.db.instance_set_state(
389 instance_id,396 context,
390 power_state.NOSTATE,397 instance_id,
391 'unrescuing')398 power_state.NOSTATE,
392 self.driver.unrescue(instance_ref)399 'unrescuing')
400 self.driver.unrescue(
401 instance_ref,
402 lambda result: self._update_state_callback(
403 self,
404 context,
405 instance_id,
406 result))
393 self._update_state(context, instance_id)407 self._update_state(context, instance_id)
394408
395 @staticmethod409 @staticmethod
396410
=== modified file 'nova/db/sqlalchemy/models.py'
--- nova/db/sqlalchemy/models.py 2011-02-24 05:47:29 +0000
+++ nova/db/sqlalchemy/models.py 2011-03-02 23:12:16 +0000
@@ -126,11 +126,16 @@
126class Instance(BASE, NovaBase):126class Instance(BASE, NovaBase):
127 """Represents a guest vm."""127 """Represents a guest vm."""
128 __tablename__ = 'instances'128 __tablename__ = 'instances'
129 onset_files = []
130
129 id = Column(Integer, primary_key=True, autoincrement=True)131 id = Column(Integer, primary_key=True, autoincrement=True)
130132
131 @property133 @property
132 def name(self):134 def name(self):
133 return FLAGS.instance_name_template % self.id135 base_name = FLAGS.instance_name_template % self.id
136 if getattr(self, '_rescue', False):
137 base_name += "-rescue"
138 return base_name
134139
135 admin_pass = Column(String(255))140 admin_pass = Column(String(255))
136 user_id = Column(String(255))141 user_id = Column(String(255))
137142
=== modified file 'nova/tests/xenapi/stubs.py'
--- nova/tests/xenapi/stubs.py 2011-02-25 16:47:08 +0000
+++ nova/tests/xenapi/stubs.py 2011-03-02 23:12:16 +0000
@@ -27,7 +27,7 @@
27 def fake_fetch_image(cls, session, instance_id, image, user, project,27 def fake_fetch_image(cls, session, instance_id, image, user, project,
28 type):28 type):
29 # Stubout wait_for_task29 # Stubout wait_for_task
30 def fake_wait_for_task(self, id, task):30 def fake_wait_for_task(self, task, id):
31 class FakeEvent:31 class FakeEvent:
3232
33 def send(self, value):33 def send(self, value):
3434
=== modified file 'nova/virt/libvirt_conn.py'
--- nova/virt/libvirt_conn.py 2011-01-27 23:58:22 +0000
+++ nova/virt/libvirt_conn.py 2011-03-02 23:12:16 +0000
@@ -362,7 +362,7 @@
362 raise exception.APIError("resume not supported for libvirt")362 raise exception.APIError("resume not supported for libvirt")
363363
364 @exception.wrap_exception364 @exception.wrap_exception
365 def rescue(self, instance):365 def rescue(self, instance, callback=None):
366 self.destroy(instance, False)366 self.destroy(instance, False)
367367
368 xml = self.to_xml(instance, rescue=True)368 xml = self.to_xml(instance, rescue=True)
@@ -392,7 +392,7 @@
392 return timer.start(interval=0.5, now=True)392 return timer.start(interval=0.5, now=True)
393393
394 @exception.wrap_exception394 @exception.wrap_exception
395 def unrescue(self, instance):395 def unrescue(self, instance, callback=None):
396 # NOTE(vish): Because reboot destroys and recreates an instance using396 # NOTE(vish): Because reboot destroys and recreates an instance using
397 # the normal xml file, we can just call reboot here397 # the normal xml file, we can just call reboot here
398 self.reboot(instance)398 self.reboot(instance)
399399
=== modified file 'nova/virt/xenapi/fake.py'
--- nova/virt/xenapi/fake.py 2011-02-09 10:08:15 +0000
+++ nova/virt/xenapi/fake.py 2011-03-02 23:12:16 +0000
@@ -401,7 +401,7 @@
401 field in _db_content[cls][ref]):401 field in _db_content[cls][ref]):
402 return _db_content[cls][ref][field]402 return _db_content[cls][ref][field]
403403
404 LOG.debuug(_('Raising NotImplemented'))404 LOG.debug(_('Raising NotImplemented'))
405 raise NotImplementedError(405 raise NotImplementedError(
406 _('xenapi.fake does not have an implementation for %s or it has '406 _('xenapi.fake does not have an implementation for %s or it has '
407 'been called with the wrong number of arguments') % name)407 'been called with the wrong number of arguments') % name)
408408
=== modified file 'nova/virt/xenapi/vm_utils.py'
--- nova/virt/xenapi/vm_utils.py 2011-02-25 16:47:08 +0000
+++ nova/virt/xenapi/vm_utils.py 2011-03-02 23:12:16 +0000
@@ -205,19 +205,17 @@
205 """Destroy VBD from host database"""205 """Destroy VBD from host database"""
206 try:206 try:
207 task = session.call_xenapi('Async.VBD.destroy', vbd_ref)207 task = session.call_xenapi('Async.VBD.destroy', vbd_ref)
208 #FIXME(armando): find a solution to missing instance_id208 session.wait_for_task(task)
209 #with Josh Kearney
210 session.wait_for_task(0, task)
211 except cls.XenAPI.Failure, exc:209 except cls.XenAPI.Failure, exc:
212 LOG.exception(exc)210 LOG.exception(exc)
213 raise StorageError(_('Unable to destroy VBD %s') % vbd_ref)211 raise StorageError(_('Unable to destroy VBD %s') % vbd_ref)
214212
215 @classmethod213 @classmethod
216 def create_vif(cls, session, vm_ref, network_ref, mac_address):214 def create_vif(cls, session, vm_ref, network_ref, mac_address, dev="0"):
217 """Create a VIF record. Returns a Deferred that gives the new215 """Create a VIF record. Returns a Deferred that gives the new
218 VIF reference."""216 VIF reference."""
219 vif_rec = {}217 vif_rec = {}
220 vif_rec['device'] = '0'218 vif_rec['device'] = dev
221 vif_rec['network'] = network_ref219 vif_rec['network'] = network_ref
222 vif_rec['VM'] = vm_ref220 vif_rec['VM'] = vm_ref
223 vif_rec['MAC'] = mac_address221 vif_rec['MAC'] = mac_address
@@ -269,7 +267,7 @@
269 original_parent_uuid = get_vhd_parent_uuid(session, vm_vdi_ref)267 original_parent_uuid = get_vhd_parent_uuid(session, vm_vdi_ref)
270268
271 task = session.call_xenapi('Async.VM.snapshot', vm_ref, label)269 task = session.call_xenapi('Async.VM.snapshot', vm_ref, label)
272 template_vm_ref = session.wait_for_task(instance_id, task)270 template_vm_ref = session.wait_for_task(task, instance_id)
273 template_vdi_rec = get_vdi_for_vm_safely(session, template_vm_ref)[1]271 template_vdi_rec = get_vdi_for_vm_safely(session, template_vm_ref)[1]
274 template_vdi_uuid = template_vdi_rec["uuid"]272 template_vdi_uuid = template_vdi_rec["uuid"]
275273
@@ -302,7 +300,7 @@
302300
303 kwargs = {'params': pickle.dumps(params)}301 kwargs = {'params': pickle.dumps(params)}
304 task = session.async_call_plugin('glance', 'upload_vhd', kwargs)302 task = session.async_call_plugin('glance', 'upload_vhd', kwargs)
305 session.wait_for_task(instance_id, task)303 session.wait_for_task(task, instance_id)
306304
307 @classmethod305 @classmethod
308 def fetch_image(cls, session, instance_id, image, user, project,306 def fetch_image(cls, session, instance_id, image, user, project,
@@ -345,7 +343,7 @@
345343
346 kwargs = {'params': pickle.dumps(params)}344 kwargs = {'params': pickle.dumps(params)}
347 task = session.async_call_plugin('glance', 'download_vhd', kwargs)345 task = session.async_call_plugin('glance', 'download_vhd', kwargs)
348 vdi_uuid = session.wait_for_task(instance_id, task)346 vdi_uuid = session.wait_for_task(task, instance_id)
349347
350 scan_sr(session, instance_id, sr_ref)348 scan_sr(session, instance_id, sr_ref)
351349
@@ -401,7 +399,7 @@
401 #let the plugin copy the correct number of bytes399 #let the plugin copy the correct number of bytes
402 args['image-size'] = str(vdi_size)400 args['image-size'] = str(vdi_size)
403 task = session.async_call_plugin('glance', fn, args)401 task = session.async_call_plugin('glance', fn, args)
404 filename = session.wait_for_task(instance_id, task)402 filename = session.wait_for_task(task, instance_id)
405 #remove the VDI as it is not needed anymore403 #remove the VDI as it is not needed anymore
406 session.get_xenapi().VDI.destroy(vdi)404 session.get_xenapi().VDI.destroy(vdi)
407 LOG.debug(_("Kernel/Ramdisk VDI %s destroyed"), vdi)405 LOG.debug(_("Kernel/Ramdisk VDI %s destroyed"), vdi)
@@ -493,7 +491,7 @@
493 if image_type == ImageType.DISK_RAW:491 if image_type == ImageType.DISK_RAW:
494 args['raw'] = 'true'492 args['raw'] = 'true'
495 task = session.async_call_plugin('objectstore', fn, args)493 task = session.async_call_plugin('objectstore', fn, args)
496 uuid = session.wait_for_task(instance_id, task)494 uuid = session.wait_for_task(task, instance_id)
497 return uuid495 return uuid
498496
499 @classmethod497 @classmethod
@@ -513,7 +511,7 @@
513 args = {}511 args = {}
514 args['vdi-ref'] = vdi_ref512 args['vdi-ref'] = vdi_ref
515 task = session.async_call_plugin('objectstore', fn, args)513 task = session.async_call_plugin('objectstore', fn, args)
516 pv_str = session.wait_for_task(instance_id, task)514 pv_str = session.wait_for_task(task, instance_id)
517 pv = None515 pv = None
518 if pv_str.lower() == 'true':516 if pv_str.lower() == 'true':
519 pv = True517 pv = True
@@ -654,7 +652,7 @@
654def scan_sr(session, instance_id, sr_ref):652def scan_sr(session, instance_id, sr_ref):
655 LOG.debug(_("Re-scanning SR %s"), sr_ref)653 LOG.debug(_("Re-scanning SR %s"), sr_ref)
656 task = session.call_xenapi('Async.SR.scan', sr_ref)654 task = session.call_xenapi('Async.SR.scan', sr_ref)
657 session.wait_for_task(instance_id, task)655 session.wait_for_task(task, instance_id)
658656
659657
660def wait_for_vhd_coalesce(session, instance_id, sr_ref, vdi_ref,658def wait_for_vhd_coalesce(session, instance_id, sr_ref, vdi_ref,
661659
=== modified file 'nova/virt/xenapi/vmops.py'
--- nova/virt/xenapi/vmops.py 2011-02-25 16:47:08 +0000
+++ nova/virt/xenapi/vmops.py 2011-03-02 23:12:16 +0000
@@ -49,6 +49,7 @@
49 def __init__(self, session):49 def __init__(self, session):
50 self.XenAPI = session.get_imported_xenapi()50 self.XenAPI = session.get_imported_xenapi()
51 self._session = session51 self._session = session
52
52 VMHelper.XenAPI = self.XenAPI53 VMHelper.XenAPI = self.XenAPI
5354
54 def list_instances(self):55 def list_instances(self):
@@ -62,20 +63,20 @@
6263
63 def spawn(self, instance):64 def spawn(self, instance):
64 """Create VM instance"""65 """Create VM instance"""
65 vm = VMHelper.lookup(self._session, instance.name)66 instance_name = instance.name
67 vm = VMHelper.lookup(self._session, instance_name)
66 if vm is not None:68 if vm is not None:
67 raise exception.Duplicate(_('Attempted to create'69 raise exception.Duplicate(_('Attempted to create'
68 ' non-unique name %s') % instance.name)70 ' non-unique name %s') % instance_name)
6971
70 #ensure enough free memory is available72 #ensure enough free memory is available
71 if not VMHelper.ensure_free_mem(self._session, instance):73 if not VMHelper.ensure_free_mem(self._session, instance):
72 name = instance['name']74 LOG.exception(_('instance %(instance_name)s: not enough free '
73 LOG.exception(_('instance %(name)s: not enough free memory')75 'memory') % locals())
74 % locals())76 db.instance_set_state(context.get_admin_context(),
75 db.instance_set_state(context.get_admin_context(),77 instance['id'],
76 instance['id'],78 power_state.SHUTDOWN)
77 power_state.SHUTDOWN)79 return
78 return
7980
80 user = AuthManager().get_user(instance.user_id)81 user = AuthManager().get_user(instance.user_id)
81 project = AuthManager().get_project(instance.project_id)82 project = AuthManager().get_project(instance.project_id)
@@ -116,10 +117,9 @@
116 self.create_vifs(instance, networks)117 self.create_vifs(instance, networks)
117118
118 LOG.debug(_('Starting VM %s...'), vm_ref)119 LOG.debug(_('Starting VM %s...'), vm_ref)
119 self._session.call_xenapi('VM.start', vm_ref, False, False)120 self._start(instance, vm_ref)
120 instance_name = instance.name
121 LOG.info(_('Spawning VM %(instance_name)s created %(vm_ref)s.')121 LOG.info(_('Spawning VM %(instance_name)s created %(vm_ref)s.')
122 % locals())122 % locals())
123123
124 def _inject_onset_files():124 def _inject_onset_files():
125 onset_files = instance.onset_files125 onset_files = instance.onset_files
@@ -143,18 +143,18 @@
143143
144 def _wait_for_boot():144 def _wait_for_boot():
145 try:145 try:
146 state = self.get_info(instance['name'])['state']146 state = self.get_info(instance_name)['state']
147 db.instance_set_state(context.get_admin_context(),147 db.instance_set_state(context.get_admin_context(),
148 instance['id'], state)148 instance['id'], state)
149 if state == power_state.RUNNING:149 if state == power_state.RUNNING:
150 LOG.debug(_('Instance %s: booted'), instance['name'])150 LOG.debug(_('Instance %s: booted'), instance_name)
151 timer.stop()151 timer.stop()
152 _inject_onset_files()152 _inject_onset_files()
153 return True153 return True
154 except Exception, exc:154 except Exception, exc:
155 LOG.warn(exc)155 LOG.warn(exc)
156 LOG.exception(_('instance %s: failed to boot'),156 LOG.exception(_('instance %s: failed to boot'),
157 instance['name'])157 instance_name)
158 db.instance_set_state(context.get_admin_context(),158 db.instance_set_state(context.get_admin_context(),
159 instance['id'],159 instance['id'],
160 power_state.SHUTDOWN)160 power_state.SHUTDOWN)
@@ -202,6 +202,20 @@
202 _('Instance not present %s') % instance_name)202 _('Instance not present %s') % instance_name)
203 return vm203 return vm
204204
205 def _acquire_bootlock(self, vm):
206 """Prevent an instance from booting"""
207 self._session.call_xenapi(
208 "VM.set_blocked_operations",
209 vm,
210 {"start": ""})
211
212 def _release_bootlock(self, vm):
213 """Allow an instance to boot"""
214 self._session.call_xenapi(
215 "VM.remove_from_blocked_operations",
216 vm,
217 "start")
218
205 def snapshot(self, instance, image_id):219 def snapshot(self, instance, image_id):
206 """ Create snapshot from a running VM instance220 """ Create snapshot from a running VM instance
207221
@@ -254,7 +268,7 @@
254 """Reboot VM instance"""268 """Reboot VM instance"""
255 vm = self._get_vm_opaque_ref(instance)269 vm = self._get_vm_opaque_ref(instance)
256 task = self._session.call_xenapi('Async.VM.clean_reboot', vm)270 task = self._session.call_xenapi('Async.VM.clean_reboot', vm)
257 self._session.wait_for_task(instance.id, task)271 self._session.wait_for_task(task, instance.id)
258272
259 def set_admin_password(self, instance, new_pass):273 def set_admin_password(self, instance, new_pass):
260 """Set the root/admin password on the VM instance. This is done via274 """Set the root/admin password on the VM instance. This is done via
@@ -294,6 +308,11 @@
294 raise RuntimeError(resp_dict['message'])308 raise RuntimeError(resp_dict['message'])
295 return resp_dict['message']309 return resp_dict['message']
296310
311 def _start(self, instance, vm):
312 """Start an instance"""
313 task = self._session.call_xenapi("Async.VM.start", vm, False, False)
314 self._session.wait_for_task(task, instance.id)
315
297 def inject_file(self, instance, b64_path, b64_contents):316 def inject_file(self, instance, b64_path, b64_contents):
298 """Write a file to the VM instance. The path to which it is to be317 """Write a file to the VM instance. The path to which it is to be
299 written and the contents of the file need to be supplied; both should318 written and the contents of the file need to be supplied; both should
@@ -320,8 +339,8 @@
320 raise RuntimeError(resp_dict['message'])339 raise RuntimeError(resp_dict['message'])
321 return resp_dict['message']340 return resp_dict['message']
322341
323 def _shutdown(self, instance, vm):342 def _shutdown(self, instance, vm, hard=True):
324 """Shutdown an instance """343 """Shutdown an instance"""
325 state = self.get_info(instance['name'])['state']344 state = self.get_info(instance['name'])['state']
326 if state == power_state.SHUTDOWN:345 if state == power_state.SHUTDOWN:
327 LOG.warn(_("VM %(vm)s already halted, skipping shutdown...") %346 LOG.warn(_("VM %(vm)s already halted, skipping shutdown...") %
@@ -332,8 +351,13 @@
332 LOG.debug(_("Shutting down VM for Instance %(instance_id)s")351 LOG.debug(_("Shutting down VM for Instance %(instance_id)s")
333 % locals())352 % locals())
334 try:353 try:
335 task = self._session.call_xenapi('Async.VM.hard_shutdown', vm)354 task = None
336 self._session.wait_for_task(instance.id, task)355 if hard:
356 task = self._session.call_xenapi("Async.VM.hard_shutdown", vm)
357 else:
358 task = self._session.call_xenapi("Async.VM.clean_shutdown", vm)
359
360 self._session.wait_for_task(task, instance.id)
337 except self.XenAPI.Failure, exc:361 except self.XenAPI.Failure, exc:
338 LOG.exception(exc)362 LOG.exception(exc)
339363
@@ -350,7 +374,7 @@
350 for vdi in vdis:374 for vdi in vdis:
351 try:375 try:
352 task = self._session.call_xenapi('Async.VDI.destroy', vdi)376 task = self._session.call_xenapi('Async.VDI.destroy', vdi)
353 self._session.wait_for_task(instance.id, task)377 self._session.wait_for_task(task, instance.id)
354 except self.XenAPI.Failure, exc:378 except self.XenAPI.Failure, exc:
355 LOG.exception(exc)379 LOG.exception(exc)
356380
@@ -389,7 +413,7 @@
389 args = {'kernel-file': kernel, 'ramdisk-file': ramdisk}413 args = {'kernel-file': kernel, 'ramdisk-file': ramdisk}
390 task = self._session.async_call_plugin(414 task = self._session.async_call_plugin(
391 'glance', 'remove_kernel_ramdisk', args)415 'glance', 'remove_kernel_ramdisk', args)
392 self._session.wait_for_task(instance.id, task)416 self._session.wait_for_task(task, instance.id)
393417
394 LOG.debug(_("kernel/ramdisk files removed"))418 LOG.debug(_("kernel/ramdisk files removed"))
395419
@@ -398,7 +422,7 @@
398 instance_id = instance.id422 instance_id = instance.id
399 try:423 try:
400 task = self._session.call_xenapi('Async.VM.destroy', vm)424 task = self._session.call_xenapi('Async.VM.destroy', vm)
401 self._session.wait_for_task(instance_id, task)425 self._session.wait_for_task(task, instance_id)
402 except self.XenAPI.Failure, exc:426 except self.XenAPI.Failure, exc:
403 LOG.exception(exc)427 LOG.exception(exc)
404428
@@ -441,7 +465,7 @@
441 def _wait_with_callback(self, instance_id, task, callback):465 def _wait_with_callback(self, instance_id, task, callback):
442 ret = None466 ret = None
443 try:467 try:
444 ret = self._session.wait_for_task(instance_id, task)468 ret = self._session.wait_for_task(task, instance_id)
445 except self.XenAPI.Failure, exc:469 except self.XenAPI.Failure, exc:
446 LOG.exception(exc)470 LOG.exception(exc)
447 callback(ret)471 callback(ret)
@@ -470,6 +494,78 @@
470 task = self._session.call_xenapi('Async.VM.resume', vm, False, True)494 task = self._session.call_xenapi('Async.VM.resume', vm, False, True)
471 self._wait_with_callback(instance.id, task, callback)495 self._wait_with_callback(instance.id, task, callback)
472496
497 def rescue(self, instance, callback):
498 """Rescue the specified instance
499 - shutdown the instance VM
500 - set 'bootlock' to prevent the instance from starting in rescue
501 - spawn a rescue VM (the vm name-label will be instance-N-rescue)
502
503 """
504 rescue_vm = VMHelper.lookup(self._session, instance.name + "-rescue")
505 if rescue_vm:
506 raise RuntimeError(_(
507 "Instance is already in Rescue Mode: %s" % instance.name))
508
509 vm = self._get_vm_opaque_ref(instance)
510 self._shutdown(instance, vm)
511 self._acquire_bootlock(vm)
512
513 instance._rescue = True
514 self.spawn(instance)
515 rescue_vm = self._get_vm_opaque_ref(instance)
516
517 vbd = self._session.get_xenapi().VM.get_VBDs(vm)[0]
518 vdi_ref = self._session.get_xenapi().VBD.get_record(vbd)["VDI"]
519 vbd_ref = VMHelper.create_vbd(
520 self._session,
521 rescue_vm,
522 vdi_ref,
523 1,
524 False)
525
526 self._session.call_xenapi("Async.VBD.plug", vbd_ref)
527
528 def unrescue(self, instance, callback):
529 """Unrescue the specified instance
530 - unplug the instance VM's disk from the rescue VM
531 - teardown the rescue VM
532 - release the bootlock to allow the instance VM to start
533
534 """
535 rescue_vm = VMHelper.lookup(self._session, instance.name + "-rescue")
536
537 if not rescue_vm:
538 raise exception.NotFound(_(
539 "Instance is not in Rescue Mode: %s" % instance.name))
540
541 original_vm = self._get_vm_opaque_ref(instance)
542 vbds = self._session.get_xenapi().VM.get_VBDs(rescue_vm)
543
544 instance._rescue = False
545
546 for vbd_ref in vbds:
547 vbd = self._session.get_xenapi().VBD.get_record(vbd_ref)
548 if vbd["userdevice"] == "1":
549 VMHelper.unplug_vbd(self._session, vbd_ref)
550 VMHelper.destroy_vbd(self._session, vbd_ref)
551
552 task1 = self._session.call_xenapi("Async.VM.hard_shutdown", rescue_vm)
553 self._session.wait_for_task(task1, instance.id)
554
555 vdis = VMHelper.lookup_vm_vdis(self._session, rescue_vm)
556 for vdi in vdis:
557 try:
558 task = self._session.call_xenapi('Async.VDI.destroy', vdi)
559 self._session.wait_for_task(task, instance.id)
560 except self.XenAPI.Failure:
561 continue
562
563 task2 = self._session.call_xenapi('Async.VM.destroy', rescue_vm)
564 self._session.wait_for_task(task2, instance.id)
565
566 self._release_bootlock(original_vm)
567 self._start(instance, original_vm)
568
473 def get_info(self, instance):569 def get_info(self, instance):
474 """Return data about VM instance"""570 """Return data about VM instance"""
475 vm = self._get_vm_opaque_ref(instance)571 vm = self._get_vm_opaque_ref(instance)
@@ -556,8 +652,17 @@
556 NetworkHelper.find_network_with_bridge(self._session, bridge)652 NetworkHelper.find_network_with_bridge(self._session, bridge)
557653
558 if network_ref:654 if network_ref:
559 VMHelper.create_vif(self._session, vm_opaque_ref,655 try:
560 network_ref, instance.mac_address)656 device = "1" if instance._rescue else "0"
657 except AttributeError:
658 device = "0"
659
660 VMHelper.create_vif(
661 self._session,
662 vm_opaque_ref,
663 network_ref,
664 instance.mac_address,
665 device)
561666
562 def reset_network(self, instance):667 def reset_network(self, instance):
563 """668 """
@@ -627,7 +732,7 @@
627 args.update(addl_args)732 args.update(addl_args)
628 try:733 try:
629 task = self._session.async_call_plugin(plugin, method, args)734 task = self._session.async_call_plugin(plugin, method, args)
630 ret = self._session.wait_for_task(instance_id, task)735 ret = self._session.wait_for_task(task, instance_id)
631 except self.XenAPI.Failure, e:736 except self.XenAPI.Failure, e:
632 ret = None737 ret = None
633 err_trace = e.details[-1]738 err_trace = e.details[-1]
634739
=== modified file 'nova/virt/xenapi/volumeops.py'
--- nova/virt/xenapi/volumeops.py 2011-01-19 20:26:09 +0000
+++ nova/virt/xenapi/volumeops.py 2011-03-02 23:12:16 +0000
@@ -83,7 +83,7 @@
83 try:83 try:
84 task = self._session.call_xenapi('Async.VBD.plug',84 task = self._session.call_xenapi('Async.VBD.plug',
85 vbd_ref)85 vbd_ref)
86 self._session.wait_for_task(vol_rec['deviceNumber'], task)86 self._session.wait_for_task(task, vol_rec['deviceNumber'])
87 except self.XenAPI.Failure, exc:87 except self.XenAPI.Failure, exc:
88 LOG.exception(exc)88 LOG.exception(exc)
89 VolumeHelper.destroy_iscsi_storage(self._session,89 VolumeHelper.destroy_iscsi_storage(self._session,
9090
=== modified file 'nova/virt/xenapi_conn.py'
--- nova/virt/xenapi_conn.py 2011-02-25 16:47:08 +0000
+++ nova/virt/xenapi_conn.py 2011-03-02 23:12:16 +0000
@@ -196,6 +196,14 @@
196 """resume the specified instance"""196 """resume the specified instance"""
197 self._vmops.resume(instance, callback)197 self._vmops.resume(instance, callback)
198198
199 def rescue(self, instance, callback):
200 """Rescue the specified instance"""
201 self._vmops.rescue(instance, callback)
202
203 def unrescue(self, instance, callback):
204 """Unrescue the specified instance"""
205 self._vmops.unrescue(instance, callback)
206
199 def reset_network(self, instance):207 def reset_network(self, instance):
200 """reset networking for specified instance"""208 """reset networking for specified instance"""
201 self._vmops.reset_network(instance)209 self._vmops.reset_network(instance)
@@ -279,7 +287,7 @@
279 self._session.xenapi.Async.host.call_plugin,287 self._session.xenapi.Async.host.call_plugin,
280 self.get_xenapi_host(), plugin, fn, args)288 self.get_xenapi_host(), plugin, fn, args)
281289
282 def wait_for_task(self, id, task):290 def wait_for_task(self, task, id=None):
283 """Return the result of the given task. The task is polled291 """Return the result of the given task. The task is polled
284 until it completes. Not re-entrant."""292 until it completes. Not re-entrant."""
285 done = event.Event()293 done = event.Event()
@@ -306,10 +314,11 @@
306 try:314 try:
307 name = self._session.xenapi.task.get_name_label(task)315 name = self._session.xenapi.task.get_name_label(task)
308 status = self._session.xenapi.task.get_status(task)316 status = self._session.xenapi.task.get_status(task)
309 action = dict(317 if id:
310 instance_id=int(id),318 action = dict(
311 action=name[0:255], # Ensure action is never > 255319 instance_id=int(id),
312 error=None)320 action=name[0:255], # Ensure action is never > 255
321 error=None)
313 if status == "pending":322 if status == "pending":
314 return323 return
315 elif status == "success":324 elif status == "success":
@@ -323,7 +332,9 @@
323 LOG.warn(_("Task [%(name)s] %(task)s status:"332 LOG.warn(_("Task [%(name)s] %(task)s status:"
324 " %(status)s %(error_info)s") % locals())333 " %(status)s %(error_info)s") % locals())
325 done.send_exception(self.XenAPI.Failure(error_info))334 done.send_exception(self.XenAPI.Failure(error_info))
326 db.instance_action_create(context.get_admin_context(), action)335
336 if id:
337 db.instance_action_create(context.get_admin_context(), action)
327 except self.XenAPI.Failure, exc:338 except self.XenAPI.Failure, exc:
328 LOG.warn(exc)339 LOG.warn(exc)
329 done.send_exception(*sys.exc_info())340 done.send_exception(*sys.exc_info())