Merge lp:~termie/nova/libvirt_snapshot into lp:~hudson-openstack/nova/trunk

Proposed by termie
Status: Merged
Approved by: Jay Pipes
Approved revision: 899
Merged at revision: 899
Proposed branch: lp:~termie/nova/libvirt_snapshot
Merge into: lp:~hudson-openstack/nova/trunk
Diff against target: 102 lines (+68/-6)
1 file modified
nova/virt/libvirt_conn.py (+68/-6)
To merge this branch: bzr merge lp:~termie/nova/libvirt_snapshot
Reviewer Review Type Date Requested Status
Jay Pipes (community) Approve
Vish Ishaya (community) Approve
Thierry Carrez (community) ffe Approve
Jesse Andrews (community) Approve
Review via email: mp+54621@code.launchpad.net

Description of the change

Adds support for snapshotting (to a new image) in the libvirt code.

To post a comment you must log in.
Revision history for this message
Vish Ishaya (vishvananda) wrote :

Very nice. I think this a huge win to get into cactus. One issue with the metadata. There was a recent change to glance to use image_format and container_format instead of type. Type is now a property. I think you therefore need metadata like the following:

metadata = {'disk_format': base['disk_format'],
            'container_format': base['container_format'],
            'is_public': False,
            'properties': {'architecture': base['architecture'],
                           'type': base['type'], # or 'machine' if you prefer
...
There is also an optional 'name' which could perhaps be something like:
            'name': '%s.%s' (base['name'], snapshot['name'])
although I'm not super attached to that.

review: Needs Fixing
lp:~termie/nova/libvirt_snapshot updated
856. By Soren Hansen

Move all types of locking into utils.synchronize decorator.

Convert all uses of semaphores to use this decorator.

Revision history for this message
termie (termie) wrote :

pushed the changes you asked for, including the name bit, though i used image_id instead of snapshot['name']

lp:~termie/nova/libvirt_snapshot updated
857. By Tushar Patil

boto_v6 module is imported if the flag "use_ipv6" is set to True

858. By Josh Kearney

Offers the ability to run a periodic_task that sweeps through rescued instances older than 24 hours and forcibly unrescues them.

Flag added: rescue_timeout (default is 0 - disabled)

859. By Todd Willey

Fix issues with certificate updating & whitespace removal

860. By justinsb

better logging of exceptions

Revision history for this message
Jesse Andrews (anotherjesse) wrote :

can we get test coverage in the smoketests for this?

lp:~termie/nova/libvirt_snapshot updated
861. By Naveed Massjouni

Added a mechanism for versioned controllers for openstack api versions 1.0/1.1.
Create servers in the 1.1 api now supports imageRef/flavorRef instead of imageId/flavorId.

862. By justinsb

Poll instance states periodically, so that we can detect when something changes 'behind the scenes'.

Beginnings of work on Bug #661214 and Bug #661260.

863. By Soren Hansen

Pass a fake timing source to live_migration_pre in every test that expectes it to fail, shaving off a whole minute of test run time.

864. By Matt Dietz

Fixes lp740322: cannot run test_localization in isolation

865. By Sandy Walsh

Aggregates capabilities from Compute, Network, Volume to the ZoneManager in Scheduler.

866. By Muneyuki Noguchi

Fix lp741514 by declaring libvirt_type in nova-manage.

867. By justinsb

Detect if user is running the default Lucid version of libvirt, and give a nicer error message

868. By justinsb

Don't try to parse the empty string as a datetime

869. By Josh Kleinpeter

Made service_get_all()'s disabled parameter default to None. Pass False for enabled services; True for disabled services. Calls to this method have been updated to remain consistent.

870. By Brian Lamar

Pylint 'Undefined variable' E0602 error fixes.

871. By Todd Willey

Fix api logging to show proper path and controller:action.

872. By justinsb

Fix some errors that pylint found in nova/api/openstack/servers.py

This was meant more as a test that pylint is actually being helpful now (it is), but these are real errors.

873. By termie

Fixes a bug that was causing tests to fail on OS X by ensuring that greenthread sleep is called during retry loops.

874. By Sandy Walsh

In this branch we are forwarding incoming requests to child zones when the requested resource is not found in the current zone.

For example: If 'nova pause 123' is issued against Zone 1, but instance 123 does not live in Zone 1, the call will be forwarded to all child zones hoping someone can deal with it.

NOTE: This currently only works with OpenStack API requests and routing checks are only being done against Compute/instance_id checks.
Specifically:
* servers.get/pause/unpause/diagnostics/suspend/resume/rescue/unrescue/delete
* servers.create is pending for distributed scheduler
* servers.get_all will get added early in Diablo.

What I've been doing for testing:
1. Set up a Nova deployment in a VM (Zone0)
2. Clone the VM and set --zone_name=zone1 (and change all the IP addresses to the new address in nova.conf, glance.conf and novarc)
3. Set --enable_zone_routing=true on all zones
4. use the --connection_type=fake driver for compute to keep things easy
5. Add Zone1 as a child of Zone0 (nova zone-add)

(make sure the instance id's are different in each zone)

Example of calls being sent to child zones:
http://paste.openstack.org/show/964/

875. By termie

Additions to the Direct API:

 * Add an example of a versioned api
 * Add some more docs to direct.py
 * Add Limited, an API limiting/versioning wrapper
 * Improve the formatting of the stack tool
 * Add support for volume and network services to the direct api

876. By Ilya Alekseyev

libvirt driver multi_nic support. In this phase libvirt can work with and without multi_nic support, as in multi_nic support for xenapi: https://code.launchpad.net/~tr3buchet/nova/xs_multi_nic/+merge/53458

877. By Josh Kearney

Adds unit test coverage for XenAPI Rescue & Unrescue.

878. By Dan Prince

Implement API extensions for the Openstack API. Based on the Openstack 1.1 API the following types of extensions are supported:

-Top level resources (extension)
-Action extensions (add an extra action to a core nova controller)
-Response extensions (inject data into response from core nova controllers)

To add an extension simply drop an extension file into the configured osapi_extensions_path (which defaults to /var/lib/nova/extensions).

See nova/tests/api/openstack/extensions/foxinsocks.py for an example Extension.

879. By Salvatore Orlando

Fix for bug #740947
Executing parted with sudo in _write_partition (vm_utils.py)

880. By Rick Harris

Adds serverId to OpenStack API image detail per related_image blueprint

881. By Sateesh

Implementation of blueprint hypervisor-vmware-vsphere-support. (Link to blueprint: https://blueprints.launchpad.net/nova/+spec/hypervisor-vmware-vsphere-support)

Adds support for hypervisor vmware ESX/ESXi server in OpenStack (Nova).

Key features included are,
1) Support for FLAT and VLAN networking model
2) Support for Guest console access through VMware vmrc
3) Integrated with Glance service for image storage and retrival

Documentation: A readme file at "doc/source/vmwareapi_readme.rst" encapsulates configuration/installation instructions required to use this module/feature.

882. By Naveed Massjouni

Support for markers for pagination as defined in the 1.1 spec.

883. By Muneyuki Noguchi

Fix lp741415 by splitting arguments of _execute in the iSCSI driver.

884. By termie

Ports the Tornado version of an S3 server to eventlet and wsgi, first step in deprecating the twistd-based objectstore.

This is a trivial implementation, never meant for production, it exists to provide an s3-look-alike objectstore for use when developing/testing things related to the amazon APIs (eucatools, etc), any production deployment would be expected to use Swift + an S3 interface.

In later patches I expect to be able to remove the old objectstore code entirely.

885. By Naveed Massjouni

Support for markers for pagination as defined in the 1.1 spec.

886. By Dan Prince

Implement metadata resource for Openstack API v1.1. Includes:
      -GET /servers/id/meta
      -POST /servers/id/meta
      -GET /servers/id/meta/key
      -PUT /servers/id/meta/key
      -DELETE /servers/id/meta/key

887. By Brian Waldon

Add a "links" container to flavors entities for Openstack API v1.1.

888. By Dan Prince

Implement image metadata controller for the v1.1 OS API.

Uses image 'properties' to store and retrieve image metadata.

889. By Andy Southgate

This is basic network injection for XenServer, and includes:

o Modification of the /etc/network/interfaces file within the image using code taken from and now shared with libvirt_conn. This is for compatibility with legacy Linux images without a guest agent.

o Setting of xenstore keys before instance boot, intended for the XenServer Windows agent. The agent will use these to configure the network at boot-time.

This change does not implement live reconfiguration, which is on another blueprint:

https://blueprints.launchpad.net/nova/+spec/xs-inject-networking

It does include template code to detect the presence of agents and avoid modifying the filesystem if they are injection-capable.

890. By Brian Waldon

- add a "links" container to versions entities for Openstack API v1.1
- add testing for the openstack api versions resource and create a view builder

Revision history for this message
Thierry Carrez (ttx) wrote :

Sounds rather cool and self-contained... If you want this into Cactus rather than wait for Diablo, could you provide a quick benefit vs. risk of regression rationale, then request a specific FFe review from me ? I think we should try to sneak this in.

lp:~termie/nova/libvirt_snapshot updated
891. By Rick Harris

disk_format is now an ImageService property. Adds tests to prevent regression.

Revision history for this message
Thierry Carrez (ttx) :
review: Needs Information (ffe)
Revision history for this message
Jesse Andrews (anotherjesse) wrote :

Rational is that the bundling process is a PITA.

This brings kvm up to date with xen in the ability to snapshot externally

review: Approve
lp:~termie/nova/libvirt_snapshot updated
892. By Brian Waldon

Adding links container to openstack api v1.1 servers entities.

893. By Vish Ishaya

Updates to the newest version of nova.sh, which includes:
 * Installing new python dependencies
 * Allows for use of interfaces other than eth0
 * Adds a run_detached mode for automated testing

Revision history for this message
Thierry Carrez (ttx) wrote :

It's aligned with our "product coherence" goals for this release and is nicely self-contained. It needs some smoketesting though, so it needs to be merged ASAP. Ffe granted for late merging by Tuesday EOD.

review: Approve (ffe)
lp:~termie/nova/libvirt_snapshot updated
894. By Ilya Alekseyev

use_ipv6 now passing to interfaces.template as first level variable in libvirt_conn

Revision history for this message
Jay Pipes (jaypipes) wrote :

I agree that this is great, Andy. Very nice work.

7 import time
8 +import tempfile

i before e except when listing alphabetically. ;)

52 + 'image_location': 'snapshot',

Since there is a 'location' key already returned from Glance, perhaps you could come up with a slightly different key name than 'image_location'? However, I don't see this key used anywhere else in the code? Could you explain what code uses the 'image_location' property?

53 + 'image_state': 'available',

The image services have a 'status' column, containing ('active', 'queued', 'saving', 'killed'). Is it possible to remove the above custom property in favour of the status column? Could you explain what code uses this custom property, if any?

Just want to make sure we don't add seemingly redundant columns that could potentially introduce confusion into the image service API. We've had some issues in the last months with columns in the various image services meaning different things and have tried to get the problem under control.

Thanks in advance for the answers, Andy.

-jay

review: Needs Fixing
Revision history for this message
Vish Ishaya (vishvananda) wrote :

I'll jump in and answer these questions since they are actually due to the ec2 / s3_api code. It is possible we can come up with better solutions for them, but i they should probably go in separately, since they aren't specifically due to snapshotting.

On Mar 28, 2011, at 6:57 AM, Jay Pipes wrote:

> Review: Needs Fixing
> I agree that this is great, Andy. Very nice work.
>
> 7 import time
> 8 +import tempfile
>
> i before e except when listing alphabetically. ;)
>
> 52 + 'image_location': 'snapshot',
>
> Since there is a 'location' key already returned from Glance, perhaps you could come up with a slightly different key name than 'image_location'? However, I don't see this key used anywhere else in the code? Could you explain what code uses the 'image_location' property?

Image_location is used by the ec2_api. In the amazon world it is the s3 bucket that the image was uploaded to. We use it as kind of a "description" field to show where the image came from. We couldn't use the "location" field in glance because that is used by the backend. The unfortunate naming is just because that is the name for the field in ec2: imageLocation, and it didn't seem worth it to rename it.

>
> 53 + 'image_state': 'available',
>
> The image services have a 'status' column, containing ('active', 'queued', 'saving', 'killed'). Is it possible to remove the above custom property in favour of the status column? Could you explain what code uses this custom property, if any?

When the s3 service is registering an image, it goes through a few state changes so that the user can see the progress ("untarring", "decrypting", "available"). So the status column was not quite robust enough. Perhaps long term we should have two status fields: status and status_description. Status could be a status "code" and description could be specified for ui perposes.

>
> Just want to make sure we don't add seemingly redundant columns that could potentially introduce confusion into the image service API. We've had some issues in the last months with columns in the various image services meaning different things and have tried to get the problem under control.
>
> Thanks in advance for the answers, Andy.
>
> -jay

In short, some cleanup needs to be done here, but I think the cleanup is to the s3 image service / glance integration, not to the snapshotting code specifically.

> --
> https://code.launchpad.net/~termie/nova/libvirt_snapshot/+merge/54621
> You are reviewing the proposed merge of lp:~termie/nova/libvirt_snapshot into lp:nova.

lp:~termie/nova/libvirt_snapshot updated
895. By justinsb

Merged the two periodic_tasks functions, that snuck in due to parallel merges in compute.manager

Revision history for this message
Jay Pipes (jaypipes) wrote :

Thanks, Vish, those were excellent answers, and appreciated. Other than that import order nit, looks fine to me. Cheers. :)

lp:~termie/nova/libvirt_snapshot updated
896. By justinsb

Mixins for tests confuse pylint no end, and aren't necessary... you can stop the base-class from being run as a test by prefixing the class name with an underscore

897. By justinsb

Assume that if we don't find a VM for an instance in the DB, and the DB state is NOSTATE, that the db instance is in the process of being spawned, and don't mark it SHUTOFF.

Fix for bug#744056

Revision history for this message
Vish Ishaya (vishvananda) wrote :

Looks good now. Just waiting on the import order fix.

review: Approve
lp:~termie/nova/libvirt_snapshot updated
898. By termie

add snapshot support for libvirt

899. By termie

update glance params per review

Revision history for this message
termie (termie) wrote :

import order fixed, looks like the result of too many merges.

Revision history for this message
Jay Pipes (jaypipes) wrote :

cool. lgtm. nice work, Andy.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'nova/virt/libvirt_conn.py'
--- nova/virt/libvirt_conn.py 2011-03-27 17:15:47 +0000
+++ nova/virt/libvirt_conn.py 2011-03-28 17:56:24 +0000
@@ -38,12 +38,15 @@
3838
39import multiprocessing39import multiprocessing
40import os40import os
41import random
41import shutil42import shutil
43import subprocess
42import sys44import sys
43import random45import tempfile
44import subprocess46import time
45import uuid47import uuid
46from xml.dom import minidom48from xml.dom import minidom
49from xml.etree import ElementTree
4750
48from eventlet import greenthread51from eventlet import greenthread
49from eventlet import tpool52from eventlet import tpool
@@ -111,6 +114,8 @@
111 'Define live migration behavior.')114 'Define live migration behavior.')
112flags.DEFINE_integer('live_migration_bandwidth', 0,115flags.DEFINE_integer('live_migration_bandwidth', 0,
113 'Define live migration behavior')116 'Define live migration behavior')
117flags.DEFINE_string('qemu_img', 'qemu-img',
118 'binary to use for qemu-img commands')
114119
115120
116def get_connection(read_only):121def get_connection(read_only):
@@ -397,10 +402,67 @@
397402
398 @exception.wrap_exception403 @exception.wrap_exception
399 def snapshot(self, instance, image_id):404 def snapshot(self, instance, image_id):
400 """ Create snapshot from a running VM instance """405 """Create snapshot from a running VM instance.
401 raise NotImplementedError(406
402 _("Instance snapshotting is not supported for libvirt"407 This command only works with qemu 0.14+, the qemu_img flag is
403 "at this time"))408 provided so that a locally compiled binary of qemu-img can be used
409 to support this command.
410
411 """
412 image_service = utils.import_object(FLAGS.image_service)
413 virt_dom = self._conn.lookupByName(instance['name'])
414 elevated = context.get_admin_context()
415
416 base = image_service.show(elevated, instance['image_id'])
417
418 metadata = {'disk_format': base['disk_format'],
419 'container_format': base['container_format'],
420 'is_public': False,
421 'properties': {'architecture': base['architecture'],
422 'type': base['type'],
423 'name': '%s.%s' % (base['name'], image_id),
424 'kernel_id': instance['kernel_id'],
425 'image_location': 'snapshot',
426 'image_state': 'available',
427 'owner_id': instance['project_id'],
428 'ramdisk_id': instance['ramdisk_id'],
429 }
430 }
431
432 # Make the snapshot
433 snapshot_name = uuid.uuid4().hex
434 snapshot_xml = """
435 <domainsnapshot>
436 <name>%s</name>
437 </domainsnapshot>
438 """ % snapshot_name
439 snapshot_ptr = virt_dom.snapshotCreateXML(snapshot_xml, 0)
440
441 # Find the disk
442 xml_desc = virt_dom.XMLDesc(0)
443 domain = ElementTree.fromstring(xml_desc)
444 source = domain.find('devices/disk/source')
445 disk_path = source.get('file')
446
447 # Export the snapshot to a raw image
448 temp_dir = tempfile.mkdtemp()
449 out_path = os.path.join(temp_dir, snapshot_name)
450 qemu_img_cmd = '%s convert -f qcow2 -O raw -s %s %s %s' % (
451 FLAGS.qemu_img,
452 snapshot_name,
453 disk_path,
454 out_path)
455 utils.execute(qemu_img_cmd)
456
457 # Upload that image to the image service
458 with open(out_path) as image_file:
459 image_service.update(elevated,
460 image_id,
461 metadata,
462 image_file)
463
464 # Clean up
465 shutil.rmtree(temp_dir)
404466
405 @exception.wrap_exception467 @exception.wrap_exception
406 def reboot(self, instance):468 def reboot(self, instance):