Merge lp:~termie/nova/libvirt_snapshot into lp:~hudson-openstack/nova/trunk

Proposed by termie
Status: Merged
Approved by: Jay Pipes
Approved revision: 899
Merged at revision: 899
Proposed branch: lp:~termie/nova/libvirt_snapshot
Merge into: lp:~hudson-openstack/nova/trunk
Diff against target: 102 lines (+68/-6)
1 file modified
nova/virt/libvirt_conn.py (+68/-6)
To merge this branch: bzr merge lp:~termie/nova/libvirt_snapshot
Reviewer Review Type Date Requested Status
Jay Pipes (community) Approve
Vish Ishaya (community) Approve
Thierry Carrez (community) ffe Approve
Jesse Andrews (community) Approve
Review via email: mp+54621@code.launchpad.net

Description of the change

Adds support for snapshotting (to a new image) in the libvirt code.

To post a comment you must log in.
Revision history for this message
Vish Ishaya (vishvananda) wrote :

Very nice. I think this a huge win to get into cactus. One issue with the metadata. There was a recent change to glance to use image_format and container_format instead of type. Type is now a property. I think you therefore need metadata like the following:

metadata = {'disk_format': base['disk_format'],
            'container_format': base['container_format'],
            'is_public': False,
            'properties': {'architecture': base['architecture'],
                           'type': base['type'], # or 'machine' if you prefer
...
There is also an optional 'name' which could perhaps be something like:
            'name': '%s.%s' (base['name'], snapshot['name'])
although I'm not super attached to that.

review: Needs Fixing
lp:~termie/nova/libvirt_snapshot updated
856. By Soren Hansen

Move all types of locking into utils.synchronize decorator.

Convert all uses of semaphores to use this decorator.

Revision history for this message
termie (termie) wrote :

pushed the changes you asked for, including the name bit, though i used image_id instead of snapshot['name']

lp:~termie/nova/libvirt_snapshot updated
857. By Tushar Patil

boto_v6 module is imported if the flag "use_ipv6" is set to True

858. By Josh Kearney

Offers the ability to run a periodic_task that sweeps through rescued instances older than 24 hours and forcibly unrescues them.

Flag added: rescue_timeout (default is 0 - disabled)

859. By Todd Willey

Fix issues with certificate updating & whitespace removal

860. By justinsb

better logging of exceptions

Revision history for this message
Jesse Andrews (anotherjesse) wrote :

can we get test coverage in the smoketests for this?

lp:~termie/nova/libvirt_snapshot updated
861. By Naveed Massjouni

Added a mechanism for versioned controllers for openstack api versions 1.0/1.1.
Create servers in the 1.1 api now supports imageRef/flavorRef instead of imageId/flavorId.

862. By justinsb

Poll instance states periodically, so that we can detect when something changes 'behind the scenes'.

Beginnings of work on Bug #661214 and Bug #661260.

863. By Soren Hansen

Pass a fake timing source to live_migration_pre in every test that expectes it to fail, shaving off a whole minute of test run time.

864. By Matt Dietz

Fixes lp740322: cannot run test_localization in isolation

865. By Sandy Walsh

Aggregates capabilities from Compute, Network, Volume to the ZoneManager in Scheduler.

866. By Muneyuki Noguchi

Fix lp741514 by declaring libvirt_type in nova-manage.

867. By justinsb

Detect if user is running the default Lucid version of libvirt, and give a nicer error message

868. By justinsb

Don't try to parse the empty string as a datetime

869. By Josh Kleinpeter

Made service_get_all()'s disabled parameter default to None. Pass False for enabled services; True for disabled services. Calls to this method have been updated to remain consistent.

870. By Brian Lamar

Pylint 'Undefined variable' E0602 error fixes.

871. By Todd Willey

Fix api logging to show proper path and controller:action.

872. By justinsb

Fix some errors that pylint found in nova/api/openstack/servers.py

This was meant more as a test that pylint is actually being helpful now (it is), but these are real errors.

873. By termie

Fixes a bug that was causing tests to fail on OS X by ensuring that greenthread sleep is called during retry loops.

874. By Sandy Walsh

In this branch we are forwarding incoming requests to child zones when the requested resource is not found in the current zone.

For example: If 'nova pause 123' is issued against Zone 1, but instance 123 does not live in Zone 1, the call will be forwarded to all child zones hoping someone can deal with it.

NOTE: This currently only works with OpenStack API requests and routing checks are only being done against Compute/instance_id checks.
Specifically:
* servers.get/pause/unpause/diagnostics/suspend/resume/rescue/unrescue/delete
* servers.create is pending for distributed scheduler
* servers.get_all will get added early in Diablo.

What I've been doing for testing:
1. Set up a Nova deployment in a VM (Zone0)
2. Clone the VM and set --zone_name=zone1 (and change all the IP addresses to the new address in nova.conf, glance.conf and novarc)
3. Set --enable_zone_routing=true on all zones
4. use the --connection_type=fake driver for compute to keep things easy
5. Add Zone1 as a child of Zone0 (nova zone-add)

(make sure the instance id's are different in each zone)

Example of calls being sent to child zones:
http://paste.openstack.org/show/964/

875. By termie

Additions to the Direct API:

 * Add an example of a versioned api
 * Add some more docs to direct.py
 * Add Limited, an API limiting/versioning wrapper
 * Improve the formatting of the stack tool
 * Add support for volume and network services to the direct api

876. By Ilya Alekseyev

libvirt driver multi_nic support. In this phase libvirt can work with and without multi_nic support, as in multi_nic support for xenapi: https://code.launchpad.net/~tr3buchet/nova/xs_multi_nic/+merge/53458

877. By Josh Kearney

Adds unit test coverage for XenAPI Rescue & Unrescue.

878. By Dan Prince

Implement API extensions for the Openstack API. Based on the Openstack 1.1 API the following types of extensions are supported:

-Top level resources (extension)
-Action extensions (add an extra action to a core nova controller)
-Response extensions (inject data into response from core nova controllers)

To add an extension simply drop an extension file into the configured osapi_extensions_path (which defaults to /var/lib/nova/extensions).

See nova/tests/api/openstack/extensions/foxinsocks.py for an example Extension.

879. By Salvatore Orlando

Fix for bug #740947
Executing parted with sudo in _write_partition (vm_utils.py)

880. By Rick Harris

Adds serverId to OpenStack API image detail per related_image blueprint

881. By Sateesh

Implementation of blueprint hypervisor-vmware-vsphere-support. (Link to blueprint: https://blueprints.launchpad.net/nova/+spec/hypervisor-vmware-vsphere-support)

Adds support for hypervisor vmware ESX/ESXi server in OpenStack (Nova).

Key features included are,
1) Support for FLAT and VLAN networking model
2) Support for Guest console access through VMware vmrc
3) Integrated with Glance service for image storage and retrival

Documentation: A readme file at "doc/source/vmwareapi_readme.rst" encapsulates configuration/installation instructions required to use this module/feature.

882. By Naveed Massjouni

Support for markers for pagination as defined in the 1.1 spec.

883. By Muneyuki Noguchi

Fix lp741415 by splitting arguments of _execute in the iSCSI driver.

884. By termie

Ports the Tornado version of an S3 server to eventlet and wsgi, first step in deprecating the twistd-based objectstore.

This is a trivial implementation, never meant for production, it exists to provide an s3-look-alike objectstore for use when developing/testing things related to the amazon APIs (eucatools, etc), any production deployment would be expected to use Swift + an S3 interface.

In later patches I expect to be able to remove the old objectstore code entirely.

885. By Naveed Massjouni

Support for markers for pagination as defined in the 1.1 spec.

886. By Dan Prince

Implement metadata resource for Openstack API v1.1. Includes:
      -GET /servers/id/meta
      -POST /servers/id/meta
      -GET /servers/id/meta/key
      -PUT /servers/id/meta/key
      -DELETE /servers/id/meta/key

887. By Brian Waldon

Add a "links" container to flavors entities for Openstack API v1.1.

888. By Dan Prince

Implement image metadata controller for the v1.1 OS API.

Uses image 'properties' to store and retrieve image metadata.

889. By Andy Southgate

This is basic network injection for XenServer, and includes:

o Modification of the /etc/network/interfaces file within the image using code taken from and now shared with libvirt_conn. This is for compatibility with legacy Linux images without a guest agent.

o Setting of xenstore keys before instance boot, intended for the XenServer Windows agent. The agent will use these to configure the network at boot-time.

This change does not implement live reconfiguration, which is on another blueprint:

https://blueprints.launchpad.net/nova/+spec/xs-inject-networking

It does include template code to detect the presence of agents and avoid modifying the filesystem if they are injection-capable.

890. By Brian Waldon

- add a "links" container to versions entities for Openstack API v1.1
- add testing for the openstack api versions resource and create a view builder

Revision history for this message
Thierry Carrez (ttx) wrote :

Sounds rather cool and self-contained... If you want this into Cactus rather than wait for Diablo, could you provide a quick benefit vs. risk of regression rationale, then request a specific FFe review from me ? I think we should try to sneak this in.

lp:~termie/nova/libvirt_snapshot updated
891. By Rick Harris

disk_format is now an ImageService property. Adds tests to prevent regression.

Revision history for this message
Thierry Carrez (ttx) :
review: Needs Information (ffe)
Revision history for this message
Jesse Andrews (anotherjesse) wrote :

Rational is that the bundling process is a PITA.

This brings kvm up to date with xen in the ability to snapshot externally

review: Approve
lp:~termie/nova/libvirt_snapshot updated
892. By Brian Waldon

Adding links container to openstack api v1.1 servers entities.

893. By Vish Ishaya

Updates to the newest version of nova.sh, which includes:
 * Installing new python dependencies
 * Allows for use of interfaces other than eth0
 * Adds a run_detached mode for automated testing

Revision history for this message
Thierry Carrez (ttx) wrote :

It's aligned with our "product coherence" goals for this release and is nicely self-contained. It needs some smoketesting though, so it needs to be merged ASAP. Ffe granted for late merging by Tuesday EOD.

review: Approve (ffe)
lp:~termie/nova/libvirt_snapshot updated
894. By Ilya Alekseyev

use_ipv6 now passing to interfaces.template as first level variable in libvirt_conn

Revision history for this message
Jay Pipes (jaypipes) wrote :

I agree that this is great, Andy. Very nice work.

7 import time
8 +import tempfile

i before e except when listing alphabetically. ;)

52 + 'image_location': 'snapshot',

Since there is a 'location' key already returned from Glance, perhaps you could come up with a slightly different key name than 'image_location'? However, I don't see this key used anywhere else in the code? Could you explain what code uses the 'image_location' property?

53 + 'image_state': 'available',

The image services have a 'status' column, containing ('active', 'queued', 'saving', 'killed'). Is it possible to remove the above custom property in favour of the status column? Could you explain what code uses this custom property, if any?

Just want to make sure we don't add seemingly redundant columns that could potentially introduce confusion into the image service API. We've had some issues in the last months with columns in the various image services meaning different things and have tried to get the problem under control.

Thanks in advance for the answers, Andy.

-jay

review: Needs Fixing
Revision history for this message
Vish Ishaya (vishvananda) wrote :

I'll jump in and answer these questions since they are actually due to the ec2 / s3_api code. It is possible we can come up with better solutions for them, but i they should probably go in separately, since they aren't specifically due to snapshotting.

On Mar 28, 2011, at 6:57 AM, Jay Pipes wrote:

> Review: Needs Fixing
> I agree that this is great, Andy. Very nice work.
>
> 7 import time
> 8 +import tempfile
>
> i before e except when listing alphabetically. ;)
>
> 52 + 'image_location': 'snapshot',
>
> Since there is a 'location' key already returned from Glance, perhaps you could come up with a slightly different key name than 'image_location'? However, I don't see this key used anywhere else in the code? Could you explain what code uses the 'image_location' property?

Image_location is used by the ec2_api. In the amazon world it is the s3 bucket that the image was uploaded to. We use it as kind of a "description" field to show where the image came from. We couldn't use the "location" field in glance because that is used by the backend. The unfortunate naming is just because that is the name for the field in ec2: imageLocation, and it didn't seem worth it to rename it.

>
> 53 + 'image_state': 'available',
>
> The image services have a 'status' column, containing ('active', 'queued', 'saving', 'killed'). Is it possible to remove the above custom property in favour of the status column? Could you explain what code uses this custom property, if any?

When the s3 service is registering an image, it goes through a few state changes so that the user can see the progress ("untarring", "decrypting", "available"). So the status column was not quite robust enough. Perhaps long term we should have two status fields: status and status_description. Status could be a status "code" and description could be specified for ui perposes.

>
> Just want to make sure we don't add seemingly redundant columns that could potentially introduce confusion into the image service API. We've had some issues in the last months with columns in the various image services meaning different things and have tried to get the problem under control.
>
> Thanks in advance for the answers, Andy.
>
> -jay

In short, some cleanup needs to be done here, but I think the cleanup is to the s3 image service / glance integration, not to the snapshotting code specifically.

> --
> https://code.launchpad.net/~termie/nova/libvirt_snapshot/+merge/54621
> You are reviewing the proposed merge of lp:~termie/nova/libvirt_snapshot into lp:nova.

lp:~termie/nova/libvirt_snapshot updated
895. By justinsb

Merged the two periodic_tasks functions, that snuck in due to parallel merges in compute.manager

Revision history for this message
Jay Pipes (jaypipes) wrote :

Thanks, Vish, those were excellent answers, and appreciated. Other than that import order nit, looks fine to me. Cheers. :)

lp:~termie/nova/libvirt_snapshot updated
896. By justinsb

Mixins for tests confuse pylint no end, and aren't necessary... you can stop the base-class from being run as a test by prefixing the class name with an underscore

897. By justinsb

Assume that if we don't find a VM for an instance in the DB, and the DB state is NOSTATE, that the db instance is in the process of being spawned, and don't mark it SHUTOFF.

Fix for bug#744056

Revision history for this message
Vish Ishaya (vishvananda) wrote :

Looks good now. Just waiting on the import order fix.

review: Approve
lp:~termie/nova/libvirt_snapshot updated
898. By termie

add snapshot support for libvirt

899. By termie

update glance params per review

Revision history for this message
termie (termie) wrote :

import order fixed, looks like the result of too many merges.

Revision history for this message
Jay Pipes (jaypipes) wrote :

cool. lgtm. nice work, Andy.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'nova/virt/libvirt_conn.py'
2--- nova/virt/libvirt_conn.py 2011-03-27 17:15:47 +0000
3+++ nova/virt/libvirt_conn.py 2011-03-28 17:56:24 +0000
4@@ -38,12 +38,15 @@
5
6 import multiprocessing
7 import os
8+import random
9 import shutil
10+import subprocess
11 import sys
12-import random
13-import subprocess
14+import tempfile
15+import time
16 import uuid
17 from xml.dom import minidom
18+from xml.etree import ElementTree
19
20 from eventlet import greenthread
21 from eventlet import tpool
22@@ -111,6 +114,8 @@
23 'Define live migration behavior.')
24 flags.DEFINE_integer('live_migration_bandwidth', 0,
25 'Define live migration behavior')
26+flags.DEFINE_string('qemu_img', 'qemu-img',
27+ 'binary to use for qemu-img commands')
28
29
30 def get_connection(read_only):
31@@ -397,10 +402,67 @@
32
33 @exception.wrap_exception
34 def snapshot(self, instance, image_id):
35- """ Create snapshot from a running VM instance """
36- raise NotImplementedError(
37- _("Instance snapshotting is not supported for libvirt"
38- "at this time"))
39+ """Create snapshot from a running VM instance.
40+
41+ This command only works with qemu 0.14+, the qemu_img flag is
42+ provided so that a locally compiled binary of qemu-img can be used
43+ to support this command.
44+
45+ """
46+ image_service = utils.import_object(FLAGS.image_service)
47+ virt_dom = self._conn.lookupByName(instance['name'])
48+ elevated = context.get_admin_context()
49+
50+ base = image_service.show(elevated, instance['image_id'])
51+
52+ metadata = {'disk_format': base['disk_format'],
53+ 'container_format': base['container_format'],
54+ 'is_public': False,
55+ 'properties': {'architecture': base['architecture'],
56+ 'type': base['type'],
57+ 'name': '%s.%s' % (base['name'], image_id),
58+ 'kernel_id': instance['kernel_id'],
59+ 'image_location': 'snapshot',
60+ 'image_state': 'available',
61+ 'owner_id': instance['project_id'],
62+ 'ramdisk_id': instance['ramdisk_id'],
63+ }
64+ }
65+
66+ # Make the snapshot
67+ snapshot_name = uuid.uuid4().hex
68+ snapshot_xml = """
69+ <domainsnapshot>
70+ <name>%s</name>
71+ </domainsnapshot>
72+ """ % snapshot_name
73+ snapshot_ptr = virt_dom.snapshotCreateXML(snapshot_xml, 0)
74+
75+ # Find the disk
76+ xml_desc = virt_dom.XMLDesc(0)
77+ domain = ElementTree.fromstring(xml_desc)
78+ source = domain.find('devices/disk/source')
79+ disk_path = source.get('file')
80+
81+ # Export the snapshot to a raw image
82+ temp_dir = tempfile.mkdtemp()
83+ out_path = os.path.join(temp_dir, snapshot_name)
84+ qemu_img_cmd = '%s convert -f qcow2 -O raw -s %s %s %s' % (
85+ FLAGS.qemu_img,
86+ snapshot_name,
87+ disk_path,
88+ out_path)
89+ utils.execute(qemu_img_cmd)
90+
91+ # Upload that image to the image service
92+ with open(out_path) as image_file:
93+ image_service.update(elevated,
94+ image_id,
95+ metadata,
96+ image_file)
97+
98+ # Clean up
99+ shutil.rmtree(temp_dir)
100
101 @exception.wrap_exception
102 def reboot(self, instance):