Merge lp:~nttdata/nova/block-migration into lp:~hudson-openstack/nova/trunk

Proposed by Kei Masumoto
Status: Merged
Approved by: Vish Ishaya
Approved revision: 1177
Merged at revision: 1439
Proposed branch: lp:~nttdata/nova/block-migration
Merge into: lp:~hudson-openstack/nova/trunk
Diff against target: 1787 lines (+746/-414)
18 files modified
Authors (+1/-0)
bin/nova-manage (+36/-7)
nova/compute/manager.py (+103/-25)
nova/db/api.py (+0/-21)
nova/db/sqlalchemy/api.py (+0/-39)
nova/db/sqlalchemy/models.py (+8/-8)
nova/exception.py (+9/-0)
nova/scheduler/driver.py (+127/-50)
nova/scheduler/manager.py (+28/-26)
nova/tests/api/openstack/test_extensions.py (+37/-66)
nova/tests/api/openstack/test_limits.py (+64/-114)
nova/tests/api/openstack/test_servers.py (+1/-2)
nova/tests/scheduler/test_scheduler.py (+46/-21)
nova/tests/test_compute.py (+24/-6)
nova/tests/test_libvirt.py (+99/-6)
nova/virt/fake.py (+1/-1)
nova/virt/libvirt/connection.py (+161/-21)
nova/virt/xenapi_conn.py (+1/-1)
To merge this branch: bzr merge lp:~nttdata/nova/block-migration
Reviewer Review Type Date Requested Status
Vish Ishaya (community) Approve
Brian Waldon (community) Approve
Matt Dietz (community) Approve
Review via email: mp+64662@code.launchpad.net

Description of the change

Adding kvm-block-migration feature.

I wrote some description the below URL. I hope it may help for reviewing.
<http://etherpad.openstack.org/kvm-block-migration>

To post a comment you must log in.
Revision history for this message
Matt Dietz (cerberus) wrote :
Download full text (10.1 KiB)

I can't vet this branch fully for functionality, but syntactically I found a few things:

some test failures:

======================================================================
ERROR: test_live_migration_works_correctly_no_volume (nova.tests.test_compute.ComputeTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/cerberus/code/python/nova/block-migration/nova/tests/test_compute.py", line 627, in test_live_migration_works_correctly_no_volume
    False)
  File "/Users/cerberus/code/python/nova/block-migration/.nova-venv/lib/python2.6/site-packages/mox.py", line 765, in __call__
    return mock_method(*params, **named_params)
  File "/Users/cerberus/code/python/nova/block-migration/.nova-venv/lib/python2.6/site-packages/mox.py", line 998, in __call__
    self._checker.Check(params, named_params)
  File "/Users/cerberus/code/python/nova/block-migration/.nova-venv/lib/python2.6/site-packages/mox.py", line 913, in Check
    'arguments' % (self._method.__name__, i))
AttributeError: live_migration does not take 5 or more positional arguments
-------------------- >> begin captured logging << --------------------
2011-06-17 13:06:36,744 AUDIT nova.auth.manager [-] Created user fake (admin: False)
2011-06-17 13:06:36,745 DEBUG nova.ldapdriver [-] Local cache hit for __project_to_dn by key pid_dn-fake from (pid=8116) inner /Users/cerberus/code/python/nova/block-migration/nova/auth/ldapdriver.py:153
2011-06-17 13:06:36,745 DEBUG nova.ldapdriver [-] Local cache hit for __dn_to_uid by key dn_uid-uid=fake,ou=Users,dc=example,dc=com from (pid=8116) inner /Users/cerberus/code/python/nova/block-migration/nova/auth/ldapdriver.py:153
2011-06-17 13:06:36,745 AUDIT nova.auth.manager [-] Created project fake with manager fake
--------------------- >> end captured logging << ---------------------

======================================================================
ERROR: test_live_migration_works_correctly_no_volume (nova.tests.test_compute.ComputeTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/cerberus/code/python/nova/block-migration/nova/tests/test_compute.py", line 81, in tearDown
    super(ComputeTestCase, self).tearDown()
  File "/Users/cerberus/code/python/nova/block-migration/nova/test.py", line 94, in tearDown
    self.mox.VerifyAll()
  File "/Users/cerberus/code/python/nova/block-migration/.nova-venv/lib/python2.6/site-packages/mox.py", line 286, in VerifyAll
    mock_obj._Verify()
  File "/Users/cerberus/code/python/nova/block-migration/.nova-venv/lib/python2.6/site-packages/mox.py", line 506, in _Verify
    raise ExpectedMethodCallsError(self._expected_calls_queue)
ExpectedMethodCallsError: Verify: Expected methods never called:
  0. nova.db.instance_get(<nova.context.RequestContext object at 0x10db02ed0>, 1) -> <nova.db.sqlalchemy.models.Instance object at 0x10db021d0>
  1. nova.db.queue_get_for(<nova.context.RequestContext object at 0x10db02ed0>, 'compute', 'dummy') -> 'compute.dummy'
-------------------- >> begin captured logging << --------------------
2011-06-17 13:06:36,744 AUDIT ...

review: Needs Fixing
Revision history for this message
Kei Masumoto (masumotok) wrote :

Hello, thanks for reviewing this!

>
> 81 + block_migration=False, **kwargs):
> 99 + kwargs.get('disk'))
>
> I'd prefer if you just passed in disk as another parameter, here. If there
> was a whole suite of option args, then kwargs would be justifiable, but
> probably not for just one key
Agreed. I've already fixed it. Please check it.

>
> 192 + # TODO:
>
> Mind attaching a name to do the TODO so that person (presumably you) knows
> to follow up on it?
>
Agreed. I've already fixed it. (I added some error-recover operation, and removed "TODO")

> 403 + if block_migration:
> 404 + raise
>
> This raise should provide an exception type and message. Additionally,
> is this the exception you hoped to catch in the following except block?
This is our mistake. I've fixed it.

> 1070 + #disk_type = driver_nodes[cnt].get_properties().getContent()
>
> Looks like you meant to remove this line?
>
I rewirte comment, please check it.

P.S. Regarding to test failure, I checked all test pass after above code-fixing. If you got any testcase failure, please let me know.

Revision history for this message
Matt Dietz (cerberus) wrote :

I've got several mocked tests failing on my end, but I suspect it's me, as it's not the first branch I've seen this in.

I approve of this syntactically. I'd still prefer several other reviews that can properly vet it for functionality.

review: Approve
Revision history for this message
Kei Masumoto (masumotok) wrote :

Matt, thanks for approval!

Revision history for this message
Dan Prince (dan-prince) wrote :

Marking 'work in progress' to fix conflicts:

Text conflict in nova/scheduler/driver.py
Text conflict in nova/tests/test_libvirt.py

Revision history for this message
Brian Waldon (bcwaldon) wrote :

I can't functionally test this as well as I want to, so I could only review the code and run the unit tests. It looks good to me after these few very minor things are addressed:

390, 1190: misspelled "Destination"

514: You might as well i18n ("_(...)") this string

665-667: This example looks a bit off: missing a comma and duplicate keys memory_mb, local_gb

I'm also having the same test failures as cerberus, even in a clean virtual environment. Can you look into this?

review: Needs Fixing
Revision history for this message
Kei Masumoto (masumotok) wrote :

Hello brian, thanks for reviewing our branch!

> I'm also having the same test failures as cerberus, even in a clean virtual
> environment. Can you look into this?
Now I can reproduce this error. I always execute "run_test.sh -N", then no error occurs.
"run_test.sh" raises error that you and cerberus pointed out to me.

Fix it soon..

Revision history for this message
Brian Waldon (bcwaldon) wrote :

Thanks, Kei, all the tests pass now. Once the comment on line 666 is fixed (repeated keys), this is good to go.

Revision history for this message
Brian Waldon (bcwaldon) wrote :

It looks good to me, but I want somebody to deploy and run this. I added Vish as a reviewer hoping he can do that for us.

review: Approve
Revision history for this message
Kei Masumoto (masumotok) wrote :

> Review: Approve
> It looks good to me, but I want somebody to deploy and run this. I added Vish
> as a reviewer hoping he can do that for us.

Thank you, Brian! I am looking forward he is going to come for review.

Revision history for this message
Vish Ishaya (vishvananda) wrote :

I ran into a few errors testing this branch:

Trying to migrate from cloudbuilders03 to cloudbuilders02:

First error (is this expected?):

2011-07-18 17:41:51,673 ERROR nova.exception [-] Uncaught exception
(nova.exception): TRACE: Traceback (most recent call last):
(nova.exception): TRACE: File "/root/nova/nova/exception.py", line 97, in wrapped
(nova.exception): TRACE: return f(*args, **kw)
(nova.exception): TRACE: File "/root/nova/nova/compute/manager.py", line 1143, in check_shared_storage_test_file
(nova.exception): TRACE: raise exception.FileNotFound(file_path=tmp_file)
(nova.exception): TRACE: FileNotFound: File /root/nova/nova/..//instances/tmperxzdb could not be found.
(nova.exception): TRACE:
2011-07-18 17:41:51,675 ERROR nova [-] Exception during message handling
(nova): TRACE: Traceback (most recent call last):
(nova): TRACE: File "/root/nova/nova/rpc.py", line 232, in _process_data
(nova): TRACE: rval = node_func(context=ctxt, **node_args)
(nova): TRACE: File "/root/nova/nova/exception.py", line 123, in wrapped
(nova): TRACE: raise Error(str(e))
(nova): TRACE: Error: File /root/nova/nova/..//instances/tmperxzdb could not be found.
(nova): TRACE:

Second error:

Traceback (most recent call last):
  File "/usr/lib/python2.6/dist-packages/eventlet/hubs/poll.py", line 97, in wait
    readers.get(fileno, noop).cb(fileno)
  File "/usr/lib/python2.6/dist-packages/eventlet/greenthread.py", line 192, in main
    result = function(*args, **kwargs)
  File "/root/nova/nova/virt/libvirt/connection.py", line 1563, in _live_migration
    FLAGS.live_migration_bandwidth)
  File "/usr/lib/python2.6/dist-packages/libvirt.py", line 581, in migrateToURI
    if ret == -1: raise libvirtError ('virDomainMigrateToURI() failed', dom=self)
libvirtError: operation failed: Failed to connect to remote libvirt URI qemu+tcp://cloudbuilder02/system
Removing descriptor: 10

Does cloudbuilders03 have to be able to hit qemu on the cloudbuilders02? Is there any way to factor this out? If not, is there any way to determine the ip address without dns lookup?

finally, the rollback failed:

(nova): TRACE: Traceback (most recent call last):
(nova): TRACE: File "/root/nova/nova/rpc.py", line 232, in _process_data
(nova): TRACE: rval = node_func(context=ctxt, **node_args)
(nova): TRACE: TypeError: rollback_live_migration_at_destination() got an unexpected keyword argument 'context'
(nova): TRACE:

looks like this line:

262 + def rollback_live_migration_at_destination(self, ctxt, instance_id):

needs to use context instead of ctxt, as rpc calls via kwarg so the name can't be changed.

review: Needs Fixing
Revision history for this message
Kei Masumoto (masumotok) wrote :

Thanks for reviewing, Vish!
Regarding to 2nd comments, I need some advice from you. Please see below..

> First error (is this expected?):
(snip)
>
> 2011-07-18 17:41:51,673 ERROR nova.exception [-] Uncaught exception
> (nova.exception): TRACE: Traceback (most recent call last):
> (nova.exception): TRACE: File "/root/nova/nova/exception.py", line 97, in wrapped
> (nova.exception): TRACE: return f(*args, **kw)
> (nova.exception): TRACE: File "/root/nova/nova/compute/manager.py", line 1143, in check_shared_storage_test_file
> (nova.exception): TRACE: raise exception.FileNotFound(file_path=tmp_file)
> (nova.exception): TRACE: FileNotFound: File /root/nova/nova/..//instances/tmperxzdb could not be found.
> (nova.exception): TRACE:

In case of block migration, the above is expected error. This checks any shared FLAGS.instances_dir is not mounted by any shared storage. But this message causes misunderstanding - I'm going to add some additional messages, such as, " this can be ignored".

> Second error:
>
> (snip)
> Does cloudbuilders03 have to be able to hit qemu on the cloudbuilders02? Is there any way to factor this out? If not, is there any way to determine the ip address without dns lookup?

Hmm.. libvirtd running at cloudbuilders03 has to be network-reachable to libvirtd running on cloudbuilders02, AFAIK. Also, if dns lookup is unprefered, I propose to write some entry to /etc/hosts.

Users can start live migration such as, "nova-manage vm live_migration(block_migration) i-00000001 hostname". Using hostname, nova can check given hostname is correct. Because nova-compute registers its hostname to Service table. Otherwise, if user specifiy wrong hostname, nova-compute understands wrong hostname(or ipaddress) when it try to start live migration(i.e. source libvirtd connects to destination libvirtd and it has to wait timeout). What do you think? Do we have to add a new FLAGS that can pass hostname checking?

> finally, the rollback failed:
(snip)
>
> 262 + def rollback_live_migration_at_destination(self, ctxt, instance_id):
>
> needs to use context instead of ctxt, as rpc calls via kwarg so the name can't be changed.

That is our mistake. I'll fix it soon..

Revision history for this message
Dave Walker (davewalker) wrote :

On 19/07/11 05:54, Kei Masumoto wrote:

<SNIP>
>> 2011-07-18 17:41:51,673 ERROR nova.exception [-] Uncaught exception
>> (nova.exception): TRACE: Traceback (most recent call last):
>> (nova.exception): TRACE: File "/root/nova/nova/exception.py", line 97, in wrapped
>> (nova.exception): TRACE: return f(*args, **kw)
>> (nova.exception): TRACE: File "/root/nova/nova/compute/manager.py", line 1143, in check_shared_storage_test_file
>> (nova.exception): TRACE: raise exception.FileNotFound(file_path=tmp_file)
>> (nova.exception): TRACE: FileNotFound: File /root/nova/nova/..//instances/tmperxzdb could not be found.
>> (nova.exception): TRACE:
> In case of block migration, the above is expected error. This checks any shared FLAGS.instances_dir is not mounted by any shared storage. But this message causes misunderstanding - I'm going to add some additional messages, such as, " this can be ignored".
>
<SNIP>

I understand this is to check that nova is attempting a live migration
without shared storage, when the instance is infact on shared storage?

If so, surely this should just be caught, and turned into a LOG.debug..
rather than a "this can be ignored"?

Thanks.

Revision history for this message
Kei Masumoto (masumotok) wrote :

> I understand this is to check that nova is attempting a live migration without shared storage, when the instance is infact on shared storage?
>
> If so, surely this should just be caught, and turned into a LOG.debug..
> rather than a "this can be ignored"?

There are 2 patterns. One is that normal live migration. Then nova expects FLAGS.instances_path is on shared storage. Another is block migration. then nova expects FLAGS.instances_path is *NOT* on shared storage. The exception is caught, but for exceptions raised at compute node(not scheduler), exception message cannot be suppressed. That is what I am thinking about.

Revision history for this message
Vish Ishaya (vishvananda) wrote :

On Jul 19, 2011, at 3:37 AM, Kei Masumoto wrote:
>
> There are 2 patterns. One is that normal live migration. Then nova expects FLAGS.instances_path is on shared storage. Another is block migration. then nova expects FLAGS.instances_path is *NOT* on shared storage. The exception is caught, but for exceptions raised at compute node(not scheduler), exception message cannot be suppressed. That is what I am thinking about.

Why not modify the method to return different values based on whether the exception is hit, rather than passing the exception back to the caller?

Revision history for this message
Kei Masumoto (masumotok) wrote :

Hi,

>> There are 2 patterns. One is that normal live migration. Then nova expects FLAGS.instances_path is on shared storage. Another is block migration. then nova expects FLAGS.instances_path is *NOT* on shared storage. The exception is caught, but for exceptions raised at compute node(not scheduler), exception message cannot be suppressed. That is what I am thinking about.
>
> Why not modify the method to return different values based on whether the exception is hit, rather than passing the exception back to the caller?

Please check updated version(updated 12 hours before). I've already fixed as you said on this matter. I should have said it before changing status "work in progress" -> "needs review".

Revision history for this message
Vish Ishaya (vishvananda) wrote :

Still having some trouble with this. Setup was a little tough, I had to
1) add /etc/hosts entries for the other hosts
2) modify /etc/libvirt/libvirt.conf to turn off tls and turn on tcp and turn off authentication
3) modify /etc/init/libvirt-bin.conf to add the -l parameter
4) stop and start libvirt-bin (restart doesn't add the extra param

Now I'm getting a new error:

2011-07-20 15:37:57,062 DEBUG nova.rpc [-] Making asynchronous cast on compute.cloudbuilder03... from (pid=20518) cast /root/nova/nova/rpc.py:554
Traceback (most recent call last):
  File "/usr/lib/python2.6/dist-packages/eventlet/hubs/poll.py", line 97, in wait
    readers.get(fileno, noop).cb(fileno)
  File "/usr/lib/python2.6/dist-packages/eventlet/greenthread.py", line 192, in main
    result = function(*args, **kwargs)
  File "/root/nova/nova/virt/libvirt/connection.py", line 1563, in _live_migration
    FLAGS.live_migration_bandwidth)
  File "/usr/lib/python2.6/dist-packages/libvirt.py", line 581, in migrateToURI
    if ret == -1: raise libvirtError ('virDomainMigrateToURI() failed', dom=self)
libvirtError: Unknown failure
Removing descriptor: 12

Any way to find out what might have failed? Do I need a more up-to-date version of kvm?

Vish

Revision history for this message
Vish Ishaya (vishvananda) wrote :

on a positive note, it recovered from the failure ok, the vm just didn't migrate.

Revision history for this message
Vish Ishaya (vishvananda) wrote :

libvirt logs from cloudbuilder02:

16:02:49.825: 21622: info : virDomainLoadAllConfigs:8190 : Scanning for configs in /etc/libvirt/uml
16:03:16.783: 21629: error : qemuMonitorTextMigrate:1260 : operation failed: migration to 'tcp:cloudbuilder03.Marriot.com:49152' failed: migration failed

16:03:17.101: 21629: error : remoteIO:10609 : Unknown failure

and from cloudbuilder03:

16:02:18.197: 19115: info : umlStartup:427 : Adding inotify watch on /var/run/libvirt/uml-guest
16:02:18.197: 19115: info : virDomainLoadAllConfigs:8190 : Scanning for configs in /etc/libvirt/uml
16:03:15.632: 19118: info : virSecurityDACSetOwnership:99 : Setting DAC user and group on '/root/nova/nova/..//instances/instance-00000001/disk' to '103:110'
16:03:15.632: 19118: info : virSecurityDACSetOwnership:99 : Setting DAC user and group on '/root/nova/nova/..//instances/instance-00000001/console.log' to '103:110'
16:03:15.632: 19118: info : virSecurityDACSetOwnership:99 : Setting DAC user and group on '/root/nova/nova/..//instances/instance-00000001/kernel' to '103:110'
16:03:15.632: 19118: info : virSecurityDACSetOwnership:99 : Setting DAC user and group on '/root/nova/nova/..//instances/instance-00000001/ramdisk' to '103:110'
16:03:16.986: 19121: info : virSecurityDACRestoreSecurityFileLabel:139 : Restoring DAC user and group on '/root/nova/nova/..//instances/instance-00000001/disk'
16:03:16.986: 19121: info : virSecurityDACSetOwnership:99 : Setting DAC user and group on '/root/nova/nova/..//instances/instance-00000001/disk' to '0:0'
16:03:16.986: 19121: info : virSecurityDACRestoreSecurityFileLabel:139 : Restoring DAC user and group on '/root/nova/nova/..//instances/instance-00000001/console.log'
16:03:16.986: 19121: info : virSecurityDACSetOwnership:99 : Setting DAC user and group on '/root/nova/nova/..//instances/instance-00000001/console.log' to '0:0'
16:03:16.986: 19121: info : virSecurityDACRestoreSecurityFileLabel:139 : Restoring DAC user and group on '/root/nova/nova/..//instances/instance-00000001/kernel'
16:03:16.987: 19121: info : virSecurityDACSetOwnership:99 : Setting DAC user and group on '/root/nova/nova/..//instances/instance-00000001/kernel' to '0:0'
16:03:16.987: 19121: info : virSecurityDACRestoreSecurityFileLabel:139 : Restoring DAC user and group on '/root/nova/nova/..//instances/instance-00000001/ramdisk'
16:03:16.987: 19121: info : virSecurityDACSetOwnership:99 : Setting DAC user and group on '/root/nova/nova/..//instances/instance-00000001/ramdisk' to '0:0'

Don't know if that helps at all

Revision history for this message
Vish Ishaya (vishvananda) wrote :

I think I found the issue by turning on debug:

16:34:40.921: 20692: debug : virNWFilterLookupByName:11602 : conn=0xf51870, name=nova-instance-instance-00000001-secgroup
16:34:40.921: 20691: debug : virEventMakePollFDs:361 : Prepare n=5 w=6, f=9 e=25
16:34:40.921: 20691: debug : virEventMakePollFDs:361 : Prepare n=6 w=7, f=8 e=25
16:34:40.921: 20691: debug : virEventMakePollFDs:361 : Prepare n=7 w=8, f=7 e=25
16:34:40.921: 20691: debug : virEventMakePollFDs:361 : Prepare n=8 w=9, f=15 e=1
16:34:40.921: 20691: debug : virEventCalculateTimeout:302 : Calculate expiry of 2 timers
16:34:40.921: 20691: debug : virEventCalculateTimeout:332 : Timeout at 0 due in -1 ms
16:34:40.921: 20692: debug : nwfilterLookupByName:257 : Network filter not found: no nwfilter with matching name 'nova-instance-instance-00000001-secgroup'
16:34:40.921: 20691: debug : virEventRunOnce:583 : Poll on 9 handles 0x7f7334081440 timeout -1
16:34:40.921: 20692: debug : remoteSerializeError:146 : prog=536903814 ver=1 proc=175 type=1 serial=15, msg=Network filter not found: no nwfilter with matching name 'nova-instance-instance-00000001-secgroup'

Looks like security groups aren't made on the new host before migration...

Revision history for this message
Kei Masumoto (masumotok) wrote :

Thanks for trying, Vish! Please give us some time to reproduce, since this is first time to look at it.
In our expectation, secgroup is prepared before migrating at any condition. But the log you sent shows setting up secgroup is failed. Let me check it..

Revision history for this message
Vish Ishaya (vishvananda) wrote :

Just FYI. I migrated once and it failed to connect. Then I fixed the libvirt
listen and tried to migrate again. If you can't reproduce I will try again
tomorrow with a new instance.
On Jul 20, 2011 5:16 PM, "Kei Masumoto" <email address hidden> wrote:
> Thanks for trying, Vish! Please give us some time to reproduce, since this
is first time to look at it.
> In our expectation, secgroup is prepared before migrating at any
condition. But the log you sent shows setting up secgroup is failed. Let me
check it..
>
>
>
> --
> https://code.launchpad.net/~nttdata/nova/block-migration/+merge/64662
> You are reviewing the proposed merge of lp:~nttdata/nova/block-migration
into lp:nova.

Revision history for this message
Kei Masumoto (masumotok) wrote :

Thank you for your reply! Actually in our development environment, if migrating is failed over and over again for some reason(we make many try-and-errors in coding/testing, don’t we?), my colleague and I sometimes encountered the situation that restarting libvirtd/OS is necessary. But once we completed coding/testing, it is not happened. Anyway, my colleague and I are going to try reproducing, and report to you again.

Revision history for this message
Kei Masumoto (masumotok) wrote :

Hi Vish,

We are still trying to reproduce... let me make sure about your environment. From your log you sent to me, I suppose
1) --firewall_driver=nova.virt.libvirt.firewall.NWFilterFirewall
2) ami/ari/aki is devided.
3) ami is qcow2
Is that correct? Above 3 points may be important maybe not. I just want to make sure anything to reproduce.

Thanks!

Revision history for this message
Vish Ishaya (vishvananda) wrote :

Below,

On Jul 21, 2011, at 6:46 AM, Kei Masumoto wrote:

> Hi Vish,
>
> We are still trying to reproduce... let me make sure about your environment. From your log you sent to me, I suppose
> 1) --firewall_driver=nova.virt.libvirt.firewall.NWFilterFirewall

No, using the default Iptables driver
> 2) ami/ari/aki is devided.

correct

> 3) ami is qcow2

the ami is actually raw, but i have --use_cow_images on (which is the default). That means that the image is actually a copy on write snapshot that is using the raw base image in instances/_base

> Is that correct? Above 3 points may be important maybe not. I just want to make sure anything to reproduce.

Also, I tried with a new instance just in case and i'm seeing the same issue.

Vish

>
> Thanks!
>
> --
> https://code.launchpad.net/~nttdata/nova/block-migration/+merge/64662
> You are reviewing the proposed merge of lp:~nttdata/nova/block-migration into lp:nova.

Revision history for this message
Kei Masumoto (masumotok) wrote :

Hi, thanks for your reply! Today I got successful to reproduce, but I am not quite sure. Please see below.

In our today's trial, we use same image(ami/ari/aki is divided, raw image and use_cow_image=True). Then block migration finishes successfully.
Also, I gave much thought to logs you sent to me. Here is the analysis.

>16:34:40.921: 20692: debug : remoteSerializeError:146 : prog=536903814 ver=1 proc=175 type=1 serial=15,
> msg=Network filter not found: no nwfilter with matching name 'nova-instance-instance-00000001-secgroup'

The reason for this log is unfilter_instance() in iptables driver is called twice. When live migration finishes, at source compute node, unfilter_instance() should be called to erase filtering rules. In addition, in case of block migration, erasing vm image at source compute node is necessary. Since I don’t like to define new method only for this purpose, I use destroy() in libvirt driver, and destroy() calls unfilter_instance() again. Then above logs popped up.
Although I may have to look for better handling, it is not critical issue for block migration failure.

One thing I would like to mention is I can reproduce block migration failure. At that time, at source compute node, I used different libvirt-bin package which is same version but is rebuilt by me. Then I found (probably) exactly same failure log that you sent. Once I tried "apt-get install libvirt-bin --reinstall", everything was fine. I ignored some test failures on rebuilding packages, that may be the reason :). So, from this experience, I am wondering that you use your own custom libvirt-bin package or not. Any comment on this? For example, below is md5sum in my environment.

> root@compute01:/opt/openstack/dev/nova/block-migration-rev1172 # md5sum /usr/sbin/libvirtd
> c0cc0abbb9e93c6152fef33bcee17255 /usr/sbin/libvirtd

BTW, I would highly appreciate if you show me nova.conf at cloudbuilder02 and cloudbuilder03. It is very useful for further analysis.

Thanks in advance,
Kei

Revision history for this message
Vish Ishaya (vishvananda) wrote :

Hope this helps:

root@cloudbuilder02:~# cat nova/bin/nova.conf
--verbose
--nodaemon
--dhcpbridge_flagfile=/root/nova/bin/nova.conf
--network_manager=nova.network.manager.FlatDHCPManager
--my_ip=192.168.1.12
--public_interface=eth0
--vlan_interface=eth0
--sql_connection=mysql://root:nova@192.168.1.12/nova
--auth_driver=nova.auth.dbdriver.DbDriver
--libvirt_type=qemu
--fixed_range=10.0.0.0/24
--glance_api_servers=192.168.1.12:9292
--rabbit_host=192.168.1.12
--flat_interface=eth0
root@cloudbuilder02:~# dpkg -l | grep libvirt
ii libvirt-bin 0.8.8-1ubuntu3~ppamaverick1 the programs for the libvirt library
ii libvirt0 0.8.8-1ubuntu3~ppamaverick1 library for interfacing with different virtualization systems
ii python-libvirt 0.8.8-1ubuntu3~ppamaverick1 libvirt Python bindings

Revision history for this message
Vish Ishaya (vishvananda) wrote :

different version of libvirt for sure:

root@cloudbuilder02:~# md5sum /usr/sbin/libvirtd
69dfa442a2b978b493b555d7dfc88ae0 /usr/sbin/libvirtd

Revision history for this message
Kei Masumoto (masumotok) wrote :

Thanks. Now I've got extra hints. I have different environment from yours. Here is ours. I'll report to you soon..

> root@compute01:/root# dpkg -l | grep libvirt
> ii libvirt-bin 0.8.8-1ubuntu6.2 the programs for the libvirt library
> ii libvirt0 0.8.8-1ubuntu6.2 library for interfacing with different virtualization systems
> ii python-libvirt 0.8.8-1ubuntu6 libvirt Python bindings

> root@compute01:/root# uname -a
> Linux cb-blsv4b1 2.6.38-8-generic #42-Ubuntu SMP Mon Apr 11 03:31:24 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
(meaning Lucid, not merverick)

> root@compute01:/root# dpkg -l | grep kvm
> ii kvm 1:84+dfsg-0ubuntu16+0.14.0+noroms+0ubuntu4 dummy transitional package from kvm to qemu-kvm
> ii kvm-pxe 5.4.4-7ubuntu2 PXE ROM's for KVM
> ii qemu-kvm 0.14.0+noroms-0ubuntu4 Full virtualization on i386 and amd64 hardware

Revision history for this message
Kei Masumoto (masumotok) wrote :

Hi Vish, let me ask one additional question - are you using KVM on VMWare (or anyway, KVM on some other hypervisor)?
Today, I have lots of our network problem, so futher analysis has not been done yet(even installing marverick took lots of time..). I'm trying to check in marverick environment, but if you are using above situation, I think it is also better to be checked.

Thanks in advance,
Kei

Revision history for this message
Vish Ishaya (vishvananda) wrote :

No, kvm is running on bare metal. I have it set to use software (qemu) mode though. I have also tried with kvm mode on and i get the same error.

Revision history for this message
Kei Masumoto (masumotok) wrote :

Hello Vish,

I installed maverick and tried to reproduce today. Unfortunately, I still cannot reproduce the error. But this error obviously looks like KVM/libvirt related error. In this assumption, once again, I would like to share our libvirt.conf to check any additional config exists. Also, I attached version info just for reference.

--- [ libvirtd.conf ]---
(p.s. comments are removed.)

listen_tls = 0
listen_tcp = 1
unix_sock_group = "libvirtd"
unix_sock_rw_perms = "0770"
auth_unix_ro = "none"
auth_unix_rw = "none"
auth_tcp = "none"

--- [ versions ] ---
ii libvirt-bin 0.8.8-1ubuntu3~ppamaverick1 the programs for the libvirt library
ii libvirt0 0.8.8-1ubuntu3~ppamaverick1 library for interfacing with different virtualization systems
ii python-libvirt 0.8.8-1ubuntu3~ppamaverick1 libvirt Python bindings
ii kvm 1:84+dfsg-0ubuntu16+0.14.0~rc1+noroms+0ubuntu4~ppamaverick1 dummy transitional pacakge from kvm to qemu-kvm
ii qemu-common 0.14.0~rc1+noroms-0ubuntu4~ppamaverick1 qemu common functionality (bios, documentation, etc)
ii qemu-kvm 0.14.0~rc1+noroms-0ubuntu4~ppamaverick1 Full virtualization on i386 and amd64 hardware

Thanks in advance!
Kei

Revision history for this message
Vish Ishaya (vishvananda) wrote :

So I'm getting a few other errors when trying to reproduce with your branch. I think I might have isolated the difference. It looks like you are using vlan mode and I am using flat dhcp.

There are some issues with using flatdhcp in your branch that have been fixed in trunk. Can you merge with trunk so I can test again?

Thanks,

Vish

Revision history for this message
Kei Masumoto (masumotok) wrote :

Hi!,
I tried "bzr merge lp:nova" and doing block migration again. block migration itself is ok, but I got below error. I'll let you know once I fix it and try again.

2011-07-27 17:11:10,773 ERROR nova [-] Exception during message handling
(nova): TRACE: Traceback (most recent call last):
(nova): TRACE: File "/opt/openstack/dev/block-migration-rev1173-merge/nova/rpc.py", line 232, in _process_data
(nova): TRACE: rval = node_func(context=ctxt, **node_args)
(nova): TRACE: File "/opt/openstack/dev/block-migration-rev1173-merge/nova/compute/manager.py", line 1414, in post_live_migration_at_destination
(nova): TRACE: block_migration)
(nova): TRACE: File "/opt/openstack/dev/block-migration-rev1173-merge/nova/virt/libvirt/connection.py", line 1643, in post_live_migration_at_destination
(nova): TRACE: xml = self.to_xml(instance_ref)
(nova): TRACE: File "/opt/openstack/dev/block-migration-rev1173-merge/nova/virt/libvirt/connection.py", line 1040, in to_xml
(nova): TRACE: block_device_mapping)
(nova): TRACE: File "/opt/openstack/dev/block-migration-rev1173-merge/nova/virt/libvirt/connection.py", line 984, in _prepare_xml_info
(nova): TRACE: nics.append(self.vif_driver.plug(instance, network, mapping))
(nova): TRACE: File "/opt/openstack/dev/block-migration-rev1173-merge/nova/virt/libvirt/vif.py", line 90, in plug
(nova): TRACE: return self._get_configurations(network, mapping)
(nova): TRACE: File "/opt/openstack/dev/block-migration-rev1173-merge/nova/virt/libvirt/vif.py", line 65, in _get_configurations
(nova): TRACE: 'dhcp_server': mapping['dhcp_server'],
(nova): TRACE: KeyError: 'dhcp_server'
(nova): TRACE:
2011-07-27 17:11:10,774 ERROR nova.rpc [-] Returning exception 'dhcp_server' to caller

Thanks,
Kei

Revision history for this message
Kei Masumoto (masumotok) wrote :

Hi!
I merged recent trunk, and confirm no error occurs. Could you please check it?

one additional note - the below is possible reason that block migration fails.
1) libvirt/qemu configuration
source/destionation host has to have exactly same libvirt.conf and qemu.conf.
For example, at source host, "user=root" and "group=root" but there is no such configuration at destination host. in this case, block migration fails.

2) nane resolution
source and destination host can resolve their name (ex. writing /etc/hosts ).
For example, if destination host refer DNS and dns entry is wrong, or DNS entry - hostname is different, block migration fails.
Apart from nova, if we go "virsh migrate --live --copy-storage-inc instance_id dest",
and dest is specified by ip address, such as "qemu+tcp://192.168.0.1/system", libvirt exchange their host name.

3) install dir
nova has to be installed exactly same dir at source and destination host.

4) Permission
libvirt can access to FLAGS.instance_dire ?
/etc/init.d/apparmor teardown may be help.

Thanks,
Kei

Revision history for this message
Vish Ishaya (vishvananda) wrote :

Still no luck with new code. Are you using vlanmode? I'm still getting migrate to uri failed

Revision history for this message
Kei Masumoto (masumotok) wrote :

Thanks for checking our branch. I should have said this, my colleagues and me already have tried FlatDHCPManager and could not reproduce. However, since I may have some mistakes, I'll send you how-to-doc that explains our environment. Could you please check it?

Revision history for this message
Vish Ishaya (vishvananda) wrote :

Kei came by and got this working. Turns out there was an issue with images smaller than one GB that he fixed.

review: Approve
Revision history for this message
Kei Masumoto (masumotok) wrote :

Thanks for review, Vish!

Revision history for this message
madhavi (prasanna-desaraju) wrote :
Download full text (10.2 KiB)

Hi kei,

Iam facing some issue while doing block migration.Can you plz help me out on this.

Instances files are being destroyed.

plz check the below log

ed': False} from (pid=15050) _unpack_context /usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py:646
2011-11-24 21:38:09,155 INFO nova.compute.manager [-] #################### in pre_live_migration
2011-11-24 21:38:09,252 INFO nova.compute.manager [-] instance-00000002 has no volume.
2011-11-24 21:38:09,253 INFO nova.compute.manager [-] @@@@@@@@@@@@@@@@@before driver.pre_live_migration instance_id 2
2011-11-24 21:38:09,253 INFO nova.compute.manager [-] @@@@@@@@@@@@@@@@@ after driver.pre_live_migration instance_id 2
2011-11-24 21:38:09,254 DEBUG nova.rpc [-] Making asynchronous call on network ... from (pid=15050) multicall /usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py:721
2011-11-24 21:38:09,254 DEBUG nova.rpc [-] MSG_ID is fba4149739ec484682ccece37cc0bcbb from (pid=15050) multicall /usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py:724
2011-11-24 21:38:09,527 INFO nova.compute.manager [-] @@@@@@@@@@@@@@@@@ after nstance nw_info 2 nwinfo [[{u'bridge': u'br100', u'multi_host': True, u'bridge_interface': u'eth0', u'vlan': 100, u'id': 1, u'injected': False, u'cidr': u'10.0.0.0/24', u'cidr_v6': None}, {u'should_create_bridge': True, u'dns': [], u'vif_uuid': u'93f0e47c-d3b5-4614-a760-34145e67317c', u'label': u'private', u'broadcast': u'10.0.0.255', u'ips': [{u'ip': u'10.0.0.5', u'netmask': u'255.255.255.0', u'enabled': u'1'}], u'mac': u'02:16:3e:00:70:4a', u'rxtx_cap': 0, u'should_create_vlan': False, u'dhcp_server': u'10.0.0.8', u'gateway': u'10.0.0.1'}]]
2011-11-24 21:38:09,528 INFO nova.virt.libvirt_conn [-] ######### endure_filtering_rules_for_instance
2011-11-24 21:38:09,528 INFO nova [-] called setup_basic_filtering in nwfilter
2011-11-24 21:38:09,529 INFO nova [-] ensuring static filters
2011-11-24 21:38:10,059 DEBUG nova.virt.libvirt.firewall [-] iptables firewall: Setup Basic Filtering from (pid=15050) setup_basic_filtering /usr/lib/python2.7/dist-packages/nova/virt/libvirt/firewall.py:525
2011-11-24 21:38:10,060 DEBUG nova.utils [-] Attempting to grab semaphore "iptables" for method "_do_refresh_provider_fw_rules"... from (pid=15050) inner /usr/lib/python2.7/dist-packages/nova/utils.py:721
2011-11-24 21:38:10,060 DEBUG nova.utils [-] Attempting to grab file lock "iptables" for method "_do_refresh_provider_fw_rules"... from (pid=15050) inner /usr/lib/python2.7/dist-packages/nova/utils.py:726
2011-11-24 21:38:10,067 DEBUG nova.utils [-] Attempting to grab semaphore "iptables" for method "apply"... from (pid=15050) inner /usr/lib/python2.7/dist-packages/nova/utils.py:721
2011-11-24 21:38:10,067 DEBUG nova.utils [-] Attempting to grab file lock "iptables" for method "apply"... from (pid=15050) inner /usr/lib/python2.7/dist-packages/nova/utils.py:726
2011-11-24 21:38:10,068 DEBUG nova.utils [-] Running cmd (subprocess): sudo iptables-save -t filter from (pid=15050) execute /usr/lib/python2.7/dist-packages/nova/utils.py:169
2011-11-24 21:38:10,090 DEBUG nova.utils [-] Running cmd (subprocess): sudo iptables-restore from (pid=15050) execute /usr/lib/python2.7/dist-packa...

Revision history for this message
Kei Masumoto (masumotok) wrote :

I am expecting there must be some exception(stack trace). I recommend you to find it. In my memory, any exception is logged into console/logfile.
I don’t know which revision you are using, but unintentionally suppressing sometimes happens in trunk nova. Kei

Revision history for this message
madhavi (prasanna-desaraju) wrote :

Hi kei,

Thanks a lot for your reply.

This is our environment which we are using.

root@Compute104:/usr/share/pyshared/nova# dpkg -l | grep libvirt
ii libvirt-bin 0.8.8-1ubuntu6.5 the programs for the libvirt library
ii libvirt0 0.8.8-1ubuntu6.5 library for interfacing with different virtualization systems
ii python-libvirt 0.8.8-1ubuntu6.5 libvirt Python bindings

root@Compute104:/usr/share/pyshared/nova# uname -a
Linux Compute104 2.6.38-12-server #51-Ubuntu SMP Wed Sep 28 16:07:08 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

root@Compute104:/usr/share/pyshared/nova# dpkg -l | grep kvm
ii nova-compute-kvm 2012.1~e2~20111116.11495-0ubuntu0ppa1~natty1 OpenStack Compute - compute node (KVM)
ii qemu-kvm 0.14.0+noroms-0ubuntu4.4 Full virtualization on i386 and amd64 hardware

Apart from this on what other stuff do i need to concentrate ? I have followed all the changes you have mentioned before but facing the same issue still

when i checked the destination compute log

It says instance files deleted successfully.

Please help me out on this.

Thanks in advance

Revision history for this message
madhavi (prasanna-desaraju) wrote :

Actually we are working on a multinode setup with one controller node and two compute nodes pointing to same controller.The nova.conf files at all ends is as follows

--dhcpbridge_flagfile=/etc/nova/nova.conf
--dhcpbridge=/usr/bin/nova-dhcpbridge
--logdir=/var/log/nova
--state_path=/var/lib/nova
--lock_path=/var/lock/nova
--use_deprecated_auth
--verbose
--flagfile=/etc/nova/nova-compute.conf
--sql_connection=mysql://novadbuser:novaDBsekret@10.233.52.102/nova
--network_manager=nova.network.manager.FlatDHCPManager
--auth_driver=nova.auth.dbdriver.DbDriver
--libvirt_type=qemu
--flat_network_bridge=br100
--flat_interface=eth0
--vlan_interface=eth0
--public_interface=eth0
--vncproxy_url=http://10.233.52.102:6080
--daemonize=1
--rabbit_host=10.233.52.102
--cc_host=10.233.52.102
--osapi_host=10.233.52.102
--ec2_host=10.233.52.102
--image_service=nova.image.glance.GlanceImageService
--glance_api_servers=10.233.52.102:9292
--use_syslog

Revision history for this message
madhavi (prasanna-desaraju) wrote :

hi kei,

Please tell me the configuration details needed for block migration.Please give me the details of

1. nova.conf at compute and controller ends
2. libvirt.conf
3. qemu.conf

Revision history for this message
madhavi (prasanna-desaraju) wrote :

hi vish,

Can you please provide me the configuration details for block migration.I have mentioned my configuration details above .iam not sure where iam going wrong.

Can you plz help me out on this?

Log shows some error in adding sec group for instance.i have got the same error which u got

nwfilter(nova-instance-instance-00000002-secgroup) for instance-00000002 is not found. from (pid=676) unfilter_instance /usr/lib/python2.7/dist-packages/nova/virt/libvirt/firewall.py:320
2011-11-23 19:45:00,695 INFO nova.virt.libvirt_conn [-] instance instance-00000002: deleting instance files /var/lib/nova/instances/instance-00000002

vish can you please help me on this?

Revision history for this message
Vish Ishaya (vishvananda) wrote :

That security group error always happens on delete, it isn't an issue. The migration is failing for some other reason. You should verify that you can connect to libvirt remotely from each host to the other.

On Nov 28, 2011, at 3:40 AM, madhavi wrote:

> hi vish,
>
> Can you please provide me the configuration details for block migration.I have mentioned my configuration details above .iam not sure where iam going wrong.
>
> Can you plz help me out on this?
>
> Log shows some error in adding sec group for instance.i have got the same error which u got
>
> nwfilter(nova-instance-instance-00000002-secgroup) for instance-00000002 is not found. from (pid=676) unfilter_instance /usr/lib/python2.7/dist-packages/nova/virt/libvirt/firewall.py:320
> 2011-11-23 19:45:00,695 INFO nova.virt.libvirt_conn [-] instance instance-00000002: deleting instance files /var/lib/nova/instances/instance-00000002
>
> vish can you please help me on this?
> --
> https://code.launchpad.net/~nttdata/nova/block-migration/+merge/64662
> You are reviewing the proposed merge of lp:~nttdata/nova/block-migration into lp:~hudson-openstack/nova/trunk.

Revision history for this message
Kei Masumoto (masumotok) wrote :

> Please tell me the configuration details
<http://docs.openstack.org/diablo/openstack-compute/admin/content/configuring-live-migrations.html>
Libvirt config is same as normal live migration. also, make sure "user=root" in qemu.conf.

Revision history for this message
madhavi (prasanna-desaraju) wrote :

root@Compute104:/usr/share/pyshared/nova# dpkg -l | grep libvirt
ii libvirt-bin 0.8.8-1ubuntu6.5 the programs for the libvirt library
ii libvirt0 0.8.8-1ubuntu6.5 library for interfacing with different virtualization systems
ii python-libvirt 0.8.8-1ubuntu6.5 libvirt Python bindings

root@Compute104:/usr/share/pyshared/nova# uname -a
Linux Compute104 2.6.38-12-server #51-Ubuntu SMP Wed Sep 28 16:07:08 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

root@Compute104:/usr/share/pyshared/nova# dpkg -l | grep kvm
ii nova-compute-kvm 2012.1~e2~20111116.11495-0ubuntu0ppa1~natty1 OpenStack Compute - compute node (KVM)
ii qemu-kvm 0.14.0+noroms-0ubuntu4.4 Full virtualization on i386 and amd64 hardware

Revision history for this message
madhavi (prasanna-desaraju) wrote :

Actually we are working on a multinode setup with one controller node and two compute nodes pointing to same controller.The nova.conf files at all ends is as follows

--dhcpbridge_flagfile=/etc/nova/nova.conf
--dhcpbridge=/usr/bin/nova-dhcpbridge
--logdir=/var/log/nova
--state_path=/var/lib/nova
--lock_path=/var/lock/nova
--use_deprecated_auth
--verbose
--flagfile=/etc/nova/nova-compute.conf
--sql_connection=mysql://novadbuser:novaDBsekret@10.233.52.102/nova
--network_manager=nova.network.manager.FlatDHCPManager
--auth_driver=nova.auth.dbdriver.DbDriver
--libvirt_type=qemu
--flat_network_bridge=br100
--flat_interface=eth0
--vlan_interface=eth0
--public_interface=eth0
--vncproxy_url=http://10.233.52.102:6080
--daemonize=1
--rabbit_host=10.233.52.102
--cc_host=10.233.52.102
--osapi_host=10.233.52.102
--ec2_host=10.233.52.102
--image_service=nova.image.glance.GlanceImageService
--glance_api_servers=10.233.52.102:9292
--use_syslog

Do i need to add additional flags?

Revision history for this message
madhavi (prasanna-desaraju) wrote :

hi kei/vish,

These are the configuration details at my end.Please help me out.I need this to happen urgently

special thanks to both of you for responding immediately.Thanks a lot for the reply

libvirt.conf
=============

listen_tls = 0
listen_tcp = 1
unix_sock_group = "libvirtd"
unix_sock_rw_perms = "0770"
auth_unix_ro = "none"
auth_unix_rw = "none"
auth_tcp = "none

qemu.conf
========

vnc_listen = "0.0.0.0"
vnc_auto_unix_socket = 1
vnc_tls = 1
vnc_tls_x509_cert_dir = "/etc/pki/libvirt-vnc"
vnc_tls_x509_verify = 1
vnc_password = "XYZ12345"
vnc_sasl = 1
spice_listen = "0.0.0.0"
spice_tls = 1
spice_password = "XYZ12345"
spice_tls_x509_cert_dir = "/etc/pki/libvirt-spice"
security_driver = "selinux"
user = "root"

Revision history for this message
Kei Masumoto (masumotok) wrote :

> security_driver = "selinux"
can't say anything except error log messages, but one thing I found is above. I always comment it out.

Revision history for this message
madhavi (prasanna-desaraju) wrote :
Download full text (7.2 KiB)

hi kei

qemu.conf at host and destination nodes
===============================
user = root
group=root in qemu.conf

libvirt.conf at host and destination nodes
===============================
listen_tls = 0
listen_tcp = 1
unix_sock_group = "libvirtd"
unix_sock_rw_perms = "0770"
auth_unix_ro = "none"
auth_unix_rw = "none"
auth_tcp = "none

and nova.conf at source and destination nodes
=============================================

--dhcpbridge_flagfile=/etc/nova/nova.conf
--dhcpbridge=/usr/bin/nova-dhcpbridge
--logdir=/var/log/nova
--state_path=/var/lib/nova
--lock_path=/var/lock/nova
--flagfile=/etc/nova/nova-compute.conf
#--force_dhcp_release=True
--use_deprecated_auth
--verbose
--sql_connection=mysql://novadbuser:novaDBsekret@10.233.52.103/nova
#--network_manager=nova.network.manager.FlatDHCPManager
--network_manager=nova.network.manager.VlanManager
--vlan_interface=eth0
--auth_driver=nova.auth.dbdriver.DbDriver
#--flat_network_dhcp_start=10.0.0.2
--flat_network_bridge=br100
#--flat_injected=False
#--routing_source_ip=10.233.52.3
--flat_interface=eth0
--libv
--connection_type=libvirt
--public_interface=eth0
--daemonize=1
--rabbit_host=10.233.52.103
--cc_host=10.233.52.103
--s3_host=10.233.52.103
--osapi_host=10.233.52.103
--ec2_url=http://10.233.52.103:8773/services/Cloud
--ec2_host=10.233.52.103
--image_service=nova.image.glance.GlanceImageService
--glance_api_servers=10.233.52.103:9292
--vncproxy_url=http://10.233.52.3:6080
--use_syslog
--allow_admin_api=True
--fixed_range=10.0.0.0/24
--sql_min_pool_size=1

Iam working on kvm which is on bare metal but still the same issue.Can you plz help me a little on this.

Below is the compute-log at destination host side
================================================

11-12-01 23:56:22,781 DEBUG nova.rpc [-] unpacked context: {'user_id': None, 'roles': [], 'timestamp': u'2011-12-01T18:26:41.555857', 'auth_token': None, 'msg_id': u'ff651bedf5db4821a5f2760a4eff6282', 'remote_address': None, 'strategy': u'noauth', 'is_admin': True, 'request_id': u'18760f6d-e82f-404f-8fae-c42849112555', 'project_id': None, 'read_deleted': False} from (pid=4836) _unpack_context /usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py:646
2011-12-01 23:56:22,937 INFO nova.compute.manager [-] server-39 has no volume.
2011-12-01 23:56:22,938 DEBUG nova.rpc [-] Making asynchronous call on network ... from (pid=4836) multicall /usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py:721
2011-12-01 23:56:22,939 DEBUG nova.rpc [-] MSG_ID is 807151e24ed94860bc8e576a95333382 from (pid=4836) multicall /usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py:724
2011-12-01 23:56:23,324 INFO nova [-] called setup_basic_filtering in nwfilter
2011-12-01 23:56:23,324 INFO nova [-] ensuring static filters
2011-12-01 23:56:24,176 DEBUG nova.utils [-] Attempting to grab semaphore "iptables" for method "apply"... from (pid=4836) inner /usr/lib/python2.7/dist-packages/nova/utils.py:672
2011-12-01 23:56:24,177 DEBUG nova.utils [-] Attempting to grab file lock "iptables" for method "apply"... from (pid=4836) inner /usr/lib/python2.7/dist-packages/nova/utils.py:677
2011-12-01 23:56:24,178 DEBUG nova.utils [-] Running cmd (subprocess): sudo ...

Read more...

Revision history for this message
Kei Masumoto (masumotok) wrote :

Usually, libvirt related error is shown at source compute node log. I recommend you to check src/dest compute, scheduler and network. Destination compute log looks no problem.

Revision history for this message
madhavi (prasanna-desaraju) wrote :

hi kei,

Can you please tell me what mite be the reasons for the rollback happening at destination node?

Will block migration work if i using qemu? or do i need to use kvm on bare metal?

Revision history for this message
madhavi (prasanna-desaraju) wrote :
Download full text (7.8 KiB)

hi kei,

If you dont mind,Can you please give the necessary configuration details needed for block migration.I have been trying since a month and i still get the same error instances files deleted successfully as stated in above comments.Iam not understanding the root cause for this.
can you please share your configuration details and even the environment details.

At my end i have given a try on qemu and even tried with KVM on bare metal but the issue remains the same.

Plz plz help me out on this.I need this urgently.

my set up is as follows:
========================

1.)I have a single node setup in one node
2.)I have only compute in other node.The controller and remaining components are common for both the nodes.
3.)both are with kvm on bare metal.Both are physical boxes with underlying hypervisor as KVM and i installed openstack on top of it .

libvirt.conf at both nodes
=========================
listen_tls = 0
listen_tcp = 1
unix_sock_group = "libvirtd"
unix_sock_rw_perms = "0770"
auth_unix_ro = "none"
auth_unix_rw = "none"
auth_tcp = "none

qemu.conf at both nodes
========================
user = root
group=root in qemu.conf

nova.conf
==========

--dhcpbridge_flagfile=/etc/nova/nova.conf
--dhcpbridge=/usr/bin/nova-dhcpbridge
--logdir=/var/log/nova
--state_path=/var/lib/nova
--lock_path=/var/lock/nova
--flagfile=/etc/nova/nova-compute.conf
--use_deprecated_auth
--verbose
--sql_connection=mysql://novadbuser:novaDBsekret@10.233.52.103/nova
--network_manager=nova.network.manager.FlatDHCPManager
--vlan_interface=eth0
--auth_driver=nova.auth.dbdriver.DbDriver
--flat_network_bridge=br100
--routing_source_ip=10.233.52.103
--flat_interface=eth0
--libvirt_type=kvm
--connection_type=libvirt
--public_interface=eth0
--daemonize=1
--rabbit_host=10.233.52.103
--cc_host=10.233.52.103
--s3_host=10.233.52.103
--osapi_host=10.233.52.103
--ec2_url=http://10.233.52.103:8773/services/Cloud
--ec2_host=10.233.52.103
--image_service=nova.image.glance.GlanceImageService
--glance_api_servers=10.233.52.103:9292
--vncproxy_url=http://10.233.52.103:6080
--use_syslog
--allow_admin_api=True
--fixed_range=10.0.0.0/24
--sql_min_pool_size=1

Source Log shows no error

Below is the compute-log at destination host side
================================================

11-12-01 23:56:22,781 DEBUG nova.rpc [-] unpacked context: {'user_id': None, 'roles': [], 'timestamp': u'2011-12-01T18:26:41.555857', 'auth_token': None, 'msg_id': u'ff651bedf5db4821a5f2760a4eff6282', 'remote_address': None, 'strategy': u'noauth', 'is_admin': True, 'request_id': u'18760f6d-e82f-404f-8fae-c42849112555', 'project_id': None, 'read_deleted': False} from (pid=4836) _unpack_context /usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py:646
2011-12-01 23:56:22,937 INFO nova.compute.manager [-] server-39 has no volume.
2011-12-01 23:56:22,938 DEBUG nova.rpc [-] Making asynchronous call on network ... from (pid=4836) multicall /usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py:721
2011-12-01 23:56:22,939 DEBUG nova.rpc [-] MSG_ID is 807151e24ed94860bc8e576a95333382 from (pid=4836) multicall /usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py:724
2011-12-01 23:56:23,324 INFO nova [-] cal...

Read more...

Revision history for this message
madhavi (prasanna-desaraju) wrote :
Download full text (17.4 KiB)

hi kei,

The before log was on qemu.

The below log is on kvm on baremetal.Configurations all remain same.Please see the below log at destination compute.

    <features name="ss" />
    <features name="acpi" />
    <features name="ds" />
    <features name="vme" />
</cpu>

2011-12-09 11:24:44,321 DEBUG nova.rpc [-] skipping a message reply due to a value of None returned by the func: <bound method ComputeManager.compare_cpu of <nova.compute.manager.ComputeManager object at 0x237f950>> from (pid=2338) _process_data /usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py:656
2011-12-09 11:24:46,371 DEBUG nova.rpc [-] received {u'_context_roles': [], u'_msg_id': u'5683b393e463469f8cce9a0d154c58fe', u'_context_read_deleted': False, u'_context_request_id': u'62c8b42e-399b-402e-9889-44ea5a9ddb07', u'args': {u'instance_id': 16639, u'block_migration': True, u'disk': u'[{"path": "/var/lib/nova/instances/instance-000040ff/disk", "local_gb": "4G", "type": "qcow2"}, {"path": "/var/lib/nova/instances/instance-000040ff/disk.local", "local_gb": "40G", "type": "qcow2"}]'}, u'_context_auth_token': None, u'_context_strategy': u'noauth', u'_context_is_admin': True, u'_context_project_id': None, u'_context_timestamp': u'2011-12-09T11:24:41.886399', u'_context_user_id': None, u'method': u'pre_live_migration', u'_context_remote_address': None} from (pid=2338) __call__ /usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py:621
2011-12-09 11:24:46,372 DEBUG nova.rpc [-] unpacked context: {'user_id': None, 'roles': [], 'timestamp': u'2011-12-09T11:24:41.886399', 'auth_token': None, 'msg_id': u'5683b393e463469f8cce9a0d154c58fe', 'remote_address': None, 'strategy': u'noauth', 'is_admin': True, 'request_id': u'62c8b42e-399b-402e-9889-44ea5a9ddb07', 'project_id': None, 'read_deleted': False} from (pid=2338) _unpack_context /usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py:676
2011-12-09 11:24:46,559 INFO nova.compute.manager [-] server-16639 has no volume.
2011-12-09 11:24:46,560 DEBUG nova.rpc [-] Making asynchronous call on network ... from (pid=2338) multicall /usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py:751
2011-12-09 11:24:46,560 DEBUG nova.rpc [-] MSG_ID is 22f0c8be2d5047d98e1ec88ba201a8fe from (pid=2338) multicall /usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py:754
2011-12-09 11:24:46,784 DEBUG nova.virt.libvirt.vif [-] Ensuring vlan 2688 and bridge br2688 from (pid=2338) plug /usr/lib/python2.7/dist-packages/nova/virt/libvirt/vif.py:82
2011-12-09 11:24:46,785 DEBUG nova.utils [-] Attempting to grab semaphore "ensure_vlan" for method "ensure_vlan"... from (pid=2338) inner /usr/lib/python2.7/dist-packages/nova/utils.py:719
2011-12-09 11:24:46,785 DEBUG nova.utils [-] Attempting to grab file lock "ensure_vlan" for method "ensure_vlan"... from (pid=2338) inner /usr/lib/python2.7/dist-packages/nova/utils.py:724
2011-12-09 11:24:46,785 DEBUG nova.utils [-] Running cmd (subprocess): ip link show dev vlan2688 from (pid=2338) execute /usr/lib/python2.7/dist-packages/nova/utils.py:165
2011-12-09 11:24:46,826 DEBUG nova.utils [-] Result was 255 from (pid=2338) execute /usr/lib/python2.7/dist-packages/nova/utils.py:180
2011-12-09 11:24:46,826 DEBU...

Revision history for this message
Kei Masumoto (masumotok) wrote :

> Can you please tell me what mite be the reasons for the rollback happening at destination node?
Because some errors occurred. "rollback" here also means cleanup method for pre-block-migration. nothing suspicious.
> Will block migration work if i using qemu? or do i need to use kvm on bare metal?
I recommend you to use kvm.

Revision history for this message
madhavi (prasanna-desaraju) wrote :

hi kei,

I tried using kvm and it says same error instance files destroyed successfully.

INFO nova.virt.libvirt_conn [-] Instance instance-000040ff destroyed successfully.

INFO nova.virt.libvirt_conn [-] instance instance-000040ff: deleting instance files /var/lib/nova/instances/instance-000040ff
Destination log shows these stuff but it doesn't showcase any errors.I have pasted the log above.

How can i trace the problem?

Revision history for this message
madhavi (prasanna-desaraju) wrote :

do we have any pre-requisites for doing block migration? iam made sure everything is fine but still the same issue.plz guide me a little on this

Revision history for this message
madhavi (prasanna-desaraju) wrote :

hi kei,

How do i check if the name resolution is working fine from one host to other.I mean how do i ensure if one host is libvirtd reachable to other?

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'Authors'
2--- Authors 2011-08-09 21:17:56 +0000
3+++ Authors 2011-08-12 19:02:28 +0000
4@@ -58,6 +58,7 @@
5 Justin Santa Barbara <justin@fathomdb.com>
6 Justin Shepherd <jshepher@rackspace.com>
7 Kei Masumoto <masumotok@nttdata.co.jp>
8+masumoto<masumotok@nttdata.co.jp>
9 Ken Pepple <ken.pepple@gmail.com>
10 Kevin Bringard <kbringard@attinteractive.com>
11 Kevin L. Mitchell <kevin.mitchell@rackspace.com>
12
13=== modified file 'bin/nova-manage'
14--- bin/nova-manage 2011-08-09 15:41:55 +0000
15+++ bin/nova-manage 2011-08-12 19:02:28 +0000
16@@ -834,11 +834,13 @@
17 instance['availability_zone'],
18 instance['launch_index'])
19
20- @args('--ec2_id', dest='ec2_id', metavar='<ec2 id>', help='EC2 ID')
21- @args('--dest', dest='dest', metavar='<Destanation>',
22- help='destanation node')
23- def live_migration(self, ec2_id, dest):
24- """Migrates a running instance to a new machine."""
25+ def _migration(self, ec2_id, dest, block_migration=False):
26+ """Migrates a running instance to a new machine.
27+ :param ec2_id: instance id which comes from euca-describe-instance.
28+ :param dest: destination host name.
29+ :param block_migration: if True, do block_migration.
30+
31+ """
32
33 ctxt = context.get_admin_context()
34 instance_id = ec2utils.ec2_id_to_id(ec2_id)
35@@ -859,11 +861,28 @@
36 {"method": "live_migration",
37 "args": {"instance_id": instance_id,
38 "dest": dest,
39- "topic": FLAGS.compute_topic}})
40+ "topic": FLAGS.compute_topic,
41+ "block_migration": block_migration}})
42
43 print _('Migration of %s initiated.'
44 'Check its progress using euca-describe-instances.') % ec2_id
45
46+ @args('--ec2_id', dest='ec2_id', metavar='<ec2 id>', help='EC2 ID')
47+ @args('--dest', dest='dest', metavar='<Destanation>',
48+ help='destanation node')
49+ def live_migration(self, ec2_id, dest):
50+ """Migrates a running instance to a new machine."""
51+
52+ self._migration(ec2_id, dest)
53+
54+ @args('--ec2_id', dest='ec2_id', metavar='<ec2 id>', help='EC2 ID')
55+ @args('--dest', dest='dest', metavar='<Destanation>',
56+ help='destanation node')
57+ def block_migration(self, ec2_id, dest):
58+ """Migrates a running instance to a new machine with storage data."""
59+
60+ self._migration(ec2_id, dest, True)
61+
62
63 class ServiceCommands(object):
64 """Enable and disable running services"""
65@@ -937,9 +956,19 @@
66 mem_u = result['resource']['memory_mb_used']
67 hdd_u = result['resource']['local_gb_used']
68
69+ cpu_sum = 0
70+ mem_sum = 0
71+ hdd_sum = 0
72 print 'HOST\t\t\tPROJECT\t\tcpu\tmem(mb)\tdisk(gb)'
73 print '%s(total)\t\t\t%s\t%s\t%s' % (host, cpu, mem, hdd)
74- print '%s(used)\t\t\t%s\t%s\t%s' % (host, cpu_u, mem_u, hdd_u)
75+ print '%s(used_now)\t\t\t%s\t%s\t%s' % (host, cpu_u, mem_u, hdd_u)
76+ for p_id, val in result['usage'].items():
77+ cpu_sum += val['vcpus']
78+ mem_sum += val['memory_mb']
79+ hdd_sum += val['local_gb']
80+ print '%s(used_max)\t\t\t%s\t%s\t%s' % (host, cpu_sum,
81+ mem_sum, hdd_sum)
82+
83 for p_id, val in result['usage'].items():
84 print '%s\t\t%s\t\t%s\t%s\t%s' % (host,
85 p_id,
86
87=== modified file 'nova/compute/manager.py'
88--- nova/compute/manager.py 2011-08-09 09:54:51 +0000
89+++ nova/compute/manager.py 2011-08-12 19:02:28 +0000
90@@ -1224,6 +1224,7 @@
91 @exception.wrap_exception(notifier=notifier, publisher_id=publisher_id())
92 def check_shared_storage_test_file(self, context, filename):
93 """Confirms existence of the tmpfile under FLAGS.instances_path.
94+ Cannot confirm tmpfile return False.
95
96 :param context: security context
97 :param filename: confirm existence of FLAGS.instances_path/thisfile
98@@ -1231,7 +1232,9 @@
99 """
100 tmp_file = os.path.join(FLAGS.instances_path, filename)
101 if not os.path.exists(tmp_file):
102- raise exception.FileNotFound(file_path=tmp_file)
103+ return False
104+ else:
105+ return True
106
107 @exception.wrap_exception(notifier=notifier, publisher_id=publisher_id())
108 def cleanup_shared_storage_test_file(self, context, filename):
109@@ -1254,11 +1257,13 @@
110 """
111 return self.driver.update_available_resource(context, self.host)
112
113- def pre_live_migration(self, context, instance_id, time=None):
114+ def pre_live_migration(self, context, instance_id, time=None,
115+ block_migration=False, disk=None):
116 """Preparations for live migration at dest host.
117
118 :param context: security context
119 :param instance_id: nova.db.sqlalchemy.models.Instance.Id
120+ :param block_migration: if true, prepare for block migration
121
122 """
123 if not time:
124@@ -1310,17 +1315,24 @@
125 # onto destination host.
126 self.driver.ensure_filtering_rules_for_instance(instance_ref)
127
128- def live_migration(self, context, instance_id, dest):
129+ # Preparation for block migration
130+ if block_migration:
131+ self.driver.pre_block_migration(context,
132+ instance_ref,
133+ disk)
134+
135+ def live_migration(self, context, instance_id,
136+ dest, block_migration=False):
137 """Executing live migration.
138
139 :param context: security context
140 :param instance_id: nova.db.sqlalchemy.models.Instance.Id
141 :param dest: destination host
142+ :param block_migration: if true, do block migration
143
144 """
145 # Get instance for error handling.
146 instance_ref = self.db.instance_get(context, instance_id)
147- i_name = instance_ref.name
148
149 try:
150 # Checking volume node is working correctly when any volumes
151@@ -1331,16 +1343,25 @@
152 {"method": "check_for_export",
153 "args": {'instance_id': instance_id}})
154
155- # Asking dest host to preparing live migration.
156+ if block_migration:
157+ disk = self.driver.get_instance_disk_info(context,
158+ instance_ref)
159+ else:
160+ disk = None
161+
162 rpc.call(context,
163 self.db.queue_get_for(context, FLAGS.compute_topic, dest),
164 {"method": "pre_live_migration",
165- "args": {'instance_id': instance_id}})
166+ "args": {'instance_id': instance_id,
167+ 'block_migration': block_migration,
168+ 'disk': disk}})
169
170 except Exception:
171+ i_name = instance_ref.name
172 msg = _("Pre live migration for %(i_name)s failed at %(dest)s")
173 LOG.error(msg % locals())
174- self.recover_live_migration(context, instance_ref)
175+ self.rollback_live_migration(context, instance_ref,
176+ dest, block_migration)
177 raise
178
179 # Executing live migration
180@@ -1348,9 +1369,11 @@
181 # nothing must be recovered in this version.
182 self.driver.live_migration(context, instance_ref, dest,
183 self.post_live_migration,
184- self.recover_live_migration)
185+ self.rollback_live_migration,
186+ block_migration)
187
188- def post_live_migration(self, ctxt, instance_ref, dest):
189+ def post_live_migration(self, ctxt, instance_ref,
190+ dest, block_migration=False):
191 """Post operations for live migration.
192
193 This method is called from live_migration
194@@ -1359,6 +1382,7 @@
195 :param ctxt: security context
196 :param instance_id: nova.db.sqlalchemy.models.Instance.Id
197 :param dest: destination host
198+ :param block_migration: if true, do block migration
199
200 """
201
202@@ -1401,8 +1425,29 @@
203 "%(i_name)s cannot inherit floating "
204 "ip.\n%(e)s") % (locals()))
205
206- # Restore instance/volume state
207- self.recover_live_migration(ctxt, instance_ref, dest)
208+ # Define domain at destination host, without doing it,
209+ # pause/suspend/terminate do not work.
210+ rpc.call(ctxt,
211+ self.db.queue_get_for(ctxt, FLAGS.compute_topic, dest),
212+ {"method": "post_live_migration_at_destination",
213+ "args": {'instance_id': instance_ref.id,
214+ 'block_migration': block_migration}})
215+
216+ # Restore instance state
217+ self.db.instance_update(ctxt,
218+ instance_ref['id'],
219+ {'state_description': 'running',
220+ 'state': power_state.RUNNING,
221+ 'host': dest})
222+ # Restore volume state
223+ for volume_ref in instance_ref['volumes']:
224+ volume_id = volume_ref['id']
225+ self.db.volume_update(ctxt, volume_id, {'status': 'in-use'})
226+
227+ # No instance booting at source host, but instance dir
228+ # must be deleted for preparing next block migration
229+ if block_migration:
230+ self.driver.destroy(instance_ref, network_info)
231
232 LOG.info(_('Migrating %(i_name)s to %(dest)s finished successfully.')
233 % locals())
234@@ -1410,31 +1455,64 @@
235 "Domain not found: no domain with matching name.\" "
236 "This error can be safely ignored."))
237
238- def recover_live_migration(self, ctxt, instance_ref, host=None, dest=None):
239+ def post_live_migration_at_destination(self, context,
240+ instance_id, block_migration=False):
241+ """Post operations for live migration .
242+
243+ :param context: security context
244+ :param instance_id: nova.db.sqlalchemy.models.Instance.Id
245+ :param block_migration: block_migration
246+
247+ """
248+ instance_ref = self.db.instance_get(context, instance_id)
249+ LOG.info(_('Post operation of migraton started for %s .')
250+ % instance_ref.name)
251+ network_info = self._get_instance_nw_info(context, instance_ref)
252+ self.driver.post_live_migration_at_destination(context,
253+ instance_ref,
254+ network_info,
255+ block_migration)
256+
257+ def rollback_live_migration(self, context, instance_ref,
258+ dest, block_migration):
259 """Recovers Instance/volume state from migrating -> running.
260
261- :param ctxt: security context
262+ :param context: security context
263 :param instance_id: nova.db.sqlalchemy.models.Instance.Id
264- :param host: DB column value is updated by this hostname.
265- If none, the host instance currently running is selected.
266-
267+ :param dest:
268+ This method is called from live migration src host.
269+ This param specifies destination host.
270 """
271- if not host:
272- host = instance_ref['host']
273-
274- self.db.instance_update(ctxt,
275+ host = instance_ref['host']
276+ self.db.instance_update(context,
277 instance_ref['id'],
278 {'state_description': 'running',
279 'state': power_state.RUNNING,
280 'host': host})
281
282- if dest:
283- volume_api = volume.API()
284 for volume_ref in instance_ref['volumes']:
285 volume_id = volume_ref['id']
286- self.db.volume_update(ctxt, volume_id, {'status': 'in-use'})
287- if dest:
288- volume_api.remove_from_compute(ctxt, volume_id, dest)
289+ self.db.volume_update(context, volume_id, {'status': 'in-use'})
290+ volume.API().remove_from_compute(context, volume_id, dest)
291+
292+ # Block migration needs empty image at destination host
293+ # before migration starts, so if any failure occurs,
294+ # any empty images has to be deleted.
295+ if block_migration:
296+ rpc.cast(context,
297+ self.db.queue_get_for(context, FLAGS.compute_topic, dest),
298+ {"method": "rollback_live_migration_at_destination",
299+ "args": {'instance_id': instance_ref['id']}})
300+
301+ def rollback_live_migration_at_destination(self, context, instance_id):
302+ """ Cleaning up image directory that is created pre_live_migration.
303+
304+ :param context: security context
305+ :param instance_id: nova.db.sqlalchemy.models.Instance.Id
306+ """
307+ instances_ref = self.db.instance_get(context, instance_id)
308+ network_info = self._get_instance_nw_info(context, instances_ref)
309+ self.driver.destroy(instances_ref, network_info)
310
311 def periodic_tasks(self, context=None):
312 """Tasks to be run at a periodic interval."""
313
314=== modified file 'nova/db/api.py'
315--- nova/db/api.py 2011-08-09 23:59:51 +0000
316+++ nova/db/api.py 2011-08-12 19:02:28 +0000
317@@ -570,27 +570,6 @@
318 security_group_id)
319
320
321-def instance_get_vcpu_sum_by_host_and_project(context, hostname, proj_id):
322- """Get instances.vcpus by host and project."""
323- return IMPL.instance_get_vcpu_sum_by_host_and_project(context,
324- hostname,
325- proj_id)
326-
327-
328-def instance_get_memory_sum_by_host_and_project(context, hostname, proj_id):
329- """Get amount of memory by host and project."""
330- return IMPL.instance_get_memory_sum_by_host_and_project(context,
331- hostname,
332- proj_id)
333-
334-
335-def instance_get_disk_sum_by_host_and_project(context, hostname, proj_id):
336- """Get total amount of disk by host and project."""
337- return IMPL.instance_get_disk_sum_by_host_and_project(context,
338- hostname,
339- proj_id)
340-
341-
342 def instance_action_create(context, values):
343 """Create an instance action from the values dictionary."""
344 return IMPL.instance_action_create(context, values)
345
346=== modified file 'nova/db/sqlalchemy/api.py'
347--- nova/db/sqlalchemy/api.py 2011-08-12 05:02:21 +0000
348+++ nova/db/sqlalchemy/api.py 2011-08-12 19:02:28 +0000
349@@ -1483,45 +1483,6 @@
350
351
352 @require_context
353-def instance_get_vcpu_sum_by_host_and_project(context, hostname, proj_id):
354- session = get_session()
355- result = session.query(models.Instance).\
356- filter_by(host=hostname).\
357- filter_by(project_id=proj_id).\
358- filter_by(deleted=False).\
359- value(func.sum(models.Instance.vcpus))
360- if not result:
361- return 0
362- return result
363-
364-
365-@require_context
366-def instance_get_memory_sum_by_host_and_project(context, hostname, proj_id):
367- session = get_session()
368- result = session.query(models.Instance).\
369- filter_by(host=hostname).\
370- filter_by(project_id=proj_id).\
371- filter_by(deleted=False).\
372- value(func.sum(models.Instance.memory_mb))
373- if not result:
374- return 0
375- return result
376-
377-
378-@require_context
379-def instance_get_disk_sum_by_host_and_project(context, hostname, proj_id):
380- session = get_session()
381- result = session.query(models.Instance).\
382- filter_by(host=hostname).\
383- filter_by(project_id=proj_id).\
384- filter_by(deleted=False).\
385- value(func.sum(models.Instance.local_gb))
386- if not result:
387- return 0
388- return result
389-
390-
391-@require_context
392 def instance_action_create(context, values):
393 """Create an instance action from the values dictionary."""
394 action_ref = models.InstanceActions()
395
396=== modified file 'nova/db/sqlalchemy/models.py'
397--- nova/db/sqlalchemy/models.py 2011-08-04 21:58:42 +0000
398+++ nova/db/sqlalchemy/models.py 2011-08-12 19:02:28 +0000
399@@ -127,14 +127,14 @@
400 'ComputeNode.service_id == Service.id,'
401 'ComputeNode.deleted == False)')
402
403- vcpus = Column(Integer, nullable=True)
404- memory_mb = Column(Integer, nullable=True)
405- local_gb = Column(Integer, nullable=True)
406- vcpus_used = Column(Integer, nullable=True)
407- memory_mb_used = Column(Integer, nullable=True)
408- local_gb_used = Column(Integer, nullable=True)
409- hypervisor_type = Column(Text, nullable=True)
410- hypervisor_version = Column(Integer, nullable=True)
411+ vcpus = Column(Integer)
412+ memory_mb = Column(Integer)
413+ local_gb = Column(Integer)
414+ vcpus_used = Column(Integer)
415+ memory_mb_used = Column(Integer)
416+ local_gb_used = Column(Integer)
417+ hypervisor_type = Column(Text)
418+ hypervisor_version = Column(Integer)
419
420 # Note(masumotok): Expected Strings example:
421 #
422
423=== modified file 'nova/exception.py'
424--- nova/exception.py 2011-08-09 23:59:51 +0000
425+++ nova/exception.py 2011-08-12 19:02:28 +0000
426@@ -273,6 +273,11 @@
427 "has been provided.")
428
429
430+class DestinationDiskExists(Invalid):
431+ message = _("The supplied disk path (%(path)s) already exists, "
432+ "it is expected not to exist.")
433+
434+
435 class InvalidDevicePath(Invalid):
436 message = _("The supplied device path (%(path)s) is invalid.")
437
438@@ -699,6 +704,10 @@
439 message = _("Instance %(name)s already exists.")
440
441
442+class InvalidSharedStorage(NovaException):
443+ message = _("%(path)s is on shared storage: %(reason)s")
444+
445+
446 class MigrationError(NovaException):
447 message = _("Migration error") + ": %(reason)s"
448
449
450=== modified file 'nova/scheduler/driver.py'
451--- nova/scheduler/driver.py 2011-07-01 14:53:20 +0000
452+++ nova/scheduler/driver.py 2011-08-12 19:02:28 +0000
453@@ -30,6 +30,7 @@
454 from nova import rpc
455 from nova import utils
456 from nova.compute import power_state
457+from nova.api.ec2 import ec2utils
458
459
460 FLAGS = flags.FLAGS
461@@ -78,7 +79,8 @@
462 """Must override at least this method for scheduler to work."""
463 raise NotImplementedError(_("Must implement a fallback schedule"))
464
465- def schedule_live_migration(self, context, instance_id, dest):
466+ def schedule_live_migration(self, context, instance_id, dest,
467+ block_migration=False):
468 """Live migration scheduling method.
469
470 :param context:
471@@ -87,9 +89,7 @@
472 :return:
473 The host where instance is running currently.
474 Then scheduler send request that host.
475-
476 """
477-
478 # Whether instance exists and is running.
479 instance_ref = db.instance_get(context, instance_id)
480
481@@ -97,10 +97,11 @@
482 self._live_migration_src_check(context, instance_ref)
483
484 # Checking destination host.
485- self._live_migration_dest_check(context, instance_ref, dest)
486-
487+ self._live_migration_dest_check(context, instance_ref,
488+ dest, block_migration)
489 # Common checking.
490- self._live_migration_common_check(context, instance_ref, dest)
491+ self._live_migration_common_check(context, instance_ref,
492+ dest, block_migration)
493
494 # Changing instance_state.
495 db.instance_set_state(context,
496@@ -130,7 +131,8 @@
497 # Checking instance is running.
498 if (power_state.RUNNING != instance_ref['state'] or \
499 'running' != instance_ref['state_description']):
500- raise exception.InstanceNotRunning(instance_id=instance_ref['id'])
501+ instance_id = ec2utils.id_to_ec2_id(instance_ref['id'])
502+ raise exception.InstanceNotRunning(instance_id=instance_id)
503
504 # Checing volume node is running when any volumes are mounted
505 # to the instance.
506@@ -147,7 +149,8 @@
507 if not self.service_is_up(services[0]):
508 raise exception.ComputeServiceUnavailable(host=src)
509
510- def _live_migration_dest_check(self, context, instance_ref, dest):
511+ def _live_migration_dest_check(self, context, instance_ref, dest,
512+ block_migration):
513 """Live migration check routine (for destination host).
514
515 :param context: security context
516@@ -168,16 +171,18 @@
517 # and dest is not same.
518 src = instance_ref['host']
519 if dest == src:
520- raise exception.UnableToMigrateToSelf(
521- instance_id=instance_ref['id'],
522- host=dest)
523+ instance_id = ec2utils.id_to_ec2_id(instance_ref['id'])
524+ raise exception.UnableToMigrateToSelf(instance_id=instance_id,
525+ host=dest)
526
527 # Checking dst host still has enough capacities.
528 self.assert_compute_node_has_enough_resources(context,
529 instance_ref,
530- dest)
531+ dest,
532+ block_migration)
533
534- def _live_migration_common_check(self, context, instance_ref, dest):
535+ def _live_migration_common_check(self, context, instance_ref, dest,
536+ block_migration):
537 """Live migration common check routine.
538
539 Below checkings are followed by
540@@ -186,11 +191,26 @@
541 :param context: security context
542 :param instance_ref: nova.db.sqlalchemy.models.Instance object
543 :param dest: destination host
544+ :param block_migration if True, check for block_migration.
545
546 """
547
548 # Checking shared storage connectivity
549- self.mounted_on_same_shared_storage(context, instance_ref, dest)
550+ # if block migration, instances_paths should not be on shared storage.
551+ try:
552+ self.mounted_on_same_shared_storage(context, instance_ref, dest)
553+ if block_migration:
554+ reason = _("Block migration can not be used "
555+ "with shared storage.")
556+ raise exception.InvalidSharedStorage(reason=reason, path=dest)
557+ except exception.FileNotFound:
558+ if not block_migration:
559+ src = instance_ref['host']
560+ ipath = FLAGS.instances_path
561+ logging.error(_("Cannot confirm tmpfile at %(ipath)s is on "
562+ "same shared storage between %(src)s "
563+ "and %(dest)s.") % locals())
564+ raise
565
566 # Checking dest exists.
567 dservice_refs = db.service_get_all_compute_by_host(context, dest)
568@@ -229,37 +249,96 @@
569 "original host %(src)s.") % locals())
570 raise
571
572- def assert_compute_node_has_enough_resources(self, context,
573- instance_ref, dest):
574+ def assert_compute_node_has_enough_resources(self, context, instance_ref,
575+ dest, block_migration):
576+
577 """Checks if destination host has enough resource for live migration.
578
579- Currently, only memory checking has been done.
580- If storage migration(block migration, meaning live-migration
581- without any shared storage) will be available, local storage
582- checking is also necessary.
583-
584- :param context: security context
585- :param instance_ref: nova.db.sqlalchemy.models.Instance object
586- :param dest: destination host
587-
588- """
589-
590- # Getting instance information
591- hostname = instance_ref['hostname']
592-
593- # Getting host information
594- service_refs = db.service_get_all_compute_by_host(context, dest)
595- compute_node_ref = service_refs[0]['compute_node'][0]
596-
597- mem_total = int(compute_node_ref['memory_mb'])
598- mem_used = int(compute_node_ref['memory_mb_used'])
599- mem_avail = mem_total - mem_used
600+ :param context: security context
601+ :param instance_ref: nova.db.sqlalchemy.models.Instance object
602+ :param dest: destination host
603+ :param block_migration: if True, disk checking has been done
604+
605+ """
606+ self.assert_compute_node_has_enough_memory(context, instance_ref, dest)
607+ if not block_migration:
608+ return
609+ self.assert_compute_node_has_enough_disk(context, instance_ref, dest)
610+
611+ def assert_compute_node_has_enough_memory(self, context,
612+ instance_ref, dest):
613+ """Checks if destination host has enough memory for live migration.
614+
615+
616+ :param context: security context
617+ :param instance_ref: nova.db.sqlalchemy.models.Instance object
618+ :param dest: destination host
619+
620+ """
621+
622+ # Getting total available memory and disk of host
623+ avail = self._get_compute_info(context, dest, 'memory_mb')
624+
625+ # Getting total used memory and disk of host
626+ # It should be sum of memories that are assigned as max value,
627+ # because overcommiting is risky.
628+ used = 0
629+ instance_refs = db.instance_get_all_by_host(context, dest)
630+ used_list = [i['memory_mb'] for i in instance_refs]
631+ if used_list:
632+ used = reduce(lambda x, y: x + y, used_list)
633+
634 mem_inst = instance_ref['memory_mb']
635- if mem_avail <= mem_inst:
636- reason = _("Unable to migrate %(hostname)s to destination: "
637- "%(dest)s (host:%(mem_avail)s <= instance:"
638- "%(mem_inst)s)")
639- raise exception.MigrationError(reason=reason % locals())
640+ avail = avail - used
641+ if avail <= mem_inst:
642+ instance_id = ec2utils.id_to_ec2_id(instance_ref['id'])
643+ reason = _("Unable to migrate %(instance_id)s to %(dest)s: "
644+ "Lack of disk(host:%(avail)s <= instance:%(mem_inst)s)")
645+ raise exception.MigrationError(reason=reason % locals())
646+
647+ def assert_compute_node_has_enough_disk(self, context,
648+ instance_ref, dest):
649+ """Checks if destination host has enough disk for block migration.
650+
651+ :param context: security context
652+ :param instance_ref: nova.db.sqlalchemy.models.Instance object
653+ :param dest: destination host
654+
655+ """
656+
657+ # Getting total available memory and disk of host
658+ avail = self._get_compute_info(context, dest, 'local_gb')
659+
660+ # Getting total used memory and disk of host
661+ # It should be sum of disks that are assigned as max value
662+ # because overcommiting is risky.
663+ used = 0
664+ instance_refs = db.instance_get_all_by_host(context, dest)
665+ used_list = [i['local_gb'] for i in instance_refs]
666+ if used_list:
667+ used = reduce(lambda x, y: x + y, used_list)
668+
669+ disk_inst = instance_ref['local_gb']
670+ avail = avail - used
671+ if avail <= disk_inst:
672+ instance_id = ec2utils.id_to_ec2_id(instance_ref['id'])
673+ reason = _("Unable to migrate %(instance_id)s to %(dest)s: "
674+ "Lack of disk(host:%(avail)s "
675+ "<= instance:%(disk_inst)s)")
676+ raise exception.MigrationError(reason=reason % locals())
677+
678+ def _get_compute_info(self, context, host, key):
679+ """get compute node's infomation specified by key
680+
681+ :param context: security context
682+ :param host: hostname(must be compute node)
683+ :param key: column name of compute_nodes
684+ :return: value specified by key
685+
686+ """
687+ compute_node_ref = db.service_get_all_compute_by_host(context, host)
688+ compute_node_ref = compute_node_ref[0]['compute_node'][0]
689+ return compute_node_ref[key]
690
691 def mounted_on_same_shared_storage(self, context, instance_ref, dest):
692 """Check if the src and dest host mount same shared storage.
693@@ -283,15 +362,13 @@
694 {"method": 'create_shared_storage_test_file'})
695
696 # make sure existence at src host.
697- rpc.call(context, src_t,
698- {"method": 'check_shared_storage_test_file',
699- "args": {'filename': filename}})
700+ ret = rpc.call(context, src_t,
701+ {"method": 'check_shared_storage_test_file',
702+ "args": {'filename': filename}})
703+ if not ret:
704+ raise exception.FileNotFound(file_path=filename)
705
706- except rpc.RemoteError:
707- ipath = FLAGS.instances_path
708- logging.error(_("Cannot confirm tmpfile at %(ipath)s is on "
709- "same shared storage between %(src)s "
710- "and %(dest)s.") % locals())
711+ except exception.FileNotFound:
712 raise
713
714 finally:
715
716=== modified file 'nova/scheduler/manager.py'
717--- nova/scheduler/manager.py 2011-08-04 00:26:37 +0000
718+++ nova/scheduler/manager.py 2011-08-12 19:02:28 +0000
719@@ -113,7 +113,7 @@
720 # NOTE (masumotok) : This method should be moved to nova.api.ec2.admin.
721 # Based on bexar design summit discussion,
722 # just put this here for bexar release.
723- def show_host_resources(self, context, host, *args):
724+ def show_host_resources(self, context, host):
725 """Shows the physical/usage resource given by hosts.
726
727 :param context: security context
728@@ -121,43 +121,45 @@
729 :returns:
730 example format is below.
731 {'resource':D, 'usage':{proj_id1:D, proj_id2:D}}
732- D: {'vcpus':3, 'memory_mb':2048, 'local_gb':2048}
733+ D: {'vcpus': 3, 'memory_mb': 2048, 'local_gb': 2048,
734+ 'vcpus_used': 12, 'memory_mb_used': 10240,
735+ 'local_gb_used': 64}
736
737 """
738
739+ # Getting compute node info and related instances info
740 compute_ref = db.service_get_all_compute_by_host(context, host)
741 compute_ref = compute_ref[0]
742-
743- # Getting physical resource information
744- compute_node_ref = compute_ref['compute_node'][0]
745- resource = {'vcpus': compute_node_ref['vcpus'],
746- 'memory_mb': compute_node_ref['memory_mb'],
747- 'local_gb': compute_node_ref['local_gb'],
748- 'vcpus_used': compute_node_ref['vcpus_used'],
749- 'memory_mb_used': compute_node_ref['memory_mb_used'],
750- 'local_gb_used': compute_node_ref['local_gb_used']}
751-
752- # Getting usage resource information
753- usage = {}
754 instance_refs = db.instance_get_all_by_host(context,
755 compute_ref['host'])
756+
757+ # Getting total available/used resource
758+ compute_ref = compute_ref['compute_node'][0]
759+ resource = {'vcpus': compute_ref['vcpus'],
760+ 'memory_mb': compute_ref['memory_mb'],
761+ 'local_gb': compute_ref['local_gb'],
762+ 'vcpus_used': compute_ref['vcpus_used'],
763+ 'memory_mb_used': compute_ref['memory_mb_used'],
764+ 'local_gb_used': compute_ref['local_gb_used']}
765+ usage = dict()
766 if not instance_refs:
767 return {'resource': resource, 'usage': usage}
768
769+ # Getting usage resource per project
770 project_ids = [i['project_id'] for i in instance_refs]
771 project_ids = list(set(project_ids))
772 for project_id in project_ids:
773- vcpus = db.instance_get_vcpu_sum_by_host_and_project(context,
774- host,
775- project_id)
776- mem = db.instance_get_memory_sum_by_host_and_project(context,
777- host,
778- project_id)
779- hdd = db.instance_get_disk_sum_by_host_and_project(context,
780- host,
781- project_id)
782- usage[project_id] = {'vcpus': int(vcpus),
783- 'memory_mb': int(mem),
784- 'local_gb': int(hdd)}
785+ vcpus = [i['vcpus'] for i in instance_refs \
786+ if i['project_id'] == project_id]
787+
788+ mem = [i['memory_mb'] for i in instance_refs \
789+ if i['project_id'] == project_id]
790+
791+ disk = [i['local_gb'] for i in instance_refs \
792+ if i['project_id'] == project_id]
793+
794+ usage[project_id] = {'vcpus': reduce(lambda x, y: x + y, vcpus),
795+ 'memory_mb': reduce(lambda x, y: x + y, mem),
796+ 'local_gb': reduce(lambda x, y: x + y, disk)}
797
798 return {'resource': resource, 'usage': usage}
799
800=== modified file 'nova/tests/api/openstack/contrib/test_keypairs.py'
801=== modified file 'nova/tests/api/openstack/test_extensions.py'
802--- nova/tests/api/openstack/test_extensions.py 2011-08-10 21:16:13 +0000
803+++ nova/tests/api/openstack/test_extensions.py 2011-08-12 19:02:28 +0000
804@@ -127,9 +127,7 @@
805 "updated": "2011-01-22T13:25:27-06:00",
806 "description": "The Fox In Socks Extension",
807 "alias": "FOXNSOX",
808- "links": [],
809- },
810- )
811+ "links": []})
812
813 def test_list_extensions_xml(self):
814 app = openstack.APIRouterV11()
815@@ -336,27 +334,18 @@
816
817 def test_serialize_extenstion(self):
818 serializer = extensions.ExtensionsXMLSerializer()
819- data = {
820- 'extension': {
821- 'name': 'ext1',
822- 'namespace': 'http://docs.rack.com/servers/api/ext/pie/v1.0',
823- 'alias': 'RS-PIE',
824- 'updated': '2011-01-22T13:25:27-06:00',
825- 'description': 'Adds the capability to share an image.',
826- 'links': [
827- {
828- 'rel': 'describedby',
829- 'type': 'application/pdf',
830- 'href': 'http://docs.rack.com/servers/api/ext/cs.pdf',
831- },
832- {
833- 'rel': 'describedby',
834- 'type': 'application/vnd.sun.wadl+xml',
835- 'href': 'http://docs.rack.com/servers/api/ext/cs.wadl',
836- },
837- ],
838- },
839- }
840+ data = {'extension': {
841+ 'name': 'ext1',
842+ 'namespace': 'http://docs.rack.com/servers/api/ext/pie/v1.0',
843+ 'alias': 'RS-PIE',
844+ 'updated': '2011-01-22T13:25:27-06:00',
845+ 'description': 'Adds the capability to share an image.',
846+ 'links': [{'rel': 'describedby',
847+ 'type': 'application/pdf',
848+ 'href': 'http://docs.rack.com/servers/api/ext/cs.pdf'},
849+ {'rel': 'describedby',
850+ 'type': 'application/vnd.sun.wadl+xml',
851+ 'href': 'http://docs.rack.com/servers/api/ext/cs.wadl'}]}}
852
853 xml = serializer.serialize(data, 'show')
854 print xml
855@@ -378,48 +367,30 @@
856
857 def test_serialize_extensions(self):
858 serializer = extensions.ExtensionsXMLSerializer()
859- data = {
860- "extensions": [
861- {
862- "name": "Public Image Extension",
863- "namespace": "http://foo.com/api/ext/pie/v1.0",
864- "alias": "RS-PIE",
865- "updated": "2011-01-22T13:25:27-06:00",
866- "description": "Adds the capability to share an image.",
867- "links": [
868- {
869- "rel": "describedby",
870- "type": "application/pdf",
871- "href": "http://foo.com/api/ext/cs-pie.pdf",
872- },
873- {
874- "rel": "describedby",
875- "type": "application/vnd.sun.wadl+xml",
876- "href": "http://foo.com/api/ext/cs-pie.wadl",
877- },
878- ],
879- },
880- {
881- "name": "Cloud Block Storage",
882- "namespace": "http://foo.com/api/ext/cbs/v1.0",
883- "alias": "RS-CBS",
884- "updated": "2011-01-12T11:22:33-06:00",
885- "description": "Allows mounting cloud block storage.",
886- "links": [
887- {
888- "rel": "describedby",
889- "type": "application/pdf",
890- "href": "http://foo.com/api/ext/cs-cbs.pdf",
891- },
892- {
893- "rel": "describedby",
894- "type": "application/vnd.sun.wadl+xml",
895- "href": "http://foo.com/api/ext/cs-cbs.wadl",
896- },
897- ],
898- },
899- ],
900- }
901+ data = {"extensions": [{
902+ "name": "Public Image Extension",
903+ "namespace": "http://foo.com/api/ext/pie/v1.0",
904+ "alias": "RS-PIE",
905+ "updated": "2011-01-22T13:25:27-06:00",
906+ "description": "Adds the capability to share an image.",
907+ "links": [{"rel": "describedby",
908+ "type": "application/pdf",
909+ "type": "application/vnd.sun.wadl+xml",
910+ "href": "http://foo.com/api/ext/cs-pie.pdf"},
911+ {"rel": "describedby",
912+ "type": "application/vnd.sun.wadl+xml",
913+ "href": "http://foo.com/api/ext/cs-pie.wadl"}]},
914+ {"name": "Cloud Block Storage",
915+ "namespace": "http://foo.com/api/ext/cbs/v1.0",
916+ "alias": "RS-CBS",
917+ "updated": "2011-01-12T11:22:33-06:00",
918+ "description": "Allows mounting cloud block storage.",
919+ "links": [{"rel": "describedby",
920+ "type": "application/pdf",
921+ "href": "http://foo.com/api/ext/cs-cbs.pdf"},
922+ {"rel": "describedby",
923+ "type": "application/vnd.sun.wadl+xml",
924+ "href": "http://foo.com/api/ext/cs-cbs.wadl"}]}]}
925
926 xml = serializer.serialize(data, 'index')
927 print xml
928
929=== modified file 'nova/tests/api/openstack/test_limits.py'
930--- nova/tests/api/openstack/test_limits.py 2011-08-04 00:26:37 +0000
931+++ nova/tests/api/openstack/test_limits.py 2011-08-12 19:02:28 +0000
932@@ -915,86 +915,56 @@
933
934 def setUp(self):
935 self.view_builder = views.limits.ViewBuilderV11()
936- self.rate_limits = [
937- {
938- "URI": "*",
939- "regex": ".*",
940- "value": 10,
941- "verb": "POST",
942- "remaining": 2,
943- "unit": "MINUTE",
944- "resetTime": 1311272226,
945- },
946- {
947- "URI": "*/servers",
948- "regex": "^/servers",
949- "value": 50,
950- "verb": "POST",
951- "remaining": 10,
952- "unit": "DAY",
953- "resetTime": 1311272226,
954- },
955- ]
956- self.absolute_limits = {
957- "metadata_items": 1,
958- "injected_files": 5,
959- "injected_file_content_bytes": 5,
960- }
961+ self.rate_limits = [{"URI": "*",
962+ "regex": ".*",
963+ "value": 10,
964+ "verb": "POST",
965+ "remaining": 2,
966+ "unit": "MINUTE",
967+ "resetTime": 1311272226},
968+ {"URI": "*/servers",
969+ "regex": "^/servers",
970+ "value": 50,
971+ "verb": "POST",
972+ "remaining": 10,
973+ "unit": "DAY",
974+ "resetTime": 1311272226}]
975+ self.absolute_limits = {"metadata_items": 1,
976+ "injected_files": 5,
977+ "injected_file_content_bytes": 5}
978
979 def tearDown(self):
980 pass
981
982 def test_build_limits(self):
983- expected_limits = {
984- "limits": {
985- "rate": [
986- {
987- "uri": "*",
988- "regex": ".*",
989- "limit": [
990- {
991- "value": 10,
992- "verb": "POST",
993- "remaining": 2,
994- "unit": "MINUTE",
995- "next-available": "2011-07-21T18:17:06Z",
996- },
997- ]
998- },
999- {
1000- "uri": "*/servers",
1001- "regex": "^/servers",
1002- "limit": [
1003- {
1004- "value": 50,
1005- "verb": "POST",
1006- "remaining": 10,
1007- "unit": "DAY",
1008- "next-available": "2011-07-21T18:17:06Z",
1009- },
1010- ]
1011- },
1012- ],
1013- "absolute": {
1014- "maxServerMeta": 1,
1015- "maxImageMeta": 1,
1016- "maxPersonality": 5,
1017- "maxPersonalitySize": 5
1018- }
1019- }
1020- }
1021+ expected_limits = {"limits": {
1022+ "rate": [{
1023+ "uri": "*",
1024+ "regex": ".*",
1025+ "limit": [{"value": 10,
1026+ "verb": "POST",
1027+ "remaining": 2,
1028+ "unit": "MINUTE",
1029+ "next-available": "2011-07-21T18:17:06Z"}]},
1030+ {"uri": "*/servers",
1031+ "regex": "^/servers",
1032+ "limit": [{"value": 50,
1033+ "verb": "POST",
1034+ "remaining": 10,
1035+ "unit": "DAY",
1036+ "next-available": "2011-07-21T18:17:06Z"}]}],
1037+ "absolute": {"maxServerMeta": 1,
1038+ "maxImageMeta": 1,
1039+ "maxPersonality": 5,
1040+ "maxPersonalitySize": 5}}}
1041
1042 output = self.view_builder.build(self.rate_limits,
1043 self.absolute_limits)
1044 self.assertDictMatch(output, expected_limits)
1045
1046 def test_build_limits_empty_limits(self):
1047- expected_limits = {
1048- "limits": {
1049- "rate": [],
1050- "absolute": {},
1051- }
1052- }
1053+ expected_limits = {"limits": {"rate": [],
1054+ "absolute": {}}}
1055
1056 abs_limits = {}
1057 rate_limits = []
1058@@ -1012,45 +982,28 @@
1059
1060 def test_index(self):
1061 serializer = limits.LimitsXMLSerializer()
1062-
1063- fixture = {
1064- "limits": {
1065- "rate": [
1066- {
1067- "uri": "*",
1068- "regex": ".*",
1069- "limit": [
1070- {
1071- "value": 10,
1072- "verb": "POST",
1073- "remaining": 2,
1074- "unit": "MINUTE",
1075- "next-available": "2011-12-15T22:42:45Z",
1076- },
1077- ]
1078- },
1079- {
1080- "uri": "*/servers",
1081- "regex": "^/servers",
1082- "limit": [
1083- {
1084- "value": 50,
1085- "verb": "POST",
1086- "remaining": 10,
1087- "unit": "DAY",
1088- "next-available": "2011-12-15T22:42:45Z"
1089- },
1090- ]
1091- },
1092- ],
1093- "absolute": {
1094- "maxServerMeta": 1,
1095- "maxImageMeta": 1,
1096- "maxPersonality": 5,
1097- "maxPersonalitySize": 10240
1098- }
1099- }
1100- }
1101+ fixture = {"limits": {
1102+ "rate": [{
1103+ "uri": "*",
1104+ "regex": ".*",
1105+ "limit": [{
1106+ "value": 10,
1107+ "verb": "POST",
1108+ "remaining": 2,
1109+ "unit": "MINUTE",
1110+ "next-available": "2011-12-15T22:42:45Z"}]},
1111+ {"uri": "*/servers",
1112+ "regex": "^/servers",
1113+ "limit": [{
1114+ "value": 50,
1115+ "verb": "POST",
1116+ "remaining": 10,
1117+ "unit": "DAY",
1118+ "next-available": "2011-12-15T22:42:45Z"}]}],
1119+ "absolute": {"maxServerMeta": 1,
1120+ "maxImageMeta": 1,
1121+ "maxPersonality": 5,
1122+ "maxPersonalitySize": 10240}}}
1123
1124 output = serializer.serialize(fixture, 'index')
1125 actual = minidom.parseString(output.replace(" ", ""))
1126@@ -1083,12 +1036,9 @@
1127 def test_index_no_limits(self):
1128 serializer = limits.LimitsXMLSerializer()
1129
1130- fixture = {
1131- "limits": {
1132- "rate": [],
1133- "absolute": {},
1134- }
1135- }
1136+ fixture = {"limits": {
1137+ "rate": [],
1138+ "absolute": {}}}
1139
1140 output = serializer.serialize(fixture, 'index')
1141 actual = minidom.parseString(output.replace(" ", ""))
1142
1143=== modified file 'nova/tests/api/openstack/test_servers.py'
1144--- nova/tests/api/openstack/test_servers.py 2011-08-11 19:30:43 +0000
1145+++ nova/tests/api/openstack/test_servers.py 2011-08-12 19:02:28 +0000
1146@@ -3049,8 +3049,7 @@
1147 address_builder,
1148 flavor_builder,
1149 image_builder,
1150- base_url,
1151- )
1152+ base_url)
1153 return view_builder
1154
1155 def test_build_server(self):
1156
1157=== modified file 'nova/tests/scheduler/test_scheduler.py'
1158--- nova/tests/scheduler/test_scheduler.py 2011-08-09 22:46:57 +0000
1159+++ nova/tests/scheduler/test_scheduler.py 2011-08-12 19:02:28 +0000
1160@@ -643,10 +643,13 @@
1161 self.mox.StubOutWithMock(driver_i, '_live_migration_dest_check')
1162 self.mox.StubOutWithMock(driver_i, '_live_migration_common_check')
1163 driver_i._live_migration_src_check(nocare, nocare)
1164- driver_i._live_migration_dest_check(nocare, nocare, i_ref['host'])
1165- driver_i._live_migration_common_check(nocare, nocare, i_ref['host'])
1166+ driver_i._live_migration_dest_check(nocare, nocare,
1167+ i_ref['host'], False)
1168+ driver_i._live_migration_common_check(nocare, nocare,
1169+ i_ref['host'], False)
1170 self.mox.StubOutWithMock(rpc, 'cast', use_mock_anything=True)
1171- kwargs = {'instance_id': instance_id, 'dest': i_ref['host']}
1172+ kwargs = {'instance_id': instance_id, 'dest': i_ref['host'],
1173+ 'block_migration': False}
1174 rpc.cast(self.context,
1175 db.queue_get_for(nocare, FLAGS.compute_topic, i_ref['host']),
1176 {"method": 'live_migration', "args": kwargs})
1177@@ -654,7 +657,8 @@
1178 self.mox.ReplayAll()
1179 self.scheduler.live_migration(self.context, FLAGS.compute_topic,
1180 instance_id=instance_id,
1181- dest=i_ref['host'])
1182+ dest=i_ref['host'],
1183+ block_migration=False)
1184
1185 i_ref = db.instance_get(self.context, instance_id)
1186 self.assertTrue(i_ref['state_description'] == 'migrating')
1187@@ -735,7 +739,7 @@
1188
1189 self.assertRaises(exception.ComputeServiceUnavailable,
1190 self.scheduler.driver._live_migration_dest_check,
1191- self.context, i_ref, i_ref['host'])
1192+ self.context, i_ref, i_ref['host'], False)
1193
1194 db.instance_destroy(self.context, instance_id)
1195 db.service_destroy(self.context, s_ref['id'])
1196@@ -748,7 +752,7 @@
1197
1198 self.assertRaises(exception.UnableToMigrateToSelf,
1199 self.scheduler.driver._live_migration_dest_check,
1200- self.context, i_ref, i_ref['host'])
1201+ self.context, i_ref, i_ref['host'], False)
1202
1203 db.instance_destroy(self.context, instance_id)
1204 db.service_destroy(self.context, s_ref['id'])
1205@@ -756,15 +760,33 @@
1206 def test_live_migration_dest_check_service_lack_memory(self):
1207 """Confirms exception raises when dest doesn't have enough memory."""
1208 instance_id = self._create_instance()
1209- i_ref = db.instance_get(self.context, instance_id)
1210- s_ref = self._create_compute_service(host='somewhere',
1211- memory_mb_used=12)
1212-
1213- self.assertRaises(exception.MigrationError,
1214- self.scheduler.driver._live_migration_dest_check,
1215- self.context, i_ref, 'somewhere')
1216-
1217- db.instance_destroy(self.context, instance_id)
1218+ instance_id2 = self._create_instance(host='somewhere',
1219+ memory_mb=12)
1220+ i_ref = db.instance_get(self.context, instance_id)
1221+ s_ref = self._create_compute_service(host='somewhere')
1222+
1223+ self.assertRaises(exception.MigrationError,
1224+ self.scheduler.driver._live_migration_dest_check,
1225+ self.context, i_ref, 'somewhere', False)
1226+
1227+ db.instance_destroy(self.context, instance_id)
1228+ db.instance_destroy(self.context, instance_id2)
1229+ db.service_destroy(self.context, s_ref['id'])
1230+
1231+ def test_block_migration_dest_check_service_lack_disk(self):
1232+ """Confirms exception raises when dest doesn't have enough disk."""
1233+ instance_id = self._create_instance()
1234+ instance_id2 = self._create_instance(host='somewhere',
1235+ local_gb=70)
1236+ i_ref = db.instance_get(self.context, instance_id)
1237+ s_ref = self._create_compute_service(host='somewhere')
1238+
1239+ self.assertRaises(exception.MigrationError,
1240+ self.scheduler.driver._live_migration_dest_check,
1241+ self.context, i_ref, 'somewhere', True)
1242+
1243+ db.instance_destroy(self.context, instance_id)
1244+ db.instance_destroy(self.context, instance_id2)
1245 db.service_destroy(self.context, s_ref['id'])
1246
1247 def test_live_migration_dest_check_service_works_correctly(self):
1248@@ -776,7 +798,8 @@
1249
1250 ret = self.scheduler.driver._live_migration_dest_check(self.context,
1251 i_ref,
1252- 'somewhere')
1253+ 'somewhere',
1254+ False)
1255 self.assertTrue(ret is None)
1256 db.instance_destroy(self.context, instance_id)
1257 db.service_destroy(self.context, s_ref['id'])
1258@@ -809,9 +832,10 @@
1259 "args": {'filename': fpath}})
1260
1261 self.mox.ReplayAll()
1262- self.assertRaises(exception.SourceHostUnavailable,
1263+ #self.assertRaises(exception.SourceHostUnavailable,
1264+ self.assertRaises(exception.FileNotFound,
1265 self.scheduler.driver._live_migration_common_check,
1266- self.context, i_ref, dest)
1267+ self.context, i_ref, dest, False)
1268
1269 db.instance_destroy(self.context, instance_id)
1270 db.service_destroy(self.context, s_ref['id'])
1271@@ -835,7 +859,7 @@
1272 self.mox.ReplayAll()
1273 self.assertRaises(exception.InvalidHypervisorType,
1274 self.scheduler.driver._live_migration_common_check,
1275- self.context, i_ref, dest)
1276+ self.context, i_ref, dest, False)
1277
1278 db.instance_destroy(self.context, instance_id)
1279 db.service_destroy(self.context, s_ref['id'])
1280@@ -861,7 +885,7 @@
1281 self.mox.ReplayAll()
1282 self.assertRaises(exception.DestinationHypervisorTooOld,
1283 self.scheduler.driver._live_migration_common_check,
1284- self.context, i_ref, dest)
1285+ self.context, i_ref, dest, False)
1286
1287 db.instance_destroy(self.context, instance_id)
1288 db.service_destroy(self.context, s_ref['id'])
1289@@ -893,7 +917,8 @@
1290 try:
1291 self.scheduler.driver._live_migration_common_check(self.context,
1292 i_ref,
1293- dest)
1294+ dest,
1295+ False)
1296 except rpc.RemoteError, e:
1297 c = (e.message.find(_("doesn't have compatibility to")) >= 0)
1298
1299
1300=== modified file 'nova/tests/test_compute.py'
1301--- nova/tests/test_compute.py 2011-08-09 22:46:57 +0000
1302+++ nova/tests/test_compute.py 2011-08-12 19:02:28 +0000
1303@@ -714,11 +714,15 @@
1304 dbmock.queue_get_for(c, FLAGS.compute_topic, i_ref['host']).\
1305 AndReturn(topic)
1306 rpc.call(c, topic, {"method": "pre_live_migration",
1307- "args": {'instance_id': i_ref['id']}})
1308+ "args": {'instance_id': i_ref['id'],
1309+ 'block_migration': False,
1310+ 'disk': None}})
1311+
1312 self.mox.StubOutWithMock(self.compute.driver, 'live_migration')
1313 self.compute.driver.live_migration(c, i_ref, i_ref['host'],
1314 self.compute.post_live_migration,
1315- self.compute.recover_live_migration)
1316+ self.compute.rollback_live_migration,
1317+ False)
1318
1319 self.compute.db = dbmock
1320 self.mox.ReplayAll()
1321@@ -739,13 +743,18 @@
1322 dbmock.queue_get_for(c, FLAGS.compute_topic, i_ref['host']).\
1323 AndReturn(topic)
1324 rpc.call(c, topic, {"method": "pre_live_migration",
1325- "args": {'instance_id': i_ref['id']}}).\
1326+ "args": {'instance_id': i_ref['id'],
1327+ 'block_migration': False,
1328+ 'disk': None}}).\
1329 AndRaise(rpc.RemoteError('', '', ''))
1330 dbmock.instance_update(c, i_ref['id'], {'state_description': 'running',
1331 'state': power_state.RUNNING,
1332 'host': i_ref['host']})
1333 for v in i_ref['volumes']:
1334 dbmock.volume_update(c, v['id'], {'status': 'in-use'})
1335+ # mock for volume_api.remove_from_compute
1336+ rpc.call(c, topic, {"method": "remove_volume",
1337+ "args": {'volume_id': v['id']}})
1338
1339 self.compute.db = dbmock
1340 self.mox.ReplayAll()
1341@@ -766,7 +775,9 @@
1342 AndReturn(topic)
1343 self.mox.StubOutWithMock(rpc, 'call')
1344 rpc.call(c, topic, {"method": "pre_live_migration",
1345- "args": {'instance_id': i_ref['id']}}).\
1346+ "args": {'instance_id': i_ref['id'],
1347+ 'block_migration': False,
1348+ 'disk': None}}).\
1349 AndRaise(rpc.RemoteError('', '', ''))
1350 dbmock.instance_update(c, i_ref['id'], {'state_description': 'running',
1351 'state': power_state.RUNNING,
1352@@ -791,11 +802,14 @@
1353 dbmock.queue_get_for(c, FLAGS.compute_topic, i_ref['host']).\
1354 AndReturn(topic)
1355 rpc.call(c, topic, {"method": "pre_live_migration",
1356- "args": {'instance_id': i_ref['id']}})
1357+ "args": {'instance_id': i_ref['id'],
1358+ 'block_migration': False,
1359+ 'disk': None}})
1360 self.mox.StubOutWithMock(self.compute.driver, 'live_migration')
1361 self.compute.driver.live_migration(c, i_ref, i_ref['host'],
1362 self.compute.post_live_migration,
1363- self.compute.recover_live_migration)
1364+ self.compute.rollback_live_migration,
1365+ False)
1366
1367 self.compute.db = dbmock
1368 self.mox.ReplayAll()
1369@@ -829,6 +843,10 @@
1370 self.compute.volume_manager.remove_compute_volume(c, v['id'])
1371 self.mox.StubOutWithMock(self.compute.driver, 'unfilter_instance')
1372 self.compute.driver.unfilter_instance(i_ref, [])
1373+ self.mox.StubOutWithMock(rpc, 'call')
1374+ rpc.call(c, db.queue_get_for(c, FLAGS.compute_topic, dest),
1375+ {"method": "post_live_migration_at_destination",
1376+ "args": {'instance_id': i_ref['id'], 'block_migration': False}})
1377
1378 # executing
1379 self.mox.ReplayAll()
1380
1381=== modified file 'nova/tests/test_libvirt.py'
1382--- nova/tests/test_libvirt.py 2011-08-11 12:34:04 +0000
1383+++ nova/tests/test_libvirt.py 2011-08-12 19:02:28 +0000
1384@@ -21,6 +21,7 @@
1385 import re
1386 import shutil
1387 import sys
1388+import tempfile
1389
1390 from xml.etree.ElementTree import fromstring as xml_to_tree
1391 from xml.dom.minidom import parseString as xml_to_dom
1392@@ -690,17 +691,20 @@
1393 return vdmock
1394
1395 self.create_fake_libvirt_mock(lookupByName=fake_lookup)
1396- self.mox.StubOutWithMock(self.compute, "recover_live_migration")
1397- self.compute.recover_live_migration(self.context, instance_ref,
1398- dest='dest')
1399+# self.mox.StubOutWithMock(self.compute, "recover_live_migration")
1400+ self.mox.StubOutWithMock(self.compute, "rollback_live_migration")
1401+# self.compute.recover_live_migration(self.context, instance_ref,
1402+# dest='dest')
1403+ self.compute.rollback_live_migration(self.context, instance_ref,
1404+ 'dest', False)
1405
1406- # Start test
1407+ #start test
1408 self.mox.ReplayAll()
1409 conn = connection.LibvirtConnection(False)
1410 self.assertRaises(libvirt.libvirtError,
1411 conn._live_migration,
1412- self.context, instance_ref, 'dest', '',
1413- self.compute.recover_live_migration)
1414+ self.context, instance_ref, 'dest', False,
1415+ self.compute.rollback_live_migration)
1416
1417 instance_ref = db.instance_get(self.context, instance_ref['id'])
1418 self.assertTrue(instance_ref['state_description'] == 'running')
1419@@ -711,6 +715,95 @@
1420 db.volume_destroy(self.context, volume_ref['id'])
1421 db.instance_destroy(self.context, instance_ref['id'])
1422
1423+ def test_pre_block_migration_works_correctly(self):
1424+ """Confirms pre_block_migration works correctly."""
1425+
1426+ # Skip if non-libvirt environment
1427+ if not self.lazy_load_library_exists():
1428+ return
1429+
1430+ # Replace instances_path since this testcase creates tmpfile
1431+ tmpdir = tempfile.mkdtemp()
1432+ store = FLAGS.instances_path
1433+ FLAGS.instances_path = tmpdir
1434+
1435+ # Test data
1436+ instance_ref = db.instance_create(self.context, self.test_instance)
1437+ dummyjson = '[{"path": "%s/disk", "local_gb": "10G", "type": "raw"}]'
1438+
1439+ # Preparing mocks
1440+ # qemu-img should be mockd since test environment might not have
1441+ # large disk space.
1442+ self.mox.StubOutWithMock(utils, "execute")
1443+ utils.execute('sudo', 'qemu-img', 'create', '-f', 'raw',
1444+ '%s/%s/disk' % (tmpdir, instance_ref.name), '10G')
1445+
1446+ self.mox.ReplayAll()
1447+ conn = connection.LibvirtConnection(False)
1448+ conn.pre_block_migration(self.context, instance_ref,
1449+ dummyjson % tmpdir)
1450+
1451+ self.assertTrue(os.path.exists('%s/%s/' %
1452+ (tmpdir, instance_ref.name)))
1453+
1454+ shutil.rmtree(tmpdir)
1455+ db.instance_destroy(self.context, instance_ref['id'])
1456+ # Restore FLAGS.instances_path
1457+ FLAGS.instances_path = store
1458+
1459+ def test_get_instance_disk_info_works_correctly(self):
1460+ """Confirms pre_block_migration works correctly."""
1461+ # Skip if non-libvirt environment
1462+ if not self.lazy_load_library_exists():
1463+ return
1464+
1465+ # Test data
1466+ instance_ref = db.instance_create(self.context, self.test_instance)
1467+ dummyxml = ("<domain type='kvm'><name>instance-0000000a</name>"
1468+ "<devices>"
1469+ "<disk type='file'><driver name='qemu' type='raw'/>"
1470+ "<source file='/test/disk'/>"
1471+ "<target dev='vda' bus='virtio'/></disk>"
1472+ "<disk type='file'><driver name='qemu' type='qcow2'/>"
1473+ "<source file='/test/disk.local'/>"
1474+ "<target dev='vdb' bus='virtio'/></disk>"
1475+ "</devices></domain>")
1476+
1477+ ret = ("image: /test/disk\nfile format: raw\n"
1478+ "virtual size: 20G (21474836480 bytes)\ndisk size: 3.1G\n")
1479+
1480+ # Preparing mocks
1481+ vdmock = self.mox.CreateMock(libvirt.virDomain)
1482+ self.mox.StubOutWithMock(vdmock, "XMLDesc")
1483+ vdmock.XMLDesc(0).AndReturn(dummyxml)
1484+
1485+ def fake_lookup(instance_name):
1486+ if instance_name == instance_ref.name:
1487+ return vdmock
1488+ self.create_fake_libvirt_mock(lookupByName=fake_lookup)
1489+
1490+ self.mox.StubOutWithMock(os.path, "getsize")
1491+ # based on above testdata, one is raw image, so getsize is mocked.
1492+ os.path.getsize("/test/disk").AndReturn(10 * 1024 * 1024 * 1024)
1493+ # another is qcow image, so qemu-img should be mocked.
1494+ self.mox.StubOutWithMock(utils, "execute")
1495+ utils.execute('sudo', 'qemu-img', 'info', '/test/disk.local').\
1496+ AndReturn((ret, ''))
1497+
1498+ self.mox.ReplayAll()
1499+ conn = connection.LibvirtConnection(False)
1500+ info = conn.get_instance_disk_info(self.context, instance_ref)
1501+ info = utils.loads(info)
1502+
1503+ self.assertTrue(info[0]['type'] == 'raw' and
1504+ info[1]['type'] == 'qcow2' and
1505+ info[0]['path'] == '/test/disk' and
1506+ info[1]['path'] == '/test/disk.local' and
1507+ info[0]['local_gb'] == '10G' and
1508+ info[1]['local_gb'] == '20G')
1509+
1510+ db.instance_destroy(self.context, instance_ref['id'])
1511+
1512 def test_spawn_with_network_info(self):
1513 # Skip if non-libvirt environment
1514 if not self.lazy_load_library_exists():
1515
1516=== modified file 'nova/virt/fake.py'
1517--- nova/virt/fake.py 2011-08-09 22:46:57 +0000
1518+++ nova/virt/fake.py 2011-08-12 19:02:28 +0000
1519@@ -492,7 +492,7 @@
1520 raise NotImplementedError('This method is supported only by libvirt.')
1521
1522 def live_migration(self, context, instance_ref, dest,
1523- post_method, recover_method):
1524+ post_method, recover_method, block_migration=False):
1525 """This method is supported only by libvirt."""
1526 return
1527
1528
1529=== modified file 'nova/virt/libvirt/connection.py'
1530--- nova/virt/libvirt/connection.py 2011-08-11 12:34:04 +0000
1531+++ nova/virt/libvirt/connection.py 2011-08-12 19:02:28 +0000
1532@@ -117,6 +117,10 @@
1533 flags.DEFINE_string('live_migration_flag',
1534 "VIR_MIGRATE_UNDEFINE_SOURCE, VIR_MIGRATE_PEER2PEER",
1535 'Define live migration behavior.')
1536+flags.DEFINE_string('block_migration_flag',
1537+ "VIR_MIGRATE_UNDEFINE_SOURCE, VIR_MIGRATE_PEER2PEER, "
1538+ "VIR_MIGRATE_NON_SHARED_INC",
1539+ 'Define block migration behavior.')
1540 flags.DEFINE_integer('live_migration_bandwidth', 0,
1541 'Define live migration behavior')
1542 flags.DEFINE_string('qemu_img', 'qemu-img',
1543@@ -727,6 +731,7 @@
1544
1545 If cow is True, it will make a CoW image instead of a copy.
1546 """
1547+
1548 if not os.path.exists(target):
1549 base_dir = os.path.join(FLAGS.instances_path, '_base')
1550 if not os.path.exists(base_dir):
1551@@ -1552,7 +1557,7 @@
1552 time.sleep(1)
1553
1554 def live_migration(self, ctxt, instance_ref, dest,
1555- post_method, recover_method):
1556+ post_method, recover_method, block_migration=False):
1557 """Spawning live_migration operation for distributing high-load.
1558
1559 :params ctxt: security context
1560@@ -1560,20 +1565,22 @@
1561 nova.db.sqlalchemy.models.Instance object
1562 instance object that is migrated.
1563 :params dest: destination host
1564+ :params block_migration: destination host
1565 :params post_method:
1566 post operation method.
1567 expected nova.compute.manager.post_live_migration.
1568 :params recover_method:
1569 recovery method when any exception occurs.
1570 expected nova.compute.manager.recover_live_migration.
1571+ :params block_migration: if true, do block migration.
1572
1573 """
1574
1575 greenthread.spawn(self._live_migration, ctxt, instance_ref, dest,
1576- post_method, recover_method)
1577+ post_method, recover_method, block_migration)
1578
1579- def _live_migration(self, ctxt, instance_ref, dest,
1580- post_method, recover_method):
1581+ def _live_migration(self, ctxt, instance_ref, dest, post_method,
1582+ recover_method, block_migration=False):
1583 """Do live migration.
1584
1585 :params ctxt: security context
1586@@ -1592,27 +1599,21 @@
1587
1588 # Do live migration.
1589 try:
1590- flaglist = FLAGS.live_migration_flag.split(',')
1591+ if block_migration:
1592+ flaglist = FLAGS.block_migration_flag.split(',')
1593+ else:
1594+ flaglist = FLAGS.live_migration_flag.split(',')
1595 flagvals = [getattr(libvirt, x.strip()) for x in flaglist]
1596 logical_sum = reduce(lambda x, y: x | y, flagvals)
1597
1598- if self.read_only:
1599- tmpconn = self._connect(self.libvirt_uri, False)
1600- dom = tmpconn.lookupByName(instance_ref.name)
1601- dom.migrateToURI(FLAGS.live_migration_uri % dest,
1602- logical_sum,
1603- None,
1604- FLAGS.live_migration_bandwidth)
1605- tmpconn.close()
1606- else:
1607- dom = self._conn.lookupByName(instance_ref.name)
1608- dom.migrateToURI(FLAGS.live_migration_uri % dest,
1609- logical_sum,
1610- None,
1611- FLAGS.live_migration_bandwidth)
1612+ dom = self._conn.lookupByName(instance_ref.name)
1613+ dom.migrateToURI(FLAGS.live_migration_uri % dest,
1614+ logical_sum,
1615+ None,
1616+ FLAGS.live_migration_bandwidth)
1617
1618 except Exception:
1619- recover_method(ctxt, instance_ref, dest=dest)
1620+ recover_method(ctxt, instance_ref, dest, block_migration)
1621 raise
1622
1623 # Waiting for completion of live_migration.
1624@@ -1624,11 +1625,150 @@
1625 self.get_info(instance_ref.name)['state']
1626 except exception.NotFound:
1627 timer.stop()
1628- post_method(ctxt, instance_ref, dest)
1629+ post_method(ctxt, instance_ref, dest, block_migration)
1630
1631 timer.f = wait_for_live_migration
1632 timer.start(interval=0.5, now=True)
1633
1634+ def pre_block_migration(self, ctxt, instance_ref, disk_info_json):
1635+ """Preparation block migration.
1636+
1637+ :params ctxt: security context
1638+ :params instance_ref:
1639+ nova.db.sqlalchemy.models.Instance object
1640+ instance object that is migrated.
1641+ :params disk_info_json:
1642+ json strings specified in get_instance_disk_info
1643+
1644+ """
1645+ disk_info = utils.loads(disk_info_json)
1646+
1647+ # make instance directory
1648+ instance_dir = os.path.join(FLAGS.instances_path, instance_ref['name'])
1649+ if os.path.exists(instance_dir):
1650+ raise exception.DestinationDiskExists(path=instance_dir)
1651+ os.mkdir(instance_dir)
1652+
1653+ for info in disk_info:
1654+ base = os.path.basename(info['path'])
1655+ # Get image type and create empty disk image.
1656+ instance_disk = os.path.join(instance_dir, base)
1657+ utils.execute('sudo', 'qemu-img', 'create', '-f', info['type'],
1658+ instance_disk, info['local_gb'])
1659+
1660+ # if image has kernel and ramdisk, just download
1661+ # following normal way.
1662+ if instance_ref['kernel_id']:
1663+ user = manager.AuthManager().get_user(instance_ref['user_id'])
1664+ project = manager.AuthManager().get_project(
1665+ instance_ref['project_id'])
1666+ self._fetch_image(nova_context.get_admin_context(),
1667+ os.path.join(instance_dir, 'kernel'),
1668+ instance_ref['kernel_id'],
1669+ user,
1670+ project)
1671+ if instance_ref['ramdisk_id']:
1672+ self._fetch_image(nova_context.get_admin_context(),
1673+ os.path.join(instance_dir, 'ramdisk'),
1674+ instance_ref['ramdisk_id'],
1675+ user,
1676+ project)
1677+
1678+ def post_live_migration_at_destination(self, ctxt,
1679+ instance_ref,
1680+ network_info,
1681+ block_migration):
1682+ """Post operation of live migration at destination host.
1683+
1684+ :params ctxt: security context
1685+ :params instance_ref:
1686+ nova.db.sqlalchemy.models.Instance object
1687+ instance object that is migrated.
1688+ :params network_info: instance network infomation
1689+ :params : block_migration: if true, post operation of block_migraiton.
1690+ """
1691+ # Define migrated instance, otherwise, suspend/destroy does not work.
1692+ dom_list = self._conn.listDefinedDomains()
1693+ if instance_ref.name not in dom_list:
1694+ instance_dir = os.path.join(FLAGS.instances_path,
1695+ instance_ref.name)
1696+ xml_path = os.path.join(instance_dir, 'libvirt.xml')
1697+ # In case of block migration, destination does not have
1698+ # libvirt.xml
1699+ if not os.path.isfile(xml_path):
1700+ xml = self.to_xml(instance_ref, network_info=network_info)
1701+ f = open(os.path.join(instance_dir, 'libvirt.xml'), 'w+')
1702+ f.write(xml)
1703+ f.close()
1704+ # libvirt.xml should be made by to_xml(), but libvirt
1705+ # does not accept to_xml() result, since uuid is not
1706+ # included in to_xml() result.
1707+ dom = self._lookup_by_name(instance_ref.name)
1708+ self._conn.defineXML(dom.XMLDesc(0))
1709+
1710+ def get_instance_disk_info(self, ctxt, instance_ref):
1711+ """Preparation block migration.
1712+
1713+ :params ctxt: security context
1714+ :params instance_ref:
1715+ nova.db.sqlalchemy.models.Instance object
1716+ instance object that is migrated.
1717+ :return:
1718+ json strings with below format.
1719+ "[{'path':'disk', 'type':'raw', 'local_gb':'10G'},...]"
1720+
1721+ """
1722+ disk_info = []
1723+
1724+ virt_dom = self._lookup_by_name(instance_ref.name)
1725+ xml = virt_dom.XMLDesc(0)
1726+ doc = libxml2.parseDoc(xml)
1727+ disk_nodes = doc.xpathEval('//devices/disk')
1728+ path_nodes = doc.xpathEval('//devices/disk/source')
1729+ driver_nodes = doc.xpathEval('//devices/disk/driver')
1730+
1731+ for cnt, path_node in enumerate(path_nodes):
1732+ disk_type = disk_nodes[cnt].get_properties().getContent()
1733+ path = path_node.get_properties().getContent()
1734+
1735+ if disk_type != 'file':
1736+ LOG.debug(_('skipping %(path)s since it looks like volume') %
1737+ locals())
1738+ continue
1739+
1740+ # In case of libvirt.xml, disk type can be obtained
1741+ # by the below statement.
1742+ # -> disk_type = driver_nodes[cnt].get_properties().getContent()
1743+ # but this xml is generated by kvm, format is slightly different.
1744+ disk_type = \
1745+ driver_nodes[cnt].get_properties().get_next().getContent()
1746+ if disk_type == 'raw':
1747+ size = int(os.path.getsize(path))
1748+ else:
1749+ out, err = utils.execute('sudo', 'qemu-img', 'info', path)
1750+ size = [i.split('(')[1].split()[0] for i in out.split('\n')
1751+ if i.strip().find('virtual size') >= 0]
1752+ size = int(size[0])
1753+
1754+ # block migration needs same/larger size of empty image on the
1755+ # destination host. since qemu-img creates bit smaller size image
1756+ # depending on original image size, fixed value is necessary.
1757+ for unit, divisor in [('G', 1024 ** 3), ('M', 1024 ** 2),
1758+ ('K', 1024), ('', 1)]:
1759+ if size / divisor == 0:
1760+ continue
1761+ if size % divisor != 0:
1762+ size = size / divisor + 1
1763+ else:
1764+ size = size / divisor
1765+ size = str(size) + unit
1766+ break
1767+
1768+ disk_info.append({'type': disk_type, 'path': path,
1769+ 'local_gb': size})
1770+
1771+ return utils.dumps(disk_info)
1772+
1773 def unfilter_instance(self, instance_ref, network_info):
1774 """See comments of same method in firewall_driver."""
1775 self.firewall_driver.unfilter_instance(instance_ref,
1776
1777=== modified file 'nova/virt/xenapi_conn.py'
1778--- nova/virt/xenapi_conn.py 2011-08-09 22:46:57 +0000
1779+++ nova/virt/xenapi_conn.py 2011-08-12 19:02:28 +0000
1780@@ -314,7 +314,7 @@
1781 return
1782
1783 def live_migration(self, context, instance_ref, dest,
1784- post_method, recover_method):
1785+ post_method, recover_method, block_migration=False):
1786 """This method is supported only by libvirt."""
1787 return
1788