Merge lp:~nttdata/nova/live-migration into lp:~hudson-openstack/nova/trunk

Proposed by Kei Masumoto
Status: Merged
Approved by: Eric Day
Approved revision: 466
Merged at revision: 573
Proposed branch: lp:~nttdata/nova/live-migration
Merge into: lp:~hudson-openstack/nova/trunk
Diff against target: 1380 lines (+956/-17)
19 files modified
.mailmap (+2/-0)
Authors (+2/-0)
bin/nova-manage (+81/-1)
nova/api/ec2/cloud.py (+1/-1)
nova/compute/manager.py (+117/-1)
nova/db/api.py (+30/-0)
nova/db/sqlalchemy/api.py (+64/-0)
nova/db/sqlalchemy/models.py (+24/-2)
nova/network/manager.py (+8/-6)
nova/scheduler/driver.py (+183/-0)
nova/scheduler/manager.py (+48/-0)
nova/service.py (+4/-0)
nova/virt/cpuinfo.xml.template (+9/-0)
nova/virt/fake.py (+32/-0)
nova/virt/libvirt_conn.py (+287/-0)
nova/virt/xenapi_conn.py (+30/-0)
nova/volume/driver.py (+25/-5)
nova/volume/manager.py (+8/-1)
setup.py (+1/-0)
To merge this branch: bzr merge lp:~nttdata/nova/live-migration
Reviewer Review Type Date Requested Status
Soren Hansen (community) Approve
Masanori Itoh (community) Approve
Vish Ishaya (community) Approve
Thierry Carrez (community) ffe Approve
Review via email: mp+44940@code.launchpad.net

Commit message

Risk of Regression: This patch don’t modify existing functionlities, but I have added some.
    1. nova.db.service.sqlalchemy.model.Serivce (adding a column to database)
    2. nova.service ( nova-compute needes to insert information defined by 1 above)

So, db migration is necessary for existing user, but just adding columns.

Description of the change

Adding live migration features.
Please refer detail design at:
<http://wiki.openstack.org/LiveMigration?action=AttachFile&do=view&target=bexar-migration-live-update1.pdf>

Also, usage is described at:
<http://wiki.openstack.org/UsageOfLiveMigration>

Changes has been done at:
nova-manage:
    this feature can be used only from nova-manage
nova/scheduler/driver.py
nova/scheduler/manager.py
    pre-checking at schedule_live_migration().
nova/compute/manager.py and nova/virt/libvirt_conn.py
    executing live_migration
nova/db/sqlalchemy/*
    we added Host table because live_migration needs to check which host has
    enough resource, then we have to record total resource that physical servers has.

To post a comment you must log in.
Revision history for this message
Soren Hansen (soren) wrote :
Download full text (46.4 KiB)

Hello.

Please find my comments inline.

2010/12/31 Kei Masumoto <email address hidden>:
> === modified file 'bin/nova-manage'
> --- bin/nova-manage     2010-12-16 22:52:08 +0000
> +++ bin/nova-manage     2010-12-31 04:08:57 +0000
> @@ -79,7 +79,10 @@
>  from nova import quota
>  from nova import utils
>  from nova.auth import manager
> +from nova import rpc
>  from nova.cloudpipe import pipelib
> +from nova.api.ec2 import cloud
> +
>
>
>  FLAGS = flags.FLAGS
> @@ -452,6 +455,86 @@
>                                     int(network_size), int(vlan_start),
>                                     int(vpn_start))
>
> +
> +class InstanceCommands(object):
> +    """Class for mangaging VM instances."""
> +
> +    def live_migration(self, ec2_id, dest):
> +        """live_migration"""
> +
> +        logging.basicConfig()

Todd's newlog branch landed very recently. This changed how we do
logging. Can you update your branch accordingly? Thanks.

> +        ctxt = context.get_admin_context()
> +
> +        try:
> +            internal_id = cloud.ec2_id_to_internal_id(ec2_id)
> +            instance_ref = db.instance_get_by_internal_id(ctxt, internal_id)
> +            instance_id = instance_ref['id']

There's no longer any difference between ec2_id and internal id. This
should simplify this bit of your patch somewhat.

> +        except exception.NotFound as e:
> +            msg = _('instance(%s) is not found')
> +            e.args += (msg % ec2_id,)
> +            raise e

I don't think it's a good idea to add elements to existing Exception
instances' args attribute this way. I'd prefer if you either just raised
a new NotFound exception or simply printed "No such instance: %s" % id
or something like that.

> +        ret = rpc.call(ctxt,
> +                       FLAGS.scheduler_topic,
> +                       {"method": "live_migration",
> +                        "args": {"instance_id": instance_id,
> +                                "dest": dest,
> +                                "topic": FLAGS.compute_topic}})

I don't understand why you pass the compute_topic in the rpc call rather
than letting the scheduler worry about that?

> +        if None != ret:
> +            raise ret

"if ret:" is better.

You can (or at least should) only raise Exceptions. rpc.call never
*returns* an Exception. It may *raise* one, but will never return one.

> +
> +        print 'Finished all procedure. Check migrating finishes successfully'
> +        print 'check status by using euca-describe-instances.'

Perhaps something like this instead:
"Migration of %s initiated. Check its progress using euca-describe-instances."

> +class HostCommands(object):
> +    """Class for mangaging host(physical nodes)."""
> +
> +
> +    def list(self):
> +        """describe host list."""
> +
> +        # To supress msg: No handlers could be found for logger "amqplib"
> +        logging.basicConfig()
> +
> +        host_refs = db.host_get_all(context.get_admin_context())
> +        for host_ref in host_refs:
> +            print host_ref['name']
> +
> +
> +    def show(self, host):
> +        """describe cpu/memory/hdd info for host."""
> +
> +        # To supress msg: No handl...

Revision history for this message
Soren Hansen (soren) :
review: Needs Fixing
Revision history for this message
Kei Masumoto (masumotok) wrote :
Download full text (49.2 KiB)

Soren,

Thanks for reviewing, and I'm trying to fix based on your comments.
By the way, I have some questions - please give me a hand to solve one by one.

Regarding to the comment on nova/compute/manager.py:

>> self.db.instance_update(context,
>> instance_id,
>> - {'host': self.host})
>> + {'host': self.host, 'launched_on':self.host})
>
>Why pass the same value twice?

You mentioned "'launched_on':self.host" should be removed didn't you?
Before doing so, let me explain.

"host' column on Instance table is to record which host an instance is running on.
Therefore, values are updated if an instance is moved by live migration.
On the other hand, 'launched_on' column that I created is to record which host an instance was launched, and is not updated.
This information is necessary because cpuflag of launched host must have compatibility to the one of live migration destination host.
For this reason, I insert save value twice to different column when an instance is launched.

Please let me know if it does not make sense.

Regards,
Kei Masumoto

-----Original Message-----
From: <email address hidden> [mailto:<email address hidden>] On Behalf Of Soren Hansen
Sent: Monday, January 10, 2011 9:47 PM
To: <email address hidden>
Subject: Re: [Merge] lp:~nttdata/nova/live-migration into lp:nova

Hello.

Please find my comments inline.

2010/12/31 Kei Masumoto <email address hidden>:
> === modified file 'bin/nova-manage'
> --- bin/nova-manage 2010-12-16 22:52:08 +0000
> +++ bin/nova-manage 2010-12-31 04:08:57 +0000
> @@ -79,7 +79,10 @@
> from nova import quota
> from nova import utils
> from nova.auth import manager
> +from nova import rpc
> from nova.cloudpipe import pipelib
> +from nova.api.ec2 import cloud
> +
>
>
> FLAGS = flags.FLAGS
> @@ -452,6 +455,86 @@
> int(network_size), int(vlan_start),
> int(vpn_start))
>
> +
> +class InstanceCommands(object):
> + """Class for mangaging VM instances."""
> +
> + def live_migration(self, ec2_id, dest):
> + """live_migration"""
> +
> + logging.basicConfig()

Todd's newlog branch landed very recently. This changed how we do
logging. Can you update your branch accordingly? Thanks.

> + ctxt = context.get_admin_context()
> +
> + try:
> + internal_id = cloud.ec2_id_to_internal_id(ec2_id)
> + instance_ref = db.instance_get_by_internal_id(ctxt, internal_id)
> + instance_id = instance_ref['id']

There's no longer any difference between ec2_id and internal id. This
should simplify this bit of your patch somewhat.

> + except exception.NotFound as e:
> + msg = _('instance(%s) is not found')
> + e.args += (msg % ec2_id,)
> + raise e

I don't think it's a good idea to add elements to existing Exception
instances' args attribute this way. I'd prefer if you either just raised
a new NotFound exception or simply printed "No such instance: %s" % ...

Revision history for this message
Kei Masumoto (masumotok) wrote :
Download full text (49.2 KiB)

Soren,

Thanks for reviewing, and I'm trying to fix based on your comments.
By the way, I have some questions - please give me a hand to solve one by one.

Regarding to the comment on nova/compute/manager.py:

>> self.db.instance_update(context,
>> instance_id,
>> - {'host': self.host})
>> + {'host': self.host, 'launched_on':self.host})
>
>Why pass the same value twice?

You mentioned "'launched_on':self.host" should be removed didn't you?
Before doing so, let me explain.

"host' column on Instance table is to record which host an instance is running on.
Therefore, values are updated if an instance is moved by live migration.
On the other hand, 'launched_on' column that I created is to record which host an instance was launched, and is not updated.
This information is necessary because cpuflag of launched host must have compatibility to the one of live migration destination host.
For this reason, I insert save value twice to different column when an instance is launched.

Please let me know if it does not make sense.

Regards,
Kei Masumoto

-----Original Message-----
From: <email address hidden> [mailto:<email address hidden>] On Behalf Of Soren Hansen
Sent: Monday, January 10, 2011 9:47 PM
To: <email address hidden>
Subject: Re: [Merge] lp:~nttdata/nova/live-migration into lp:nova

Hello.

Please find my comments inline.

2010/12/31 Kei Masumoto <email address hidden>:
> === modified file 'bin/nova-manage'
> --- bin/nova-manage 2010-12-16 22:52:08 +0000
> +++ bin/nova-manage 2010-12-31 04:08:57 +0000
> @@ -79,7 +79,10 @@
> from nova import quota
> from nova import utils
> from nova.auth import manager
> +from nova import rpc
> from nova.cloudpipe import pipelib
> +from nova.api.ec2 import cloud
> +
>
>
> FLAGS = flags.FLAGS
> @@ -452,6 +455,86 @@
> int(network_size), int(vlan_start),
> int(vpn_start))
>
> +
> +class InstanceCommands(object):
> + """Class for mangaging VM instances."""
> +
> + def live_migration(self, ec2_id, dest):
> + """live_migration"""
> +
> + logging.basicConfig()

Todd's newlog branch landed very recently. This changed how we do
logging. Can you update your branch accordingly? Thanks.

> + ctxt = context.get_admin_context()
> +
> + try:
> + internal_id = cloud.ec2_id_to_internal_id(ec2_id)
> + instance_ref = db.instance_get_by_internal_id(ctxt, internal_id)
> + instance_id = instance_ref['id']

There's no longer any difference between ec2_id and internal id. This
should simplify this bit of your patch somewhat.

> + except exception.NotFound as e:
> + msg = _('instance(%s) is not found')
> + e.args += (msg % ec2_id,)
> + raise e

I don't think it's a good idea to add elements to existing Exception
instances' args attribute this way. I'd prefer if you either just raised
a new NotFound exception or simply printed "No such instance: %s" % ...

Revision history for this message
Soren Hansen (soren) wrote :

2011/1/11 Kei Masumoto <email address hidden>:
> You mentioned "'launched_on':self.host" should be removed didn't you?
> Before doing so, let me explain.
>
> "host' column on Instance table is to record which host an instance is
> running on. Therefore, values are updated if an instance is moved by
> live migration. On the other hand, 'launched_on' column that I
> created is to record which host an instance was launched, and is not
> updated. This information is necessary because cpuflag of launched
> host must have compatibility to the one of live migration destination
> host. For this reason, I insert save value twice to different column
> when an instance is launched.

That makes perfect sense. I somehow thought you were passing it twice in
an RPC call. Please ignore that part of my review, then :)

--
Soren Hansen <email address hidden>
Systems Architect, The Rackspace Cloud
Ubuntu Developer

Revision history for this message
Kei Masumoto (masumotok) wrote :
Download full text (3.8 KiB)

Soren,

Thank you for your reply.
I have still more questions and explanations.
Please let me know if it does not make sense.

1. Your comment at nova-manage

>> + ret = rpc.call(ctxt,
>> + FLAGS.scheduler_topic,
>> + {"method": "live_migration",
>> + "args": {"instance_id": instance_id,
>> + "dest": dest,
>> + "topic": FLAGS.compute_topic}})
>
> I don't understand why you pass the compute_topic in the rpc call rather than letting the scheduler worry about that?

Altough I agree with you, scheduler requires 4 argument.
If I remove the argument "topic", I got following error.

> Traceback (most recent call last):
> File "bin/nova-manage", line 680, in <module>
> main()
> File "bin/nova-manage", line 672, in main
> fn(*argv)
> File "bin/nova-manage", line 488, in live_migration
> "dest": dest}})
> File "/opt/nova.20110112/nova/rpc.py", line 340, in call
> raise wait_msg.result
> nova.rpc.RemoteError: TypeError _schedule() takes at least 4 non-keyword arguments (3 given)
> [u'Traceback (most recent call last):\n', u' File "/opt/nova.20110112/nova/rpc.py", line 191, in receive
> rval = node_func(context=ctxt, **node_args)\n', u'TypeError: _schedule() takes at lea

I think _schedule() is common method, and I should be very careful to change it.
That's why I add the argument "topic".
But your comment is true - I suggest I wrote this explanation briefly as a comment.
What do you think?

2. comment at nova.scheduler.driver.schedule_live_migration

>> + try:
>> + instance_ref = db.instance_get(context, instance_id)
>> + ec2_id = instance_ref['hostname']
>> + internal_id = instance_ref['internal_id']
>> + except exception.NotFound, e:
>> + msg = _('Unexpected error: instance is not found')
>> + e.args += ('\n' + msg, )
>> + raise e
>
> Same comment as above wrt to exception mangling. I now see your comment above, but I don't understand how it helps to just not show the id at all?

This method (schedule_live_migration) is called when users execute nova-manage.
And user inputs like i-xxxx. But in this case, users get nova.db.sqlalchemy.model.Instance.id from upcoming exception.
Users may minunderstand when an id is 10, users get i-a.
( I made an mistake - what I wanted to do is: "Unexpected error: instance(%s) is not found" % ec2_id.)

Regards,
Kei Masumoto

-----Original Message-----
From: <email address hidden> [mailto:<email address hidden>] On Behalf Of Soren Hansen
Sent: Wednesday, January 12, 2011 6:20 AM
To: <email address hidden>
Subject: Re: [Merge] lp:~nttdata/nova/live-migration into lp:nova

2011/1/11 Kei Masumoto <email address hidden>:
> You mentioned "'launched_on':self.host" should be removed didn't you?
> Before doing so, let me explain.
>
> "host' column on Instance table is to record which host an instance is
> running on. Therefore, values are updated if an instance is moved by
> live migration. On the other hand, ...

Read more...

Revision history for this message
Soren Hansen (soren) wrote :

> Thank you for your reply.
> I have still more questions and explanations.
> Please let me know if it does not make sense.

Ok.

By the way, you should merge with trunk again. There are many conflicts.

> 1. Your comment at nova-manage
>>> + ret = rpc.call(ctxt,
>>> + FLAGS.scheduler_topic,
>>> + {"method": "live_migration",
>>> + "args": {"instance_id": instance_id,
>>> + "dest": dest,
>>> + "topic": FLAGS.compute_topic}})
>> I don't understand why you pass the compute_topic in the rpc call rather
>> than letting the scheduler worry about that?
> Altough I agree with you, scheduler requires 4 argument.

Yes, I see that now. Clearly not your fault. Ok.

> 2. comment at nova.scheduler.driver.schedule_live_migration
>
>>> + try:
>>> + instance_ref = db.instance_get(context, instance_id)
>>> + ec2_id = instance_ref['hostname']
>>> + internal_id = instance_ref['internal_id']
>>> + except exception.NotFound, e:
>>> + msg = _('Unexpected error: instance is not found')
>>> + e.args += ('\n' + msg, )
>>> + raise e
>> Same comment as above wrt to exception mangling. I now see your comment
>> above, but I don't understand how it helps to just not show the id at all?
> This method (schedule_live_migration) is called when users execute nova-
> manage.
> And user inputs like i-xxxx. But in this case, users get
> nova.db.sqlalchemy.model.Instance.id from upcoming exception.
> Users may minunderstand when an id is 10, users get i-a.
> ( I made an mistake - what I wanted to do is: "Unexpected error: instance(%s)
> is not found" % ec2_id.)

Perhaps you could add a more general method that outputs both the ID's and use that in your expection message.

Nevertheless, mangling existing Exceptions seems error prone and pointless. I'd raise a new Exception.

Revision history for this message
Kei Masumoto (masumotok) wrote :

Thank you for your reply.
I merged recent trunk yesterday, and almost solved conflicts.
I'll let you know after few hours( I'm testing now)

Thanks again, your comments are very helpful.

Regards,
Kei Masumoto

-----Original Message-----
From: <email address hidden> [mailto:<email address hidden>] On Behalf Of Soren Hansen
Sent: Thursday, January 13, 2011 8:23 PM
To: <email address hidden>
Subject: Re: RE: [Merge] lp:~nttdata/nova/live-migration into lp:nova

> Thank you for your reply.
> I have still more questions and explanations.
> Please let me know if it does not make sense.

Ok.

By the way, you should merge with trunk again. There are many conflicts.

> 1. Your comment at nova-manage
>>> + ret = rpc.call(ctxt,
>>> + FLAGS.scheduler_topic,
>>> + {"method": "live_migration",
>>> + "args": {"instance_id": instance_id,
>>> + "dest": dest,
>>> + "topic": FLAGS.compute_topic}})
>> I don't understand why you pass the compute_topic in the rpc call rather
>> than letting the scheduler worry about that?
> Altough I agree with you, scheduler requires 4 argument.

Yes, I see that now. Clearly not your fault. Ok.

> 2. comment at nova.scheduler.driver.schedule_live_migration
>
>>> + try:
>>> + instance_ref = db.instance_get(context, instance_id)
>>> + ec2_id = instance_ref['hostname']
>>> + internal_id = instance_ref['internal_id']
>>> + except exception.NotFound, e:
>>> + msg = _('Unexpected error: instance is not found')
>>> + e.args += ('\n' + msg, )
>>> + raise e
>> Same comment as above wrt to exception mangling. I now see your comment
>> above, but I don't understand how it helps to just not show the id at all?
> This method (schedule_live_migration) is called when users execute nova-
> manage.
> And user inputs like i-xxxx. But in this case, users get
> nova.db.sqlalchemy.model.Instance.id from upcoming exception.
> Users may minunderstand when an id is 10, users get i-a.
> ( I made an mistake - what I wanted to do is: "Unexpected error: instance(%s)
> is not found" % ec2_id.)

Perhaps you could add a more general method that outputs both the ID's and use that in your expection message.

Nevertheless, mangling existing Exceptions seems error prone and pointless. I'd raise a new Exception.

--
https://code.launchpad.net/~nttdata/nova/live-migration/+merge/44940
Your team NTT DATA is subscribed to branch lp:~nttdata/nova/live-migration.

Revision history for this message
Vish Ishaya (vishvananda) wrote :
Download full text (50.1 KiB)

On Jan 13, 2011, at 3:33 PM, Kei Masumoto wrote:
>
> --
> https://code.launchpad.net/~nttdata/nova/live-migration/+merge/44940
> You are requested to review the proposed merge of lp:~nttdata/nova/live-migration into lp:nova.
> === modified file 'Authors'
> --- Authors 2011-01-12 19:39:25 +0000
> +++ Authors 2011-01-13 23:32:42 +0000
> @@ -25,12 +25,14 @@
> Josh Kearney <email address hidden>
> Joshua McKenty <email address hidden>
> Justin Santa Barbara <email address hidden>
> +Kei Masumoto <email address hidden>
> Ken Pepple <email address hidden>
> Lorin Hochstein <email address hidden>
> Matt Dietz <email address hidden>
> Michael Gundlach <email address hidden>
> Monsyne Dragon <email address hidden>
> Monty Taylor <email address hidden>
> +Muneyuki Noguchi <email address hidden>
> Paul Voccio <email address hidden>
> Rick Clark <email address hidden>
> Rick Harris <email address hidden>
>
> === modified file 'bin/nova-manage'
> --- bin/nova-manage 2011-01-12 20:12:08 +0000
> +++ bin/nova-manage 2011-01-13 23:32:42 +0000
> @@ -81,8 +81,9 @@
> from nova import quota
> from nova import utils
> from nova.auth import manager
> +from nova import rpc
> from nova.cloudpipe import pipelib
> -
> +from nova.api.ec2 import cloud
>
> logging.basicConfig()
> FLAGS = flags.FLAGS
> @@ -461,6 +462,81 @@
> int(vpn_start))
>
>
> +class InstanceCommands(object):
> + """Class for mangaging VM instances."""
> +
> + def live_migration(self, ec2_id, dest):
> + """live_migration"""
> +
> + if FLAGS.connection_type != 'libvirt':
> + raise exception.Error('Only KVM is supported for now. '
> + 'Sorry.')
> +
> + if FLAGS.volume_driver != 'nova.volume.driver.AOEDriver':
> + raise exception.Error('Only AOEDriver is supported for now. '
> + 'Sorry.')

It seems like Iscsi driver would work fine as long as there are no volumes attached to the instance. Can you create a bug to fix
this to check for this in the call and return an exception later? Also to match the rest of nova, these strings should be surrounded in _()

> +
> + logging.basicConfig()
> + ctxt = context.get_admin_context()
> + instance_id = cloud.ec2_id_to_id(ec2_id)
> +
> + rpc.call(ctxt,
> + FLAGS.scheduler_topic,
> + {"method": "live_migration",
> + "args": {"instance_id": instance_id,
> + "dest": dest,
> + "topic": FLAGS.compute_topic}})
> +
> + msg = 'Migration of %s initiated. ' % ec2_id
> + msg += 'Check its progress using euca-describe-instances.'
> + print msg
> +
> +
> +class HostCommands(object):
> + """Class for mangaging host(physical nodes)."""
> +
> + def list(self):
> + """describe host list."""
> +
> + # To supress msg: No handlers could be found for logger "amqplib"
> + logging.basicConfig()

No longer necessary as nova-manage calls this in main
> +
> + host_refs = db.host_get_all(context.get_admin_context())
> + for host_ref in host_refs:
> + print host...

Revision history for this message
Thierry Carrez (ttx) wrote :

FFe review: my main issue with late-merging this is that it adds a new table to the database schema, so it also delays stuff like db-migration. It's also a bit large... If the core reviewers can reach review consensus on this really soon, I'll accept it. Otherwise it's probably better to defer this to early Cactus (which is just three weeks away). We already have plenty of features to test in Bexar.

review: Needs Information (ffe)
Revision history for this message
Kei Masumoto (masumotok) wrote :
Download full text (52.6 KiB)

Hi Vish,

Thanks for your comment.
I fixed all of them and submit merge proposal again here.
There are no conflicts(just Authors)

The point is, I told you we still have separate Host table on database,
and make relationship Service -> Host.
But I re-considerate on this matter...
If we have separate Host table, although it might be good from data optimization point of view, source code is entirely complex, and may cause some bugs.
In addition the affect cause entire source code. We should not do just before Bexar Release.
Therefore, I remove Host table.
I appreciate if you tell your opinion, just because I would like to synchronize our opinion on this matter.

Anyway, please check the diffs.

Thanks in advance.

P.S.
I asked Thierry about FFE, and please let me know if I have something special to do.

-----Original Message-----
From: <email address hidden> [mailto:<email address hidden>] On Behalf Of Vish Ishaya
Sent: Friday, January 14, 2011 9:26 AM
To: <email address hidden>
Subject: Re: [Merge] lp:~nttdata/nova/live-migration into lp:nova

On Jan 13, 2011, at 3:33 PM, Kei Masumoto wrote:
>
> --
> https://code.launchpad.net/~nttdata/nova/live-migration/+merge/44940
> You are requested to review the proposed merge of lp:~nttdata/nova/live-migration into lp:nova.
> === modified file 'Authors'
> --- Authors 2011-01-12 19:39:25 +0000
> +++ Authors 2011-01-13 23:32:42 +0000
> @@ -25,12 +25,14 @@
> Josh Kearney <email address hidden>
> Joshua McKenty <email address hidden>
> Justin Santa Barbara <email address hidden>
> +Kei Masumoto <email address hidden>
> Ken Pepple <email address hidden>
> Lorin Hochstein <email address hidden>
> Matt Dietz <email address hidden>
> Michael Gundlach <email address hidden>
> Monsyne Dragon <email address hidden>
> Monty Taylor <email address hidden>
> +Muneyuki Noguchi <email address hidden>
> Paul Voccio <email address hidden>
> Rick Clark <email address hidden>
> Rick Harris <email address hidden>
>
> === modified file 'bin/nova-manage'
> --- bin/nova-manage 2011-01-12 20:12:08 +0000
> +++ bin/nova-manage 2011-01-13 23:32:42 +0000
> @@ -81,8 +81,9 @@
> from nova import quota
> from nova import utils
> from nova.auth import manager
> +from nova import rpc
> from nova.cloudpipe import pipelib
> -
> +from nova.api.ec2 import cloud
>
> logging.basicConfig()
> FLAGS = flags.FLAGS
> @@ -461,6 +462,81 @@
> int(vpn_start))
>
>
> +class InstanceCommands(object):
> + """Class for mangaging VM instances."""
> +
> + def live_migration(self, ec2_id, dest):
> + """live_migration"""
> +
> + if FLAGS.connection_type != 'libvirt':
> + raise exception.Error('Only KVM is supported for now. '
> + 'Sorry.')
> +
> + if FLAGS.volume_driver != 'nova.volume.driver.AOEDriver':
> + raise exception.Error('Only AOEDriver is supported for now. '
> + 'Sorry.')

It seems like Iscsi driver would work fine as long as there are no volumes attached to the instance. Can you create a bug to fix
this to check for this in...

Revision history for this message
Vish Ishaya (vishvananda) wrote :

Looking good. Three minor points:

143 - instance_id = floating_ip_ref['fixed_ip']['instance']['ec2_id']
144 + # modified by masumotok
145 + #instance_id = floating_ip_ref['fixed_ip']['instance']['ec2_id']
146 + instance_id = floating_ip_ref['fixed_ip']['instance']['id']

This is a bug and your change is correct, you can remove the comments.

1236
1237 - logging.info('ensuring static filters')
1238 self._ensure_static_filters()
1239

I assume you did this during debugging. You should probably add it back in.

Finally, two new drivers were added to volume/driver.py. Since discover_volume has been modified to take a context parameter, please add _context to the other two volume types as well.

Revision history for this message
Thierry Carrez (ttx) wrote :

You should also add yourself to the Authors file.

Revision history for this message
Thierry Carrez (ttx) wrote :

Looks like we are almost there, and the regression risk is contained. However since this is blocking the db-migration branch, I'd like this branch merged before the meeting tomorrow, which probably means getting the last fixes on the branch soon.

Please ensure you cover Vish's latest concerns, any upcoming Soren's remarks, add yourself to the Authors file, merge with trunk if necessary and pass tests and pep8 checks, to try to avoid any Hudson roundtrip.

FFe granted for merging before tomorrow weekly meeting.

review: Approve (ffe)
Revision history for this message
Soren Hansen (soren) wrote :

I get this error when running the test suite:

======================================================================
FAIL: test_authors_up_to_date (nova.tests.test_misc.ProjectTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/soren/src/openstack/nova/nova/nova/tests/test_misc.py", line 53, in test_authors_up_to_date
    '%r not listed in Authors' % missing)
AssertionError: set([u'<root@openstack2-api>', u'Masumoto<email address hidden>']) not listed in Authors

----------------------------------------------------------------------

Please add these lines to .mailmap:
<email address hidden> <root@openstack2-api>
<email address hidden> Masumoto<email address hidden>

Once fixed, I'll vote approve.

I don't think the remaining issues should block this.

review: Needs Fixing
Revision history for this message
Vish Ishaya (vishvananda) wrote :

lgtm

review: Approve
Revision history for this message
Masanori Itoh (itohm) wrote :

Please take care not contaminating log messages by Japanese chars in the future.
Except that, looks good to me. :)

review: Approve
Revision history for this message
Soren Hansen (soren) wrote :

======================================================================
FAIL: test_authors_up_to_date (nova.tests.test_misc.ProjectTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/soren/src/openstack/nova/nova/nova/tests/test_misc.py", line 53, in test_authors_up_to_date
    '%r not listed in Authors' % missing)
AssertionError: set([u'<root@openstack2-api>', u'Masumoto<email address hidden>']) not listed in Authors

----------------------------------------------------------------------
Ran 286 tests in 94.682s

FAILED (failures=1)

I've already explained *exactly* how to fix this problem.

Before you submit things for review, please run the test suite.

review: Needs Fixing
Revision history for this message
Masanori Itoh (itohm) wrote :

Hello Soren,

It looks like masumotok cannot reproduce the failure below, and
he is struggling to fix the issue as his highest priority.

I suggested to checkout his branch on the launchpad again,
and try to reproduce the issue.

Please give him some more time.

Thanks in advance,

Masanori (a.k.a thatsdone)

---
Masanori ITOH R&D Headquarters, NTT DATA CORPORATION
               e-mail: <email address hidden>
               phone : +81-50-5546-2301 (ext: 47-6278)

From: Soren Hansen <email address hidden>
Subject: Re: [Merge] lp:~nttdata/nova/live-migration into lp:nova
Date: Tue, 18 Jan 2011 21:44:52 +0900

> Review: Needs Fixing
> ======================================================================
> FAIL: test_authors_up_to_date (nova.tests.test_misc.ProjectTestCase)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> File "/home/soren/src/openstack/nova/nova/nova/tests/test_misc.py", line 53, in test_authors_up_to_date
> '%r not listed in Authors' % missing)
> AssertionError: set([u'<root@openstack2-api>', u'Masumoto<email address hidden>']) not listed in Authors
>
> ----------------------------------------------------------------------
> Ran 286 tests in 94.682s
>
> FAILED (failures=1)
>
>
> I've already explained *exactly* how to fix this problem.
>
>
> Before you submit things for review, please run the test suite.
> --
> https://code.launchpad.net/~nttdata/nova/live-migration/+merge/44940
> You are reviewing the proposed merge of lp:~nttdata/nova/live-migration into lp:nova.

Revision history for this message
Soren Hansen (soren) wrote :

Alright, lets get this merged. :)

review: Approve
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :
Download full text (7.3 KiB)

The attempt to merge lp:~nttdata/nova/live-migration into lp:nova failed. Below is the output from the failed tests.

nova/service.py:122:48: E231 missing whitespace after ':'
                                        {'host':self.host,
                                               ^
    JCR: Each comma, semicolon or colon should be followed by whitespace.

    Okay: [a, b]
    Okay: (3,)
    Okay: a[1:4]
    Okay: a[:4]
    Okay: a[1:]
    Okay: a[1:4:2]
    E231: ['a','b']
    E231: foo(bar,baz)
nova/scheduler/manager.py:82:20: E201 whitespace after '['
        compute = [ s for s in services if s['topic'] == 'compute']
                   ^
    Avoid extraneous whitespace in the following situations:

    - Immediately inside parentheses, brackets or braces.

    - Immediately before a comma, semicolon, or colon.

    Okay: spam(ham[1], {eggs: 2})
    E201: spam( ham[1], {eggs: 2})
    E201: spam(ham[ 1], {eggs: 2})
    E201: spam(ham[1], { eggs: 2})
    E202: spam(ham[1], {eggs: 2} )
    E202: spam(ham[1 ], {eggs: 2})
    E202: spam(ham[1], {eggs: 2 })

    E203: if x == 4: print x, y; x, y = y , x
    E203: if x == 4: print x, y ; x, y = y, x
    E203: if x == 4 : print x, y; x, y = y, x
nova/scheduler/manager.py:83:30: W291 trailing whitespace
        if 0 == len(compute):
                             ^
    JCR: Trailing whitespace is superfluous.
    FBM: Except when it occurs as part of a blank line (i.e. the line is
         nothing but whitespace). According to Python docs[1] a line with only
         whitespace is considered a blank line, and is to be ignored. However,
         matching a blank line to its indentation level avoids mistakenly
         terminating a multi-line statement (e.g. class declaration) when
         pasting code into the standard Python interpreter.

         [1] http://docs.python.org/reference/lexical_analysis.html#blank-lines

    The warning returned varies on whether the line itself is blank, for easier
    filtering for those who want to indent their blank lines.

    Okay: spam(1)
    W291: spam(1)\s
    W293: class Foo(object):\n \n bang = 12
nova/scheduler/manager.py:93:1: W293 blank line contains whitespace

^
    JCR: Trailing whitespace is superfluous.
    FBM: Except when it occurs as part of a blank line (i.e. the line is
         nothing but whitespace). According to Python docs[1] a line with only
         whitespace is considered a blank line, and is to be ignored. However,
         matching a blank line to its indentation level avoids mistakenly
         terminating a multi-line statement (e.g. class declaration) when
         pasting code into the standard Python interpreter.

         [1] http://docs.python.org/reference/lexical_analysis.html#blank-lines

    The warning returned varies on whether the line itself is blank, for easier
    filtering for those who want to indent their blank lines.

    Okay: spam(1)
    W291: spam(1)\s
    W293: class Foo(object):\n \n bang = 12
nova/scheduler/manager.py:95:9: E303 too many blank lines (2)
        u_resource = {}
        ^
    Separate top-level function and class definitions with two blank lines.

    Method definitions inside a c...

Read more...

lp:~nttdata/nova/live-migration updated
466. By Kei Masumoto

fixed pep8 error

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file '.mailmap'
2--- .mailmap 2011-01-05 23:04:51 +0000
3+++ .mailmap 2011-01-18 16:24:10 +0000
4@@ -16,6 +16,8 @@
5 <jmckenty@gmail.com> <jmckenty@joshua-mckentys-macbook-pro.local>
6 <jmckenty@gmail.com> <joshua.mckenty@nasa.gov>
7 <justin@fathomdb.com> <justinsb@justinsb-desktop>
8+<masumotok@nttdata.co.jp> <root@openstack2-api>
9+<masumotok@nttdata.co.jp> Masumoto<masumotok@nttdata.co.jp>
10 <mordred@inaugust.com> <mordred@hudson>
11 <paul@openstack.org> <pvoccio@castor.local>
12 <paul@openstack.org> <paul.voccio@rackspace.com>
13
14=== modified file 'Authors'
15--- Authors 2011-01-14 04:59:06 +0000
16+++ Authors 2011-01-18 16:24:10 +0000
17@@ -26,6 +26,7 @@
18 Josh Kearney <josh.kearney@rackspace.com>
19 Joshua McKenty <jmckenty@gmail.com>
20 Justin Santa Barbara <justin@fathomdb.com>
21+Kei Masumoto <masumotok@nttdata.co.jp>
22 Ken Pepple <ken.pepple@gmail.com>
23 Koji Iida <iida.koji@lab.ntt.co.jp>
24 Lorin Hochstein <lorin@isi.edu>
25@@ -34,6 +35,7 @@
26 Monsyne Dragon <mdragon@rackspace.com>
27 Monty Taylor <mordred@inaugust.com>
28 MORITA Kazutaka <morita.kazutaka@gmail.com>
29+Muneyuki Noguchi <noguchimn@nttdata.co.jp>
30 Nachi Ueno <ueno.nachi@lab.ntt.co.jp> <openstack@lab.ntt.co.jp> <nati.ueno@gmail.com> <nova@u4>
31 Paul Voccio <paul@openstack.org>
32 Rick Clark <rick@openstack.org>
33
34=== modified file 'bin/nova-manage'
35--- bin/nova-manage 2011-01-13 00:28:35 +0000
36+++ bin/nova-manage 2011-01-18 16:24:10 +0000
37@@ -62,6 +62,7 @@
38
39 import IPy
40
41+
42 # If ../nova/__init__.py exists, add ../ to Python search path, so that
43 # it will override what happens to be installed in /usr/(local/)lib/python...
44 possible_topdir = os.path.normpath(os.path.join(os.path.abspath(sys.argv[0]),
45@@ -81,8 +82,9 @@
46 from nova import quota
47 from nova import utils
48 from nova.auth import manager
49+from nova import rpc
50 from nova.cloudpipe import pipelib
51-
52+from nova.api.ec2 import cloud
53
54 logging.basicConfig()
55 FLAGS = flags.FLAGS
56@@ -465,6 +467,82 @@
57 int(vpn_start), fixed_range_v6)
58
59
60+class InstanceCommands(object):
61+ """Class for mangaging VM instances."""
62+
63+ def live_migration(self, ec2_id, dest):
64+ """live_migration"""
65+
66+ ctxt = context.get_admin_context()
67+ instance_id = cloud.ec2_id_to_id(ec2_id)
68+
69+ if FLAGS.connection_type != 'libvirt':
70+ msg = _('Only KVM is supported for now. Sorry!')
71+ raise exception.Error(msg)
72+
73+ if FLAGS.volume_driver != 'nova.volume.driver.AOEDriver':
74+ instance_ref = db.instance_get(ctxt, instance_id)
75+ if len(instance_ref['volumes']) != 0:
76+ msg = _(("""Volumes attached by ISCSIDriver"""
77+ """ are not supported. Sorry!"""))
78+ raise exception.Error(msg)
79+
80+ rpc.call(ctxt,
81+ FLAGS.scheduler_topic,
82+ {"method": "live_migration",
83+ "args": {"instance_id": instance_id,
84+ "dest": dest,
85+ "topic": FLAGS.compute_topic}})
86+
87+ msg = 'Migration of %s initiated. ' % ec2_id
88+ msg += 'Check its progress using euca-describe-instances.'
89+ print msg
90+
91+
92+class HostCommands(object):
93+ """Class for mangaging host(physical nodes)."""
94+
95+ def list(self):
96+ """describe host list."""
97+
98+ # To supress msg: No handlers could be found for logger "amqplib"
99+ logging.basicConfig()
100+
101+ service_refs = db.service_get_all(context.get_admin_context())
102+ hosts = [h['host'] for h in service_refs]
103+ hosts = list(set(hosts))
104+ for host in hosts:
105+ print host
106+
107+ def show(self, host):
108+ """describe cpu/memory/hdd info for host."""
109+
110+ result = rpc.call(context.get_admin_context(),
111+ FLAGS.scheduler_topic,
112+ {"method": "show_host_resource",
113+ "args": {"host": host}})
114+
115+ # Checking result msg format is necessary, that will have done
116+ # when this feture is included in API.
117+ if type(result) != dict:
118+ print 'Unexpected error occurs'
119+ elif not result['ret']:
120+ print '%s' % result['msg']
121+ else:
122+ cpu = result['phy_resource']['vcpus']
123+ mem = result['phy_resource']['memory_mb']
124+ hdd = result['phy_resource']['local_gb']
125+
126+ print 'HOST\t\tPROJECT\t\tcpu\tmem(mb)\tdisk(gb)'
127+ print '%s\t\t\t%s\t%s\t%s' % (host, cpu, mem, hdd)
128+ for p_id, val in result['usage'].items():
129+ print '%s\t%s\t\t%s\t%s\t%s' % (host,
130+ p_id,
131+ val['vcpus'],
132+ val['memory_mb'],
133+ val['local_gb'])
134+
135+
136 class ServiceCommands(object):
137 """Enable and disable running services"""
138
139@@ -527,6 +605,8 @@
140 ('vpn', VpnCommands),
141 ('floating', FloatingIpCommands),
142 ('network', NetworkCommands),
143+ ('instance', InstanceCommands),
144+ ('host', HostCommands),
145 ('service', ServiceCommands),
146 ('log', LogCommands)]
147
148
149=== modified file 'nova/api/ec2/cloud.py'
150--- nova/api/ec2/cloud.py 2011-01-17 18:05:26 +0000
151+++ nova/api/ec2/cloud.py 2011-01-18 16:24:10 +0000
152@@ -729,7 +729,7 @@
153 ec2_id = None
154 if (floating_ip_ref['fixed_ip']
155 and floating_ip_ref['fixed_ip']['instance']):
156- instance_id = floating_ip_ref['fixed_ip']['instance']['ec2_id']
157+ instance_id = floating_ip_ref['fixed_ip']['instance']['id']
158 ec2_id = id_to_ec2_id(instance_id)
159 address_rv = {'public_ip': address,
160 'instance_id': ec2_id}
161
162=== modified file 'nova/compute/manager.py'
163--- nova/compute/manager.py 2011-01-17 17:16:36 +0000
164+++ nova/compute/manager.py 2011-01-18 16:24:10 +0000
165@@ -41,6 +41,7 @@
166 import socket
167 import functools
168
169+from nova import db
170 from nova import exception
171 from nova import flags
172 from nova import log as logging
173@@ -120,6 +121,35 @@
174 """
175 self.driver.init_host()
176
177+ def update_service(self, ctxt, host, binary):
178+ """Insert compute node specific information to DB."""
179+
180+ try:
181+ service_ref = db.service_get_by_args(ctxt,
182+ host,
183+ binary)
184+ except exception.NotFound:
185+ msg = _(("""Cannot insert compute manager specific info"""
186+ """Because no service record found."""))
187+ raise exception.Invalid(msg)
188+
189+ # Updating host information
190+ vcpu = self.driver.get_vcpu_number()
191+ memory_mb = self.driver.get_memory_mb()
192+ local_gb = self.driver.get_local_gb()
193+ hypervisor = self.driver.get_hypervisor_type()
194+ version = self.driver.get_hypervisor_version()
195+ cpu_info = self.driver.get_cpu_info()
196+
197+ db.service_update(ctxt,
198+ service_ref['id'],
199+ {'vcpus': vcpu,
200+ 'memory_mb': memory_mb,
201+ 'local_gb': local_gb,
202+ 'hypervisor_type': hypervisor,
203+ 'hypervisor_version': version,
204+ 'cpu_info': cpu_info})
205+
206 def _update_state(self, context, instance_id):
207 """Update the state of an instance from the driver info."""
208 # FIXME(ja): include other fields from state?
209@@ -178,9 +208,10 @@
210 raise exception.Error(_("Instance has already been created"))
211 LOG.audit(_("instance %s: starting..."), instance_id,
212 context=context)
213+
214 self.db.instance_update(context,
215 instance_id,
216- {'host': self.host})
217+ {'host': self.host, 'launched_on': self.host})
218
219 self.db.instance_set_state(context,
220 instance_id,
221@@ -560,3 +591,88 @@
222 self.volume_manager.remove_compute_volume(context, volume_id)
223 self.db.volume_detached(context, volume_id)
224 return True
225+
226+ def compare_cpu(self, context, cpu_info):
227+ """ Check the host cpu is compatible to a cpu given by xml."""
228+ return self.driver.compare_cpu(cpu_info)
229+
230+ def pre_live_migration(self, context, instance_id, dest):
231+ """Any preparation for live migration at dst host."""
232+
233+ # Getting instance info
234+ instance_ref = db.instance_get(context, instance_id)
235+ ec2_id = instance_ref['hostname']
236+
237+ # Getting fixed ips
238+ fixed_ip = db.instance_get_fixed_address(context, instance_id)
239+ if not fixed_ip:
240+ msg = _('%s(%s) doesnt have fixed_ip') % (instance_id, ec2_id)
241+ raise exception.NotFound(msg)
242+
243+ # If any volume is mounted, prepare here.
244+ if len(instance_ref['volumes']) == 0:
245+ logging.info(_("%s has no volume.") % ec2_id)
246+ else:
247+ for v in instance_ref['volumes']:
248+ self.volume_manager.setup_compute_volume(context, v['id'])
249+
250+ # Bridge settings
251+ # call this method prior to ensure_filtering_rules_for_instance,
252+ # since bridge is not set up, ensure_filtering_rules_for instance
253+ # fails.
254+ self.network_manager.setup_compute_network(context, instance_id)
255+
256+ # Creating filters to hypervisors and firewalls.
257+ # An example is that nova-instance-instance-xxx,
258+ # which is written to libvirt.xml( check "virsh nwfilter-list )
259+ # On destination host, this nwfilter is necessary.
260+ # In addition, this method is creating filtering rule
261+ # onto destination host.
262+ self.driver.ensure_filtering_rules_for_instance(instance_ref)
263+
264+ def live_migration(self, context, instance_id, dest):
265+ """executes live migration."""
266+
267+ # Get instance for error handling.
268+ instance_ref = db.instance_get(context, instance_id)
269+ ec2_id = instance_ref['hostname']
270+
271+ try:
272+ # Checking volume node is working correctly when any volumes
273+ # are attached to instances.
274+ if len(instance_ref['volumes']) != 0:
275+ rpc.call(context,
276+ FLAGS.volume_topic,
277+ {"method": "check_for_export",
278+ "args": {'instance_id': instance_id}})
279+
280+ # Asking dest host to preparing live migration.
281+ compute_topic = db.queue_get_for(context,
282+ FLAGS.compute_topic,
283+ dest)
284+ rpc.call(context,
285+ compute_topic,
286+ {"method": "pre_live_migration",
287+ "args": {'instance_id': instance_id,
288+ 'dest': dest}})
289+
290+ except Exception, e:
291+ msg = _('Pre live migration for %s failed at %s')
292+ logging.error(msg, ec2_id, dest)
293+ db.instance_set_state(context,
294+ instance_id,
295+ power_state.RUNNING,
296+ 'running')
297+
298+ for v in instance_ref['volumes']:
299+ db.volume_update(context,
300+ v['id'],
301+ {'status': 'in-use'})
302+
303+ # e should be raised. just calling "raise" may raise NotFound.
304+ raise e
305+
306+ # Executing live migration
307+ # live_migration might raises exceptions, but
308+ # nothing must be recovered in this version.
309+ self.driver.live_migration(context, instance_ref, dest)
310
311=== modified file 'nova/db/api.py'
312--- nova/db/api.py 2011-01-14 07:49:41 +0000
313+++ nova/db/api.py 2011-01-18 16:24:10 +0000
314@@ -253,6 +253,10 @@
315 return IMPL.floating_ip_get_by_address(context, address)
316
317
318+def floating_ip_update(context, address, values):
319+ """update floating ip information."""
320+ return IMPL.floating_ip_update(context, address, values)
321+
322 ####################
323
324
325@@ -405,6 +409,32 @@
326 security_group_id)
327
328
329+def instance_get_all_by_host(context, hostname):
330+ """Get instances by host"""
331+ return IMPL.instance_get_all_by_host(context, hostname)
332+
333+
334+def instance_get_vcpu_sum_by_host_and_project(context, hostname, proj_id):
335+ """Get instances.vcpus by host and project"""
336+ return IMPL.instance_get_vcpu_sum_by_host_and_project(context,
337+ hostname,
338+ proj_id)
339+
340+
341+def instance_get_memory_sum_by_host_and_project(context, hostname, proj_id):
342+ """Get amount of memory by host and project """
343+ return IMPL.instance_get_memory_sum_by_host_and_project(context,
344+ hostname,
345+ proj_id)
346+
347+
348+def instance_get_disk_sum_by_host_and_project(context, hostname, proj_id):
349+ """Get total amount of disk by host and project """
350+ return IMPL.instance_get_disk_sum_by_host_and_project(context,
351+ hostname,
352+ proj_id)
353+
354+
355 def instance_action_create(context, values):
356 """Create an instance action from the values dictionary."""
357 return IMPL.instance_action_create(context, values)
358
359=== modified file 'nova/db/sqlalchemy/api.py'
360--- nova/db/sqlalchemy/api.py 2011-01-15 01:54:36 +0000
361+++ nova/db/sqlalchemy/api.py 2011-01-18 16:24:10 +0000
362@@ -495,6 +495,16 @@
363 return result
364
365
366+@require_context
367+def floating_ip_update(context, address, values):
368+ session = get_session()
369+ with session.begin():
370+ floating_ip_ref = floating_ip_get_by_address(context, address, session)
371+ for (key, value) in values.iteritems():
372+ floating_ip_ref[key] = value
373+ floating_ip_ref.save(session=session)
374+
375+
376 ###################
377
378
379@@ -858,6 +868,7 @@
380 return instance_ref
381
382
383+@require_context
384 def instance_add_security_group(context, instance_id, security_group_id):
385 """Associate the given security group with the given instance"""
386 session = get_session()
387@@ -871,6 +882,59 @@
388
389
390 @require_context
391+def instance_get_all_by_host(context, hostname):
392+ session = get_session()
393+ if not session:
394+ session = get_session()
395+
396+ result = session.query(models.Instance).\
397+ filter_by(host=hostname).\
398+ filter_by(deleted=can_read_deleted(context)).\
399+ all()
400+ if not result:
401+ return []
402+ return result
403+
404+
405+@require_context
406+def _instance_get_sum_by_host_and_project(context, column, hostname, proj_id):
407+ session = get_session()
408+
409+ result = session.query(models.Instance).\
410+ filter_by(host=hostname).\
411+ filter_by(project_id=proj_id).\
412+ filter_by(deleted=can_read_deleted(context)).\
413+ value(column)
414+ if not result:
415+ return 0
416+ return result
417+
418+
419+@require_context
420+def instance_get_vcpu_sum_by_host_and_project(context, hostname, proj_id):
421+ return _instance_get_sum_by_host_and_project(context,
422+ 'vcpus',
423+ hostname,
424+ proj_id)
425+
426+
427+@require_context
428+def instance_get_memory_sum_by_host_and_project(context, hostname, proj_id):
429+ return _instance_get_sum_by_host_and_project(context,
430+ 'memory_mb',
431+ hostname,
432+ proj_id)
433+
434+
435+@require_context
436+def instance_get_disk_sum_by_host_and_project(context, hostname, proj_id):
437+ return _instance_get_sum_by_host_and_project(context,
438+ 'local_gb',
439+ hostname,
440+ proj_id)
441+
442+
443+@require_context
444 def instance_action_create(context, values):
445 """Create an instance action from the values dictionary."""
446 action_ref = models.InstanceActions()
447
448=== modified file 'nova/db/sqlalchemy/models.py'
449--- nova/db/sqlalchemy/models.py 2011-01-15 01:48:48 +0000
450+++ nova/db/sqlalchemy/models.py 2011-01-18 16:24:10 +0000
451@@ -150,13 +150,32 @@
452
453 __tablename__ = 'services'
454 id = Column(Integer, primary_key=True)
455- host = Column(String(255)) # , ForeignKey('hosts.id'))
456+ #host_id = Column(Integer, ForeignKey('hosts.id'), nullable=True)
457+ #host = relationship(Host, backref=backref('services'))
458+ host = Column(String(255))
459 binary = Column(String(255))
460 topic = Column(String(255))
461 report_count = Column(Integer, nullable=False, default=0)
462 disabled = Column(Boolean, default=False)
463 availability_zone = Column(String(255), default='nova')
464
465+ # The below items are compute node only.
466+ # -1 or None is inserted for other service.
467+ vcpus = Column(Integer, nullable=False, default=-1)
468+ memory_mb = Column(Integer, nullable=False, default=-1)
469+ local_gb = Column(Integer, nullable=False, default=-1)
470+ hypervisor_type = Column(String(128))
471+ hypervisor_version = Column(Integer, nullable=False, default=-1)
472+ # Note(masumotok): Expected Strings example:
473+ #
474+ # '{"arch":"x86_64", "model":"Nehalem",
475+ # "topology":{"sockets":1, "threads":2, "cores":3},
476+ # features:[ "tdtscp", "xtpr"]}'
477+ #
478+ # Points are "json translatable" and it must have all
479+ # dictionary keys above.
480+ cpu_info = Column(String(512))
481+
482
483 class Certificate(BASE, NovaBase):
484 """Represents a an x509 certificate"""
485@@ -231,6 +250,9 @@
486 display_name = Column(String(255))
487 display_description = Column(String(255))
488
489+ # To remember on which host a instance booted.
490+ # An instance may moved to other host by live migraiton.
491+ launched_on = Column(String(255))
492 locked = Column(Boolean)
493
494 # TODO(vish): see Ewan's email about state improvements, probably
495@@ -588,7 +610,7 @@
496 Volume, ExportDevice, IscsiTarget, FixedIp, FloatingIp,
497 Network, SecurityGroup, SecurityGroupIngressRule,
498 SecurityGroupInstanceAssociation, AuthToken, User,
499- Project, Certificate, ConsolePool, Console) # , Image, Host
500+ Project, Certificate, ConsolePool, Console) # , Host, Image
501 engine = create_engine(FLAGS.sql_connection, echo=False)
502 for model in models:
503 model.metadata.create_all(engine)
504
505=== modified file 'nova/network/manager.py'
506--- nova/network/manager.py 2011-01-15 01:54:36 +0000
507+++ nova/network/manager.py 2011-01-18 16:24:10 +0000
508@@ -159,7 +159,7 @@
509 """Called when this host becomes the host for a network."""
510 raise NotImplementedError()
511
512- def setup_compute_network(self, context, instance_id):
513+ def setup_compute_network(self, context, instance_id, network_ref=None):
514 """Sets up matching network for compute hosts."""
515 raise NotImplementedError()
516
517@@ -320,7 +320,7 @@
518 self.db.fixed_ip_update(context, address, {'allocated': False})
519 self.db.fixed_ip_disassociate(context.elevated(), address)
520
521- def setup_compute_network(self, context, instance_id):
522+ def setup_compute_network(self, context, instance_id, network_ref=None):
523 """Network is created manually."""
524 pass
525
526@@ -395,9 +395,10 @@
527 super(FlatDHCPManager, self).init_host()
528 self.driver.metadata_forward()
529
530- def setup_compute_network(self, context, instance_id):
531+ def setup_compute_network(self, context, instance_id, network_ref=None):
532 """Sets up matching network for compute hosts."""
533- network_ref = db.network_get_by_instance(context, instance_id)
534+ if network_ref is None:
535+ network_ref = db.network_get_by_instance(context, instance_id)
536 self.driver.ensure_bridge(network_ref['bridge'],
537 FLAGS.flat_interface)
538
539@@ -487,9 +488,10 @@
540 """Returns a fixed ip to the pool."""
541 self.db.fixed_ip_update(context, address, {'allocated': False})
542
543- def setup_compute_network(self, context, instance_id):
544+ def setup_compute_network(self, context, instance_id, network_ref=None):
545 """Sets up matching network for compute hosts."""
546- network_ref = db.network_get_by_instance(context, instance_id)
547+ if network_ref is None:
548+ network_ref = db.network_get_by_instance(context, instance_id)
549 self.driver.ensure_vlan_bridge(network_ref['vlan'],
550 network_ref['bridge'])
551
552
553=== modified file 'nova/scheduler/driver.py'
554--- nova/scheduler/driver.py 2010-12-28 20:11:41 +0000
555+++ nova/scheduler/driver.py 2011-01-18 16:24:10 +0000
556@@ -26,6 +26,9 @@
557 from nova import db
558 from nova import exception
559 from nova import flags
560+from nova import log as logging
561+from nova import rpc
562+from nova.compute import power_state
563
564 FLAGS = flags.FLAGS
565 flags.DEFINE_integer('service_down_time', 60,
566@@ -64,3 +67,183 @@
567 def schedule(self, context, topic, *_args, **_kwargs):
568 """Must override at least this method for scheduler to work."""
569 raise NotImplementedError(_("Must implement a fallback schedule"))
570+
571+ def schedule_live_migration(self, context, instance_id, dest):
572+ """ live migration method """
573+
574+ # Whether instance exists and running
575+ instance_ref = db.instance_get(context, instance_id)
576+ ec2_id = instance_ref['hostname']
577+
578+ # Checking instance.
579+ self._live_migration_src_check(context, instance_ref)
580+
581+ # Checking destination host.
582+ self._live_migration_dest_check(context, instance_ref, dest)
583+
584+ # Common checking.
585+ self._live_migration_common_check(context, instance_ref, dest)
586+
587+ # Changing instance_state.
588+ db.instance_set_state(context,
589+ instance_id,
590+ power_state.PAUSED,
591+ 'migrating')
592+
593+ # Changing volume state
594+ for v in instance_ref['volumes']:
595+ db.volume_update(context,
596+ v['id'],
597+ {'status': 'migrating'})
598+
599+ # Return value is necessary to send request to src
600+ # Check _schedule() in detail.
601+ src = instance_ref['host']
602+ return src
603+
604+ def _live_migration_src_check(self, context, instance_ref):
605+ """Live migration check routine (for src host)"""
606+
607+ # Checking instance is running.
608+ if power_state.RUNNING != instance_ref['state'] or \
609+ 'running' != instance_ref['state_description']:
610+ msg = _('Instance(%s) is not running')
611+ ec2_id = instance_ref['hostname']
612+ raise exception.Invalid(msg % ec2_id)
613+
614+ # Checing volume node is running when any volumes are mounted
615+ # to the instance.
616+ if len(instance_ref['volumes']) != 0:
617+ services = db.service_get_all_by_topic(context, 'volume')
618+ if len(services) < 1 or not self.service_is_up(services[0]):
619+ msg = _('volume node is not alive(time synchronize problem?)')
620+ raise exception.Invalid(msg)
621+
622+ # Checking src host is alive.
623+ src = instance_ref['host']
624+ services = db.service_get_all_by_topic(context, 'compute')
625+ services = [service for service in services if service.host == src]
626+ if len(services) < 1 or not self.service_is_up(services[0]):
627+ msg = _('%s is not alive(time synchronize problem?)')
628+ raise exception.Invalid(msg % src)
629+
630+ def _live_migration_dest_check(self, context, instance_ref, dest):
631+ """Live migration check routine (for destination host)"""
632+
633+ # Checking dest exists and compute node.
634+ dservice_refs = db.service_get_all_by_host(context, dest)
635+ if len(dservice_refs) <= 0:
636+ msg = _('%s does not exists.')
637+ raise exception.Invalid(msg % dest)
638+
639+ dservice_ref = dservice_refs[0]
640+ if dservice_ref['topic'] != 'compute':
641+ msg = _('%s must be compute node')
642+ raise exception.Invalid(msg % dest)
643+
644+ # Checking dest host is alive.
645+ if not self.service_is_up(dservice_ref):
646+ msg = _('%s is not alive(time synchronize problem?)')
647+ raise exception.Invalid(msg % dest)
648+
649+ # Checking whether The host where instance is running
650+ # and dest is not same.
651+ src = instance_ref['host']
652+ if dest == src:
653+ ec2_id = instance_ref['hostname']
654+ msg = _('%s is where %s is running now. choose other host.')
655+ raise exception.Invalid(msg % (dest, ec2_id))
656+
657+ # Checking dst host still has enough capacities.
658+ self.has_enough_resource(context, instance_ref, dest)
659+
660+ def _live_migration_common_check(self, context, instance_ref, dest):
661+ """
662+ Live migration check routine.
663+ Below pre-checkings are followed by
664+ http://wiki.libvirt.org/page/TodoPreMigrationChecks
665+
666+ """
667+
668+ # Checking dest exists.
669+ dservice_refs = db.service_get_all_by_host(context, dest)
670+ if len(dservice_refs) <= 0:
671+ msg = _('%s does not exists.')
672+ raise exception.Invalid(msg % dest)
673+ dservice_ref = dservice_refs[0]
674+
675+ # Checking original host( where instance was launched at) exists.
676+ orighost = instance_ref['launched_on']
677+ oservice_refs = db.service_get_all_by_host(context, orighost)
678+ if len(oservice_refs) <= 0:
679+ msg = _('%s(where instance was launched at) does not exists.')
680+ raise exception.Invalid(msg % orighost)
681+ oservice_ref = oservice_refs[0]
682+
683+ # Checking hypervisor is same.
684+ otype = oservice_ref['hypervisor_type']
685+ dtype = dservice_ref['hypervisor_type']
686+ if otype != dtype:
687+ msg = _('Different hypervisor type(%s->%s)')
688+ raise exception.Invalid(msg % (otype, dtype))
689+
690+ # Checkng hypervisor version.
691+ oversion = oservice_ref['hypervisor_version']
692+ dversion = dservice_ref['hypervisor_version']
693+ if oversion > dversion:
694+ msg = _('Older hypervisor version(%s->%s)')
695+ raise exception.Invalid(msg % (oversion, dversion))
696+
697+ # Checking cpuinfo.
698+ cpu_info = oservice_ref['cpu_info']
699+ try:
700+ rpc.call(context,
701+ db.queue_get_for(context, FLAGS.compute_topic, dest),
702+ {"method": 'compare_cpu',
703+ "args": {'cpu_info': cpu_info}})
704+
705+ except rpc.RemoteError, e:
706+ msg = _(("""%s doesnt have compatibility to %s"""
707+ """(where %s was launched at)"""))
708+ ec2_id = instance_ref['hostname']
709+ src = instance_ref['host']
710+ logging.error(msg % (dest, src, ec2_id))
711+ raise e
712+
713+ def has_enough_resource(self, context, instance_ref, dest):
714+ """ Check if destination host has enough resource for live migration"""
715+
716+ # Getting instance information
717+ ec2_id = instance_ref['hostname']
718+ vcpus = instance_ref['vcpus']
719+ mem = instance_ref['memory_mb']
720+ hdd = instance_ref['local_gb']
721+
722+ # Gettin host information
723+ service_refs = db.service_get_all_by_host(context, dest)
724+ if len(service_refs) <= 0:
725+ msg = _('%s does not exists.')
726+ raise exception.Invalid(msg % dest)
727+ service_ref = service_refs[0]
728+
729+ total_cpu = int(service_ref['vcpus'])
730+ total_mem = int(service_ref['memory_mb'])
731+ total_hdd = int(service_ref['local_gb'])
732+
733+ instances_ref = db.instance_get_all_by_host(context, dest)
734+ for i_ref in instances_ref:
735+ total_cpu -= int(i_ref['vcpus'])
736+ total_mem -= int(i_ref['memory_mb'])
737+ total_hdd -= int(i_ref['local_gb'])
738+
739+ # Checking host has enough information
740+ logging.debug('host(%s) remains vcpu:%s mem:%s hdd:%s,' %
741+ (dest, total_cpu, total_mem, total_hdd))
742+ logging.debug('instance(%s) has vcpu:%s mem:%s hdd:%s,' %
743+ (ec2_id, vcpus, mem, hdd))
744+
745+ if total_cpu <= vcpus or total_mem <= mem or total_hdd <= hdd:
746+ msg = '%s doesnt have enough resource for %s' % (dest, ec2_id)
747+ raise exception.NotEmpty(msg)
748+
749+ logging.debug(_('%s has_enough_resource() for %s') % (dest, ec2_id))
750
751=== modified file 'nova/scheduler/manager.py'
752--- nova/scheduler/manager.py 2011-01-04 05:23:35 +0000
753+++ nova/scheduler/manager.py 2011-01-18 16:24:10 +0000
754@@ -29,6 +29,7 @@
755 from nova import manager
756 from nova import rpc
757 from nova import utils
758+from nova import exception
759
760 LOG = logging.getLogger('nova.scheduler.manager')
761 FLAGS = flags.FLAGS
762@@ -67,3 +68,50 @@
763 {"method": method,
764 "args": kwargs})
765 LOG.debug(_("Casting to %s %s for %s"), topic, host, method)
766+
767+ # NOTE (masumotok) : This method should be moved to nova.api.ec2.admin.
768+ # Based on bear design summit discussion,
769+ # just put this here for bexar release.
770+ def show_host_resource(self, context, host, *args):
771+ """ show the physical/usage resource given by hosts."""
772+
773+ services = db.service_get_all_by_host(context, host)
774+ if len(services) == 0:
775+ return {'ret': False, 'msg': 'No such Host'}
776+
777+ compute = [s for s in services if s['topic'] == 'compute']
778+ if 0 == len(compute):
779+ service_ref = services[0]
780+ else:
781+ service_ref = compute[0]
782+
783+ # Getting physical resource information
784+ h_resource = {'vcpus': service_ref['vcpus'],
785+ 'memory_mb': service_ref['memory_mb'],
786+ 'local_gb': service_ref['local_gb']}
787+
788+ # Getting usage resource information
789+ u_resource = {}
790+ instances_ref = db.instance_get_all_by_host(context,
791+ service_ref['host'])
792+
793+ if 0 == len(instances_ref):
794+ return {'ret': True, 'phy_resource': h_resource, 'usage': {}}
795+
796+ project_ids = [i['project_id'] for i in instances_ref]
797+ project_ids = list(set(project_ids))
798+ for p_id in project_ids:
799+ vcpus = db.instance_get_vcpu_sum_by_host_and_project(context,
800+ host,
801+ p_id)
802+ mem = db.instance_get_memory_sum_by_host_and_project(context,
803+ host,
804+ p_id)
805+ hdd = db.instance_get_disk_sum_by_host_and_project(context,
806+ host,
807+ p_id)
808+ u_resource[p_id] = {'vcpus': vcpus,
809+ 'memory_mb': mem,
810+ 'local_gb': hdd}
811+
812+ return {'ret': True, 'phy_resource': h_resource, 'usage': u_resource}
813
814=== modified file 'nova/service.py'
815--- nova/service.py 2011-01-11 22:27:36 +0000
816+++ nova/service.py 2011-01-18 16:24:10 +0000
817@@ -80,6 +80,7 @@
818 self.manager.init_host()
819 self.model_disconnected = False
820 ctxt = context.get_admin_context()
821+
822 try:
823 service_ref = db.service_get_by_args(ctxt,
824 self.host,
825@@ -88,6 +89,9 @@
826 except exception.NotFound:
827 self._create_service_ref(ctxt)
828
829+ if 'nova-compute' == self.binary:
830+ self.manager.update_service(ctxt, self.host, self.binary)
831+
832 conn1 = rpc.Connection.instance(new=True)
833 conn2 = rpc.Connection.instance(new=True)
834 if self.report_interval:
835
836=== added file 'nova/virt/cpuinfo.xml.template'
837--- nova/virt/cpuinfo.xml.template 1970-01-01 00:00:00 +0000
838+++ nova/virt/cpuinfo.xml.template 2011-01-18 16:24:10 +0000
839@@ -0,0 +1,9 @@
840+<cpu>
841+ <arch>$arch</arch>
842+ <model>$model</model>
843+ <vendor>$vendor</vendor>
844+ <topology sockets="$topology.sockets" cores="$topology.cores" threads="$topology.threads"/>
845+#for $var in $features
846+ <features name="$var" />
847+#end for
848+</cpu>
849
850=== modified file 'nova/virt/fake.py'
851--- nova/virt/fake.py 2011-01-12 19:22:01 +0000
852+++ nova/virt/fake.py 2011-01-18 16:24:10 +0000
853@@ -310,6 +310,38 @@
854 'username': 'fakeuser',
855 'password': 'fakepassword'}
856
857+ def get_cpu_info(self):
858+ """This method is supported only libvirt. """
859+ return
860+
861+ def get_vcpu_number(self):
862+ """This method is supported only libvirt. """
863+ return -1
864+
865+ def get_memory_mb(self):
866+ """This method is supported only libvirt.."""
867+ return -1
868+
869+ def get_local_gb(self):
870+ """This method is supported only libvirt.."""
871+ return -1
872+
873+ def get_hypervisor_type(self):
874+ """This method is supported only libvirt.."""
875+ return
876+
877+ def get_hypervisor_version(self):
878+ """This method is supported only libvirt.."""
879+ return -1
880+
881+ def compare_cpu(self, xml):
882+ """This method is supported only libvirt.."""
883+ raise NotImplementedError('This method is supported only libvirt.')
884+
885+ def live_migration(self, context, instance_ref, dest):
886+ """This method is supported only libvirt.."""
887+ raise NotImplementedError('This method is supported only libvirt.')
888+
889
890 class FakeInstance(object):
891
892
893=== modified file 'nova/virt/libvirt_conn.py'
894--- nova/virt/libvirt_conn.py 2011-01-17 17:16:36 +0000
895+++ nova/virt/libvirt_conn.py 2011-01-18 16:24:10 +0000
896@@ -36,8 +36,11 @@
897
898 """
899
900+import json
901 import os
902 import shutil
903+import re
904+import time
905 import random
906 import subprocess
907 import uuid
908@@ -80,6 +83,9 @@
909 flags.DEFINE_string('libvirt_xml_template',
910 utils.abspath('virt/libvirt.xml.template'),
911 'Libvirt XML Template')
912+flags.DEFINE_string('cpuinfo_xml_template',
913+ utils.abspath('virt/cpuinfo.xml.template'),
914+ 'CpuInfo XML Template (used only live migration now)')
915 flags.DEFINE_string('libvirt_type',
916 'kvm',
917 'Libvirt domain type (valid options are: '
918@@ -88,6 +94,16 @@
919 '',
920 'Override the default libvirt URI (which is dependent'
921 ' on libvirt_type)')
922+flags.DEFINE_string('live_migration_uri',
923+ "qemu+tcp://%s/system",
924+ 'Define protocol used by live_migration feature')
925+flags.DEFINE_string('live_migration_flag',
926+ "VIR_MIGRATE_UNDEFINE_SOURCE, VIR_MIGRATE_PEER2PEER",
927+ 'Define live migration behavior.')
928+flags.DEFINE_integer('live_migration_bandwidth', 0,
929+ 'Define live migration behavior')
930+flags.DEFINE_string('live_migration_timeout_sec', 10,
931+ 'Timeout second for pre_live_migration is completed.')
932 flags.DEFINE_bool('allow_project_net_traffic',
933 True,
934 'Whether to allow in project network traffic')
935@@ -146,6 +162,7 @@
936 self.libvirt_uri = self.get_uri()
937
938 self.libvirt_xml = open(FLAGS.libvirt_xml_template).read()
939+ self.cpuinfo_xml = open(FLAGS.cpuinfo_xml_template).read()
940 self._wrapped_conn = None
941 self.read_only = read_only
942
943@@ -818,6 +835,74 @@
944
945 return interfaces
946
947+ def get_vcpu_number(self):
948+ """ Get vcpu number of physical computer. """
949+ return self._conn.getMaxVcpus(None)
950+
951+ def get_memory_mb(self):
952+ """Get the memory size of physical computer ."""
953+ meminfo = open('/proc/meminfo').read().split()
954+ idx = meminfo.index('MemTotal:')
955+ # transforming kb to mb.
956+ return int(meminfo[idx + 1]) / 1024
957+
958+ def get_local_gb(self):
959+ """Get the hdd size of physical computer ."""
960+ hddinfo = os.statvfs(FLAGS.instances_path)
961+ return hddinfo.f_bsize * hddinfo.f_blocks / 1024 / 1024 / 1024
962+
963+ def get_hypervisor_type(self):
964+ """ Get hypervisor type """
965+ return self._conn.getType()
966+
967+ def get_hypervisor_version(self):
968+ """ Get hypervisor version """
969+ return self._conn.getVersion()
970+
971+ def get_cpu_info(self):
972+ """ Get cpuinfo information """
973+ xmlstr = self._conn.getCapabilities()
974+ xml = libxml2.parseDoc(xmlstr)
975+ nodes = xml.xpathEval('//cpu')
976+ if len(nodes) != 1:
977+ msg = 'Unexpected xml format. tag "cpu" must be 1, but %d.' \
978+ % len(nodes)
979+ msg += '\n' + xml.serialize()
980+ raise exception.Invalid(_(msg))
981+
982+ arch = xml.xpathEval('//cpu/arch')[0].getContent()
983+ model = xml.xpathEval('//cpu/model')[0].getContent()
984+ vendor = xml.xpathEval('//cpu/vendor')[0].getContent()
985+
986+ topology_node = xml.xpathEval('//cpu/topology')[0].get_properties()
987+ topology = dict()
988+ while topology_node != None:
989+ name = topology_node.get_name()
990+ topology[name] = topology_node.getContent()
991+ topology_node = topology_node.get_next()
992+
993+ keys = ['cores', 'sockets', 'threads']
994+ tkeys = topology.keys()
995+ if list(set(tkeys)) != list(set(keys)):
996+ msg = _('Invalid xml: topology(%s) must have %s')
997+ raise exception.Invalid(msg % (str(topology), ', '.join(keys)))
998+
999+ feature_nodes = xml.xpathEval('//cpu/feature')
1000+ features = list()
1001+ for nodes in feature_nodes:
1002+ feature_name = nodes.get_properties().getContent()
1003+ features.append(feature_name)
1004+
1005+ template = ("""{"arch":"%s", "model":"%s", "vendor":"%s", """
1006+ """"topology":{"cores":"%s", "threads":"%s", """
1007+ """"sockets":"%s"}, "features":[%s]}""")
1008+ c = topology['cores']
1009+ s = topology['sockets']
1010+ t = topology['threads']
1011+ f = ['"%s"' % x for x in features]
1012+ cpu_info = template % (arch, model, vendor, c, s, t, ', '.join(f))
1013+ return cpu_info
1014+
1015 def block_stats(self, instance_name, disk):
1016 """
1017 Note that this function takes an instance name, not an Instance, so
1018@@ -848,6 +933,208 @@
1019 def refresh_security_group_members(self, security_group_id):
1020 self.firewall_driver.refresh_security_group_members(security_group_id)
1021
1022+ def compare_cpu(self, cpu_info):
1023+ """
1024+ Check the host cpu is compatible to a cpu given by xml.
1025+ "xml" must be a part of libvirt.openReadonly().getCapabilities().
1026+ return values follows by virCPUCompareResult.
1027+ if 0 > return value, do live migration.
1028+
1029+ 'http://libvirt.org/html/libvirt-libvirt.html#virCPUCompareResult'
1030+ """
1031+ msg = _('Checking cpu_info: instance was launched this cpu.\n: %s ')
1032+ LOG.info(msg % cpu_info)
1033+ dic = json.loads(cpu_info)
1034+ xml = str(Template(self.cpuinfo_xml, searchList=dic))
1035+ msg = _('to xml...\n: %s ')
1036+ LOG.info(msg % xml)
1037+
1038+ url = 'http://libvirt.org/html/libvirt-libvirt.html'
1039+ url += '#virCPUCompareResult\n'
1040+ msg = 'CPU does not have compativility.\n'
1041+ msg += 'result:%d \n'
1042+ msg += 'Refer to %s'
1043+ msg = _(msg)
1044+
1045+ # unknown character exists in xml, then libvirt complains
1046+ try:
1047+ ret = self._conn.compareCPU(xml, 0)
1048+ except libvirt.libvirtError, e:
1049+ LOG.error(msg % (ret, url))
1050+ raise e
1051+
1052+ if ret <= 0:
1053+ raise exception.Invalid(msg % (ret, url))
1054+
1055+ return
1056+
1057+ def ensure_filtering_rules_for_instance(self, instance_ref):
1058+ """ Setting up inevitable filtering rules on compute node,
1059+ and waiting for its completion.
1060+ To migrate an instance, filtering rules to hypervisors
1061+ and firewalls are inevitable on destination host.
1062+ ( Waiting only for filterling rules to hypervisor,
1063+ since filtering rules to firewall rules can be set faster).
1064+
1065+ Concretely, the below method must be called.
1066+ - setup_basic_filtering (for nova-basic, etc.)
1067+ - prepare_instance_filter(for nova-instance-instance-xxx, etc.)
1068+
1069+ to_xml may have to be called since it defines PROJNET, PROJMASK.
1070+ but libvirt migrates those value through migrateToURI(),
1071+ so , no need to be called.
1072+
1073+ Don't use thread for this method since migration should
1074+ not be started when setting-up filtering rules operations
1075+ are not completed."""
1076+
1077+ # Tf any instances never launch at destination host,
1078+ # basic-filtering must be set here.
1079+ self.nwfilter.setup_basic_filtering(instance_ref)
1080+ # setting up n)ova-instance-instance-xx mainly.
1081+ self.firewall_driver.prepare_instance_filter(instance_ref)
1082+
1083+ # wait for completion
1084+ timeout_count = range(FLAGS.live_migration_timeout_sec * 2)
1085+ while len(timeout_count) != 0:
1086+ try:
1087+ filter_name = 'nova-instance-%s' % instance_ref.name
1088+ self._conn.nwfilterLookupByName(filter_name)
1089+ break
1090+ except libvirt.libvirtError:
1091+ timeout_count.pop()
1092+ if len(timeout_count) == 0:
1093+ ec2_id = instance_ref['hostname']
1094+ msg = _('Timeout migrating for %s(%s)')
1095+ raise exception.Error(msg % (ec2_id, instance_ref.name))
1096+ time.sleep(0.5)
1097+
1098+ def live_migration(self, context, instance_ref, dest):
1099+ """
1100+ Just spawning live_migration operation for
1101+ distributing high-load.
1102+ """
1103+ greenthread.spawn(self._live_migration, context, instance_ref, dest)
1104+
1105+ def _live_migration(self, context, instance_ref, dest):
1106+ """ Do live migration."""
1107+
1108+ # Do live migration.
1109+ try:
1110+ duri = FLAGS.live_migration_uri % dest
1111+
1112+ flaglist = FLAGS.live_migration_flag.split(',')
1113+ flagvals = [getattr(libvirt, x.strip()) for x in flaglist]
1114+ logical_sum = reduce(lambda x, y: x | y, flagvals)
1115+
1116+ bandwidth = FLAGS.live_migration_bandwidth
1117+
1118+ if self.read_only:
1119+ tmpconn = self._connect(self.libvirt_uri, False)
1120+ dom = tmpconn.lookupByName(instance_ref.name)
1121+ dom.migrateToURI(duri, logical_sum, None, bandwidth)
1122+ tmpconn.close()
1123+ else:
1124+ dom = self._conn.lookupByName(instance_ref.name)
1125+ dom.migrateToURI(duri, logical_sum, None, bandwidth)
1126+
1127+ except Exception, e:
1128+ id = instance_ref['id']
1129+ db.instance_set_state(context, id, power_state.RUNNING, 'running')
1130+ for v in instance_ref['volumes']:
1131+ db.volume_update(context,
1132+ v['id'],
1133+ {'status': 'in-use'})
1134+
1135+ raise e
1136+
1137+ # Waiting for completion of live_migration.
1138+ timer = utils.LoopingCall(f=None)
1139+
1140+ def wait_for_live_migration():
1141+
1142+ try:
1143+ state = self.get_info(instance_ref.name)['state']
1144+ except exception.NotFound:
1145+ timer.stop()
1146+ self._post_live_migration(context, instance_ref, dest)
1147+
1148+ timer.f = wait_for_live_migration
1149+ timer.start(interval=0.5, now=True)
1150+
1151+ def _post_live_migration(self, context, instance_ref, dest):
1152+ """
1153+ Post operations for live migration.
1154+ Mainly, database updating.
1155+ """
1156+ LOG.info('post livemigration operation is started..')
1157+ # Detaching volumes.
1158+ # (not necessary in current version )
1159+
1160+ # Releasing vlan.
1161+ # (not necessary in current implementation?)
1162+
1163+ # Releasing security group ingress rule.
1164+ if FLAGS.firewall_driver == \
1165+ 'nova.virt.libvirt_conn.IptablesFirewallDriver':
1166+ try:
1167+ self.firewall_driver.unfilter_instance(instance_ref)
1168+ except KeyError, e:
1169+ pass
1170+
1171+ # Database updating.
1172+ ec2_id = instance_ref['hostname']
1173+
1174+ instance_id = instance_ref['id']
1175+ fixed_ip = db.instance_get_fixed_address(context, instance_id)
1176+ # Not return if fixed_ip is not found, otherwise,
1177+ # instance never be accessible..
1178+ if None == fixed_ip:
1179+ logging.warn('fixed_ip is not found for %s ' % ec2_id)
1180+ db.fixed_ip_update(context, fixed_ip, {'host': dest})
1181+ network_ref = db.fixed_ip_get_network(context, fixed_ip)
1182+ db.network_update(context, network_ref['id'], {'host': dest})
1183+
1184+ try:
1185+ floating_ip \
1186+ = db.instance_get_floating_address(context, instance_id)
1187+ # Not return if floating_ip is not found, otherwise,
1188+ # instance never be accessible..
1189+ if None == floating_ip:
1190+ logging.error('floating_ip is not found for %s ' % ec2_id)
1191+ else:
1192+ floating_ip_ref = db.floating_ip_get_by_address(context,
1193+ floating_ip)
1194+ db.floating_ip_update(context,
1195+ floating_ip_ref['address'],
1196+ {'host': dest})
1197+ except exception.NotFound:
1198+ logging.debug('%s doesnt have floating_ip.. ' % ec2_id)
1199+ except:
1200+ msg = 'Live migration: Unexpected error:'
1201+ msg += '%s cannot inherit floating ip.. ' % ec2_id
1202+ logging.error(_(msg))
1203+
1204+ # Restore instance/volume state
1205+ db.instance_update(context,
1206+ instance_id,
1207+ {'state_description': 'running',
1208+ 'state': power_state.RUNNING,
1209+ 'host': dest})
1210+
1211+ for v in instance_ref['volumes']:
1212+ db.volume_update(context,
1213+ v['id'],
1214+ {'status': 'in-use'})
1215+
1216+ logging.info(_('Live migrating %s to %s finishes successfully')
1217+ % (ec2_id, dest))
1218+ msg = _(("""Known error: the below error is nomally occurs.\n"""
1219+ """Just check if iinstance is successfully migrated.\n"""
1220+ """libvir: QEMU error : Domain not found: no domain """
1221+ """with matching name.."""))
1222+ logging.info(msg)
1223+
1224
1225 class FirewallDriver(object):
1226 def prepare_instance_filter(self, instance):
1227
1228=== modified file 'nova/virt/xenapi_conn.py'
1229--- nova/virt/xenapi_conn.py 2011-01-17 17:16:36 +0000
1230+++ nova/virt/xenapi_conn.py 2011-01-18 16:24:10 +0000
1231@@ -209,6 +209,36 @@
1232 'username': FLAGS.xenapi_connection_username,
1233 'password': FLAGS.xenapi_connection_password}
1234
1235+ def get_cpu_info(self):
1236+ """This method is supported only libvirt. """
1237+ return
1238+
1239+ def get_vcpu_number(self):
1240+ """This method is supported only libvirt. """
1241+ return -1
1242+
1243+ def get_memory_mb(self):
1244+ """This method is supported only libvirt.."""
1245+ return -1
1246+
1247+ def get_local_gb(self):
1248+ """This method is supported only libvirt.."""
1249+ return -1
1250+
1251+ def get_hypervisor_type(self):
1252+ """This method is supported only libvirt.."""
1253+ return
1254+
1255+ def get_hypervisor_version(self):
1256+ """This method is supported only libvirt.."""
1257+ return -1
1258+
1259+ def compare_cpu(self, xml):
1260+ raise NotImplementedError('This method is supported only libvirt.')
1261+
1262+ def live_migration(self, context, instance_ref, dest):
1263+ raise NotImplementedError('This method is supported only libvirt.')
1264+
1265
1266 class XenAPISession(object):
1267 """The session to invoke XenAPI SDK calls"""
1268
1269=== modified file 'nova/volume/driver.py'
1270--- nova/volume/driver.py 2011-01-13 12:02:14 +0000
1271+++ nova/volume/driver.py 2011-01-18 16:24:10 +0000
1272@@ -122,7 +122,7 @@
1273 """Removes an export for a logical volume."""
1274 raise NotImplementedError()
1275
1276- def discover_volume(self, volume):
1277+ def discover_volume(self, _context, volume):
1278 """Discover volume on a remote host."""
1279 raise NotImplementedError()
1280
1281@@ -184,15 +184,35 @@
1282 self._try_execute("sudo vblade-persist destroy %s %s" %
1283 (shelf_id, blade_id))
1284
1285- def discover_volume(self, _volume):
1286+ def discover_volume(self, context, volume):
1287 """Discover volume on a remote host."""
1288 self._execute("sudo aoe-discover")
1289 self._execute("sudo aoe-stat", check_exit_code=False)
1290+ shelf_id, blade_id = self.db.volume_get_shelf_and_blade(context,
1291+ volume['id'])
1292+ return "/dev/etherd/e%s.%s" % (shelf_id, blade_id)
1293
1294 def undiscover_volume(self, _volume):
1295 """Undiscover volume on a remote host."""
1296 pass
1297
1298+ def check_for_export(self, context, volume_id):
1299+ """Make sure whether volume is exported."""
1300+ (shelf_id,
1301+ blade_id) = self.db.volume_get_shelf_and_blade(context,
1302+ volume_id)
1303+ (out, _err) = self._execute("sudo vblade-persist ls --no-header")
1304+ exists = False
1305+ for line in out.split('\n'):
1306+ param = line.split(' ')
1307+ if len(param) == 6 and param[0] == str(shelf_id) \
1308+ and param[1] == str(blade_id) and param[-1] == "run":
1309+ exists = True
1310+ break
1311+ if not exists:
1312+ logging.warning(_("vblade process for e%s.%s isn't running.")
1313+ % (shelf_id, blade_id))
1314+
1315
1316 class FakeAOEDriver(AOEDriver):
1317 """Logs calls instead of executing."""
1318@@ -276,7 +296,7 @@
1319 iscsi_portal = location.split(",")[0]
1320 return (iscsi_name, iscsi_portal)
1321
1322- def discover_volume(self, volume):
1323+ def discover_volume(self, _context, volume):
1324 """Discover volume on a remote host."""
1325 iscsi_name, iscsi_portal = self._get_name_and_portal(volume['name'],
1326 volume['host'])
1327@@ -364,7 +384,7 @@
1328 """Removes an export for a logical volume"""
1329 pass
1330
1331- def discover_volume(self, volume):
1332+ def discover_volume(self, _context, volume):
1333 """Discover volume on a remote host"""
1334 return "rbd:%s/%s" % (FLAGS.rbd_pool, volume['name'])
1335
1336@@ -413,7 +433,7 @@
1337 """Removes an export for a logical volume"""
1338 pass
1339
1340- def discover_volume(self, volume):
1341+ def discover_volume(self, _context, volume):
1342 """Discover volume on a remote host"""
1343 return "sheepdog:%s" % volume['name']
1344
1345
1346=== modified file 'nova/volume/manager.py'
1347--- nova/volume/manager.py 2011-01-04 05:23:35 +0000
1348+++ nova/volume/manager.py 2011-01-18 16:24:10 +0000
1349@@ -138,7 +138,7 @@
1350 if volume_ref['host'] == self.host and FLAGS.use_local_volumes:
1351 path = self.driver.local_path(volume_ref)
1352 else:
1353- path = self.driver.discover_volume(volume_ref)
1354+ path = self.driver.discover_volume(context, volume_ref)
1355 return path
1356
1357 def remove_compute_volume(self, context, volume_id):
1358@@ -149,3 +149,10 @@
1359 return True
1360 else:
1361 self.driver.undiscover_volume(volume_ref)
1362+
1363+ def check_for_export(self, context, instance_id):
1364+ """Make sure whether volume is exported."""
1365+ if FLAGS.volume_driver == 'nova.volume.driver.AOEDriver':
1366+ instance_ref = self.db.instance_get(instance_id)
1367+ for v in instance_ref['volumes']:
1368+ self.driver.check_for_export(context, v['id'])
1369
1370=== modified file 'setup.py'
1371--- setup.py 2011-01-10 19:26:38 +0000
1372+++ setup.py 2011-01-18 16:24:10 +0000
1373@@ -34,6 +34,7 @@
1374 version_file.write(vcsversion)
1375
1376
1377+
1378 class local_BuildDoc(BuildDoc):
1379 def run(self):
1380 for builder in ['html', 'man']: