Merge lp:~nttdata/nova/live-migration into lp:~hudson-openstack/nova/trunk

Proposed by Kei Masumoto
Status: Merged
Approved by: Eric Day
Approved revision: 466
Merged at revision: 573
Proposed branch: lp:~nttdata/nova/live-migration
Merge into: lp:~hudson-openstack/nova/trunk
Diff against target: 1380 lines (+956/-17)
19 files modified
.mailmap (+2/-0)
Authors (+2/-0)
bin/nova-manage (+81/-1)
nova/api/ec2/cloud.py (+1/-1)
nova/compute/manager.py (+117/-1)
nova/db/api.py (+30/-0)
nova/db/sqlalchemy/api.py (+64/-0)
nova/db/sqlalchemy/models.py (+24/-2)
nova/network/manager.py (+8/-6)
nova/scheduler/driver.py (+183/-0)
nova/scheduler/manager.py (+48/-0)
nova/service.py (+4/-0)
nova/virt/cpuinfo.xml.template (+9/-0)
nova/virt/fake.py (+32/-0)
nova/virt/libvirt_conn.py (+287/-0)
nova/virt/xenapi_conn.py (+30/-0)
nova/volume/driver.py (+25/-5)
nova/volume/manager.py (+8/-1)
setup.py (+1/-0)
To merge this branch: bzr merge lp:~nttdata/nova/live-migration
Reviewer Review Type Date Requested Status
Soren Hansen (community) Approve
Masanori Itoh (community) Approve
Vish Ishaya (community) Approve
Thierry Carrez (community) ffe Approve
Review via email: mp+44940@code.launchpad.net

Commit message

Risk of Regression: This patch don’t modify existing functionlities, but I have added some.
    1. nova.db.service.sqlalchemy.model.Serivce (adding a column to database)
    2. nova.service ( nova-compute needes to insert information defined by 1 above)

So, db migration is necessary for existing user, but just adding columns.

Description of the change

Adding live migration features.
Please refer detail design at:
<http://wiki.openstack.org/LiveMigration?action=AttachFile&do=view&target=bexar-migration-live-update1.pdf>

Also, usage is described at:
<http://wiki.openstack.org/UsageOfLiveMigration>

Changes has been done at:
nova-manage:
    this feature can be used only from nova-manage
nova/scheduler/driver.py
nova/scheduler/manager.py
    pre-checking at schedule_live_migration().
nova/compute/manager.py and nova/virt/libvirt_conn.py
    executing live_migration
nova/db/sqlalchemy/*
    we added Host table because live_migration needs to check which host has
    enough resource, then we have to record total resource that physical servers has.

To post a comment you must log in.
Revision history for this message
Soren Hansen (soren) wrote :
Download full text (46.4 KiB)

Hello.

Please find my comments inline.

2010/12/31 Kei Masumoto <email address hidden>:
> === modified file 'bin/nova-manage'
> --- bin/nova-manage     2010-12-16 22:52:08 +0000
> +++ bin/nova-manage     2010-12-31 04:08:57 +0000
> @@ -79,7 +79,10 @@
>  from nova import quota
>  from nova import utils
>  from nova.auth import manager
> +from nova import rpc
>  from nova.cloudpipe import pipelib
> +from nova.api.ec2 import cloud
> +
>
>
>  FLAGS = flags.FLAGS
> @@ -452,6 +455,86 @@
>                                     int(network_size), int(vlan_start),
>                                     int(vpn_start))
>
> +
> +class InstanceCommands(object):
> +    """Class for mangaging VM instances."""
> +
> +    def live_migration(self, ec2_id, dest):
> +        """live_migration"""
> +
> +        logging.basicConfig()

Todd's newlog branch landed very recently. This changed how we do
logging. Can you update your branch accordingly? Thanks.

> +        ctxt = context.get_admin_context()
> +
> +        try:
> +            internal_id = cloud.ec2_id_to_internal_id(ec2_id)
> +            instance_ref = db.instance_get_by_internal_id(ctxt, internal_id)
> +            instance_id = instance_ref['id']

There's no longer any difference between ec2_id and internal id. This
should simplify this bit of your patch somewhat.

> +        except exception.NotFound as e:
> +            msg = _('instance(%s) is not found')
> +            e.args += (msg % ec2_id,)
> +            raise e

I don't think it's a good idea to add elements to existing Exception
instances' args attribute this way. I'd prefer if you either just raised
a new NotFound exception or simply printed "No such instance: %s" % id
or something like that.

> +        ret = rpc.call(ctxt,
> +                       FLAGS.scheduler_topic,
> +                       {"method": "live_migration",
> +                        "args": {"instance_id": instance_id,
> +                                "dest": dest,
> +                                "topic": FLAGS.compute_topic}})

I don't understand why you pass the compute_topic in the rpc call rather
than letting the scheduler worry about that?

> +        if None != ret:
> +            raise ret

"if ret:" is better.

You can (or at least should) only raise Exceptions. rpc.call never
*returns* an Exception. It may *raise* one, but will never return one.

> +
> +        print 'Finished all procedure. Check migrating finishes successfully'
> +        print 'check status by using euca-describe-instances.'

Perhaps something like this instead:
"Migration of %s initiated. Check its progress using euca-describe-instances."

> +class HostCommands(object):
> +    """Class for mangaging host(physical nodes)."""
> +
> +
> +    def list(self):
> +        """describe host list."""
> +
> +        # To supress msg: No handlers could be found for logger "amqplib"
> +        logging.basicConfig()
> +
> +        host_refs = db.host_get_all(context.get_admin_context())
> +        for host_ref in host_refs:
> +            print host_ref['name']
> +
> +
> +    def show(self, host):
> +        """describe cpu/memory/hdd info for host."""
> +
> +        # To supress msg: No handl...

Revision history for this message
Soren Hansen (soren) :
review: Needs Fixing
Revision history for this message
Kei Masumoto (masumotok) wrote :
Download full text (49.2 KiB)

Soren,

Thanks for reviewing, and I'm trying to fix based on your comments.
By the way, I have some questions - please give me a hand to solve one by one.

Regarding to the comment on nova/compute/manager.py:

>> self.db.instance_update(context,
>> instance_id,
>> - {'host': self.host})
>> + {'host': self.host, 'launched_on':self.host})
>
>Why pass the same value twice?

You mentioned "'launched_on':self.host" should be removed didn't you?
Before doing so, let me explain.

"host' column on Instance table is to record which host an instance is running on.
Therefore, values are updated if an instance is moved by live migration.
On the other hand, 'launched_on' column that I created is to record which host an instance was launched, and is not updated.
This information is necessary because cpuflag of launched host must have compatibility to the one of live migration destination host.
For this reason, I insert save value twice to different column when an instance is launched.

Please let me know if it does not make sense.

Regards,
Kei Masumoto

-----Original Message-----
From: <email address hidden> [mailto:<email address hidden>] On Behalf Of Soren Hansen
Sent: Monday, January 10, 2011 9:47 PM
To: <email address hidden>
Subject: Re: [Merge] lp:~nttdata/nova/live-migration into lp:nova

Hello.

Please find my comments inline.

2010/12/31 Kei Masumoto <email address hidden>:
> === modified file 'bin/nova-manage'
> --- bin/nova-manage 2010-12-16 22:52:08 +0000
> +++ bin/nova-manage 2010-12-31 04:08:57 +0000
> @@ -79,7 +79,10 @@
> from nova import quota
> from nova import utils
> from nova.auth import manager
> +from nova import rpc
> from nova.cloudpipe import pipelib
> +from nova.api.ec2 import cloud
> +
>
>
> FLAGS = flags.FLAGS
> @@ -452,6 +455,86 @@
> int(network_size), int(vlan_start),
> int(vpn_start))
>
> +
> +class InstanceCommands(object):
> + """Class for mangaging VM instances."""
> +
> + def live_migration(self, ec2_id, dest):
> + """live_migration"""
> +
> + logging.basicConfig()

Todd's newlog branch landed very recently. This changed how we do
logging. Can you update your branch accordingly? Thanks.

> + ctxt = context.get_admin_context()
> +
> + try:
> + internal_id = cloud.ec2_id_to_internal_id(ec2_id)
> + instance_ref = db.instance_get_by_internal_id(ctxt, internal_id)
> + instance_id = instance_ref['id']

There's no longer any difference between ec2_id and internal id. This
should simplify this bit of your patch somewhat.

> + except exception.NotFound as e:
> + msg = _('instance(%s) is not found')
> + e.args += (msg % ec2_id,)
> + raise e

I don't think it's a good idea to add elements to existing Exception
instances' args attribute this way. I'd prefer if you either just raised
a new NotFound exception or simply printed "No such instance: %s" % ...

Revision history for this message
Kei Masumoto (masumotok) wrote :
Download full text (49.2 KiB)

Soren,

Thanks for reviewing, and I'm trying to fix based on your comments.
By the way, I have some questions - please give me a hand to solve one by one.

Regarding to the comment on nova/compute/manager.py:

>> self.db.instance_update(context,
>> instance_id,
>> - {'host': self.host})
>> + {'host': self.host, 'launched_on':self.host})
>
>Why pass the same value twice?

You mentioned "'launched_on':self.host" should be removed didn't you?
Before doing so, let me explain.

"host' column on Instance table is to record which host an instance is running on.
Therefore, values are updated if an instance is moved by live migration.
On the other hand, 'launched_on' column that I created is to record which host an instance was launched, and is not updated.
This information is necessary because cpuflag of launched host must have compatibility to the one of live migration destination host.
For this reason, I insert save value twice to different column when an instance is launched.

Please let me know if it does not make sense.

Regards,
Kei Masumoto

-----Original Message-----
From: <email address hidden> [mailto:<email address hidden>] On Behalf Of Soren Hansen
Sent: Monday, January 10, 2011 9:47 PM
To: <email address hidden>
Subject: Re: [Merge] lp:~nttdata/nova/live-migration into lp:nova

Hello.

Please find my comments inline.

2010/12/31 Kei Masumoto <email address hidden>:
> === modified file 'bin/nova-manage'
> --- bin/nova-manage 2010-12-16 22:52:08 +0000
> +++ bin/nova-manage 2010-12-31 04:08:57 +0000
> @@ -79,7 +79,10 @@
> from nova import quota
> from nova import utils
> from nova.auth import manager
> +from nova import rpc
> from nova.cloudpipe import pipelib
> +from nova.api.ec2 import cloud
> +
>
>
> FLAGS = flags.FLAGS
> @@ -452,6 +455,86 @@
> int(network_size), int(vlan_start),
> int(vpn_start))
>
> +
> +class InstanceCommands(object):
> + """Class for mangaging VM instances."""
> +
> + def live_migration(self, ec2_id, dest):
> + """live_migration"""
> +
> + logging.basicConfig()

Todd's newlog branch landed very recently. This changed how we do
logging. Can you update your branch accordingly? Thanks.

> + ctxt = context.get_admin_context()
> +
> + try:
> + internal_id = cloud.ec2_id_to_internal_id(ec2_id)
> + instance_ref = db.instance_get_by_internal_id(ctxt, internal_id)
> + instance_id = instance_ref['id']

There's no longer any difference between ec2_id and internal id. This
should simplify this bit of your patch somewhat.

> + except exception.NotFound as e:
> + msg = _('instance(%s) is not found')
> + e.args += (msg % ec2_id,)
> + raise e

I don't think it's a good idea to add elements to existing Exception
instances' args attribute this way. I'd prefer if you either just raised
a new NotFound exception or simply printed "No such instance: %s" % ...

Revision history for this message
Soren Hansen (soren) wrote :

2011/1/11 Kei Masumoto <email address hidden>:
> You mentioned "'launched_on':self.host" should be removed didn't you?
> Before doing so, let me explain.
>
> "host' column on Instance table is to record which host an instance is
> running on. Therefore, values are updated if an instance is moved by
> live migration. On the other hand, 'launched_on' column that I
> created is to record which host an instance was launched, and is not
> updated. This information is necessary because cpuflag of launched
> host must have compatibility to the one of live migration destination
> host. For this reason, I insert save value twice to different column
> when an instance is launched.

That makes perfect sense. I somehow thought you were passing it twice in
an RPC call. Please ignore that part of my review, then :)

--
Soren Hansen <email address hidden>
Systems Architect, The Rackspace Cloud
Ubuntu Developer

Revision history for this message
Kei Masumoto (masumotok) wrote :
Download full text (3.8 KiB)

Soren,

Thank you for your reply.
I have still more questions and explanations.
Please let me know if it does not make sense.

1. Your comment at nova-manage

>> + ret = rpc.call(ctxt,
>> + FLAGS.scheduler_topic,
>> + {"method": "live_migration",
>> + "args": {"instance_id": instance_id,
>> + "dest": dest,
>> + "topic": FLAGS.compute_topic}})
>
> I don't understand why you pass the compute_topic in the rpc call rather than letting the scheduler worry about that?

Altough I agree with you, scheduler requires 4 argument.
If I remove the argument "topic", I got following error.

> Traceback (most recent call last):
> File "bin/nova-manage", line 680, in <module>
> main()
> File "bin/nova-manage", line 672, in main
> fn(*argv)
> File "bin/nova-manage", line 488, in live_migration
> "dest": dest}})
> File "/opt/nova.20110112/nova/rpc.py", line 340, in call
> raise wait_msg.result
> nova.rpc.RemoteError: TypeError _schedule() takes at least 4 non-keyword arguments (3 given)
> [u'Traceback (most recent call last):\n', u' File "/opt/nova.20110112/nova/rpc.py", line 191, in receive
> rval = node_func(context=ctxt, **node_args)\n', u'TypeError: _schedule() takes at lea

I think _schedule() is common method, and I should be very careful to change it.
That's why I add the argument "topic".
But your comment is true - I suggest I wrote this explanation briefly as a comment.
What do you think?

2. comment at nova.scheduler.driver.schedule_live_migration

>> + try:
>> + instance_ref = db.instance_get(context, instance_id)
>> + ec2_id = instance_ref['hostname']
>> + internal_id = instance_ref['internal_id']
>> + except exception.NotFound, e:
>> + msg = _('Unexpected error: instance is not found')
>> + e.args += ('\n' + msg, )
>> + raise e
>
> Same comment as above wrt to exception mangling. I now see your comment above, but I don't understand how it helps to just not show the id at all?

This method (schedule_live_migration) is called when users execute nova-manage.
And user inputs like i-xxxx. But in this case, users get nova.db.sqlalchemy.model.Instance.id from upcoming exception.
Users may minunderstand when an id is 10, users get i-a.
( I made an mistake - what I wanted to do is: "Unexpected error: instance(%s) is not found" % ec2_id.)

Regards,
Kei Masumoto

-----Original Message-----
From: <email address hidden> [mailto:<email address hidden>] On Behalf Of Soren Hansen
Sent: Wednesday, January 12, 2011 6:20 AM
To: <email address hidden>
Subject: Re: [Merge] lp:~nttdata/nova/live-migration into lp:nova

2011/1/11 Kei Masumoto <email address hidden>:
> You mentioned "'launched_on':self.host" should be removed didn't you?
> Before doing so, let me explain.
>
> "host' column on Instance table is to record which host an instance is
> running on. Therefore, values are updated if an instance is moved by
> live migration. On the other hand, ...

Read more...

Revision history for this message
Soren Hansen (soren) wrote :

> Thank you for your reply.
> I have still more questions and explanations.
> Please let me know if it does not make sense.

Ok.

By the way, you should merge with trunk again. There are many conflicts.

> 1. Your comment at nova-manage
>>> + ret = rpc.call(ctxt,
>>> + FLAGS.scheduler_topic,
>>> + {"method": "live_migration",
>>> + "args": {"instance_id": instance_id,
>>> + "dest": dest,
>>> + "topic": FLAGS.compute_topic}})
>> I don't understand why you pass the compute_topic in the rpc call rather
>> than letting the scheduler worry about that?
> Altough I agree with you, scheduler requires 4 argument.

Yes, I see that now. Clearly not your fault. Ok.

> 2. comment at nova.scheduler.driver.schedule_live_migration
>
>>> + try:
>>> + instance_ref = db.instance_get(context, instance_id)
>>> + ec2_id = instance_ref['hostname']
>>> + internal_id = instance_ref['internal_id']
>>> + except exception.NotFound, e:
>>> + msg = _('Unexpected error: instance is not found')
>>> + e.args += ('\n' + msg, )
>>> + raise e
>> Same comment as above wrt to exception mangling. I now see your comment
>> above, but I don't understand how it helps to just not show the id at all?
> This method (schedule_live_migration) is called when users execute nova-
> manage.
> And user inputs like i-xxxx. But in this case, users get
> nova.db.sqlalchemy.model.Instance.id from upcoming exception.
> Users may minunderstand when an id is 10, users get i-a.
> ( I made an mistake - what I wanted to do is: "Unexpected error: instance(%s)
> is not found" % ec2_id.)

Perhaps you could add a more general method that outputs both the ID's and use that in your expection message.

Nevertheless, mangling existing Exceptions seems error prone and pointless. I'd raise a new Exception.

Revision history for this message
Kei Masumoto (masumotok) wrote :

Thank you for your reply.
I merged recent trunk yesterday, and almost solved conflicts.
I'll let you know after few hours( I'm testing now)

Thanks again, your comments are very helpful.

Regards,
Kei Masumoto

-----Original Message-----
From: <email address hidden> [mailto:<email address hidden>] On Behalf Of Soren Hansen
Sent: Thursday, January 13, 2011 8:23 PM
To: <email address hidden>
Subject: Re: RE: [Merge] lp:~nttdata/nova/live-migration into lp:nova

> Thank you for your reply.
> I have still more questions and explanations.
> Please let me know if it does not make sense.

Ok.

By the way, you should merge with trunk again. There are many conflicts.

> 1. Your comment at nova-manage
>>> + ret = rpc.call(ctxt,
>>> + FLAGS.scheduler_topic,
>>> + {"method": "live_migration",
>>> + "args": {"instance_id": instance_id,
>>> + "dest": dest,
>>> + "topic": FLAGS.compute_topic}})
>> I don't understand why you pass the compute_topic in the rpc call rather
>> than letting the scheduler worry about that?
> Altough I agree with you, scheduler requires 4 argument.

Yes, I see that now. Clearly not your fault. Ok.

> 2. comment at nova.scheduler.driver.schedule_live_migration
>
>>> + try:
>>> + instance_ref = db.instance_get(context, instance_id)
>>> + ec2_id = instance_ref['hostname']
>>> + internal_id = instance_ref['internal_id']
>>> + except exception.NotFound, e:
>>> + msg = _('Unexpected error: instance is not found')
>>> + e.args += ('\n' + msg, )
>>> + raise e
>> Same comment as above wrt to exception mangling. I now see your comment
>> above, but I don't understand how it helps to just not show the id at all?
> This method (schedule_live_migration) is called when users execute nova-
> manage.
> And user inputs like i-xxxx. But in this case, users get
> nova.db.sqlalchemy.model.Instance.id from upcoming exception.
> Users may minunderstand when an id is 10, users get i-a.
> ( I made an mistake - what I wanted to do is: "Unexpected error: instance(%s)
> is not found" % ec2_id.)

Perhaps you could add a more general method that outputs both the ID's and use that in your expection message.

Nevertheless, mangling existing Exceptions seems error prone and pointless. I'd raise a new Exception.

--
https://code.launchpad.net/~nttdata/nova/live-migration/+merge/44940
Your team NTT DATA is subscribed to branch lp:~nttdata/nova/live-migration.

Revision history for this message
Vish Ishaya (vishvananda) wrote :
Download full text (50.1 KiB)

On Jan 13, 2011, at 3:33 PM, Kei Masumoto wrote:
>
> --
> https://code.launchpad.net/~nttdata/nova/live-migration/+merge/44940
> You are requested to review the proposed merge of lp:~nttdata/nova/live-migration into lp:nova.
> === modified file 'Authors'
> --- Authors 2011-01-12 19:39:25 +0000
> +++ Authors 2011-01-13 23:32:42 +0000
> @@ -25,12 +25,14 @@
> Josh Kearney <email address hidden>
> Joshua McKenty <email address hidden>
> Justin Santa Barbara <email address hidden>
> +Kei Masumoto <email address hidden>
> Ken Pepple <email address hidden>
> Lorin Hochstein <email address hidden>
> Matt Dietz <email address hidden>
> Michael Gundlach <email address hidden>
> Monsyne Dragon <email address hidden>
> Monty Taylor <email address hidden>
> +Muneyuki Noguchi <email address hidden>
> Paul Voccio <email address hidden>
> Rick Clark <email address hidden>
> Rick Harris <email address hidden>
>
> === modified file 'bin/nova-manage'
> --- bin/nova-manage 2011-01-12 20:12:08 +0000
> +++ bin/nova-manage 2011-01-13 23:32:42 +0000
> @@ -81,8 +81,9 @@
> from nova import quota
> from nova import utils
> from nova.auth import manager
> +from nova import rpc
> from nova.cloudpipe import pipelib
> -
> +from nova.api.ec2 import cloud
>
> logging.basicConfig()
> FLAGS = flags.FLAGS
> @@ -461,6 +462,81 @@
> int(vpn_start))
>
>
> +class InstanceCommands(object):
> + """Class for mangaging VM instances."""
> +
> + def live_migration(self, ec2_id, dest):
> + """live_migration"""
> +
> + if FLAGS.connection_type != 'libvirt':
> + raise exception.Error('Only KVM is supported for now. '
> + 'Sorry.')
> +
> + if FLAGS.volume_driver != 'nova.volume.driver.AOEDriver':
> + raise exception.Error('Only AOEDriver is supported for now. '
> + 'Sorry.')

It seems like Iscsi driver would work fine as long as there are no volumes attached to the instance. Can you create a bug to fix
this to check for this in the call and return an exception later? Also to match the rest of nova, these strings should be surrounded in _()

> +
> + logging.basicConfig()
> + ctxt = context.get_admin_context()
> + instance_id = cloud.ec2_id_to_id(ec2_id)
> +
> + rpc.call(ctxt,
> + FLAGS.scheduler_topic,
> + {"method": "live_migration",
> + "args": {"instance_id": instance_id,
> + "dest": dest,
> + "topic": FLAGS.compute_topic}})
> +
> + msg = 'Migration of %s initiated. ' % ec2_id
> + msg += 'Check its progress using euca-describe-instances.'
> + print msg
> +
> +
> +class HostCommands(object):
> + """Class for mangaging host(physical nodes)."""
> +
> + def list(self):
> + """describe host list."""
> +
> + # To supress msg: No handlers could be found for logger "amqplib"
> + logging.basicConfig()

No longer necessary as nova-manage calls this in main
> +
> + host_refs = db.host_get_all(context.get_admin_context())
> + for host_ref in host_refs:
> + print host...

Revision history for this message
Thierry Carrez (ttx) wrote :

FFe review: my main issue with late-merging this is that it adds a new table to the database schema, so it also delays stuff like db-migration. It's also a bit large... If the core reviewers can reach review consensus on this really soon, I'll accept it. Otherwise it's probably better to defer this to early Cactus (which is just three weeks away). We already have plenty of features to test in Bexar.

review: Needs Information (ffe)
Revision history for this message
Kei Masumoto (masumotok) wrote :
Download full text (52.6 KiB)

Hi Vish,

Thanks for your comment.
I fixed all of them and submit merge proposal again here.
There are no conflicts(just Authors)

The point is, I told you we still have separate Host table on database,
and make relationship Service -> Host.
But I re-considerate on this matter...
If we have separate Host table, although it might be good from data optimization point of view, source code is entirely complex, and may cause some bugs.
In addition the affect cause entire source code. We should not do just before Bexar Release.
Therefore, I remove Host table.
I appreciate if you tell your opinion, just because I would like to synchronize our opinion on this matter.

Anyway, please check the diffs.

Thanks in advance.

P.S.
I asked Thierry about FFE, and please let me know if I have something special to do.

-----Original Message-----
From: <email address hidden> [mailto:<email address hidden>] On Behalf Of Vish Ishaya
Sent: Friday, January 14, 2011 9:26 AM
To: <email address hidden>
Subject: Re: [Merge] lp:~nttdata/nova/live-migration into lp:nova

On Jan 13, 2011, at 3:33 PM, Kei Masumoto wrote:
>
> --
> https://code.launchpad.net/~nttdata/nova/live-migration/+merge/44940
> You are requested to review the proposed merge of lp:~nttdata/nova/live-migration into lp:nova.
> === modified file 'Authors'
> --- Authors 2011-01-12 19:39:25 +0000
> +++ Authors 2011-01-13 23:32:42 +0000
> @@ -25,12 +25,14 @@
> Josh Kearney <email address hidden>
> Joshua McKenty <email address hidden>
> Justin Santa Barbara <email address hidden>
> +Kei Masumoto <email address hidden>
> Ken Pepple <email address hidden>
> Lorin Hochstein <email address hidden>
> Matt Dietz <email address hidden>
> Michael Gundlach <email address hidden>
> Monsyne Dragon <email address hidden>
> Monty Taylor <email address hidden>
> +Muneyuki Noguchi <email address hidden>
> Paul Voccio <email address hidden>
> Rick Clark <email address hidden>
> Rick Harris <email address hidden>
>
> === modified file 'bin/nova-manage'
> --- bin/nova-manage 2011-01-12 20:12:08 +0000
> +++ bin/nova-manage 2011-01-13 23:32:42 +0000
> @@ -81,8 +81,9 @@
> from nova import quota
> from nova import utils
> from nova.auth import manager
> +from nova import rpc
> from nova.cloudpipe import pipelib
> -
> +from nova.api.ec2 import cloud
>
> logging.basicConfig()
> FLAGS = flags.FLAGS
> @@ -461,6 +462,81 @@
> int(vpn_start))
>
>
> +class InstanceCommands(object):
> + """Class for mangaging VM instances."""
> +
> + def live_migration(self, ec2_id, dest):
> + """live_migration"""
> +
> + if FLAGS.connection_type != 'libvirt':
> + raise exception.Error('Only KVM is supported for now. '
> + 'Sorry.')
> +
> + if FLAGS.volume_driver != 'nova.volume.driver.AOEDriver':
> + raise exception.Error('Only AOEDriver is supported for now. '
> + 'Sorry.')

It seems like Iscsi driver would work fine as long as there are no volumes attached to the instance. Can you create a bug to fix
this to check for this in...

Revision history for this message
Vish Ishaya (vishvananda) wrote :

Looking good. Three minor points:

143 - instance_id = floating_ip_ref['fixed_ip']['instance']['ec2_id']
144 + # modified by masumotok
145 + #instance_id = floating_ip_ref['fixed_ip']['instance']['ec2_id']
146 + instance_id = floating_ip_ref['fixed_ip']['instance']['id']

This is a bug and your change is correct, you can remove the comments.

1236
1237 - logging.info('ensuring static filters')
1238 self._ensure_static_filters()
1239

I assume you did this during debugging. You should probably add it back in.

Finally, two new drivers were added to volume/driver.py. Since discover_volume has been modified to take a context parameter, please add _context to the other two volume types as well.

Revision history for this message
Thierry Carrez (ttx) wrote :

You should also add yourself to the Authors file.

Revision history for this message
Thierry Carrez (ttx) wrote :

Looks like we are almost there, and the regression risk is contained. However since this is blocking the db-migration branch, I'd like this branch merged before the meeting tomorrow, which probably means getting the last fixes on the branch soon.

Please ensure you cover Vish's latest concerns, any upcoming Soren's remarks, add yourself to the Authors file, merge with trunk if necessary and pass tests and pep8 checks, to try to avoid any Hudson roundtrip.

FFe granted for merging before tomorrow weekly meeting.

review: Approve (ffe)
Revision history for this message
Soren Hansen (soren) wrote :

I get this error when running the test suite:

======================================================================
FAIL: test_authors_up_to_date (nova.tests.test_misc.ProjectTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/soren/src/openstack/nova/nova/nova/tests/test_misc.py", line 53, in test_authors_up_to_date
    '%r not listed in Authors' % missing)
AssertionError: set([u'<root@openstack2-api>', u'Masumoto<email address hidden>']) not listed in Authors

----------------------------------------------------------------------

Please add these lines to .mailmap:
<email address hidden> <root@openstack2-api>
<email address hidden> Masumoto<email address hidden>

Once fixed, I'll vote approve.

I don't think the remaining issues should block this.

review: Needs Fixing
Revision history for this message
Vish Ishaya (vishvananda) wrote :

lgtm

review: Approve
Revision history for this message
Masanori Itoh (itohm) wrote :

Please take care not contaminating log messages by Japanese chars in the future.
Except that, looks good to me. :)

review: Approve
Revision history for this message
Soren Hansen (soren) wrote :

======================================================================
FAIL: test_authors_up_to_date (nova.tests.test_misc.ProjectTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/soren/src/openstack/nova/nova/nova/tests/test_misc.py", line 53, in test_authors_up_to_date
    '%r not listed in Authors' % missing)
AssertionError: set([u'<root@openstack2-api>', u'Masumoto<email address hidden>']) not listed in Authors

----------------------------------------------------------------------
Ran 286 tests in 94.682s

FAILED (failures=1)

I've already explained *exactly* how to fix this problem.

Before you submit things for review, please run the test suite.

review: Needs Fixing
Revision history for this message
Masanori Itoh (itohm) wrote :

Hello Soren,

It looks like masumotok cannot reproduce the failure below, and
he is struggling to fix the issue as his highest priority.

I suggested to checkout his branch on the launchpad again,
and try to reproduce the issue.

Please give him some more time.

Thanks in advance,

Masanori (a.k.a thatsdone)

---
Masanori ITOH R&D Headquarters, NTT DATA CORPORATION
               e-mail: <email address hidden>
               phone : +81-50-5546-2301 (ext: 47-6278)

From: Soren Hansen <email address hidden>
Subject: Re: [Merge] lp:~nttdata/nova/live-migration into lp:nova
Date: Tue, 18 Jan 2011 21:44:52 +0900

> Review: Needs Fixing
> ======================================================================
> FAIL: test_authors_up_to_date (nova.tests.test_misc.ProjectTestCase)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> File "/home/soren/src/openstack/nova/nova/nova/tests/test_misc.py", line 53, in test_authors_up_to_date
> '%r not listed in Authors' % missing)
> AssertionError: set([u'<root@openstack2-api>', u'Masumoto<email address hidden>']) not listed in Authors
>
> ----------------------------------------------------------------------
> Ran 286 tests in 94.682s
>
> FAILED (failures=1)
>
>
> I've already explained *exactly* how to fix this problem.
>
>
> Before you submit things for review, please run the test suite.
> --
> https://code.launchpad.net/~nttdata/nova/live-migration/+merge/44940
> You are reviewing the proposed merge of lp:~nttdata/nova/live-migration into lp:nova.

Revision history for this message
Soren Hansen (soren) wrote :

Alright, lets get this merged. :)

review: Approve
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :
Download full text (7.3 KiB)

The attempt to merge lp:~nttdata/nova/live-migration into lp:nova failed. Below is the output from the failed tests.

nova/service.py:122:48: E231 missing whitespace after ':'
                                        {'host':self.host,
                                               ^
    JCR: Each comma, semicolon or colon should be followed by whitespace.

    Okay: [a, b]
    Okay: (3,)
    Okay: a[1:4]
    Okay: a[:4]
    Okay: a[1:]
    Okay: a[1:4:2]
    E231: ['a','b']
    E231: foo(bar,baz)
nova/scheduler/manager.py:82:20: E201 whitespace after '['
        compute = [ s for s in services if s['topic'] == 'compute']
                   ^
    Avoid extraneous whitespace in the following situations:

    - Immediately inside parentheses, brackets or braces.

    - Immediately before a comma, semicolon, or colon.

    Okay: spam(ham[1], {eggs: 2})
    E201: spam( ham[1], {eggs: 2})
    E201: spam(ham[ 1], {eggs: 2})
    E201: spam(ham[1], { eggs: 2})
    E202: spam(ham[1], {eggs: 2} )
    E202: spam(ham[1 ], {eggs: 2})
    E202: spam(ham[1], {eggs: 2 })

    E203: if x == 4: print x, y; x, y = y , x
    E203: if x == 4: print x, y ; x, y = y, x
    E203: if x == 4 : print x, y; x, y = y, x
nova/scheduler/manager.py:83:30: W291 trailing whitespace
        if 0 == len(compute):
                             ^
    JCR: Trailing whitespace is superfluous.
    FBM: Except when it occurs as part of a blank line (i.e. the line is
         nothing but whitespace). According to Python docs[1] a line with only
         whitespace is considered a blank line, and is to be ignored. However,
         matching a blank line to its indentation level avoids mistakenly
         terminating a multi-line statement (e.g. class declaration) when
         pasting code into the standard Python interpreter.

         [1] http://docs.python.org/reference/lexical_analysis.html#blank-lines

    The warning returned varies on whether the line itself is blank, for easier
    filtering for those who want to indent their blank lines.

    Okay: spam(1)
    W291: spam(1)\s
    W293: class Foo(object):\n \n bang = 12
nova/scheduler/manager.py:93:1: W293 blank line contains whitespace

^
    JCR: Trailing whitespace is superfluous.
    FBM: Except when it occurs as part of a blank line (i.e. the line is
         nothing but whitespace). According to Python docs[1] a line with only
         whitespace is considered a blank line, and is to be ignored. However,
         matching a blank line to its indentation level avoids mistakenly
         terminating a multi-line statement (e.g. class declaration) when
         pasting code into the standard Python interpreter.

         [1] http://docs.python.org/reference/lexical_analysis.html#blank-lines

    The warning returned varies on whether the line itself is blank, for easier
    filtering for those who want to indent their blank lines.

    Okay: spam(1)
    W291: spam(1)\s
    W293: class Foo(object):\n \n bang = 12
nova/scheduler/manager.py:95:9: E303 too many blank lines (2)
        u_resource = {}
        ^
    Separate top-level function and class definitions with two blank lines.

    Method definitions inside a c...

Read more...

lp:~nttdata/nova/live-migration updated
466. By Kei Masumoto

fixed pep8 error

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file '.mailmap'
--- .mailmap 2011-01-05 23:04:51 +0000
+++ .mailmap 2011-01-18 16:24:10 +0000
@@ -16,6 +16,8 @@
16<jmckenty@gmail.com> <jmckenty@joshua-mckentys-macbook-pro.local>16<jmckenty@gmail.com> <jmckenty@joshua-mckentys-macbook-pro.local>
17<jmckenty@gmail.com> <joshua.mckenty@nasa.gov>17<jmckenty@gmail.com> <joshua.mckenty@nasa.gov>
18<justin@fathomdb.com> <justinsb@justinsb-desktop>18<justin@fathomdb.com> <justinsb@justinsb-desktop>
19<masumotok@nttdata.co.jp> <root@openstack2-api>
20<masumotok@nttdata.co.jp> Masumoto<masumotok@nttdata.co.jp>
19<mordred@inaugust.com> <mordred@hudson>21<mordred@inaugust.com> <mordred@hudson>
20<paul@openstack.org> <pvoccio@castor.local>22<paul@openstack.org> <pvoccio@castor.local>
21<paul@openstack.org> <paul.voccio@rackspace.com>23<paul@openstack.org> <paul.voccio@rackspace.com>
2224
=== modified file 'Authors'
--- Authors 2011-01-14 04:59:06 +0000
+++ Authors 2011-01-18 16:24:10 +0000
@@ -26,6 +26,7 @@
26Josh Kearney <josh.kearney@rackspace.com>26Josh Kearney <josh.kearney@rackspace.com>
27Joshua McKenty <jmckenty@gmail.com>27Joshua McKenty <jmckenty@gmail.com>
28Justin Santa Barbara <justin@fathomdb.com>28Justin Santa Barbara <justin@fathomdb.com>
29Kei Masumoto <masumotok@nttdata.co.jp>
29Ken Pepple <ken.pepple@gmail.com>30Ken Pepple <ken.pepple@gmail.com>
30Koji Iida <iida.koji@lab.ntt.co.jp>31Koji Iida <iida.koji@lab.ntt.co.jp>
31Lorin Hochstein <lorin@isi.edu>32Lorin Hochstein <lorin@isi.edu>
@@ -34,6 +35,7 @@
34Monsyne Dragon <mdragon@rackspace.com>35Monsyne Dragon <mdragon@rackspace.com>
35Monty Taylor <mordred@inaugust.com>36Monty Taylor <mordred@inaugust.com>
36MORITA Kazutaka <morita.kazutaka@gmail.com>37MORITA Kazutaka <morita.kazutaka@gmail.com>
38Muneyuki Noguchi <noguchimn@nttdata.co.jp>
37Nachi Ueno <ueno.nachi@lab.ntt.co.jp> <openstack@lab.ntt.co.jp> <nati.ueno@gmail.com> <nova@u4>39Nachi Ueno <ueno.nachi@lab.ntt.co.jp> <openstack@lab.ntt.co.jp> <nati.ueno@gmail.com> <nova@u4>
38Paul Voccio <paul@openstack.org>40Paul Voccio <paul@openstack.org>
39Rick Clark <rick@openstack.org>41Rick Clark <rick@openstack.org>
4042
=== modified file 'bin/nova-manage'
--- bin/nova-manage 2011-01-13 00:28:35 +0000
+++ bin/nova-manage 2011-01-18 16:24:10 +0000
@@ -62,6 +62,7 @@
6262
63import IPy63import IPy
6464
65
65# If ../nova/__init__.py exists, add ../ to Python search path, so that66# If ../nova/__init__.py exists, add ../ to Python search path, so that
66# it will override what happens to be installed in /usr/(local/)lib/python...67# it will override what happens to be installed in /usr/(local/)lib/python...
67possible_topdir = os.path.normpath(os.path.join(os.path.abspath(sys.argv[0]),68possible_topdir = os.path.normpath(os.path.join(os.path.abspath(sys.argv[0]),
@@ -81,8 +82,9 @@
81from nova import quota82from nova import quota
82from nova import utils83from nova import utils
83from nova.auth import manager84from nova.auth import manager
85from nova import rpc
84from nova.cloudpipe import pipelib86from nova.cloudpipe import pipelib
8587from nova.api.ec2 import cloud
8688
87logging.basicConfig()89logging.basicConfig()
88FLAGS = flags.FLAGS90FLAGS = flags.FLAGS
@@ -465,6 +467,82 @@
465 int(vpn_start), fixed_range_v6)467 int(vpn_start), fixed_range_v6)
466468
467469
470class InstanceCommands(object):
471 """Class for mangaging VM instances."""
472
473 def live_migration(self, ec2_id, dest):
474 """live_migration"""
475
476 ctxt = context.get_admin_context()
477 instance_id = cloud.ec2_id_to_id(ec2_id)
478
479 if FLAGS.connection_type != 'libvirt':
480 msg = _('Only KVM is supported for now. Sorry!')
481 raise exception.Error(msg)
482
483 if FLAGS.volume_driver != 'nova.volume.driver.AOEDriver':
484 instance_ref = db.instance_get(ctxt, instance_id)
485 if len(instance_ref['volumes']) != 0:
486 msg = _(("""Volumes attached by ISCSIDriver"""
487 """ are not supported. Sorry!"""))
488 raise exception.Error(msg)
489
490 rpc.call(ctxt,
491 FLAGS.scheduler_topic,
492 {"method": "live_migration",
493 "args": {"instance_id": instance_id,
494 "dest": dest,
495 "topic": FLAGS.compute_topic}})
496
497 msg = 'Migration of %s initiated. ' % ec2_id
498 msg += 'Check its progress using euca-describe-instances.'
499 print msg
500
501
502class HostCommands(object):
503 """Class for mangaging host(physical nodes)."""
504
505 def list(self):
506 """describe host list."""
507
508 # To supress msg: No handlers could be found for logger "amqplib"
509 logging.basicConfig()
510
511 service_refs = db.service_get_all(context.get_admin_context())
512 hosts = [h['host'] for h in service_refs]
513 hosts = list(set(hosts))
514 for host in hosts:
515 print host
516
517 def show(self, host):
518 """describe cpu/memory/hdd info for host."""
519
520 result = rpc.call(context.get_admin_context(),
521 FLAGS.scheduler_topic,
522 {"method": "show_host_resource",
523 "args": {"host": host}})
524
525 # Checking result msg format is necessary, that will have done
526 # when this feture is included in API.
527 if type(result) != dict:
528 print 'Unexpected error occurs'
529 elif not result['ret']:
530 print '%s' % result['msg']
531 else:
532 cpu = result['phy_resource']['vcpus']
533 mem = result['phy_resource']['memory_mb']
534 hdd = result['phy_resource']['local_gb']
535
536 print 'HOST\t\tPROJECT\t\tcpu\tmem(mb)\tdisk(gb)'
537 print '%s\t\t\t%s\t%s\t%s' % (host, cpu, mem, hdd)
538 for p_id, val in result['usage'].items():
539 print '%s\t%s\t\t%s\t%s\t%s' % (host,
540 p_id,
541 val['vcpus'],
542 val['memory_mb'],
543 val['local_gb'])
544
545
468class ServiceCommands(object):546class ServiceCommands(object):
469 """Enable and disable running services"""547 """Enable and disable running services"""
470548
@@ -527,6 +605,8 @@
527 ('vpn', VpnCommands),605 ('vpn', VpnCommands),
528 ('floating', FloatingIpCommands),606 ('floating', FloatingIpCommands),
529 ('network', NetworkCommands),607 ('network', NetworkCommands),
608 ('instance', InstanceCommands),
609 ('host', HostCommands),
530 ('service', ServiceCommands),610 ('service', ServiceCommands),
531 ('log', LogCommands)]611 ('log', LogCommands)]
532612
533613
=== modified file 'nova/api/ec2/cloud.py'
--- nova/api/ec2/cloud.py 2011-01-17 18:05:26 +0000
+++ nova/api/ec2/cloud.py 2011-01-18 16:24:10 +0000
@@ -729,7 +729,7 @@
729 ec2_id = None729 ec2_id = None
730 if (floating_ip_ref['fixed_ip']730 if (floating_ip_ref['fixed_ip']
731 and floating_ip_ref['fixed_ip']['instance']):731 and floating_ip_ref['fixed_ip']['instance']):
732 instance_id = floating_ip_ref['fixed_ip']['instance']['ec2_id']732 instance_id = floating_ip_ref['fixed_ip']['instance']['id']
733 ec2_id = id_to_ec2_id(instance_id)733 ec2_id = id_to_ec2_id(instance_id)
734 address_rv = {'public_ip': address,734 address_rv = {'public_ip': address,
735 'instance_id': ec2_id}735 'instance_id': ec2_id}
736736
=== modified file 'nova/compute/manager.py'
--- nova/compute/manager.py 2011-01-17 17:16:36 +0000
+++ nova/compute/manager.py 2011-01-18 16:24:10 +0000
@@ -41,6 +41,7 @@
41import socket41import socket
42import functools42import functools
4343
44from nova import db
44from nova import exception45from nova import exception
45from nova import flags46from nova import flags
46from nova import log as logging47from nova import log as logging
@@ -120,6 +121,35 @@
120 """121 """
121 self.driver.init_host()122 self.driver.init_host()
122123
124 def update_service(self, ctxt, host, binary):
125 """Insert compute node specific information to DB."""
126
127 try:
128 service_ref = db.service_get_by_args(ctxt,
129 host,
130 binary)
131 except exception.NotFound:
132 msg = _(("""Cannot insert compute manager specific info"""
133 """Because no service record found."""))
134 raise exception.Invalid(msg)
135
136 # Updating host information
137 vcpu = self.driver.get_vcpu_number()
138 memory_mb = self.driver.get_memory_mb()
139 local_gb = self.driver.get_local_gb()
140 hypervisor = self.driver.get_hypervisor_type()
141 version = self.driver.get_hypervisor_version()
142 cpu_info = self.driver.get_cpu_info()
143
144 db.service_update(ctxt,
145 service_ref['id'],
146 {'vcpus': vcpu,
147 'memory_mb': memory_mb,
148 'local_gb': local_gb,
149 'hypervisor_type': hypervisor,
150 'hypervisor_version': version,
151 'cpu_info': cpu_info})
152
123 def _update_state(self, context, instance_id):153 def _update_state(self, context, instance_id):
124 """Update the state of an instance from the driver info."""154 """Update the state of an instance from the driver info."""
125 # FIXME(ja): include other fields from state?155 # FIXME(ja): include other fields from state?
@@ -178,9 +208,10 @@
178 raise exception.Error(_("Instance has already been created"))208 raise exception.Error(_("Instance has already been created"))
179 LOG.audit(_("instance %s: starting..."), instance_id,209 LOG.audit(_("instance %s: starting..."), instance_id,
180 context=context)210 context=context)
211
181 self.db.instance_update(context,212 self.db.instance_update(context,
182 instance_id,213 instance_id,
183 {'host': self.host})214 {'host': self.host, 'launched_on': self.host})
184215
185 self.db.instance_set_state(context,216 self.db.instance_set_state(context,
186 instance_id,217 instance_id,
@@ -560,3 +591,88 @@
560 self.volume_manager.remove_compute_volume(context, volume_id)591 self.volume_manager.remove_compute_volume(context, volume_id)
561 self.db.volume_detached(context, volume_id)592 self.db.volume_detached(context, volume_id)
562 return True593 return True
594
595 def compare_cpu(self, context, cpu_info):
596 """ Check the host cpu is compatible to a cpu given by xml."""
597 return self.driver.compare_cpu(cpu_info)
598
599 def pre_live_migration(self, context, instance_id, dest):
600 """Any preparation for live migration at dst host."""
601
602 # Getting instance info
603 instance_ref = db.instance_get(context, instance_id)
604 ec2_id = instance_ref['hostname']
605
606 # Getting fixed ips
607 fixed_ip = db.instance_get_fixed_address(context, instance_id)
608 if not fixed_ip:
609 msg = _('%s(%s) doesnt have fixed_ip') % (instance_id, ec2_id)
610 raise exception.NotFound(msg)
611
612 # If any volume is mounted, prepare here.
613 if len(instance_ref['volumes']) == 0:
614 logging.info(_("%s has no volume.") % ec2_id)
615 else:
616 for v in instance_ref['volumes']:
617 self.volume_manager.setup_compute_volume(context, v['id'])
618
619 # Bridge settings
620 # call this method prior to ensure_filtering_rules_for_instance,
621 # since bridge is not set up, ensure_filtering_rules_for instance
622 # fails.
623 self.network_manager.setup_compute_network(context, instance_id)
624
625 # Creating filters to hypervisors and firewalls.
626 # An example is that nova-instance-instance-xxx,
627 # which is written to libvirt.xml( check "virsh nwfilter-list )
628 # On destination host, this nwfilter is necessary.
629 # In addition, this method is creating filtering rule
630 # onto destination host.
631 self.driver.ensure_filtering_rules_for_instance(instance_ref)
632
633 def live_migration(self, context, instance_id, dest):
634 """executes live migration."""
635
636 # Get instance for error handling.
637 instance_ref = db.instance_get(context, instance_id)
638 ec2_id = instance_ref['hostname']
639
640 try:
641 # Checking volume node is working correctly when any volumes
642 # are attached to instances.
643 if len(instance_ref['volumes']) != 0:
644 rpc.call(context,
645 FLAGS.volume_topic,
646 {"method": "check_for_export",
647 "args": {'instance_id': instance_id}})
648
649 # Asking dest host to preparing live migration.
650 compute_topic = db.queue_get_for(context,
651 FLAGS.compute_topic,
652 dest)
653 rpc.call(context,
654 compute_topic,
655 {"method": "pre_live_migration",
656 "args": {'instance_id': instance_id,
657 'dest': dest}})
658
659 except Exception, e:
660 msg = _('Pre live migration for %s failed at %s')
661 logging.error(msg, ec2_id, dest)
662 db.instance_set_state(context,
663 instance_id,
664 power_state.RUNNING,
665 'running')
666
667 for v in instance_ref['volumes']:
668 db.volume_update(context,
669 v['id'],
670 {'status': 'in-use'})
671
672 # e should be raised. just calling "raise" may raise NotFound.
673 raise e
674
675 # Executing live migration
676 # live_migration might raises exceptions, but
677 # nothing must be recovered in this version.
678 self.driver.live_migration(context, instance_ref, dest)
563679
=== modified file 'nova/db/api.py'
--- nova/db/api.py 2011-01-14 07:49:41 +0000
+++ nova/db/api.py 2011-01-18 16:24:10 +0000
@@ -253,6 +253,10 @@
253 return IMPL.floating_ip_get_by_address(context, address)253 return IMPL.floating_ip_get_by_address(context, address)
254254
255255
256def floating_ip_update(context, address, values):
257 """update floating ip information."""
258 return IMPL.floating_ip_update(context, address, values)
259
256####################260####################
257261
258262
@@ -405,6 +409,32 @@
405 security_group_id)409 security_group_id)
406410
407411
412def instance_get_all_by_host(context, hostname):
413 """Get instances by host"""
414 return IMPL.instance_get_all_by_host(context, hostname)
415
416
417def instance_get_vcpu_sum_by_host_and_project(context, hostname, proj_id):
418 """Get instances.vcpus by host and project"""
419 return IMPL.instance_get_vcpu_sum_by_host_and_project(context,
420 hostname,
421 proj_id)
422
423
424def instance_get_memory_sum_by_host_and_project(context, hostname, proj_id):
425 """Get amount of memory by host and project """
426 return IMPL.instance_get_memory_sum_by_host_and_project(context,
427 hostname,
428 proj_id)
429
430
431def instance_get_disk_sum_by_host_and_project(context, hostname, proj_id):
432 """Get total amount of disk by host and project """
433 return IMPL.instance_get_disk_sum_by_host_and_project(context,
434 hostname,
435 proj_id)
436
437
408def instance_action_create(context, values):438def instance_action_create(context, values):
409 """Create an instance action from the values dictionary."""439 """Create an instance action from the values dictionary."""
410 return IMPL.instance_action_create(context, values)440 return IMPL.instance_action_create(context, values)
411441
=== modified file 'nova/db/sqlalchemy/api.py'
--- nova/db/sqlalchemy/api.py 2011-01-15 01:54:36 +0000
+++ nova/db/sqlalchemy/api.py 2011-01-18 16:24:10 +0000
@@ -495,6 +495,16 @@
495 return result495 return result
496496
497497
498@require_context
499def floating_ip_update(context, address, values):
500 session = get_session()
501 with session.begin():
502 floating_ip_ref = floating_ip_get_by_address(context, address, session)
503 for (key, value) in values.iteritems():
504 floating_ip_ref[key] = value
505 floating_ip_ref.save(session=session)
506
507
498###################508###################
499509
500510
@@ -858,6 +868,7 @@
858 return instance_ref868 return instance_ref
859869
860870
871@require_context
861def instance_add_security_group(context, instance_id, security_group_id):872def instance_add_security_group(context, instance_id, security_group_id):
862 """Associate the given security group with the given instance"""873 """Associate the given security group with the given instance"""
863 session = get_session()874 session = get_session()
@@ -871,6 +882,59 @@
871882
872883
873@require_context884@require_context
885def instance_get_all_by_host(context, hostname):
886 session = get_session()
887 if not session:
888 session = get_session()
889
890 result = session.query(models.Instance).\
891 filter_by(host=hostname).\
892 filter_by(deleted=can_read_deleted(context)).\
893 all()
894 if not result:
895 return []
896 return result
897
898
899@require_context
900def _instance_get_sum_by_host_and_project(context, column, hostname, proj_id):
901 session = get_session()
902
903 result = session.query(models.Instance).\
904 filter_by(host=hostname).\
905 filter_by(project_id=proj_id).\
906 filter_by(deleted=can_read_deleted(context)).\
907 value(column)
908 if not result:
909 return 0
910 return result
911
912
913@require_context
914def instance_get_vcpu_sum_by_host_and_project(context, hostname, proj_id):
915 return _instance_get_sum_by_host_and_project(context,
916 'vcpus',
917 hostname,
918 proj_id)
919
920
921@require_context
922def instance_get_memory_sum_by_host_and_project(context, hostname, proj_id):
923 return _instance_get_sum_by_host_and_project(context,
924 'memory_mb',
925 hostname,
926 proj_id)
927
928
929@require_context
930def instance_get_disk_sum_by_host_and_project(context, hostname, proj_id):
931 return _instance_get_sum_by_host_and_project(context,
932 'local_gb',
933 hostname,
934 proj_id)
935
936
937@require_context
874def instance_action_create(context, values):938def instance_action_create(context, values):
875 """Create an instance action from the values dictionary."""939 """Create an instance action from the values dictionary."""
876 action_ref = models.InstanceActions()940 action_ref = models.InstanceActions()
877941
=== modified file 'nova/db/sqlalchemy/models.py'
--- nova/db/sqlalchemy/models.py 2011-01-15 01:48:48 +0000
+++ nova/db/sqlalchemy/models.py 2011-01-18 16:24:10 +0000
@@ -150,13 +150,32 @@
150150
151 __tablename__ = 'services'151 __tablename__ = 'services'
152 id = Column(Integer, primary_key=True)152 id = Column(Integer, primary_key=True)
153 host = Column(String(255)) # , ForeignKey('hosts.id'))153 #host_id = Column(Integer, ForeignKey('hosts.id'), nullable=True)
154 #host = relationship(Host, backref=backref('services'))
155 host = Column(String(255))
154 binary = Column(String(255))156 binary = Column(String(255))
155 topic = Column(String(255))157 topic = Column(String(255))
156 report_count = Column(Integer, nullable=False, default=0)158 report_count = Column(Integer, nullable=False, default=0)
157 disabled = Column(Boolean, default=False)159 disabled = Column(Boolean, default=False)
158 availability_zone = Column(String(255), default='nova')160 availability_zone = Column(String(255), default='nova')
159161
162 # The below items are compute node only.
163 # -1 or None is inserted for other service.
164 vcpus = Column(Integer, nullable=False, default=-1)
165 memory_mb = Column(Integer, nullable=False, default=-1)
166 local_gb = Column(Integer, nullable=False, default=-1)
167 hypervisor_type = Column(String(128))
168 hypervisor_version = Column(Integer, nullable=False, default=-1)
169 # Note(masumotok): Expected Strings example:
170 #
171 # '{"arch":"x86_64", "model":"Nehalem",
172 # "topology":{"sockets":1, "threads":2, "cores":3},
173 # features:[ "tdtscp", "xtpr"]}'
174 #
175 # Points are "json translatable" and it must have all
176 # dictionary keys above.
177 cpu_info = Column(String(512))
178
160179
161class Certificate(BASE, NovaBase):180class Certificate(BASE, NovaBase):
162 """Represents a an x509 certificate"""181 """Represents a an x509 certificate"""
@@ -231,6 +250,9 @@
231 display_name = Column(String(255))250 display_name = Column(String(255))
232 display_description = Column(String(255))251 display_description = Column(String(255))
233252
253 # To remember on which host a instance booted.
254 # An instance may moved to other host by live migraiton.
255 launched_on = Column(String(255))
234 locked = Column(Boolean)256 locked = Column(Boolean)
235257
236 # TODO(vish): see Ewan's email about state improvements, probably258 # TODO(vish): see Ewan's email about state improvements, probably
@@ -588,7 +610,7 @@
588 Volume, ExportDevice, IscsiTarget, FixedIp, FloatingIp,610 Volume, ExportDevice, IscsiTarget, FixedIp, FloatingIp,
589 Network, SecurityGroup, SecurityGroupIngressRule,611 Network, SecurityGroup, SecurityGroupIngressRule,
590 SecurityGroupInstanceAssociation, AuthToken, User,612 SecurityGroupInstanceAssociation, AuthToken, User,
591 Project, Certificate, ConsolePool, Console) # , Image, Host613 Project, Certificate, ConsolePool, Console) # , Host, Image
592 engine = create_engine(FLAGS.sql_connection, echo=False)614 engine = create_engine(FLAGS.sql_connection, echo=False)
593 for model in models:615 for model in models:
594 model.metadata.create_all(engine)616 model.metadata.create_all(engine)
595617
=== modified file 'nova/network/manager.py'
--- nova/network/manager.py 2011-01-15 01:54:36 +0000
+++ nova/network/manager.py 2011-01-18 16:24:10 +0000
@@ -159,7 +159,7 @@
159 """Called when this host becomes the host for a network."""159 """Called when this host becomes the host for a network."""
160 raise NotImplementedError()160 raise NotImplementedError()
161161
162 def setup_compute_network(self, context, instance_id):162 def setup_compute_network(self, context, instance_id, network_ref=None):
163 """Sets up matching network for compute hosts."""163 """Sets up matching network for compute hosts."""
164 raise NotImplementedError()164 raise NotImplementedError()
165165
@@ -320,7 +320,7 @@
320 self.db.fixed_ip_update(context, address, {'allocated': False})320 self.db.fixed_ip_update(context, address, {'allocated': False})
321 self.db.fixed_ip_disassociate(context.elevated(), address)321 self.db.fixed_ip_disassociate(context.elevated(), address)
322322
323 def setup_compute_network(self, context, instance_id):323 def setup_compute_network(self, context, instance_id, network_ref=None):
324 """Network is created manually."""324 """Network is created manually."""
325 pass325 pass
326326
@@ -395,9 +395,10 @@
395 super(FlatDHCPManager, self).init_host()395 super(FlatDHCPManager, self).init_host()
396 self.driver.metadata_forward()396 self.driver.metadata_forward()
397397
398 def setup_compute_network(self, context, instance_id):398 def setup_compute_network(self, context, instance_id, network_ref=None):
399 """Sets up matching network for compute hosts."""399 """Sets up matching network for compute hosts."""
400 network_ref = db.network_get_by_instance(context, instance_id)400 if network_ref is None:
401 network_ref = db.network_get_by_instance(context, instance_id)
401 self.driver.ensure_bridge(network_ref['bridge'],402 self.driver.ensure_bridge(network_ref['bridge'],
402 FLAGS.flat_interface)403 FLAGS.flat_interface)
403404
@@ -487,9 +488,10 @@
487 """Returns a fixed ip to the pool."""488 """Returns a fixed ip to the pool."""
488 self.db.fixed_ip_update(context, address, {'allocated': False})489 self.db.fixed_ip_update(context, address, {'allocated': False})
489490
490 def setup_compute_network(self, context, instance_id):491 def setup_compute_network(self, context, instance_id, network_ref=None):
491 """Sets up matching network for compute hosts."""492 """Sets up matching network for compute hosts."""
492 network_ref = db.network_get_by_instance(context, instance_id)493 if network_ref is None:
494 network_ref = db.network_get_by_instance(context, instance_id)
493 self.driver.ensure_vlan_bridge(network_ref['vlan'],495 self.driver.ensure_vlan_bridge(network_ref['vlan'],
494 network_ref['bridge'])496 network_ref['bridge'])
495497
496498
=== modified file 'nova/scheduler/driver.py'
--- nova/scheduler/driver.py 2010-12-28 20:11:41 +0000
+++ nova/scheduler/driver.py 2011-01-18 16:24:10 +0000
@@ -26,6 +26,9 @@
26from nova import db26from nova import db
27from nova import exception27from nova import exception
28from nova import flags28from nova import flags
29from nova import log as logging
30from nova import rpc
31from nova.compute import power_state
2932
30FLAGS = flags.FLAGS33FLAGS = flags.FLAGS
31flags.DEFINE_integer('service_down_time', 60,34flags.DEFINE_integer('service_down_time', 60,
@@ -64,3 +67,183 @@
64 def schedule(self, context, topic, *_args, **_kwargs):67 def schedule(self, context, topic, *_args, **_kwargs):
65 """Must override at least this method for scheduler to work."""68 """Must override at least this method for scheduler to work."""
66 raise NotImplementedError(_("Must implement a fallback schedule"))69 raise NotImplementedError(_("Must implement a fallback schedule"))
70
71 def schedule_live_migration(self, context, instance_id, dest):
72 """ live migration method """
73
74 # Whether instance exists and running
75 instance_ref = db.instance_get(context, instance_id)
76 ec2_id = instance_ref['hostname']
77
78 # Checking instance.
79 self._live_migration_src_check(context, instance_ref)
80
81 # Checking destination host.
82 self._live_migration_dest_check(context, instance_ref, dest)
83
84 # Common checking.
85 self._live_migration_common_check(context, instance_ref, dest)
86
87 # Changing instance_state.
88 db.instance_set_state(context,
89 instance_id,
90 power_state.PAUSED,
91 'migrating')
92
93 # Changing volume state
94 for v in instance_ref['volumes']:
95 db.volume_update(context,
96 v['id'],
97 {'status': 'migrating'})
98
99 # Return value is necessary to send request to src
100 # Check _schedule() in detail.
101 src = instance_ref['host']
102 return src
103
104 def _live_migration_src_check(self, context, instance_ref):
105 """Live migration check routine (for src host)"""
106
107 # Checking instance is running.
108 if power_state.RUNNING != instance_ref['state'] or \
109 'running' != instance_ref['state_description']:
110 msg = _('Instance(%s) is not running')
111 ec2_id = instance_ref['hostname']
112 raise exception.Invalid(msg % ec2_id)
113
114 # Checing volume node is running when any volumes are mounted
115 # to the instance.
116 if len(instance_ref['volumes']) != 0:
117 services = db.service_get_all_by_topic(context, 'volume')
118 if len(services) < 1 or not self.service_is_up(services[0]):
119 msg = _('volume node is not alive(time synchronize problem?)')
120 raise exception.Invalid(msg)
121
122 # Checking src host is alive.
123 src = instance_ref['host']
124 services = db.service_get_all_by_topic(context, 'compute')
125 services = [service for service in services if service.host == src]
126 if len(services) < 1 or not self.service_is_up(services[0]):
127 msg = _('%s is not alive(time synchronize problem?)')
128 raise exception.Invalid(msg % src)
129
130 def _live_migration_dest_check(self, context, instance_ref, dest):
131 """Live migration check routine (for destination host)"""
132
133 # Checking dest exists and compute node.
134 dservice_refs = db.service_get_all_by_host(context, dest)
135 if len(dservice_refs) <= 0:
136 msg = _('%s does not exists.')
137 raise exception.Invalid(msg % dest)
138
139 dservice_ref = dservice_refs[0]
140 if dservice_ref['topic'] != 'compute':
141 msg = _('%s must be compute node')
142 raise exception.Invalid(msg % dest)
143
144 # Checking dest host is alive.
145 if not self.service_is_up(dservice_ref):
146 msg = _('%s is not alive(time synchronize problem?)')
147 raise exception.Invalid(msg % dest)
148
149 # Checking whether The host where instance is running
150 # and dest is not same.
151 src = instance_ref['host']
152 if dest == src:
153 ec2_id = instance_ref['hostname']
154 msg = _('%s is where %s is running now. choose other host.')
155 raise exception.Invalid(msg % (dest, ec2_id))
156
157 # Checking dst host still has enough capacities.
158 self.has_enough_resource(context, instance_ref, dest)
159
160 def _live_migration_common_check(self, context, instance_ref, dest):
161 """
162 Live migration check routine.
163 Below pre-checkings are followed by
164 http://wiki.libvirt.org/page/TodoPreMigrationChecks
165
166 """
167
168 # Checking dest exists.
169 dservice_refs = db.service_get_all_by_host(context, dest)
170 if len(dservice_refs) <= 0:
171 msg = _('%s does not exists.')
172 raise exception.Invalid(msg % dest)
173 dservice_ref = dservice_refs[0]
174
175 # Checking original host( where instance was launched at) exists.
176 orighost = instance_ref['launched_on']
177 oservice_refs = db.service_get_all_by_host(context, orighost)
178 if len(oservice_refs) <= 0:
179 msg = _('%s(where instance was launched at) does not exists.')
180 raise exception.Invalid(msg % orighost)
181 oservice_ref = oservice_refs[0]
182
183 # Checking hypervisor is same.
184 otype = oservice_ref['hypervisor_type']
185 dtype = dservice_ref['hypervisor_type']
186 if otype != dtype:
187 msg = _('Different hypervisor type(%s->%s)')
188 raise exception.Invalid(msg % (otype, dtype))
189
190 # Checkng hypervisor version.
191 oversion = oservice_ref['hypervisor_version']
192 dversion = dservice_ref['hypervisor_version']
193 if oversion > dversion:
194 msg = _('Older hypervisor version(%s->%s)')
195 raise exception.Invalid(msg % (oversion, dversion))
196
197 # Checking cpuinfo.
198 cpu_info = oservice_ref['cpu_info']
199 try:
200 rpc.call(context,
201 db.queue_get_for(context, FLAGS.compute_topic, dest),
202 {"method": 'compare_cpu',
203 "args": {'cpu_info': cpu_info}})
204
205 except rpc.RemoteError, e:
206 msg = _(("""%s doesnt have compatibility to %s"""
207 """(where %s was launched at)"""))
208 ec2_id = instance_ref['hostname']
209 src = instance_ref['host']
210 logging.error(msg % (dest, src, ec2_id))
211 raise e
212
213 def has_enough_resource(self, context, instance_ref, dest):
214 """ Check if destination host has enough resource for live migration"""
215
216 # Getting instance information
217 ec2_id = instance_ref['hostname']
218 vcpus = instance_ref['vcpus']
219 mem = instance_ref['memory_mb']
220 hdd = instance_ref['local_gb']
221
222 # Gettin host information
223 service_refs = db.service_get_all_by_host(context, dest)
224 if len(service_refs) <= 0:
225 msg = _('%s does not exists.')
226 raise exception.Invalid(msg % dest)
227 service_ref = service_refs[0]
228
229 total_cpu = int(service_ref['vcpus'])
230 total_mem = int(service_ref['memory_mb'])
231 total_hdd = int(service_ref['local_gb'])
232
233 instances_ref = db.instance_get_all_by_host(context, dest)
234 for i_ref in instances_ref:
235 total_cpu -= int(i_ref['vcpus'])
236 total_mem -= int(i_ref['memory_mb'])
237 total_hdd -= int(i_ref['local_gb'])
238
239 # Checking host has enough information
240 logging.debug('host(%s) remains vcpu:%s mem:%s hdd:%s,' %
241 (dest, total_cpu, total_mem, total_hdd))
242 logging.debug('instance(%s) has vcpu:%s mem:%s hdd:%s,' %
243 (ec2_id, vcpus, mem, hdd))
244
245 if total_cpu <= vcpus or total_mem <= mem or total_hdd <= hdd:
246 msg = '%s doesnt have enough resource for %s' % (dest, ec2_id)
247 raise exception.NotEmpty(msg)
248
249 logging.debug(_('%s has_enough_resource() for %s') % (dest, ec2_id))
67250
=== modified file 'nova/scheduler/manager.py'
--- nova/scheduler/manager.py 2011-01-04 05:23:35 +0000
+++ nova/scheduler/manager.py 2011-01-18 16:24:10 +0000
@@ -29,6 +29,7 @@
29from nova import manager29from nova import manager
30from nova import rpc30from nova import rpc
31from nova import utils31from nova import utils
32from nova import exception
3233
33LOG = logging.getLogger('nova.scheduler.manager')34LOG = logging.getLogger('nova.scheduler.manager')
34FLAGS = flags.FLAGS35FLAGS = flags.FLAGS
@@ -67,3 +68,50 @@
67 {"method": method,68 {"method": method,
68 "args": kwargs})69 "args": kwargs})
69 LOG.debug(_("Casting to %s %s for %s"), topic, host, method)70 LOG.debug(_("Casting to %s %s for %s"), topic, host, method)
71
72 # NOTE (masumotok) : This method should be moved to nova.api.ec2.admin.
73 # Based on bear design summit discussion,
74 # just put this here for bexar release.
75 def show_host_resource(self, context, host, *args):
76 """ show the physical/usage resource given by hosts."""
77
78 services = db.service_get_all_by_host(context, host)
79 if len(services) == 0:
80 return {'ret': False, 'msg': 'No such Host'}
81
82 compute = [s for s in services if s['topic'] == 'compute']
83 if 0 == len(compute):
84 service_ref = services[0]
85 else:
86 service_ref = compute[0]
87
88 # Getting physical resource information
89 h_resource = {'vcpus': service_ref['vcpus'],
90 'memory_mb': service_ref['memory_mb'],
91 'local_gb': service_ref['local_gb']}
92
93 # Getting usage resource information
94 u_resource = {}
95 instances_ref = db.instance_get_all_by_host(context,
96 service_ref['host'])
97
98 if 0 == len(instances_ref):
99 return {'ret': True, 'phy_resource': h_resource, 'usage': {}}
100
101 project_ids = [i['project_id'] for i in instances_ref]
102 project_ids = list(set(project_ids))
103 for p_id in project_ids:
104 vcpus = db.instance_get_vcpu_sum_by_host_and_project(context,
105 host,
106 p_id)
107 mem = db.instance_get_memory_sum_by_host_and_project(context,
108 host,
109 p_id)
110 hdd = db.instance_get_disk_sum_by_host_and_project(context,
111 host,
112 p_id)
113 u_resource[p_id] = {'vcpus': vcpus,
114 'memory_mb': mem,
115 'local_gb': hdd}
116
117 return {'ret': True, 'phy_resource': h_resource, 'usage': u_resource}
70118
=== modified file 'nova/service.py'
--- nova/service.py 2011-01-11 22:27:36 +0000
+++ nova/service.py 2011-01-18 16:24:10 +0000
@@ -80,6 +80,7 @@
80 self.manager.init_host()80 self.manager.init_host()
81 self.model_disconnected = False81 self.model_disconnected = False
82 ctxt = context.get_admin_context()82 ctxt = context.get_admin_context()
83
83 try:84 try:
84 service_ref = db.service_get_by_args(ctxt,85 service_ref = db.service_get_by_args(ctxt,
85 self.host,86 self.host,
@@ -88,6 +89,9 @@
88 except exception.NotFound:89 except exception.NotFound:
89 self._create_service_ref(ctxt)90 self._create_service_ref(ctxt)
9091
92 if 'nova-compute' == self.binary:
93 self.manager.update_service(ctxt, self.host, self.binary)
94
91 conn1 = rpc.Connection.instance(new=True)95 conn1 = rpc.Connection.instance(new=True)
92 conn2 = rpc.Connection.instance(new=True)96 conn2 = rpc.Connection.instance(new=True)
93 if self.report_interval:97 if self.report_interval:
9498
=== added file 'nova/virt/cpuinfo.xml.template'
--- nova/virt/cpuinfo.xml.template 1970-01-01 00:00:00 +0000
+++ nova/virt/cpuinfo.xml.template 2011-01-18 16:24:10 +0000
@@ -0,0 +1,9 @@
1<cpu>
2 <arch>$arch</arch>
3 <model>$model</model>
4 <vendor>$vendor</vendor>
5 <topology sockets="$topology.sockets" cores="$topology.cores" threads="$topology.threads"/>
6#for $var in $features
7 <features name="$var" />
8#end for
9</cpu>
010
=== modified file 'nova/virt/fake.py'
--- nova/virt/fake.py 2011-01-12 19:22:01 +0000
+++ nova/virt/fake.py 2011-01-18 16:24:10 +0000
@@ -310,6 +310,38 @@
310 'username': 'fakeuser',310 'username': 'fakeuser',
311 'password': 'fakepassword'}311 'password': 'fakepassword'}
312312
313 def get_cpu_info(self):
314 """This method is supported only libvirt. """
315 return
316
317 def get_vcpu_number(self):
318 """This method is supported only libvirt. """
319 return -1
320
321 def get_memory_mb(self):
322 """This method is supported only libvirt.."""
323 return -1
324
325 def get_local_gb(self):
326 """This method is supported only libvirt.."""
327 return -1
328
329 def get_hypervisor_type(self):
330 """This method is supported only libvirt.."""
331 return
332
333 def get_hypervisor_version(self):
334 """This method is supported only libvirt.."""
335 return -1
336
337 def compare_cpu(self, xml):
338 """This method is supported only libvirt.."""
339 raise NotImplementedError('This method is supported only libvirt.')
340
341 def live_migration(self, context, instance_ref, dest):
342 """This method is supported only libvirt.."""
343 raise NotImplementedError('This method is supported only libvirt.')
344
313345
314class FakeInstance(object):346class FakeInstance(object):
315347
316348
=== modified file 'nova/virt/libvirt_conn.py'
--- nova/virt/libvirt_conn.py 2011-01-17 17:16:36 +0000
+++ nova/virt/libvirt_conn.py 2011-01-18 16:24:10 +0000
@@ -36,8 +36,11 @@
3636
37"""37"""
3838
39import json
39import os40import os
40import shutil41import shutil
42import re
43import time
41import random44import random
42import subprocess45import subprocess
43import uuid46import uuid
@@ -80,6 +83,9 @@
80flags.DEFINE_string('libvirt_xml_template',83flags.DEFINE_string('libvirt_xml_template',
81 utils.abspath('virt/libvirt.xml.template'),84 utils.abspath('virt/libvirt.xml.template'),
82 'Libvirt XML Template')85 'Libvirt XML Template')
86flags.DEFINE_string('cpuinfo_xml_template',
87 utils.abspath('virt/cpuinfo.xml.template'),
88 'CpuInfo XML Template (used only live migration now)')
83flags.DEFINE_string('libvirt_type',89flags.DEFINE_string('libvirt_type',
84 'kvm',90 'kvm',
85 'Libvirt domain type (valid options are: '91 'Libvirt domain type (valid options are: '
@@ -88,6 +94,16 @@
88 '',94 '',
89 'Override the default libvirt URI (which is dependent'95 'Override the default libvirt URI (which is dependent'
90 ' on libvirt_type)')96 ' on libvirt_type)')
97flags.DEFINE_string('live_migration_uri',
98 "qemu+tcp://%s/system",
99 'Define protocol used by live_migration feature')
100flags.DEFINE_string('live_migration_flag',
101 "VIR_MIGRATE_UNDEFINE_SOURCE, VIR_MIGRATE_PEER2PEER",
102 'Define live migration behavior.')
103flags.DEFINE_integer('live_migration_bandwidth', 0,
104 'Define live migration behavior')
105flags.DEFINE_string('live_migration_timeout_sec', 10,
106 'Timeout second for pre_live_migration is completed.')
91flags.DEFINE_bool('allow_project_net_traffic',107flags.DEFINE_bool('allow_project_net_traffic',
92 True,108 True,
93 'Whether to allow in project network traffic')109 'Whether to allow in project network traffic')
@@ -146,6 +162,7 @@
146 self.libvirt_uri = self.get_uri()162 self.libvirt_uri = self.get_uri()
147163
148 self.libvirt_xml = open(FLAGS.libvirt_xml_template).read()164 self.libvirt_xml = open(FLAGS.libvirt_xml_template).read()
165 self.cpuinfo_xml = open(FLAGS.cpuinfo_xml_template).read()
149 self._wrapped_conn = None166 self._wrapped_conn = None
150 self.read_only = read_only167 self.read_only = read_only
151168
@@ -818,6 +835,74 @@
818835
819 return interfaces836 return interfaces
820837
838 def get_vcpu_number(self):
839 """ Get vcpu number of physical computer. """
840 return self._conn.getMaxVcpus(None)
841
842 def get_memory_mb(self):
843 """Get the memory size of physical computer ."""
844 meminfo = open('/proc/meminfo').read().split()
845 idx = meminfo.index('MemTotal:')
846 # transforming kb to mb.
847 return int(meminfo[idx + 1]) / 1024
848
849 def get_local_gb(self):
850 """Get the hdd size of physical computer ."""
851 hddinfo = os.statvfs(FLAGS.instances_path)
852 return hddinfo.f_bsize * hddinfo.f_blocks / 1024 / 1024 / 1024
853
854 def get_hypervisor_type(self):
855 """ Get hypervisor type """
856 return self._conn.getType()
857
858 def get_hypervisor_version(self):
859 """ Get hypervisor version """
860 return self._conn.getVersion()
861
862 def get_cpu_info(self):
863 """ Get cpuinfo information """
864 xmlstr = self._conn.getCapabilities()
865 xml = libxml2.parseDoc(xmlstr)
866 nodes = xml.xpathEval('//cpu')
867 if len(nodes) != 1:
868 msg = 'Unexpected xml format. tag "cpu" must be 1, but %d.' \
869 % len(nodes)
870 msg += '\n' + xml.serialize()
871 raise exception.Invalid(_(msg))
872
873 arch = xml.xpathEval('//cpu/arch')[0].getContent()
874 model = xml.xpathEval('//cpu/model')[0].getContent()
875 vendor = xml.xpathEval('//cpu/vendor')[0].getContent()
876
877 topology_node = xml.xpathEval('//cpu/topology')[0].get_properties()
878 topology = dict()
879 while topology_node != None:
880 name = topology_node.get_name()
881 topology[name] = topology_node.getContent()
882 topology_node = topology_node.get_next()
883
884 keys = ['cores', 'sockets', 'threads']
885 tkeys = topology.keys()
886 if list(set(tkeys)) != list(set(keys)):
887 msg = _('Invalid xml: topology(%s) must have %s')
888 raise exception.Invalid(msg % (str(topology), ', '.join(keys)))
889
890 feature_nodes = xml.xpathEval('//cpu/feature')
891 features = list()
892 for nodes in feature_nodes:
893 feature_name = nodes.get_properties().getContent()
894 features.append(feature_name)
895
896 template = ("""{"arch":"%s", "model":"%s", "vendor":"%s", """
897 """"topology":{"cores":"%s", "threads":"%s", """
898 """"sockets":"%s"}, "features":[%s]}""")
899 c = topology['cores']
900 s = topology['sockets']
901 t = topology['threads']
902 f = ['"%s"' % x for x in features]
903 cpu_info = template % (arch, model, vendor, c, s, t, ', '.join(f))
904 return cpu_info
905
821 def block_stats(self, instance_name, disk):906 def block_stats(self, instance_name, disk):
822 """907 """
823 Note that this function takes an instance name, not an Instance, so908 Note that this function takes an instance name, not an Instance, so
@@ -848,6 +933,208 @@
848 def refresh_security_group_members(self, security_group_id):933 def refresh_security_group_members(self, security_group_id):
849 self.firewall_driver.refresh_security_group_members(security_group_id)934 self.firewall_driver.refresh_security_group_members(security_group_id)
850935
936 def compare_cpu(self, cpu_info):
937 """
938 Check the host cpu is compatible to a cpu given by xml.
939 "xml" must be a part of libvirt.openReadonly().getCapabilities().
940 return values follows by virCPUCompareResult.
941 if 0 > return value, do live migration.
942
943 'http://libvirt.org/html/libvirt-libvirt.html#virCPUCompareResult'
944 """
945 msg = _('Checking cpu_info: instance was launched this cpu.\n: %s ')
946 LOG.info(msg % cpu_info)
947 dic = json.loads(cpu_info)
948 xml = str(Template(self.cpuinfo_xml, searchList=dic))
949 msg = _('to xml...\n: %s ')
950 LOG.info(msg % xml)
951
952 url = 'http://libvirt.org/html/libvirt-libvirt.html'
953 url += '#virCPUCompareResult\n'
954 msg = 'CPU does not have compativility.\n'
955 msg += 'result:%d \n'
956 msg += 'Refer to %s'
957 msg = _(msg)
958
959 # unknown character exists in xml, then libvirt complains
960 try:
961 ret = self._conn.compareCPU(xml, 0)
962 except libvirt.libvirtError, e:
963 LOG.error(msg % (ret, url))
964 raise e
965
966 if ret <= 0:
967 raise exception.Invalid(msg % (ret, url))
968
969 return
970
971 def ensure_filtering_rules_for_instance(self, instance_ref):
972 """ Setting up inevitable filtering rules on compute node,
973 and waiting for its completion.
974 To migrate an instance, filtering rules to hypervisors
975 and firewalls are inevitable on destination host.
976 ( Waiting only for filterling rules to hypervisor,
977 since filtering rules to firewall rules can be set faster).
978
979 Concretely, the below method must be called.
980 - setup_basic_filtering (for nova-basic, etc.)
981 - prepare_instance_filter(for nova-instance-instance-xxx, etc.)
982
983 to_xml may have to be called since it defines PROJNET, PROJMASK.
984 but libvirt migrates those value through migrateToURI(),
985 so , no need to be called.
986
987 Don't use thread for this method since migration should
988 not be started when setting-up filtering rules operations
989 are not completed."""
990
991 # Tf any instances never launch at destination host,
992 # basic-filtering must be set here.
993 self.nwfilter.setup_basic_filtering(instance_ref)
994 # setting up n)ova-instance-instance-xx mainly.
995 self.firewall_driver.prepare_instance_filter(instance_ref)
996
997 # wait for completion
998 timeout_count = range(FLAGS.live_migration_timeout_sec * 2)
999 while len(timeout_count) != 0:
1000 try:
1001 filter_name = 'nova-instance-%s' % instance_ref.name
1002 self._conn.nwfilterLookupByName(filter_name)
1003 break
1004 except libvirt.libvirtError:
1005 timeout_count.pop()
1006 if len(timeout_count) == 0:
1007 ec2_id = instance_ref['hostname']
1008 msg = _('Timeout migrating for %s(%s)')
1009 raise exception.Error(msg % (ec2_id, instance_ref.name))
1010 time.sleep(0.5)
1011
1012 def live_migration(self, context, instance_ref, dest):
1013 """
1014 Just spawning live_migration operation for
1015 distributing high-load.
1016 """
1017 greenthread.spawn(self._live_migration, context, instance_ref, dest)
1018
1019 def _live_migration(self, context, instance_ref, dest):
1020 """ Do live migration."""
1021
1022 # Do live migration.
1023 try:
1024 duri = FLAGS.live_migration_uri % dest
1025
1026 flaglist = FLAGS.live_migration_flag.split(',')
1027 flagvals = [getattr(libvirt, x.strip()) for x in flaglist]
1028 logical_sum = reduce(lambda x, y: x | y, flagvals)
1029
1030 bandwidth = FLAGS.live_migration_bandwidth
1031
1032 if self.read_only:
1033 tmpconn = self._connect(self.libvirt_uri, False)
1034 dom = tmpconn.lookupByName(instance_ref.name)
1035 dom.migrateToURI(duri, logical_sum, None, bandwidth)
1036 tmpconn.close()
1037 else:
1038 dom = self._conn.lookupByName(instance_ref.name)
1039 dom.migrateToURI(duri, logical_sum, None, bandwidth)
1040
1041 except Exception, e:
1042 id = instance_ref['id']
1043 db.instance_set_state(context, id, power_state.RUNNING, 'running')
1044 for v in instance_ref['volumes']:
1045 db.volume_update(context,
1046 v['id'],
1047 {'status': 'in-use'})
1048
1049 raise e
1050
1051 # Waiting for completion of live_migration.
1052 timer = utils.LoopingCall(f=None)
1053
1054 def wait_for_live_migration():
1055
1056 try:
1057 state = self.get_info(instance_ref.name)['state']
1058 except exception.NotFound:
1059 timer.stop()
1060 self._post_live_migration(context, instance_ref, dest)
1061
1062 timer.f = wait_for_live_migration
1063 timer.start(interval=0.5, now=True)
1064
1065 def _post_live_migration(self, context, instance_ref, dest):
1066 """
1067 Post operations for live migration.
1068 Mainly, database updating.
1069 """
1070 LOG.info('post livemigration operation is started..')
1071 # Detaching volumes.
1072 # (not necessary in current version )
1073
1074 # Releasing vlan.
1075 # (not necessary in current implementation?)
1076
1077 # Releasing security group ingress rule.
1078 if FLAGS.firewall_driver == \
1079 'nova.virt.libvirt_conn.IptablesFirewallDriver':
1080 try:
1081 self.firewall_driver.unfilter_instance(instance_ref)
1082 except KeyError, e:
1083 pass
1084
1085 # Database updating.
1086 ec2_id = instance_ref['hostname']
1087
1088 instance_id = instance_ref['id']
1089 fixed_ip = db.instance_get_fixed_address(context, instance_id)
1090 # Not return if fixed_ip is not found, otherwise,
1091 # instance never be accessible..
1092 if None == fixed_ip:
1093 logging.warn('fixed_ip is not found for %s ' % ec2_id)
1094 db.fixed_ip_update(context, fixed_ip, {'host': dest})
1095 network_ref = db.fixed_ip_get_network(context, fixed_ip)
1096 db.network_update(context, network_ref['id'], {'host': dest})
1097
1098 try:
1099 floating_ip \
1100 = db.instance_get_floating_address(context, instance_id)
1101 # Not return if floating_ip is not found, otherwise,
1102 # instance never be accessible..
1103 if None == floating_ip:
1104 logging.error('floating_ip is not found for %s ' % ec2_id)
1105 else:
1106 floating_ip_ref = db.floating_ip_get_by_address(context,
1107 floating_ip)
1108 db.floating_ip_update(context,
1109 floating_ip_ref['address'],
1110 {'host': dest})
1111 except exception.NotFound:
1112 logging.debug('%s doesnt have floating_ip.. ' % ec2_id)
1113 except:
1114 msg = 'Live migration: Unexpected error:'
1115 msg += '%s cannot inherit floating ip.. ' % ec2_id
1116 logging.error(_(msg))
1117
1118 # Restore instance/volume state
1119 db.instance_update(context,
1120 instance_id,
1121 {'state_description': 'running',
1122 'state': power_state.RUNNING,
1123 'host': dest})
1124
1125 for v in instance_ref['volumes']:
1126 db.volume_update(context,
1127 v['id'],
1128 {'status': 'in-use'})
1129
1130 logging.info(_('Live migrating %s to %s finishes successfully')
1131 % (ec2_id, dest))
1132 msg = _(("""Known error: the below error is nomally occurs.\n"""
1133 """Just check if iinstance is successfully migrated.\n"""
1134 """libvir: QEMU error : Domain not found: no domain """
1135 """with matching name.."""))
1136 logging.info(msg)
1137
8511138
852class FirewallDriver(object):1139class FirewallDriver(object):
853 def prepare_instance_filter(self, instance):1140 def prepare_instance_filter(self, instance):
8541141
=== modified file 'nova/virt/xenapi_conn.py'
--- nova/virt/xenapi_conn.py 2011-01-17 17:16:36 +0000
+++ nova/virt/xenapi_conn.py 2011-01-18 16:24:10 +0000
@@ -209,6 +209,36 @@
209 'username': FLAGS.xenapi_connection_username,209 'username': FLAGS.xenapi_connection_username,
210 'password': FLAGS.xenapi_connection_password}210 'password': FLAGS.xenapi_connection_password}
211211
212 def get_cpu_info(self):
213 """This method is supported only libvirt. """
214 return
215
216 def get_vcpu_number(self):
217 """This method is supported only libvirt. """
218 return -1
219
220 def get_memory_mb(self):
221 """This method is supported only libvirt.."""
222 return -1
223
224 def get_local_gb(self):
225 """This method is supported only libvirt.."""
226 return -1
227
228 def get_hypervisor_type(self):
229 """This method is supported only libvirt.."""
230 return
231
232 def get_hypervisor_version(self):
233 """This method is supported only libvirt.."""
234 return -1
235
236 def compare_cpu(self, xml):
237 raise NotImplementedError('This method is supported only libvirt.')
238
239 def live_migration(self, context, instance_ref, dest):
240 raise NotImplementedError('This method is supported only libvirt.')
241
212242
213class XenAPISession(object):243class XenAPISession(object):
214 """The session to invoke XenAPI SDK calls"""244 """The session to invoke XenAPI SDK calls"""
215245
=== modified file 'nova/volume/driver.py'
--- nova/volume/driver.py 2011-01-13 12:02:14 +0000
+++ nova/volume/driver.py 2011-01-18 16:24:10 +0000
@@ -122,7 +122,7 @@
122 """Removes an export for a logical volume."""122 """Removes an export for a logical volume."""
123 raise NotImplementedError()123 raise NotImplementedError()
124124
125 def discover_volume(self, volume):125 def discover_volume(self, _context, volume):
126 """Discover volume on a remote host."""126 """Discover volume on a remote host."""
127 raise NotImplementedError()127 raise NotImplementedError()
128128
@@ -184,15 +184,35 @@
184 self._try_execute("sudo vblade-persist destroy %s %s" %184 self._try_execute("sudo vblade-persist destroy %s %s" %
185 (shelf_id, blade_id))185 (shelf_id, blade_id))
186186
187 def discover_volume(self, _volume):187 def discover_volume(self, context, volume):
188 """Discover volume on a remote host."""188 """Discover volume on a remote host."""
189 self._execute("sudo aoe-discover")189 self._execute("sudo aoe-discover")
190 self._execute("sudo aoe-stat", check_exit_code=False)190 self._execute("sudo aoe-stat", check_exit_code=False)
191 shelf_id, blade_id = self.db.volume_get_shelf_and_blade(context,
192 volume['id'])
193 return "/dev/etherd/e%s.%s" % (shelf_id, blade_id)
191194
192 def undiscover_volume(self, _volume):195 def undiscover_volume(self, _volume):
193 """Undiscover volume on a remote host."""196 """Undiscover volume on a remote host."""
194 pass197 pass
195198
199 def check_for_export(self, context, volume_id):
200 """Make sure whether volume is exported."""
201 (shelf_id,
202 blade_id) = self.db.volume_get_shelf_and_blade(context,
203 volume_id)
204 (out, _err) = self._execute("sudo vblade-persist ls --no-header")
205 exists = False
206 for line in out.split('\n'):
207 param = line.split(' ')
208 if len(param) == 6 and param[0] == str(shelf_id) \
209 and param[1] == str(blade_id) and param[-1] == "run":
210 exists = True
211 break
212 if not exists:
213 logging.warning(_("vblade process for e%s.%s isn't running.")
214 % (shelf_id, blade_id))
215
196216
197class FakeAOEDriver(AOEDriver):217class FakeAOEDriver(AOEDriver):
198 """Logs calls instead of executing."""218 """Logs calls instead of executing."""
@@ -276,7 +296,7 @@
276 iscsi_portal = location.split(",")[0]296 iscsi_portal = location.split(",")[0]
277 return (iscsi_name, iscsi_portal)297 return (iscsi_name, iscsi_portal)
278298
279 def discover_volume(self, volume):299 def discover_volume(self, _context, volume):
280 """Discover volume on a remote host."""300 """Discover volume on a remote host."""
281 iscsi_name, iscsi_portal = self._get_name_and_portal(volume['name'],301 iscsi_name, iscsi_portal = self._get_name_and_portal(volume['name'],
282 volume['host'])302 volume['host'])
@@ -364,7 +384,7 @@
364 """Removes an export for a logical volume"""384 """Removes an export for a logical volume"""
365 pass385 pass
366386
367 def discover_volume(self, volume):387 def discover_volume(self, _context, volume):
368 """Discover volume on a remote host"""388 """Discover volume on a remote host"""
369 return "rbd:%s/%s" % (FLAGS.rbd_pool, volume['name'])389 return "rbd:%s/%s" % (FLAGS.rbd_pool, volume['name'])
370390
@@ -413,7 +433,7 @@
413 """Removes an export for a logical volume"""433 """Removes an export for a logical volume"""
414 pass434 pass
415435
416 def discover_volume(self, volume):436 def discover_volume(self, _context, volume):
417 """Discover volume on a remote host"""437 """Discover volume on a remote host"""
418 return "sheepdog:%s" % volume['name']438 return "sheepdog:%s" % volume['name']
419439
420440
=== modified file 'nova/volume/manager.py'
--- nova/volume/manager.py 2011-01-04 05:23:35 +0000
+++ nova/volume/manager.py 2011-01-18 16:24:10 +0000
@@ -138,7 +138,7 @@
138 if volume_ref['host'] == self.host and FLAGS.use_local_volumes:138 if volume_ref['host'] == self.host and FLAGS.use_local_volumes:
139 path = self.driver.local_path(volume_ref)139 path = self.driver.local_path(volume_ref)
140 else:140 else:
141 path = self.driver.discover_volume(volume_ref)141 path = self.driver.discover_volume(context, volume_ref)
142 return path142 return path
143143
144 def remove_compute_volume(self, context, volume_id):144 def remove_compute_volume(self, context, volume_id):
@@ -149,3 +149,10 @@
149 return True149 return True
150 else:150 else:
151 self.driver.undiscover_volume(volume_ref)151 self.driver.undiscover_volume(volume_ref)
152
153 def check_for_export(self, context, instance_id):
154 """Make sure whether volume is exported."""
155 if FLAGS.volume_driver == 'nova.volume.driver.AOEDriver':
156 instance_ref = self.db.instance_get(instance_id)
157 for v in instance_ref['volumes']:
158 self.driver.check_for_export(context, v['id'])
152159
=== modified file 'setup.py'
--- setup.py 2011-01-10 19:26:38 +0000
+++ setup.py 2011-01-18 16:24:10 +0000
@@ -34,6 +34,7 @@
34 version_file.write(vcsversion)34 version_file.write(vcsversion)
3535
3636
37
37class local_BuildDoc(BuildDoc):38class local_BuildDoc(BuildDoc):
38 def run(self):39 def run(self):
39 for builder in ['html', 'man']:40 for builder in ['html', 'man']: