OpenStack Compute (nova)

Merge lp:~nttdata/nova/live-migration into lp:~hudson-openstack/nova/trunk

live-migration
Merge into trunk

Proposed by Kei Masumoto on 2011-02-14

Status:

Merged

Merged at revision:

799

Proposed branch:

lp:~nttdata/nova/live-migration

Merge into:

lp:~hudson-openstack/nova/trunk

Diff against target:

3335 lines (+2788/-10)

21 files modified

bin/nova-manage (+88/-0)
contrib/nova.sh (+1/-0)
nova/compute/manager.py (+252/-1)
nova/db/api.py (+59/-0)
nova/db/sqlalchemy/api.py (+121/-0)
nova/db/sqlalchemy/migrate_repo/versions/010_add_live_migration.py (+83/-0)
nova/db/sqlalchemy/models.py (+38/-0)
nova/scheduler/driver.py (+237/-0)
nova/scheduler/manager.py (+52/-0)
nova/service.py (+3/-0)
nova/tests/test_compute.py (+294/-0)
nova/tests/test_scheduler.py (+622/-1)
nova/tests/test_service.py (+41/-0)
nova/tests/test_virt.py (+223/-3)
nova/tests/test_volume.py (+195/-0)
nova/virt/cpuinfo.xml.template (+9/-0)
nova/virt/fake.py (+21/-0)
nova/virt/libvirt_conn.py (+369/-0)
nova/virt/xenapi_conn.py (+21/-0)
nova/volume/driver.py (+52/-4)
nova/volume/manager.py (+7/-1)

To merge this branch:

bzr merge lp:~nttdata/nova/live-migration

High

Fix Released

Link a bug report

Reviewer	Date Requested	Status
Ken Pepple (community)		Approve on 2011-03-14
Thierry Carrez (community)		Approve on 2011-03-14
Jay Pipes (community)		Approve on 2011-03-10
Rick Harris (community)		Approve on 2011-03-09
Brian Schott (community)		Approve on 2011-03-09
termie (community)	2011-02-14	Needs Fixing on 2011-02-22
Review via email: mp+49699@code.launchpad.net

Description of the change

Main changes from previous merge request:

1. Adding test code
2. Bug fixing
   - improper resource checking
     (memory checking is enough for current version)
   - retrying when continuously live migration request
     in this case, iptables complains, so let's retry!
3. ISCSI EBS volume checking
   - adding nova.volume.driver.ISCSIDriver.check_for_export
   - changing nova.compute.post_live_migration for logging out from iscsi server.

Please feel free to give us comments.
Thanks in advance.

Revision history for this message

Rick Harris (rconradharris) wrote on 2011-02-17:

Just a few nits :)

> + def describeresource(self, host):
> + def updateresource(self, host):

These should probably be `describe_resource` and `update_resource` respectively.

3083 +def mktmpfile(dir):
3084 + """create tmpfile under dir, and return filename."""
3085 + filename = datetime.datetime.utcnow().strftime('%Y%m%d%H%M%S')
3086 + fpath = os.path.join(dir, filename)
3087 + open(fpath, 'a+').write(fpath + '\n')
3088 + return fpath

It would probably be better to use the `tempfile` module in the Python stdlib.

3091 +def exists(filename):
3092 + """check file path existence."""
3093 + return os.path.exists(filename)
3094 +
3095 +
3096 +def remove(filename):
3097 + """remove file."""
3098 + return os.remove(filename)

These wrapper functions seem unnecessary, it would probably be better to just use os.path.exists and os.remove directly in the code.

If you need a stub-point for testing, you can stub out `os.path` and `os` directly.

+ LOG.info('post_live_migration() is started..')

Needs i18n _('post_live...') treatment.

533 + #services.create_column(services_vcpus)
534 + #services.create_column(services_memory_mb)
535 + #services.create_column(services_local_gb)
536 + #services.create_column(services_vcpus_used)
537 + #services.create_column(services_memory_mb_used)
538 + #services.create_column(services_local_gb_used)
539 + #services.create_column(services_hypervisor_type)
540 + #services.create_column(services_hypervisor_version)
541 + #services.create_column(services_cpu_info)

Was this left in by mistake?

902 + print 'manager.attrerr', e

Probably should be logging here, rather than printing to stdout.

review: Needs Fixing

Revision history for this message

Kei Masumoto (masumotok) wrote on 2011-02-18:

Hi Rick,

Thanks for review!
I think I fixed all of your comments.

Additional changes are made at nova.compute.api.create and nova.image.s3.
It is not related to live-migration, but instances which has kernel and ramdisk cannot launch without this changes. I never change this file and I tested not only run_test.sh but also confirm instances were successfully migrated at real server before I raised merge request. So I completely have no idea when this changes are included...
Anyway, I think this change is necessary. Could you also please review it?

Kindly Regards,
Kei

-----Original Message-----
From: <email address hidden> [mailto:<email address hidden>] On Behalf Of Rick Harris
Sent: Friday, February 18, 2011 7:10 AM
To: <email address hidden>
Subject: Re: [Merge] lp:~nttdata/nova/live-migration into lp:nova

Review: Needs Fixing
Just a few nits :)

> + def describeresource(self, host):
> + def updateresource(self, host):

These should probably be `describe_resource` and `update_resource` respectively.

It would probably be better to use the `tempfile` module in the Python stdlib.

These wrapper functions seem unnecessary, it would probably be better to just use os.path.exists and os.remove directly in the code.

If you need a stub-point for testing, you can stub out `os.path` and `os` directly.

+ LOG.info('post_live_migration() is started..')

Needs i18n _('post_live...') treatment.

Was this left in by mistake?

902 + print 'manager.attrerr', e

Probably should be logging here, rather than printing to stdout.

--
https://code.launchpad.net/~nttdata/nova/live-migration/+merge/49699
Your team NTT DATA is subscribed to branch lp:~nttdata/nova/live-migration.

Hi Rick,

Thanks for review!
I think I fixed all of your comments.

Kindly Regards, 
Kei

-----Original Message-----
From: bounces@canonical.com [mailto:bounces@canonical.com] On Behalf Of Rick Harris
Sent: Friday, February 18, 2011 7:10 AM
To: mp+49699@code.launchpad.net
Subject: Re: [Merge] lp:~nttdata/nova/live-migration into lp:nova

Review: Needs Fixing
Just a few nits :)

> +    def describeresource(self, host):
> +    def updateresource(self, host):

These should probably be `describe_resource` and `update_resource` respectively.

3083	+def mktmpfile(dir):
3084	+    """create tmpfile under dir, and return filename."""
3085	+    filename = datetime.datetime.utcnow().strftime('%Y%m%d%H%M%S')
3086	+    fpath = os.path.join(dir, filename)
3087	+    open(fpath, 'a+').write(fpath + '\n')
3088	+    return fpath

It would probably be better to use the `tempfile` module in the Python stdlib.

3091	+def exists(filename):
3092	+    """check file path existence."""
3093	+    return os.path.exists(filename)
3094	+
3095	+
3096	+def remove(filename):
3097	+    """remove file."""
3098	+    return os.remove(filename)

These wrapper functions seem unnecessary, it would probably be better to just use os.path.exists and os.remove directly in the code.

If you need a stub-point for testing, you can stub out `os.path` and `os` directly.

+        LOG.info('post_live_migration() is started..')

Needs i18n _('post_live...') treatment.

533	+    #services.create_column(services_vcpus)
534	+    #services.create_column(services_memory_mb)
535	+    #services.create_column(services_local_gb)
536	+    #services.create_column(services_vcpus_used)
537	+    #services.create_column(services_memory_mb_used)
538	+    #services.create_column(services_local_gb_used)
539	+    #services.create_column(services_hypervisor_type)
540	+    #services.create_column(services_hypervisor_version)
541	+    #services.create_column(services_cpu_info)

Was this left in by mistake?

902	+            print 'manager.attrerr', e

Probably should be logging here, rather than printing to stdout.

-- 
https://code.launchpad.net/~nttdata/nova/live-migration/+merge/49699
Your team NTT DATA is subscribed to branch lp:~nttdata/nova/live-migration.

Revision history for this message

Brian Schott (bfschott) wrote on 2011-02-18:

Download full text (6.8 KiB)

We're very interested in this capability, so looking forward to it. Few comments.

1. Current branch conflicts with lp:nova trunk.

+N nova/db/sqlalchemy/migrate_repo/versions/003_add_label_to_networks.py
+N nova/db/sqlalchemy/migrate_repo/versions/004_add_zone_tables.py

fix: bschott@island100:~/source/nova/live-migration/nova/db/sqlalchemy/migrate_repo/versions$ bzr rename 003_cactus.py 005_add_instance_migration.py

2. Should these be in their own table? That is a lot of fields to add to the Service table directly, since this is a table that has entries for every service type. I was thinking about adding a ComputeService (compute_services) table for our heterogeneous compute cluster.

627 + # The below items are compute node only.
628 + # None is inserted for other service.
629 + vcpus = Column(Integer, nullable=True)
630 + memory_mb = Column(Integer, nullable=True)
631 + local_gb = Column(Integer, nullable=True)
632 + vcpus_used = Column(Integer, nullable=True)
633 + memory_mb_used = Column(Integer, nullable=True)
634 + local_gb_used = Column(Integer, nullable=True)
635 + hypervisor_type = Column(Text, nullable=True)
636 + hypervisor_version = Column(Integer, nullable=True)

3. We can use the "arch" sub-field below for our project. Can we talk about adding accelerator_info (for GPUs, FPGAs, or other co-procesors) and possibly network_info for details on the physical network interface?

    # Note(masumotok): Expected Strings example:
    #
    # '{"arch":"x86_64", "model":"Nehalem",
    # "topology":{"sockets":1, "threads":2, "cores":3},
    # features:[ "tdtscp", "xtpr"]}'
    #
    # Points are "json translatable" and it must have all
    # dictionary keys above.
    cpu_info = Column(Text, nullable=True)

We're very interested in this capability, so looking forward to it.  Few comments.

1. Current branch conflicts with lp:nova trunk.

+N  nova/db/sqlalchemy/migrate_repo/versions/003_add_label_to_networks.py
+N  nova/db/sqlalchemy/migrate_repo/versions/004_add_zone_tables.py

fix: bschott@island100:~/source/nova/live-migration/nova/db/sqlalchemy/migrate_repo/versions$ bzr rename 003_cactus.py 005_add_instance_migration.py

2. Should these be in their own table?  That is a lot of fields to add to the Service table directly, since this is a table that has entries for every service type.  I was thinking about adding a ComputeService (compute_services) table for our heterogeneous compute cluster.

627	+    # The below items are compute node only.
628	+    # None is inserted for other service.
629	+    vcpus = Column(Integer, nullable=True)
630	+    memory_mb = Column(Integer, nullable=True)
631	+    local_gb = Column(Integer, nullable=True)
632	+    vcpus_used = Column(Integer, nullable=True)
633	+    memory_mb_used = Column(Integer, nullable=True)
634	+    local_gb_used = Column(Integer, nullable=True)
635	+    hypervisor_type = Column(Text, nullable=True)
636	+    hypervisor_version = Column(Integer, nullable=True)

3. We can use the "arch" sub-field below for our project.  Can we talk about adding accelerator_info (for GPUs, FPGAs, or other co-procesors) and possibly network_info for details on the physical network interface?

# Note(masumotok): Expected Strings example:
    #
    # '{"arch":"x86_64", "model":"Nehalem",
    #  "topology":{"sockets":1, "threads":2, "cores":3},
    #  features:[ "tdtscp", "xtpr"]}'
    #
    # Points are "json translatable" and it must have all
    # dictionary keys above.
    cpu_info = Column(Text, nullable=True)

bschott@island100:~/source/nova/live-migration$ bzr merge lp:nova
+N  nova/api/openstack/zones.py                                                                                                     
+N  nova/db/sqlalchemy/migrate_repo/versions/003_add_label_to_networks.py
+N  nova/db/sqlalchemy/migrate_repo/versions/004_add_zone_tables.py
+N  nova/tests/api/openstack/test_common.py
+N  nova/tests/api/openstack/test_zones.py
 M  .mailmap
 M  Authors
 M  HACKING
 M  MANIFEST.in
 M  bin/nova-manage
 M  locale/nova.pot
 M  nova/api/ec2/cloud.py
 M  nova/api/openstack/__init__.py
 M  nova/api/openstack/auth.py
 M  nova/api/openstack/common.py
 M  nova/api/openstack/servers.py
 M  nova/auth/ldapdriver.py
 M  nova/auth/novarc.template
 M  nova/compute/api.py
 M  nova/compute/manager.py
 M  nova/compute/power_state.py
 M  nova/context.py
 M  nova/db/api.py
 M  nova/db/sqlalchemy/api.py
 M  nova/db/sqlalchemy/migrate_repo/versions/001_austin.py
 M  nova/db/sqlalchemy/migrate_repo/versions/002_bexar.py
 M  nova/db/sqlalchemy/migration.py
 M  nova/db/sqlalchemy/models.py
 M  nova/flags.py
 M  nova/log.py
 M  nova/network/linux_net.py
 M  nova/network/manager.py
 M  nova/rpc.py
 M  nova/tests/api/openstack/__init__.py
 M  nova/tests/api/openstack/test_servers.py
 M  nova/tests/test_api.py
 M  nova/tests/test_compute.py
 M  nova/tests/test_log.py
 M  nova/tests/test_xenapi.py
 M  nova/twistd.py
 M  nova/utils.py
 M  nova/virt/fake.py
 M  nova/virt/xenapi/fake.py
 M  nova/virt/xenapi/vm_utils.py
 M  nova/virt/xenapi/vmops.py
 M  nova/virt/xenapi_conn.py
 M  nova/volume/manager.py
 M  plugins/xenserver/xenapi/etc/xapi.d/plugins/agent
 M  plugins/xenserver/xenapi/etc/xapi.d/plugins/glance
 M  plugins/xenserver/xenapi/etc/xapi.d/plugins/objectstore
 M  plugins/xenserver/xenapi/etc/xapi.d/plugins/xenstore.py
 M  setup.py
All changes applied successfully.                                                                                                   
bschott@island100:~/source/nova/live-migration$ ./run_tests.sh 
ERROR

======================================================================
ERROR: <nose.suite.ContextSuite context=nova.tests>
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/bschott/source/nova/live-migration/.nova-venv/lib/python2.6/site-packages/nose/suite.py", line 208, in run
    self.setUp()
  File "/home/bschott/source/nova/live-migration/.nova-venv/lib/python2.6/site-packages/nose/suite.py", line 291, in setUp
    self.setupContext(ancestor)
  File "/home/bschott/source/nova/live-migration/.nova-venv/lib/python2.6/site-packages/nose/suite.py", line 314, in setupContext
    try_run(context, names)
  File "/home/bschott/source/nova/live-migration/.nova-venv/lib/python2.6/site-packages/nose/util.py", line 478, in try_run
    return func()
  File "/home/bschott/source/nova/live-migration/nova/tests/__init__.py", line 41, in setup
    migration.db_sync()
  File "/home/bschott/source/nova/live-migration/nova/db/migration.py", line 33, in db_sync
    return IMPL.db_sync(version=version)
  File "/home/bschott/source/nova/live-migration/nova/db/sqlalchemy/migration.py", line 41, in db_sync
    db_version()
  File "/home/bschott/source/nova/live-migration/nova/db/sqlalchemy/migration.py", line 49, in db_version
    return versioning_api.db_version(FLAGS.sql_connection, repo_path)
  File "<string>", line 2, in db_version
  File "/home/bschott/source/nova/live-migration/.nova-venv/lib/python2.6/site-packages/migrate/versioning/util/__init__.py", line 159, in with_engine
    return f(*a, **kw)
  File "/home/bschott/source/nova/live-migration/.nova-venv/lib/python2.6/site-packages/migrate/versioning/api.py", line 148, in db_version
    schema = ControlledSchema(engine, repository)
  File "/home/bschott/source/nova/live-migration/.nova-venv/lib/python2.6/site-packages/migrate/versioning/schema.py", line 27, in __init__
    repository = Repository(repository)
  File "/home/bschott/source/nova/live-migration/.nova-venv/lib/python2.6/site-packages/migrate/versioning/repository.py", line 81, in __init__
    self._versions))
  File "/home/bschott/source/nova/live-migration/.nova-venv/lib/python2.6/site-packages/migrate/versioning/version.py", line 84, in __init__
    self.versions[VerNum(num)] = Version(num, path, files)
  File "/home/bschott/source/nova/live-migration/.nova-venv/lib/python2.6/site-packages/migrate/versioning/version.py", line 154, in __init__
    self.add_script(os.path.join(path, script))
  File "/home/bschott/source/nova/live-migration/.nova-venv/lib/python2.6/site-packages/migrate/versioning/version.py", line 175, in add_script
    self._add_script_py(path)
  File "/home/bschott/source/nova/live-migration/.nova-venv/lib/python2.6/site-packages/migrate/versioning/version.py", line 198, in _add_script_py
    'per version, but you have: %s and %s' % (self.python, path))
ScriptError: You can only have one Python script per version, but you have: /home/bschott/source/nova/live-migration/nova/db/sqlalchemy/migrate_repo/versions/003_cactus.py and /home/bschott/source/nova/live-migration/nova/db/sqlalchemy/migrate_repo/versions/003_add_label_to_networks.py

review: Needs Fixing

Revision history for this message

termie (termie) wrote on 2011-02-22:

Download full text (9.8 KiB)

Hello :) I think the code looks very good (tests especially appear thorough), however there are many places for style cleanup, you may want read the part of the HACKING file about docstrings before going on:

in bin/nova-api:

looks like utils.default_flagfile() should be in the __main__ function rather than at the top of the file.

in bin/nova-dhcpbridge:

looks like there is a leftover debugging statement ('open...')

in bin/nova-manage:

please update the docstring for 'live_migration' to describe what it will do (something like "Migrates a running instance to a new machine." is fine)

for the long "if FLAGS.volume_driver..." line, please instead put the line in parens like so:

if (FLAGS.volume_driver != 'nova.volume.driver.AOEDriver' and
FLAGS.volume_driver != 'nova.volume.driver.ISCSIDriver'):

When generating the "msg" you can do something similar:

msg = ('Migration of %s initiated. Checking its progress'
' using euca-describe-instances.') % ec2_id

in the docstring for describe_resource, please capitalize the first word (Describe...)

the comment at line 83 ("Checking result msg format is necessary...") is a little unclear, are you saying:

It will be necessary to check the result msg format when this feature is included in the API.

if so, you could say:

TODO(masumotok): It will be necessary to check the result msg...

Please capitalize the first letter of the docstring for update_resource

in nova/compute/manager.py:

the triple quotes are not necessary around the description of the 'flags.DEFINE_string' line, single quotes are fine.

flags.DEFINE_string looks like it should be flags.DEFINE_integer

the docstring for compare_cpu has an extra space at the beginning that is not necessary.

please capitalize the first letter of the docstring for mktmpfile

if you are only writing to the tmpfile for debugging purposes, perhaps that should be a logging.debug call?

please add a period to the end of the docstring for update_available_resource

in the pre_live_migration method, there should be an apostrophe in the word "doesnt" (doesn't)

may as well capitalize the first letter in the Bridge settings comment ('Call this method...')

in the message about failing a retry you can remove the 'th.' part, and change 'fail' to 'failed', it still doesn't read perfectly but pluralization isn't really necessary for log messages.

in the live_migration method, you can delete the line about #@exception.wrap_exception if you are going to comment it out.

also, please capitalize the first letter of the docstring.

in post_live_migration, please move the first line to the first line of the docstring... and reformat the string a bit like so:

"""Post operations for live migration.

Mainly, database updating.

"""

also in post_live_migration you check 'None == some_variable' a couple times, in python we don't usually do this because it is impossible to write 'if some_variable = None' because the assignment operation is not an expression... which means you don't need to be extra safe with which side the variable is on and having the variable first is easier to read (at least in english).

also a bit further down you don't need to use triple ...

in bin/nova-api:

looks like utils.default_flagfile() should be in the __main__ function rather than at the top of the file.

in bin/nova-dhcpbridge:

looks like there is a leftover debugging statement ('open...')

in bin/nova-manage:

please update the docstring for 'live_migration' to describe what it will do (something like "Migrates a running instance to a new machine." is fine)

for the long "if FLAGS.volume_driver..." line, please instead put the line in parens like so:

if (FLAGS.volume_driver != 'nova.volume.driver.AOEDriver' and
    FLAGS.volume_driver != 'nova.volume.driver.ISCSIDriver'):

When generating the "msg" you can do something similar:

msg = ('Migration of %s initiated. Checking its progress'
       ' using euca-describe-instances.') % ec2_id