Merge lp:~nttdata/nova/live-migration into lp:~hudson-openstack/nova/trunk

Proposed by Kei Masumoto
Status: Merged
Merged at revision: 799
Proposed branch: lp:~nttdata/nova/live-migration
Merge into: lp:~hudson-openstack/nova/trunk
Diff against target: 3335 lines (+2788/-10)
21 files modified
bin/nova-manage (+88/-0)
contrib/nova.sh (+1/-0)
nova/compute/manager.py (+252/-1)
nova/db/api.py (+59/-0)
nova/db/sqlalchemy/api.py (+121/-0)
nova/db/sqlalchemy/migrate_repo/versions/010_add_live_migration.py (+83/-0)
nova/db/sqlalchemy/models.py (+38/-0)
nova/scheduler/driver.py (+237/-0)
nova/scheduler/manager.py (+52/-0)
nova/service.py (+3/-0)
nova/tests/test_compute.py (+294/-0)
nova/tests/test_scheduler.py (+622/-1)
nova/tests/test_service.py (+41/-0)
nova/tests/test_virt.py (+223/-3)
nova/tests/test_volume.py (+195/-0)
nova/virt/cpuinfo.xml.template (+9/-0)
nova/virt/fake.py (+21/-0)
nova/virt/libvirt_conn.py (+369/-0)
nova/virt/xenapi_conn.py (+21/-0)
nova/volume/driver.py (+52/-4)
nova/volume/manager.py (+7/-1)
To merge this branch: bzr merge lp:~nttdata/nova/live-migration
Reviewer Review Type Date Requested Status
Ken Pepple (community) Approve
Thierry Carrez (community) Approve
Jay Pipes (community) Approve
Rick Harris (community) Approve
Brian Schott (community) Approve
termie (community) Needs Fixing
Review via email: mp+49699@code.launchpad.net

Description of the change

Main changes from previous merge request:

1. Adding test code
2. Bug fixing
   - improper resource checking
     (memory checking is enough for current version)
   - retrying when continuously live migration request
     in this case, iptables complains, so let's retry!
3. ISCSI EBS volume checking
   - adding nova.volume.driver.ISCSIDriver.check_for_export
   - changing nova.compute.post_live_migration for logging out from iscsi server.

Please feel free to give us comments.
Thanks in advance.

To post a comment you must log in.
Revision history for this message
Rick Harris (rconradharris) wrote :

Just a few nits :)

> + def describeresource(self, host):
> + def updateresource(self, host):

These should probably be `describe_resource` and `update_resource` respectively.

3083 +def mktmpfile(dir):
3084 + """create tmpfile under dir, and return filename."""
3085 + filename = datetime.datetime.utcnow().strftime('%Y%m%d%H%M%S')
3086 + fpath = os.path.join(dir, filename)
3087 + open(fpath, 'a+').write(fpath + '\n')
3088 + return fpath

It would probably be better to use the `tempfile` module in the Python stdlib.

3091 +def exists(filename):
3092 + """check file path existence."""
3093 + return os.path.exists(filename)
3094 +
3095 +
3096 +def remove(filename):
3097 + """remove file."""
3098 + return os.remove(filename)

These wrapper functions seem unnecessary, it would probably be better to just use os.path.exists and os.remove directly in the code.

If you need a stub-point for testing, you can stub out `os.path` and `os` directly.

+ LOG.info('post_live_migration() is started..')

Needs i18n _('post_live...') treatment.

533 + #services.create_column(services_vcpus)
534 + #services.create_column(services_memory_mb)
535 + #services.create_column(services_local_gb)
536 + #services.create_column(services_vcpus_used)
537 + #services.create_column(services_memory_mb_used)
538 + #services.create_column(services_local_gb_used)
539 + #services.create_column(services_hypervisor_type)
540 + #services.create_column(services_hypervisor_version)
541 + #services.create_column(services_cpu_info)

Was this left in by mistake?

902 + print 'manager.attrerr', e

Probably should be logging here, rather than printing to stdout.

review: Needs Fixing
Revision history for this message
Kei Masumoto (masumotok) wrote :

Hi Rick,

Thanks for review!
I think I fixed all of your comments.

Additional changes are made at nova.compute.api.create and nova.image.s3.
It is not related to live-migration, but instances which has kernel and ramdisk cannot launch without this changes. I never change this file and I tested not only run_test.sh but also confirm instances were successfully migrated at real server before I raised merge request. So I completely have no idea when this changes are included...
Anyway, I think this change is necessary. Could you also please review it?

Kindly Regards,
Kei

-----Original Message-----
From: <email address hidden> [mailto:<email address hidden>] On Behalf Of Rick Harris
Sent: Friday, February 18, 2011 7:10 AM
To: <email address hidden>
Subject: Re: [Merge] lp:~nttdata/nova/live-migration into lp:nova

Review: Needs Fixing
Just a few nits :)

> + def describeresource(self, host):
> + def updateresource(self, host):

These should probably be `describe_resource` and `update_resource` respectively.

3083 +def mktmpfile(dir):
3084 + """create tmpfile under dir, and return filename."""
3085 + filename = datetime.datetime.utcnow().strftime('%Y%m%d%H%M%S')
3086 + fpath = os.path.join(dir, filename)
3087 + open(fpath, 'a+').write(fpath + '\n')
3088 + return fpath

It would probably be better to use the `tempfile` module in the Python stdlib.

3091 +def exists(filename):
3092 + """check file path existence."""
3093 + return os.path.exists(filename)
3094 +
3095 +
3096 +def remove(filename):
3097 + """remove file."""
3098 + return os.remove(filename)

These wrapper functions seem unnecessary, it would probably be better to just use os.path.exists and os.remove directly in the code.

If you need a stub-point for testing, you can stub out `os.path` and `os` directly.

+ LOG.info('post_live_migration() is started..')

Needs i18n _('post_live...') treatment.

533 + #services.create_column(services_vcpus)
534 + #services.create_column(services_memory_mb)
535 + #services.create_column(services_local_gb)
536 + #services.create_column(services_vcpus_used)
537 + #services.create_column(services_memory_mb_used)
538 + #services.create_column(services_local_gb_used)
539 + #services.create_column(services_hypervisor_type)
540 + #services.create_column(services_hypervisor_version)
541 + #services.create_column(services_cpu_info)

Was this left in by mistake?

902 + print 'manager.attrerr', e

Probably should be logging here, rather than printing to stdout.

--
https://code.launchpad.net/~nttdata/nova/live-migration/+merge/49699
Your team NTT DATA is subscribed to branch lp:~nttdata/nova/live-migration.

Revision history for this message
Brian Schott (bfschott) wrote :
Download full text (6.8 KiB)

We're very interested in this capability, so looking forward to it. Few comments.

1. Current branch conflicts with lp:nova trunk.

+N nova/db/sqlalchemy/migrate_repo/versions/003_add_label_to_networks.py
+N nova/db/sqlalchemy/migrate_repo/versions/004_add_zone_tables.py

fix: bschott@island100:~/source/nova/live-migration/nova/db/sqlalchemy/migrate_repo/versions$ bzr rename 003_cactus.py 005_add_instance_migration.py

2. Should these be in their own table? That is a lot of fields to add to the Service table directly, since this is a table that has entries for every service type. I was thinking about adding a ComputeService (compute_services) table for our heterogeneous compute cluster.

627 + # The below items are compute node only.
628 + # None is inserted for other service.
629 + vcpus = Column(Integer, nullable=True)
630 + memory_mb = Column(Integer, nullable=True)
631 + local_gb = Column(Integer, nullable=True)
632 + vcpus_used = Column(Integer, nullable=True)
633 + memory_mb_used = Column(Integer, nullable=True)
634 + local_gb_used = Column(Integer, nullable=True)
635 + hypervisor_type = Column(Text, nullable=True)
636 + hypervisor_version = Column(Integer, nullable=True)

3. We can use the "arch" sub-field below for our project. Can we talk about adding accelerator_info (for GPUs, FPGAs, or other co-procesors) and possibly network_info for details on the physical network interface?

    # Note(masumotok): Expected Strings example:
    #
    # '{"arch":"x86_64", "model":"Nehalem",
    # "topology":{"sockets":1, "threads":2, "cores":3},
    # features:[ "tdtscp", "xtpr"]}'
    #
    # Points are "json translatable" and it must have all
    # dictionary keys above.
    cpu_info = Column(Text, nullable=True)

bschott@island100:~/source/nova/live-migration$ bzr merge lp:nova
+N nova/api/openstack/zones.py
+N nova/db/sqlalchemy/migrate_repo/versions/003_add_label_to_networks.py
+N nova/db/sqlalchemy/migrate_repo/versions/004_add_zone_tables.py
+N nova/tests/api/openstack/test_common.py
+N nova/tests/api/openstack/test_zones.py
 M .mailmap
 M Authors
 M HACKING
 M MANIFEST.in
 M bin/nova-manage
 M locale/nova.pot
 M nova/api/ec2/cloud.py
 M nova/api/openstack/__init__.py
 M nova/api/openstack/auth.py
 M nova/api/openstack/common.py
 M nova/api/openstack/servers.py
 M nova/auth/ldapdriver.py
 M nova/auth/novarc.template
 M nova/compute/api.py
 M nova/compute/manager.py
 M nova/compute/power_state.py
 M nova/context.py
 M nova/db/api.py
 M nova/db/sqlalchemy/api.py
 M nova/db/sqlalchemy/migrate_repo/versions/001_austin.py
 M nova/db/sqlalchemy/migrate_repo/versions/002_bexar.py
 M nova/db/sqlalchemy/migration.py
 M nova/db/sqlalchemy/models.py
 M nova/flags.py
 M nova/log.py
 M nova/network/linux_net.py
 M nova/network/manager.py
 M nova/rpc.py
 M nova/tests/api/openstack/__init__.py
 M nova/tests/api/openstack/test_servers.py
 M nova/tests/test_api.py
 M nova/tests/test_compute.py
 M nova/tests/test_log.py
 M nova/tests/test_xenapi.py
 M nova/twistd.py
 M nova/u...

Read more...

review: Needs Fixing
Revision history for this message
termie (termie) wrote :
Download full text (9.8 KiB)

Hello :) I think the code looks very good (tests especially appear thorough), however there are many places for style cleanup, you may want read the part of the HACKING file about docstrings before going on:

in bin/nova-api:

looks like utils.default_flagfile() should be in the __main__ function rather than at the top of the file.

in bin/nova-dhcpbridge:

looks like there is a leftover debugging statement ('open...')

in bin/nova-manage:

please update the docstring for 'live_migration' to describe what it will do (something like "Migrates a running instance to a new machine." is fine)

for the long "if FLAGS.volume_driver..." line, please instead put the line in parens like so:

if (FLAGS.volume_driver != 'nova.volume.driver.AOEDriver' and
    FLAGS.volume_driver != 'nova.volume.driver.ISCSIDriver'):

When generating the "msg" you can do something similar:

msg = ('Migration of %s initiated. Checking its progress'
       ' using euca-describe-instances.') % ec2_id

in the docstring for describe_resource, please capitalize the first word (Describe...)

the comment at line 83 ("Checking result msg format is necessary...") is a little unclear, are you saying:

It will be necessary to check the result msg format when this feature is included in the API.

if so, you could say:

TODO(masumotok): It will be necessary to check the result msg...

Please capitalize the first letter of the docstring for update_resource

in nova/compute/manager.py:

the triple quotes are not necessary around the description of the 'flags.DEFINE_string' line, single quotes are fine.

flags.DEFINE_string looks like it should be flags.DEFINE_integer

the docstring for compare_cpu has an extra space at the beginning that is not necessary.

please capitalize the first letter of the docstring for mktmpfile

if you are only writing to the tmpfile for debugging purposes, perhaps that should be a logging.debug call?

please add a period to the end of the docstring for update_available_resource

in the pre_live_migration method, there should be an apostrophe in the word "doesnt" (doesn't)

may as well capitalize the first letter in the Bridge settings comment ('Call this method...')

in the message about failing a retry you can remove the 'th.' part, and change 'fail' to 'failed', it still doesn't read perfectly but pluralization isn't really necessary for log messages.

in the live_migration method, you can delete the line about #@exception.wrap_exception if you are going to comment it out.

also, please capitalize the first letter of the docstring.

in post_live_migration, please move the first line to the first line of the docstring... and reformat the string a bit like so:

"""Post operations for live migration.

Mainly, database updating.

"""

also in post_live_migration you check 'None == some_variable' a couple times, in python we don't usually do this because it is impossible to write 'if some_variable = None' because the assignment operation is not an expression... which means you don't need to be extra safe with which side the variable is on and having the variable first is easier to read (at least in english).

also a bit further down you don't need to use triple ...

Read more...

review: Needs Fixing
Revision history for this message
Kei Masumoto (masumotok) wrote :
Download full text (8.3 KiB)

Hi Brian,
Thanks for review!

I think I fixed based on all of your comments.
(branch will update soon)
Also, regarding to the point 3 below,

> 3. We can use the "arch" sub-field below for our project. Can we talk about adding
> accelerator_info (for GPUs, FPGAs, or other co-procesors) and possibly network_info
> for details on the physical network interface?

We use cpu_info column to store an argument of compareCPU() in virConnect.
You can get examples following by the below procedure.

# python
# import libvirt
# conn = libvirt.openReadOnly()
# conn.getCapabilities()

Once you follow above, you get xml. We cut and store <cpu>...</cpu> to db.
I think some other result will be shown in your hardware environment.
Then FPGA/GPU info is also included, I suppose.
If libvirt don’t find any FPGA/GPU info, then please let me know.
I have to give it much thought..

-----Original Message-----
From: <email address hidden> [mailto:<email address hidden>] On Behalf Of Brian Schott
Sent: Saturday, February 19, 2011 5:02 AM
To: <email address hidden>
Subject: Re: [Merge] lp:~nttdata/nova/live-migration into lp:nova

Review: Needs Fixing

We're very interested in this capability, so looking forward to it. Few comments.

1. Current branch conflicts with lp:nova trunk.

+N nova/db/sqlalchemy/migrate_repo/versions/003_add_label_to_networks.py
+N nova/db/sqlalchemy/migrate_repo/versions/004_add_zone_tables.py

fix: bschott@island100:~/source/nova/live-migration/nova/db/sqlalchemy/migrate_repo/versions$ bzr rename 003_cactus.py 005_add_instance_migration.py

2. Should these be in their own table? That is a lot of fields to add to the Service table directly, since this is a table that has entries for every service type. I was thinking about adding a ComputeService (compute_services) table for our heterogeneous compute cluster.

627 + # The below items are compute node only.
628 + # None is inserted for other service.
629 + vcpus = Column(Integer, nullable=True)
630 + memory_mb = Column(Integer, nullable=True)
631 + local_gb = Column(Integer, nullable=True)
632 + vcpus_used = Column(Integer, nullable=True)
633 + memory_mb_used = Column(Integer, nullable=True)
634 + local_gb_used = Column(Integer, nullable=True)
635 + hypervisor_type = Column(Text, nullable=True)
636 + hypervisor_version = Column(Integer, nullable=True)

3. We can use the "arch" sub-field below for our project. Can we talk about adding accelerator_info (for GPUs, FPGAs, or other co-procesors) and possibly network_info for details on the physical network interface?

    # Note(masumotok): Expected Strings example:
    #
    # '{"arch":"x86_64", "model":"Nehalem",
    # "topology":{"sockets":1, "threads":2, "cores":3},
    # features:[ "tdtscp", "xtpr"]}'
    #
    # Points are "json translatable" and it must have all
    # dictionary keys above.
    cpu_info = Column(Text, nullable=True)

bschott@island100:~/source/nova/live-migration$ bzr merge lp:nova
+N nova/api/openstack/zones.py ...

Read more...

Revision history for this message
Kei Masumoto (masumotok) wrote :

Hi termie,

Thank you very much for reviewing my branch!
I fixed based on all comments from you.
Hope I don’t miss to fix...

Please let me know if you have further comments.

Regards, ,
Kei

--
https://code.launchpad.net/~nttdata/nova/live-migration/+merge/49699
Your team NTT DATA is subscribed to branch lp:~nttdata/nova/live-migration.

Revision history for this message
Kei Masumoto (masumotok) wrote :

Hi Rick,

Thanks again for reviewing my branch.
After you gave me a review, I went through 2 other reviewers and things are improved.
Hopefully, let me make proceed for merging.
Also, I kindly ask you to tell me if I missed your point.

Regards,
Kei

________________________________________
差出人: <email address hidden> [<email address hidden>] は Kei Masumoto [<email address hidden>] の代理
送信日時: 2011年2月23日 2:08
宛先: termie
件名: RE: [Merge] lp:~nttdata/nova/live-migration into lp:nova

Hi termie,

Thank you very much for reviewing my branch!
I fixed based on all comments from you.
Hope I don’t miss to fix...

Please let me know if you have further comments.

Regards, ,
Kei

--
https://code.launchpad.net/~nttdata/nova/live-migration/+merge/49699
Your team NTT DATA is subscribed to branch lp:~nttdata/nova/live-migration.

--
https://code.launchpad.net/~nttdata/nova/live-migration/+merge/49699
Your team NTT DATA is subscribed to branch lp:~nttdata/nova/live-migration.

Revision history for this message
Brian Schott (bfschott) wrote :

I've been reviewing your changes. Thank you for doing the compute service table changes. I plan to do some integration testing next week with our hpc-trunk branch.

Brian Schott
<email address hidden>

On Feb 25, 2011, at 8:47 AM, Kei Masumoto wrote:

> Hi Rick,
>
> Thanks again for reviewing my branch.
> After you gave me a review, I went through 2 other reviewers and things are improved.
> Hopefully, let me make proceed for merging.
> Also, I kindly ask you to tell me if I missed your point.
>
> Regards,
> Kei
>
> ________________________________________
> 差出人: <email address hidden> [<email address hidden>] は Kei Masumoto [<email address hidden>] の代理
> 送信日時: 2011年2月23日 2:08
> 宛先: termie
> 件名: RE: [Merge] lp:~nttdata/nova/live-migration into lp:nova
>
> Hi termie,
>
> Thank you very much for reviewing my branch!
> I fixed based on all comments from you.
> Hope I don’t miss to fix...
>
> Please let me know if you have further comments.
>
> Regards, ,
> Kei
>
> --
> https://code.launchpad.net/~nttdata/nova/live-migration/+merge/49699
> Your team NTT DATA is subscribed to branch lp:~nttdata/nova/live-migration.
>
> --
> https://code.launchpad.net/~nttdata/nova/live-migration/+merge/49699
> Your team NTT DATA is subscribed to branch lp:~nttdata/nova/live-migration.
> --
> https://code.launchpad.net/~nttdata/nova/live-migration/+merge/49699
> You are reviewing the proposed merge of lp:~nttdata/nova/live-migration into lp:nova.

Revision history for this message
Kei Masumoto (masumotok) wrote :

To All reviewers:

It has been passed a week that I did "bzr merge lp:nova" to this branch, and I think it's better to update again to avoid any conflict.
If you are on the way to review once again, please wait for a while. I will e-mail again once I finished. Few hours are enough, hopefully.

Regards,
Kei Masumoto
--
https://code.launchpad.net/~nttdata/nova/live-migration/+merge/49699
Your team NTT DATA is subscribed to branch lp:~nttdata/nova/live-migration.

Revision history for this message
Kei Masumoto (masumotok) wrote :

To All reviewers:

I completed to merge trunk rev 752.
Few changes are necessary. Please check the below comment.

> 1. merged trunk rev749
> 2. rpc.call returns '/' as '\/', so nova.compute.manager.mktmpfile, nova.compute.manager.confirm_tmpfile, nova.scheduler.driver.Scheduler.mounted_on_same_shared_storage are modified followed by this changes.
> 3. nova.tests.test_virt.py is modified so that other teams modification is easily detected since other team is using nova.db.sqlalchemy.models.ComputeService.

If you have further comments, if I missed your point, please let me know.

--
https://code.launchpad.net/~nttdata/nova/live-migration/+merge/49699
Your team NTT DATA is subscribed to branch lp:~nttdata/nova/live-migration.

Revision history for this message
Rick Harris (rconradharris) wrote :
Download full text (4.1 KiB)

Hi Kei! Improvement look good, thanks for the updates. Here are my round-two review notes:

> def mktmpfile(self, context):

It might be a good idea to rename these functions. Right now, the name
confirm_tmpfile contain implementation details, but doesn't provide a good
hint as to what it's used for.

Might be better as:

    create_shared_storage_test_file

Also, what happens if the destination isn't on shared storage? We've deposited
the test file, will that ever be cleaned up?

Perhaps in pseudo-code it should be something like:

    def mounted_on_same_shared_storage:

        create_shared_storage_test_file(dest)
        try:
            # Unlike confirm_tmpfile, this doesn't delete the test_file; that is left
            # to cleanup_test_file
            test_file_exists = check_shared_storage_test_file(source)
        finally:
            # Regardless of whether we find it, we always delete it
            cleanup_shared_storage_test_file(dest)

> ======================================================================
> ERROR: Failure: ImportError (No module named libvirt)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> File "/Library/Python/2.6/site-packages/nose-0.11.3-py2.6.egg/nose/loader.py", line 382, in loadTestsFromName
> addr.filename, addr.module)
> File "/Library/Python/2.6/site-packages/nose-0.11.3-py2.6.egg/nose/importer.py", line 39, in importFromPath
> return self.importFromDir(dir_path, fqname)
> File "/Library/Python/2.6/site-packages/nose-0.11.3-py2.6.egg/nose/importer.py", line 86, in importFromDir
> mod = load_module(part_fqname, fh, filename, desc)
> File "/Users/rick/Documents/code/openstack/nova/live-migration/nova/tests/test_virt.py", line 17, in <module>
> import libvirt
> ImportError: No module named libvirt
>
> ----------------------------------------------------------------------

Libvirt is required for the tests to run.

Since not everyone is going to have libvirt on their machine, we should probably use
the 'LazyImport pattern' here so that we only import libvirt if it's actually
going to be used-- in the case of unit-tests only the FakeLibvirt should be
used.

> 232 + for v in instance_ref['volumes']:

In general, it's better to use variable names with one letter-- it aids
readability and make it a little easier to 'grep' around the code. In this
case, 'volume' seems like the right choice. There are a few other instances
throughout the code where I think single-letter variable names should
proabably be expanded:

    + p = os.path.join(FLAGS.instances_path, filename)

On the other hand, it's fine (and idiomatic) for exception handler blocks to use `e` as the
variable for their exception. Like:

+ except exception.ProcessExecutionError, e:

> 697 +compute_services = Table('compute_services', meta,

'compute_services' sounds little too much like the 'compute worker service'
that we already have. This might be clearer if renamed

'compute_nodes' or 'compute_hosts'.

The 'compute_node' would represent the physical machine, while the
'compute-service' would represent the logical endpoin...

Read more...

review: Needs Fixing
Revision history for this message
Kei Masumoto (masumotok) wrote :

Hi Rick!
Thanks for review!

I agree all comments from you.
Fixed them soon...

Revision history for this message
Kei Masumoto (masumotok) wrote :

Hi, Rick!

Fixed based on your comments.
One thing I note that I removed libvirt-dependant test in test_virt.py for non-libvirt environment developers.
(At first I was trying to mock/FakeLibvirt as much as I could, but eventually tests become not so meaningful :) )

Hope your feedback..

Kei

Revision history for this message
Rick Harris (rconradharris) wrote :
Download full text (3.3 KiB)

Hi Kei.

I ran the tests on my Mac OS X machine and received 1 failure. Looks like we might need to mock out the get_cpu_info portion of the driver.

> but eventually tests become not so meaningful

Agreed, in terms of "does this really work?", the unit tests aren't a substitute for real functional/integration testing.

However, even with lots of code faked out, we can still get some value from the tests in terms of catching small issues: passing wrong number arguments, syntax errors, variables being of the wrong type. These are things that unit tests are really good at catching. And since we don't have the benefit of a compiler-pass, these unit tests really help cut down on the number of these problems that make it into trunk.

======================================================================
ERROR: test_update_available_resource_works_correctly (nova.tests.test_virt.LibvirtConnTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/rick/Documents/code/openstack/nova/live-migration/nova/tests/test_virt.py", line 290, in test_update_available_resource_works_correctly
    conn.update_available_resource(self.context, 'dummy')
  File "/Users/rick/Documents/code/openstack/nova/live-migration/nova/virt/libvirt_conn.py", line 1042, in update_available_resource
    dic = {'vcpus': self.get_vcpu_total(),
  File "/Users/rick/Documents/code/openstack/nova/live-migration/nova/virt/libvirt_conn.py", line 861, in get_vcpu_total
    return open('/proc/cpuinfo').read().count('processor')
IOError: [Errno 2] No such file or directory: '/proc/cpuinfo'
-------------------- >> begin captured logging << --------------------
2011-03-03 12:58:20,747 AUDIT nova.auth.manager [-] Created user fake (admin: True)
2011-03-03 12:58:20,750 AUDIT nova.auth.manager [-] Created project fake with manager fake
--------------------- >> end captured logging << ---------------------

======================================================================
ERROR: test_update_available_resource_works_correctly (nova.tests.test_virt.LibvirtConnTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/rick/Documents/code/openstack/nova/live-migration/nova/tests/test_virt.py", line 325, in tearDown
    super(LibvirtConnTestCase, self).tearDown()
  File "/Users/rick/Documents/code/openstack/nova/live-migration/nova/test.py", line 91, in tearDown
    self.mox.VerifyAll()
  File "build/bdist.macosx-10.6-universal/egg/mox.py", line 286, in VerifyAll
    mock_obj._Verify()
  File "build/bdist.macosx-10.6-universal/egg/mox.py", line 506, in _Verify
    raise ExpectedMethodCallsError(self._expected_calls_queue)
ExpectedMethodCallsError: Verify: Expected methods never called:
  0. get_cpu_info.__call__() -> 'cpuinfo'
-------------------- >> begin captured logging << --------------------
2011-03-03 12:58:20,747 AUDIT nova.auth.manager [-] Created user fake (admin: True)
2011-03-03 12:58:20,750 AUDIT nova.auth.manager [-] Created project fake with manager fake
2011-03-03 12:58:20,795 AUDIT nova.auth.manager [-] Deleting project fake
2011-03-03 ...

Read more...

Revision history for this message
Kei Masumoto (masumotok) wrote :

Hi Rick.

Thanks for your response.

> I ran the tests on my Mac OS X machine and received 1 failure. Looks like we might
> need to mock out the get_cpu_info portion of the driver.

Thanks for this information! That is very helpful. OK, perhaps enhancing exception handling at get_cpu_info is better instead of mock out, since get_cpu_info() is called whenever nova-compute launches.

if sys.platform.upper() != 'LINUX2':
    return 0
else:
   open('/proc/cpuinfo').read().count('processors')

Please let me know if you have any comments at this point.

> However, even with lots of code faked out, we can still get some value
> from the tests in terms of catching small issues: passing wrong number arguments,
> syntax errors, variables being of the wrong type. These are things that unit
> tests are really good at catching. And since we don't have the benefit of
> a compiler-pass, these unit tests really help cut down on the number of
> these problems that make it into trunk.

Understand. actually I was bit confusing I can include unit-test-like test into trunk.
Let me get deleted test code back to this branch.

Fix them soon...

Thanks again!
Kei

Revision history for this message
Kei Masumoto (masumotok) wrote :

Hi Rick,

I fixed my branch based on your comments.
I think I already explained main changes at previous e-mail - please review it.

Thanks,
Kei

--
https://code.launchpad.net/~nttdata/nova/live-migration/+merge/49699
Your team NTT DATA is subscribed to branch lp:~nttdata/nova/live-migration.

Revision history for this message
Brian Schott (bfschott) wrote :

In case reviewers are hitting this:

---
bzr rename nova/db/sqlalchemy/migrate_repo/versions/009_add_instance_migrations.py nova/db/sqlalchemy/migrate_repo/versions/010_add_instance_migrations.py
--

I suggest you create flags:
compute_vcpus_total
compute_memory_mb_total
compute_local_gb_total

They can be used to specify resources less than total (like suppose you only want to dedicate 1 core to VMs or half your host memory)? Also, some Linux distros don't have /proc, or at least I think /proc fs is still optional in the kernel.

if FLAGS.compute_vcpus_total
    return FLAGS.compute_vcpus_total
else:
   try
       open('/proc/cpuinfo').read().count('processors')
   except ... (i forget what goes here :-)
        return 1

review: Approve
Revision history for this message
Kei Masumoto (masumotok) wrote :

Hi Brian,

Thanks for approval.

> bzr rename nova/db/sqlalchemy/migrate_repo/versions/009_add_instance_migrations.py
> nova/db/sqlalchemy/migrate_repo/versions/010_add_instance_migrations.py
OK. Fix this soon.

> They can be used to specify resources less than total
> (like suppose you only want to dedicate 1 core to VMs or half your host memory)?
> Also, some Linux distros don't have /proc, or at least
> I think /proc fs is still optional in the kernel.
>
> if FLAGS.compute_vcpus_total>
> return FLAGS.compute_vcpus_total
> else:
> try
> open('/proc/cpuinfo').read().count('processors')
> except ... (i forget what goes here :-)
> return 1

I think I understand your point.
Currently, I use multi-platform library to calculate cpu number and amount of disks, so no problem at this point.
Regarding to the memory, I was trying to multi-platform library like psutil(http://code.google.com/p/psutil/), but only older version is available at Ubuntu Marverick. So I have to wait newer version, and I agree your suggestion for memory calculation.

If "like suppose you only want to dedicate 1 core to VMs or half your host memory?" is your point, please specify which point you are looking at. I don’t intend neither "only 1 core to VM" nor "half of host memory".

Again, thanks for approval.

Kei

Revision history for this message
Brian Schott (bfschott) wrote :

Sorry, I meant it would be good for cloud server administrators to be able to specify how many cores and disk and memory are dedicated to nova.

If someone has cloud on laptop or office computers they might want to reserve some capacity for host operating system.

Or admin might reserve one core and a gigabyte if memory for swift storage server. Not all configs are dedicated compute blades.

Looking forward to seeing this merged soon!

Sent from my iPhone

On Mar 8, 2011, at 9:40 PM, Kei Masumoto <email address hidden> wrote:

> Hi Brian,
>
> Thanks for approval.
>
>> bzr rename nova/db/sqlalchemy/migrate_repo/versions/009_add_instance_migrations.py
>> nova/db/sqlalchemy/migrate_repo/versions/010_add_instance_migrations.py
> OK. Fix this soon.
>
>> They can be used to specify resources less than total
>> (like suppose you only want to dedicate 1 core to VMs or half your host memory)?
>> Also, some Linux distros don't have /proc, or at least
>> I think /proc fs is still optional in the kernel.
>>
>> if FLAGS.compute_vcpus_total>
>> return FLAGS.compute_vcpus_total
>> else:
>> try
>> open('/proc/cpuinfo').read().count('processors')
>> except ... (i forget what goes here :-)
>> return 1
>
> I think I understand your point.
> Currently, I use multi-platform library to calculate cpu number and amount of disks, so no problem at this point.
> Regarding to the memory, I was trying to multi-platform library like psutil(http://code.google.com/p/psutil/), but only older version is available at Ubuntu Marverick. So I have to wait newer version, and I agree your suggestion for memory calculation.
>
> If "like suppose you only want to dedicate 1 core to VMs or half your host memory?" is your point, please specify which point you are looking at. I don’t intend neither "only 1 core to VM" nor "half of host memory".
>
> Again, thanks for approval.
>
> Kei
>
> --
> https://code.launchpad.net/~nttdata/nova/live-migration/+merge/49699
> You are reviewing the proposed merge of lp:~nttdata/nova/live-migration into lp:nova.

Revision history for this message
Kei Masumoto (masumotok) wrote :

Brian, Thanks for explanation! I understand your comment, "reserving some cpu/memory/disk" sounds like good idea. I personally agree to implement it to nova, but unfortunately, I didn’t mention "reserving" when I got blueprint approval. In addition, it is not only for live migration topic but also entire nova topic. Admin also wants to use "reserving" when just launching instance or creating volumes, doesn’t he?.
Therefore, I think it is better to discuss at next design summit.. (actually, I've heard from someone that feature is necessary and it should be implement at scheduler. I am sure there are many discussions :)
Thanks again!
Kei

Revision history for this message
Brian Schott (bfschott) wrote :

No problem. I'll propose a follow-on blueprint and link it to this one with a more detailed approach.

Brian Schott
<email address hidden>

On Mar 9, 2011, at 12:43 AM, Kei Masumoto wrote:

> Brian, Thanks for explanation! I understand your comment, "reserving some cpu/memory/disk" sounds like good idea. I personally agree to implement it to nova, but unfortunately, I didn’t mention "reserving" when I got blueprint approval. In addition, it is not only for live migration topic but also entire nova topic. Admin also wants to use "reserving" when just launching instance or creating volumes, doesn’t he?.
> Therefore, I think it is better to discuss at next design summit.. (actually, I've heard from someone that feature is necessary and it should be implement at scheduler. I am sure there are many discussions :)
> Thanks again!
> Kei
>
>
> --
> https://code.launchpad.net/~nttdata/nova/live-migration/+merge/49699
> You are reviewing the proposed merge of lp:~nttdata/nova/live-migration into lp:nova.

Revision history for this message
Rick Harris (rconradharris) wrote :

Nice work, Kei.

Some small nits:

> 194 + os.fdopen(fd, 'w+').close()a

`os.close` should suffice since the fd that mkstemp returns is already open.

> 1007 + ec2_id = instance_ref['hostname']

Doesn't appear to be used.

review: Approve
Revision history for this message
Jay Pipes (jaypipes) wrote :
Download full text (4.2 KiB)

Hi Kei!

All tests are passing locally for me, which is great, and the code looks very solid. Very good use of mox in your test cases.

Just a few suggestions, mostly small style stuff...

1)

There are a number of places you use __setitem__, like so:

1223 + instance_ref.__setitem__('id', 1)

it's easier to just write:

instance_ref['id'] = 1

2) Tiny style/i18n/English stuff

Please do not take offence to me correcting your English phrases! :)

62 + print 'Unexpected error occurs'

Please i18n that. Also, the English saying would be "An unexpected error has occurred."

29 + raise exception.Error(msg)

There are 2 spaces after raise. Only 1 needed :)

92 + raise exception.Invalid(_('%s does not exists.') % host)

English saying would be "%s does not exist" (without the s on exist)

146 +flags.DEFINE_string('live_migration_retry_count', 30,

Might want to use DEFINE_integer to ensure an integer is used as the flag value...

192 + LOG.debug(_("Creating tmpfile %s to notify to other "
193 + "compute node that they mounts same storage.") % tmp_file)

s/node that they mounts same storage/nodes that they should mount the same storage/

248 + msg = _("%(instance_id)s(%(ec2_id)s) does'nt have fixed_ip")

s/does'nt have/does not have/

365 + LOG.info(_('floating_ip is not found for %s'), i_name)
73 + LOG.info(_('Floating_ip is not found for %s'), i_name)

s/floating_ip is not found for/No floating IP was found for/

381 + LOG.info(_('Migrating %(i_name)s to %(dest)s finishes successfully.')

s/finishes successfully/finished successfully/

383 + LOG.info(_("The below error is normally occurs. "
384 + "Just check if instance is successfully migrated.\n"
385 + "libvir: QEMU error : Domain not found: no domain "
386 + "with matching name.."))

I would say this, instead:

LOG.info(_("You may see the error \"libvirt: QEMU error: "
           "Domain not found: no domain with matching name.\" "
           "This error can be safely ignored.")

547 + raise exception.NotFound(_("%s does not exist or not "
548 + "compute node.") % host)

s/or not compute node/or is not a compute node/

1040 + raise exception.NotEmpty(_("%(ec2_id)s is not capable to "
1041 + "migrate %(dest)s (host:%(mem_avail)s "

I'd rewrite that as "Unable to migrate %(ec2_id)s to destination: %(dest)s ..."

1073 + logging.error(_("Cannot comfirm tmpfile at %(ipath)s is on "

s/comfirm/confirm/ :)

You use this line:

global ghost, gbinary, gmox

in 2 places in the nova/tests/test_service.py file:

2198 + global ghost, gbinary, gmox
2238 + global ghost, gbinary, gmox

However, the actual variable names are:

2187 +# temporary variable to store host/binary/self.mox
2188 +# from each method to fake class.
2189 +global_host = None
2190 +global_binary = None
2191 +global_mox = None

You will want to make those consistent I believe, otherwise I'm not sure what gbinary, ghost, and gmox are going to refer to ;)

2385 + def tes1t_update_available_resource_works_correctly(self):

s/tes1t/test :) The mispelling is causing this test case not to be run. (It passes, BTW, when you fix the typo...I checked. :) )

2658 + msg = _("""Cannot confirm exported volume id:%(volume_id)s."""
2659 + """vblade process...

Read more...

review: Needs Fixing
Revision history for this message
Kei Masumoto (masumotok) wrote :

Thanks for review, Rick! I'll fix it, soon..

Kei

Revision history for this message
Kei Masumoto (masumotok) wrote :

Thanks Jay! Your statement at the last IRC meeting is so helpful for me. Everyone gave me review again and got Approve!
I'll fixed my branch based on your comment soon...

One note:
> You use this line:
>
> global ghost, gbinary, gmox
>
> in 2 places in the nova/tests/test_service.py file:
>
>2198 + global ghost, gbinary, gmox
>2238 + global ghost, gbinary, gmox
>
>However, the actual variable names are:
>
>2187 +# temporary variable to store host/binary/self.mox
>2188 +# from each method to fake class.
>2189 +global_host = None
>2190 +global_binary = None
>2191 +global_mox = None
>
>You will want to make those consistent I believe, otherwise I'm not sure what gbinary, ghost, and gmox are going to >refer to ;)
I completely forgot to update this testcase. I'll rewrite this. Sorry..

Thanks again!
Kei

Revision history for this message
Jay Pipes (jaypipes) wrote :

Awesome job, Kei :)

review: Approve
Revision history for this message
Brian Schott (bfschott) wrote :

+1
This branch adds a lot of new capabilities.

Brian Schott
<email address hidden>

On Mar 10, 2011, at 9:36 AM, Jay Pipes wrote:

> Review: Approve
> Awesome job, Kei :)
> --
> https://code.launchpad.net/~nttdata/nova/live-migration/+merge/49699
> You are reviewing the proposed merge of lp:~nttdata/nova/live-migration into lp:nova.

Revision history for this message
Kei Masumoto (masumotok) wrote :

Jay, Brian, thank you! I appreciate your help!

Kei

Revision history for this message
Ken Pepple (ken-pepple) wrote :

This is a nit but will drive nova admins crazy -- in nova-manage, i think we should verify the destination host name and service alive-ness before we send the rpc call off to the scheduler. I think this is as easy to implement as wrapping nova-manage:31-39 with an if..else statement with a call to db.service_get_by_host_and_topic(context, dest, "compute").

This will save us from waiting on the migration (that will never happen) and cleaning out the queue later.

review: Needs Fixing
Revision history for this message
Kei Masumoto (masumotok) wrote :

Hi Ken, thanks for your comments!
Fix it soon..

Kei

Revision history for this message
Kei Masumoto (masumotok) wrote :

Hi Ken,

I appreciate if I ask you some question.

1. In current proposed code, scheduler checks many things, src/dest alive check, lack of memory check, hypervisor check... Your comment implies all those checks must be done in nova-manage? or just alive check? I was thinking it is safer that those checks is done at scheduler, for example, scheduler is busy. Other example is that an use directly send a request to scheduler not using nova-manage(what I am worrying about some security issues occurs here or no need to think?).

2. In current proposed code, if destination host is not alive(this check is done at scheduler), an exception raised and returned to nova-manage. then, we have to cleanup rabbitmq?

I'm bit confused, please give me a light..

Kei

Revision history for this message
Ken Pepple (ken-pepple) wrote :

> 1. In current proposed code, scheduler checks many things, src/dest alive check, lack of memory check, hypervisor check... Your comment implies all those checks must be done in nova-manage? or just alive check? I was thinking it is safer that those checks is done at scheduler, for example, scheduler is busy. Other example is that an use directly send a request to scheduler not using nova-manage(what I am worrying about some security issues occurs here or no need to think?).

Hi masumoto-san -

Sorry, I meant for only basic checks to be done in nova-manage. My concern is that admins will start a live migration to a non-existant host (or disabled host), wait for a minute or two, then check euca-describe-instances and see see that nothing happened because the recover_live_migration has already set it back to "running" state.

I agree that most checks should be done in scheduler, so that later we might be able to add API support for live-migration.

Will nova/compute/manager.py:879 check to make sure that the service is alive ? I thought this just looked for a queue (and queues aren't destroyed on alive checks), so this may be my misunderstanding.

> 2. In current proposed code, if destination host is not alive(this check is done at scheduler), an exception raised and returned to nova-manage. then, we have to cleanup rabbitmq?

no, not we are never putting it in the queue.

Thanks for all the work on this -- looking for to live-migration.

Revision history for this message
Kei Masumoto (masumotok) wrote :
Download full text (3.2 KiB)

Hi Ken-san, Thanks for your answer!

> My concern is that admins will start a live migration to a non-existant host
> (or disabled host), wait for a minute or two, then check euca-describe-instances
> and see see that nothing happened because the recover_live_migration has
> already set it back to "running" state.

In our environment, admin can get an error message such as, "destination host is not alive" in a few seconds, because scheduler check it and raise exception(see below for other scheduler checks).

> Will nova/compute/manager.py:879 check to make sure that the service is alive ?
pre_live_migratioin(meaning nova/compute/manager.py:879) is a checking method that can be checked only in compute node. such as, security group ingress rule(iptables rule) is successfully taken over?, destination host can recognize volumes(iscsi daemon is alive?/aoe kernel module is inserted?), ..etc.
On the other hand, scheduler make sure for other checks that can be done everywhere(see below).

[Examples of scheduler checks]
- instance is running
- src/dest host exists(and alive)
- nova-computes run on src/dest host
- nova-volume is alive when instance mounts a volume.
- hypervisor_type, hypervisor_version and cpu compatibility
- dest host has enough memory?
- src/dest host mounts same shared storage?

Please let me know if it does not make sense to you.(I always am worrying about my english mistake confuse you :)) . I think I explained here your concerns does not matter in current implementation....

Thanks again!
Kei

-----Original Message-----
From: Ken Pepple [mailto:<email address hidden>]
Sent: Friday, March 11, 2011 12:26 PM
To: RDH 桝本 圭(ITアーキ&セキュ技術)
Cc: <email address hidden>
Subject: Re: [Merge] lp:~nttdata/nova/live-migration into lp:nova

> 1. In current proposed code, scheduler checks many things, src/dest alive check, lack of memory check, hypervisor check... Your comment implies all those checks must be done in nova-manage? or just alive check? I was thinking it is safer that those checks is done at scheduler, for example, scheduler is busy. Other example is that an use directly send a request to scheduler not using nova-manage(what I am worrying about some security issues occurs here or no need to think?).

Hi masumoto-san -

Sorry, I meant for only basic checks to be done in nova-manage. My concern is that admins will start a live migration to a non-existant host (or disabled host), wait for a minute or two, then check euca-describe-instances and see see that nothing happened because the recover_live_migration has already set it back to "running" state.

I agree that most checks should be done in scheduler, so that later we might be able to add API support for live-migration.

Will nova/compute/manager.py:879 check to make sure that the service is alive ? I thought this just looked for a queue (and queues aren't destroyed on alive checks), so this may be my misunderstanding.

> 2. In current proposed code, if destination host is not alive(this check is done at scheduler), an exception raised and returned to nova-manage. then, we have to cleanup rabbitmq?

no, not we are ...

Read more...

Revision history for this message
Ken Pepple (ken-pepple) wrote :

On Mar 10, 2011, at 8:01 PM, Kei Masumoto wrote:
> Please let me know if it does not make sense to you.(I always am worrying about my english mistake confuse you :)) . I think I explained here your concerns does not matter in current implementation….

Okay, i think i understand … it can be a bit difficult following the scheduler code sometimes.

Last question: don't we need this patch (see below) ? my install fails when i do this:

root@shuttle:~/src/live-migration/contrib/nova# bin/nova-manage vm live_migration i0023 badhost
2011-03-10 21:53:20,653 CRITICAL nova [-] global name 'ec2_id_to_id' is not defined
(nova): TRACE: Traceback (most recent call last):
(nova): TRACE: File "bin/nova-manage", line 1074, in <module>
(nova): TRACE: main()
(nova): TRACE: File "bin/nova-manage", line 1066, in main
(nova): TRACE: fn(*argv)
(nova): TRACE: File "bin/nova-manage", line 573, in live_migration
(nova): TRACE: instance_id = ec2_id_to_id(ec2_id)
(nova): TRACE: NameError: global name 'ec2_id_to_id' is not defined
(nova): TRACE:

Or did I not install this correctly ?
Thanks again
/k

===== PATCH ======

=== modified file 'bin/nova-manage'
--- bin/nova-manage 2011-03-10 06:23:13 +0000
+++ bin/nova-manage 2011-03-11 05:56:38 +0000
@@ -570,7 +570,7 @@
         """

         ctxt = context.get_admin_context()
- instance_id = ec2_id_to_id(ec2_id)
+ instance_id = ec2utils.ec2_id_to_id(ec2_id)

         if FLAGS.connection_type != 'libvirt':
             msg = _('Only KVM is supported for now. Sorry!')

Revision history for this message
Kei Masumoto (masumotok) wrote :

Hi ken!

Hold on, in japan, now quite big earthquake comes. TV channel does not work, I am not sure what is going on..
Probably, almost all companies stop their business, and employees are trying to escape.
Actually, I helped a pregnant woman and baby and finally get back to my apartment.

After I tidy my room, I'm gonna check this. my new TV didn’t fall down and not broke although glasses broke, I don’t know I am still lucky or not...

By the way, I am not sure it is ok to write e-mail or I should prepare to escape? :)

Kei

Revision history for this message
Ken Pepple (ken-pepple) wrote :

Masumoto-san -- i'm talking with my friends in Aoyama (8.8M !). you should definitely escape :)

Revision history for this message
Thierry Carrez (ttx) wrote :

I think this should be merged *now*. The feature part was approved already. Given the situation in Japan I don't expect Kei to have lots of time to add the additional pre-checks that Ken mentioned.

Someone can propose a branch that adds the checks afterwards.

review: Approve
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Attempt to merge into lp:nova failed due to conflicts:

text conflict in nova/tests/test_virt.py

Revision history for this message
Ken Pepple (ken-pepple) wrote :

agreeing with ttx, will file bugs/patches on my objections.

review: Approve
Revision history for this message
Jay Pipes (jaypipes) wrote :

I'm fixing the merge conflict locally for Kei and will push shortly.

Revision history for this message
Brian Schott (bfschott) wrote :

Jay,

Not that you need a reference, but I may have fixed those conflicts in:
lp:~usc-isi/nova/hpc-trunk
Don't pull the whole branch, as it has our cpu-arch extensions.

Brian Schott
<email address hidden>

On Mar 14, 2011, at 12:43 PM, Jay Pipes wrote:

> I'm fixing the merge conflict locally for Kei and will push shortly.
> --
> https://code.launchpad.net/~nttdata/nova/live-migration/+merge/49699
> You are reviewing the proposed merge of lp:~nttdata/nova/live-migration into lp:nova.

Revision history for this message
Jay Pipes (jaypipes) wrote :

Thanks Brian! It was a simple little import order thingie, though :)
Not a big deal!

On Mon, Mar 14, 2011 at 1:44 PM, Brian Schott <email address hidden> wrote:
> Jay,
>
> Not that you need a reference, but I may have fixed those conflicts in:
> lp:~usc-isi/nova/hpc-trunk
> Don't pull the whole branch, as it has our cpu-arch extensions.
>
> Brian Schott
> <email address hidden>
>
>
>
> On Mar 14, 2011, at 12:43 PM, Jay Pipes wrote:
>
>> I'm fixing the merge conflict locally for Kei and will push shortly.
>> --
>> https://code.launchpad.net/~nttdata/nova/live-migration/+merge/49699
>> You are reviewing the proposed merge of lp:~nttdata/nova/live-migration into lp:nova.
>
>
> --
> https://code.launchpad.net/~nttdata/nova/live-migration/+merge/49699
> You are reviewing the proposed merge of lp:~nttdata/nova/live-migration into lp:nova.
>

Revision history for this message
Brian Schott (bfschott) wrote :

Lorin,

Good catch. That's going to hit trunk soon. I'm going to submit a bug to nova trunk. Jay, are you able to confirm this?

Brian

---
Brian Schott
USC Information Sciences Institute
http://www.east.isi.edu/~bschott
ph: 703-812-3722 fx: 703-812-3712

On Mar 14, 2011, at 3:34 PM, Lorin Hochstein wrote:

> I was running hpc-trunk with Ubuntu packages and saw error this in nova-compute
>
> 2011-03-14 12:12:02,764 ERROR nova [-] in Service.create()
> (nova): TRACE: Traceback (most recent call last):
> (nova): TRACE: File "/usr/lib/pymodules/python2.6/nova/service.py", line 264, in serve
> (nova): TRACE: services = [Service.create()]
> (nova): TRACE: File "/usr/lib/pymodules/python2.6/nova/service.py", line 167, in create
> (nova): TRACE: report_interval, periodic_interval)
> (nova): TRACE: File "/usr/lib/pymodules/python2.6/nova/service.py", line 73, in __init__
> (nova): TRACE: self.manager = manager_class(host=self.host, *args, **kwargs)
> (nova): TRACE: File "/usr/lib/pymodules/python2.6/nova/compute/manager.py", line 118, in __init__
> (nova): TRACE: self.driver = utils.import_object(compute_driver)
> (nova): TRACE: File "/usr/lib/pymodules/python2.6/nova/utils.py", line 75, in import_object
> (nova): TRACE: return cls()
> (nova): TRACE: File "/usr/lib/pymodules/python2.6/nova/virt/connection.py", line 64, in get_connection
> (nova): TRACE: conn = libvirt_conn.get_connection(read_only)
> (nova): TRACE: File "/usr/lib/pymodules/python2.6/nova/virt/libvirt_conn.py", line 131, in get_connection
> (nova): TRACE: return LibvirtConnection(read_only)
> (nova): TRACE: File "/usr/lib/pymodules/python2.6/nova/virt/libvirt_conn.py", line 163, in __init__
> (nova): TRACE: self.cpuinfo_xml = open(FLAGS.cpuinfo_xml_template).read()
> (nova): TRACE: IOError: [Errno 2] No such file or directory: '/usr/lib/pymodules/python2.6/nova/virt/cpuinfo.xml.template'
> (nova): TRACE:
>
> The NTT guys seem to have added a new file (cpuinfo.xml.template), but that file isn't currently being packaged. I'm not sure why, but I'm trying to figure it out.
>
> Lorin
>
> --
> Lorin Hochstein, Computer Scientist
> USC Information Sciences Institute
> 703.812.3710
> http://www.east.isi.edu/~lorin
>
> _______________________________________________
> DODCS mailing list
> <email address hidden>
> http://list.east.isi.edu/cgi-bin/mailman/listinfo/dodcs

Revision history for this message
Jay Pipes (jaypipes) wrote :

Hmm, this should not have gotten through the distribution/packaging
tests... I'll see what I can discover.

-jay

On Mon, Mar 14, 2011 at 3:47 PM, Brian Schott <email address hidden> wrote:
> Lorin,
>
> Good catch.  That's going to hit trunk soon.  I'm going to submit a bug to nova trunk.  Jay, are you able to confirm this?
>
> Brian
>
> ---
> Brian Schott
> USC Information Sciences Institute
> http://www.east.isi.edu/~bschott
> ph: 703-812-3722 fx: 703-812-3712
>
>
> On Mar 14, 2011, at 3:34 PM, Lorin Hochstein wrote:
>
>> I was running hpc-trunk with Ubuntu packages and saw error this in nova-compute
>>
>> 2011-03-14 12:12:02,764 ERROR nova [-] in Service.create()
>> (nova): TRACE: Traceback (most recent call last):
>> (nova): TRACE:   File "/usr/lib/pymodules/python2.6/nova/service.py", line 264, in serve
>> (nova): TRACE:     services = [Service.create()]
>> (nova): TRACE:   File "/usr/lib/pymodules/python2.6/nova/service.py", line 167, in create
>> (nova): TRACE:     report_interval, periodic_interval)
>> (nova): TRACE:   File "/usr/lib/pymodules/python2.6/nova/service.py", line 73, in __init__
>> (nova): TRACE:     self.manager = manager_class(host=self.host, *args, **kwargs)
>> (nova): TRACE:   File "/usr/lib/pymodules/python2.6/nova/compute/manager.py", line 118, in __init__
>> (nova): TRACE:     self.driver = utils.import_object(compute_driver)
>> (nova): TRACE:   File "/usr/lib/pymodules/python2.6/nova/utils.py", line 75, in import_object
>> (nova): TRACE:     return cls()
>> (nova): TRACE:   File "/usr/lib/pymodules/python2.6/nova/virt/connection.py", line 64, in get_connection
>> (nova): TRACE:     conn = libvirt_conn.get_connection(read_only)
>> (nova): TRACE:   File "/usr/lib/pymodules/python2.6/nova/virt/libvirt_conn.py", line 131, in get_connection
>> (nova): TRACE:     return LibvirtConnection(read_only)
>> (nova): TRACE:   File "/usr/lib/pymodules/python2.6/nova/virt/libvirt_conn.py", line 163, in __init__
>> (nova): TRACE:     self.cpuinfo_xml = open(FLAGS.cpuinfo_xml_template).read()
>> (nova): TRACE: IOError: [Errno 2] No such file or directory: '/usr/lib/pymodules/python2.6/nova/virt/cpuinfo.xml.template'
>> (nova): TRACE:
>>
>> The NTT guys seem to have added a new file (cpuinfo.xml.template), but that file isn't currently being packaged. I'm not sure why, but I'm trying to figure it out.
>>
>> Lorin
>>
>> --
>> Lorin Hochstein, Computer Scientist
>> USC Information Sciences Institute
>> 703.812.3710
>> http://www.east.isi.edu/~lorin
>>
>> _______________________________________________
>> DODCS mailing list
>> <email address hidden>
>> http://list.east.isi.edu/cgi-bin/mailman/listinfo/dodcs
>
>
> --
> https://code.launchpad.net/~nttdata/nova/live-migration/+merge/49699
> You are reviewing the proposed merge of lp:~nttdata/nova/live-migration into lp:nova.
>

Revision history for this message
Soren Hansen (soren) wrote :

I have a Jenkins job that would have alerted us about this sooner. It
triggers when files are added to bzr, but don't end up in the tarball.
I've made a note to get that added tomorrow.

--
Soren Hansen        | http://linux2go.dk/
Ubuntu Developer    | http://www.ubuntu.com/
OpenStack Developer | http://www.openstack.org/

Revision history for this message
Kei Masumoto (masumotok) wrote :

Hi,

I would like to say thank you for everyone regarding to live-migration branch was merged. Especially, reviewers both from core dev and from community, help me to improve quality of our branch. I don’t think this is the end of our work, at least I've already recognized that I have to follow some recent nova's changes. My team is planning to submit some patches.

By the way, regarding to the earthquake, it happened the north part of Japan, not Tokyo(where I live and I work at). Although we can engaging our business, the effect also comes to Tokyo. For example, crazy traffic jam, lack of gasoline, lack of food, stopping electricity and I spend with no lights but with candles in the night... etc.

Please forgive us we need some more time to get back, and I am looking forward to sending all of you many thanks f2f at next design summit.

Regards,
Kei

Revision history for this message
Jay Pipes (jaypipes) wrote :

I think I can safely say that all of us in the contributor community
wish you and all our colleagues and friends in Japan our best. It's a
horrific event and many of us feel powerless to do anything about it.
We hope you can find some normality in the coming weeks as, hopefully,
Japan recovers from the earthquake.

All our best,

jay

On Wed, Mar 16, 2011 at 2:42 AM, <email address hidden> wrote:
> Hi,
>
> I would like to say thank you for everyone regarding to live-migration branch was merged. Especially, reviewers both from core dev and from community, help me to improve quality of our branch. I don’t think this is the end of our work, at least I've already recognized that I have to follow some recent nova's changes. My team is planning to submit some patches.
>
> By the way, regarding to the earthquake, it happened the north part of Japan, not Tokyo(where I live and I work at). Although we can engaging our business, the effect also comes to Tokyo. For example, crazy traffic jam, lack of gasoline, lack of food, stopping electricity and I spend with no lights but with candles in the night... etc.
>
> Please forgive us we need some more time to get back, and I am looking forward to sending all of you many thanks f2f at next design summit.
>
> Regards,
> Kei
>

Revision history for this message
Brian Schott (bfschott) wrote :

+100

Brian Schott
<email address hidden>

On Mar 16, 2011, at 11:12 AM, Jay Pipes wrote:

> I think I can safely say that all of us in the contributor community
> wish you and all our colleagues and friends in Japan our best. It's a
> horrific event and many of us feel powerless to do anything about it.
> We hope you can find some normality in the coming weeks as, hopefully,
> Japan recovers from the earthquake.
>
> All our best,
>
> jay
>
> On Wed, Mar 16, 2011 at 2:42 AM, <email address hidden> wrote:
>> Hi,
>>
>> I would like to say thank you for everyone regarding to live-migration branch was merged. Especially, reviewers both from core dev and from community, help me to improve quality of our branch. I don’t think this is the end of our work, at least I've already recognized that I have to follow some recent nova's changes. My team is planning to submit some patches.
>>
>> By the way, regarding to the earthquake, it happened the north part of Japan, not Tokyo(where I live and I work at). Although we can engaging our business, the effect also comes to Tokyo. For example, crazy traffic jam, lack of gasoline, lack of food, stopping electricity and I spend with no lights but with candles in the night... etc.
>>
>> Please forgive us we need some more time to get back, and I am looking forward to sending all of you many thanks f2f at next design summit.
>>
>> Regards,
>> Kei
>>
>
> --
> https://code.launchpad.net/~nttdata/nova/live-migration/+merge/49699
> You are reviewing the proposed merge of lp:~nttdata/nova/live-migration into lp:nova.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'bin/nova-manage'
--- bin/nova-manage 2011-03-10 04:42:11 +0000
+++ bin/nova-manage 2011-03-10 06:27:59 +0000
@@ -558,6 +558,40 @@
558 db.network_delete_safe(context.get_admin_context(), network.id)558 db.network_delete_safe(context.get_admin_context(), network.id)
559559
560560
561class VmCommands(object):
562 """Class for mangaging VM instances."""
563
564 def live_migration(self, ec2_id, dest):
565 """Migrates a running instance to a new machine.
566
567 :param ec2_id: instance id which comes from euca-describe-instance.
568 :param dest: destination host name.
569
570 """
571
572 ctxt = context.get_admin_context()
573 instance_id = ec2_id_to_id(ec2_id)
574
575 if FLAGS.connection_type != 'libvirt':
576 msg = _('Only KVM is supported for now. Sorry!')
577 raise exception.Error(msg)
578
579 if (FLAGS.volume_driver != 'nova.volume.driver.AOEDriver' and \
580 FLAGS.volume_driver != 'nova.volume.driver.ISCSIDriver'):
581 msg = _("Support only AOEDriver and ISCSIDriver. Sorry!")
582 raise exception.Error(msg)
583
584 rpc.call(ctxt,
585 FLAGS.scheduler_topic,
586 {"method": "live_migration",
587 "args": {"instance_id": instance_id,
588 "dest": dest,
589 "topic": FLAGS.compute_topic}})
590
591 print _('Migration of %s initiated.'
592 'Check its progress using euca-describe-instances.') % ec2_id
593
594
561class ServiceCommands(object):595class ServiceCommands(object):
562 """Enable and disable running services"""596 """Enable and disable running services"""
563597
@@ -602,6 +636,59 @@
602 return636 return
603 db.service_update(ctxt, svc['id'], {'disabled': True})637 db.service_update(ctxt, svc['id'], {'disabled': True})
604638
639 def describe_resource(self, host):
640 """Describes cpu/memory/hdd info for host.
641
642 :param host: hostname.
643
644 """
645
646 result = rpc.call(context.get_admin_context(),
647 FLAGS.scheduler_topic,
648 {"method": "show_host_resources",
649 "args": {"host": host}})
650
651 if type(result) != dict:
652 print _('An unexpected error has occurred.')
653 print _('[Result]'), result
654 else:
655 cpu = result['resource']['vcpus']
656 mem = result['resource']['memory_mb']
657 hdd = result['resource']['local_gb']
658 cpu_u = result['resource']['vcpus_used']
659 mem_u = result['resource']['memory_mb_used']
660 hdd_u = result['resource']['local_gb_used']
661
662 print 'HOST\t\t\tPROJECT\t\tcpu\tmem(mb)\tdisk(gb)'
663 print '%s(total)\t\t\t%s\t%s\t%s' % (host, cpu, mem, hdd)
664 print '%s(used)\t\t\t%s\t%s\t%s' % (host, cpu_u, mem_u, hdd_u)
665 for p_id, val in result['usage'].items():
666 print '%s\t\t%s\t\t%s\t%s\t%s' % (host,
667 p_id,
668 val['vcpus'],
669 val['memory_mb'],
670 val['local_gb'])
671
672 def update_resource(self, host):
673 """Updates available vcpu/memory/disk info for host.
674
675 :param host: hostname.
676
677 """
678
679 ctxt = context.get_admin_context()
680 service_refs = db.service_get_all_by_host(ctxt, host)
681 if len(service_refs) <= 0:
682 raise exception.Invalid(_('%s does not exist.') % host)
683
684 service_refs = [s for s in service_refs if s['topic'] == 'compute']
685 if len(service_refs) <= 0:
686 raise exception.Invalid(_('%s is not compute node.') % host)
687
688 rpc.call(ctxt,
689 db.queue_get_for(ctxt, FLAGS.compute_topic, host),
690 {"method": "update_available_resource"})
691
605692
606class LogCommands(object):693class LogCommands(object):
607 def request(self, request_id, logfile='/var/log/nova.log'):694 def request(self, request_id, logfile='/var/log/nova.log'):
@@ -905,6 +992,7 @@
905 ('fixed', FixedIpCommands),992 ('fixed', FixedIpCommands),
906 ('floating', FloatingIpCommands),993 ('floating', FloatingIpCommands),
907 ('network', NetworkCommands),994 ('network', NetworkCommands),
995 ('vm', VmCommands),
908 ('service', ServiceCommands),996 ('service', ServiceCommands),
909 ('log', LogCommands),997 ('log', LogCommands),
910 ('db', DbCommands),998 ('db', DbCommands),
911999
=== modified file 'contrib/nova.sh'
--- contrib/nova.sh 2011-03-08 00:01:43 +0000
+++ contrib/nova.sh 2011-03-10 06:27:59 +0000
@@ -76,6 +76,7 @@
76 sudo apt-get install -y python-migrate python-eventlet python-gflags python-ipy python-tempita76 sudo apt-get install -y python-migrate python-eventlet python-gflags python-ipy python-tempita
77 sudo apt-get install -y python-libvirt python-libxml2 python-routes python-cheetah77 sudo apt-get install -y python-libvirt python-libxml2 python-routes python-cheetah
78 sudo apt-get install -y python-netaddr python-paste python-pastedeploy python-glance78 sudo apt-get install -y python-netaddr python-paste python-pastedeploy python-glance
79 sudo apt-get install -y python-multiprocessing
7980
80 if [ "$USE_IPV6" == 1 ]; then81 if [ "$USE_IPV6" == 1 ]; then
81 sudo apt-get install -y radvd82 sudo apt-get install -y radvd
8283
=== modified file 'nova/compute/manager.py'
--- nova/compute/manager.py 2011-03-07 22:40:19 +0000
+++ nova/compute/manager.py 2011-03-10 06:27:59 +0000
@@ -36,9 +36,12 @@
3636
37import base6437import base64
38import datetime38import datetime
39import os
39import random40import random
40import string41import string
41import socket42import socket
43import tempfile
44import time
42import functools45import functools
4346
44from nova import exception47from nova import exception
@@ -61,6 +64,9 @@
61flags.DEFINE_string('console_host', socket.gethostname(),64flags.DEFINE_string('console_host', socket.gethostname(),
62 'Console proxy host to use to connect to instances on'65 'Console proxy host to use to connect to instances on'
63 'this host.')66 'this host.')
67flags.DEFINE_integer('live_migration_retry_count', 30,
68 ("Retry count needed in live_migration."
69 " sleep 1 sec for each count"))
6470
65LOG = logging.getLogger('nova.compute.manager')71LOG = logging.getLogger('nova.compute.manager')
6672
@@ -181,7 +187,7 @@
181 context=context)187 context=context)
182 self.db.instance_update(context,188 self.db.instance_update(context,
183 instance_id,189 instance_id,
184 {'host': self.host})190 {'host': self.host, 'launched_on': self.host})
185191
186 self.db.instance_set_state(context,192 self.db.instance_set_state(context,
187 instance_id,193 instance_id,
@@ -723,3 +729,248 @@
723 self.volume_manager.remove_compute_volume(context, volume_id)729 self.volume_manager.remove_compute_volume(context, volume_id)
724 self.db.volume_detached(context, volume_id)730 self.db.volume_detached(context, volume_id)
725 return True731 return True
732
733 @exception.wrap_exception
734 def compare_cpu(self, context, cpu_info):
735 """Checks the host cpu is compatible to a cpu given by xml.
736
737 :param context: security context
738 :param cpu_info: json string obtained from virConnect.getCapabilities
739 :returns: See driver.compare_cpu
740
741 """
742 return self.driver.compare_cpu(cpu_info)
743
744 @exception.wrap_exception
745 def create_shared_storage_test_file(self, context):
746 """Makes tmpfile under FLAGS.instance_path.
747
748 This method enables compute nodes to recognize that they mounts
749 same shared storage. (create|check|creanup)_shared_storage_test_file()
750 is a pair.
751
752 :param context: security context
753 :returns: tmpfile name(basename)
754
755 """
756
757 dirpath = FLAGS.instances_path
758 fd, tmp_file = tempfile.mkstemp(dir=dirpath)
759 LOG.debug(_("Creating tmpfile %s to notify to other "
760 "compute nodes that they should mount "
761 "the same storage.") % tmp_file)
762 os.close(fd)
763 return os.path.basename(tmp_file)
764
765 @exception.wrap_exception
766 def check_shared_storage_test_file(self, context, filename):
767 """Confirms existence of the tmpfile under FLAGS.instances_path.
768
769 :param context: security context
770 :param filename: confirm existence of FLAGS.instances_path/thisfile
771
772 """
773
774 tmp_file = os.path.join(FLAGS.instances_path, filename)
775 if not os.path.exists(tmp_file):
776 raise exception.NotFound(_('%s not found') % tmp_file)
777
778 @exception.wrap_exception
779 def cleanup_shared_storage_test_file(self, context, filename):
780 """Removes existence of the tmpfile under FLAGS.instances_path.
781
782 :param context: security context
783 :param filename: remove existence of FLAGS.instances_path/thisfile
784
785 """
786
787 tmp_file = os.path.join(FLAGS.instances_path, filename)
788 os.remove(tmp_file)
789
790 @exception.wrap_exception
791 def update_available_resource(self, context):
792 """See comments update_resource_info.
793
794 :param context: security context
795 :returns: See driver.update_available_resource()
796
797 """
798
799 return self.driver.update_available_resource(context, self.host)
800
801 def pre_live_migration(self, context, instance_id):
802 """Preparations for live migration at dest host.
803
804 :param context: security context
805 :param instance_id: nova.db.sqlalchemy.models.Instance.Id
806
807 """
808
809 # Getting instance info
810 instance_ref = self.db.instance_get(context, instance_id)
811 ec2_id = instance_ref['hostname']
812
813 # Getting fixed ips
814 fixed_ip = self.db.instance_get_fixed_address(context, instance_id)
815 if not fixed_ip:
816 msg = _("%(instance_id)s(%(ec2_id)s) does not have fixed_ip.")
817 raise exception.NotFound(msg % locals())
818
819 # If any volume is mounted, prepare here.
820 if not instance_ref['volumes']:
821 LOG.info(_("%s has no volume."), ec2_id)
822 else:
823 for v in instance_ref['volumes']:
824 self.volume_manager.setup_compute_volume(context, v['id'])
825
826 # Bridge settings.
827 # Call this method prior to ensure_filtering_rules_for_instance,
828 # since bridge is not set up, ensure_filtering_rules_for instance
829 # fails.
830 #
831 # Retry operation is necessary because continuously request comes,
832 # concorrent request occurs to iptables, then it complains.
833 max_retry = FLAGS.live_migration_retry_count
834 for cnt in range(max_retry):
835 try:
836 self.network_manager.setup_compute_network(context,
837 instance_id)
838 break
839 except exception.ProcessExecutionError:
840 if cnt == max_retry - 1:
841 raise
842 else:
843 LOG.warn(_("setup_compute_network() failed %(cnt)d."
844 "Retry up to %(max_retry)d for %(ec2_id)s.")
845 % locals())
846 time.sleep(1)
847
848 # Creating filters to hypervisors and firewalls.
849 # An example is that nova-instance-instance-xxx,
850 # which is written to libvirt.xml(Check "virsh nwfilter-list")
851 # This nwfilter is necessary on the destination host.
852 # In addition, this method is creating filtering rule
853 # onto destination host.
854 self.driver.ensure_filtering_rules_for_instance(instance_ref)
855
856 def live_migration(self, context, instance_id, dest):
857 """Executing live migration.
858
859 :param context: security context
860 :param instance_id: nova.db.sqlalchemy.models.Instance.Id
861 :param dest: destination host
862
863 """
864
865 # Get instance for error handling.
866 instance_ref = self.db.instance_get(context, instance_id)
867 i_name = instance_ref.name
868
869 try:
870 # Checking volume node is working correctly when any volumes
871 # are attached to instances.
872 if instance_ref['volumes']:
873 rpc.call(context,
874 FLAGS.volume_topic,
875 {"method": "check_for_export",
876 "args": {'instance_id': instance_id}})
877
878 # Asking dest host to preparing live migration.
879 rpc.call(context,
880 self.db.queue_get_for(context, FLAGS.compute_topic, dest),
881 {"method": "pre_live_migration",
882 "args": {'instance_id': instance_id}})
883
884 except Exception:
885 msg = _("Pre live migration for %(i_name)s failed at %(dest)s")
886 LOG.error(msg % locals())
887 self.recover_live_migration(context, instance_ref)
888 raise
889
890 # Executing live migration
891 # live_migration might raises exceptions, but
892 # nothing must be recovered in this version.
893 self.driver.live_migration(context, instance_ref, dest,
894 self.post_live_migration,
895 self.recover_live_migration)
896
897 def post_live_migration(self, ctxt, instance_ref, dest):
898 """Post operations for live migration.
899
900 This method is called from live_migration
901 and mainly updating database record.
902
903 :param ctxt: security context
904 :param instance_id: nova.db.sqlalchemy.models.Instance.Id
905 :param dest: destination host
906
907 """
908
909 LOG.info(_('post_live_migration() is started..'))
910 instance_id = instance_ref['id']
911
912 # Detaching volumes.
913 try:
914 for vol in self.db.volume_get_all_by_instance(ctxt, instance_id):
915 self.volume_manager.remove_compute_volume(ctxt, vol['id'])
916 except exception.NotFound:
917 pass
918
919 # Releasing vlan.
920 # (not necessary in current implementation?)
921
922 # Releasing security group ingress rule.
923 self.driver.unfilter_instance(instance_ref)
924
925 # Database updating.
926 i_name = instance_ref.name
927 try:
928 # Not return if floating_ip is not found, otherwise,
929 # instance never be accessible..
930 floating_ip = self.db.instance_get_floating_address(ctxt,
931 instance_id)
932 if not floating_ip:
933 LOG.info(_('No floating_ip is found for %s.'), i_name)
934 else:
935 floating_ip_ref = self.db.floating_ip_get_by_address(ctxt,
936 floating_ip)
937 self.db.floating_ip_update(ctxt,
938 floating_ip_ref['address'],
939 {'host': dest})
940 except exception.NotFound:
941 LOG.info(_('No floating_ip is found for %s.'), i_name)
942 except:
943 LOG.error(_("Live migration: Unexpected error:"
944 "%s cannot inherit floating ip..") % i_name)
945
946 # Restore instance/volume state
947 self.recover_live_migration(ctxt, instance_ref, dest)
948
949 LOG.info(_('Migrating %(i_name)s to %(dest)s finished successfully.')
950 % locals())
951 LOG.info(_("You may see the error \"libvirt: QEMU error: "
952 "Domain not found: no domain with matching name.\" "
953 "This error can be safely ignored."))
954
955 def recover_live_migration(self, ctxt, instance_ref, host=None):
956 """Recovers Instance/volume state from migrating -> running.
957
958 :param ctxt: security context
959 :param instance_id: nova.db.sqlalchemy.models.Instance.Id
960 :param host:
961 DB column value is updated by this hostname.
962 if none, the host instance currently running is selected.
963
964 """
965
966 if not host:
967 host = instance_ref['host']
968
969 self.db.instance_update(ctxt,
970 instance_ref['id'],
971 {'state_description': 'running',
972 'state': power_state.RUNNING,
973 'host': host})
974
975 for volume in instance_ref['volumes']:
976 self.db.volume_update(ctxt, volume['id'], {'status': 'in-use'})
726977
=== modified file 'nova/db/api.py'
--- nova/db/api.py 2011-03-09 21:27:38 +0000
+++ nova/db/api.py 2011-03-10 06:27:59 +0000
@@ -104,6 +104,11 @@
104 return IMPL.service_get_all_by_host(context, host)104 return IMPL.service_get_all_by_host(context, host)
105105
106106
107def service_get_all_compute_by_host(context, host):
108 """Get all compute services for a given host."""
109 return IMPL.service_get_all_compute_by_host(context, host)
110
111
107def service_get_all_compute_sorted(context):112def service_get_all_compute_sorted(context):
108 """Get all compute services sorted by instance count.113 """Get all compute services sorted by instance count.
109114
@@ -153,6 +158,29 @@
153###################158###################
154159
155160
161def compute_node_get(context, compute_id, session=None):
162 """Get an computeNode or raise if it does not exist."""
163 return IMPL.compute_node_get(context, compute_id)
164
165
166def compute_node_create(context, values):
167 """Create a computeNode from the values dictionary."""
168 return IMPL.compute_node_create(context, values)
169
170
171def compute_node_update(context, compute_id, values):
172 """Set the given properties on an computeNode and update it.
173
174 Raises NotFound if computeNode does not exist.
175
176 """
177
178 return IMPL.compute_node_update(context, compute_id, values)
179
180
181###################
182
183
156def certificate_create(context, values):184def certificate_create(context, values):
157 """Create a certificate from the values dictionary."""185 """Create a certificate from the values dictionary."""
158 return IMPL.certificate_create(context, values)186 return IMPL.certificate_create(context, values)
@@ -257,6 +285,11 @@
257 return IMPL.floating_ip_get_by_address(context, address)285 return IMPL.floating_ip_get_by_address(context, address)
258286
259287
288def floating_ip_update(context, address, values):
289 """Update a floating ip by address or raise if it doesn't exist."""
290 return IMPL.floating_ip_update(context, address, values)
291
292
260####################293####################
261294
262def migration_update(context, id, values):295def migration_update(context, id, values):
@@ -441,6 +474,27 @@
441 security_group_id)474 security_group_id)
442475
443476
477def instance_get_vcpu_sum_by_host_and_project(context, hostname, proj_id):
478 """Get instances.vcpus by host and project."""
479 return IMPL.instance_get_vcpu_sum_by_host_and_project(context,
480 hostname,
481 proj_id)
482
483
484def instance_get_memory_sum_by_host_and_project(context, hostname, proj_id):
485 """Get amount of memory by host and project."""
486 return IMPL.instance_get_memory_sum_by_host_and_project(context,
487 hostname,
488 proj_id)
489
490
491def instance_get_disk_sum_by_host_and_project(context, hostname, proj_id):
492 """Get total amount of disk by host and project."""
493 return IMPL.instance_get_disk_sum_by_host_and_project(context,
494 hostname,
495 proj_id)
496
497
444def instance_action_create(context, values):498def instance_action_create(context, values):
445 """Create an instance action from the values dictionary."""499 """Create an instance action from the values dictionary."""
446 return IMPL.instance_action_create(context, values)500 return IMPL.instance_action_create(context, values)
@@ -765,6 +819,11 @@
765 return IMPL.volume_get_all_by_host(context, host)819 return IMPL.volume_get_all_by_host(context, host)
766820
767821
822def volume_get_all_by_instance(context, instance_id):
823 """Get all volumes belonging to a instance."""
824 return IMPL.volume_get_all_by_instance(context, instance_id)
825
826
768def volume_get_all_by_project(context, project_id):827def volume_get_all_by_project(context, project_id):
769 """Get all volumes belonging to a project."""828 """Get all volumes belonging to a project."""
770 return IMPL.volume_get_all_by_project(context, project_id)829 return IMPL.volume_get_all_by_project(context, project_id)
771830
=== modified file 'nova/db/sqlalchemy/api.py'
--- nova/db/sqlalchemy/api.py 2011-03-09 21:27:38 +0000
+++ nova/db/sqlalchemy/api.py 2011-03-10 06:27:59 +0000
@@ -118,6 +118,11 @@
118 service_ref = service_get(context, service_id, session=session)118 service_ref = service_get(context, service_id, session=session)
119 service_ref.delete(session=session)119 service_ref.delete(session=session)
120120
121 if service_ref.topic == 'compute' and \
122 len(service_ref.compute_node) != 0:
123 for c in service_ref.compute_node:
124 c.delete(session=session)
125
121126
122@require_admin_context127@require_admin_context
123def service_get(context, service_id, session=None):128def service_get(context, service_id, session=None):
@@ -125,6 +130,7 @@
125 session = get_session()130 session = get_session()
126131
127 result = session.query(models.Service).\132 result = session.query(models.Service).\
133 options(joinedload('compute_node')).\
128 filter_by(id=service_id).\134 filter_by(id=service_id).\
129 filter_by(deleted=can_read_deleted(context)).\135 filter_by(deleted=can_read_deleted(context)).\
130 first()136 first()
@@ -175,6 +181,24 @@
175181
176182
177@require_admin_context183@require_admin_context
184def service_get_all_compute_by_host(context, host):
185 topic = 'compute'
186 session = get_session()
187 result = session.query(models.Service).\
188 options(joinedload('compute_node')).\
189 filter_by(deleted=False).\
190 filter_by(host=host).\
191 filter_by(topic=topic).\
192 all()
193
194 if not result:
195 raise exception.NotFound(_("%s does not exist or is not "
196 "a compute node.") % host)
197
198 return result
199
200
201@require_admin_context
178def _service_get_all_topic_subquery(context, session, topic, subq, label):202def _service_get_all_topic_subquery(context, session, topic, subq, label):
179 sort_value = getattr(subq.c, label)203 sort_value = getattr(subq.c, label)
180 return session.query(models.Service, func.coalesce(sort_value, 0)).\204 return session.query(models.Service, func.coalesce(sort_value, 0)).\
@@ -285,6 +309,42 @@
285309
286310
287@require_admin_context311@require_admin_context
312def compute_node_get(context, compute_id, session=None):
313 if not session:
314 session = get_session()
315
316 result = session.query(models.ComputeNode).\
317 filter_by(id=compute_id).\
318 filter_by(deleted=can_read_deleted(context)).\
319 first()
320
321 if not result:
322 raise exception.NotFound(_('No computeNode for id %s') % compute_id)
323
324 return result
325
326
327@require_admin_context
328def compute_node_create(context, values):
329 compute_node_ref = models.ComputeNode()
330 compute_node_ref.update(values)
331 compute_node_ref.save()
332 return compute_node_ref
333
334
335@require_admin_context
336def compute_node_update(context, compute_id, values):
337 session = get_session()
338 with session.begin():
339 compute_ref = compute_node_get(context, compute_id, session=session)
340 compute_ref.update(values)
341 compute_ref.save(session=session)
342
343
344###################
345
346
347@require_admin_context
288def certificate_get(context, certificate_id, session=None):348def certificate_get(context, certificate_id, session=None):
289 if not session:349 if not session:
290 session = get_session()350 session = get_session()
@@ -505,6 +565,16 @@
505 return result565 return result
506566
507567
568@require_context
569def floating_ip_update(context, address, values):
570 session = get_session()
571 with session.begin():
572 floating_ip_ref = floating_ip_get_by_address(context, address, session)
573 for (key, value) in values.iteritems():
574 floating_ip_ref[key] = value
575 floating_ip_ref.save(session=session)
576
577
508###################578###################
509579
510580
@@ -905,6 +975,45 @@
905975
906976
907@require_context977@require_context
978def instance_get_vcpu_sum_by_host_and_project(context, hostname, proj_id):
979 session = get_session()
980 result = session.query(models.Instance).\
981 filter_by(host=hostname).\
982 filter_by(project_id=proj_id).\
983 filter_by(deleted=False).\
984 value(func.sum(models.Instance.vcpus))
985 if not result:
986 return 0
987 return result
988
989
990@require_context
991def instance_get_memory_sum_by_host_and_project(context, hostname, proj_id):
992 session = get_session()
993 result = session.query(models.Instance).\
994 filter_by(host=hostname).\
995 filter_by(project_id=proj_id).\
996 filter_by(deleted=False).\
997 value(func.sum(models.Instance.memory_mb))
998 if not result:
999 return 0
1000 return result
1001
1002
1003@require_context
1004def instance_get_disk_sum_by_host_and_project(context, hostname, proj_id):
1005 session = get_session()
1006 result = session.query(models.Instance).\
1007 filter_by(host=hostname).\
1008 filter_by(project_id=proj_id).\
1009 filter_by(deleted=False).\
1010 value(func.sum(models.Instance.local_gb))
1011 if not result:
1012 return 0
1013 return result
1014
1015
1016@require_context
908def instance_action_create(context, values):1017def instance_action_create(context, values):
909 """Create an instance action from the values dictionary."""1018 """Create an instance action from the values dictionary."""
910 action_ref = models.InstanceActions()1019 action_ref = models.InstanceActions()
@@ -1522,6 +1631,18 @@
1522 all()1631 all()
15231632
15241633
1634@require_admin_context
1635def volume_get_all_by_instance(context, instance_id):
1636 session = get_session()
1637 result = session.query(models.Volume).\
1638 filter_by(instance_id=instance_id).\
1639 filter_by(deleted=False).\
1640 all()
1641 if not result:
1642 raise exception.NotFound(_('No volume for instance %s') % instance_id)
1643 return result
1644
1645
1525@require_context1646@require_context
1526def volume_get_all_by_project(context, project_id):1647def volume_get_all_by_project(context, project_id):
1527 authorize_project_context(context, project_id)1648 authorize_project_context(context, project_id)
15281649
=== added file 'nova/db/sqlalchemy/migrate_repo/versions/010_add_live_migration.py'
--- nova/db/sqlalchemy/migrate_repo/versions/010_add_live_migration.py 1970-01-01 00:00:00 +0000
+++ nova/db/sqlalchemy/migrate_repo/versions/010_add_live_migration.py 2011-03-10 06:27:59 +0000
@@ -0,0 +1,83 @@
1# vim: tabstop=4 shiftwidth=4 softtabstop=4
2
3# Copyright 2010 United States Government as represented by the
4# Administrator of the National Aeronautics and Space Administration.
5# All Rights Reserved.
6#
7# Licensed under the Apache License, Version 2.0 (the "License"); you may
8# not use this file except in compliance with the License. You may obtain
9# a copy of the License at
10#
11# http://www.apache.org/licenses/LICENSE-2.0
12#
13# Unless required by applicable law or agreed to in writing, software
14# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
15# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
16# License for the specific language governing permissions and limitations
17# under the License.
18
19from migrate import *
20from nova import log as logging
21from sqlalchemy import *
22
23
24meta = MetaData()
25
26instances = Table('instances', meta,
27 Column('id', Integer(), primary_key=True, nullable=False),
28 )
29
30#
31# New Tables
32#
33
34compute_nodes = Table('compute_nodes', meta,
35 Column('created_at', DateTime(timezone=False)),
36 Column('updated_at', DateTime(timezone=False)),
37 Column('deleted_at', DateTime(timezone=False)),
38 Column('deleted', Boolean(create_constraint=True, name=None)),
39 Column('id', Integer(), primary_key=True, nullable=False),
40 Column('service_id', Integer(), nullable=False),
41
42 Column('vcpus', Integer(), nullable=False),
43 Column('memory_mb', Integer(), nullable=False),
44 Column('local_gb', Integer(), nullable=False),
45 Column('vcpus_used', Integer(), nullable=False),
46 Column('memory_mb_used', Integer(), nullable=False),
47 Column('local_gb_used', Integer(), nullable=False),
48 Column('hypervisor_type',
49 Text(convert_unicode=False, assert_unicode=None,
50 unicode_error=None, _warn_on_bytestring=False),
51 nullable=False),
52 Column('hypervisor_version', Integer(), nullable=False),
53 Column('cpu_info',
54 Text(convert_unicode=False, assert_unicode=None,
55 unicode_error=None, _warn_on_bytestring=False),
56 nullable=False),
57 )
58
59
60#
61# Tables to alter
62#
63instances_launched_on = Column(
64 'launched_on',
65 Text(convert_unicode=False, assert_unicode=None,
66 unicode_error=None, _warn_on_bytestring=False),
67 nullable=True)
68
69
70def upgrade(migrate_engine):
71 # Upgrade operations go here. Don't create your own engine;
72 # bind migrate_engine to your metadata
73 meta.bind = migrate_engine
74
75 try:
76 compute_nodes.create()
77 except Exception:
78 logging.info(repr(compute_nodes))
79 logging.exception('Exception while creating table')
80 meta.drop_all(tables=[compute_nodes])
81 raise
82
83 instances.create_column(instances_launched_on)
084
=== modified file 'nova/db/sqlalchemy/models.py'
--- nova/db/sqlalchemy/models.py 2011-03-03 19:13:15 +0000
+++ nova/db/sqlalchemy/models.py 2011-03-10 06:27:59 +0000
@@ -113,6 +113,41 @@
113 availability_zone = Column(String(255), default='nova')113 availability_zone = Column(String(255), default='nova')
114114
115115
116class ComputeNode(BASE, NovaBase):
117 """Represents a running compute service on a host."""
118
119 __tablename__ = 'compute_nodes'
120 id = Column(Integer, primary_key=True)
121 service_id = Column(Integer, ForeignKey('services.id'), nullable=True)
122 service = relationship(Service,
123 backref=backref('compute_node'),
124 foreign_keys=service_id,
125 primaryjoin='and_('
126 'ComputeNode.service_id == Service.id,'
127 'ComputeNode.deleted == False)')
128
129 vcpus = Column(Integer, nullable=True)
130 memory_mb = Column(Integer, nullable=True)
131 local_gb = Column(Integer, nullable=True)
132 vcpus_used = Column(Integer, nullable=True)
133 memory_mb_used = Column(Integer, nullable=True)
134 local_gb_used = Column(Integer, nullable=True)
135 hypervisor_type = Column(Text, nullable=True)
136 hypervisor_version = Column(Integer, nullable=True)
137
138 # Note(masumotok): Expected Strings example:
139 #
140 # '{"arch":"x86_64",
141 # "model":"Nehalem",
142 # "topology":{"sockets":1, "threads":2, "cores":3},
143 # "features":["tdtscp", "xtpr"]}'
144 #
145 # Points are "json translatable" and it must have all dictionary keys
146 # above, since it is copied from <cpu> tag of getCapabilities()
147 # (See libvirt.virtConnection).
148 cpu_info = Column(Text, nullable=True)
149
150
116class Certificate(BASE, NovaBase):151class Certificate(BASE, NovaBase):
117 """Represents a an x509 certificate"""152 """Represents a an x509 certificate"""
118 __tablename__ = 'certificates'153 __tablename__ = 'certificates'
@@ -191,6 +226,9 @@
191 display_name = Column(String(255))226 display_name = Column(String(255))
192 display_description = Column(String(255))227 display_description = Column(String(255))
193228
229 # To remember on which host a instance booted.
230 # An instance may have moved to another host by live migraiton.
231 launched_on = Column(Text)
194 locked = Column(Boolean)232 locked = Column(Boolean)
195233
196 # TODO(vish): see Ewan's email about state improvements, probably234 # TODO(vish): see Ewan's email about state improvements, probably
197235
=== modified file 'nova/scheduler/driver.py'
--- nova/scheduler/driver.py 2011-01-18 19:01:16 +0000
+++ nova/scheduler/driver.py 2011-03-10 06:27:59 +0000
@@ -26,10 +26,14 @@
26from nova import db26from nova import db
27from nova import exception27from nova import exception
28from nova import flags28from nova import flags
29from nova import log as logging
30from nova import rpc
31from nova.compute import power_state
2932
30FLAGS = flags.FLAGS33FLAGS = flags.FLAGS
31flags.DEFINE_integer('service_down_time', 60,34flags.DEFINE_integer('service_down_time', 60,
32 'maximum time since last checkin for up service')35 'maximum time since last checkin for up service')
36flags.DECLARE('instances_path', 'nova.compute.manager')
3337
3438
35class NoValidHost(exception.Error):39class NoValidHost(exception.Error):
@@ -64,3 +68,236 @@
64 def schedule(self, context, topic, *_args, **_kwargs):68 def schedule(self, context, topic, *_args, **_kwargs):
65 """Must override at least this method for scheduler to work."""69 """Must override at least this method for scheduler to work."""
66 raise NotImplementedError(_("Must implement a fallback schedule"))70 raise NotImplementedError(_("Must implement a fallback schedule"))
71
72 def schedule_live_migration(self, context, instance_id, dest):
73 """Live migration scheduling method.
74
75 :param context:
76 :param instance_id:
77 :param dest: destination host
78 :return:
79 The host where instance is running currently.
80 Then scheduler send request that host.
81
82 """
83
84 # Whether instance exists and is running.
85 instance_ref = db.instance_get(context, instance_id)
86
87 # Checking instance.
88 self._live_migration_src_check(context, instance_ref)
89
90 # Checking destination host.
91 self._live_migration_dest_check(context, instance_ref, dest)
92
93 # Common checking.
94 self._live_migration_common_check(context, instance_ref, dest)
95
96 # Changing instance_state.
97 db.instance_set_state(context,
98 instance_id,
99 power_state.PAUSED,
100 'migrating')
101
102 # Changing volume state
103 for volume_ref in instance_ref['volumes']:
104 db.volume_update(context,
105 volume_ref['id'],
106 {'status': 'migrating'})
107
108 # Return value is necessary to send request to src
109 # Check _schedule() in detail.
110 src = instance_ref['host']
111 return src
112
113 def _live_migration_src_check(self, context, instance_ref):
114 """Live migration check routine (for src host).
115
116 :param context: security context
117 :param instance_ref: nova.db.sqlalchemy.models.Instance object
118
119 """
120
121 # Checking instance is running.
122 if (power_state.RUNNING != instance_ref['state'] or \
123 'running' != instance_ref['state_description']):
124 ec2_id = instance_ref['hostname']
125 raise exception.Invalid(_('Instance(%s) is not running') % ec2_id)
126
127 # Checing volume node is running when any volumes are mounted
128 # to the instance.
129 if len(instance_ref['volumes']) != 0:
130 services = db.service_get_all_by_topic(context, 'volume')
131 if len(services) < 1 or not self.service_is_up(services[0]):
132 raise exception.Invalid(_("volume node is not alive"
133 "(time synchronize problem?)"))
134
135 # Checking src host exists and compute node
136 src = instance_ref['host']
137 services = db.service_get_all_compute_by_host(context, src)
138
139 # Checking src host is alive.
140 if not self.service_is_up(services[0]):
141 raise exception.Invalid(_("%s is not alive(time "
142 "synchronize problem?)") % src)
143
144 def _live_migration_dest_check(self, context, instance_ref, dest):
145 """Live migration check routine (for destination host).
146
147 :param context: security context
148 :param instance_ref: nova.db.sqlalchemy.models.Instance object
149 :param dest: destination host
150
151 """
152
153 # Checking dest exists and compute node.
154 dservice_refs = db.service_get_all_compute_by_host(context, dest)
155 dservice_ref = dservice_refs[0]
156
157 # Checking dest host is alive.
158 if not self.service_is_up(dservice_ref):
159 raise exception.Invalid(_("%s is not alive(time "
160 "synchronize problem?)") % dest)
161
162 # Checking whether The host where instance is running
163 # and dest is not same.
164 src = instance_ref['host']
165 if dest == src:
166 ec2_id = instance_ref['hostname']
167 raise exception.Invalid(_("%(dest)s is where %(ec2_id)s is "
168 "running now. choose other host.")
169 % locals())
170
171 # Checking dst host still has enough capacities.
172 self.assert_compute_node_has_enough_resources(context,
173 instance_ref,
174 dest)
175
176 def _live_migration_common_check(self, context, instance_ref, dest):
177 """Live migration common check routine.
178
179 Below checkings are followed by
180 http://wiki.libvirt.org/page/TodoPreMigrationChecks
181
182 :param context: security context
183 :param instance_ref: nova.db.sqlalchemy.models.Instance object
184 :param dest: destination host
185
186 """
187
188 # Checking shared storage connectivity
189 self.mounted_on_same_shared_storage(context, instance_ref, dest)
190
191 # Checking dest exists.
192 dservice_refs = db.service_get_all_compute_by_host(context, dest)
193 dservice_ref = dservice_refs[0]['compute_node'][0]
194
195 # Checking original host( where instance was launched at) exists.
196 try:
197 oservice_refs = db.service_get_all_compute_by_host(context,
198 instance_ref['launched_on'])
199 except exception.NotFound:
200 raise exception.Invalid(_("host %s where instance was launched "
201 "does not exist.")
202 % instance_ref['launched_on'])
203 oservice_ref = oservice_refs[0]['compute_node'][0]
204
205 # Checking hypervisor is same.
206 orig_hypervisor = oservice_ref['hypervisor_type']
207 dest_hypervisor = dservice_ref['hypervisor_type']
208 if orig_hypervisor != dest_hypervisor:
209 raise exception.Invalid(_("Different hypervisor type"
210 "(%(orig_hypervisor)s->"
211 "%(dest_hypervisor)s)')" % locals()))
212
213 # Checkng hypervisor version.
214 orig_hypervisor = oservice_ref['hypervisor_version']
215 dest_hypervisor = dservice_ref['hypervisor_version']
216 if orig_hypervisor > dest_hypervisor:
217 raise exception.Invalid(_("Older hypervisor version"
218 "(%(orig_hypervisor)s->"
219 "%(dest_hypervisor)s)") % locals())
220
221 # Checking cpuinfo.
222 try:
223 rpc.call(context,
224 db.queue_get_for(context, FLAGS.compute_topic, dest),
225 {"method": 'compare_cpu',
226 "args": {'cpu_info': oservice_ref['cpu_info']}})
227
228 except rpc.RemoteError:
229 src = instance_ref['host']
230 logging.exception(_("host %(dest)s is not compatible with "
231 "original host %(src)s.") % locals())
232 raise
233
234 def assert_compute_node_has_enough_resources(self, context,
235 instance_ref, dest):
236 """Checks if destination host has enough resource for live migration.
237
238 Currently, only memory checking has been done.
239 If storage migration(block migration, meaning live-migration
240 without any shared storage) will be available, local storage
241 checking is also necessary.
242
243 :param context: security context
244 :param instance_ref: nova.db.sqlalchemy.models.Instance object
245 :param dest: destination host
246
247 """
248
249 # Getting instance information
250 ec2_id = instance_ref['hostname']
251
252 # Getting host information
253 service_refs = db.service_get_all_compute_by_host(context, dest)
254 compute_node_ref = service_refs[0]['compute_node'][0]
255
256 mem_total = int(compute_node_ref['memory_mb'])
257 mem_used = int(compute_node_ref['memory_mb_used'])
258 mem_avail = mem_total - mem_used
259 mem_inst = instance_ref['memory_mb']
260 if mem_avail <= mem_inst:
261 raise exception.NotEmpty(_("Unable to migrate %(ec2_id)s "
262 "to destination: %(dest)s "
263 "(host:%(mem_avail)s "
264 "<= instance:%(mem_inst)s)")
265 % locals())
266
267 def mounted_on_same_shared_storage(self, context, instance_ref, dest):
268 """Check if the src and dest host mount same shared storage.
269
270 At first, dest host creates temp file, and src host can see
271 it if they mounts same shared storage. Then src host erase it.
272
273 :param context: security context
274 :param instance_ref: nova.db.sqlalchemy.models.Instance object
275 :param dest: destination host
276
277 """
278
279 src = instance_ref['host']
280 dst_t = db.queue_get_for(context, FLAGS.compute_topic, dest)
281 src_t = db.queue_get_for(context, FLAGS.compute_topic, src)
282
283 try:
284 # create tmpfile at dest host
285 filename = rpc.call(context, dst_t,
286 {"method": 'create_shared_storage_test_file'})
287
288 # make sure existence at src host.
289 rpc.call(context, src_t,
290 {"method": 'check_shared_storage_test_file',
291 "args": {'filename': filename}})
292
293 except rpc.RemoteError:
294 ipath = FLAGS.instances_path
295 logging.error(_("Cannot confirm tmpfile at %(ipath)s is on "
296 "same shared storage between %(src)s "
297 "and %(dest)s.") % locals())
298 raise
299
300 finally:
301 rpc.call(context, dst_t,
302 {"method": 'cleanup_shared_storage_test_file',
303 "args": {'filename': filename}})
67304
=== modified file 'nova/scheduler/manager.py'
--- nova/scheduler/manager.py 2011-01-19 15:41:30 +0000
+++ nova/scheduler/manager.py 2011-03-10 06:27:59 +0000
@@ -67,3 +67,55 @@
67 {"method": method,67 {"method": method,
68 "args": kwargs})68 "args": kwargs})
69 LOG.debug(_("Casting to %(topic)s %(host)s for %(method)s") % locals())69 LOG.debug(_("Casting to %(topic)s %(host)s for %(method)s") % locals())
70
71 # NOTE (masumotok) : This method should be moved to nova.api.ec2.admin.
72 # Based on bexar design summit discussion,
73 # just put this here for bexar release.
74 def show_host_resources(self, context, host, *args):
75 """Shows the physical/usage resource given by hosts.
76
77 :param context: security context
78 :param host: hostname
79 :returns:
80 example format is below.
81 {'resource':D, 'usage':{proj_id1:D, proj_id2:D}}
82 D: {'vcpus':3, 'memory_mb':2048, 'local_gb':2048}
83
84 """
85
86 compute_ref = db.service_get_all_compute_by_host(context, host)
87 compute_ref = compute_ref[0]
88
89 # Getting physical resource information
90 compute_node_ref = compute_ref['compute_node'][0]
91 resource = {'vcpus': compute_node_ref['vcpus'],
92 'memory_mb': compute_node_ref['memory_mb'],
93 'local_gb': compute_node_ref['local_gb'],
94 'vcpus_used': compute_node_ref['vcpus_used'],
95 'memory_mb_used': compute_node_ref['memory_mb_used'],
96 'local_gb_used': compute_node_ref['local_gb_used']}
97
98 # Getting usage resource information
99 usage = {}
100 instance_refs = db.instance_get_all_by_host(context,
101 compute_ref['host'])
102 if not instance_refs:
103 return {'resource': resource, 'usage': usage}
104
105 project_ids = [i['project_id'] for i in instance_refs]
106 project_ids = list(set(project_ids))
107 for project_id in project_ids:
108 vcpus = db.instance_get_vcpu_sum_by_host_and_project(context,
109 host,
110 project_id)
111 mem = db.instance_get_memory_sum_by_host_and_project(context,
112 host,
113 project_id)
114 hdd = db.instance_get_disk_sum_by_host_and_project(context,
115 host,
116 project_id)
117 usage[project_id] = {'vcpus': int(vcpus),
118 'memory_mb': int(mem),
119 'local_gb': int(hdd)}
120
121 return {'resource': resource, 'usage': usage}
70122
=== modified file 'nova/service.py'
--- nova/service.py 2011-03-09 00:51:05 +0000
+++ nova/service.py 2011-03-10 06:27:59 +0000
@@ -92,6 +92,9 @@
92 except exception.NotFound:92 except exception.NotFound:
93 self._create_service_ref(ctxt)93 self._create_service_ref(ctxt)
9494
95 if 'nova-compute' == self.binary:
96 self.manager.update_available_resource(ctxt)
97
95 conn1 = rpc.Connection.instance(new=True)98 conn1 = rpc.Connection.instance(new=True)
96 conn2 = rpc.Connection.instance(new=True)99 conn2 = rpc.Connection.instance(new=True)
97 if self.report_interval:100 if self.report_interval:
98101
=== modified file 'nova/tests/test_compute.py'
--- nova/tests/test_compute.py 2011-03-10 04:42:11 +0000
+++ nova/tests/test_compute.py 2011-03-10 06:27:59 +0000
@@ -20,6 +20,7 @@
20"""20"""
2121
22import datetime22import datetime
23import mox
2324
24from nova import compute25from nova import compute
25from nova import context26from nova import context
@@ -27,15 +28,20 @@
27from nova import exception28from nova import exception
28from nova import flags29from nova import flags
29from nova import log as logging30from nova import log as logging
31from nova import rpc
30from nova import test32from nova import test
31from nova import utils33from nova import utils
32from nova.auth import manager34from nova.auth import manager
33from nova.compute import instance_types35from nova.compute import instance_types
36from nova.compute import manager as compute_manager
37from nova.compute import power_state
38from nova.db.sqlalchemy import models
34from nova.image import local39from nova.image import local
3540
36LOG = logging.getLogger('nova.tests.compute')41LOG = logging.getLogger('nova.tests.compute')
37FLAGS = flags.FLAGS42FLAGS = flags.FLAGS
38flags.DECLARE('stub_network', 'nova.compute.manager')43flags.DECLARE('stub_network', 'nova.compute.manager')
44flags.DECLARE('live_migration_retry_count', 'nova.compute.manager')
3945
4046
41class ComputeTestCase(test.TestCase):47class ComputeTestCase(test.TestCase):
@@ -83,6 +89,41 @@
83 'project_id': self.project.id}89 'project_id': self.project.id}
84 return db.security_group_create(self.context, values)90 return db.security_group_create(self.context, values)
8591
92 def _get_dummy_instance(self):
93 """Get mock-return-value instance object
94 Use this when any testcase executed later than test_run_terminate
95 """
96 vol1 = models.Volume()
97 vol1['id'] = 1
98 vol2 = models.Volume()
99 vol2['id'] = 2
100 instance_ref = models.Instance()
101 instance_ref['id'] = 1
102 instance_ref['volumes'] = [vol1, vol2]
103 instance_ref['hostname'] = 'i-00000001'
104 instance_ref['host'] = 'dummy'
105 return instance_ref
106
107 def test_create_instance_defaults_display_name(self):
108 """Verify that an instance cannot be created without a display_name."""
109 cases = [dict(), dict(display_name=None)]
110 for instance in cases:
111 ref = self.compute_api.create(self.context,
112 FLAGS.default_instance_type, None, **instance)
113 try:
114 self.assertNotEqual(ref[0]['display_name'], None)
115 finally:
116 db.instance_destroy(self.context, ref[0]['id'])
117
118 def test_create_instance_associates_security_groups(self):
119 """Make sure create associates security groups"""
120 group = self._create_group()
121 instance_ref = models.Instance()
122 instance_ref['id'] = 1
123 instance_ref['volumes'] = [{'id': 1}, {'id': 2}]
124 instance_ref['hostname'] = 'i-00000001'
125 return instance_ref
126
86 def test_create_instance_defaults_display_name(self):127 def test_create_instance_defaults_display_name(self):
87 """Verify that an instance cannot be created without a display_name."""128 """Verify that an instance cannot be created without a display_name."""
88 cases = [dict(), dict(display_name=None)]129 cases = [dict(), dict(display_name=None)]
@@ -301,3 +342,256 @@
301 self.compute.terminate_instance(self.context, instance_id)342 self.compute.terminate_instance(self.context, instance_id)
302 type = instance_types.get_by_flavor_id("1")343 type = instance_types.get_by_flavor_id("1")
303 self.assertEqual(type, 'm1.tiny')344 self.assertEqual(type, 'm1.tiny')
345
346 def _setup_other_managers(self):
347 self.volume_manager = utils.import_object(FLAGS.volume_manager)
348 self.network_manager = utils.import_object(FLAGS.network_manager)
349 self.compute_driver = utils.import_object(FLAGS.compute_driver)
350
351 def test_pre_live_migration_instance_has_no_fixed_ip(self):
352 """Confirm raising exception if instance doesn't have fixed_ip."""
353 instance_ref = self._get_dummy_instance()
354 c = context.get_admin_context()
355 i_id = instance_ref['id']
356
357 dbmock = self.mox.CreateMock(db)
358 dbmock.instance_get(c, i_id).AndReturn(instance_ref)
359 dbmock.instance_get_fixed_address(c, i_id).AndReturn(None)
360
361 self.compute.db = dbmock
362 self.mox.ReplayAll()
363 self.assertRaises(exception.NotFound,
364 self.compute.pre_live_migration,
365 c, instance_ref['id'])
366
367 def test_pre_live_migration_instance_has_volume(self):
368 """Confirm setup_compute_volume is called when volume is mounted."""
369 i_ref = self._get_dummy_instance()
370 c = context.get_admin_context()
371
372 self._setup_other_managers()
373 dbmock = self.mox.CreateMock(db)
374 volmock = self.mox.CreateMock(self.volume_manager)
375 netmock = self.mox.CreateMock(self.network_manager)
376 drivermock = self.mox.CreateMock(self.compute_driver)
377
378 dbmock.instance_get(c, i_ref['id']).AndReturn(i_ref)
379 dbmock.instance_get_fixed_address(c, i_ref['id']).AndReturn('dummy')
380 for i in range(len(i_ref['volumes'])):
381 vid = i_ref['volumes'][i]['id']
382 volmock.setup_compute_volume(c, vid).InAnyOrder('g1')
383 netmock.setup_compute_network(c, i_ref['id'])
384 drivermock.ensure_filtering_rules_for_instance(i_ref)
385
386 self.compute.db = dbmock
387 self.compute.volume_manager = volmock
388 self.compute.network_manager = netmock
389 self.compute.driver = drivermock
390
391 self.mox.ReplayAll()
392 ret = self.compute.pre_live_migration(c, i_ref['id'])
393 self.assertEqual(ret, None)
394
395 def test_pre_live_migration_instance_has_no_volume(self):
396 """Confirm log meg when instance doesn't mount any volumes."""
397 i_ref = self._get_dummy_instance()
398 i_ref['volumes'] = []
399 c = context.get_admin_context()
400
401 self._setup_other_managers()
402 dbmock = self.mox.CreateMock(db)
403 netmock = self.mox.CreateMock(self.network_manager)
404 drivermock = self.mox.CreateMock(self.compute_driver)
405
406 dbmock.instance_get(c, i_ref['id']).AndReturn(i_ref)
407 dbmock.instance_get_fixed_address(c, i_ref['id']).AndReturn('dummy')
408 self.mox.StubOutWithMock(compute_manager.LOG, 'info')
409 compute_manager.LOG.info(_("%s has no volume."), i_ref['hostname'])
410 netmock.setup_compute_network(c, i_ref['id'])
411 drivermock.ensure_filtering_rules_for_instance(i_ref)
412
413 self.compute.db = dbmock
414 self.compute.network_manager = netmock
415 self.compute.driver = drivermock
416
417 self.mox.ReplayAll()
418 ret = self.compute.pre_live_migration(c, i_ref['id'])
419 self.assertEqual(ret, None)
420
421 def test_pre_live_migration_setup_compute_node_fail(self):
422 """Confirm operation setup_compute_network() fails.
423
424 It retries and raise exception when timeout exceeded.
425
426 """
427
428 i_ref = self._get_dummy_instance()
429 c = context.get_admin_context()
430
431 self._setup_other_managers()
432 dbmock = self.mox.CreateMock(db)
433 netmock = self.mox.CreateMock(self.network_manager)
434 volmock = self.mox.CreateMock(self.volume_manager)
435
436 dbmock.instance_get(c, i_ref['id']).AndReturn(i_ref)
437 dbmock.instance_get_fixed_address(c, i_ref['id']).AndReturn('dummy')
438 for i in range(len(i_ref['volumes'])):
439 volmock.setup_compute_volume(c, i_ref['volumes'][i]['id'])
440 for i in range(FLAGS.live_migration_retry_count):
441 netmock.setup_compute_network(c, i_ref['id']).\
442 AndRaise(exception.ProcessExecutionError())
443
444 self.compute.db = dbmock
445 self.compute.network_manager = netmock
446 self.compute.volume_manager = volmock
447
448 self.mox.ReplayAll()
449 self.assertRaises(exception.ProcessExecutionError,
450 self.compute.pre_live_migration,
451 c, i_ref['id'])
452
453 def test_live_migration_works_correctly_with_volume(self):
454 """Confirm check_for_export to confirm volume health check."""
455 i_ref = self._get_dummy_instance()
456 c = context.get_admin_context()
457 topic = db.queue_get_for(c, FLAGS.compute_topic, i_ref['host'])
458
459 dbmock = self.mox.CreateMock(db)
460 dbmock.instance_get(c, i_ref['id']).AndReturn(i_ref)
461 self.mox.StubOutWithMock(rpc, 'call')
462 rpc.call(c, FLAGS.volume_topic, {"method": "check_for_export",
463 "args": {'instance_id': i_ref['id']}})
464 dbmock.queue_get_for(c, FLAGS.compute_topic, i_ref['host']).\
465 AndReturn(topic)
466 rpc.call(c, topic, {"method": "pre_live_migration",
467 "args": {'instance_id': i_ref['id']}})
468 self.mox.StubOutWithMock(self.compute.driver, 'live_migration')
469 self.compute.driver.live_migration(c, i_ref, i_ref['host'],
470 self.compute.post_live_migration,
471 self.compute.recover_live_migration)
472
473 self.compute.db = dbmock
474 self.mox.ReplayAll()
475 ret = self.compute.live_migration(c, i_ref['id'], i_ref['host'])
476 self.assertEqual(ret, None)
477
478 def test_live_migration_dest_raises_exception(self):
479 """Confirm exception when pre_live_migration fails."""
480 i_ref = self._get_dummy_instance()
481 c = context.get_admin_context()
482 topic = db.queue_get_for(c, FLAGS.compute_topic, i_ref['host'])
483
484 dbmock = self.mox.CreateMock(db)
485 dbmock.instance_get(c, i_ref['id']).AndReturn(i_ref)
486 self.mox.StubOutWithMock(rpc, 'call')
487 rpc.call(c, FLAGS.volume_topic, {"method": "check_for_export",
488 "args": {'instance_id': i_ref['id']}})
489 dbmock.queue_get_for(c, FLAGS.compute_topic, i_ref['host']).\
490 AndReturn(topic)
491 rpc.call(c, topic, {"method": "pre_live_migration",
492 "args": {'instance_id': i_ref['id']}}).\
493 AndRaise(rpc.RemoteError('', '', ''))
494 dbmock.instance_update(c, i_ref['id'], {'state_description': 'running',
495 'state': power_state.RUNNING,
496 'host': i_ref['host']})
497 for v in i_ref['volumes']:
498 dbmock.volume_update(c, v['id'], {'status': 'in-use'})
499
500 self.compute.db = dbmock
501 self.mox.ReplayAll()
502 self.assertRaises(rpc.RemoteError,
503 self.compute.live_migration,
504 c, i_ref['id'], i_ref['host'])
505
506 def test_live_migration_dest_raises_exception_no_volume(self):
507 """Same as above test(input pattern is different) """
508 i_ref = self._get_dummy_instance()
509 i_ref['volumes'] = []
510 c = context.get_admin_context()
511 topic = db.queue_get_for(c, FLAGS.compute_topic, i_ref['host'])
512
513 dbmock = self.mox.CreateMock(db)
514 dbmock.instance_get(c, i_ref['id']).AndReturn(i_ref)
515 dbmock.queue_get_for(c, FLAGS.compute_topic, i_ref['host']).\
516 AndReturn(topic)
517 self.mox.StubOutWithMock(rpc, 'call')
518 rpc.call(c, topic, {"method": "pre_live_migration",
519 "args": {'instance_id': i_ref['id']}}).\
520 AndRaise(rpc.RemoteError('', '', ''))
521 dbmock.instance_update(c, i_ref['id'], {'state_description': 'running',
522 'state': power_state.RUNNING,
523 'host': i_ref['host']})
524
525 self.compute.db = dbmock
526 self.mox.ReplayAll()
527 self.assertRaises(rpc.RemoteError,
528 self.compute.live_migration,
529 c, i_ref['id'], i_ref['host'])
530
531 def test_live_migration_works_correctly_no_volume(self):
532 """Confirm live_migration() works as expected correctly."""
533 i_ref = self._get_dummy_instance()
534 i_ref['volumes'] = []
535 c = context.get_admin_context()
536 topic = db.queue_get_for(c, FLAGS.compute_topic, i_ref['host'])
537
538 dbmock = self.mox.CreateMock(db)
539 dbmock.instance_get(c, i_ref['id']).AndReturn(i_ref)
540 self.mox.StubOutWithMock(rpc, 'call')
541 dbmock.queue_get_for(c, FLAGS.compute_topic, i_ref['host']).\
542 AndReturn(topic)
543 rpc.call(c, topic, {"method": "pre_live_migration",
544 "args": {'instance_id': i_ref['id']}})
545 self.mox.StubOutWithMock(self.compute.driver, 'live_migration')
546 self.compute.driver.live_migration(c, i_ref, i_ref['host'],
547 self.compute.post_live_migration,
548 self.compute.recover_live_migration)
549
550 self.compute.db = dbmock
551 self.mox.ReplayAll()
552 ret = self.compute.live_migration(c, i_ref['id'], i_ref['host'])
553 self.assertEqual(ret, None)
554
555 def test_post_live_migration_working_correctly(self):
556 """Confirm post_live_migration() works as expected correctly."""
557 dest = 'desthost'
558 flo_addr = '1.2.1.2'
559
560 # Preparing datas
561 c = context.get_admin_context()
562 instance_id = self._create_instance()
563 i_ref = db.instance_get(c, instance_id)
564 db.instance_update(c, i_ref['id'], {'state_description': 'migrating',
565 'state': power_state.PAUSED})
566 v_ref = db.volume_create(c, {'size': 1, 'instance_id': instance_id})
567 fix_addr = db.fixed_ip_create(c, {'address': '1.1.1.1',
568 'instance_id': instance_id})
569 fix_ref = db.fixed_ip_get_by_address(c, fix_addr)
570 flo_ref = db.floating_ip_create(c, {'address': flo_addr,
571 'fixed_ip_id': fix_ref['id']})
572 # reload is necessary before setting mocks
573 i_ref = db.instance_get(c, instance_id)
574
575 # Preparing mocks
576 self.mox.StubOutWithMock(self.compute.volume_manager,
577 'remove_compute_volume')
578 for v in i_ref['volumes']:
579 self.compute.volume_manager.remove_compute_volume(c, v['id'])
580 self.mox.StubOutWithMock(self.compute.driver, 'unfilter_instance')
581 self.compute.driver.unfilter_instance(i_ref)
582
583 # executing
584 self.mox.ReplayAll()
585 ret = self.compute.post_live_migration(c, i_ref, dest)
586
587 # make sure every data is rewritten to dest
588 i_ref = db.instance_get(c, i_ref['id'])
589 c1 = (i_ref['host'] == dest)
590 flo_refs = db.floating_ip_get_all_by_host(c, dest)
591 c2 = (len(flo_refs) != 0 and flo_refs[0]['address'] == flo_addr)
592
593 # post operaton
594 self.assertTrue(c1 and c2)
595 db.instance_destroy(c, instance_id)
596 db.volume_destroy(c, v_ref['id'])
597 db.floating_ip_destroy(c, flo_addr)
304598
=== modified file 'nova/tests/test_scheduler.py'
--- nova/tests/test_scheduler.py 2011-03-07 01:25:01 +0000
+++ nova/tests/test_scheduler.py 2011-03-10 06:27:59 +0000
@@ -20,10 +20,12 @@
20"""20"""
2121
22import datetime22import datetime
23import mox
2324
24from mox import IgnoreArg25from mox import IgnoreArg
25from nova import context26from nova import context
26from nova import db27from nova import db
28from nova import exception
27from nova import flags29from nova import flags
28from nova import service30from nova import service
29from nova import test31from nova import test
@@ -32,11 +34,14 @@
32from nova.auth import manager as auth_manager34from nova.auth import manager as auth_manager
33from nova.scheduler import manager35from nova.scheduler import manager
34from nova.scheduler import driver36from nova.scheduler import driver
37from nova.compute import power_state
38from nova.db.sqlalchemy import models
3539
3640
37FLAGS = flags.FLAGS41FLAGS = flags.FLAGS
38flags.DECLARE('max_cores', 'nova.scheduler.simple')42flags.DECLARE('max_cores', 'nova.scheduler.simple')
39flags.DECLARE('stub_network', 'nova.compute.manager')43flags.DECLARE('stub_network', 'nova.compute.manager')
44flags.DECLARE('instances_path', 'nova.compute.manager')
4045
4146
42class TestDriver(driver.Scheduler):47class TestDriver(driver.Scheduler):
@@ -54,6 +59,34 @@
54 super(SchedulerTestCase, self).setUp()59 super(SchedulerTestCase, self).setUp()
55 self.flags(scheduler_driver='nova.tests.test_scheduler.TestDriver')60 self.flags(scheduler_driver='nova.tests.test_scheduler.TestDriver')
5661
62 def _create_compute_service(self):
63 """Create compute-manager(ComputeNode and Service record)."""
64 ctxt = context.get_admin_context()
65 dic = {'host': 'dummy', 'binary': 'nova-compute', 'topic': 'compute',
66 'report_count': 0, 'availability_zone': 'dummyzone'}
67 s_ref = db.service_create(ctxt, dic)
68
69 dic = {'service_id': s_ref['id'],
70 'vcpus': 16, 'memory_mb': 32, 'local_gb': 100,
71 'vcpus_used': 16, 'memory_mb_used': 32, 'local_gb_used': 10,
72 'hypervisor_type': 'qemu', 'hypervisor_version': 12003,
73 'cpu_info': ''}
74 db.compute_node_create(ctxt, dic)
75
76 return db.service_get(ctxt, s_ref['id'])
77
78 def _create_instance(self, **kwargs):
79 """Create a test instance"""
80 ctxt = context.get_admin_context()
81 inst = {}
82 inst['user_id'] = 'admin'
83 inst['project_id'] = kwargs.get('project_id', 'fake')
84 inst['host'] = kwargs.get('host', 'dummy')
85 inst['vcpus'] = kwargs.get('vcpus', 1)
86 inst['memory_mb'] = kwargs.get('memory_mb', 10)
87 inst['local_gb'] = kwargs.get('local_gb', 20)
88 return db.instance_create(ctxt, inst)
89
57 def test_fallback(self):90 def test_fallback(self):
58 scheduler = manager.SchedulerManager()91 scheduler = manager.SchedulerManager()
59 self.mox.StubOutWithMock(rpc, 'cast', use_mock_anything=True)92 self.mox.StubOutWithMock(rpc, 'cast', use_mock_anything=True)
@@ -76,6 +109,73 @@
76 self.mox.ReplayAll()109 self.mox.ReplayAll()
77 scheduler.named_method(ctxt, 'topic', num=7)110 scheduler.named_method(ctxt, 'topic', num=7)
78111
112 def test_show_host_resources_host_not_exit(self):
113 """A host given as an argument does not exists."""
114
115 scheduler = manager.SchedulerManager()
116 dest = 'dummydest'
117 ctxt = context.get_admin_context()
118
119 try:
120 scheduler.show_host_resources(ctxt, dest)
121 except exception.NotFound, e:
122 c1 = (e.message.find(_("does not exist or is not a "
123 "compute node.")) >= 0)
124 self.assertTrue(c1)
125
126 def _dic_is_equal(self, dic1, dic2, keys=None):
127 """Compares 2 dictionary contents(Helper method)"""
128 if not keys:
129 keys = ['vcpus', 'memory_mb', 'local_gb',
130 'vcpus_used', 'memory_mb_used', 'local_gb_used']
131
132 for key in keys:
133 if not (dic1[key] == dic2[key]):
134 return False
135 return True
136
137 def test_show_host_resources_no_project(self):
138 """No instance are running on the given host."""
139
140 scheduler = manager.SchedulerManager()
141 ctxt = context.get_admin_context()
142 s_ref = self._create_compute_service()
143
144 result = scheduler.show_host_resources(ctxt, s_ref['host'])
145
146 # result checking
147 c1 = ('resource' in result and 'usage' in result)
148 compute_node = s_ref['compute_node'][0]
149 c2 = self._dic_is_equal(result['resource'], compute_node)
150 c3 = result['usage'] == {}
151 self.assertTrue(c1 and c2 and c3)
152 db.service_destroy(ctxt, s_ref['id'])
153
154 def test_show_host_resources_works_correctly(self):
155 """Show_host_resources() works correctly as expected."""
156
157 scheduler = manager.SchedulerManager()
158 ctxt = context.get_admin_context()
159 s_ref = self._create_compute_service()
160 i_ref1 = self._create_instance(project_id='p-01', host=s_ref['host'])
161 i_ref2 = self._create_instance(project_id='p-02', vcpus=3,
162 host=s_ref['host'])
163
164 result = scheduler.show_host_resources(ctxt, s_ref['host'])
165
166 c1 = ('resource' in result and 'usage' in result)
167 compute_node = s_ref['compute_node'][0]
168 c2 = self._dic_is_equal(result['resource'], compute_node)
169 c3 = result['usage'].keys() == ['p-01', 'p-02']
170 keys = ['vcpus', 'memory_mb', 'local_gb']
171 c4 = self._dic_is_equal(result['usage']['p-01'], i_ref1, keys)
172 c5 = self._dic_is_equal(result['usage']['p-02'], i_ref2, keys)
173 self.assertTrue(c1 and c2 and c3 and c4 and c5)
174
175 db.service_destroy(ctxt, s_ref['id'])
176 db.instance_destroy(ctxt, i_ref1['id'])
177 db.instance_destroy(ctxt, i_ref2['id'])
178
79179
80class ZoneSchedulerTestCase(test.TestCase):180class ZoneSchedulerTestCase(test.TestCase):
81 """Test case for zone scheduler"""181 """Test case for zone scheduler"""
@@ -161,9 +261,15 @@
161 inst['project_id'] = self.project.id261 inst['project_id'] = self.project.id
162 inst['instance_type'] = 'm1.tiny'262 inst['instance_type'] = 'm1.tiny'
163 inst['mac_address'] = utils.generate_mac()263 inst['mac_address'] = utils.generate_mac()
264 inst['vcpus'] = kwargs.get('vcpus', 1)
164 inst['ami_launch_index'] = 0265 inst['ami_launch_index'] = 0
165 inst['vcpus'] = 1
166 inst['availability_zone'] = kwargs.get('availability_zone', None)266 inst['availability_zone'] = kwargs.get('availability_zone', None)
267 inst['host'] = kwargs.get('host', 'dummy')
268 inst['memory_mb'] = kwargs.get('memory_mb', 20)
269 inst['local_gb'] = kwargs.get('local_gb', 30)
270 inst['launched_on'] = kwargs.get('launghed_on', 'dummy')
271 inst['state_description'] = kwargs.get('state_description', 'running')
272 inst['state'] = kwargs.get('state', power_state.RUNNING)
167 return db.instance_create(self.context, inst)['id']273 return db.instance_create(self.context, inst)['id']
168274
169 def _create_volume(self):275 def _create_volume(self):
@@ -173,6 +279,211 @@
173 vol['availability_zone'] = 'test'279 vol['availability_zone'] = 'test'
174 return db.volume_create(self.context, vol)['id']280 return db.volume_create(self.context, vol)['id']
175281
282 def _create_compute_service(self, **kwargs):
283 """Create a compute service."""
284
285 dic = {'binary': 'nova-compute', 'topic': 'compute',
286 'report_count': 0, 'availability_zone': 'dummyzone'}
287 dic['host'] = kwargs.get('host', 'dummy')
288 s_ref = db.service_create(self.context, dic)
289 if 'created_at' in kwargs.keys() or 'updated_at' in kwargs.keys():
290 t = datetime.datetime.utcnow() - datetime.timedelta(0)
291 dic['created_at'] = kwargs.get('created_at', t)
292 dic['updated_at'] = kwargs.get('updated_at', t)
293 db.service_update(self.context, s_ref['id'], dic)
294
295 dic = {'service_id': s_ref['id'],
296 'vcpus': 16, 'memory_mb': 32, 'local_gb': 100,
297 'vcpus_used': 16, 'local_gb_used': 10,
298 'hypervisor_type': 'qemu', 'hypervisor_version': 12003,
299 'cpu_info': ''}
300 dic['memory_mb_used'] = kwargs.get('memory_mb_used', 32)
301 dic['hypervisor_type'] = kwargs.get('hypervisor_type', 'qemu')
302 dic['hypervisor_version'] = kwargs.get('hypervisor_version', 12003)
303 db.compute_node_create(self.context, dic)
304 return db.service_get(self.context, s_ref['id'])
305
306 def test_doesnt_report_disabled_hosts_as_up(self):
307 """Ensures driver doesn't find hosts before they are enabled"""
308 # NOTE(vish): constructing service without create method
309 # because we are going to use it without queue
310 compute1 = service.Service('host1',
311 'nova-compute',
312 'compute',
313 FLAGS.compute_manager)
314 compute1.start()
315 compute2 = service.Service('host2',
316 'nova-compute',
317 'compute',
318 FLAGS.compute_manager)
319 compute2.start()
320 s1 = db.service_get_by_args(self.context, 'host1', 'nova-compute')
321 s2 = db.service_get_by_args(self.context, 'host2', 'nova-compute')
322 db.service_update(self.context, s1['id'], {'disabled': True})
323 db.service_update(self.context, s2['id'], {'disabled': True})
324 hosts = self.scheduler.driver.hosts_up(self.context, 'compute')
325 self.assertEqual(0, len(hosts))
326 compute1.kill()
327 compute2.kill()
328
329 def test_reports_enabled_hosts_as_up(self):
330 """Ensures driver can find the hosts that are up"""
331 # NOTE(vish): constructing service without create method
332 # because we are going to use it without queue
333 compute1 = service.Service('host1',
334 'nova-compute',
335 'compute',
336 FLAGS.compute_manager)
337 compute1.start()
338 compute2 = service.Service('host2',
339 'nova-compute',
340 'compute',
341 FLAGS.compute_manager)
342 compute2.start()
343 hosts = self.scheduler.driver.hosts_up(self.context, 'compute')
344 self.assertEqual(2, len(hosts))
345 compute1.kill()
346 compute2.kill()
347
348 def test_least_busy_host_gets_instance(self):
349 """Ensures the host with less cores gets the next one"""
350 compute1 = service.Service('host1',
351 'nova-compute',
352 'compute',
353 FLAGS.compute_manager)
354 compute1.start()
355 compute2 = service.Service('host2',
356 'nova-compute',
357 'compute',
358 FLAGS.compute_manager)
359 compute2.start()
360 instance_id1 = self._create_instance()
361 compute1.run_instance(self.context, instance_id1)
362 instance_id2 = self._create_instance()
363 host = self.scheduler.driver.schedule_run_instance(self.context,
364 instance_id2)
365 self.assertEqual(host, 'host2')
366 compute1.terminate_instance(self.context, instance_id1)
367 db.instance_destroy(self.context, instance_id2)
368 compute1.kill()
369 compute2.kill()
370
371 def test_specific_host_gets_instance(self):
372 """Ensures if you set availability_zone it launches on that zone"""
373 compute1 = service.Service('host1',
374 'nova-compute',
375 'compute',
376 FLAGS.compute_manager)
377 compute1.start()
378 compute2 = service.Service('host2',
379 'nova-compute',
380 'compute',
381 FLAGS.compute_manager)
382 compute2.start()
383 instance_id1 = self._create_instance()
384 compute1.run_instance(self.context, instance_id1)
385 instance_id2 = self._create_instance(availability_zone='nova:host1')
386 host = self.scheduler.driver.schedule_run_instance(self.context,
387 instance_id2)
388 self.assertEqual('host1', host)
389 compute1.terminate_instance(self.context, instance_id1)
390 db.instance_destroy(self.context, instance_id2)
391 compute1.kill()
392 compute2.kill()
393
394 def test_wont_sechedule_if_specified_host_is_down(self):
395 compute1 = service.Service('host1',
396 'nova-compute',
397 'compute',
398 FLAGS.compute_manager)
399 compute1.start()
400 s1 = db.service_get_by_args(self.context, 'host1', 'nova-compute')
401 now = datetime.datetime.utcnow()
402 delta = datetime.timedelta(seconds=FLAGS.service_down_time * 2)
403 past = now - delta
404 db.service_update(self.context, s1['id'], {'updated_at': past})
405 instance_id2 = self._create_instance(availability_zone='nova:host1')
406 self.assertRaises(driver.WillNotSchedule,
407 self.scheduler.driver.schedule_run_instance,
408 self.context,
409 instance_id2)
410 db.instance_destroy(self.context, instance_id2)
411 compute1.kill()
412
413 def test_will_schedule_on_disabled_host_if_specified(self):
414 compute1 = service.Service('host1',
415 'nova-compute',
416 'compute',
417 FLAGS.compute_manager)
418 compute1.start()
419 s1 = db.service_get_by_args(self.context, 'host1', 'nova-compute')
420 db.service_update(self.context, s1['id'], {'disabled': True})
421 instance_id2 = self._create_instance(availability_zone='nova:host1')
422 host = self.scheduler.driver.schedule_run_instance(self.context,
423 instance_id2)
424 self.assertEqual('host1', host)
425 db.instance_destroy(self.context, instance_id2)
426 compute1.kill()
427
428 def test_too_many_cores(self):
429 """Ensures we don't go over max cores"""
430 compute1 = service.Service('host1',
431 'nova-compute',
432 'compute',
433 FLAGS.compute_manager)
434 compute1.start()
435 compute2 = service.Service('host2',
436 'nova-compute',
437 'compute',
438 FLAGS.compute_manager)
439 compute2.start()
440 instance_ids1 = []
441 instance_ids2 = []
442 for index in xrange(FLAGS.max_cores):
443 instance_id = self._create_instance()
444 compute1.run_instance(self.context, instance_id)
445 instance_ids1.append(instance_id)
446 instance_id = self._create_instance()
447 compute2.run_instance(self.context, instance_id)
448 instance_ids2.append(instance_id)
449 instance_id = self._create_instance()
450 self.assertRaises(driver.NoValidHost,
451 self.scheduler.driver.schedule_run_instance,
452 self.context,
453 instance_id)
454 for instance_id in instance_ids1:
455 compute1.terminate_instance(self.context, instance_id)
456 for instance_id in instance_ids2:
457 compute2.terminate_instance(self.context, instance_id)
458 compute1.kill()
459 compute2.kill()
460
461 def test_least_busy_host_gets_volume(self):
462 """Ensures the host with less gigabytes gets the next one"""
463 volume1 = service.Service('host1',
464 'nova-volume',
465 'volume',
466 FLAGS.volume_manager)
467 volume1.start()
468 volume2 = service.Service('host2',
469 'nova-volume',
470 'volume',
471 FLAGS.volume_manager)
472 volume2.start()
473 volume_id1 = self._create_volume()
474 volume1.create_volume(self.context, volume_id1)
475 volume_id2 = self._create_volume()
476 host = self.scheduler.driver.schedule_create_volume(self.context,
477 volume_id2)
478 self.assertEqual(host, 'host2')
479 volume1.delete_volume(self.context, volume_id1)
480 db.volume_destroy(self.context, volume_id2)
481 dic = {'service_id': s_ref['id'],
482 'vcpus': 16, 'memory_mb': 32, 'local_gb': 100,
483 'vcpus_used': 16, 'memory_mb_used': 12, 'local_gb_used': 10,
484 'hypervisor_type': 'qemu', 'hypervisor_version': 12003,
485 'cpu_info': ''}
486
176 def test_doesnt_report_disabled_hosts_as_up(self):487 def test_doesnt_report_disabled_hosts_as_up(self):
177 """Ensures driver doesn't find hosts before they are enabled"""488 """Ensures driver doesn't find hosts before they are enabled"""
178 compute1 = self.start_service('compute', host='host1')489 compute1 = self.start_service('compute', host='host1')
@@ -316,3 +627,313 @@
316 volume2.delete_volume(self.context, volume_id)627 volume2.delete_volume(self.context, volume_id)
317 volume1.kill()628 volume1.kill()
318 volume2.kill()629 volume2.kill()
630
631 def test_scheduler_live_migration_with_volume(self):
632 """scheduler_live_migration() works correctly as expected.
633
634 Also, checks instance state is changed from 'running' -> 'migrating'.
635
636 """
637
638 instance_id = self._create_instance()
639 i_ref = db.instance_get(self.context, instance_id)
640 dic = {'instance_id': instance_id, 'size': 1}
641 v_ref = db.volume_create(self.context, dic)
642
643 # cannot check 2nd argument b/c the addresses of instance object
644 # is different.
645 driver_i = self.scheduler.driver
646 nocare = mox.IgnoreArg()
647 self.mox.StubOutWithMock(driver_i, '_live_migration_src_check')
648 self.mox.StubOutWithMock(driver_i, '_live_migration_dest_check')
649 self.mox.StubOutWithMock(driver_i, '_live_migration_common_check')
650 driver_i._live_migration_src_check(nocare, nocare)
651 driver_i._live_migration_dest_check(nocare, nocare, i_ref['host'])
652 driver_i._live_migration_common_check(nocare, nocare, i_ref['host'])
653 self.mox.StubOutWithMock(rpc, 'cast', use_mock_anything=True)
654 kwargs = {'instance_id': instance_id, 'dest': i_ref['host']}
655 rpc.cast(self.context,
656 db.queue_get_for(nocare, FLAGS.compute_topic, i_ref['host']),
657 {"method": 'live_migration', "args": kwargs})
658
659 self.mox.ReplayAll()
660 self.scheduler.live_migration(self.context, FLAGS.compute_topic,
661 instance_id=instance_id,
662 dest=i_ref['host'])
663
664 i_ref = db.instance_get(self.context, instance_id)
665 self.assertTrue(i_ref['state_description'] == 'migrating')
666 db.instance_destroy(self.context, instance_id)
667 db.volume_destroy(self.context, v_ref['id'])
668
669 def test_live_migration_src_check_instance_not_running(self):
670 """The instance given by instance_id is not running."""
671
672 instance_id = self._create_instance(state_description='migrating')
673 i_ref = db.instance_get(self.context, instance_id)
674
675 try:
676 self.scheduler.driver._live_migration_src_check(self.context,
677 i_ref)
678 except exception.Invalid, e:
679 c = (e.message.find('is not running') > 0)
680
681 self.assertTrue(c)
682 db.instance_destroy(self.context, instance_id)
683
684 def test_live_migration_src_check_volume_node_not_alive(self):
685 """Raise exception when volume node is not alive."""
686
687 instance_id = self._create_instance()
688 i_ref = db.instance_get(self.context, instance_id)
689 dic = {'instance_id': instance_id, 'size': 1}
690 v_ref = db.volume_create(self.context, {'instance_id': instance_id,
691 'size': 1})
692 t1 = datetime.datetime.utcnow() - datetime.timedelta(1)
693 dic = {'created_at': t1, 'updated_at': t1, 'binary': 'nova-volume',
694 'topic': 'volume', 'report_count': 0}
695 s_ref = db.service_create(self.context, dic)
696
697 try:
698 self.scheduler.driver.schedule_live_migration(self.context,
699 instance_id,
700 i_ref['host'])
701 except exception.Invalid, e:
702 c = (e.message.find('volume node is not alive') >= 0)
703
704 self.assertTrue(c)
705 db.instance_destroy(self.context, instance_id)
706 db.service_destroy(self.context, s_ref['id'])
707 db.volume_destroy(self.context, v_ref['id'])
708
709 def test_live_migration_src_check_compute_node_not_alive(self):
710 """Confirms src-compute node is alive."""
711 instance_id = self._create_instance()
712 i_ref = db.instance_get(self.context, instance_id)
713 t = datetime.datetime.utcnow() - datetime.timedelta(10)
714 s_ref = self._create_compute_service(created_at=t, updated_at=t,
715 host=i_ref['host'])
716
717 try:
718 self.scheduler.driver._live_migration_src_check(self.context,
719 i_ref)
720 except exception.Invalid, e:
721 c = (e.message.find('is not alive') >= 0)
722
723 self.assertTrue(c)
724 db.instance_destroy(self.context, instance_id)
725 db.service_destroy(self.context, s_ref['id'])
726
727 def test_live_migration_src_check_works_correctly(self):
728 """Confirms this method finishes with no error."""
729 instance_id = self._create_instance()
730 i_ref = db.instance_get(self.context, instance_id)
731 s_ref = self._create_compute_service(host=i_ref['host'])
732
733 ret = self.scheduler.driver._live_migration_src_check(self.context,
734 i_ref)
735
736 self.assertTrue(ret == None)
737 db.instance_destroy(self.context, instance_id)
738 db.service_destroy(self.context, s_ref['id'])
739
740 def test_live_migration_dest_check_not_alive(self):
741 """Confirms exception raises in case dest host does not exist."""
742 instance_id = self._create_instance()
743 i_ref = db.instance_get(self.context, instance_id)
744 t = datetime.datetime.utcnow() - datetime.timedelta(10)
745 s_ref = self._create_compute_service(created_at=t, updated_at=t,
746 host=i_ref['host'])
747
748 try:
749 self.scheduler.driver._live_migration_dest_check(self.context,
750 i_ref,
751 i_ref['host'])
752 except exception.Invalid, e:
753 c = (e.message.find('is not alive') >= 0)
754
755 self.assertTrue(c)
756 db.instance_destroy(self.context, instance_id)
757 db.service_destroy(self.context, s_ref['id'])
758
759 def test_live_migration_dest_check_service_same_host(self):
760 """Confirms exceptioin raises in case dest and src is same host."""
761 instance_id = self._create_instance()
762 i_ref = db.instance_get(self.context, instance_id)
763 s_ref = self._create_compute_service(host=i_ref['host'])
764
765 try:
766 self.scheduler.driver._live_migration_dest_check(self.context,
767 i_ref,
768 i_ref['host'])
769 except exception.Invalid, e:
770 c = (e.message.find('choose other host') >= 0)
771
772 self.assertTrue(c)
773 db.instance_destroy(self.context, instance_id)
774 db.service_destroy(self.context, s_ref['id'])
775
776 def test_live_migration_dest_check_service_lack_memory(self):
777 """Confirms exception raises when dest doesn't have enough memory."""
778 instance_id = self._create_instance()
779 i_ref = db.instance_get(self.context, instance_id)
780 s_ref = self._create_compute_service(host='somewhere',
781 memory_mb_used=12)
782
783 try:
784 self.scheduler.driver._live_migration_dest_check(self.context,
785 i_ref,
786 'somewhere')
787 except exception.NotEmpty, e:
788 c = (e.message.find('Unable to migrate') >= 0)
789
790 self.assertTrue(c)
791 db.instance_destroy(self.context, instance_id)
792 db.service_destroy(self.context, s_ref['id'])
793
794 def test_live_migration_dest_check_service_works_correctly(self):
795 """Confirms method finishes with no error."""
796 instance_id = self._create_instance()
797 i_ref = db.instance_get(self.context, instance_id)
798 s_ref = self._create_compute_service(host='somewhere',
799 memory_mb_used=5)
800
801 ret = self.scheduler.driver._live_migration_dest_check(self.context,
802 i_ref,
803 'somewhere')
804 self.assertTrue(ret == None)
805 db.instance_destroy(self.context, instance_id)
806 db.service_destroy(self.context, s_ref['id'])
807
808 def test_live_migration_common_check_service_orig_not_exists(self):
809 """Destination host does not exist."""
810
811 dest = 'dummydest'
812 # mocks for live_migration_common_check()
813 instance_id = self._create_instance()
814 i_ref = db.instance_get(self.context, instance_id)
815 t1 = datetime.datetime.utcnow() - datetime.timedelta(10)
816 s_ref = self._create_compute_service(created_at=t1, updated_at=t1,
817 host=dest)
818
819 # mocks for mounted_on_same_shared_storage()
820 fpath = '/test/20110127120000'
821 self.mox.StubOutWithMock(driver, 'rpc', use_mock_anything=True)
822 topic = FLAGS.compute_topic
823 driver.rpc.call(mox.IgnoreArg(),
824 db.queue_get_for(self.context, topic, dest),
825 {"method": 'create_shared_storage_test_file'}).AndReturn(fpath)
826 driver.rpc.call(mox.IgnoreArg(),
827 db.queue_get_for(mox.IgnoreArg(), topic, i_ref['host']),
828 {"method": 'check_shared_storage_test_file',
829 "args": {'filename': fpath}})
830 driver.rpc.call(mox.IgnoreArg(),
831 db.queue_get_for(mox.IgnoreArg(), topic, dest),
832 {"method": 'cleanup_shared_storage_test_file',
833 "args": {'filename': fpath}})
834
835 self.mox.ReplayAll()
836 try:
837 self.scheduler.driver._live_migration_common_check(self.context,
838 i_ref,
839 dest)
840 except exception.Invalid, e:
841 c = (e.message.find('does not exist') >= 0)
842
843 self.assertTrue(c)
844 db.instance_destroy(self.context, instance_id)
845 db.service_destroy(self.context, s_ref['id'])
846
847 def test_live_migration_common_check_service_different_hypervisor(self):
848 """Original host and dest host has different hypervisor type."""
849 dest = 'dummydest'
850 instance_id = self._create_instance()
851 i_ref = db.instance_get(self.context, instance_id)
852
853 # compute service for destination
854 s_ref = self._create_compute_service(host=i_ref['host'])
855 # compute service for original host
856 s_ref2 = self._create_compute_service(host=dest, hypervisor_type='xen')
857
858 # mocks
859 driver = self.scheduler.driver
860 self.mox.StubOutWithMock(driver, 'mounted_on_same_shared_storage')
861 driver.mounted_on_same_shared_storage(mox.IgnoreArg(), i_ref, dest)
862
863 self.mox.ReplayAll()
864 try:
865 self.scheduler.driver._live_migration_common_check(self.context,
866 i_ref,
867 dest)
868 except exception.Invalid, e:
869 c = (e.message.find(_('Different hypervisor type')) >= 0)
870
871 self.assertTrue(c)
872 db.instance_destroy(self.context, instance_id)
873 db.service_destroy(self.context, s_ref['id'])
874 db.service_destroy(self.context, s_ref2['id'])
875
876 def test_live_migration_common_check_service_different_version(self):
877 """Original host and dest host has different hypervisor version."""
878 dest = 'dummydest'
879 instance_id = self._create_instance()
880 i_ref = db.instance_get(self.context, instance_id)
881
882 # compute service for destination
883 s_ref = self._create_compute_service(host=i_ref['host'])
884 # compute service for original host
885 s_ref2 = self._create_compute_service(host=dest,
886 hypervisor_version=12002)
887
888 # mocks
889 driver = self.scheduler.driver
890 self.mox.StubOutWithMock(driver, 'mounted_on_same_shared_storage')
891 driver.mounted_on_same_shared_storage(mox.IgnoreArg(), i_ref, dest)
892
893 self.mox.ReplayAll()
894 try:
895 self.scheduler.driver._live_migration_common_check(self.context,
896 i_ref,
897 dest)
898 except exception.Invalid, e:
899 c = (e.message.find(_('Older hypervisor version')) >= 0)
900
901 self.assertTrue(c)
902 db.instance_destroy(self.context, instance_id)
903 db.service_destroy(self.context, s_ref['id'])
904 db.service_destroy(self.context, s_ref2['id'])
905
906 def test_live_migration_common_check_checking_cpuinfo_fail(self):
907 """Raise excetion when original host doen't have compatible cpu."""
908
909 dest = 'dummydest'
910 instance_id = self._create_instance()
911 i_ref = db.instance_get(self.context, instance_id)
912
913 # compute service for destination
914 s_ref = self._create_compute_service(host=i_ref['host'])
915 # compute service for original host
916 s_ref2 = self._create_compute_service(host=dest)
917
918 # mocks
919 driver = self.scheduler.driver
920 self.mox.StubOutWithMock(driver, 'mounted_on_same_shared_storage')
921 driver.mounted_on_same_shared_storage(mox.IgnoreArg(), i_ref, dest)
922 self.mox.StubOutWithMock(rpc, 'call', use_mock_anything=True)
923 rpc.call(mox.IgnoreArg(), mox.IgnoreArg(),
924 {"method": 'compare_cpu',
925 "args": {'cpu_info': s_ref2['compute_node'][0]['cpu_info']}}).\
926 AndRaise(rpc.RemoteError("doesn't have compatibility to", "", ""))
927
928 self.mox.ReplayAll()
929 try:
930 self.scheduler.driver._live_migration_common_check(self.context,
931 i_ref,
932 dest)
933 except rpc.RemoteError, e:
934 c = (e.message.find(_("doesn't have compatibility to")) >= 0)
935
936 self.assertTrue(c)
937 db.instance_destroy(self.context, instance_id)
938 db.service_destroy(self.context, s_ref['id'])
939 db.service_destroy(self.context, s_ref2['id'])
319940
=== modified file 'nova/tests/test_service.py'
--- nova/tests/test_service.py 2011-02-23 23:14:16 +0000
+++ nova/tests/test_service.py 2011-03-10 06:27:59 +0000
@@ -30,6 +30,7 @@
30from nova import test30from nova import test
31from nova import service31from nova import service
32from nova import manager32from nova import manager
33from nova.compute import manager as compute_manager
3334
34FLAGS = flags.FLAGS35FLAGS = flags.FLAGS
35flags.DEFINE_string("fake_manager", "nova.tests.test_service.FakeManager",36flags.DEFINE_string("fake_manager", "nova.tests.test_service.FakeManager",
@@ -251,3 +252,43 @@
251 serv.report_state()252 serv.report_state()
252253
253 self.assert_(not serv.model_disconnected)254 self.assert_(not serv.model_disconnected)
255
256 def test_compute_can_update_available_resource(self):
257 """Confirm compute updates their record of compute-service table."""
258 host = 'foo'
259 binary = 'nova-compute'
260 topic = 'compute'
261
262 # Any mocks are not working without UnsetStubs() here.
263 self.mox.UnsetStubs()
264 ctxt = context.get_admin_context()
265 service_ref = db.service_create(ctxt, {'host': host,
266 'binary': binary,
267 'topic': topic})
268 serv = service.Service(host,
269 binary,
270 topic,
271 'nova.compute.manager.ComputeManager')
272
273 # This testcase want to test calling update_available_resource.
274 # No need to call periodic call, then below variable must be set 0.
275 serv.report_interval = 0
276 serv.periodic_interval = 0
277
278 # Creating mocks
279 self.mox.StubOutWithMock(service.rpc.Connection, 'instance')
280 service.rpc.Connection.instance(new=mox.IgnoreArg())
281 service.rpc.Connection.instance(new=mox.IgnoreArg())
282 self.mox.StubOutWithMock(serv.manager.driver,
283 'update_available_resource')
284 serv.manager.driver.update_available_resource(mox.IgnoreArg(), host)
285
286 # Just doing start()-stop(), not confirm new db record is created,
287 # because update_available_resource() works only in
288 # libvirt environment. This testcase confirms
289 # update_available_resource() is called. Otherwise, mox complains.
290 self.mox.ReplayAll()
291 serv.start()
292 serv.stop()
293
294 db.service_destroy(ctxt, service_ref['id'])
254295
=== modified file 'nova/tests/test_virt.py'
--- nova/tests/test_virt.py 2011-03-09 23:45:00 +0000
+++ nova/tests/test_virt.py 2011-03-10 06:27:59 +0000
@@ -14,21 +14,28 @@
14# License for the specific language governing permissions and limitations14# License for the specific language governing permissions and limitations
15# under the License.15# under the License.
1616
17import eventlet
18import mox
17import os19import os
20import sys
1821
19import eventlet
20from xml.etree.ElementTree import fromstring as xml_to_tree22from xml.etree.ElementTree import fromstring as xml_to_tree
21from xml.dom.minidom import parseString as xml_to_dom23from xml.dom.minidom import parseString as xml_to_dom
2224
23from nova import context25from nova import context
24from nova import db26from nova import db
27from nova import exception
25from nova import flags28from nova import flags
26from nova import test29from nova import test
27from nova import utils30from nova import utils
28from nova.api.ec2 import cloud31from nova.api.ec2 import cloud
29from nova.auth import manager32from nova.auth import manager
33from nova.compute import manager as compute_manager
34from nova.compute import power_state
35from nova.db.sqlalchemy import models
30from nova.virt import libvirt_conn36from nova.virt import libvirt_conn
3137
38libvirt = None
32FLAGS = flags.FLAGS39FLAGS = flags.FLAGS
33flags.DECLARE('instances_path', 'nova.compute.manager')40flags.DECLARE('instances_path', 'nova.compute.manager')
3441
@@ -103,11 +110,28 @@
103 libvirt_conn._late_load_cheetah()110 libvirt_conn._late_load_cheetah()
104 self.flags(fake_call=True)111 self.flags(fake_call=True)
105 self.manager = manager.AuthManager()112 self.manager = manager.AuthManager()
113
114 try:
115 pjs = self.manager.get_projects()
116 pjs = [p for p in pjs if p.name == 'fake']
117 if 0 != len(pjs):
118 self.manager.delete_project(pjs[0])
119
120 users = self.manager.get_users()
121 users = [u for u in users if u.name == 'fake']
122 if 0 != len(users):
123 self.manager.delete_user(users[0])
124 except Exception, e:
125 pass
126
127 users = self.manager.get_users()
106 self.user = self.manager.create_user('fake', 'fake', 'fake',128 self.user = self.manager.create_user('fake', 'fake', 'fake',
107 admin=True)129 admin=True)
108 self.project = self.manager.create_project('fake', 'fake', 'fake')130 self.project = self.manager.create_project('fake', 'fake', 'fake')
109 self.network = utils.import_object(FLAGS.network_manager)131 self.network = utils.import_object(FLAGS.network_manager)
132 self.context = context.get_admin_context()
110 FLAGS.instances_path = ''133 FLAGS.instances_path = ''
134 self.call_libvirt_dependant_setup = False
111135
112 test_ip = '10.11.12.13'136 test_ip = '10.11.12.13'
113 test_instance = {'memory_kb': '1024000',137 test_instance = {'memory_kb': '1024000',
@@ -119,6 +143,58 @@
119 'bridge': 'br101',143 'bridge': 'br101',
120 'instance_type': 'm1.small'}144 'instance_type': 'm1.small'}
121145
146 def lazy_load_library_exists(self):
147 """check if libvirt is available."""
148 # try to connect libvirt. if fail, skip test.
149 try:
150 import libvirt
151 import libxml2
152 except ImportError:
153 return False
154 global libvirt
155 libvirt = __import__('libvirt')
156 libvirt_conn.libvirt = __import__('libvirt')
157 libvirt_conn.libxml2 = __import__('libxml2')
158 return True
159
160 def create_fake_libvirt_mock(self, **kwargs):
161 """Defining mocks for LibvirtConnection(libvirt is not used)."""
162
163 # A fake libvirt.virConnect
164 class FakeLibvirtConnection(object):
165 pass
166
167 # A fake libvirt_conn.IptablesFirewallDriver
168 class FakeIptablesFirewallDriver(object):
169
170 def __init__(self, **kwargs):
171 pass
172
173 def setattr(self, key, val):
174 self.__setattr__(key, val)
175
176 # Creating mocks
177 fake = FakeLibvirtConnection()
178 fakeip = FakeIptablesFirewallDriver
179 # Customizing above fake if necessary
180 for key, val in kwargs.items():
181 fake.__setattr__(key, val)
182
183 # Inevitable mocks for libvirt_conn.LibvirtConnection
184 self.mox.StubOutWithMock(libvirt_conn.utils, 'import_class')
185 libvirt_conn.utils.import_class(mox.IgnoreArg()).AndReturn(fakeip)
186 self.mox.StubOutWithMock(libvirt_conn.LibvirtConnection, '_conn')
187 libvirt_conn.LibvirtConnection._conn = fake
188
189 def create_service(self, **kwargs):
190 service_ref = {'host': kwargs.get('host', 'dummy'),
191 'binary': 'nova-compute',
192 'topic': 'compute',
193 'report_count': 0,
194 'availability_zone': 'zone'}
195
196 return db.service_create(context.get_admin_context(), service_ref)
197
122 def test_xml_and_uri_no_ramdisk_no_kernel(self):198 def test_xml_and_uri_no_ramdisk_no_kernel(self):
123 instance_data = dict(self.test_instance)199 instance_data = dict(self.test_instance)
124 self._check_xml_and_uri(instance_data,200 self._check_xml_and_uri(instance_data,
@@ -258,8 +334,8 @@
258 expected_result,334 expected_result,
259 '%s failed common check %d' % (xml, i))335 '%s failed common check %d' % (xml, i))
260336
261 # This test is supposed to make sure we don't override a specifically337 # This test is supposed to make sure we don't
262 # set uri338 # override a specifically set uri
263 #339 #
264 # Deliberately not just assigning this string to FLAGS.libvirt_uri and340 # Deliberately not just assigning this string to FLAGS.libvirt_uri and
265 # checking against that later on. This way we make sure the341 # checking against that later on. This way we make sure the
@@ -273,6 +349,150 @@
273 self.assertEquals(uri, testuri)349 self.assertEquals(uri, testuri)
274 db.instance_destroy(user_context, instance_ref['id'])350 db.instance_destroy(user_context, instance_ref['id'])
275351
352 def test_update_available_resource_works_correctly(self):
353 """Confirm compute_node table is updated successfully."""
354 org_path = FLAGS.instances_path = ''
355 FLAGS.instances_path = '.'
356
357 # Prepare mocks
358 def getVersion():
359 return 12003
360
361 def getType():
362 return 'qemu'
363
364 def listDomainsID():
365 return []
366
367 service_ref = self.create_service(host='dummy')
368 self.create_fake_libvirt_mock(getVersion=getVersion,
369 getType=getType,
370 listDomainsID=listDomainsID)
371 self.mox.StubOutWithMock(libvirt_conn.LibvirtConnection,
372 'get_cpu_info')
373 libvirt_conn.LibvirtConnection.get_cpu_info().AndReturn('cpuinfo')
374
375 # Start test
376 self.mox.ReplayAll()
377 conn = libvirt_conn.LibvirtConnection(False)
378 conn.update_available_resource(self.context, 'dummy')
379 service_ref = db.service_get(self.context, service_ref['id'])
380 compute_node = service_ref['compute_node'][0]
381
382 if sys.platform.upper() == 'LINUX2':
383 self.assertTrue(compute_node['vcpus'] >= 0)
384 self.assertTrue(compute_node['memory_mb'] > 0)
385 self.assertTrue(compute_node['local_gb'] > 0)
386 self.assertTrue(compute_node['vcpus_used'] == 0)
387 self.assertTrue(compute_node['memory_mb_used'] > 0)
388 self.assertTrue(compute_node['local_gb_used'] > 0)
389 self.assertTrue(len(compute_node['hypervisor_type']) > 0)
390 self.assertTrue(compute_node['hypervisor_version'] > 0)
391 else:
392 self.assertTrue(compute_node['vcpus'] >= 0)
393 self.assertTrue(compute_node['memory_mb'] == 0)
394 self.assertTrue(compute_node['local_gb'] > 0)
395 self.assertTrue(compute_node['vcpus_used'] == 0)
396 self.assertTrue(compute_node['memory_mb_used'] == 0)
397 self.assertTrue(compute_node['local_gb_used'] > 0)
398 self.assertTrue(len(compute_node['hypervisor_type']) > 0)
399 self.assertTrue(compute_node['hypervisor_version'] > 0)
400
401 db.service_destroy(self.context, service_ref['id'])
402 FLAGS.instances_path = org_path
403
404 def test_update_resource_info_no_compute_record_found(self):
405 """Raise exception if no recorde found on services table."""
406 org_path = FLAGS.instances_path = ''
407 FLAGS.instances_path = '.'
408 self.create_fake_libvirt_mock()
409
410 self.mox.ReplayAll()
411 conn = libvirt_conn.LibvirtConnection(False)
412 self.assertRaises(exception.Invalid,
413 conn.update_available_resource,
414 self.context, 'dummy')
415
416 FLAGS.instances_path = org_path
417
418 def test_ensure_filtering_rules_for_instance_timeout(self):
419 """ensure_filtering_fules_for_instance() finishes with timeout."""
420 # Skip if non-libvirt environment
421 if not self.lazy_load_library_exists():
422 return
423
424 # Preparing mocks
425 def fake_none(self):
426 return
427
428 def fake_raise(self):
429 raise libvirt.libvirtError('ERR')
430
431 self.create_fake_libvirt_mock(nwfilterLookupByName=fake_raise)
432 instance_ref = db.instance_create(self.context, self.test_instance)
433
434 # Start test
435 self.mox.ReplayAll()
436 try:
437 conn = libvirt_conn.LibvirtConnection(False)
438 conn.firewall_driver.setattr('setup_basic_filtering', fake_none)
439 conn.firewall_driver.setattr('prepare_instance_filter', fake_none)
440 conn.ensure_filtering_rules_for_instance(instance_ref)
441 except exception.Error, e:
442 c1 = (0 <= e.message.find('Timeout migrating for'))
443 self.assertTrue(c1)
444
445 db.instance_destroy(self.context, instance_ref['id'])
446
447 def test_live_migration_raises_exception(self):
448 """Confirms recover method is called when exceptions are raised."""
449 # Skip if non-libvirt environment
450 if not self.lazy_load_library_exists():
451 return
452
453 # Preparing data
454 self.compute = utils.import_object(FLAGS.compute_manager)
455 instance_dict = {'host': 'fake', 'state': power_state.RUNNING,
456 'state_description': 'running'}
457 instance_ref = db.instance_create(self.context, self.test_instance)
458 instance_ref = db.instance_update(self.context, instance_ref['id'],
459 instance_dict)
460 vol_dict = {'status': 'migrating', 'size': 1}
461 volume_ref = db.volume_create(self.context, vol_dict)
462 db.volume_attached(self.context, volume_ref['id'], instance_ref['id'],
463 '/dev/fake')
464
465 # Preparing mocks
466 vdmock = self.mox.CreateMock(libvirt.virDomain)
467 self.mox.StubOutWithMock(vdmock, "migrateToURI")
468 vdmock.migrateToURI(FLAGS.live_migration_uri % 'dest',
469 mox.IgnoreArg(),
470 None, FLAGS.live_migration_bandwidth).\
471 AndRaise(libvirt.libvirtError('ERR'))
472
473 def fake_lookup(instance_name):
474 if instance_name == instance_ref.name:
475 return vdmock
476
477 self.create_fake_libvirt_mock(lookupByName=fake_lookup)
478
479 # Start test
480 self.mox.ReplayAll()
481 conn = libvirt_conn.LibvirtConnection(False)
482 self.assertRaises(libvirt.libvirtError,
483 conn._live_migration,
484 self.context, instance_ref, 'dest', '',
485 self.compute.recover_live_migration)
486
487 instance_ref = db.instance_get(self.context, instance_ref['id'])
488 self.assertTrue(instance_ref['state_description'] == 'running')
489 self.assertTrue(instance_ref['state'] == power_state.RUNNING)
490 volume_ref = db.volume_get(self.context, volume_ref['id'])
491 self.assertTrue(volume_ref['status'] == 'in-use')
492
493 db.volume_destroy(self.context, volume_ref['id'])
494 db.instance_destroy(self.context, instance_ref['id'])
495
276 def tearDown(self):496 def tearDown(self):
277 self.manager.delete_project(self.project)497 self.manager.delete_project(self.project)
278 self.manager.delete_user(self.user)498 self.manager.delete_user(self.user)
279499
=== modified file 'nova/tests/test_volume.py'
--- nova/tests/test_volume.py 2011-03-07 01:25:01 +0000
+++ nova/tests/test_volume.py 2011-03-10 06:27:59 +0000
@@ -20,6 +20,8 @@
2020
21"""21"""
2222
23import cStringIO
24
23from nova import context25from nova import context
24from nova import exception26from nova import exception
25from nova import db27from nova import db
@@ -173,3 +175,196 @@
173 # each of them having a different FLAG for storage_node175 # each of them having a different FLAG for storage_node
174 # This will allow us to test cross-node interactions176 # This will allow us to test cross-node interactions
175 pass177 pass
178
179
180class DriverTestCase(test.TestCase):
181 """Base Test class for Drivers."""
182 driver_name = "nova.volume.driver.FakeAOEDriver"
183
184 def setUp(self):
185 super(DriverTestCase, self).setUp()
186 self.flags(volume_driver=self.driver_name,
187 logging_default_format_string="%(message)s")
188 self.volume = utils.import_object(FLAGS.volume_manager)
189 self.context = context.get_admin_context()
190 self.output = ""
191
192 def _fake_execute(_command, *_args, **_kwargs):
193 """Fake _execute."""
194 return self.output, None
195 self.volume.driver._execute = _fake_execute
196 self.volume.driver._sync_execute = _fake_execute
197
198 log = logging.getLogger()
199 self.stream = cStringIO.StringIO()
200 log.addHandler(logging.StreamHandler(self.stream))
201
202 inst = {}
203 self.instance_id = db.instance_create(self.context, inst)['id']
204
205 def tearDown(self):
206 super(DriverTestCase, self).tearDown()
207
208 def _attach_volume(self):
209 """Attach volumes to an instance. This function also sets
210 a fake log message."""
211 return []
212
213 def _detach_volume(self, volume_id_list):
214 """Detach volumes from an instance."""
215 for volume_id in volume_id_list:
216 db.volume_detached(self.context, volume_id)
217 self.volume.delete_volume(self.context, volume_id)
218
219
220class AOETestCase(DriverTestCase):
221 """Test Case for AOEDriver"""
222 driver_name = "nova.volume.driver.AOEDriver"
223
224 def setUp(self):
225 super(AOETestCase, self).setUp()
226
227 def tearDown(self):
228 super(AOETestCase, self).tearDown()
229
230 def _attach_volume(self):
231 """Attach volumes to an instance. This function also sets
232 a fake log message."""
233 volume_id_list = []
234 for index in xrange(3):
235 vol = {}
236 vol['size'] = 0
237 volume_id = db.volume_create(self.context,
238 vol)['id']
239 self.volume.create_volume(self.context, volume_id)
240
241 # each volume has a different mountpoint
242 mountpoint = "/dev/sd" + chr((ord('b') + index))
243 db.volume_attached(self.context, volume_id, self.instance_id,
244 mountpoint)
245
246 (shelf_id, blade_id) = db.volume_get_shelf_and_blade(self.context,
247 volume_id)
248 self.output += "%s %s eth0 /dev/nova-volumes/vol-foo auto run\n" \
249 % (shelf_id, blade_id)
250
251 volume_id_list.append(volume_id)
252
253 return volume_id_list
254
255 def test_check_for_export_with_no_volume(self):
256 """No log message when no volume is attached to an instance."""
257 self.stream.truncate(0)
258 self.volume.check_for_export(self.context, self.instance_id)
259 self.assertEqual(self.stream.getvalue(), '')
260
261 def test_check_for_export_with_all_vblade_processes(self):
262 """No log message when all the vblade processes are running."""
263 volume_id_list = self._attach_volume()
264
265 self.stream.truncate(0)
266 self.volume.check_for_export(self.context, self.instance_id)
267 self.assertEqual(self.stream.getvalue(), '')
268
269 self._detach_volume(volume_id_list)
270
271 def test_check_for_export_with_vblade_process_missing(self):
272 """Output a warning message when some vblade processes aren't
273 running."""
274 volume_id_list = self._attach_volume()
275
276 # the first vblade process isn't running
277 self.output = self.output.replace("run", "down", 1)
278 (shelf_id, blade_id) = db.volume_get_shelf_and_blade(self.context,
279 volume_id_list[0])
280
281 msg_is_match = False
282 self.stream.truncate(0)
283 try:
284 self.volume.check_for_export(self.context, self.instance_id)
285 except exception.ProcessExecutionError, e:
286 volume_id = volume_id_list[0]
287 msg = _("Cannot confirm exported volume id:%(volume_id)s. "
288 "vblade process for e%(shelf_id)s.%(blade_id)s "
289 "isn't running.") % locals()
290
291 msg_is_match = (0 <= e.message.find(msg))
292
293 self.assertTrue(msg_is_match)
294 self._detach_volume(volume_id_list)
295
296
297class ISCSITestCase(DriverTestCase):
298 """Test Case for ISCSIDriver"""
299 driver_name = "nova.volume.driver.ISCSIDriver"
300
301 def setUp(self):
302 super(ISCSITestCase, self).setUp()
303
304 def tearDown(self):
305 super(ISCSITestCase, self).tearDown()
306
307 def _attach_volume(self):
308 """Attach volumes to an instance. This function also sets
309 a fake log message."""
310 volume_id_list = []
311 for index in xrange(3):
312 vol = {}
313 vol['size'] = 0
314 vol_ref = db.volume_create(self.context, vol)
315 self.volume.create_volume(self.context, vol_ref['id'])
316 vol_ref = db.volume_get(self.context, vol_ref['id'])
317
318 # each volume has a different mountpoint
319 mountpoint = "/dev/sd" + chr((ord('b') + index))
320 db.volume_attached(self.context, vol_ref['id'], self.instance_id,
321 mountpoint)
322 volume_id_list.append(vol_ref['id'])
323
324 return volume_id_list
325
326 def test_check_for_export_with_no_volume(self):
327 """No log message when no volume is attached to an instance."""
328 self.stream.truncate(0)
329 self.volume.check_for_export(self.context, self.instance_id)
330 self.assertEqual(self.stream.getvalue(), '')
331
332 def test_check_for_export_with_all_volume_exported(self):
333 """No log message when all the vblade processes are running."""
334 volume_id_list = self._attach_volume()
335
336 self.mox.StubOutWithMock(self.volume.driver, '_execute')
337 for i in volume_id_list:
338 tid = db.volume_get_iscsi_target_num(self.context, i)
339 self.volume.driver._execute("sudo ietadm --op show --tid=%(tid)d"
340 % locals())
341
342 self.stream.truncate(0)
343 self.mox.ReplayAll()
344 self.volume.check_for_export(self.context, self.instance_id)
345 self.assertEqual(self.stream.getvalue(), '')
346 self.mox.UnsetStubs()
347
348 self._detach_volume(volume_id_list)
349
350 def test_check_for_export_with_some_volume_missing(self):
351 """Output a warning message when some volumes are not recognied
352 by ietd."""
353 volume_id_list = self._attach_volume()
354
355 # the first vblade process isn't running
356 tid = db.volume_get_iscsi_target_num(self.context, volume_id_list[0])
357 self.mox.StubOutWithMock(self.volume.driver, '_execute')
358 self.volume.driver._execute("sudo ietadm --op show --tid=%(tid)d"
359 % locals()).AndRaise(exception.ProcessExecutionError())
360
361 self.mox.ReplayAll()
362 self.assertRaises(exception.ProcessExecutionError,
363 self.volume.check_for_export,
364 self.context,
365 self.instance_id)
366 msg = _("Cannot confirm exported volume id:%s.") % volume_id_list[0]
367 self.assertTrue(0 <= self.stream.getvalue().find(msg))
368 self.mox.UnsetStubs()
369
370 self._detach_volume(volume_id_list)
176371
=== added file 'nova/virt/cpuinfo.xml.template'
--- nova/virt/cpuinfo.xml.template 1970-01-01 00:00:00 +0000
+++ nova/virt/cpuinfo.xml.template 2011-03-10 06:27:59 +0000
@@ -0,0 +1,9 @@
1<cpu>
2 <arch>$arch</arch>
3 <model>$model</model>
4 <vendor>$vendor</vendor>
5 <topology sockets="$topology.sockets" cores="$topology.cores" threads="$topology.threads"/>
6#for $var in $features
7 <features name="$var" />
8#end for
9</cpu>
010
=== modified file 'nova/virt/fake.py'
--- nova/virt/fake.py 2011-02-28 17:39:23 +0000
+++ nova/virt/fake.py 2011-03-10 06:27:59 +0000
@@ -407,6 +407,27 @@
407 """407 """
408 return True408 return True
409409
410 def update_available_resource(self, ctxt, host):
411 """This method is supported only by libvirt."""
412 return
413
414 def compare_cpu(self, xml):
415 """This method is supported only by libvirt."""
416 raise NotImplementedError('This method is supported only by libvirt.')
417
418 def ensure_filtering_rules_for_instance(self, instance_ref):
419 """This method is supported only by libvirt."""
420 raise NotImplementedError('This method is supported only by libvirt.')
421
422 def live_migration(self, context, instance_ref, dest,
423 post_method, recover_method):
424 """This method is supported only by libvirt."""
425 return
426
427 def unfilter_instance(self, instance_ref):
428 """This method is supported only by libvirt."""
429 raise NotImplementedError('This method is supported only by libvirt.')
430
410431
411class FakeInstance(object):432class FakeInstance(object):
412433
413434
=== modified file 'nova/virt/libvirt_conn.py'
--- nova/virt/libvirt_conn.py 2011-03-10 04:42:11 +0000
+++ nova/virt/libvirt_conn.py 2011-03-10 06:27:59 +0000
@@ -36,10 +36,13 @@
3636
37"""37"""
3838
39import multiprocessing
39import os40import os
40import shutil41import shutil
42import sys
41import random43import random
42import subprocess44import subprocess
45import time
43import uuid46import uuid
44from xml.dom import minidom47from xml.dom import minidom
4548
@@ -70,6 +73,7 @@
70LOG = logging.getLogger('nova.virt.libvirt_conn')73LOG = logging.getLogger('nova.virt.libvirt_conn')
7174
72FLAGS = flags.FLAGS75FLAGS = flags.FLAGS
76flags.DECLARE('live_migration_retry_count', 'nova.compute.manager')
73# TODO(vish): These flags should probably go into a shared location77# TODO(vish): These flags should probably go into a shared location
74flags.DEFINE_string('rescue_image_id', 'ami-rescue', 'Rescue ami image')78flags.DEFINE_string('rescue_image_id', 'ami-rescue', 'Rescue ami image')
75flags.DEFINE_string('rescue_kernel_id', 'aki-rescue', 'Rescue aki image')79flags.DEFINE_string('rescue_kernel_id', 'aki-rescue', 'Rescue aki image')
@@ -100,6 +104,17 @@
100flags.DEFINE_string('firewall_driver',104flags.DEFINE_string('firewall_driver',
101 'nova.virt.libvirt_conn.IptablesFirewallDriver',105 'nova.virt.libvirt_conn.IptablesFirewallDriver',
102 'Firewall driver (defaults to iptables)')106 'Firewall driver (defaults to iptables)')
107flags.DEFINE_string('cpuinfo_xml_template',
108 utils.abspath('virt/cpuinfo.xml.template'),
109 'CpuInfo XML Template (Used only live migration now)')
110flags.DEFINE_string('live_migration_uri',
111 "qemu+tcp://%s/system",
112 'Define protocol used by live_migration feature')
113flags.DEFINE_string('live_migration_flag',
114 "VIR_MIGRATE_UNDEFINE_SOURCE, VIR_MIGRATE_PEER2PEER",
115 'Define live migration behavior.')
116flags.DEFINE_integer('live_migration_bandwidth', 0,
117 'Define live migration behavior')
103118
104119
105def get_connection(read_only):120def get_connection(read_only):
@@ -146,6 +161,7 @@
146 self.libvirt_uri = self.get_uri()161 self.libvirt_uri = self.get_uri()
147162
148 self.libvirt_xml = open(FLAGS.libvirt_xml_template).read()163 self.libvirt_xml = open(FLAGS.libvirt_xml_template).read()
164 self.cpuinfo_xml = open(FLAGS.cpuinfo_xml_template).read()
149 self._wrapped_conn = None165 self._wrapped_conn = None
150 self.read_only = read_only166 self.read_only = read_only
151167
@@ -851,6 +867,158 @@
851867
852 return interfaces868 return interfaces
853869
870 def get_vcpu_total(self):
871 """Get vcpu number of physical computer.
872
873 :returns: the number of cpu core.
874
875 """
876
877 # On certain platforms, this will raise a NotImplementedError.
878 try:
879 return multiprocessing.cpu_count()
880 except NotImplementedError:
881 LOG.warn(_("Cannot get the number of cpu, because this "
882 "function is not implemented for this platform. "
883 "This error can be safely ignored for now."))
884 return 0
885
886 def get_memory_mb_total(self):
887 """Get the total memory size(MB) of physical computer.
888
889 :returns: the total amount of memory(MB).
890
891 """
892
893 if sys.platform.upper() != 'LINUX2':
894 return 0
895
896 meminfo = open('/proc/meminfo').read().split()
897 idx = meminfo.index('MemTotal:')
898 # transforming kb to mb.
899 return int(meminfo[idx + 1]) / 1024
900
901 def get_local_gb_total(self):
902 """Get the total hdd size(GB) of physical computer.
903
904 :returns:
905 The total amount of HDD(GB).
906 Note that this value shows a partition where
907 NOVA-INST-DIR/instances mounts.
908
909 """
910
911 hddinfo = os.statvfs(FLAGS.instances_path)
912 return hddinfo.f_frsize * hddinfo.f_blocks / 1024 / 1024 / 1024
913
914 def get_vcpu_used(self):
915 """ Get vcpu usage number of physical computer.
916
917 :returns: The total number of vcpu that currently used.
918
919 """
920
921 total = 0
922 for dom_id in self._conn.listDomainsID():
923 dom = self._conn.lookupByID(dom_id)
924 total += len(dom.vcpus()[1])
925 return total
926
927 def get_memory_mb_used(self):
928 """Get the free memory size(MB) of physical computer.
929
930 :returns: the total usage of memory(MB).
931
932 """
933
934 if sys.platform.upper() != 'LINUX2':
935 return 0
936
937 m = open('/proc/meminfo').read().split()
938 idx1 = m.index('MemFree:')
939 idx2 = m.index('Buffers:')
940 idx3 = m.index('Cached:')
941 avail = (int(m[idx1 + 1]) + int(m[idx2 + 1]) + int(m[idx3 + 1])) / 1024
942 return self.get_memory_mb_total() - avail
943
944 def get_local_gb_used(self):
945 """Get the free hdd size(GB) of physical computer.
946
947 :returns:
948 The total usage of HDD(GB).
949 Note that this value shows a partition where
950 NOVA-INST-DIR/instances mounts.
951
952 """
953
954 hddinfo = os.statvfs(FLAGS.instances_path)
955 avail = hddinfo.f_frsize * hddinfo.f_bavail / 1024 / 1024 / 1024
956 return self.get_local_gb_total() - avail
957
958 def get_hypervisor_type(self):
959 """Get hypervisor type.
960
961 :returns: hypervisor type (ex. qemu)
962
963 """
964
965 return self._conn.getType()
966
967 def get_hypervisor_version(self):
968 """Get hypervisor version.
969
970 :returns: hypervisor version (ex. 12003)
971
972 """
973
974 return self._conn.getVersion()
975
976 def get_cpu_info(self):
977 """Get cpuinfo information.
978
979 Obtains cpu feature from virConnect.getCapabilities,
980 and returns as a json string.
981
982 :return: see above description
983
984 """
985
986 xml = self._conn.getCapabilities()
987 xml = libxml2.parseDoc(xml)
988 nodes = xml.xpathEval('//cpu')
989 if len(nodes) != 1:
990 raise exception.Invalid(_("Invalid xml. '<cpu>' must be 1,"
991 "but %d\n") % len(nodes)
992 + xml.serialize())
993
994 cpu_info = dict()
995 cpu_info['arch'] = xml.xpathEval('//cpu/arch')[0].getContent()
996 cpu_info['model'] = xml.xpathEval('//cpu/model')[0].getContent()
997 cpu_info['vendor'] = xml.xpathEval('//cpu/vendor')[0].getContent()
998
999 topology_node = xml.xpathEval('//cpu/topology')[0].get_properties()
1000 topology = dict()
1001 while topology_node != None:
1002 name = topology_node.get_name()
1003 topology[name] = topology_node.getContent()
1004 topology_node = topology_node.get_next()
1005
1006 keys = ['cores', 'sockets', 'threads']
1007 tkeys = topology.keys()
1008 if list(set(tkeys)) != list(set(keys)):
1009 ks = ', '.join(keys)
1010 raise exception.Invalid(_("Invalid xml: topology(%(topology)s) "
1011 "must have %(ks)s") % locals())
1012
1013 feature_nodes = xml.xpathEval('//cpu/feature')
1014 features = list()
1015 for nodes in feature_nodes:
1016 features.append(nodes.get_properties().getContent())
1017
1018 cpu_info['topology'] = topology
1019 cpu_info['features'] = features
1020 return utils.dumps(cpu_info)
1021
854 def block_stats(self, instance_name, disk):1022 def block_stats(self, instance_name, disk):
855 """1023 """
856 Note that this function takes an instance name, not an Instance, so1024 Note that this function takes an instance name, not an Instance, so
@@ -881,6 +1049,207 @@
881 def refresh_security_group_members(self, security_group_id):1049 def refresh_security_group_members(self, security_group_id):
882 self.firewall_driver.refresh_security_group_members(security_group_id)1050 self.firewall_driver.refresh_security_group_members(security_group_id)
8831051
1052 def update_available_resource(self, ctxt, host):
1053 """Updates compute manager resource info on ComputeNode table.
1054
1055 This method is called when nova-coompute launches, and
1056 whenever admin executes "nova-manage service update_resource".
1057
1058 :param ctxt: security context
1059 :param host: hostname that compute manager is currently running
1060
1061 """
1062
1063 try:
1064 service_ref = db.service_get_all_compute_by_host(ctxt, host)[0]
1065 except exception.NotFound:
1066 raise exception.Invalid(_("Cannot update compute manager "
1067 "specific info, because no service "
1068 "record was found."))
1069
1070 # Updating host information
1071 dic = {'vcpus': self.get_vcpu_total(),
1072 'memory_mb': self.get_memory_mb_total(),
1073 'local_gb': self.get_local_gb_total(),
1074 'vcpus_used': self.get_vcpu_used(),
1075 'memory_mb_used': self.get_memory_mb_used(),
1076 'local_gb_used': self.get_local_gb_used(),
1077 'hypervisor_type': self.get_hypervisor_type(),
1078 'hypervisor_version': self.get_hypervisor_version(),
1079 'cpu_info': self.get_cpu_info()}
1080
1081 compute_node_ref = service_ref['compute_node']
1082 if not compute_node_ref:
1083 LOG.info(_('Compute_service record created for %s ') % host)
1084 dic['service_id'] = service_ref['id']
1085 db.compute_node_create(ctxt, dic)
1086 else:
1087 LOG.info(_('Compute_service record updated for %s ') % host)
1088 db.compute_node_update(ctxt, compute_node_ref[0]['id'], dic)
1089
1090 def compare_cpu(self, cpu_info):
1091 """Checks the host cpu is compatible to a cpu given by xml.
1092
1093 "xml" must be a part of libvirt.openReadonly().getCapabilities().
1094 return values follows by virCPUCompareResult.
1095 if 0 > return value, do live migration.
1096 'http://libvirt.org/html/libvirt-libvirt.html#virCPUCompareResult'
1097
1098 :param cpu_info: json string that shows cpu feature(see get_cpu_info())
1099 :returns:
1100 None. if given cpu info is not compatible to this server,
1101 raise exception.
1102
1103 """
1104
1105 LOG.info(_('Instance launched has CPU info:\n%s') % cpu_info)
1106 dic = utils.loads(cpu_info)
1107 xml = str(Template(self.cpuinfo_xml, searchList=dic))
1108 LOG.info(_('to xml...\n:%s ' % xml))
1109
1110 u = "http://libvirt.org/html/libvirt-libvirt.html#virCPUCompareResult"
1111 m = _("CPU doesn't have compatibility.\n\n%(ret)s\n\nRefer to %(u)s")
1112 # unknown character exists in xml, then libvirt complains
1113 try:
1114 ret = self._conn.compareCPU(xml, 0)
1115 except libvirt.libvirtError, e:
1116 ret = e.message
1117 LOG.error(m % locals())
1118 raise
1119
1120 if ret <= 0:
1121 raise exception.Invalid(m % locals())
1122
1123 return
1124
1125 def ensure_filtering_rules_for_instance(self, instance_ref):
1126 """Setting up filtering rules and waiting for its completion.
1127
1128 To migrate an instance, filtering rules to hypervisors
1129 and firewalls are inevitable on destination host.
1130 ( Waiting only for filterling rules to hypervisor,
1131 since filtering rules to firewall rules can be set faster).
1132
1133 Concretely, the below method must be called.
1134 - setup_basic_filtering (for nova-basic, etc.)
1135 - prepare_instance_filter(for nova-instance-instance-xxx, etc.)
1136
1137 to_xml may have to be called since it defines PROJNET, PROJMASK.
1138 but libvirt migrates those value through migrateToURI(),
1139 so , no need to be called.
1140
1141 Don't use thread for this method since migration should
1142 not be started when setting-up filtering rules operations
1143 are not completed.
1144
1145 :params instance_ref: nova.db.sqlalchemy.models.Instance object
1146
1147 """
1148
1149 # If any instances never launch at destination host,
1150 # basic-filtering must be set here.
1151 self.firewall_driver.setup_basic_filtering(instance_ref)
1152 # setting up n)ova-instance-instance-xx mainly.
1153 self.firewall_driver.prepare_instance_filter(instance_ref)
1154
1155 # wait for completion
1156 timeout_count = range(FLAGS.live_migration_retry_count)
1157 while timeout_count:
1158 try:
1159 filter_name = 'nova-instance-%s' % instance_ref.name
1160 self._conn.nwfilterLookupByName(filter_name)
1161 break
1162 except libvirt.libvirtError:
1163 timeout_count.pop()
1164 if len(timeout_count) == 0:
1165 ec2_id = instance_ref['hostname']
1166 iname = instance_ref.name
1167 msg = _('Timeout migrating for %(ec2_id)s(%(iname)s)')
1168 raise exception.Error(msg % locals())
1169 time.sleep(1)
1170
1171 def live_migration(self, ctxt, instance_ref, dest,
1172 post_method, recover_method):
1173 """Spawning live_migration operation for distributing high-load.
1174
1175 :params ctxt: security context
1176 :params instance_ref:
1177 nova.db.sqlalchemy.models.Instance object
1178 instance object that is migrated.
1179 :params dest: destination host
1180 :params post_method:
1181 post operation method.
1182 expected nova.compute.manager.post_live_migration.
1183 :params recover_method:
1184 recovery method when any exception occurs.
1185 expected nova.compute.manager.recover_live_migration.
1186
1187 """
1188
1189 greenthread.spawn(self._live_migration, ctxt, instance_ref, dest,
1190 post_method, recover_method)
1191
1192 def _live_migration(self, ctxt, instance_ref, dest,
1193 post_method, recover_method):
1194 """Do live migration.
1195
1196 :params ctxt: security context
1197 :params instance_ref:
1198 nova.db.sqlalchemy.models.Instance object
1199 instance object that is migrated.
1200 :params dest: destination host
1201 :params post_method:
1202 post operation method.
1203 expected nova.compute.manager.post_live_migration.
1204 :params recover_method:
1205 recovery method when any exception occurs.
1206 expected nova.compute.manager.recover_live_migration.
1207
1208 """
1209
1210 # Do live migration.
1211 try:
1212 flaglist = FLAGS.live_migration_flag.split(',')
1213 flagvals = [getattr(libvirt, x.strip()) for x in flaglist]
1214 logical_sum = reduce(lambda x, y: x | y, flagvals)
1215
1216 if self.read_only:
1217 tmpconn = self._connect(self.libvirt_uri, False)
1218 dom = tmpconn.lookupByName(instance_ref.name)
1219 dom.migrateToURI(FLAGS.live_migration_uri % dest,
1220 logical_sum,
1221 None,
1222 FLAGS.live_migration_bandwidth)
1223 tmpconn.close()
1224 else:
1225 dom = self._conn.lookupByName(instance_ref.name)
1226 dom.migrateToURI(FLAGS.live_migration_uri % dest,
1227 logical_sum,
1228 None,
1229 FLAGS.live_migration_bandwidth)
1230
1231 except Exception:
1232 recover_method(ctxt, instance_ref)
1233 raise
1234
1235 # Waiting for completion of live_migration.
1236 timer = utils.LoopingCall(f=None)
1237
1238 def wait_for_live_migration():
1239 """waiting for live migration completion"""
1240 try:
1241 self.get_info(instance_ref.name)['state']
1242 except exception.NotFound:
1243 timer.stop()
1244 post_method(ctxt, instance_ref, dest)
1245
1246 timer.f = wait_for_live_migration
1247 timer.start(interval=0.5, now=True)
1248
1249 def unfilter_instance(self, instance_ref):
1250 """See comments of same method in firewall_driver."""
1251 self.firewall_driver.unfilter_instance(instance_ref)
1252
8841253
885class FirewallDriver(object):1254class FirewallDriver(object):
886 def prepare_instance_filter(self, instance):1255 def prepare_instance_filter(self, instance):
8871256
=== modified file 'nova/virt/xenapi_conn.py'
--- nova/virt/xenapi_conn.py 2011-03-07 23:51:20 +0000
+++ nova/virt/xenapi_conn.py 2011-03-10 06:27:59 +0000
@@ -263,6 +263,27 @@
263 'username': FLAGS.xenapi_connection_username,263 'username': FLAGS.xenapi_connection_username,
264 'password': FLAGS.xenapi_connection_password}264 'password': FLAGS.xenapi_connection_password}
265265
266 def update_available_resource(self, ctxt, host):
267 """This method is supported only by libvirt."""
268 return
269
270 def compare_cpu(self, xml):
271 """This method is supported only by libvirt."""
272 raise NotImplementedError('This method is supported only by libvirt.')
273
274 def ensure_filtering_rules_for_instance(self, instance_ref):
275 """This method is supported only libvirt."""
276 return
277
278 def live_migration(self, context, instance_ref, dest,
279 post_method, recover_method):
280 """This method is supported only by libvirt."""
281 return
282
283 def unfilter_instance(self, instance_ref):
284 """This method is supported only by libvirt."""
285 raise NotImplementedError('This method is supported only by libvirt.')
286
266287
267class XenAPISession(object):288class XenAPISession(object):
268 """The session to invoke XenAPI SDK calls"""289 """The session to invoke XenAPI SDK calls"""
269290
=== modified file 'nova/volume/driver.py'
--- nova/volume/driver.py 2011-03-09 20:33:20 +0000
+++ nova/volume/driver.py 2011-03-10 06:27:59 +0000
@@ -143,6 +143,10 @@
143 """Undiscover volume on a remote host."""143 """Undiscover volume on a remote host."""
144 raise NotImplementedError()144 raise NotImplementedError()
145145
146 def check_for_export(self, context, volume_id):
147 """Make sure volume is exported."""
148 raise NotImplementedError()
149
146150
147class AOEDriver(VolumeDriver):151class AOEDriver(VolumeDriver):
148 """Implements AOE specific volume commands."""152 """Implements AOE specific volume commands."""
@@ -198,15 +202,45 @@
198 self._try_execute('sudo', 'vblade-persist', 'destroy',202 self._try_execute('sudo', 'vblade-persist', 'destroy',
199 shelf_id, blade_id)203 shelf_id, blade_id)
200204
201 def discover_volume(self, _volume):205 def discover_volume(self, context, _volume):
202 """Discover volume on a remote host."""206 """Discover volume on a remote host."""
203 self._execute('sudo', 'aoe-discover')207 (shelf_id,
204 self._execute('sudo', 'aoe-stat', check_exit_code=False)208 blade_id) = self.db.volume_get_shelf_and_blade(context,
209 _volume['id'])
210 self._execute("sudo aoe-discover")
211 out, err = self._execute("sudo aoe-stat", check_exit_code=False)
212 device_path = 'e%(shelf_id)d.%(blade_id)d' % locals()
213 if out.find(device_path) >= 0:
214 return "/dev/etherd/%s" % device_path
215 else:
216 return
205217
206 def undiscover_volume(self, _volume):218 def undiscover_volume(self, _volume):
207 """Undiscover volume on a remote host."""219 """Undiscover volume on a remote host."""
208 pass220 pass
209221
222 def check_for_export(self, context, volume_id):
223 """Make sure volume is exported."""
224 (shelf_id,
225 blade_id) = self.db.volume_get_shelf_and_blade(context,
226 volume_id)
227 cmd = "sudo vblade-persist ls --no-header"
228 out, _err = self._execute(cmd)
229 exported = False
230 for line in out.split('\n'):
231 param = line.split(' ')
232 if len(param) == 6 and param[0] == str(shelf_id) \
233 and param[1] == str(blade_id) and param[-1] == "run":
234 exported = True
235 break
236 if not exported:
237 # Instance will be terminated in this case.
238 desc = _("Cannot confirm exported volume id:%(volume_id)s. "
239 "vblade process for e%(shelf_id)s.%(blade_id)s "
240 "isn't running.") % locals()
241 raise exception.ProcessExecutionError(out, _err, cmd=cmd,
242 description=desc)
243
210244
211class FakeAOEDriver(AOEDriver):245class FakeAOEDriver(AOEDriver):
212 """Logs calls instead of executing."""246 """Logs calls instead of executing."""
@@ -402,7 +436,7 @@
402 (property_key, property_value))436 (property_key, property_value))
403 return self._run_iscsiadm(iscsi_properties, iscsi_command)437 return self._run_iscsiadm(iscsi_properties, iscsi_command)
404438
405 def discover_volume(self, volume):439 def discover_volume(self, context, volume):
406 """Discover volume on a remote host."""440 """Discover volume on a remote host."""
407 iscsi_properties = self._get_iscsi_properties(volume)441 iscsi_properties = self._get_iscsi_properties(volume)
408442
@@ -461,6 +495,20 @@
461 self._run_iscsiadm(iscsi_properties, "--logout")495 self._run_iscsiadm(iscsi_properties, "--logout")
462 self._run_iscsiadm(iscsi_properties, "--op delete")496 self._run_iscsiadm(iscsi_properties, "--op delete")
463497
498 def check_for_export(self, context, volume_id):
499 """Make sure volume is exported."""
500
501 tid = self.db.volume_get_iscsi_target_num(context, volume_id)
502 try:
503 self._execute("sudo ietadm --op show --tid=%(tid)d" % locals())
504 except exception.ProcessExecutionError, e:
505 # Instances remount read-only in this case.
506 # /etc/init.d/iscsitarget restart and rebooting nova-volume
507 # is better since ensure_export() works at boot time.
508 logging.error(_("Cannot confirm exported volume "
509 "id:%(volume_id)s.") % locals())
510 raise
511
464512
465class FakeISCSIDriver(ISCSIDriver):513class FakeISCSIDriver(ISCSIDriver):
466 """Logs calls instead of executing."""514 """Logs calls instead of executing."""
467515
=== modified file 'nova/volume/manager.py'
--- nova/volume/manager.py 2011-02-21 23:52:41 +0000
+++ nova/volume/manager.py 2011-03-10 06:27:59 +0000
@@ -160,7 +160,7 @@
160 if volume_ref['host'] == self.host and FLAGS.use_local_volumes:160 if volume_ref['host'] == self.host and FLAGS.use_local_volumes:
161 path = self.driver.local_path(volume_ref)161 path = self.driver.local_path(volume_ref)
162 else:162 else:
163 path = self.driver.discover_volume(volume_ref)163 path = self.driver.discover_volume(context, volume_ref)
164 return path164 return path
165165
166 def remove_compute_volume(self, context, volume_id):166 def remove_compute_volume(self, context, volume_id):
@@ -171,3 +171,9 @@
171 return True171 return True
172 else:172 else:
173 self.driver.undiscover_volume(volume_ref)173 self.driver.undiscover_volume(volume_ref)
174
175 def check_for_export(self, context, instance_id):
176 """Make sure whether volume is exported."""
177 instance_ref = self.db.instance_get(context, instance_id)
178 for volume in instance_ref['volumes']:
179 self.driver.check_for_export(context, volume['id'])