Merge lp:~cbehrens/nova/rpc-kombu into lp:~hudson-openstack/nova/trunk

Proposed by Chris Behrens
Status: Merged
Approved by: Soren Hansen
Approved revision: 1529
Merged at revision: 1513
Proposed branch: lp:~cbehrens/nova/rpc-kombu
Merge into: lp:~hudson-openstack/nova/trunk
Diff against target: 1853 lines (+1262/-330)
17 files modified
bin/nova-ajax-console-proxy (+6/-6)
contrib/nova.sh (+1/-1)
nova/flags.py (+6/-2)
nova/rpc/__init__.py (+17/-26)
nova/rpc/common.py (+6/-0)
nova/rpc/impl_carrot.py (+81/-21)
nova/rpc/impl_kombu.py (+781/-0)
nova/service.py (+11/-21)
nova/tests/test_adminapi.py (+0/-2)
nova/tests/test_cloud.py (+0/-2)
nova/tests/test_rpc.py (+6/-158)
nova/tests/test_rpc_amqp.py (+0/-88)
nova/tests/test_rpc_carrot.py (+45/-0)
nova/tests/test_rpc_common.py (+189/-0)
nova/tests/test_rpc_kombu.py (+110/-0)
nova/tests/test_test.py (+2/-3)
tools/pip-requires (+1/-0)
To merge this branch: bzr merge lp:~cbehrens/nova/rpc-kombu
Reviewer Review Type Date Requested Status
Dan Prince (community) Approve
Zed A. Shaw (community) Approve
Brian Lamar (community) Needs Information
Vish Ishaya (community) Approve
Joseph Heck (community) Approve
Dave Walker Pending
Review via email: mp+73096@code.launchpad.net

Description of the change

Implements lp:798876 which is 'switch carrot to kombu'. Leaves carrot as the default for now... decision will be made later to switch the default to kombu after further testing. There's a lot of code duplication between carrot and kombu, but I left it that way in preparation for ripping carrot out later and to keep minimal changes to carrot.

This also fixes bug: lp:794627 (re-establish connections to carrot when it restarts), but only fixes it in kombu.
This also fixes bug: lp:803168 (msg-id response queues being left around), but also only fixes it in kombu.

See those bugs for comments.

To post a comment you must log in.
Revision history for this message
Chris Behrens (cbehrens) wrote :

One question I have is... carrot would call sys.exit() when it couldn't connect to rabbit after so many attempts. I left this in when handling lp:794627, but it seems I should probably make it try to re-connect forever. Thoughts?

Revision history for this message
Joseph Heck (heckj) wrote :

+1 for reconnecting forever, but with some fail-off on retry interval so that a failure doesn't end up spamming the network with attempts. Some exponential try interval that caps at a retry interval maximum of roughly every 5 minutes is what I would think would be reasonable.

Revision history for this message
Chris Behrens (cbehrens) wrote :

yeah, there's some backoff built into the ensure_connection() call I'm making to kombu. I'll look at reconnecting forever.

Also appears I broke what Zed was doing. :( Going to restore the old interfaces and make this work.

Revision history for this message
Chris Behrens (cbehrens) wrote :

Restores most of the original interfaces with only slight changes to satisfy work Zed is doing.

This keeps carrot as the default for now. There were slight modifications to carrot done to support an interface change... as well as to handle changes to FLAG values regarding rabbit retries.

To use kombu, use --rpc_backend=nova.rpc.impl_kombu

Moved rpc tests into a common class, so that tests for the default module as well as tests for carrot and kombu can all be shared. There's a test for fanout for kombu that is SKIP'd because the 'memory' transport built into kombu that I'm using when FLAGS.fake_rabbit=True seems to be buggy. When I run the test against a real rabbit server (FLAGS.fake_rabbit=False), that test passes fine.

Please give the tests a run. They all pass for me. zedas said he might have seen a failure somewhere in his scrollback, but I'm unable to see one.

Revision history for this message
Chris Behrens (cbehrens) wrote :

heckj: I changed FLAGS.max_retries to default to 0, which means reconnect-forever now. There's also a FLAGS.rabbit_retry_backoff.. which is # of seconds to add to the delay on each cycle. FLAGS.rabbit_retry_interval is now the delay before reconnecting the 1st time.

Revision history for this message
Joseph Heck (heckj) wrote :

Noticed the kombu memory/fanout test skip - ran into that myself

lgtm

review: Approve
Revision history for this message
Vish Ishaya (vishvananda) wrote :

Still runs fine with carrot, but not with kombu:

(nova): TRACE: File "/tmp/bzr/rpc-kombu/nova/rpc/impl_kombu.py", line 325, in reconnect
(nova): TRACE: self.connection = kombu.connection.Connection(**self.params)
(nova): TRACE: AttributeError: 'module' object has no attribute 'Connection'
(nova): TRACE:
root@nova:/tmp/bzr# pip freeze | grep kombu
kombu==1.0.4

is a specific version of kombu required?

review: Needs Information
Revision history for this message
Vish Ishaya (vishvananda) wrote :

So it works fine with 1.2.1

If this isn't going to work with 1.0.4 we should:
a) update pip requires to require >=1.2
b) provide a kombu package for the ppa

Vish

Revision history for this message
Chris Behrens (cbehrens) wrote :

Should be fixed for new and old versions of kombu.

I was using kombu.connection.Connection which in at least 1.1.2 is == connection.BrokerConnection, but 1.0.4 is missing that. Reverted to using kombu.connection.BrokerConnection and it seems to work in 1.0.4 and 1.2.1 now.

Revision history for this message
Vish Ishaya (vishvananda) wrote :

Works for me too. I'm actually happy making this the default, so we make sure and get lots of people banging on it for rbp. If there is an issue we can always revert the default.

review: Approve
Revision history for this message
Chris Behrens (cbehrens) wrote :

Agree. Switched default to kombu, then.

Revision history for this message
Brian Lamar (blamar) wrote :

Ubuntu only has a package for kombu in 11.10 (oneric), unless I'm missing something. I don't know really what the policies are, but has this been taken into account?

review: Needs Information
Revision history for this message
Vish Ishaya (vishvananda) wrote :

There is a version in the ppa (since it is a requirement of glance now), so I don't know that it is a huge deal that packages aren't provided in the older distros. We have a lot of dependencies that aren't in older distros, so I don't think this needs to be a blocker.

On Aug 31, 2011, at 10:55 AM, Brian Lamar wrote:

> Review: Needs Information
> Ubuntu only has a package for kombu in 11.10 (oneric), unless I'm missing something. I don't know really what the policies are, but has this been taken into account?
> --
> https://code.launchpad.net/~cbehrens/nova/rpc-kombu/+merge/73096
> You are reviewing the proposed merge of lp:~cbehrens/nova/rpc-kombu into lp:nova.

Revision history for this message
Dan Prince (dan-prince) wrote :

Chris.

This runs great for me functionall with kombu and carrot with 1.0.4. (I'm able to boot instances, etc.)

Using python-kombu-1.1.3-1.fc15.noarch on my Fedora box I'm getting errors when running unit tests:

http://paste.openstack.org/show/2322/ (KeyError: 'ae.undeliver')

Is it possible to support both kombu 1.0 and 1.1 versions?

review: Needs Fixing
Revision history for this message
Chuck Short (zulcss) wrote :

Ubuntu has 1.0.4

Revision history for this message
Zed A. Shaw (zedshaw) wrote :

I have no problems with this patch as I worked with Chris so we could coordinate my other pending patch. The tests all pass on my side, and the sooner this is in the sooner I can finish my work.

review: Approve
Revision history for this message
Chris Behrens (cbehrens) wrote :

Dan: Hm. I just installed 1.1.3. test_rpc_kombu passes... I'm starting the run of all tests now. What's a test that was failing for you? I can't tell from the paste.

Revision history for this message
Chris Behrens (cbehrens) wrote :

Nevermind... that was quick:

FAIL: test_associate (nova.tests.api.openstack.contrib.test_security_groups.TestSecurityGroups)

Looking at it.

Revision history for this message
Chris Behrens (cbehrens) wrote :

Dan: It appears to be a bug in 1.1.3's 'memory' transport that is used for tests only. I've kludged a fix in when we're using that transport. 1.1.3 tests pass now. And 1.2.1 tests still pass with this. I'm running the 1.0.4 tests right now. Will put this back into 'needs review' if that succeeds.

Revision history for this message
Chris Behrens (cbehrens) wrote :

Ok, passes on 1.0.4, 1.1.3, and 1.2.1.

Revision history for this message
Dan Prince (dan-prince) wrote :

> Ok, passes on 1.0.4, 1.1.3, and 1.2.1.

Running tests now...

Revision history for this message
Dan Prince (dan-prince) wrote :

Looks great.

review: Approve
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :
Download full text (912.2 KiB)

The attempt to merge lp:~cbehrens/nova/rpc-kombu into lp:nova failed. Below is the output from the failed tests.

CreateserverextTest
    test_create_instance_with_duplicate_networks OK 0.30
    test_create_instance_with_duplicate_networks_xml OK 0.42
    test_create_instance_with_network_empty_fixed_ip OK 0.28
    test_create_instance_with_network_empty_fixed_ip_xml OK 0.28
    test_create_instance_with_network_invalid_id OK 0.44
    test_create_instance_with_network_invalid_id_xml OK 0.28
    test_create_instance_with_network_no_fixed_ip OK 0.29
    test_create_instance_with_network_no_fixed_ip_xml OK 0.47
    test_create_instance_with_network_no_id OK 0.28
    test_create_instance_with_network_no_id_xml OK 0.28
    test_create_instance_with_network_non_string_fixed_ip OK 0.45
    test_create_instance_with_no_networks OK 0.29
    test_create_instance_with_no_networks_xml OK 0.29
    test_create_instance_with_one_network OK 0.45
    test_create_instance_with_one_network_xml OK 0.30
    test_create_instance_with_two_networks OK 0.29
    test_create_instance_with_two_networks_xml OK 0.47
FloatingIpTest
    test_add_floating_ip_to_instance OK 0.31
    test_bad_address_param_in_add_floating_ip OK 0.30
    test_bad_address_param_in_remove_floating_ip OK 0.47
    test_floating_ip_allocate OK 0.31
    test_floating_ip_release OK 0.31
    test_floating_ip_show OK 0.47
    test_floating_ips_list OK 0.31
    test_missing_dict_param_in_add_floating_ip OK 0.45
    test_missing_dict_param_in_remove_floating_ip OK 0.31
    test_remove_floating_ip_from_instance OK 0.30
    test_translate_floating_ip_view OK 0.04
    test_translate_floating_ip_view_dict OK 0.02
KeypairsTest
    test_keypair_create OK 0.64
    test_keypair_delete OK 0.32
    test_keypair_import OK 0.37
    test_keypair_list OK 0.55
FixedIpTest
    test_add_fixed_ip OK 0.29
    test_add_fixed_ip_no_network OK 0.78
    test_remove_fixed_ip OK 0.46
    test_remove_fixed_ip_no_address OK 0.28
QuotaSetsTest
    test_format_quota_set OK 0.00
    test_quotas_defaults OK 0.29
    test_quotas_show_as_admin OK 0.45
    test_quotas_show_as_unauthorized_user OK 0.28
    test_quotas_upd...

Revision history for this message
Chris Behrens (cbehrens) wrote :

Well shoot. I'm gonna need some help to figure out why kombu couldn't be loaded. It should be on the server because of glance's dependency... and I've tested this branch using the version from the ppa.

Revision history for this message
Chris Behrens (cbehrens) wrote :

hrm. soren added python-kombu... i'm a little confused why it was not there, since nova depends on glance, which already depends on kombu.. let's try this again.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'bin/nova-ajax-console-proxy'
--- bin/nova-ajax-console-proxy 2011-08-18 17:55:39 +0000
+++ bin/nova-ajax-console-proxy 2011-08-31 18:56:22 +0000
@@ -113,11 +113,10 @@
113 AjaxConsoleProxy.tokens[kwargs['token']] = \113 AjaxConsoleProxy.tokens[kwargs['token']] = \
114 {'args': kwargs, 'last_activity': time.time()}114 {'args': kwargs, 'last_activity': time.time()}
115115
116 conn = rpc.create_connection(new=True)116 self.conn = rpc.create_connection(new=True)
117 consumer = rpc.create_consumer(117 self.conn.create_consumer(
118 conn,118 FLAGS.ajax_console_proxy_topic,
119 FLAGS.ajax_console_proxy_topic,119 TopicProxy)
120 TopicProxy)
121120
122 def delete_expired_tokens():121 def delete_expired_tokens():
123 now = time.time()122 now = time.time()
@@ -129,7 +128,7 @@
129 for k in to_delete:128 for k in to_delete:
130 del AjaxConsoleProxy.tokens[k]129 del AjaxConsoleProxy.tokens[k]
131130
132 utils.LoopingCall(consumer.fetch, enable_callbacks=True).start(0.1)131 self.conn.consume_in_thread()
133 utils.LoopingCall(delete_expired_tokens).start(1)132 utils.LoopingCall(delete_expired_tokens).start(1)
134133
135if __name__ == '__main__':134if __name__ == '__main__':
@@ -142,3 +141,4 @@
142 server = wsgi.Server("AJAX Console Proxy", acp, port=acp_port)141 server = wsgi.Server("AJAX Console Proxy", acp, port=acp_port)
143 service.serve(server)142 service.serve(server)
144 service.wait()143 service.wait()
144 self.conn.close()
145145
=== modified file 'contrib/nova.sh'
--- contrib/nova.sh 2011-08-05 16:59:28 +0000
+++ contrib/nova.sh 2011-08-31 18:56:22 +0000
@@ -81,7 +81,7 @@
81 sudo apt-get install -y python-netaddr python-pastedeploy python-eventlet81 sudo apt-get install -y python-netaddr python-pastedeploy python-eventlet
82 sudo apt-get install -y python-novaclient python-glance python-cheetah82 sudo apt-get install -y python-novaclient python-glance python-cheetah
83 sudo apt-get install -y python-carrot python-tempita python-sqlalchemy83 sudo apt-get install -y python-carrot python-tempita python-sqlalchemy
84 sudo apt-get install -y python-suds84 sudo apt-get install -y python-suds python-kombu
8585
8686
87 if [ "$USE_IPV6" == 1 ]; then87 if [ "$USE_IPV6" == 1 ]; then
8888
=== modified file 'nova/flags.py'
--- nova/flags.py 2011-08-24 23:48:04 +0000
+++ nova/flags.py 2011-08-31 18:56:22 +0000
@@ -303,8 +303,12 @@
303DEFINE_string('rabbit_userid', 'guest', 'rabbit userid')303DEFINE_string('rabbit_userid', 'guest', 'rabbit userid')
304DEFINE_string('rabbit_password', 'guest', 'rabbit password')304DEFINE_string('rabbit_password', 'guest', 'rabbit password')
305DEFINE_string('rabbit_virtual_host', '/', 'rabbit virtual host')305DEFINE_string('rabbit_virtual_host', '/', 'rabbit virtual host')
306DEFINE_integer('rabbit_retry_interval', 10, 'rabbit connection retry interval')306DEFINE_integer('rabbit_retry_interval', 1,
307DEFINE_integer('rabbit_max_retries', 12, 'rabbit connection attempts')307 &#x