Merge lp:~allenap/maas/rpc-rapid-connections into lp:maas/trunk

Proposed by Gavin Panella on 2014-05-09
Status: Merged
Approved by: Gavin Panella on 2014-05-09
Approved revision: 2311
Merged at revision: 2309
Proposed branch: lp:~allenap/maas/rpc-rapid-connections
Merge into: lp:maas/trunk
Diff against target: 189 lines (+89/-46)
2 files modified
src/provisioningserver/rpc/clusterservice.py (+42/-12)
src/provisioningserver/rpc/tests/test_clusterservice.py (+47/-34)
To merge this branch: bzr merge lp:~allenap/maas/rpc-rapid-connections
Reviewer Review Type Date Requested Status
Raphaël Badin (community) 2014-05-09 Approve on 2014-05-09
Review via email: mp+219030@code.launchpad.net

Commit message

Shorten the time taken for a cluster to initially connect to the region via RPC to around 2 seconds.

Previously it could take between 30 and 90 seconds.

Description of the change

Changes the RPC client service on the cluster to more rapidly connect to the region.

To post a comment you must log in.
Raphaël Badin (rvb) wrote :

Looks good. Nice job documenting all the different code paths.

review: Approve
MAAS Lander (maas-lander) wrote :
Download full text (16.2 KiB)

The attempt to merge lp:~allenap/maas/rpc-rapid-connections into lp:maas failed. Below is the output from the failed tests.

Ign http://security.ubuntu.com trusty-security InRelease
Get:1 http://security.ubuntu.com trusty-security Release.gpg [933 B]
Get:2 http://security.ubuntu.com trusty-security Release [58.5 kB]
Ign http://nova.clouds.archive.ubuntu.com trusty InRelease
Ign http://nova.clouds.archive.ubuntu.com trusty-updates InRelease
Get:3 http://nova.clouds.archive.ubuntu.com trusty Release.gpg [933 B]
Get:4 http://nova.clouds.archive.ubuntu.com trusty-updates Release.gpg [933 B]
Get:5 http://nova.clouds.archive.ubuntu.com trusty Release [58.5 kB]
Get:6 http://nova.clouds.archive.ubuntu.com trusty-updates Release [58.5 kB]
Get:7 http://security.ubuntu.com trusty-security/main Sources [12.0 kB]
Get:8 http://security.ubuntu.com trusty-security/universe Sources [2,873 B]
Get:9 http://nova.clouds.archive.ubuntu.com trusty/main Sources [1,064 kB]
Get:10 http://security.ubuntu.com trusty-security/main amd64 Packages [37.9 kB]
Get:11 http://security.ubuntu.com trusty-security/universe amd64 Packages [13.6 kB]
Hit http://security.ubuntu.com trusty-security/main Translation-en
Get:12 http://security.ubuntu.com trusty-security/universe Translation-en [6,799 B]
Ign http://security.ubuntu.com trusty-security/main Translation-en_US
Ign http://security.ubuntu.com trusty-security/universe Translation-en_US
Get:13 http://nova.clouds.archive.ubuntu.com trusty/universe Sources [6,399 kB]
Get:14 http://nova.clouds.archive.ubuntu.com trusty/main amd64 Packages [1,350 kB]
Get:15 http://nova.clouds.archive.ubuntu.com trusty/universe amd64 Packages [5,859 kB]
Hit http://nova.clouds.archive.ubuntu.com trusty/main Translation-en
Hit http://nova.clouds.archive.ubuntu.com trusty/universe Translation-en
Get:16 http://nova.clouds.archive.ubuntu.com trusty-updates/main Sources [26.5 kB]
Get:17 http://nova.clouds.archive.ubuntu.com trusty-updates/universe Sources [16.2 kB]
Get:18 http://nova.clouds.archive.ubuntu.com trusty-updates/main amd64 Packages [66.1 kB]
Get:19 http://nova.clouds.archive.ubuntu.com trusty-updates/universe amd64 Packages [39.3 kB]
Hit http://nova.clouds.archive.ubuntu.com trusty-updates/main Translation-en
Hit http://nova.clouds.archive.ubuntu.com trusty-updates/universe Translation-en
Ign http://nova.clouds.archive.ubuntu.com trusty/main Translation-en_US
Ign http://nova.clouds.archive.ubuntu.com trusty/universe Translation-en_US
Ign http://nova.clouds.archive.ubuntu.com trusty-updates/main Translation-en_US
Ign http://nova.clouds.archive.ubuntu.com trusty-updates/universe Translation-en_US
Fetched 15.1 MB in 7s (2,046 kB/s)
Reading package lists...
sudo DEBIAN_FRONTEND=noninteractive apt-get -y \
     --no-install-recommends install apache2 bind9 bind9utils build-essential bzr-builddeb curl daemontools debhelper dh-apport distro-info dnsutils firefox freeipmi-tools ipython isc-dhcp-common libjs-raphael libjs-yui3-full libjs-yui3-min libpq-dev make postgresql python-amqplib python-bzrlib python-celery python-convoy python-crochet python-cssselect python-curtin python-dev python-distro-info python-django python-django-piston python-django-sout...

Julian Edwards (julian-edwards) wrote :

On 10/05/14 04:54, <email address hidden> wrote:
> The proposal to merge lp:~allenap/maas/rpc-rapid-connections into lp:maas has been updated.
>
> Status: Approved => Merged
>
> For more details, see:
> https://code.launchpad.net/~allenap/maas/rpc-rapid-connections/+merge/219030
>

I think this is worthy of a backport to the 1.5 branch.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/provisioningserver/rpc/clusterservice.py'
2--- src/provisioningserver/rpc/clusterservice.py 2014-03-28 15:17:34 +0000
3+++ src/provisioningserver/rpc/clusterservice.py 2014-05-09 17:01:43 +0000
4@@ -221,9 +221,13 @@
5 instances connected to it.
6 """
7
8+ INTERVAL_LOW = 2 # seconds.
9+ INTERVAL_MID = 10 # seconds.
10+ INTERVAL_HIGH = 30 # seconds.
11+
12 def __init__(self, reactor):
13 super(ClusterClientService, self).__init__(
14- self._get_random_interval(), self.update)
15+ self._calculate_interval(None, None), self.update)
16 self.connections = {}
17 self.clock = reactor
18
19@@ -248,9 +252,6 @@
20 This obtains a list of endpoints from the region then connects
21 to new ones and drops connections to those no longer used.
22 """
23- # 0. Update interval.
24- self._update_interval()
25- # 1. Obtain RPC endpoints.
26 try:
27 info_url = self._get_rpc_info_url()
28 info_page = yield getPage(info_url)
29@@ -258,9 +259,13 @@
30 eventloops = info["eventloops"]
31 yield self._update_connections(eventloops)
32 except ConnectError as error:
33+ self._update_interval(None, len(self.connections))
34 log.msg("Region not available: %s" % (error,))
35 except:
36+ self._update_interval(None, len(self.connections))
37 log.err()
38+ else:
39+ self._update_interval(len(eventloops), len(self.connections))
40
41 @staticmethod
42 def _get_rpc_info_url():
43@@ -270,14 +275,39 @@
44 url = url.geturl()
45 return ascii_url(url)
46
47- @staticmethod
48- def _get_random_interval():
49- """Return a random interval between 30 and 90 seconds."""
50- return random.randint(30, 90)
51-
52- def _update_interval(self):
53- """Change the interval randomly to avoid stampedes of clusters."""
54- self._loop.interval = self.step = self._get_random_interval()
55+ def _calculate_interval(self, num_eventloops, num_connections):
56+ """Calculate the update interval.
57+
58+ The interval is `INTERVAL_LOW` seconds when there are no
59+ connections, so that this can quickly obtain its first
60+ connection.
61+
62+ The interval changes to `INTERVAL_MID` seconds when there are
63+ some connections, but fewer than there are event-loops.
64+
65+ After that it drops back to `INTERVAL_HIGH` seconds.
66+ """
67+ if num_eventloops is None:
68+ # The region is not available; keep trying regularly.
69+ return self.INTERVAL_LOW
70+ elif num_eventloops == 0:
71+ # The region is coming up; keep trying regularly.
72+ return self.INTERVAL_LOW
73+ elif num_connections == 0:
74+ # No connections to the region; keep trying regularly.
75+ return self.INTERVAL_LOW
76+ elif num_connections < num_eventloops:
77+ # Some connections to the region, but not to all event
78+ # loops; keep updating reasonably frequently.
79+ return self.INTERVAL_MID
80+ else:
81+ # Fully connected to the region; update every so often.
82+ return self.INTERVAL_HIGH
83+
84+ def _update_interval(self, num_eventloops, num_connections):
85+ """Change the update interval."""
86+ self._loop.interval = self.step = self._calculate_interval(
87+ num_eventloops, num_connections)
88
89 @inlineCallbacks
90 def _update_connections(self, eventloops):
91
92=== modified file 'src/provisioningserver/rpc/tests/test_clusterservice.py'
93--- src/provisioningserver/rpc/tests/test_clusterservice.py 2014-03-28 04:31:32 +0000
94+++ src/provisioningserver/rpc/tests/test_clusterservice.py 2014-05-09 17:01:43 +0000
95@@ -254,40 +254,6 @@
96 observed_rpc_info_url = ClusterClientService._get_rpc_info_url()
97 self.assertThat(observed_rpc_info_url, Equals(expected_rpc_info_url))
98
99- def test__get_random_interval(self):
100- # _get_random_interval() returns a random number between 30 and
101- # 90 inclusive.
102- is_between_30_and_90_inclusive = MatchesAll(
103- MatchesAny(GreaterThan(30), Equals(30)),
104- MatchesAny(LessThan(90), Equals(90)))
105- for _ in range(100):
106- self.assertThat(
107- ClusterClientService._get_random_interval(),
108- is_between_30_and_90_inclusive)
109-
110- def test__get_random_interval_calls_into_standard_library(self):
111- # _get_random_interval() depends entirely on the standard library.
112- random = self.patch(clusterservice, "random")
113- random.randint.return_value = sentinel.randint
114- self.assertIs(
115- sentinel.randint,
116- ClusterClientService._get_random_interval())
117- self.assertThat(random.randint, MockCalledOnceWith(30, 90))
118-
119- def test__update_interval(self):
120- service = ClusterClientService(Clock())
121- # ClusterClientService's superclass, TimerService, creates a
122- # LoopingCall with now=True. We neuter it here because we only
123- # want to observe the behaviour of _update_interval().
124- service.call = (lambda: None, (), {})
125- service.startService()
126- self.assertThat(service.step, MatchesAll(
127- Equals(service._loop.interval), IsInstance(int)))
128- service.step = service._loop.interval = sentinel.undefined
129- service._update_interval()
130- self.assertThat(service.step, MatchesAll(
131- Equals(service._loop.interval), IsInstance(int)))
132-
133 def test_update_connect_error_is_logged_tersely(self):
134 getPage = self.patch(clusterservice, "getPage")
135 getPage.side_effect = error.ConnectionRefusedError()
136@@ -512,6 +478,53 @@
137 service.getClient)
138
139
140+class TestClusterClientServiceIntervals(MAASTestCase):
141+
142+ scenarios = (
143+ ("initial", {
144+ "num_eventloops": None,
145+ "num_connections": None,
146+ "expected": ClusterClientService.INTERVAL_LOW,
147+ }),
148+ ("no-event-loops", {
149+ "num_eventloops": 0,
150+ "num_connections": sentinel.undefined,
151+ "expected": ClusterClientService.INTERVAL_LOW,
152+ }),
153+ ("no-connections", {
154+ "num_eventloops": 1, # anything > 1.
155+ "num_connections": 0,
156+ "expected": ClusterClientService.INTERVAL_LOW,
157+ }),
158+ ("fewer-connections-than-event-loops", {
159+ "num_eventloops": 2, # anything > num_connections.
160+ "num_connections": 1, # anything > 0.
161+ "expected": ClusterClientService.INTERVAL_MID,
162+ }),
163+ ("default", {
164+ "num_eventloops": 3, # same as num_connections.
165+ "num_connections": 3, # same as num_eventloops.
166+ "expected": ClusterClientService.INTERVAL_HIGH,
167+ }),
168+ )
169+
170+ def make_inert_client_service(self):
171+ service = ClusterClientService(Clock())
172+ # ClusterClientService's superclass, TimerService, creates a
173+ # LoopingCall with now=True. We neuter it here to allow
174+ # observation of the behaviour of _update_interval() for
175+ # example.
176+ service.call = (lambda: None, (), {})
177+ return service
178+
179+ def test__calculate_interval(self):
180+ service = self.make_inert_client_service()
181+ service.startService()
182+ self.assertEqual(
183+ self.expected, service._calculate_interval(
184+ self.num_eventloops, self.num_connections))
185+
186+
187 class TestClusterClient(MAASTestCase):
188
189 run_tests_with = AsynchronousDeferredRunTest.make_factory(timeout=5)