Merge lp:~edouardb/cloud-init/scaleway-datasource into lp:~cloud-init-dev/cloud-init/trunk
- scaleway-datasource
- Merge into trunk
Status: | Rejected |
---|---|
Rejected by: | Scott Moser |
Proposed branch: | lp:~edouardb/cloud-init/scaleway-datasource |
Merge into: | lp:~cloud-init-dev/cloud-init/trunk |
Diff against target: |
443 lines (+414/-2) 3 files modified
cloudinit/sources/DataSourceScaleway.py (+216/-0) cloudinit/url_helper.py (+5/-2) tests/unittests/test_datasource/test_scaleway.py (+193/-0) |
To merge this branch: | bzr merge lp:~edouardb/cloud-init/scaleway-datasource |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
cloud-init Commiters | Pending | ||
Review via email: mp+274861@code.launchpad.net |
Commit message
Description of the change
Add Datasource for Scaleway's metadata service
Julien Castets (jcastets) wrote : | # |
Scott Moser (smoser) wrote : | # |
Hey,
This looks well done, thanks.
A couple comments
a.) we'll need some unit tests to ensure we dont inadvertently break this.
b.) is there some way (anyway) we can detect if we're on scaleway? As it is right now, it looks like we're just going to block and retry for the availability of the MD. That is much less than ideal, and the only current "on by default" datasource thtat does that is EC2 (which only gets that privilege from being first). Other vendors provide some dmi data or another quick local test.
c.) you'll need to sign the Canonical Contributors License Agreement (http://
d.) vendor-data would be nice (and helpful to you as the operator of the cloud.
again, though. Thanks, it looks really good.
Feel free to ping in #cloud-init if you have questions.
Julien Castets (jcastets) wrote : | # |
Hi,
a) Done
b) Done. Unfortunately, there's no way to ensure you're running on Scaleway without hitting a network resource
c) Done
d) Indeed. Can we consider adding them later?
Scott Moser (smoser) wrote : | # |
Julien,
Thanks.
for 'd', sure you can add vendor-data later.
if its not in the cloud provider anyway, not much use for cluod-init to support it.
The big issue is just 'b'. We can't enable by default without solid way of knowing that we should hit a http source that might hang indefinitely.
I'll revewi shortly.
Julien Castets (jcastets) wrote : | # |
Great, thanks :) Waiting for your review then.
Scott Moser (smoser) wrote : | # |
two nitpicks. but this looks good other than failing tests.
b
Manfred Touron (moul) wrote : | # |
To check if you are on scaleway:
$ test -f /run/oc-
0
This file is populated when something fetches the api metadata.
Our initrd (https:/
Scott Moser (smoser) wrote : | # |
Hi Eduardo,
Looking at this again.
Could you please sign the contributors agreement please feel free to contact me if you have any questions (freenode 'smoser') http://
Second, i think the check for /run/oc-
Scott Moser (smoser) wrote : | # |
Hello,
Thank you for taking the time to contribute to cloud-init. Cloud-init has moved its revision control system to git. As a result, we are marking all bzr merge proposals as 'rejected'. If you would like to re-submit this proposal for review, please do so by following the current HACKING documentation at http://
Unmerged revisions
- 1154. By Edouard Bonlieu <email address hidden>
-
Merge jcastets PR
- 1153. By Edouard Bonlieu <email address hidden>
-
Pass userdata url and retries as parameters
- 1152. By Edouard Bonlieu <email address hidden>
-
Add a check to ensure we are on Scaleway
- 1151. By Edouard Bonlieu <email address hidden>
-
Add Datasource for Scaleway's metadata service (https:/
/www.scaleway. com) - 1150. By Edouard Bonlieu <email address hidden>
-
Add optional session parameter to readurl
Preview Diff
1 | === added file 'cloudinit/sources/DataSourceScaleway.py' |
2 | --- cloudinit/sources/DataSourceScaleway.py 1970-01-01 00:00:00 +0000 |
3 | +++ cloudinit/sources/DataSourceScaleway.py 2015-10-28 09:51:17 +0000 |
4 | @@ -0,0 +1,216 @@ |
5 | +# vi: ts=4 expandtab |
6 | +# |
7 | +# Author: Edouard Bonlieu <ebonlieu@ocs.online.net> |
8 | +# |
9 | +# This program is free software: you can redistribute it and/or modify |
10 | +# it under the terms of the GNU General Public License version 3, as |
11 | +# published by the Free Software Foundation. |
12 | +# |
13 | +# This program is distributed in the hope that it will be useful, |
14 | +# but WITHOUT ANY WARRANTY; without even the implied warranty of |
15 | +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
16 | +# GNU General Public License for more details. |
17 | +# |
18 | +# You should have received a copy of the GNU General Public License |
19 | +# along with this program. If not, see <http://www.gnu.org/licenses/>. |
20 | + |
21 | +import functools |
22 | +import errno |
23 | +import json |
24 | +import time |
25 | + |
26 | +from requests.packages.urllib3.poolmanager import PoolManager |
27 | +import requests |
28 | + |
29 | +from cloudinit import log as logging |
30 | +from cloudinit import sources |
31 | +from cloudinit import url_helper |
32 | +from cloudinit import util |
33 | + |
34 | + |
35 | +LOG = logging.getLogger(__name__) |
36 | + |
37 | +BUILTIN_DS_CONFIG = { |
38 | + 'metadata_url': 'http://169.254.42.42/conf?format=json', |
39 | + 'userdata_url': 'http://169.254.42.42/user_data/cloud-init' |
40 | +} |
41 | + |
42 | +DEF_MD_RETRIES = 5 |
43 | +DEF_MD_TIMEOUT = 10 |
44 | + |
45 | + |
46 | +def on_scaleway(user_data_url, retries=5): |
47 | + """ Check if we are on Scaleway. |
48 | + |
49 | + If Scaleway's user-data API isn't queried from a privileged source port |
50 | + (ie. below 1024), it returns HTTP/403. |
51 | + """ |
52 | + for _ in range(retries): |
53 | + try: |
54 | + code = requests.head(user_data_url).status_code |
55 | + if code not in (403, 429) and code < 500: |
56 | + return False |
57 | + if code == 403: |
58 | + return True |
59 | + except (requests.exceptions.ConnectionError, |
60 | + requests.exceptions.Timeout): |
61 | + return False |
62 | + |
63 | + time.sleep(1) # be nice, and wait a bit before retrying |
64 | + return False |
65 | + |
66 | + |
67 | +class SourceAddressAdapter(requests.adapters.HTTPAdapter): |
68 | + """ Adapter for requests to choose the local address to bind to. |
69 | + """ |
70 | + |
71 | + def __init__(self, source_address, **kwargs): |
72 | + self.source_address = source_address |
73 | + super(SourceAddressAdapter, self).__init__(**kwargs) |
74 | + |
75 | + def init_poolmanager(self, connections, maxsize, block=False): |
76 | + self.poolmanager = PoolManager(num_pools=connections, |
77 | + maxsize=maxsize, |
78 | + block=block, |
79 | + source_address=self.source_address) |
80 | + |
81 | + |
82 | +def _get_user_data(userdata_address, timeout, retries, session): |
83 | + """ Retrieve user data. |
84 | + |
85 | + Scaleway userdata API returns HTTP/404 if user data is not set. |
86 | + |
87 | + This function wraps `url_helper.readurl` but instead of considering |
88 | + HTTP/404 as an error that requires a retry, it considers it as empty user |
89 | + data. |
90 | + |
91 | + Also, user data API require the source port to be below 1024. If requests |
92 | + raises ConnectionError (EADDRINUSE), we raise immediately instead of |
93 | + retrying. This way, the caller can retry to call this function on an other |
94 | + port. |
95 | + """ |
96 | + try: |
97 | + # exception_cb is used to re-raise the exception if the API responds |
98 | + # HTTP/404. |
99 | + resp = url_helper.readurl( |
100 | + userdata_address, |
101 | + data=None, |
102 | + timeout=timeout, |
103 | + retries=retries, |
104 | + session=session, |
105 | + exception_cb=lambda _, exc: exc.code == 404 or isinstance( |
106 | + exc.cause, requests.exceptions.ConnectionError |
107 | + ) |
108 | + ) |
109 | + return util.decode_binary(resp.contents) |
110 | + except url_helper.UrlError as exc: |
111 | + # Empty user data |
112 | + if exc.code == 404: |
113 | + return None |
114 | + |
115 | + # `retries` is reached, re-raise |
116 | + raise |
117 | + |
118 | + |
119 | +class DataSourceScaleway(sources.DataSource): |
120 | + |
121 | + def __init__(self, sys_cfg, distro, paths): |
122 | + LOG.debug('Init scw') |
123 | + sources.DataSource.__init__(self, sys_cfg, distro, paths) |
124 | + |
125 | + self.metadata = {} |
126 | + self.ds_cfg = util.mergemanydict([ |
127 | + util.get_cfg_by_path(sys_cfg, ["datasource", "Scaleway"], {}), |
128 | + BUILTIN_DS_CONFIG |
129 | + ]) |
130 | + |
131 | + self.metadata_address = self.ds_cfg['metadata_url'] |
132 | + self.userdata_address = self.ds_cfg['userdata_url'] |
133 | + |
134 | + self.retries = self.ds_cfg.get('retries', DEF_MD_RETRIES) |
135 | + self.timeout = self.ds_cfg.get('timeout', DEF_MD_TIMEOUT) |
136 | + |
137 | + def _get_metadata(self): |
138 | + resp = url_helper.readurl( |
139 | + self.metadata_address, |
140 | + timeout=self.timeout, |
141 | + retries=self.retries |
142 | + ) |
143 | + metadata = json.loads(util.decode_binary(resp.contents)) |
144 | + LOG.debug('metadata downloaded') |
145 | + |
146 | + # try to make a request on the first privileged port available |
147 | + for port in range(1, 1024): |
148 | + try: |
149 | + LOG.debug( |
150 | + 'Trying to get user data (bind on port %d)...' % port |
151 | + ) |
152 | + session = requests.Session() |
153 | + session.mount( |
154 | + 'http://', |
155 | + SourceAddressAdapter(source_address=('0.0.0.0', port)) |
156 | + ) |
157 | + user_data = _get_user_data( |
158 | + self.userdata_address, |
159 | + timeout=self.timeout, |
160 | + retries=self.retries, |
161 | + session=session |
162 | + ) |
163 | + LOG.debug('user-data downloaded') |
164 | + return metadata, user_data |
165 | + |
166 | + except url_helper.UrlError as exc: |
167 | + # local port already in use, try next port |
168 | + if isinstance(exc.cause, requests.exceptions.ConnectionError): |
169 | + continue |
170 | + |
171 | + # unexpected exception |
172 | + raise |
173 | + |
174 | + def get_data(self): |
175 | + if on_scaleway(self.ds_cfg['userdata_url'], self.retries) is False: |
176 | + return False |
177 | + |
178 | + metadata, metadata['user-data'] = self._get_metadata() |
179 | + self.metadata = { |
180 | + 'id': metadata['id'], |
181 | + 'hostname': metadata['hostname'], |
182 | + 'user-data': metadata['user-data'], |
183 | + 'ssh_public_keys': [ |
184 | + key['key'] for key in metadata['ssh_public_keys'] |
185 | + ] |
186 | + } |
187 | + return True |
188 | + |
189 | + @property |
190 | + def launch_index(self): |
191 | + return None |
192 | + |
193 | + def get_instance_id(self): |
194 | + return self.metadata['id'] |
195 | + |
196 | + def get_public_ssh_keys(self): |
197 | + return self.metadata['ssh_public_keys'] |
198 | + |
199 | + def get_hostname(self, fqdn=False, resolve_ip=False): |
200 | + return self.metadata['hostname'] |
201 | + |
202 | + def get_userdata_raw(self): |
203 | + return self.metadata['user-data'] |
204 | + |
205 | + @property |
206 | + def availability_zone(self): |
207 | + return None |
208 | + |
209 | + @property |
210 | + def region(self): |
211 | + return None |
212 | + |
213 | + |
214 | +datasources = [ |
215 | + (DataSourceScaleway, (sources.DEP_FILESYSTEM, sources.DEP_NETWORK)), |
216 | +] |
217 | + |
218 | + |
219 | +def get_datasource_list(depends): |
220 | + return sources.list_from_depends(depends, datasources) |
221 | |
222 | === modified file 'cloudinit/url_helper.py' |
223 | --- cloudinit/url_helper.py 2015-09-29 21:17:49 +0000 |
224 | +++ cloudinit/url_helper.py 2015-10-28 09:51:17 +0000 |
225 | @@ -183,7 +183,8 @@ |
226 | |
227 | def readurl(url, data=None, timeout=None, retries=0, sec_between=1, |
228 | headers=None, headers_cb=None, ssl_details=None, |
229 | - check_status=True, allow_redirects=True, exception_cb=None): |
230 | + check_status=True, allow_redirects=True, exception_cb=None, |
231 | + session=None): |
232 | url = _cleanurl(url) |
233 | req_args = { |
234 | 'url': url, |
235 | @@ -242,7 +243,9 @@ |
236 | LOG.debug("[%s/%s] open '%s' with %s configuration", i, |
237 | manual_tries, url, filtered_req_args) |
238 | |
239 | - r = requests.request(**req_args) |
240 | + if session is None: |
241 | + session = requests.Session() |
242 | + r = session.request(**req_args) |
243 | if check_status: |
244 | r.raise_for_status() |
245 | LOG.debug("Read from %s (%s, %sb) after %s attempts", url, |
246 | |
247 | === added file 'tests/unittests/test_datasource/test_scaleway.py' |
248 | --- tests/unittests/test_datasource/test_scaleway.py 1970-01-01 00:00:00 +0000 |
249 | +++ tests/unittests/test_datasource/test_scaleway.py 2015-10-28 09:51:17 +0000 |
250 | @@ -0,0 +1,193 @@ |
251 | +# |
252 | +# Copyright (C) 2015 Julien Castets |
253 | +# |
254 | +# Author: Julien Castets <castets.j@gmail.com> |
255 | +# |
256 | +# This program is free software: you can redistribute it and/or modify |
257 | +# it under the terms of the GNU General Public License version 3, as |
258 | +# published by the Free Software Foundation. |
259 | +# |
260 | +# This program is distributed in the hope that it will be useful, |
261 | +# but WITHOUT ANY WARRANTY; without even the implied warranty of |
262 | +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
263 | +# GNU General Public License for more details. |
264 | +# |
265 | +# You should have received a copy of the GNU General Public License |
266 | +# along with this program. If not, see <http://www.gnu.org/licenses/>. |
267 | + |
268 | +import json |
269 | + |
270 | +import requests |
271 | + |
272 | +from cloudinit import settings |
273 | +from cloudinit import helpers |
274 | +from cloudinit.sources import DataSourceScaleway |
275 | + |
276 | +from .. import helpers as test_helpers |
277 | + |
278 | + |
279 | +httpretty = test_helpers.import_httpretty() |
280 | + |
281 | + |
282 | +class UserDataResponses(object): |
283 | + """ Possible responses of the API endpoint |
284 | + 169.254.42.42/user_data/cloud-init. |
285 | + |
286 | + HEAD requests are made to check if the server is on Scaleway. |
287 | + GET requests are made to get user data. |
288 | + """ |
289 | + |
290 | + FAKE_USER_DATA = '#!/bin/bash\necho "user-data"' |
291 | + |
292 | + @staticmethod |
293 | + def head_ok(method, uri, headers): |
294 | + """ To ensure it's running on Scaleway, the datasource makes a HEAD |
295 | + request to 169.254.42.42/user_data/cloud-init and expects a HTTP/403 |
296 | + response, because this endpoint needs to be queried with a privileged |
297 | + source port (below 2014). |
298 | + """ |
299 | + return 403, headers, '' |
300 | + |
301 | + @staticmethod |
302 | + def connection_error(method, uri, headers): |
303 | + """ Unable to connect to the user data API. |
304 | + """ |
305 | + raise requests.exceptions.ConnectionError() |
306 | + |
307 | + @staticmethod |
308 | + def rate_limited(method, uri, headers): |
309 | + return 429, headers, '' |
310 | + |
311 | + @staticmethod |
312 | + def api_error(method, uri, headers): |
313 | + return 500, headers, '' |
314 | + |
315 | + @classmethod |
316 | + def get_ok(cls, method, uri, headers): |
317 | + return 200, headers, cls.FAKE_USER_DATA |
318 | + |
319 | + @staticmethod |
320 | + def empty(method, uri, headers): |
321 | + """ No user data for this server. |
322 | + """ |
323 | + return 404, headers, '' |
324 | + |
325 | + |
326 | +class MetadataResponses(object): |
327 | + """ Possible responses of the metadata API. |
328 | + """ |
329 | + |
330 | + FAKE_METADATA = { |
331 | + 'id': '00000000-0000-0000-0000-000000000000', |
332 | + 'hostname': 'scaleway.host', |
333 | + 'ssh_public_keys': [{ |
334 | + 'key': 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABA', |
335 | + 'fingerprint': '2048 06:ae:... login (RSA)' |
336 | + }, { |
337 | + 'key': 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABCCCCC', |
338 | + 'fingerprint': '2048 06:ff:... login2 (RSA)' |
339 | + }] |
340 | + } |
341 | + |
342 | + @classmethod |
343 | + def get_ok(cls, method, uri, headers): |
344 | + return 200, headers, json.dumps(cls.FAKE_METADATA) |
345 | + |
346 | + |
347 | +class TestDataSourceScaleway(test_helpers.HttprettyTestCase): |
348 | + |
349 | + def setUp(self): |
350 | + self.datasource = DataSourceScaleway.DataSourceScaleway( |
351 | + settings.CFG_BUILTIN, None, helpers.Paths({}) |
352 | + ) |
353 | + super(TestDataSourceScaleway, self).setUp() |
354 | + |
355 | + self.metadata_url = \ |
356 | + DataSourceScaleway.BUILTIN_DS_CONFIG['metadata_url'] |
357 | + self.userdata_url = \ |
358 | + DataSourceScaleway.BUILTIN_DS_CONFIG['userdata_url'] |
359 | + |
360 | + @httpretty.activate |
361 | + @test_helpers.mock.patch('time.sleep', return_value=None) |
362 | + def test_on_scaleway(self, sleep): |
363 | + # Test API ok |
364 | + httpretty.register_uri(httpretty.HEAD, self.userdata_url, |
365 | + body=UserDataResponses.head_ok) |
366 | + self.assertTrue(DataSourceScaleway.on_scaleway(self.userdata_url)) |
367 | + |
368 | + # API returns something else than 403: we're not on scaleway |
369 | + httpretty.register_uri(httpretty.HEAD, self.userdata_url, |
370 | + body='ok') |
371 | + self.assertFalse(DataSourceScaleway.on_scaleway(self.userdata_url)) |
372 | + |
373 | + # Connection error |
374 | + httpretty.register_uri(httpretty.HEAD, self.userdata_url, |
375 | + body=UserDataResponses.connection_error) |
376 | + self.assertFalse(DataSourceScaleway.on_scaleway(self.userdata_url)) |
377 | + |
378 | + # Rate limited 2 times, then API error, then ok |
379 | + httpretty.register_uri( |
380 | + httpretty.HEAD, self.userdata_url, |
381 | + responses=[ |
382 | + httpretty.Response(body=UserDataResponses.rate_limited), |
383 | + httpretty.Response(body=UserDataResponses.rate_limited), |
384 | + httpretty.Response(body=UserDataResponses.api_error), |
385 | + httpretty.Response(body=UserDataResponses.head_ok), |
386 | + ] |
387 | + ) |
388 | + self.assertTrue(DataSourceScaleway.on_scaleway(self.userdata_url)) |
389 | + self.assertEqual(sleep.call_count, 3) |
390 | + |
391 | + @httpretty.activate |
392 | + @test_helpers.mock.patch('time.sleep', return_value=None) |
393 | + def test_metadata(self, sleep): |
394 | + # Not on scaleway |
395 | + httpretty.register_uri(httpretty.HEAD, self.userdata_url, |
396 | + body=UserDataResponses.connection_error) |
397 | + self.assertFalse(self.datasource.get_data()) |
398 | + |
399 | + # Make on_scaleway return true |
400 | + httpretty.register_uri(httpretty.HEAD, self.userdata_url, |
401 | + body=UserDataResponses.head_ok) |
402 | + |
403 | + # Make user data API return a valid response |
404 | + httpretty.register_uri(httpretty.GET, self.metadata_url, |
405 | + body=MetadataResponses.get_ok) |
406 | + httpretty.register_uri(httpretty.GET, self.userdata_url, |
407 | + body=UserDataResponses.get_ok) |
408 | + self.datasource.get_data() |
409 | + |
410 | + self.assertEqual(self.datasource.get_instance_id(), |
411 | + MetadataResponses.FAKE_METADATA['id']) |
412 | + self.assertEqual(self.datasource.get_public_ssh_keys(), [ |
413 | + elem['key'] for elem in |
414 | + MetadataResponses.FAKE_METADATA['ssh_public_keys'] |
415 | + ]) |
416 | + self.assertEqual(self.datasource.get_hostname(), |
417 | + MetadataResponses.FAKE_METADATA['hostname']) |
418 | + self.assertEqual(self.datasource.get_userdata_raw(), |
419 | + UserDataResponses.FAKE_USER_DATA) |
420 | + self.assertIsNone(self.datasource.availability_zone) |
421 | + self.assertIsNone(self.datasource.region) |
422 | + |
423 | + # Make user data API return HTTP/404, which means there is no user data |
424 | + # for the server. |
425 | + httpretty.register_uri(httpretty.GET, self.userdata_url, |
426 | + body=UserDataResponses.empty) |
427 | + self.datasource.get_data() |
428 | + self.assertIsNone(self.datasource.get_userdata_raw()) |
429 | + |
430 | + # Make user data API rate limit 2 times, then ConnectionError (ie. |
431 | + # local port is used), then API ok |
432 | + httpretty.register_uri( |
433 | + httpretty.GET, self.userdata_url, |
434 | + responses=[ |
435 | + httpretty.Response(body=UserDataResponses.rate_limited), |
436 | + httpretty.Response(body=UserDataResponses.rate_limited), |
437 | + httpretty.Response(body=UserDataResponses.connection_error), |
438 | + httpretty.Response(body=UserDataResponses.get_ok), |
439 | + ] |
440 | + ) |
441 | + self.datasource.get_data() |
442 | + self.assertEqual(self.datasource.get_userdata_raw(), |
443 | + UserDataResponses.FAKE_USER_DATA) |
Unlike other providers, the Scaleway user-data API is restricted to privileged ports (< 1024) to prevent non-root users accessing to it.
We added a new parameter to readurl to specify the requests session object, to bind on a specific port.