Merge lp:~chad.smith/landscape-client/avoid-metadata-pycurl-traceback into lp:~landscape/landscape-client/trunk
- avoid-metadata-pycurl-traceback
- Merge into trunk
Status: | Merged |
---|---|
Approved by: | Chad Smith |
Approved revision: | 728 |
Merged at revision: | 719 |
Proposed branch: | lp:~chad.smith/landscape-client/avoid-metadata-pycurl-traceback |
Merge into: | lp:~landscape/landscape-client/trunk |
Diff against target: |
446 lines (+258/-92) 4 files modified
landscape/lib/cloud.py (+45/-0) landscape/lib/tests/test_cloud.py (+107/-0) landscape/monitor/computerinfo.py (+28/-51) landscape/monitor/tests/test_computerinfo.py (+78/-41) |
To merge this branch: | bzr merge lp:~chad.smith/landscape-client/avoid-metadata-pycurl-traceback |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Chris Glass (community) | Approve | ||
Free Ekanayaka (community) | Approve | ||
Review via email: mp+183279@code.launchpad.net |
Commit message
Avoid pycurl tracebacks from missing EC2 API meta-data calls. Metadata queries allow 3 retries before giving up and logging warnings.
Description of the change
This branch reduces the noise of a metadata logs in a couple of ways:
1. catch pycurl "host can't be contacted" error so that it doesn't read like a traceback.
2. Remove the discrete "Queueing metadata at url ..../(instance-
3. keep internal bool flag _check_cloud to prevent repetitive calls to _fetch_
-- is not a cloud instance (e.g. LXC or physical server)
-- ls-client has already obtained the EC2 metadata (because instance-id, ami-id and instance-type don't change except across restarts)
Chad Smith (chad.smith) wrote : | # |
Free, I'm still working the refactor of some of the fetch_cloud_
Free Ekanayaka (free.ekanayaka) wrote : | # |
I'm not totally sure encapsulating retries is a good idea, since you'll want some delay between subsequent attempts and that would be a bottle neck for calling code (the computer info plugin). I believe the current retry logic in the computer info plugin is good enough (we don't have any other use case and don't foresee any).
Chad Smith (chad.smith) wrote : | # |
Hi Free,
I think I have addressed your review comments. I have now separated the fetch_ec2_meta_data functions out into their own very simple cloud.py library with local unit tests.
I have left a number of computerinfo meta-data tests intact as they represented integration tests using the results of fetch_ec2_
Free Ekanayaka (free.ekanayaka) wrote : | # |
Nice work Chad, a few other minor points, but looks good. +1!
[4]
+def fetch_ec2_
Since this is an internal helper, not relevant to cloud.py consumers would you please make it private? And move it to the bottom of the file (you can still keep the unit tests and just mark them as wb, e.g. test_wb_
See also:
https:/
[5]
+def fetch_ec2_
Please document the fetch parameter and mention that it's there for testing purposes.
[6]
+ def _unicode_
+ if value is None:
+ return None
+ else:
+ return value.decode(
This could be simply:
def _unicode_
if value is not None:
I'd also rename it to unicode_or_none (no need to make it private since it's local scope).
[7]
+ logging.
It's generally better to keep logging functionality outside of library functions, this grants a bit more flexibility to the calling code, that can decide what, when and how to log (and indeed computerinfo.py is where the rest of the logging is done). This is minor, so feel free to address it or not.
[8]
+METADATA_RETRY_MAX = 3 # Number of retries to get EC2 meta-data
This should probably be moved to computerinfo.py, since it's where it's used (while cloud.py ignores it).
[9]
+ else:
+ logging.warning(
+ "Temporary failure accessing cloud meta-data, retrying.")
There's no need to log retries I think, especially because non-cloud clients will always get this message.
[10]
+ log_failure(
+ error, msg=(
+ "Max retries reached querying meta-data. %s" %
+ error.getErrorM
I don't think it should be logged as an error, as this just fine for non-cloud clients. I'd simply use logging.info with a message like "No cloud meta-data available.".
- 726. By Chad Smith
-
address free's remaining comments:
- fetch_ec2_item made private
- METADATA_RETRY_MAX moved from lib/cloud into computerinfo.py
- drop retry logging message and handle all logging on computerinfo
- metadata query failure is not a real failure, just info message for non-cloud computers
Chris Glass (tribaal) wrote : | # |
Looks good! +1 with a couple of refactoring/
[1]
I feel the term "cloud" is a little overused everywhere, making things less obvious that they could be. In the future I suppose we will add meta data for more than just EC2, so how about:
- Renaming _fetch_
- Changing the logs to say "EC2" instead of "Cloud" in the aforementioned method
[2]
I would like to keep the message content gathering and creation logic in _create_
+ if (self._
+ self._cloud_retries < METADATA_
+ self._cloud_
to _create_
If this is done, I'm not certain it's useful to keep self._cloud_
Then in the future we can then easily extend the functionality by adding _fetch_
Chad Smith (chad.smith) wrote : | # |
thanks Chris
[1] addressed
agree with point [2] addressed, though we still need the persistent self._cloud_
- 727. By Chad Smith
-
fetch_cloud_
meta_data -> fetch_ec2_meta_data - 728. By Chad Smith
-
shuffle METADATA_RETRY_MAX logic and fetch_ec2_metadata from send_computer_
message and into _create_ computer_ message
Preview Diff
1 | === added file 'landscape/lib/cloud.py' |
2 | --- landscape/lib/cloud.py 1970-01-01 00:00:00 +0000 |
3 | +++ landscape/lib/cloud.py 2013-09-08 12:39:26 +0000 |
4 | @@ -0,0 +1,45 @@ |
5 | +from landscape.lib.fetch import fetch_async |
6 | + |
7 | +EC2_HOST = "169.254.169.254" |
8 | +EC2_API = "http://%s/latest" % (EC2_HOST,) |
9 | + |
10 | + |
11 | +def fetch_ec2_meta_data(fetch=None): |
12 | + """Fetch EC2 information about the cloud instance. |
13 | + |
14 | + The C{fetch} parameter provided above is non-mocker testing purposes. |
15 | + """ |
16 | + cloud_data = [] |
17 | + # We're not using a DeferredList here because we want to keep the |
18 | + # number of connections to the backend minimal. See lp:567515. |
19 | + deferred = _fetch_ec2_item("instance-id", cloud_data, fetch) |
20 | + deferred.addCallback( |
21 | + lambda ignore: _fetch_ec2_item("instance-type", cloud_data, fetch)) |
22 | + deferred.addCallback( |
23 | + lambda ignore: _fetch_ec2_item("ami-id", cloud_data, fetch)) |
24 | + |
25 | + def return_result(ignore): |
26 | + """Record the instance data returned by the EC2 API.""" |
27 | + |
28 | + def _unicode_or_none(value): |
29 | + if value is not None: |
30 | + return value.decode("utf-8") |
31 | + |
32 | + (instance_id, instance_type, ami_id) = cloud_data |
33 | + return { |
34 | + "instance-id": _unicode_or_none(instance_id), |
35 | + "ami-id": _unicode_or_none(ami_id), |
36 | + "instance-type": _unicode_or_none(instance_type)} |
37 | + deferred.addCallback(return_result) |
38 | + return deferred |
39 | + |
40 | + |
41 | +def _fetch_ec2_item(path, accumulate, fetch=None): |
42 | + """ |
43 | + Get data at C{path} on the EC2 API endpoint, and add the result to the |
44 | + C{accumulate} list. The C{fetch} parameter is provided for testing only. |
45 | + """ |
46 | + url = EC2_API + "/meta-data/" + path |
47 | + if fetch is None: |
48 | + fetch = fetch_async |
49 | + return fetch(url).addCallback(accumulate.append) |
50 | |
51 | === added file 'landscape/lib/tests/test_cloud.py' |
52 | --- landscape/lib/tests/test_cloud.py 1970-01-01 00:00:00 +0000 |
53 | +++ landscape/lib/tests/test_cloud.py 2013-09-08 12:39:26 +0000 |
54 | @@ -0,0 +1,107 @@ |
55 | +from landscape.lib.cloud import (EC2_API, _fetch_ec2_item, fetch_ec2_meta_data) |
56 | +from landscape.lib.fetch import HTTPCodeError, PyCurlError |
57 | +from landscape.tests.helpers import LandscapeTest |
58 | +from twisted.internet.defer import succeed, fail |
59 | + |
60 | + |
61 | +class CloudTest(LandscapeTest): |
62 | + |
63 | + def setUp(self): |
64 | + LandscapeTest.setUp(self) |
65 | + self.query_results = {} |
66 | + |
67 | + def fetch_stub(url): |
68 | + value = self.query_results[url] |
69 | + if isinstance(value, Exception): |
70 | + return fail(value) |
71 | + else: |
72 | + return succeed(value) |
73 | + |
74 | + self.fetch_func = fetch_stub |
75 | + self.add_query_result("instance-id", "i00001") |
76 | + self.add_query_result("ami-id", "ami-00002") |
77 | + self.add_query_result("instance-type", "hs1.8xlarge") |
78 | + |
79 | + def add_query_result(self, name, value): |
80 | + """ |
81 | + Add a url to self.query_results that is then available through |
82 | + self.fetch_func. |
83 | + """ |
84 | + url = "%s/meta-data/%s" % (EC2_API, name) |
85 | + self.query_results[url] = value |
86 | + |
87 | + def test_fetch_ec2_meta_data_error_on_any_item_error(self): |
88 | + """ |
89 | + L{_fetch_ec2_meta_data} returns a deferred C{Failure} containing the |
90 | + error message when an error occurs on any of the queried meta-data |
91 | + items C{instance-id}, C{ami-id} or C{instance-type}. |
92 | + """ |
93 | + self.log_helper.ignore_errors(HTTPCodeError) |
94 | + error = HTTPCodeError(404, "notfound") |
95 | + metadata_items = ["instance-id", "ami-id", "instance-type"] |
96 | + for item in metadata_items: |
97 | + # reset all item data adding the error to only 1 item per iteration |
98 | + for setup_item in metadata_items: |
99 | + if setup_item == item: |
100 | + self.add_query_result(item, error) |
101 | + else: |
102 | + self.add_query_result(setup_item, "value%s" % setup_item) |
103 | + |
104 | + deferred = fetch_ec2_meta_data(fetch=self.fetch_func) |
105 | + failure = self.failureResultOf(deferred) |
106 | + self.assertEqual( |
107 | + "Server returned HTTP code 404", |
108 | + failure.getErrorMessage()) |
109 | + |
110 | + def test_fetch_ec2_meta_data(self): |
111 | + """ |
112 | + L{_fetch_ec2_meta_data} returns a C{dict} containing meta-data for |
113 | + C{instance-id}, C{ami-id} and C{instance-type}. |
114 | + """ |
115 | + deferred = fetch_ec2_meta_data(fetch=self.fetch_func) |
116 | + result = self.successResultOf(deferred) |
117 | + self.assertEqual( |
118 | + {"ami-id": u"ami-00002", |
119 | + "instance-id": u"i00001", |
120 | + "instance-type": u"hs1.8xlarge"}, |
121 | + result) |
122 | + |
123 | + def test_fetch_ec2_meta_data_utf8(self): |
124 | + """ |
125 | + L{_fetch_ec2_meta_data} decodes utf-8 strings returned from the |
126 | + external service. |
127 | + """ |
128 | + self.add_query_result("ami-id", "asdf\xe1\x88\xb4") |
129 | + deferred = fetch_ec2_meta_data(fetch=self.fetch_func) |
130 | + result = self.successResultOf(deferred) |
131 | + self.assertEqual({"instance-id": u"i00001", |
132 | + "ami-id": u"asdf\u1234", |
133 | + "instance-type": u"hs1.8xlarge"}, |
134 | + result) |
135 | + |
136 | + def test_wb_fetch_ec2_item_multiple_items_appends_accumulate_list(self): |
137 | + """ |
138 | + L{_fetch_ec2_item} retrieves individual meta-data items from the |
139 | + EC2 api and appends them to the C{list} provided by the C{accumulate} |
140 | + parameter. |
141 | + """ |
142 | + accumulate = [] |
143 | + self.successResultOf( |
144 | + _fetch_ec2_item("instance-id", accumulate, fetch=self.fetch_func)) |
145 | + self.successResultOf( |
146 | + _fetch_ec2_item( |
147 | + "instance-type", accumulate, fetch=self.fetch_func)) |
148 | + self.assertEqual(["i00001", "hs1.8xlarge"], accumulate) |
149 | + |
150 | + def test_wb_fetch_ec2_item_error_returns_failure(self): |
151 | + """ |
152 | + L{_fetch_ec2_item} returns a deferred C{Failure} containing the error |
153 | + message when faced with no EC2 cloud API service. |
154 | + """ |
155 | + self.log_helper.ignore_errors(PyCurlError) |
156 | + self.add_query_result("other-id", PyCurlError(60, "pycurl error")) |
157 | + accumulate = [] |
158 | + deferred = _fetch_ec2_item( |
159 | + "other-id", accumulate, fetch=self.fetch_func) |
160 | + failure = self.failureResultOf(deferred) |
161 | + self.assertEqual("Error 60: pycurl error", failure.getErrorMessage()) |
162 | |
163 | === modified file 'landscape/monitor/computerinfo.py' |
164 | --- landscape/monitor/computerinfo.py 2013-08-23 21:06:26 +0000 |
165 | +++ landscape/monitor/computerinfo.py 2013-09-08 12:39:26 +0000 |
166 | @@ -1,16 +1,15 @@ |
167 | import os |
168 | import logging |
169 | -from twisted.internet.defer import inlineCallbacks |
170 | +from twisted.internet.defer import inlineCallbacks, returnValue |
171 | |
172 | from landscape.lib.fetch import fetch_async |
173 | from landscape.lib.fs import read_file |
174 | -from landscape.lib.log import log_failure |
175 | from landscape.lib.lsb_release import LSB_RELEASE_FILENAME, parse_lsb_release |
176 | +from landscape.lib.cloud import fetch_ec2_meta_data |
177 | from landscape.lib.network import get_fqdn |
178 | from landscape.monitor.plugin import MonitorPlugin |
179 | |
180 | -EC2_HOST = "169.254.169.254" |
181 | -EC2_API = "http://%s/latest" % (EC2_HOST,) |
182 | +METADATA_RETRY_MAX = 3 # Number of retries to get EC2 meta-data |
183 | |
184 | |
185 | class DistributionInfoError(Exception): |
186 | @@ -32,6 +31,7 @@ |
187 | self._lsb_release_filename = lsb_release_filename |
188 | self._root_path = root_path |
189 | self._cloud_meta_data = None |
190 | + self._cloud_retries = 0 |
191 | self._fetch_async = fetch_async |
192 | |
193 | def register(self, registry): |
194 | @@ -44,10 +44,7 @@ |
195 | |
196 | @inlineCallbacks |
197 | def send_computer_message(self, urgent=False): |
198 | - if self._cloud_meta_data is None: |
199 | - self._cloud_meta_data = yield self._fetch_cloud_meta_data() |
200 | - |
201 | - message = self._create_computer_info_message() |
202 | + message = yield self._create_computer_info_message() |
203 | if message: |
204 | message["type"] = "computer-info" |
205 | logging.info("Queueing message with updated computer info.") |
206 | @@ -69,6 +66,7 @@ |
207 | broker.call_if_accepted("distribution-info", |
208 | self.send_distribution_message, urgent) |
209 | |
210 | + @inlineCallbacks |
211 | def _create_computer_info_message(self): |
212 | message = {} |
213 | self._add_if_new(message, "hostname", |
214 | @@ -83,12 +81,16 @@ |
215 | meta_data[key] = read_file( |
216 | os.path.join(self._meta_data_path, key)) |
217 | |
218 | + if (self._cloud_meta_data is None and |
219 | + self._cloud_retries < METADATA_RETRY_MAX): |
220 | + self._cloud_meta_data = yield self._fetch_ec2_meta_data() |
221 | + |
222 | if self._cloud_meta_data: |
223 | meta_data = dict( |
224 | meta_data.items() + self._cloud_meta_data.items()) |
225 | if meta_data: |
226 | self._add_if_new(message, "meta-data", meta_data) |
227 | - return message |
228 | + returnValue(message) |
229 | |
230 | def _add_if_new(self, message, key, value): |
231 | if value != self._persist.get(key): |
232 | @@ -122,47 +124,22 @@ |
233 | message.update(parse_lsb_release(self._lsb_release_filename)) |
234 | return message |
235 | |
236 | - def _fetch_data(self, path, accumulate): |
237 | - """ |
238 | - Get data at C{path} on the EC2 API endpoint, and add the result to the |
239 | - C{accumulate} list. |
240 | - """ |
241 | - url = EC2_API + "/meta-data/" + path |
242 | - logging.info("Queueing url fetch %s." % url) |
243 | - return self._fetch_async(url).addCallback(accumulate.append) |
244 | - |
245 | - def _fetch_cloud_meta_data(self): |
246 | + def _fetch_ec2_meta_data(self): |
247 | """Fetch information about the cloud instance.""" |
248 | - cloud_data = [] |
249 | - # We're not using a DeferredList here because we want to keep the |
250 | - # number of connections to the backend minimal. See lp:567515. |
251 | - deferred = self._fetch_data("instance-id", cloud_data) |
252 | - deferred.addCallback( |
253 | - lambda ignore: |
254 | - self._fetch_data("instance-type", cloud_data)) |
255 | - deferred.addCallback( |
256 | - lambda ignore: |
257 | - self._fetch_data("ami-id", cloud_data)) |
258 | - |
259 | - def store_data(ignore): |
260 | - """Record the instance data returned by the EC2 API.""" |
261 | - |
262 | - def _unicode_none(value): |
263 | - if value is None: |
264 | - return None |
265 | - else: |
266 | - return value.decode("utf-8") |
267 | - |
268 | - (instance_id, instance_type, ami_id) = cloud_data |
269 | - return { |
270 | - "instance-id": _unicode_none(instance_id), |
271 | - "instance-type": _unicode_none(instance_type), |
272 | - "ami-id": _unicode_none(ami_id)} |
273 | - |
274 | - def log_error(error): |
275 | - log_failure(error, msg="Got error while fetching meta-data: %r" |
276 | - % (error.value,)) |
277 | - |
278 | - deferred.addCallback(store_data) |
279 | - deferred.addErrback(log_error) |
280 | + if self._cloud_retries == 0: |
281 | + logging.info("Querying cloud meta-data.") |
282 | + deferred = fetch_ec2_meta_data(self._fetch_async) |
283 | + |
284 | + def log_no_meta_data_found(error): |
285 | + self._cloud_retries += 1 |
286 | + if self._cloud_retries >= METADATA_RETRY_MAX: |
287 | + logging.info("No cloud meta-data available. %s" % |
288 | + error.getErrorMessage()) |
289 | + |
290 | + def log_success(result): |
291 | + logging.info("Acquired cloud meta-data.") |
292 | + return result |
293 | + |
294 | + deferred.addCallback(log_success) |
295 | + deferred.addErrback(log_no_meta_data_found) |
296 | return deferred |
297 | |
298 | === modified file 'landscape/monitor/tests/test_computerinfo.py' |
299 | --- landscape/monitor/tests/test_computerinfo.py 2013-08-23 21:06:26 +0000 |
300 | +++ landscape/monitor/tests/test_computerinfo.py 2013-09-08 12:39:26 +0000 |
301 | @@ -3,9 +3,9 @@ |
302 | |
303 | from twisted.internet.defer import succeed, fail, inlineCallbacks |
304 | |
305 | -from landscape.lib.fetch import HTTPCodeError |
306 | +from landscape.lib.fetch import HTTPCodeError, PyCurlError |
307 | from landscape.lib.fs import create_file |
308 | -from landscape.monitor.computerinfo import ComputerInfo |
309 | +from landscape.monitor.computerinfo import ComputerInfo, METADATA_RETRY_MAX |
310 | from landscape.tests.helpers import LandscapeTest, MonitorHelper |
311 | from landscape.tests.mocker import ANY |
312 | |
313 | @@ -405,9 +405,6 @@ |
314 | def test_with_cloud_info(self): |
315 | """Fetch cloud information""" |
316 | self.config.cloud = True |
317 | - self.add_query_result("instance-id", "i00001") |
318 | - self.add_query_result("ami-id", "ami-00002") |
319 | - self.add_query_result("instance-type", "hs1.8xlarge") |
320 | self.mstore.set_accepted_types(["computer-info"]) |
321 | |
322 | plugin = ComputerInfo(fetch_async=self.fetch_func) |
323 | @@ -421,47 +418,87 @@ |
324 | "instance-type": u"hs1.8xlarge"}, |
325 | messages[0]["meta-data"]) |
326 | |
327 | + def test_no_fetch_ec2_meta_data_when_cloud_retries_is_max(self): |
328 | + """ |
329 | + Do not fetch EC2 info when C{_cloud_retries} is C{METADATA_RETRY_MAX} |
330 | + """ |
331 | + self.config.cloud = True |
332 | + self.mstore.set_accepted_types(["computer-info"]) |
333 | + |
334 | + plugin = ComputerInfo(fetch_async=self.fetch_func) |
335 | + plugin._cloud_retries = METADATA_RETRY_MAX |
336 | + self.monitor.add(plugin) |
337 | + plugin.exchange() |
338 | + messages = self.mstore.get_pending_messages() |
339 | + self.assertEqual(1, len(messages)) |
340 | + self.assertNotIn("meta-data", messages[0]) |
341 | + |
342 | @inlineCallbacks |
343 | - def test_fetch_cloud_meta_data(self): |
344 | + def test_fetch_ec2_meta_data(self): |
345 | """ |
346 | - L{_fetch_cloud_meta_data} retrieves instance information from the |
347 | + L{_fetch_ec2_meta_data} retrieves instance information from the |
348 | EC2 api. |
349 | """ |
350 | - self.add_query_result("instance-id", "i00001") |
351 | - self.add_query_result("ami-id", "ami-00002") |
352 | - self.add_query_result("instance-type", "hs1.8xlarge") |
353 | - |
354 | plugin = ComputerInfo(fetch_async=self.fetch_func) |
355 | - result = yield plugin._fetch_cloud_meta_data() |
356 | + result = yield plugin._fetch_ec2_meta_data() |
357 | self.assertEqual({"instance-id": u"i00001", "ami-id": u"ami-00002", |
358 | "instance-type": u"hs1.8xlarge"}, result) |
359 | - |
360 | - @inlineCallbacks |
361 | - def test_fetch_cloud_meta_data_bad_result(self): |
362 | - """ |
363 | - L{_fetch_cloud_meta_data} returns C{None} when faced with errors from |
364 | - the EC2 api. |
365 | - """ |
366 | - self.log_helper.ignore_errors(HTTPCodeError) |
367 | - self.add_query_result("instance-id", "i7337") |
368 | - self.add_query_result("ami-id", HTTPCodeError(404, "notfound")) |
369 | - self.add_query_result("instance-type", "hs1.8xlarge") |
370 | - plugin = ComputerInfo(fetch_async=self.fetch_func) |
371 | - result = yield plugin._fetch_cloud_meta_data() |
372 | - self.assertEqual(None, result) |
373 | - |
374 | - @inlineCallbacks |
375 | - def test_fetch_cloud_meta_data_utf8(self): |
376 | - """ |
377 | - L{_fetch_cloud_meta_data} decodes utf-8 strings returned from the |
378 | - external service. |
379 | - """ |
380 | - self.add_query_result("instance-id", "i00001") |
381 | - self.add_query_result("ami-id", "asdf\xe1\x88\xb4") |
382 | - self.add_query_result("instance-type", "m1.large") |
383 | - plugin = ComputerInfo(fetch_async=self.fetch_func) |
384 | - result = yield plugin._fetch_cloud_meta_data() |
385 | - self.assertEqual({"instance-id": u"i00001", |
386 | - "ami-id": u"asdf\u1234", |
387 | - "instance-type": u"m1.large"}, |
388 | + self.assertEqual( |
389 | + " INFO: Querying cloud meta-data.\n" |
390 | + " INFO: Acquired cloud meta-data.\n", |
391 | + self.logfile.getvalue()) |
392 | + |
393 | + @inlineCallbacks |
394 | + def test_fetch_ec2_meta_data_no_cloud_api_max_retry(self): |
395 | + """ |
396 | + L{_fetch_ec2_meta_data} returns C{None} when faced with no EC2 cloud |
397 | + API service and reports the specific C{PyCurlError} upon message |
398 | + exchange when L{_cloud_retries} equals C{METADATA_RETRY_MAX}. |
399 | + """ |
400 | + self.log_helper.ignore_errors(PyCurlError) |
401 | + self.add_query_result("instance-id", PyCurlError(60, "pycurl error")) |
402 | + plugin = ComputerInfo(fetch_async=self.fetch_func) |
403 | + plugin._cloud_retries = METADATA_RETRY_MAX |
404 | + result = yield plugin._fetch_ec2_meta_data() |
405 | + self.assertIn( |
406 | + "INFO: No cloud meta-data available. " |
407 | + "Error 60: pycurl error\n", self.logfile.getvalue()) |
408 | + self.assertEqual(None, result) |
409 | + |
410 | + @inlineCallbacks |
411 | + def test_fetch_ec2_meta_data_bad_result_max_retry(self): |
412 | + """ |
413 | + L{_fetch_ec2_meta_data} returns C{None} and logs an error when |
414 | + crossing the retry threshold C{METADATA_RETRY_MAX}. |
415 | + """ |
416 | + self.log_helper.ignore_errors(HTTPCodeError) |
417 | + self.add_query_result("ami-id", HTTPCodeError(404, "notfound")) |
418 | + plugin = ComputerInfo(fetch_async=self.fetch_func) |
419 | + plugin._cloud_retries = METADATA_RETRY_MAX |
420 | + result = yield plugin._fetch_ec2_meta_data() |
421 | + self.assertIn( |
422 | + "INFO: No cloud meta-data available. Server returned " |
423 | + "HTTP code 404", |
424 | + self.logfile.getvalue()) |
425 | + self.assertEqual(None, result) |
426 | + |
427 | + @inlineCallbacks |
428 | + def test_fetch_ec2_meta_data_bad_result_retry(self): |
429 | + """ |
430 | + L{_fetch_ec2_meta_data} returns C{None} when faced with spurious |
431 | + errors from the EC2 api. The method increments L{_cloud_retries} |
432 | + counter which allows L{_fetch_ec2_meta_data} to run again next |
433 | + message exchange. |
434 | + """ |
435 | + self.log_helper.ignore_errors(HTTPCodeError) |
436 | + self.add_query_result("ami-id", HTTPCodeError(404, "notfound")) |
437 | + plugin = ComputerInfo(fetch_async=self.fetch_func) |
438 | + result = yield plugin._fetch_ec2_meta_data() |
439 | + self.assertEqual(1, plugin._cloud_retries) |
440 | + self.assertEqual(None, result) |
441 | + # Fix the error condition for the retry. |
442 | + self.add_query_result("ami-id", "ami-00002") |
443 | + result = yield plugin._fetch_ec2_meta_data() |
444 | + self.assertEqual({"instance-id": u"i00001", "ami-id": u"ami-00002", |
445 | + "instance-type": u"hs1.8xlarge"}, |
446 | result) |
Marking as needs fixing because of [1], thanks.
[0]
landscape/ monitor/ computerinfo. py:145: 17: E126 continuation line over-indented for hanging indent monitor/ computerinfo. py:148: 17: E126 continuation line over-indented for hanging indent
landscape/
[1]
+ if error.check( PyCurlError) :
I don't think there's a truly reliable way to discern if a fetch error is due to the client not being run in a cloud instance or to some other transient network error.
I propose that the code retries up to 3 times no matter what, then gives up.
[2]
+ self._check_cloud = True
+ self._cloud_retries = 0
Assuming that [1] gets fixed, we can drop self._check_cloud
def __init__(self, ...)
self._ cloud_retries = 0
...
...
def send_computer_ message( self, urgent=False): meta_data is None and self._cloud_retries < 3:
self. _cloud_ retries += 1
self. _cloud_ meta_data = yield self._fetch_ cloud_meta_ data()
if self._cloud_
fetch_cloud_ meta_data should just return None in case any failure occurs.
[3]
Not related to this branch, but I re-raise a point I had made when this code was first put up for review. It'd be good to factor out ComputerInfo. _fetch_ cloud_meta_ data into a standalone fetch_cloud_ meta_data function (perhaps living in landscape. lib.cloud) . This has the following advantages:
- it makes it possible to test fetch_cloud_ meta_data in isolation
- it makes ComputerInfo info simpler, and it avoids the temptation of spreading stateful information around, like this branch is doing by updating self._check_cloud and self._cloud_retries inside ComputerInfo. _fetch_ cloud_meta_ data
As a general note, keeping state generally makes code more complicated, so prefer passing data around, which is more explicit (you know exactly what the context is), and if keeping state is absolutely necessary (like with the retries count), try to keep all uses of such state close to each others (like checking and updating in the same spot).