Merge lp:~julian-edwards/maas/import-fail-bug-1386914 into lp:~maas-committers/maas/trunk
- import-fail-bug-1386914
- Merge into trunk
Status: | Merged |
---|---|
Approved by: | Julian Edwards |
Approved revision: | no longer in the source branch. |
Merged at revision: | 3329 |
Proposed branch: | lp:~julian-edwards/maas/import-fail-bug-1386914 |
Merge into: | lp:~maas-committers/maas/trunk |
Diff against target: |
427 lines (+108/-83) 9 files modified
src/maasserver/bootresources.py (+4/-0) src/maasserver/bootsources.py (+28/-40) src/maasserver/enum.py (+1/-0) src/maasserver/eventloop.py (+0/-6) src/maasserver/models/bootsource.py (+2/-2) src/maasserver/models/tests/test_bootsource.py (+3/-2) src/maasserver/tests/test_bootresources.py (+20/-16) src/maasserver/tests/test_bootsources.py (+50/-7) src/maasserver/tests/test_eventloop.py (+0/-10) |
To merge this branch: | bzr merge lp:~julian-edwards/maas/import-fail-bug-1386914 |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Blake Rouse (community) | Approve | ||
Review via email: mp+240091@code.launchpad.net |
Commit message
Put a persistent error message on the web pages if the image importer is unable to connect to any of the defined boot sources. Previously, the import service just crashed and the user was unaware.
Description of the change
I've deliberately only caught errors from when descriptions are downloaded as this is basically where URLs are verified, which is the cause of the original bug.
We may decide to later extend this validation to more parts of the code, but this is a 1.7 backport candidate so keeping it short.
Blake Rouse (blake-rouse) wrote : | # |
Julian Edwards (julian-edwards) wrote : | # |
On Thursday 30 Oct 2014 15:46:23 you wrote:
> There is an issue with this.
>
> The cache is only updated every hour. If an update occurs and it fails it
> will show this error for a whole hour, until it tries again. The import
> process does not update the cache its completely seperate, it doesn't even
> us the cache, the cache is for the UI only.
Ok this is useful info, thanks.
> Only way I can think around this is to update the cache before each import,
> if not then this error message can exist while it has imported and
> completed successfully.
That seems like something I'd expect to happen anyway, so great. Can you
point me at the relevant area that needs changing?
Also would you please consider a technical overview write-up of this whole
import process? It seems that a few of us don't fully understand the changes
you made.
J
Blake Rouse (blake-rouse) wrote : | # |
> On Thursday 30 Oct 2014 15:46:23 you wrote:
> > There is an issue with this.
> >
> > The cache is only updated every hour. If an update occurs and it fails it
> > will show this error for a whole hour, until it tries again. The import
> > process does not update the cache its completely seperate, it doesn't even
> > us the cache, the cache is for the UI only.
>
> Ok this is useful info, thanks.
>
> > Only way I can think around this is to update the cache before each import,
> > if not then this error message can exist while it has imported and
> > completed successfully.
>
> That seems like something I'd expect to happen anyway, so great. Can you
> point me at the relevant area that needs changing?
You would need to start the update in maasserver/
>
> Also would you please consider a technical overview write-up of this whole
> import process? It seems that a few of us don't fully understand the changes
> you made.
I will work on adding a doc in the dev section to give an overview of how it all works.
>
> J
Julian Edwards (julian-edwards) wrote : | # |
I've made changes as suggested and it all seems to work well on my test rig. I'm also catching IOError — Simplestreams has zero documentation about what exceptions it raises and I didn't see that it raises this one as well, depending on the error.
Please have another look.
Julian Edwards (julian-edwards) wrote : | # |
Forgot to say, I renamed:
_cache_
cache_
The latter is still required because a post-save signal causes it to run, and it cannot run synchronously in that case.
Blake Rouse (blake-rouse) wrote : | # |
Other then the error message, that I commented on inline. Looks good.
Julian Edwards (julian-edwards) wrote : | # |
Thanks for the (re-)review!
On Monday 03 November 2014 18:47:14 you wrote:
> > + sources)
> > + except (IOError, ConnectionError) as e:
> > + source_
> > + "Failed to import images from boot source %s: %s"
> > % ( + source.url, unicode(e)))
>
> This isn't importing images, its just downloading metadata.
That used to be the case, but it's now a prerequisite to the actual download,
so the error message is pretty accurate in that it's failed to download the
image.
It would look a little weird to say "failed to download image descriptions" or
the like, most users have no idea what those are unless they're familiar with
simplestreams.
Preview Diff
1 | === modified file 'src/maasserver/bootresources.py' |
2 | --- src/maasserver/bootresources.py 2014-10-30 18:24:50 +0000 |
3 | +++ src/maasserver/bootresources.py 2014-11-03 03:01:27 +0000 |
4 | @@ -39,6 +39,7 @@ |
5 | from django.shortcuts import get_object_or_404 |
6 | from maasserver import locks |
7 | from maasserver.bootsources import ( |
8 | + cache_boot_sources, |
9 | ensure_boot_source_definition, |
10 | get_boot_sources, |
11 | ) |
12 | @@ -883,6 +884,9 @@ |
13 | maaslog.debug("Skipping import as another import is already running.") |
14 | return |
15 | |
16 | + # Keep the descriptions cache up-to-date. |
17 | + cache_boot_sources() |
18 | + |
19 | # If we're not being forced, don't sync unless we've already done it once |
20 | # before, i.e. we've been asked to explicitly sync by a user. |
21 | if not force and not has_synced_resources(): |
22 | |
23 | === modified file 'src/maasserver/bootsources.py' |
24 | --- src/maasserver/bootsources.py 2014-09-30 20:22:58 +0000 |
25 | +++ src/maasserver/bootsources.py 2014-11-03 03:01:27 +0000 |
26 | @@ -17,11 +17,15 @@ |
27 | "get_boot_sources", |
28 | "get_os_info_from_boot_sources", |
29 | "cache_boot_sources", |
30 | - "BootSourceCacheService", |
31 | + "cache_boot_sources_in_thread", |
32 | ] |
33 | |
34 | - |
35 | from maasserver import locks |
36 | +from maasserver.components import ( |
37 | + discard_persistent_error, |
38 | + register_persistent_error, |
39 | + ) |
40 | +from maasserver.enum import COMPONENT |
41 | from maasserver.models import ( |
42 | BootSource, |
43 | BootSourceCache, |
44 | @@ -37,10 +41,9 @@ |
45 | from provisioningserver.logger import get_maas_logger |
46 | from provisioningserver.utils.env import environment_variables |
47 | from provisioningserver.utils.fs import tempdir |
48 | -from twisted.application.internet import TimerService |
49 | +from requests.exceptions import ConnectionError |
50 | from twisted.internet import reactor |
51 | from twisted.internet.threads import deferToThread |
52 | -from twisted.python import log |
53 | |
54 | |
55 | maaslog = get_maas_logger("bootsources") |
56 | @@ -101,12 +104,14 @@ |
57 | |
58 | |
59 | @transactional |
60 | -def _cache_boot_sources(): |
61 | +def cache_boot_sources(): |
62 | """Cache all image information in boot sources.""" |
63 | # If the lock is already held, then cache is already running. |
64 | if locks.cache_sources.is_locked(): |
65 | return |
66 | |
67 | + source_errors = [] |
68 | + |
69 | # Hold the lock while performing the cache |
70 | with locks.cache_sources: |
71 | env = get_simplestreams_env() |
72 | @@ -114,7 +119,14 @@ |
73 | for source in BootSource.objects.all(): |
74 | sources = write_all_keyrings( |
75 | keyrings_path, [source.to_dict_without_selections()]) |
76 | - image_descriptions = download_all_image_descriptions(sources) |
77 | + try: |
78 | + image_descriptions = download_all_image_descriptions( |
79 | + sources) |
80 | + except (IOError, ConnectionError) as e: |
81 | + source_errors.append( |
82 | + "Failed to import images from boot source %s: %s" % ( |
83 | + source.url, unicode(e))) |
84 | + continue |
85 | |
86 | # We clear the cache once the information has been retrieved, |
87 | # because if an error occurs getting the information then the |
88 | @@ -133,8 +145,15 @@ |
89 | ) |
90 | maaslog.info("Updated boot sources cache.") |
91 | |
92 | - |
93 | -def cache_boot_sources(): |
94 | + # Update the component errors while still holding the lock. |
95 | + if len(source_errors) > 0: |
96 | + register_persistent_error( |
97 | + COMPONENT.REGION_IMAGE_IMPORT, "\n".join(source_errors)) |
98 | + else: |
99 | + discard_persistent_error(COMPONENT.REGION_IMAGE_IMPORT) |
100 | + |
101 | + |
102 | +def cache_boot_sources_in_thread(): |
103 | """Starts the caching of image information in boot sources. |
104 | |
105 | Note: This function returns immediately. It only starts the process, it |
106 | @@ -144,35 +163,4 @@ |
107 | # "OperationalError: could not serialize access due to concurrent update" |
108 | # The wait will make sure the transaction that started the update will |
109 | # have finished and been committed. |
110 | - reactor.callLater(1, deferToThread, _cache_boot_sources) |
111 | - |
112 | - |
113 | -class BootSourceCacheService(TimerService, object): |
114 | - """Service to periodically cache boot source information. |
115 | - |
116 | - This will run immediately when it's started, then once again every hour, |
117 | - though the interval can be overridden by passing it to the constructor. |
118 | - """ |
119 | - |
120 | - def __init__(self, interval=(60 * 60)): |
121 | - super(BootSourceCacheService, self).__init__( |
122 | - interval, self.try_cache_boot_sources) |
123 | - |
124 | - def try_cache_boot_sources(self): |
125 | - """Attempt to cache boot sources. |
126 | - |
127 | - Log errors on failure, but do not propagate them up; that will |
128 | - stop the timed loop from running. |
129 | - """ |
130 | - |
131 | - def cache_boot_sources_failed(failure): |
132 | - # Log the error in full to the Twisted log. |
133 | - log.err(failure) |
134 | - # Log something concise to the MAAS log. |
135 | - maaslog.error( |
136 | - "Failed to update boot source cache: %s", |
137 | - failure.getErrorMessage()) |
138 | - |
139 | - d = deferToThread(_cache_boot_sources) |
140 | - d.addErrback(cache_boot_sources_failed) |
141 | - return d |
142 | + reactor.callLater(1, deferToThread, cache_boot_sources) |
143 | |
144 | === modified file 'src/maasserver/enum.py' |
145 | --- src/maasserver/enum.py 2014-10-03 10:42:35 +0000 |
146 | +++ src/maasserver/enum.py 2014-11-03 03:01:27 +0000 |
147 | @@ -42,6 +42,7 @@ |
148 | PSERV = 'provisioning server' |
149 | IMPORT_PXE_FILES = 'maas-import-pxe-files script' |
150 | CLUSTERS = 'clusters' |
151 | + REGION_IMAGE_IMPORT = 'Image importer' |
152 | |
153 | |
154 | class NODE_STATUS: |
155 | |
156 | === modified file 'src/maasserver/eventloop.py' |
157 | --- src/maasserver/eventloop.py 2014-09-26 20:19:01 +0000 |
158 | +++ src/maasserver/eventloop.py 2014-11-03 03:01:27 +0000 |
159 | @@ -164,11 +164,6 @@ |
160 | return bootresources.ImportResourcesService() |
161 | |
162 | |
163 | -def make_BootSourceCacheService(): |
164 | - from maasserver import bootsources |
165 | - return bootsources.BootSourceCacheService() |
166 | - |
167 | - |
168 | class RegionEventLoop: |
169 | """An event loop running in a region controller process. |
170 | |
171 | @@ -194,7 +189,6 @@ |
172 | ("rpc-advertise", make_RegionAdvertisingService), |
173 | ("nonce-cleanup", make_NonceCleanupService), |
174 | ("import-resources", make_ImportResourcesService), |
175 | - ("cache-sources", make_BootSourceCacheService), |
176 | ) |
177 | |
178 | def __init__(self): |
179 | |
180 | === modified file 'src/maasserver/models/bootsource.py' |
181 | --- src/maasserver/models/bootsource.py 2014-09-30 19:48:33 +0000 |
182 | +++ src/maasserver/models/bootsource.py 2014-11-03 03:01:27 +0000 |
183 | @@ -106,7 +106,7 @@ |
184 | """Update the `BootSourceCache` with information for the updated |
185 | source.""" |
186 | # Avoid circular import |
187 | - from maasserver.bootsources import cache_boot_sources |
188 | - cache_boot_sources() |
189 | + from maasserver.bootsources import cache_boot_sources_in_thread |
190 | + cache_boot_sources_in_thread() |
191 | |
192 | post_save.connect(update_boot_source_cache, BootSource) |
193 | |
194 | === modified file 'src/maasserver/models/tests/test_bootsource.py' |
195 | --- src/maasserver/models/tests/test_bootsource.py 2014-09-30 19:48:33 +0000 |
196 | +++ src/maasserver/models/tests/test_bootsource.py 2014-11-03 03:01:27 +0000 |
197 | @@ -17,7 +17,7 @@ |
198 | import os |
199 | |
200 | from django.core.exceptions import ValidationError |
201 | -from maasserver.bootsources import _cache_boot_sources |
202 | +from maasserver.bootsources import cache_boot_sources |
203 | from maasserver.models import BootSource |
204 | from maasserver.testing.factory import factory |
205 | from maasserver.testing.testcase import MAASServerTestCase |
206 | @@ -104,4 +104,5 @@ |
207 | url="http://test.test/", keyring_data=b"1234") |
208 | self.assertThat( |
209 | mock_callLater, |
210 | - MockCalledOnceWith(1, deferToThread, _cache_boot_sources)) |
211 | + MockCalledOnceWith( |
212 | + 1, deferToThread, cache_boot_sources)) |
213 | |
214 | === modified file 'src/maasserver/tests/test_bootresources.py' |
215 | --- src/maasserver/tests/test_bootresources.py 2014-10-30 18:24:50 +0000 |
216 | +++ src/maasserver/tests/test_bootresources.py 2014-11-03 03:01:27 +0000 |
217 | @@ -1030,7 +1030,7 @@ |
218 | # lock is already held. |
219 | self.assertThat(has_synced_resources, MockNotCalled()) |
220 | |
221 | - def test__import_resources_exists_early_without_force(self): |
222 | + def test__import_resources_exits_early_without_force(self): |
223 | has_synced_resources = self.patch( |
224 | bootresources, "has_synced_resources") |
225 | bootresources._import_resources(force=False) |
226 | @@ -1059,33 +1059,37 @@ |
227 | self.assertFalse(bootresources.locks.import_images.is_locked()) |
228 | |
229 | def test__import_resources_calls_functions_with_correct_parameters(self): |
230 | - fake_write_all_keyrings = self.patch( |
231 | + cache_boot_sources = self.patch( |
232 | + bootresources, 'cache_boot_sources') |
233 | + write_all_keyrings = self.patch( |
234 | bootresources, 'write_all_keyrings') |
235 | - fake_write_all_keyrings.return_value = [sentinel.source] |
236 | - fake_image_descriptions = self.patch( |
237 | + write_all_keyrings.return_value = [sentinel.source] |
238 | + image_descriptions = self.patch( |
239 | bootresources, 'download_all_image_descriptions') |
240 | descriptions = Mock() |
241 | descriptions.is_empty.return_value = False |
242 | - fake_image_descriptions.return_value = descriptions |
243 | - fake_map_products = self.patch( |
244 | + image_descriptions.return_value = descriptions |
245 | + map_products = self.patch( |
246 | bootresources, 'map_products') |
247 | - fake_map_products.return_value = sentinel.mapping |
248 | - fake_download_all_boot_resources = self.patch( |
249 | + map_products.return_value = sentinel.mapping |
250 | + download_all_boot_resources = self.patch( |
251 | bootresources, 'download_all_boot_resources') |
252 | |
253 | bootresources._import_resources(force=True) |
254 | |
255 | - self.assertThat( |
256 | - fake_write_all_keyrings, |
257 | + self.expectThat( |
258 | + cache_boot_sources, MockCalledOnceWith()) |
259 | + self.expectThat( |
260 | + write_all_keyrings, |
261 | MockCalledOnceWith(ANY, [])) |
262 | - self.assertThat( |
263 | - fake_image_descriptions, |
264 | + self.expectThat( |
265 | + image_descriptions, |
266 | MockCalledOnceWith([sentinel.source])) |
267 | - self.assertThat( |
268 | - fake_map_products, |
269 | + self.expectThat( |
270 | + map_products, |
271 | MockCalledOnceWith(descriptions)) |
272 | - self.assertThat( |
273 | - fake_download_all_boot_resources, |
274 | + self.expectThat( |
275 | + download_all_boot_resources, |
276 | MockCalledOnceWith([sentinel.source], sentinel.mapping)) |
277 | |
278 | def test__import_resources_has_env_GNUPGHOME_set(self): |
279 | |
280 | === modified file 'src/maasserver/tests/test_bootsources.py' |
281 | --- src/maasserver/tests/test_bootsources.py 2014-09-30 19:48:33 +0000 |
282 | +++ src/maasserver/tests/test_bootsources.py 2014-11-03 03:01:27 +0000 |
283 | @@ -18,12 +18,17 @@ |
284 | |
285 | from maasserver import bootsources |
286 | from maasserver.bootsources import ( |
287 | - _cache_boot_sources, |
288 | cache_boot_sources, |
289 | + cache_boot_sources_in_thread, |
290 | ensure_boot_source_definition, |
291 | get_boot_sources, |
292 | get_os_info_from_boot_sources, |
293 | ) |
294 | +from maasserver.components import ( |
295 | + get_persistent_error, |
296 | + register_persistent_error, |
297 | + ) |
298 | +from maasserver.enum import COMPONENT |
299 | from maasserver.models import ( |
300 | BootSource, |
301 | BootSourceCache, |
302 | @@ -34,10 +39,14 @@ |
303 | from maasserver.testing.testcase import MAASServerTestCase |
304 | from maastesting.matchers import MockCalledOnceWith |
305 | from mock import MagicMock |
306 | +from provisioningserver.import_images import ( |
307 | + download_descriptions as download_descriptions_module, |
308 | + ) |
309 | from provisioningserver.import_images.boot_image_mapping import ( |
310 | BootImageMapping, |
311 | ) |
312 | from provisioningserver.import_images.helpers import ImageSpec |
313 | +from requests.exceptions import ConnectionError |
314 | from testtools.matchers import HasLength |
315 | from twisted.internet import reactor |
316 | from twisted.internet.threads import deferToThread |
317 | @@ -170,7 +179,7 @@ |
318 | capture = ( |
319 | patch_and_capture_env_for_download_all_image_descriptions(self)) |
320 | factory.make_BootSource(keyring_data='1234') |
321 | - _cache_boot_sources() |
322 | + cache_boot_sources() |
323 | self.assertEqual( |
324 | bootsources.get_maas_user_gpghome(), |
325 | capture.env['GNUPGHOME']) |
326 | @@ -181,7 +190,7 @@ |
327 | capture = ( |
328 | patch_and_capture_env_for_download_all_image_descriptions(self)) |
329 | factory.make_BootSource(keyring_data='1234') |
330 | - _cache_boot_sources() |
331 | + cache_boot_sources() |
332 | self.assertEqual( |
333 | (proxy_address, proxy_address), |
334 | (capture.env['http_proxy'], capture.env['http_proxy'])) |
335 | @@ -192,7 +201,7 @@ |
336 | mock_download = self.patch( |
337 | bootsources, 'download_all_image_descriptions') |
338 | mock_download.return_value = make_boot_image_mapping() |
339 | - _cache_boot_sources() |
340 | + cache_boot_sources() |
341 | self.assertEqual(0, BootSourceCache.objects.all().count()) |
342 | |
343 | def test__returns_adds_entries_to_cache_for_source(self): |
344 | @@ -204,7 +213,7 @@ |
345 | mock_download = self.patch( |
346 | bootsources, 'download_all_image_descriptions') |
347 | mock_download.return_value = make_boot_image_mapping(image_specs) |
348 | - _cache_boot_sources() |
349 | + cache_boot_sources() |
350 | cached_releases = [ |
351 | cache.release |
352 | for cache in BootSourceCache.objects.filter(boot_source=source) |
353 | @@ -213,11 +222,45 @@ |
354 | self.assertItemsEqual(releases, cached_releases) |
355 | |
356 | |
357 | +class TestBadConnectionHandling(MAASServerTestCase): |
358 | + |
359 | + def test__catches_connection_errors_and_sets_component_error(self): |
360 | + sources = [ |
361 | + factory.make_BootSource(keyring_data='1234') for _ in range(3)] |
362 | + download_image_descriptions = self.patch( |
363 | + download_descriptions_module, 'download_image_descriptions') |
364 | + error_text = factory.make_name("error_text") |
365 | + # Make two of the downloads fail. |
366 | + download_image_descriptions.side_effect = [ |
367 | + ConnectionError(error_text), |
368 | + BootImageMapping(), |
369 | + IOError(error_text), |
370 | + ] |
371 | + cache_boot_sources() |
372 | + base_error = "Failed to import images from boot source {url}: {err}" |
373 | + error_part_one = base_error.format(url=sources[0].url, err=error_text) |
374 | + error_part_two = base_error.format(url=sources[2].url, err=error_text) |
375 | + expected_error = error_part_one + '\n' + error_part_two |
376 | + actual_error = get_persistent_error(COMPONENT.REGION_IMAGE_IMPORT) |
377 | + self.assertEqual(expected_error, actual_error) |
378 | + |
379 | + def test__clears_component_error_when_successful(self): |
380 | + register_persistent_error( |
381 | + COMPONENT.REGION_IMAGE_IMPORT, factory.make_string()) |
382 | + [factory.make_BootSource(keyring_data='1234') for _ in range(3)] |
383 | + download_image_descriptions = self.patch( |
384 | + download_descriptions_module, 'download_image_descriptions') |
385 | + # Make all of the downloads successful. |
386 | + download_image_descriptions.return_value = BootImageMapping() |
387 | + cache_boot_sources() |
388 | + self.assertIsNone(get_persistent_error(COMPONENT.REGION_IMAGE_IMPORT)) |
389 | + |
390 | + |
391 | class TestCacheBootSources(MAASServerTestCase): |
392 | |
393 | def test__calls_callLater_in_reactor(self): |
394 | mock_callLater = self.patch(reactor, 'callLater') |
395 | - cache_boot_sources() |
396 | + cache_boot_sources_in_thread() |
397 | self.assertThat( |
398 | mock_callLater, |
399 | - MockCalledOnceWith(1, deferToThread, _cache_boot_sources)) |
400 | + MockCalledOnceWith(1, deferToThread, cache_boot_sources)) |
401 | |
402 | === modified file 'src/maasserver/tests/test_eventloop.py' |
403 | --- src/maasserver/tests/test_eventloop.py 2014-09-26 20:19:01 +0000 |
404 | +++ src/maasserver/tests/test_eventloop.py 2014-11-03 03:01:27 +0000 |
405 | @@ -18,7 +18,6 @@ |
406 | from django.db import connections |
407 | from maasserver import ( |
408 | bootresources, |
409 | - bootsources, |
410 | eventloop, |
411 | nonces_cleanup, |
412 | ) |
413 | @@ -147,15 +146,6 @@ |
414 | eventloop.make_ImportResourcesService, |
415 | {factory for _, factory in eventloop.loop.factories}) |
416 | |
417 | - def test_make_BootSourceCacheService(self): |
418 | - service = eventloop.make_BootSourceCacheService() |
419 | - self.assertThat(service, IsInstance( |
420 | - bootsources.BootSourceCacheService)) |
421 | - # It is registered as a factory in RegionEventLoop. |
422 | - self.assertIn( |
423 | - eventloop.make_BootSourceCacheService, |
424 | - {factory for _, factory in eventloop.loop.factories}) |
425 | - |
426 | |
427 | class TestDisablingDatabaseConnections(MAASTestCase): |
428 |
There is an issue with this.
The cache is only updated every hour. If an update occurs and it fails it will show this error for a whole hour, until it tries again. The import process does not update the cache its completely seperate, it doesn't even us the cache, the cache is for the UI only.
Only way I can think around this is to update the cache before each import, if not then this error message can exist while it has imported and completed successfully.