Merge lp:~julian-edwards/maas/import-fail-bug-1386914 into lp:~maas-committers/maas/trunk

Proposed by Julian Edwards
Status: Merged
Approved by: Julian Edwards
Approved revision: no longer in the source branch.
Merged at revision: 3329
Proposed branch: lp:~julian-edwards/maas/import-fail-bug-1386914
Merge into: lp:~maas-committers/maas/trunk
Diff against target: 427 lines (+108/-83)
9 files modified
src/maasserver/bootresources.py (+4/-0)
src/maasserver/bootsources.py (+28/-40)
src/maasserver/enum.py (+1/-0)
src/maasserver/eventloop.py (+0/-6)
src/maasserver/models/bootsource.py (+2/-2)
src/maasserver/models/tests/test_bootsource.py (+3/-2)
src/maasserver/tests/test_bootresources.py (+20/-16)
src/maasserver/tests/test_bootsources.py (+50/-7)
src/maasserver/tests/test_eventloop.py (+0/-10)
To merge this branch: bzr merge lp:~julian-edwards/maas/import-fail-bug-1386914
Reviewer Review Type Date Requested Status
Blake Rouse (community) Approve
Review via email: mp+240091@code.launchpad.net

Commit message

Put a persistent error message on the web pages if the image importer is unable to connect to any of the defined boot sources. Previously, the import service just crashed and the user was unaware.

Description of the change

I've deliberately only caught errors from when descriptions are downloaded as this is basically where URLs are verified, which is the cause of the original bug.

We may decide to later extend this validation to more parts of the code, but this is a 1.7 backport candidate so keeping it short.

To post a comment you must log in.
Revision history for this message
Blake Rouse (blake-rouse) wrote :

There is an issue with this.

The cache is only updated every hour. If an update occurs and it fails it will show this error for a whole hour, until it tries again. The import process does not update the cache its completely seperate, it doesn't even us the cache, the cache is for the UI only.

Only way I can think around this is to update the cache before each import, if not then this error message can exist while it has imported and completed successfully.

Revision history for this message
Julian Edwards (julian-edwards) wrote :

On Thursday 30 Oct 2014 15:46:23 you wrote:
> There is an issue with this.
>
> The cache is only updated every hour. If an update occurs and it fails it
> will show this error for a whole hour, until it tries again. The import
> process does not update the cache its completely seperate, it doesn't even
> us the cache, the cache is for the UI only.

Ok this is useful info, thanks.

> Only way I can think around this is to update the cache before each import,
> if not then this error message can exist while it has imported and
> completed successfully.

That seems like something I'd expect to happen anyway, so great. Can you
point me at the relevant area that needs changing?

Also would you please consider a technical overview write-up of this whole
import process? It seems that a few of us don't fully understand the changes
you made.

J

Revision history for this message
Blake Rouse (blake-rouse) wrote :

> On Thursday 30 Oct 2014 15:46:23 you wrote:
> > There is an issue with this.
> >
> > The cache is only updated every hour. If an update occurs and it fails it
> > will show this error for a whole hour, until it tries again. The import
> > process does not update the cache its completely seperate, it doesn't even
> > us the cache, the cache is for the UI only.
>
> Ok this is useful info, thanks.
>
> > Only way I can think around this is to update the cache before each import,
> > if not then this error message can exist while it has imported and
> > completed successfully.
>
> That seems like something I'd expect to happen anyway, so great. Can you
> point me at the relevant area that needs changing?

You would need to start the update in maasserver/bootresources.py:_import_resources. If you do add this, then you probably could remove the cache updating service, since the importing is a service as well.

>
> Also would you please consider a technical overview write-up of this whole
> import process? It seems that a few of us don't fully understand the changes
> you made.

I will work on adding a doc in the dev section to give an overview of how it all works.

>
> J

Revision history for this message
Julian Edwards (julian-edwards) wrote :

I've made changes as suggested and it all seems to work well on my test rig. I'm also catching IOError — Simplestreams has zero documentation about what exceptions it raises and I didn't see that it raises this one as well, depending on the error.

Please have another look.

Revision history for this message
Julian Edwards (julian-edwards) wrote :

Forgot to say, I renamed:

 _cache_boot_sources() → cache_boot_sources()
 cache_boot_source() → cache_boot_sources_in_thread()

The latter is still required because a post-save signal causes it to run, and it cannot run synchronously in that case.

Revision history for this message
Blake Rouse (blake-rouse) wrote :

Other then the error message, that I commented on inline. Looks good.

review: Approve
Revision history for this message
Julian Edwards (julian-edwards) wrote :

Thanks for the (re-)review!

On Monday 03 November 2014 18:47:14 you wrote:
> > + sources)
> > + except (IOError, ConnectionError) as e:
> > + source_errors.append(
> > + "Failed to import images from boot source %s: %s"
> > % ( + source.url, unicode(e)))
>
> This isn't importing images, its just downloading metadata.

That used to be the case, but it's now a prerequisite to the actual download,
so the error message is pretty accurate in that it's failed to download the
image.

It would look a little weird to say "failed to download image descriptions" or
the like, most users have no idea what those are unless they're familiar with
simplestreams.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/maasserver/bootresources.py'
2--- src/maasserver/bootresources.py 2014-10-30 18:24:50 +0000
3+++ src/maasserver/bootresources.py 2014-11-03 03:01:27 +0000
4@@ -39,6 +39,7 @@
5 from django.shortcuts import get_object_or_404
6 from maasserver import locks
7 from maasserver.bootsources import (
8+ cache_boot_sources,
9 ensure_boot_source_definition,
10 get_boot_sources,
11 )
12@@ -883,6 +884,9 @@
13 maaslog.debug("Skipping import as another import is already running.")
14 return
15
16+ # Keep the descriptions cache up-to-date.
17+ cache_boot_sources()
18+
19 # If we're not being forced, don't sync unless we've already done it once
20 # before, i.e. we've been asked to explicitly sync by a user.
21 if not force and not has_synced_resources():
22
23=== modified file 'src/maasserver/bootsources.py'
24--- src/maasserver/bootsources.py 2014-09-30 20:22:58 +0000
25+++ src/maasserver/bootsources.py 2014-11-03 03:01:27 +0000
26@@ -17,11 +17,15 @@
27 "get_boot_sources",
28 "get_os_info_from_boot_sources",
29 "cache_boot_sources",
30- "BootSourceCacheService",
31+ "cache_boot_sources_in_thread",
32 ]
33
34-
35 from maasserver import locks
36+from maasserver.components import (
37+ discard_persistent_error,
38+ register_persistent_error,
39+ )
40+from maasserver.enum import COMPONENT
41 from maasserver.models import (
42 BootSource,
43 BootSourceCache,
44@@ -37,10 +41,9 @@
45 from provisioningserver.logger import get_maas_logger
46 from provisioningserver.utils.env import environment_variables
47 from provisioningserver.utils.fs import tempdir
48-from twisted.application.internet import TimerService
49+from requests.exceptions import ConnectionError
50 from twisted.internet import reactor
51 from twisted.internet.threads import deferToThread
52-from twisted.python import log
53
54
55 maaslog = get_maas_logger("bootsources")
56@@ -101,12 +104,14 @@
57
58
59 @transactional
60-def _cache_boot_sources():
61+def cache_boot_sources():
62 """Cache all image information in boot sources."""
63 # If the lock is already held, then cache is already running.
64 if locks.cache_sources.is_locked():
65 return
66
67+ source_errors = []
68+
69 # Hold the lock while performing the cache
70 with locks.cache_sources:
71 env = get_simplestreams_env()
72@@ -114,7 +119,14 @@
73 for source in BootSource.objects.all():
74 sources = write_all_keyrings(
75 keyrings_path, [source.to_dict_without_selections()])
76- image_descriptions = download_all_image_descriptions(sources)
77+ try:
78+ image_descriptions = download_all_image_descriptions(
79+ sources)
80+ except (IOError, ConnectionError) as e:
81+ source_errors.append(
82+ "Failed to import images from boot source %s: %s" % (
83+ source.url, unicode(e)))
84+ continue
85
86 # We clear the cache once the information has been retrieved,
87 # because if an error occurs getting the information then the
88@@ -133,8 +145,15 @@
89 )
90 maaslog.info("Updated boot sources cache.")
91
92-
93-def cache_boot_sources():
94+ # Update the component errors while still holding the lock.
95+ if len(source_errors) > 0:
96+ register_persistent_error(
97+ COMPONENT.REGION_IMAGE_IMPORT, "\n".join(source_errors))
98+ else:
99+ discard_persistent_error(COMPONENT.REGION_IMAGE_IMPORT)
100+
101+
102+def cache_boot_sources_in_thread():
103 """Starts the caching of image information in boot sources.
104
105 Note: This function returns immediately. It only starts the process, it
106@@ -144,35 +163,4 @@
107 # "OperationalError: could not serialize access due to concurrent update"
108 # The wait will make sure the transaction that started the update will
109 # have finished and been committed.
110- reactor.callLater(1, deferToThread, _cache_boot_sources)
111-
112-
113-class BootSourceCacheService(TimerService, object):
114- """Service to periodically cache boot source information.
115-
116- This will run immediately when it's started, then once again every hour,
117- though the interval can be overridden by passing it to the constructor.
118- """
119-
120- def __init__(self, interval=(60 * 60)):
121- super(BootSourceCacheService, self).__init__(
122- interval, self.try_cache_boot_sources)
123-
124- def try_cache_boot_sources(self):
125- """Attempt to cache boot sources.
126-
127- Log errors on failure, but do not propagate them up; that will
128- stop the timed loop from running.
129- """
130-
131- def cache_boot_sources_failed(failure):
132- # Log the error in full to the Twisted log.
133- log.err(failure)
134- # Log something concise to the MAAS log.
135- maaslog.error(
136- "Failed to update boot source cache: %s",
137- failure.getErrorMessage())
138-
139- d = deferToThread(_cache_boot_sources)
140- d.addErrback(cache_boot_sources_failed)
141- return d
142+ reactor.callLater(1, deferToThread, cache_boot_sources)
143
144=== modified file 'src/maasserver/enum.py'
145--- src/maasserver/enum.py 2014-10-03 10:42:35 +0000
146+++ src/maasserver/enum.py 2014-11-03 03:01:27 +0000
147@@ -42,6 +42,7 @@
148 PSERV = 'provisioning server'
149 IMPORT_PXE_FILES = 'maas-import-pxe-files script'
150 CLUSTERS = 'clusters'
151+ REGION_IMAGE_IMPORT = 'Image importer'
152
153
154 class NODE_STATUS:
155
156=== modified file 'src/maasserver/eventloop.py'
157--- src/maasserver/eventloop.py 2014-09-26 20:19:01 +0000
158+++ src/maasserver/eventloop.py 2014-11-03 03:01:27 +0000
159@@ -164,11 +164,6 @@
160 return bootresources.ImportResourcesService()
161
162
163-def make_BootSourceCacheService():
164- from maasserver import bootsources
165- return bootsources.BootSourceCacheService()
166-
167-
168 class RegionEventLoop:
169 """An event loop running in a region controller process.
170
171@@ -194,7 +189,6 @@
172 ("rpc-advertise", make_RegionAdvertisingService),
173 ("nonce-cleanup", make_NonceCleanupService),
174 ("import-resources", make_ImportResourcesService),
175- ("cache-sources", make_BootSourceCacheService),
176 )
177
178 def __init__(self):
179
180=== modified file 'src/maasserver/models/bootsource.py'
181--- src/maasserver/models/bootsource.py 2014-09-30 19:48:33 +0000
182+++ src/maasserver/models/bootsource.py 2014-11-03 03:01:27 +0000
183@@ -106,7 +106,7 @@
184 """Update the `BootSourceCache` with information for the updated
185 source."""
186 # Avoid circular import
187- from maasserver.bootsources import cache_boot_sources
188- cache_boot_sources()
189+ from maasserver.bootsources import cache_boot_sources_in_thread
190+ cache_boot_sources_in_thread()
191
192 post_save.connect(update_boot_source_cache, BootSource)
193
194=== modified file 'src/maasserver/models/tests/test_bootsource.py'
195--- src/maasserver/models/tests/test_bootsource.py 2014-09-30 19:48:33 +0000
196+++ src/maasserver/models/tests/test_bootsource.py 2014-11-03 03:01:27 +0000
197@@ -17,7 +17,7 @@
198 import os
199
200 from django.core.exceptions import ValidationError
201-from maasserver.bootsources import _cache_boot_sources
202+from maasserver.bootsources import cache_boot_sources
203 from maasserver.models import BootSource
204 from maasserver.testing.factory import factory
205 from maasserver.testing.testcase import MAASServerTestCase
206@@ -104,4 +104,5 @@
207 url="http://test.test/", keyring_data=b"1234")
208 self.assertThat(
209 mock_callLater,
210- MockCalledOnceWith(1, deferToThread, _cache_boot_sources))
211+ MockCalledOnceWith(
212+ 1, deferToThread, cache_boot_sources))
213
214=== modified file 'src/maasserver/tests/test_bootresources.py'
215--- src/maasserver/tests/test_bootresources.py 2014-10-30 18:24:50 +0000
216+++ src/maasserver/tests/test_bootresources.py 2014-11-03 03:01:27 +0000
217@@ -1030,7 +1030,7 @@
218 # lock is already held.
219 self.assertThat(has_synced_resources, MockNotCalled())
220
221- def test__import_resources_exists_early_without_force(self):
222+ def test__import_resources_exits_early_without_force(self):
223 has_synced_resources = self.patch(
224 bootresources, "has_synced_resources")
225 bootresources._import_resources(force=False)
226@@ -1059,33 +1059,37 @@
227 self.assertFalse(bootresources.locks.import_images.is_locked())
228
229 def test__import_resources_calls_functions_with_correct_parameters(self):
230- fake_write_all_keyrings = self.patch(
231+ cache_boot_sources = self.patch(
232+ bootresources, 'cache_boot_sources')
233+ write_all_keyrings = self.patch(
234 bootresources, 'write_all_keyrings')
235- fake_write_all_keyrings.return_value = [sentinel.source]
236- fake_image_descriptions = self.patch(
237+ write_all_keyrings.return_value = [sentinel.source]
238+ image_descriptions = self.patch(
239 bootresources, 'download_all_image_descriptions')
240 descriptions = Mock()
241 descriptions.is_empty.return_value = False
242- fake_image_descriptions.return_value = descriptions
243- fake_map_products = self.patch(
244+ image_descriptions.return_value = descriptions
245+ map_products = self.patch(
246 bootresources, 'map_products')
247- fake_map_products.return_value = sentinel.mapping
248- fake_download_all_boot_resources = self.patch(
249+ map_products.return_value = sentinel.mapping
250+ download_all_boot_resources = self.patch(
251 bootresources, 'download_all_boot_resources')
252
253 bootresources._import_resources(force=True)
254
255- self.assertThat(
256- fake_write_all_keyrings,
257+ self.expectThat(
258+ cache_boot_sources, MockCalledOnceWith())
259+ self.expectThat(
260+ write_all_keyrings,
261 MockCalledOnceWith(ANY, []))
262- self.assertThat(
263- fake_image_descriptions,
264+ self.expectThat(
265+ image_descriptions,
266 MockCalledOnceWith([sentinel.source]))
267- self.assertThat(
268- fake_map_products,
269+ self.expectThat(
270+ map_products,
271 MockCalledOnceWith(descriptions))
272- self.assertThat(
273- fake_download_all_boot_resources,
274+ self.expectThat(
275+ download_all_boot_resources,
276 MockCalledOnceWith([sentinel.source], sentinel.mapping))
277
278 def test__import_resources_has_env_GNUPGHOME_set(self):
279
280=== modified file 'src/maasserver/tests/test_bootsources.py'
281--- src/maasserver/tests/test_bootsources.py 2014-09-30 19:48:33 +0000
282+++ src/maasserver/tests/test_bootsources.py 2014-11-03 03:01:27 +0000
283@@ -18,12 +18,17 @@
284
285 from maasserver import bootsources
286 from maasserver.bootsources import (
287- _cache_boot_sources,
288 cache_boot_sources,
289+ cache_boot_sources_in_thread,
290 ensure_boot_source_definition,
291 get_boot_sources,
292 get_os_info_from_boot_sources,
293 )
294+from maasserver.components import (
295+ get_persistent_error,
296+ register_persistent_error,
297+ )
298+from maasserver.enum import COMPONENT
299 from maasserver.models import (
300 BootSource,
301 BootSourceCache,
302@@ -34,10 +39,14 @@
303 from maasserver.testing.testcase import MAASServerTestCase
304 from maastesting.matchers import MockCalledOnceWith
305 from mock import MagicMock
306+from provisioningserver.import_images import (
307+ download_descriptions as download_descriptions_module,
308+ )
309 from provisioningserver.import_images.boot_image_mapping import (
310 BootImageMapping,
311 )
312 from provisioningserver.import_images.helpers import ImageSpec
313+from requests.exceptions import ConnectionError
314 from testtools.matchers import HasLength
315 from twisted.internet import reactor
316 from twisted.internet.threads import deferToThread
317@@ -170,7 +179,7 @@
318 capture = (
319 patch_and_capture_env_for_download_all_image_descriptions(self))
320 factory.make_BootSource(keyring_data='1234')
321- _cache_boot_sources()
322+ cache_boot_sources()
323 self.assertEqual(
324 bootsources.get_maas_user_gpghome(),
325 capture.env['GNUPGHOME'])
326@@ -181,7 +190,7 @@
327 capture = (
328 patch_and_capture_env_for_download_all_image_descriptions(self))
329 factory.make_BootSource(keyring_data='1234')
330- _cache_boot_sources()
331+ cache_boot_sources()
332 self.assertEqual(
333 (proxy_address, proxy_address),
334 (capture.env['http_proxy'], capture.env['http_proxy']))
335@@ -192,7 +201,7 @@
336 mock_download = self.patch(
337 bootsources, 'download_all_image_descriptions')
338 mock_download.return_value = make_boot_image_mapping()
339- _cache_boot_sources()
340+ cache_boot_sources()
341 self.assertEqual(0, BootSourceCache.objects.all().count())
342
343 def test__returns_adds_entries_to_cache_for_source(self):
344@@ -204,7 +213,7 @@
345 mock_download = self.patch(
346 bootsources, 'download_all_image_descriptions')
347 mock_download.return_value = make_boot_image_mapping(image_specs)
348- _cache_boot_sources()
349+ cache_boot_sources()
350 cached_releases = [
351 cache.release
352 for cache in BootSourceCache.objects.filter(boot_source=source)
353@@ -213,11 +222,45 @@
354 self.assertItemsEqual(releases, cached_releases)
355
356
357+class TestBadConnectionHandling(MAASServerTestCase):
358+
359+ def test__catches_connection_errors_and_sets_component_error(self):
360+ sources = [
361+ factory.make_BootSource(keyring_data='1234') for _ in range(3)]
362+ download_image_descriptions = self.patch(
363+ download_descriptions_module, 'download_image_descriptions')
364+ error_text = factory.make_name("error_text")
365+ # Make two of the downloads fail.
366+ download_image_descriptions.side_effect = [
367+ ConnectionError(error_text),
368+ BootImageMapping(),
369+ IOError(error_text),
370+ ]
371+ cache_boot_sources()
372+ base_error = "Failed to import images from boot source {url}: {err}"
373+ error_part_one = base_error.format(url=sources[0].url, err=error_text)
374+ error_part_two = base_error.format(url=sources[2].url, err=error_text)
375+ expected_error = error_part_one + '\n' + error_part_two
376+ actual_error = get_persistent_error(COMPONENT.REGION_IMAGE_IMPORT)
377+ self.assertEqual(expected_error, actual_error)
378+
379+ def test__clears_component_error_when_successful(self):
380+ register_persistent_error(
381+ COMPONENT.REGION_IMAGE_IMPORT, factory.make_string())
382+ [factory.make_BootSource(keyring_data='1234') for _ in range(3)]
383+ download_image_descriptions = self.patch(
384+ download_descriptions_module, 'download_image_descriptions')
385+ # Make all of the downloads successful.
386+ download_image_descriptions.return_value = BootImageMapping()
387+ cache_boot_sources()
388+ self.assertIsNone(get_persistent_error(COMPONENT.REGION_IMAGE_IMPORT))
389+
390+
391 class TestCacheBootSources(MAASServerTestCase):
392
393 def test__calls_callLater_in_reactor(self):
394 mock_callLater = self.patch(reactor, 'callLater')
395- cache_boot_sources()
396+ cache_boot_sources_in_thread()
397 self.assertThat(
398 mock_callLater,
399- MockCalledOnceWith(1, deferToThread, _cache_boot_sources))
400+ MockCalledOnceWith(1, deferToThread, cache_boot_sources))
401
402=== modified file 'src/maasserver/tests/test_eventloop.py'
403--- src/maasserver/tests/test_eventloop.py 2014-09-26 20:19:01 +0000
404+++ src/maasserver/tests/test_eventloop.py 2014-11-03 03:01:27 +0000
405@@ -18,7 +18,6 @@
406 from django.db import connections
407 from maasserver import (
408 bootresources,
409- bootsources,
410 eventloop,
411 nonces_cleanup,
412 )
413@@ -147,15 +146,6 @@
414 eventloop.make_ImportResourcesService,
415 {factory for _, factory in eventloop.loop.factories})
416
417- def test_make_BootSourceCacheService(self):
418- service = eventloop.make_BootSourceCacheService()
419- self.assertThat(service, IsInstance(
420- bootsources.BootSourceCacheService))
421- # It is registered as a factory in RegionEventLoop.
422- self.assertIn(
423- eventloop.make_BootSourceCacheService,
424- {factory for _, factory in eventloop.loop.factories})
425-
426
427 class TestDisablingDatabaseConnections(MAASTestCase):
428