Merge lp:~jtv/maas/boot-image-dict into lp:~maas-committers/maas/trunk

Proposed by Jeroen T. Vermeulen
Status: Merged
Approved by: Jeroen T. Vermeulen
Approved revision: no longer in the source branch.
Merged at revision: 2237
Proposed branch: lp:~jtv/maas/boot-image-dict
Merge into: lp:~maas-committers/maas/trunk
Diff against target: 411 lines (+178/-110)
2 files modified
src/provisioningserver/import_images/boot_resources.py (+98/-74)
src/provisioningserver/import_images/tests/test_boot_resources.py (+80/-36)
To merge this branch: bzr merge lp:~jtv/maas/boot-image-dict
Reviewer Review Type Date Requested Status
Graham Binns (community) Approve
Review via email: mp+214210@code.launchpad.net

Commit message

Turn the “hierarchical dicts” in the new import code into a proper class, BootImageDict.

Description of the change

The import code is a bewildering constellation of dict-based data structures. In order to be able to test, maintain, and debug this code, we need a bit more structure. This came up in my ongoing attempt to write unit tests for the new code.

Much of the dict-ey code is about dicts that index by architecture, subarchitecture, release, and label. Another, very similar kind of dict uses the same constructor but contains something else. I'm prying the two apart, turning that first kind into a class with limited methods, so we have a clear idea of what does and doesn't happen to its contents. In the process I'm conquering the fear of hidden interactions with Simplestreams code.

One thing the encapsulation allowed me to do is to simplify the dict's structure. It now maps ImageSpec, a tuple type, to boot resource — another kind of dict that I'd like to turn into a class. And boot_reverse turns a BootImageDict inside out to return that other kind of dict that uses the same constructor. You guessed it: that too I would like to turn into a class.

The one complication was in dumping the dict to JSON: you can't have tuples as keys in a JSON object. Moreover, you can't just change the format of the maas.meta file, which is where the JSON goes, because we have no idea what its purpose is or where it is used. And so inside the new dump_json method, I do convert back to the old format. This turns out to be a lot easier than continually managing the full deeply-nested structure, as I did in an earlier stage in the evolution of this branch.

The testing for dump_json overlaps with test code that I previously added to the end-to-end test. I can probably take that back out again now, in favour of the new unit test. But I wanted to have a solid anchor for the refactoring, so that I didn't accidentally change the format.

Jeroen

To post a comment you must log in.
Revision history for this message
Graham Binns (gmb) wrote :

This is great work; thanks Jeroen. A couple of quibbles:

[1]

19 +class BootImageDict:
20 + """Mapping of boot-image data.
21 +
22 + Maps `ImageSpec` tuples to metadata for Simplestreams products.
23 +
24 + This class is deliberately a bit more restrictive and less ad-hoc than a
25 + dict. It helps keep a clear view of the data structures in this module.

You've said that it's "a bit more restrictive and less ad-hoc than a
dict" but it's still called a BootImageDict and it implements items(),
which makes it look a bit like it *might* be a dict. But if you try to
use it like a dict, it'll go boom:

>>> from provisioningserver.import_images.boot_resources import BootImageDict
>>> image_dict = BootImageDict()
>>> image_dict['spam'] = 'eggs'
TypeError: 'BootImageDict' object does not support item assignment

If it's not going to behave like a dict we shouldn't call it a dict. If
it's meant to behave like a dict it should at least implement the
requisite methods, or raise a sensible error when those methods are
called on it (as shouldn't happen).

[2]

86 + """Complement `BootImageDict` with entries from another.

Missing "one" after "Complement".

review: Approve
Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

Renamed to BootImageMapping, and “one” added. Thanks for an insightful review.

Revision history for this message
MAAS Lander (maas-lander) wrote :

There are additional revisions which have not been approved in review. Please seek review and approval of these new revisions.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/provisioningserver/import_images/boot_resources.py'
2--- src/provisioningserver/import_images/boot_resources.py 2014-04-04 06:51:23 +0000
3+++ src/provisioningserver/import_images/boot_resources.py 2014-04-07 15:13:16 +0000
4@@ -92,23 +92,44 @@
5 ])
6
7
8-def iterate_boot_resources(boot_dict):
9- """Iterate a multi-level dict of boot images.
10-
11- Yields each combination of architecture, subarchitecture, release, and
12- label for which `boot` has an entry, as an `ImageSpec`.
13-
14- :param boot: Four-level dict of dicts representing boot images: the top
15- level maps the architectures to sub-dicts, each of which maps
16- subarchitectures to further dicts, each of which in turn maps
17- releases to yet more dicts, each of which maps release labels to any
18- kind of item it likes.
19+class BootImageMapping:
20+ """Mapping of boot-image data.
21+
22+ Maps `ImageSpec` tuples to metadata for Simplestreams products.
23+
24+ This class is deliberately a bit more restrictive and less ad-hoc than a
25+ dict. It helps keep a clear view of the data structures in this module.
26 """
27- for arch, subarches in sorted(boot_dict.items()):
28- for subarch, releases in sorted(subarches.items()):
29- for release, labels in sorted(releases.items()):
30- for label in sorted(labels.keys()):
31- yield ImageSpec(arch, subarch, release, label)
32+
33+ def __init__(self):
34+ self.mapping = {}
35+
36+ def items(self):
37+ """Iterate over `ImageSpec` keys, and their stored values."""
38+ for image_spec, item in sorted(self.mapping.items()):
39+ yield image_spec, item
40+
41+ def setdefault(self, image_spec, item):
42+ """Set metadata for `image_spec` to item, if not already set."""
43+ assert isinstance(image_spec, ImageSpec)
44+ self.mapping.setdefault(image_spec, item)
45+
46+ def dump_json(self):
47+ """Produce JSON representing the mapped boot images.
48+
49+ Tries to keep the output deterministic, so that identical data is
50+ likely to produce identical JSON.
51+ """
52+ # The meta files represent the mapping as a nested hierarchy of dicts.
53+ # Keep that format.
54+ data = {}
55+ for image, resource in self.items():
56+ arch, subarch, release, label = image
57+ data.setdefault(arch, {})
58+ data[arch].setdefault(subarch, {})
59+ data[arch][subarch].setdefault(release, {})
60+ data[arch][subarch][release][label] = resource
61+ return json.dumps(data, sort_keys=True)
62
63
64 def value_passes_filter_list(filter_list, property_value):
65@@ -156,67 +177,70 @@
66 return False
67
68
69-def boot_merge(boot1, boot2, filters=None):
70- """Add entries from the second multi-level dict to the first one.
71-
72- Function copies d[arch][subarch][release]=value chains from the second
73- dictionary to the first one if they don't exist there and pass optional
74- check done by filters.
75-
76- :param boot1: first dict which will be extended in-place.
77- :param boot2: second dict which will be used as a source of new entries.
78- :param filters: list of dicts each of which contains 'arch', 'subarch',
79- 'release' keys; function takes d[arch][subarch][release] chain to the
80- first dict only if filters contain at least one dict with
81- arch in d['arches'], subarch in d['subarch'], d['release'] == release;
82- dict may have '*' as a value for 'arch' and 'release' keys and as a
83- member of 'subarch' list -- in that case key-specific check always
84- passes.
85+def boot_merge(destination, additions, filters=None):
86+ """Complement one `BootImageMapping` with entries from another.
87+
88+ This adds entries from `additions` (that match `filters`, if given) to
89+ `destination`, but only for those image specs for which `destination` does
90+ not have entries yet.
91+
92+ :param destination: `BootImageMapping` to be updated. It will be extended
93+ in-place.
94+ :param additions: A second `BootImageMapping`, which will be used as a
95+ source of additional entries.
96+ :param filters: List of dicts, each of which contains 'arch', 'subarch',
97+ and 'release' keys. If given, entries are only considered for copying
98+ from `additions` to `destination` if they match at least one of the
99+ filters. Entries in the filter may be the string `*` (or for entries
100+ that are lists, may contain the string `*`) to make them match any
101+ value.
102 """
103- for arch, subarch, release, label in iterate_boot_resources(boot2):
104+ for image, resource in additions.items():
105+ arch, subarch, release, label = image
106 if image_passes_filter(filters, arch, subarch, release, label):
107 logger.debug(
108 "Merging boot resource for %s/%s/%s/%s.",
109 arch, subarch, release, label)
110- boot_resource = boot2[arch][subarch][release][label]
111 # Do not override an existing entry with the same
112 # arch/subarch/release/label: the first entry found takes
113 # precedence.
114- if not boot1[arch][subarch][release][label]:
115- boot1[arch][subarch][release][label] = boot_resource
116-
117-
118-def boot_reverse(boot):
119- """Determine a set of subarches which should be deployed by boot resource.
120-
121- Function reverses h[arch][subarch][release]=boot_resource hierarchy to form
122- boot resource to subarch relation. Many subarches may be deployed by a
123- single boot resource (in which case boot_resource=[subarch1, subarch2]
124- relation will be created). We note only subarchitectures and ignore
125- architectures because boot resource is tightly coupled with architecture
126- it can deploy according to metadata format. We can figure out for which
127- architecture we need to use a specific boot resource by looking at its
128- description in metadata. We can't do the same with subarch because we may
129- want to use boot resource only for a specific subset of subarches it can be
130- used for. To represent boot resource to subarch relation we generate the
131- following multi-level dictionary: d[content_id][product_name]=[subarches]
132- where 'content_id' and 'product_name' values come from metadata information
133- and allow us to uniquely identify a specific boot resource.
134-
135- :param boot: Hierarchy of dicts d[arch][subarch][release]=boot_resource
136+ destination.setdefault(image, resource)
137+
138+
139+def boot_reverse(boot_images_dict):
140+ """Determine the subarches supported by each boot resource.
141+
142+ Many subarches may be deployed by a single boot resource. We note only
143+ subarchitectures here and ignore architectures because the metadata format
144+ tightly couples a boot resource to its architecture.
145+
146+ We can figure out for which architecture we need to use a specific boot
147+ resource by looking at its description in the metadata. We can't do the
148+ same with subarch, because we may want to use a boot resource only for a
149+ specific subset of subarches.
150+
151+ This function represents the relationship between boot resources and
152+ subarchitectures as a multi-level dictionary::
153+
154+ d[content_id][product_name] = [subarches]
155+
156+ Here, `content_id` and `product_name` come from the metadata information
157+ and uniquely identify a specific boot resource.
158+
159+ :param boot: A `BootImageMapping` containing the images' metadata.
160 :return Hierarchy of dictionaries d[content_id][product_name]=[subarches]
161 which describes boot resource to subarches relation for all available
162 boot resources (products).
163 """
164 reverse = create_empty_hierarchy()
165
166- for arch, subarch, release, label in iterate_boot_resources(boot):
167- boot_resource = boot[arch][subarch][release][label]
168+ for image, boot_resource in boot_images_dict.items():
169 content_id = boot_resource['content_id']
170 product_name = boot_resource['product_name']
171 version_name = boot_resource['version_name']
172 existent = list(reverse[content_id][product_name][version_name])
173- reverse[content_id][product_name][version_name] = [subarch] + existent
174+ reverse[content_id][product_name][version_name] = (
175+ [image.subarch] + existent)
176
177 return reverse
178
179@@ -293,15 +317,15 @@
180 """Gather metadata about boot images available in a Simplestreams repo.
181
182 Used inside `download_image_descriptions`. Stores basic metadata about
183- each image it finds upstream in a given dict, indexed by arch, subarch,
184- release, and label. Each item is a dict containing the basic metadata
185- for retrieving a boot image.
186-
187- :ivar boot_images_dict: A nested `defaultdict` for boot images.
188- Image metadata will be stored here as it is discovered. Simplestreams
189- does not interact with this variable, and `BasicMirrorWriter` in itself
190- is stateless. It relies on a subclass (such as this one) to store
191- data.
192+ each image it finds upstream in a given `BootImageMapping`. Each stored
193+ item is a dict containing the basic metadata for retrieving a boot image.
194+
195+ Simplestreams' `BasicMirrorWriter` in itself is stateless. It relies on
196+ a subclass (such as this one) to store data.
197+
198+ :ivar boot_images_dict: A `BootImageMapping`. Image metadata will be
199+ stored here as it is discovered. Simplestreams does not interact with
200+ this variable.
201 """
202
203 def __init__(self, boot_images_dict):
204@@ -321,11 +345,11 @@
205 arch, subarches = item['arch'], item['subarches']
206 release = item['release']
207 label = item['label']
208+ base_image = ImageSpec(arch, None, release, label)
209 compact_item = clean_up_repo_item(item)
210 for subarch in subarches.split(','):
211- if not self.boot_images_dict[arch][subarch][release][label]:
212- self.boot_images_dict[arch][subarch][release][label] = (
213- compact_item)
214+ self.boot_images_dict.setdefault(
215+ base_image._replace(subarch=subarch), compact_item)
216
217
218 def download_image_descriptions(path, keyring=None):
219@@ -339,7 +363,7 @@
220 mirror, rpath = path_from_mirror_url(path, None)
221 policy = get_signing_policy(rpath, keyring)
222 reader = UrlMirrorReader(mirror, policy=policy)
223- boot_images_dict = create_empty_hierarchy()
224+ boot_images_dict = BootImageMapping()
225 dumper = RepoDumper(boot_images_dict)
226 dumper.sync(reader, rpath)
227 return boot_images_dict
228@@ -347,7 +371,7 @@
229
230 def download_all_image_descriptions(config):
231 """Download image metadata for all sources in `config`."""
232- boot = create_empty_hierarchy()
233+ boot = BootImageMapping()
234 for source in config['boot']['sources']:
235 repo_boot = download_image_descriptions(
236 source['path'], keyring=source['keyring'])
237@@ -548,7 +572,7 @@
238 storage = config['boot']['storage']
239
240 boot = download_all_image_descriptions(config)
241- meta_file_content = json.dumps(boot, sort_keys=True)
242+ meta_file_content = boot.dump_json()
243 if meta_contains(storage, meta_file_content):
244 # The current maas.meta already contains the new config. No need to
245 # rewrite anything.
246
247=== modified file 'src/provisioningserver/import_images/tests/test_boot_resources.py'
248--- src/provisioningserver/import_images/tests/test_boot_resources.py 2014-04-07 08:09:47 +0000
249+++ src/provisioningserver/import_images/tests/test_boot_resources.py 2014-04-07 15:13:16 +0000
250@@ -46,22 +46,75 @@
251 )
252
253
254-class TestIterateBootResources(MAASTestCase):
255- """Tests for `iterate_boot_resources`."""
256-
257- def test_empty_hierarchy_yields_nothing(self):
258- self.assertItemsEqual(
259- [],
260- boot_resources.iterate_boot_resources(
261- boot_resources.create_empty_hierarchy()))
262-
263- def test_finds_boot_resource(self):
264- image_spec = make_image_spec()
265- arch, subarch, release, label = image_spec
266- self.assertItemsEqual(
267- [image_spec],
268- boot_resources.iterate_boot_resources(
269- {arch: {subarch: {release: {label: factory.make_name()}}}}))
270+class TestBootImageMapping(MAASTestCase):
271+ """Tests for `BootImageMapping`."""
272+
273+ def test_initially_empty(self):
274+ self.assertItemsEqual([], boot_resources.BootImageMapping().items())
275+
276+ def test_items_returns_items(self):
277+ image = make_image_spec()
278+ resource = factory.make_name('resource')
279+ image_dict = set_resource(image_spec=image, resource=resource)
280+ self.assertItemsEqual([(image, resource)], image_dict.items())
281+
282+ def test_setdefault_sets_unset_item(self):
283+ image_dict = boot_resources.BootImageMapping()
284+ image = make_image_spec()
285+ resource = factory.make_name('resource')
286+ image_dict.setdefault(image, resource)
287+ self.assertItemsEqual([(image, resource)], image_dict.items())
288+
289+ def test_setdefault_leaves_set_item_unchanged(self):
290+ image = make_image_spec()
291+ old_resource = factory.make_name('resource')
292+ image_dict = set_resource(image_spec=image, resource=old_resource)
293+ image_dict.setdefault(image, factory.make_name('newresource'))
294+ self.assertItemsEqual([(image, old_resource)], image_dict.items())
295+
296+ def test_dump_json_is_consistent(self):
297+ image = make_image_spec()
298+ resource = factory.make_name('resource')
299+ image_dict_1 = set_resource(image_spec=image, resource=resource)
300+ image_dict_2 = set_resource(image_spec=image, resource=resource)
301+ self.assertEqual(image_dict_1.dump_json(), image_dict_2.dump_json())
302+
303+ def test_dump_json_represents_empty_dict_as_empty_object(self):
304+ self.assertEqual('{}', boot_resources.BootImageMapping().dump_json())
305+
306+ def test_dump_json_represents_entry(self):
307+ image = make_image_spec()
308+ resource = factory.make_name('resource')
309+ image_dict = set_resource(image_spec=image, resource=resource)
310+ self.assertEqual(
311+ {
312+ image.arch: {
313+ image.subarch: {
314+ image.release: {image.label: resource},
315+ },
316+ },
317+ },
318+ json.loads(image_dict.dump_json()))
319+
320+ def test_dump_json_combines_similar_entries(self):
321+ image = make_image_spec()
322+ other_release = factory.make_name('other-release')
323+ resource1 = factory.make_name('resource')
324+ resource2 = factory.make_name('other-resource')
325+ image_dict = boot_resources.BootImageMapping()
326+ set_resource(image_dict, image, resource1)
327+ set_resource(
328+ image_dict, image._replace(release=other_release), resource2)
329+ self.assertEqual(
330+ {
331+ image.arch: {
332+ image.subarch: {
333+ image.release: {image.label: resource1},
334+ other_release: {image.label: resource2},
335+ },
336+ },
337+ },
338+ json.loads(image_dict.dump_json()))
339
340
341 class TestValuePassesFilterList(MAASTestCase):
342@@ -151,7 +204,7 @@
343 image1 = make_image_spec()
344 image2 = make_image_spec()
345 resource = self.make_boot_resource()
346- boot_dict = {}
347+ boot_dict = boot_resources.BootImageMapping()
348 # Create two images in boot_dict, both containing the same resource.
349 for image in [image1, image2]:
350 set_resource(
351@@ -262,23 +315,14 @@
352
353
354 def set_resource(boot_dict=None, image_spec=None, resource=None):
355- """Add a boot resource to `boot_dict`, creating it if necessary."""
356+ """Add boot resource to a `BootImageMapping`, creating it if necessary."""
357 if boot_dict is None:
358- boot_dict = {}
359+ boot_dict = boot_resources.BootImageMapping()
360 if image_spec is None:
361 image_spec = make_image_spec()
362 if resource is None:
363 resource = factory.make_name('boot-resource')
364- arch, subarch, release, label = image_spec
365- # Drill down into the dict; along the way, create any missing levels of
366- # nested dicts.
367- nested_dict = boot_dict
368- for level in (arch, subarch, release):
369- nested_dict.setdefault(level, {})
370- nested_dict = nested_dict[level]
371- # At the bottom level, indexed by "label," insert "resource" as the
372- # value.
373- nested_dict[label] = resource
374+ boot_dict.mapping[image_spec] = resource
375 return boot_dict
376
377
378@@ -288,12 +332,12 @@
379 def test_integrates(self):
380 # End-to-end scenario for boot_merge: start with an empty boot
381 # resources dict, and receive one resource from Simplestreams.
382- total_resources = boot_resources.create_empty_hierarchy()
383+ total_resources = boot_resources.BootImageMapping()
384 resources_from_repo = set_resource()
385- boot_resources.boot_merge(total_resources, resources_from_repo.copy())
386+ boot_resources.boot_merge(total_resources, resources_from_repo)
387 # Since we started with an empty dict, the result contains the same
388 # item that we got from Simplestreams, and nothing else.
389- self.assertEqual(resources_from_repo, total_resources)
390+ self.assertEqual(resources_from_repo.mapping, total_resources.mapping)
391
392 def test_obeys_filters(self):
393 filters = [
394@@ -312,13 +356,13 @@
395
396 def test_does_not_overwrite_existing_entry(self):
397 image = make_image_spec()
398- original_resources = set_resource(
399+ total_resources = set_resource(
400 resource="Original resource", image_spec=image)
401- total_resources = original_resources.copy()
402+ original_resources = total_resources.mapping.copy()
403 resources_from_repo = set_resource(
404 resource="New resource", image_spec=image)
405- boot_resources.boot_merge(total_resources, resources_from_repo.copy())
406- self.assertEqual(original_resources, total_resources)
407+ boot_resources.boot_merge(total_resources, resources_from_repo)
408+ self.assertEqual(original_resources, total_resources.mapping)
409
410
411 class TestGetSigningPolicy(MAASTestCase):