Merge lp:~waveform/meta-release/ubuntu into lp:~ubuntu-core-dev/meta-release/ubuntu

Proposed by Dave Jones
Status: Merged
Merged at revision: 288
Proposed branch: lp:~waveform/meta-release/ubuntu
Merge into: lp:~ubuntu-core-dev/meta-release/ubuntu
Diff against target: 976 lines (+714/-84)
2 files modified
raspi/os_list_imagingutility_ubuntu.json (+24/-24)
refresh_os_list (+690/-60)
To merge this branch: bzr merge lp:~waveform/meta-release/ubuntu
Reviewer Review Type Date Requested Status
Brian Murray Needs Information
Review via email: mp+408826@code.launchpad.net

Commit message

Impish release changes and tests added to the refresh_os_list script.

Add tests via the built-in doctest module. Simply run the script with TEST=1 in the environment (e.g. TEST=1 ./refresh_os_list) to invoke the test-suite. If all tests pass, no output is produced and the exit-code is 0. You can add "-v" to run tests verbosely, but this tends to produce a lot of (quite confusing) output.

Instructions are included (in comments under main()) for running the test suite under the coverage tool to track coverage. Currently it's about 96% which is as good as I can manage without going completely overboard.

Description of the change

I'm in two minds as to whether this is the right approach. On the plus side:

* adding tests is good
* this keeps everything in a single file which is good given this is a standalone utility only really applicable to this repo
* it adds no dependencies; the script and everything required for the test-suite is still built into a base Python 3 installation
* as the tests use doctest, the tests for the most part enhance the documentation by providing useful examples (those that aren't useful and largely for coverage are pushed to the end of the file away from doc-strings)

On the minus side:

* it nearly trebles the size of the file (!)
* running the test-suite with coverage is non-obvious; I've left comments on doing so in the main function (which future developers would presumably (hopefully?!) read :) but it's nothing like as easy as good ol' pytest-cov
* it adds a few functions that are *purely* for the benefit of the test-suite (fixtures basically); I've left commentary in their doc-strings that these are nothing to do with the "meat" of the script but it still feels a bit ugly

Anyway, feel free to reject this -- maybe it's not the best approach, but it's an interesting experiment nonetheless.

To post a comment you must log in.
lp:~waveform/meta-release/ubuntu updated
284. By Brian Murray

merge Dave's changes which adds 21.10 for raspi

285. By Brian Murray

add in impish to meta-release and meta-release-proposed

286. By Brian Murray

add jammy to meta-release-development

287. By Brian Murray

add in jammy to meta-release-lts-development

Revision history for this message
Brian Murray (brian-murray) wrote :

Unless you think this is particularly useful I'd pass on merging this. Do you really want to add it?

review: Needs Information
Revision history for this message
Dave Jones (waveform) wrote :

On balance, I'd like to add it.

Beyond the basic "moar tests good!" sentiment, and despite it being a bloody huge commit that drastically extends the size of the script, it does mean that should I (or some other poor soul) ever need to touch it again (maybe one day, someone will decide cdimage needs a bit of a makeover), it'll at least mean I can run the test-suite and be vaguely confident nothing's borked.

Oh, it also adds doc-strings to everything which always makes me happy!

lp:~waveform/meta-release/ubuntu updated
288. By Brian Murray

add tests to the refresh_os_list script

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'raspi/os_list_imagingutility_ubuntu.json'
2--- raspi/os_list_imagingutility_ubuntu.json 2021-08-26 14:01:38 +0000
3+++ raspi/os_list_imagingutility_ubuntu.json 2021-10-14 14:58:33 +0000
4@@ -1,40 +1,40 @@
5 {
6 "os_list": [
7 {
8- "name": "Ubuntu Desktop 21.04 (RPi 4/400)",
9+ "name": "Ubuntu Desktop 21.10 (RPi 4/400)",
10 "description": "64-bit desktop OS for Pi 4 models with 4Gb+",
11- "url": "http://cdimage.ubuntu.com/releases/hirsute/release/ubuntu-21.04-preinstalled-desktop-arm64+raspi.img.xz",
12+ "url": "http://cdimage.ubuntu.com/releases/impish/release/ubuntu-21.10-preinstalled-desktop-arm64+raspi.img.xz",
13 "icon": "https://assets.ubuntu.com/v1/85a9de76-ubuntu-icon.svg",
14- "extract_size": 9026085888,
15- "extract_sha256": "de4a0e5a40bab9fd9ced8d6b04388370c8bce28a185ed449eaa54c03ce2d77d7",
16- "image_download_size": 1743725624,
17- "release_date": "2021-04-22",
18- "image_download_sha256": "d89ee327a00b98d7166b1a8cc95d17762aaacd3b4d0fc756c5b6b65df1708f48",
19- "website": "https://ubuntu.com/raspberry-pi/desktop"
20+ "release_date": "2021-10-14",
21+ "website": "https://ubuntu.com/raspberry-pi/desktop",
22+ "extract_size": 9400728576,
23+ "extract_sha256": "132af57a4bb711273c78e25119b5ec92766f8bc7884d9bf27f15faa176bc952c",
24+ "image_download_size": 2027828568,
25+ "image_download_sha256": "5187d507099f26bc4d8218085109af498fae5ff93b40c668f83bab5c7574d954"
26 },
27 {
28- "name": "Ubuntu Server 21.04 (RPi 2/3/4/400)",
29+ "name": "Ubuntu Server 21.10 (RPi 2/3/4/400)",
30 "description": "32-bit server OS for armhf architectures",
31- "url": "http://cdimage.ubuntu.com/releases/hirsute/release/ubuntu-21.04-preinstalled-server-armhf+raspi.img.xz",
32+ "url": "http://cdimage.ubuntu.com/releases/impish/release/ubuntu-21.10-preinstalled-server-armhf+raspi.img.xz",
33 "icon": "https://assets.ubuntu.com/v1/85a9de76-ubuntu-icon.svg",
34- "extract_size": 3211852800,
35- "extract_sha256": "8eeeaa116b91f4f622bc3abaa80453387783de6bfce77f7a51a02c5337f5c7e1",
36- "image_download_size": 756021496,
37- "release_date": "2021-04-21",
38- "image_download_sha256": "c9a9a5177a03fcbb6203b38e5c3c4e5447fd9e8891515da4146f319f04eb3495",
39- "website": "https://ubuntu.com/raspberry-pi/server"
40+ "release_date": "2021-10-14",
41+ "website": "https://ubuntu.com/raspberry-pi/server",
42+ "extract_size": 3732335616,
43+ "extract_sha256": "56caec8fd34aa4aec01641aa3ac3993d21468b375835ca40a7ccf948947ca353",
44+ "image_download_size": 886173524,
45+ "image_download_sha256": "341593c9607ed20744cd86941d94d73e3ba4f74e8ef2633eec63ce9b0cfac5a1"
46 },
47 {
48- "name": "Ubuntu Server 21.04 (RPi 3/4/400)",
49+ "name": "Ubuntu Server 21.10 (RPi 3/4/400)",
50 "description": "64-bit server OS for arm64 architectures",
51- "url": "http://cdimage.ubuntu.com/releases/hirsute/release/ubuntu-21.04-preinstalled-server-arm64+raspi.img.xz",
52+ "url": "http://cdimage.ubuntu.com/releases/impish/release/ubuntu-21.10-preinstalled-server-arm64+raspi.img.xz",
53 "icon": "https://assets.ubuntu.com/v1/85a9de76-ubuntu-icon.svg",
54- "extract_size": 3491662848,
55- "extract_sha256": "0d1eb068e55879ed279a3c9ba79fe186919db5746606477a3ba4b76318fd10cd",
56- "image_download_size": 787262724,
57- "release_date": "2021-04-21",
58- "image_download_sha256": "3df85b93b66ccd2d370c844568b37888de66c362eebae5204bf017f6f5875207",
59- "website": "https://ubuntu.com/raspberry-pi/server"
60+ "release_date": "2021-10-14",
61+ "website": "https://ubuntu.com/raspberry-pi/server",
62+ "extract_size": 4068480000,
63+ "extract_sha256": "4cf06429e0367f0a59b890819d1792b0d816bc531fcb5bd092e441d8d6a942b9",
64+ "image_download_size": 921117368,
65+ "image_download_sha256": "126f940d3b270a6c1fc5a183ac8a3d193805fead4f517296a7df9d3e7d691a03"
66 },
67 {
68 "name": "Ubuntu Server 20.04.3 LTS (RPi 2/3/4/400)",
69
70=== modified file 'refresh_os_list'
71--- refresh_os_list 2021-08-26 12:35:57 +0000
72+++ refresh_os_list 2021-10-14 14:58:33 +0000
73@@ -1,5 +1,30 @@
74 #!/usr/bin/python3
75
76+"""
77+A script for generating the JSON list required by the Raspberry Pi imaging
78+utility. Takes an existing JSON list as input and updates it by calculating new
79+image sizes and check-sums from the source files on the server. Typical usage
80+is to tweak the existing JSON (e.g. moving a URL from one point release to the
81+next), feed the script the edited JSON, directing output to a new JSON file,
82+then diff the results for sanity before renaming the new file over the old:
83+
84+$ vim raspi/os*.json
85+$ ./refresh_os_list raspi/os*.json > new.json
86+$ diff raspi/os*.json new.json
87+$ mv new.json raspi/os*.json
88+
89+The only mandatory field in each entry is "url"; the script will fill out all
90+other missing fields (albeit with place-holders in the case of "name" and
91+"description" which it can't derive). If you need to add new images to the
92+JSON, you can simply add a new entry with just the "url" field and have the
93+script fill out everything else (remembering to replace the "name" and
94+"description" placeholders afterwards!).
95+
96+If you need to override the "url" field (e.g. when dealing with the list prior
97+to release), add an "override_url" field which will removed from the output but
98+will be used as the actual source of the image to calculate hashes and sizes.
99+"""
100+
101 import os
102 import io
103 import sys
104@@ -12,6 +37,7 @@
105 import textwrap
106 import warnings
107 import functools
108+import contextlib
109 import datetime as dt
110 from html.parser import HTMLParser
111 from urllib.parse import urlsplit, urlunsplit
112@@ -21,36 +47,23 @@
113
114
115 def main(args=None):
116- """
117- A script for generating the JSON list required by the Raspberry Pi imaging
118- utility. Takes an existing JSON list as input and updates it by calculating
119- new image sizes and check-sums from the source files on the server. Typical
120- usage is to tweak the existing JSON (e.g. moving a URL from one point
121- release to the next), feed the script the edited JSON, directing output to
122- a new JSON file, then diff the results for sanity before renaming the new
123- file over the old:
124-
125- $ vim raspi/os*.json
126- $ ./refresh_os_list raspi/os*.json > new.json
127- $ diff raspi/os*.json new.json
128- $ mv new.json raspi/os*.json
129-
130- The only mandatory field in each entry is "url"; the script will fill out
131- all other missing fields (albeit with place-holders in the case of "name"
132- and "description" which it can't derive). If you need to add new images to
133- the JSON, you can simply add a new entry with just the "url" field and have
134- the script fill out everything else (remembering to replace the "name" and
135- "description" placeholders afterwards!).
136-
137- If you need to override the "url" field (e.g. when dealing with the list
138- prior to release), add an "override_url" field which will removed from the
139- output but will be used as the actual source of the image to calculate
140- hashes and sizes.
141- """
142- if args is None:
143- args = sys.argv[1:]
144+ if sys.version_info < (3, 7):
145+ raise SystemExit('This script requires Python 3.7 or later')
146+
147+ if int(os.environ.get('TEST', '0')):
148+ # To run the test suite (via the built-in doctest module):
149+ #
150+ # $ TEST=1 ./refresh_os_list
151+ #
152+ # Optionally, if you have python3-coverage installed, and you want to
153+ # track the coverage of the test suite you can further do:
154+ #
155+ # $ TEST=1 python3-coverage run --source=./ ./refresh_os_list
156+ # $ python3-coverage report --show-missing
157+ return _test_main()
158+
159 parser = argparse.ArgumentParser(
160- description=textwrap.dedent(main.__doc__),
161+ description=textwrap.dedent(__doc__),
162 formatter_class=argparse.RawDescriptionHelpFormatter)
163 parser.add_argument(
164 'input_file', type=argparse.FileType('r', encoding='utf-8'),
165@@ -64,28 +77,57 @@
166 help="Force the utility to refresh all images even if the release "
167 "date and download size have not changed in the index")
168 args = parser.parse_args(args)
169+
170 try:
171 update_template(args.input_file, args.output_file, args.force)
172 except Exception as e:
173- # NOTE: If you want full stack traces just run me like this:
174- # DEBUG=1 ./refresh_os_list blah
175- if not int(os.environ.get('DEBUG', '0')):
176+ # If you want full stack traces just run me like this:
177+ #
178+ # $ DEBUG=1 ./refresh_os_list blah
179+ if int(os.environ.get('DEBUG', '0')):
180+ raise
181+ else:
182 print(str(e), file=sys.stderr)
183 sys.exit(1)
184- else:
185- raise
186-
187-
188-def update_template(input_file, output_file, force=False):
189+
190+
191+def update_template(
192+ input_file, output_file, force=False,
193+ icon_url='https://assets.ubuntu.com/v1/85a9de76-ubuntu-icon.svg'):
194 """
195 Reads JSON data from *input_file* (a file-like object), updates any entries
196 found to have incorrect size and/or check-sums, fills out mandatory missing
197 fields with placeholders (name, description, website), and updates the
198 release-date (if it's out of date). The output is written (again in JSON
199- format) to *output_file* (another file-like object).
200+ format) to *output_file* (another file-like object). For example::
201+
202+ >>> images = {
203+ ... 'impish-armhf+raspi.img.gz': gzip.compress(b'foo' * 123456),
204+ ... 'impish-arm64+raspi.img.gz': gzip.compress(b'bar' * 234567),
205+ ... }
206+ >>> ts = dt.datetime(2021, 10, 25)
207+ >>> with contextlib.redirect_stderr(io.StringIO()):
208+ ... with _test_server(_make_index(_make_sums(images), ts)) as url:
209+ ... input_file = io.StringIO(json.dumps({'os_list': [
210+ ... {'url': f'{url}impish-armhf+raspi.img.gz'},
211+ ... {'url': f'{url}impish-arm64+raspi.img.gz'},
212+ ... ]}))
213+ ... output_file = io.StringIO()
214+ ... update_template(input_file, output_file)
215+ >>> output = json.loads(output_file.getvalue())
216+ >>> len(output['os_list'])
217+ 2
218+ >>> sorted(output['os_list'][0].keys()) # doctest: +NORMALIZE_WHITESPACE
219+ ['description', 'extract_sha256', 'extract_size', 'icon',
220+ 'image_download_sha256', 'image_download_size', 'name',
221+ 'release_date', 'url', 'website']
222+ >>> output['os_list'][0].keys() == output['os_list'][1].keys()
223+ True
224
225 If the *force* parameter is ``True`` then all entries are updated
226- regardless of whether they are up to date or not.
227+ regardless of whether they are up to date or not. The *icon_url* parameter
228+ optionally specifies the URL of the icon to include in entries that lack
229+ one.
230
231 Progress information is printed to stderr while the routine is running.
232 """
233@@ -95,7 +137,7 @@
234 # diff'ing the output substantially easier
235 template = json.load(input_file, object_pairs_hook=OrderedDict)
236 if not isinstance(template, dict):
237- raise ValueError('expected a JSON object in {}'.format(args.input_file))
238+ raise ValueError(f'expected a JSON object in {input_file.name}')
239 if template.keys() != {'os_list'}:
240 raise ValueError('expected a single "os_list" entry')
241 if not all('url' in entry for entry in template['os_list']):
242@@ -105,7 +147,7 @@
243 url = entry.get('override_url', entry['url'])
244 source = get_entry(url)
245 if 'icon' not in entry:
246- entry['icon'] = 'https://assets.ubuntu.com/v1/85a9de76-ubuntu-icon.svg'
247+ entry['icon'] = icon_url
248 if 'name' not in entry:
249 warnings.warn(Warning('Inserted placeholder entries; check output'))
250 entry['name'] = 'PLACEHOLDER'
251@@ -126,11 +168,10 @@
252 ).date() < source.date
253 )
254 if update:
255- print('Updating {}'.format(source.name), file=sys.stderr)
256+ print(f'Updating {source.name}', file=sys.stderr)
257 entry.update(update_entry(entry, source))
258 else:
259- print('Entry for {} is up to date'.format(source.name),
260- file=sys.stderr)
261+ print(f'Entry for {source.name} is up to date', file=sys.stderr)
262 # Always update image_download_sha256 to fill it out when missing
263 entry['image_download_sha256'] = source.sha256
264 entry.pop('override_url', None)
265@@ -140,18 +181,58 @@
266 urlopen(entry['website'])
267 except HTTPError as exc:
268 warnings.warn(
269- Warning('Failed to access {entry[website]}: {exc!r}'.format(
270- entry=entry, exc=exc)))
271+ Warning(f'Failed to access {entry[website]}: {exc!r}'))
272 json.dump(template, output_file, indent=4)
273
274
275 class HashStream:
276+ """
277+ When constructed with *stream*, a file-like object, this class proxies all
278+ calls to the usual :meth:`~io.BufferedIOBase.read` calls (which must be
279+ sequential as the class provides no "seek" method), and tots up the number
280+ of bytes read in the :attr:`size` attribute, and the SHA256 checksum of
281+ all data read in the :attr:`cksum` attribute.
282+
283+ For example::
284+
285+ >>> import io
286+ >>> stream = io.BytesIO(b'foo bar baz')
287+ >>> h = HashStream(stream)
288+ >>> h.size
289+ 0
290+ >>> h.cksum
291+ 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'
292+ >>> h.read(3)
293+ b'foo'
294+ >>> h.size
295+ 3
296+ >>> h.cksum
297+ '2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae'
298+ >>> h.read()
299+ b' bar baz'
300+ >>> h.size == len(stream.getvalue())
301+ True
302+ >>> h.cksum
303+ 'dbd318c1c462aee872f41109a4dfd3048871a03dedd0fe0e757ced57dad6f2d7'
304+
305+ The class may also be used as a context manager, in which case exiting the
306+ context manager closes the underlying stream::
307+
308+ >>> stream = io.BytesIO(b'foo bar baz')
309+ >>> stream.closed
310+ False
311+ >>> with HashStream(stream) as h:
312+ ... print(h.read().decode('ascii'))
313+ foo bar baz
314+ >>> stream.closed
315+ True
316+ """
317 def __init__(self, stream):
318 self.stream = stream
319 self._cksum = hashlib.sha256()
320 self._size = 0
321
322- def read(self, n):
323+ def read(self, n=-1):
324 result = self.stream.read(n)
325 self._size += len(result)
326 self._cksum.update(result)
327@@ -165,10 +246,17 @@
328
329 @property
330 def cksum(self):
331+ """
332+ The SHA256 checksum of all data read so far, returned as a lowercased
333+ hex-string.
334+ """
335 return self._cksum.hexdigest().lower()
336
337 @property
338 def size(self):
339+ """
340+ The number of bytes of data read so far.
341+ """
342 return self._size
343
344
345@@ -180,7 +268,27 @@
346 It stores the content of all ``<th>`` and ``<td>`` tags under each ``<tr>``
347 tag in the :attr:`table` attribute as a list of lists (the outer list of
348 rows, the inner lists of cells within those rows). All data is represented
349- as strings, or as ``None`` for entirely empty entries.
350+ as strings, or as ``None`` for entirely empty entries. For example::
351+
352+ >>> html = '''
353+ ... <html><body><table>
354+ ... <p>A table:
355+ ... <tr><th>#</th><th>Name</th></tr>
356+ ... <tr><td>1</td><td>foo</td></tr>
357+ ... <tr><td>2</td><td>bar</td></tr>
358+ ... <tr><td></td><td>quux</td></tr>
359+ ... </table></body></html>
360+ ... '''
361+ >>> parser = TableParser()
362+ >>> parser.feed(html)
363+ >>> parser.table
364+ [['#', 'Name'], ['1', 'foo'], ['2', 'bar'], [None, 'quux']]
365+
366+ .. note::
367+
368+ As this is a subclass of an HTML parser (as opposed to an XML parser)
369+ there is no requirement that the input is strictly valid XML, hence the
370+ lack of a closing ``<p>`` tag above is acceptable.
371 """
372 def __init__(self):
373 super().__init__(convert_charrefs=True)
374@@ -217,6 +325,20 @@
375 """
376 Given the *url* of an image, returns an :class:`IndexEntry` named tuple
377 containing the url, name, generated date, SHA-256 check-sum, and file size.
378+ For example::
379+
380+ >>> images = {
381+ ... 'impish-armhf+raspi.img.xz': b'foo' * 123456,
382+ ... 'impish-arm64+raspi.img.xz': b'bar' * 234567,
383+ ... }
384+ >>> ts = dt.datetime(2021, 10, 25)
385+ >>> with _test_server(_make_index(_make_sums(images), ts)) as url:
386+ ... entry = get_entry(f'{url}impish-armhf+raspi.img.xz')
387+ >>> entry # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS
388+ IndexEntry(url='http://127.0.0.1:4444/impish-armhf+raspi.img.xz',
389+ name='impish-armhf+raspi.img.xz', date=datetime.date(2021, 10, 25),
390+ sha256='f4c1a97c6eef546ce6814a31abd371f3072bd9056377fcfee...',
391+ size=370368)
392 """
393 split = urlsplit(url)
394 path, name = split.path.rsplit('/', 1)
395@@ -224,19 +346,35 @@
396 try:
397 entry = get_index(index)[name]
398 except KeyError:
399- raise ValueError('unable to find {}; are you sure the filename is '
400- 'correct?'.format(url))
401+ raise ValueError(
402+ f'unable to find {url}; are you sure the filename is correct?')
403 if entry.size is None or entry.sha256 is None:
404- raise ValueError('unable to retrieve file-size or checksum for '
405- '{}'.format(url))
406+ raise ValueError(f'unable to retrieve file-size or checksum for {url}')
407 return entry
408
409
410-@functools.lru_cache()
411+@functools.lru_cache
412 def get_index(url):
413 """
414 Given the *url* of a cdimage directory containing images, returns a dict
415 mapping filenames to :class:`IndexEntry` named tuples.
416+ For example::
417+
418+ >>> images = {
419+ ... 'impish-armhf+raspi.img.xz': b'foo' * 123456,
420+ ... 'impish-arm64+raspi.img.xz': b'bar' * 234567,
421+ ... }
422+ >>> ts = dt.datetime(2021, 10, 25)
423+ >>> with _test_server(_make_index(_make_sums(images), ts)) as url:
424+ ... index = get_index(url)
425+ >>> sorted(index.keys())
426+ ['impish-arm64+raspi.img.xz', 'impish-armhf+raspi.img.xz']
427+ >>> entry = index['impish-arm64+raspi.img.xz']
428+ >>> entry # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS
429+ IndexEntry(url='http://127.0.0.1:4444/impish-arm64+raspi.img.xz',
430+ name='impish-arm64+raspi.img.xz', date=datetime.date(2021, 10, 25),
431+ sha256='e9cd9718e97ac951c0ead5de8069d0ff5de188620b12b02...',
432+ size=703701)
433 """
434 # NOTE: This code relies on the current layout of pages on
435 # cdimage.ubuntu.com; if extra tables or columns are introduced or
436@@ -246,8 +384,8 @@
437 with urlopen(url) as page:
438 parser.feed(page.read().decode('utf-8'))
439 except HTTPError:
440- raise ValueError('unable to find {}; are you sure the path is '
441- 'correct?'.format(url))
442+ raise ValueError(
443+ f'unable to get {url}; are you sure the path is correct?')
444 entries = {}
445 for row in parser.table:
446 try:
447@@ -277,6 +415,7 @@
448 entries[name] = entries[name]._replace(sha256=cksum)
449 except KeyError:
450 pass
451+ del entries['SHA256SUMS']
452 return entries
453
454
455@@ -284,10 +423,26 @@
456 """
457 Given an *entry* (a dict, read from the source JSON) representing a single
458 image, and a *source* (an :class:`IndexEntry` namedtuple) representing the
459- current state of that source on the cdimage server, download the
460- specified image and update the "extract_size", "extract_sha256",
461+ current state of that source on the cdimage server, download the specified
462+ image and update the "extract_size", "extract_sha256",
463 "image_download_size", "image_download_sha256", and "release_date" fields
464- (as necessary), returning the new entry.
465+ (as necessary), returning the new entry. For example::
466+
467+ >>> images = {
468+ ... 'impish-armhf+raspi.img.gz': gzip.compress(b'foo' * 123456),
469+ ... 'impish-arm64+raspi.img.gz': gzip.compress(b'bar' * 234567),
470+ ... }
471+ >>> ts = dt.datetime(2021, 10, 25)
472+ >>> with _test_server(_make_index(_make_sums(images), ts)) as url:
473+ ... entry = {'url': f'{url}impish-armhf+raspi.img.gz'}
474+ ... new_entry = update_entry(entry, get_entry(entry['url']))
475+ >>> sorted(new_entry.keys()) # doctest: +NORMALIZE_WHITESPACE
476+ ['extract_sha256', 'extract_size', 'image_download_sha256',
477+ 'image_download_size', 'release_date', 'url']
478+ >>> new_entry['release_date']
479+ '2021-10-25'
480+ >>> new_entry['extract_size']
481+ 370368
482
483 Note: the routine does *not* modify the entry dict passed in.
484 """
485@@ -300,8 +455,8 @@
486 }[source.url.rsplit('.', 1)[1]]
487 with HashStream(decompresser(compressed_image)) as decompressed_image:
488 while True:
489- print('\rDownloading & verifying image:{:5.1f}%'.format(
490- compressed_image.size * 100 / source.size),
491+ print('\rDownloading & verifying image:'
492+ f'{compressed_image.size * 100 / source.size:5.1f}%',
493 end='', file=sys.stderr)
494 buf = decompressed_image.read(65536)
495 if not buf:
496@@ -326,5 +481,480 @@
497 return entry
498
499
500+def _test_main():
501+ """
502+ Run the test suite via doctest.
503+ """
504+ # All functions from here on are purely for the benefit of the test suite
505+ import doctest
506+
507+ # Undecorate get_index to prevent the cache from breaking many tests (could
508+ # use get_index.cache_clear but this hack is marginally cleaner at least
509+ # from the perspective of the tests themselves)
510+ global get_index
511+ get_index = get_index.__wrapped__
512+ failures, total = doctest.testmod()
513+ return bool(failures)
514+
515+
516+@contextlib.contextmanager
517+def _test_server(files, *, host='127.0.0.1', port=4444):
518+ """
519+ This function provides a test HTTP server for the doctest suite.
520+
521+ It expects to be called with *content*, a :class:`dict` mapping filenames
522+ to byte-strings representing file contents. All contents will be written to
523+ a temporary directory, and a trivial HTTP server will be started to serve
524+ its content on the specified *host* and *port* (defaults to port 4444 on
525+ localhost).
526+
527+ The function acts as a context manager, cleaning up the http daemon and
528+ temporary directory upon exit. The URL of the root of the server is yielded
529+ by the context manager.
530+ """
531+ import tempfile
532+ import http.server
533+ from pathlib import Path
534+ from threading import Thread
535+
536+ class SilentHandler(http.server.SimpleHTTPRequestHandler):
537+ def log_message(self, fmt, *args):
538+ # Don't spam the console
539+ pass
540+
541+ with tempfile.TemporaryDirectory() as temp:
542+ for filename, data in files.items():
543+ filepath = Path(temp) / filename
544+ filepath.write_bytes(data)
545+
546+ handler = functools.partial(SilentHandler, directory=temp)
547+ with http.server.ThreadingHTTPServer(
548+ ('127.0.0.1', 4444), handler) as httpd:
549+ httpd_thread = Thread(target=httpd.serve_forever)
550+ httpd_thread.start()
551+ try:
552+ yield f'http://{host}:{port}/'
553+ finally:
554+ httpd.shutdown()
555+ httpd_thread.join(timeout=5)
556+ assert not httpd_thread.is_alive()
557+
558+
559+def _make_sums(files):
560+ """
561+ This function exists to generate SHA256SUMS files for the doctest suite.
562+
563+ Given *files*, a :class:`dict` mapping filenames to byte-strings of
564+ file contents, this function returns a new :class:`dict` which is a copy of
565+ *files* with one additional entry titled "SHA256SUMS" which contains the
566+ output of the "sha256sum" command for the given content.
567+ """
568+ files = files.copy()
569+ files['SHA256SUMS'] = '\n'.join(
570+ f'{hashlib.sha256(data).hexdigest()} {filename}'
571+ for filename, data in files.items()
572+ ).encode('ascii')
573+ return files
574+
575+
576+def _make_index(files, timestamp=None):
577+ """
578+ This function generates index.html files for the doctest suite.
579+
580+ Given *files*, a :class:`dict` mapping image filenames to byte-strings
581+ of file contents, this function generates an appropriate "index.html" file,
582+ returning a copy of the original :class:`dict` with this new entry.
583+
584+ Additionally *timestamp*, a :class:`~datetime.datetime` representing the
585+ last modification date, can be specified. It defaults to the current time
586+ if not given.
587+ """
588+ if timestamp is None:
589+ timestamp = dt.datetime.now()
590+ files = files.copy()
591+ rows = '\n'.join(
592+ f'<tr><td>Icon</td><td>{filename}</td>'
593+ f'<td>{timestamp.strftime("%Y-%m-%d %H:%M")}</td>'
594+ f'<td>{len(data) // 1024}K</td><td>Descriptive text</td></tr>'
595+ for filename, data in files.items()
596+ )
597+ files['index.html'] = f"""
598+ <html><body>
599+ <p>The following files are available:</p>
600+ <table>
601+ <tr><th></th><th>Name</th><th>LastMod</th><th>Size</th><th>Desc</th></tr>
602+ {rows}
603+ </table>
604+ </body></html>
605+ """.encode('utf-8')
606+ return files
607+
608+
609+# Extra tests to bump test suite coverage
610+
611+__test__ = {
612+ 'bad-json': """
613+ The input file must be valid JSON::
614+
615+ >>> import io
616+ >>> update_template(io.StringIO('foo'), io.StringIO())
617+ Traceback (most recent call last):
618+ File "<stdin>", line 2, in <module>
619+ update_template(io.StringIO('foo'), io.StringIO())
620+ File "./refresh_os_list", line 117, in update_template
621+ template = json.load(input_file, object_pairs_hook=OrderedDict)
622+ File "/usr/lib/python3.8/json/decoder.py", line 355, in raw_decode
623+ raise JSONDecodeError("Expecting value", s, err.value) from None
624+ json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
625+
626+ The input file has to contain a JSON object at the top level::
627+
628+ >>> data = io.StringIO('[]')
629+ >>> data.name = 'foo'
630+ >>> update_template(data, io.StringIO())
631+ Traceback (most recent call last):
632+ File "<stdin>", line 5, in <module>
633+ update_template(data, io.StringIO())
634+ File "./refresh_os_list", line 119, in update_template
635+ raise ValueError(f'expected a JSON object in {input_file.name}')
636+ ValueError: expected a JSON object in foo
637+
638+ The top-level JSON object must contain a single "os_list" entry::
639+
640+ >>> data = io.StringIO('{"foo": 1}')
641+ >>> data.name = 'foo'
642+ >>> update_template(data, io.StringIO())
643+ Traceback (most recent call last):
644+ File "<stdin>", line 8, in <module>
645+ update_template(data, io.StringIO())
646+ File "./refresh_os_list", line 121, in update_template
647+ raise ValueError('expected a single "os_list" entry')
648+ ValueError: expected a single "os_list" entry
649+
650+ Every JSON object below os_list must have a "url" field::
651+
652+ >>> data = io.StringIO(
653+ ... '{"os_list": {"name": "impish-server-arm64+raspi.img.xz"}}')
654+ >>> data.name = 'foo'
655+ >>> update_template(data, io.StringIO())
656+ Traceback (most recent call last):
657+ File "<stdin>", line 11, in <module>
658+ update_template(data, io.StringIO())
659+ File "./refresh_os_list", line 123, in update_template
660+ raise ValueError('all "os_list" entries must contain a "url" entry')
661+ ValueError: all "os_list" entries must contain a "url" entry
662+ """,
663+
664+ 'bad-url': """
665+ The URL provided to get_entry must be valid::
666+
667+ >>> images = {
668+ ... 'impish-armhf+raspi.img.xz': b'foo',
669+ ... 'impish-arm64+raspi.img.xz': b'bar',
670+ ... }
671+ >>> with _test_server(_make_index(_make_sums(images))) as url:
672+ ... wrong_url = f'{url}wrong/impish-armhf+raspi.img.xz'
673+ ... get_entry(wrong_url) # doctest: +ELLIPSIS
674+ Traceback (most recent call last):
675+ File "<stdin>", line 3, in <module>
676+ get_entry(wrong_url) # doctest: +ELLIPSIS
677+ File "./refresh_os_list", line 327, in get_entry
678+ entry = get_index(index)[name]
679+ File "./refresh_os_list", line 368, in get_index
680+ raise ValueError(...)
681+ ValueError: unable to get http://...; are you sure the path is correct?
682+ >>> with _test_server(_make_index(_make_sums(images))) as url:
683+ ... image_url = f'{url}focal-arm64+raspi.img.xz'
684+ ... get_entry(image_url) # doctest: +ELLIPSIS
685+ Traceback (most recent call last):
686+ File "<stdin>", line 4, in <module>
687+ entry = get_entry(f'{url}focal-arm64+raspi.img.xz')
688+ File "./refresh_os_list", line 328, in get_entry
689+ raise ValueError(...)
690+ ValueError: unable to find http://...; are you sure the filename is correct?
691+ """,
692+
693+ 'no-checksums': """
694+ The SHA256SUMS file must exist on the server::
695+
696+ >>> images = {
697+ ... 'hirsute-armhf+raspi.img.xz': b'foo',
698+ ... 'hirsute-arm64+raspi.img.xz': b'bar',
699+ ... }
700+ >>> with _test_server(_make_index(images)) as url:
701+ ... image_url = f'{url}hirsute-arm64+raspi.img.xz'
702+ ... get_entry(image_url) # doctest: +ELLIPSIS
703+ Traceback (most recent call last):
704+ File "<stdin>", line 3, in <module>
705+ get_entry(image_url)
706+ File "./refresh_os_list", line 332, in get_entry
707+ raise ValueError(f'unable to retrieve file-size or checksum for {url}')
708+ ValueError: unable to retrieve file-size or checksum for http://...
709+ """,
710+
711+ 'ignore-star-prefixes': """
712+ Filenames in checksum files can have star prefixes (indicating binary
713+ input) which should be ignored::
714+
715+ >>> images = _make_sums({
716+ ... 'impish-armhf+raspi.img.xz': b'foo' * 123456,
717+ ... 'impish-arm64+raspi.img.xz': b'bar' * 234567,
718+ ... })
719+ >>> cksums = images['SHA256SUMS'].decode('utf-8').splitlines(True)
720+ >>> cksums = [f'{cksum} *{filename}' for line in cksums
721+ ... for cksum, filename in (line.split(None, 1),)]
722+ >>> images['SHA256SUMS'] = ''.join(cksums).encode('utf-8')
723+ >>> ts = dt.datetime(2021, 10, 25)
724+ >>> with _test_server(_make_index(images, ts)) as url:
725+ ... entry = get_entry(f'{url}impish-armhf+raspi.img.xz')
726+ >>> entry # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS
727+ IndexEntry(url='http://127.0.0.1:4444/impish-armhf+raspi.img.xz',
728+ name='impish-armhf+raspi.img.xz', date=datetime.date(2021, 10, 25),
729+ sha256='f4c1a97c6eef546ce6814a31abd371f3072bd9056377fcfee...',
730+ size=370368)
731+ """,
732+
733+ 'ignore-extra-cksums': """
734+ Files may be present in the checksum file which we didn't find (or more
735+ likely ignored) in the index.html. This should not cause an error::
736+
737+ >>> images = _make_sums({
738+ ... 'impish-armhf+raspi.img.xz': b'foo' * 123456,
739+ ... 'impish-arm64+raspi.img.xz': b'bar' * 234567,
740+ ... })
741+ >>> cksums = images['SHA256SUMS'].decode('utf-8')
742+ >>> cksums += '\\n' + '0123abcd' * 8 + ' weird-hash.img.xz'
743+ >>> images['SHA256SUMS'] = cksums.encode('utf-8')
744+ >>> ts = dt.datetime(2021, 10, 25)
745+ >>> with _test_server(_make_index(images, ts)) as url:
746+ ... entry = get_entry(f'{url}impish-armhf+raspi.img.xz')
747+ >>> entry # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS
748+ IndexEntry(url='http://127.0.0.1:4444/impish-armhf+raspi.img.xz',
749+ name='impish-armhf+raspi.img.xz', date=datetime.date(2021, 10, 25),
750+ sha256='f4c1a97c6eef546ce6814a31abd371f3072bd9056377fcfee...',
751+ size=370368)
752+ """,
753+
754+ 'ignore-bad-rows': """
755+ If the table contains rows with the wrong number of columns, or rows with
756+ timestamps that cannot be converted, ignore those rows::
757+
758+ >>> content = _make_sums({
759+ ... 'hirsute-armhf+raspi.img.xz': b'foo',
760+ ... 'hirsute-arm64+raspi.img.xz': b'bar',
761+ ... })
762+ >>> content['index.html'] = '''
763+ ... <html><body><table>
764+ ... <tr>
765+ ... <th></th><th>Name</th>
766+ ... <th>LastMod</th><th>Size</th><th>Desc</th>
767+ ... </tr>
768+ ... <tr><td>Silly row with one column</td></tr>
769+ ... <tr>
770+ ... <td>Icon</td>
771+ ... <td>SHA256SUMS</td>
772+ ... <td>2021-10-25 00:00</td><td>{len(content['SHA256SUMS'])}</td>
773+ ... <td>sha256 checksums</td>
774+ ... </tr>
775+ ... <tr>
776+ ... <td>Icon</td>
777+ ... <td>hirsute-armhf+raspi.img.xz</td>
778+ ... <td>2021-10-25 00:00</td><td>3</td><td>Hirsute release for Pi</td>
779+ ... </tr>
780+ ... <tr>
781+ ... <td>Icon</td>
782+ ... <td>hirsute-arm64+raspi.img.xz</td>
783+ ... <td>2021-10-25 00:00</td><td>3</td><td>Hirsute release for Pi</td>
784+ ... </tr>
785+ ... <tr>
786+ ... <td>Icon</td>
787+ ... <td>hirsute-amd64.img.xz</td>
788+ ... <td>Bad timestamp</td><td>3</td><td>Focal release for Pi</td>
789+ ... </tr>
790+ ... </table></body></html>
791+ ... '''.encode('utf-8')
792+ >>> with _test_server(content) as url:
793+ ... image_url = f'{url}hirsute-armhf+raspi.img.xz'
794+ ... entry = get_entry(image_url)
795+ >>> entry # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS
796+ IndexEntry(url='http://127.0.0.1:4444/hirsute-armhf+raspi.img.xz',
797+ name='hirsute-armhf+raspi.img.xz', date=datetime.date(2021, 10, 25),
798+ sha256='2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f...',
799+ size=3)
800+ """,
801+
802+ 'corrupted-size': """
803+ We should notice if the downloaded size doesn't match what's reported by
804+ the server. Here, we grab the new source entry for an image, then switch
805+ out the content of the image before calling update_entry so it appears the
806+ image has been corrupted on the server::
807+
808+ >>> images = _make_sums({
809+ ... 'impish-armhf+raspi.img.gz': gzip.compress(b'foo'),
810+ ... 'impish-arm64+raspi.img.gz': gzip.compress(b'bar'),
811+ ... })
812+ >>> ts = dt.datetime(2021, 10, 25)
813+ >>> with _test_server(_make_index(images, ts)) as url:
814+ ... entry = {'url': f'{url}impish-armhf+raspi.img.gz'}
815+ ... source = get_entry(entry['url'])
816+ >>> images['impish-armhf+raspi.img.gz'] = gzip.compress(b'corrupted!')
817+ >>> with _test_server(_make_index(images, ts)) as url:
818+ ... update_entry(entry, source)
819+ Traceback (most recent call last):
820+ File "<stdin>", line 6, in <module>
821+ update_entry(entry, source)
822+ File "./refresh_os_list", line 451, in update_entry
823+ raise ValueError('Corrupted download; size check failed')
824+ ValueError: Corrupted download; size check failed
825+ """,
826+
827+ 'corrupted-checksum': """
828+ If the size of a download matches, but the SHA256 checksum doesn't, we
829+ should again notice. In this case we use gzip compresslevel=0 (no
830+ compression) with data which is different but the same size as the
831+ original, and pull the same trick of grabbing a new source entry then
832+ switching out the file on the server before calling update_entry::
833+
834+ >>> images = _make_sums({
835+ ... 'impish-armhf+raspi.img.gz': gzip.compress(b'foo', compresslevel=0),
836+ ... 'impish-arm64+raspi.img.gz': gzip.compress(b'bar', compresslevel=0),
837+ ... })
838+ >>> ts = dt.datetime(2021, 10, 25)
839+ >>> with _test_server(_make_index(images, ts)) as url:
840+ ... entry = {'url': f'{url}impish-armhf+raspi.img.gz'}
841+ ... source = get_entry(entry['url'])
842+ >>> images['impish-armhf+raspi.img.gz'] = gzip.compress(b'baz', compresslevel=0)
843+ >>> with _test_server(_make_index(images, ts)) as url:
844+ ... update_entry(entry, source)
845+ Traceback (most recent call last):
846+ File "<stdin>", line 6, in <module>
847+ update_entry(entry, source)
848+ File "./refresh_os_list", line 453, in update_entry
849+ raise ValueError('Corrupted download; SHA256 check failed')
850+ ValueError: Corrupted download; SHA256 check failed
851+ """,
852+
853+ 'use-existing-fields': """
854+ If entries being their own name, description, and/or website entries we
855+ should simply keep them as is::
856+
857+ >>> images = {
858+ ... 'impish-armhf+raspi.img.gz': gzip.compress(b'foo'),
859+ ... 'impish-arm64+raspi.img.gz': gzip.compress(b'bar'),
860+ ... }
861+ >>> ts = dt.datetime(2021, 10, 25)
862+ >>> with contextlib.redirect_stderr(io.StringIO()):
863+ ... with _test_server(_make_index(_make_sums(images), ts)) as url:
864+ ... input_file = io.StringIO(json.dumps({'os_list': [
865+ ... {'name': name,
866+ ... 'website': 'https://ubuntu.com/',
867+ ... 'url': f'{url}{name}',
868+ ... 'description': 'The {name} image'}
869+ ... for name in images
870+ ... ]}))
871+ ... output_file = io.StringIO()
872+ ... update_template(input_file, output_file)
873+ >>> output = json.loads(output_file.getvalue())
874+ >>> len(output['os_list'])
875+ 2
876+ >>> sorted(output['os_list'][0].keys()) # doctest: +NORMALIZE_WHITESPACE
877+ ['description', 'extract_sha256', 'extract_size', 'icon',
878+ 'image_download_sha256', 'image_download_size', 'name',
879+ 'release_date', 'url', 'website']
880+ >>> output['os_list'][0]['website']
881+ 'https://ubuntu.com/'
882+ """,
883+
884+ 'keep-up-to-date': """
885+ If an entry is already up to date (same size, same release date) don't
886+ bother to update it. In fact, as long as the existing entry date is greater
887+ than or equal to the date on the server we accept it (to deal with official
888+ release dates later than the image generation date)::
889+
890+ >>> images = {
891+ ... 'impish-armhf+raspi.img.gz': gzip.compress(b'foo'),
892+ ... 'impish-arm64+raspi.img.gz': gzip.compress(b'bar'),
893+ ... }
894+ >>> ts = dt.datetime(2021, 10, 25)
895+ >>> with contextlib.redirect_stderr(io.StringIO()):
896+ ... with _test_server(_make_index(_make_sums(images), ts)) as url:
897+ ... input_file = io.StringIO(json.dumps({'os_list': [
898+ ... {'name': name,
899+ ... 'image_download_sha256':
900+ ... hashlib.sha256(images[name]).hexdigest(),
901+ ... 'image_download_size': len(images[name]),
902+ ... 'icon': 'https://ubuntu.com/ubuntu.svg',
903+ ... 'extract_sha256': '0123abcd' * 8,
904+ ... 'extract_size': 3,
905+ ... 'website': 'https://ubuntu.com/',
906+ ... 'url': f'{url}{name}',
907+ ... 'release_date': '2021-10-27',
908+ ... 'description': 'The {name} image'}
909+ ... for name in images
910+ ... ]}))
911+ ... output_file = io.StringIO()
912+ ... update_template(input_file, output_file)
913+ >>> output = json.loads(output_file.getvalue())
914+ >>> output['os_list'][0]['extract_sha256'] = '0123abcd' * 8
915+
916+ Note that in this case the extracted SHA256 is *not* checked because we
917+ don't even bother to download the image. We can tell no actual download
918+ occurred above because the extract_sha256 definitely won't match!
919+ """,
920+
921+ 'main --help': """
922+ Running the main function with --help results in help output::
923+
924+ >>> import os
925+ >>> os.environ['TEST'] = '0' # don't run the test-suite recursively!
926+ >>> try:
927+ ... main(['--help']) # doctest: +ELLIPSIS
928+ ... except SystemExit:
929+ ... pass
930+ usage: refresh_os_list [-h] ... input_file [output_file]
931+ ...
932+ """,
933+
934+ 'main works': """
935+ Running the main function actually does what it says on the tin::
936+
937+ >>> import tempfile
938+ >>> from pathlib import Path
939+ >>> tmp = tempfile.TemporaryDirectory()
940+ >>> os.environ['TEST'] = '0' # don't run the test-suite recursively!
941+ >>> images = {
942+ ... 'impish-armhf+raspi.img.gz': gzip.compress(b'foo'),
943+ ... 'impish-arm64+raspi.img.gz': gzip.compress(b'bar'),
944+ ... }
945+ >>> ts = dt.datetime(2021, 10, 25)
946+ >>> with _test_server(_make_index(_make_sums(images), ts)) as url:
947+ ... old_data = {'os_list': [
948+ ... {'url': f'{url}{name}'}
949+ ... for name in images
950+ ... ]}
951+ ... old_filename = Path(tmp.name) / 'old.json'
952+ ... old_size = old_filename.write_text(json.dumps(old_data))
953+ ... new_filename = Path(tmp.name) / 'new.json'
954+ ... with contextlib.redirect_stderr(io.StringIO()):
955+ ... main([str(old_filename), str(new_filename)])
956+ ... new_data = json.loads(new_filename.read_text())
957+ >>> list(new_data.keys())
958+ ['os_list']
959+ >>> len(new_data['os_list'])
960+ 2
961+ >>> entry = new_data['os_list'][0]
962+ >>> sorted(entry.keys()) # doctest: +NORMALIZE_WHITESPACE
963+ ['description', 'extract_sha256', 'extract_size', 'icon',
964+ 'image_download_sha256', 'image_download_size', 'name',
965+ 'release_date', 'url', 'website']
966+ >>> entry['extract_size']
967+ 3
968+ >>> entry['extract_sha256']
969+ '2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae'
970+ >>> tmp.cleanup()
971+ """,
972+}
973+
974+
975 if __name__ == '__main__':
976 main()

Subscribers

People subscribed via source and target branches