meta-release

Merge lp:~waveform/meta-release/ubuntu into lp:~ubuntu-core-dev/meta-release/ubuntu

ubuntu
Merge into ubuntu

Proposed by Dave Jones on 2021-09-17

Status:

Merged

Merged at revision:

288

Proposed branch:

lp:~waveform/meta-release/ubuntu

Merge into:

lp:~ubuntu-core-dev/meta-release/ubuntu

Diff against target:

976 lines (+714/-84)

2 files modified

raspi/os_list_imagingutility_ubuntu.json (+24/-24)
refresh_os_list (+690/-60)

To merge this branch:

bzr merge lp:~waveform/meta-release/ubuntu

Undecided

Fix Released

Link a bug report

Reviewer	Review Type	Date Requested	Status
Brian Murray		2021-09-17	Needs Information on 2021-11-19
Review via email: mp+408826@code.launchpad.net

Commit message

Impish release changes and tests added to the refresh_os_list script.

Add tests via the built-in doctest module. Simply run the script with TEST=1 in the environment (e.g. TEST=1 ./refresh_os_list) to invoke the test-suite. If all tests pass, no output is produced and the exit-code is 0. You can add "-v" to run tests verbosely, but this tends to produce a lot of (quite confusing) output.

Instructions are included (in comments under main()) for running the test suite under the coverage tool to track coverage. Currently it's about 96% which is as good as I can manage without going completely overboard.

Description of the change

I'm in two minds as to whether this is the right approach. On the plus side:

* adding tests is good
* this keeps everything in a single file which is good given this is a standalone utility only really applicable to this repo
* it adds no dependencies; the script and everything required for the test-suite is still built into a base Python 3 installation
* as the tests use doctest, the tests for the most part enhance the documentation by providing useful examples (those that aren't useful and largely for coverage are pushed to the end of the file away from doc-strings)

On the minus side:

* it nearly trebles the size of the file (!)
* running the test-suite with coverage is non-obvious; I've left comments on doing so in the main function (which future developers would presumably (hopefully?!) read :) but it's nothing like as easy as good ol' pytest-cov
* it adds a few functions that are *purely* for the benefit of the test-suite (fixtures basically); I've left commentary in their doc-strings that these are nothing to do with the "meat" of the script but it still feels a bit ugly

Anyway, feel free to reject this -- maybe it's not the best approach, but it's an interesting experiment nonetheless.

lp:~waveform/meta-release/ubuntu updated on 2021-10-26

284. By Brian Murray on 2021-10-14: merge Dave's changes which adds 21.10 for raspi
285. By Brian Murray on 2021-10-14: add in impish to meta-release and meta-release-proposed
286. By Brian Murray on 2021-10-19: add jammy to meta-release-development
287. By Brian Murray on 2021-10-26: add in jammy to meta-release-lts-development

Revision history for this message

Brian Murray (brian-murray) wrote on 2021-11-19:

Unless you think this is particularly useful I'd pass on merging this. Do you really want to add it?

review: Needs Information

Revision history for this message

Dave Jones (waveform) wrote on 2021-11-23:

On balance, I'd like to add it.

Beyond the basic "moar tests good!" sentiment, and despite it being a bloody huge commit that drastically extends the size of the script, it does mean that should I (or some other poor soul) ever need to touch it again (maybe one day, someone will decide cdimage needs a bit of a makeover), it'll at least mean I can run the test-suite and be vaguely confident nothing's borked.

Oh, it also adds doc-strings to everything which always makes me happy!

lp:~waveform/meta-release/ubuntu updated on 2021-11-23

288. By Brian Murray on 2021-11-23: add tests to the refresh_os_list script

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Brian Murray

Dave Jones

Sergio Zanchetta

Simon Quigley

 === modified file 'raspi/os_list_imagingutility_ubuntu.json'
 --- raspi/os_list_imagingutility_ubuntu.json	2021-08-26 14:01:38 +0000
 +++ raspi/os_list_imagingutility_ubuntu.json	2021-10-14 14:58:33 +0000
@@ -1,40 +1,40 @@
+ {
      "os_list": [
+         {
--            "name": "Ubuntu Desktop 21.04 (RPi 4/400)",
++            "name": "Ubuntu Desktop 21.10 (RPi 4/400)",
              "description": "64-bit desktop OS for Pi 4 models with 4Gb+",
--            "url": "http://cdimage.ubuntu.com/releases/hirsute/release/ubuntu-21.04-preinstalled-desktop-arm64+raspi.img.xz",
++            "url": "http://cdimage.ubuntu.com/releases/impish/release/ubuntu-21.10-preinstalled-desktop-arm64+raspi.img.xz",
              "icon": "https://assets.ubuntu.com/v1/85a9de76-ubuntu-icon.svg",
--            "extract_size": 9026085888,
--            "extract_sha256": "de4a0e5a40bab9fd9ced8d6b04388370c8bce28a185ed449eaa54c03ce2d77d7",
--            "image_download_size": 1743725624,
--            "release_date": "2021-04-22",
--            "image_download_sha256": "d89ee327a00b98d7166b1a8cc95d17762aaacd3b4d0fc756c5b6b65df1708f48",
--            "website": "https://ubuntu.com/raspberry-pi/desktop"
++            "release_date": "2021-10-14",
++            "website": "https://ubuntu.com/raspberry-pi/desktop",
++            "extract_size": 9400728576,
++            "extract_sha256": "132af57a4bb711273c78e25119b5ec92766f8bc7884d9bf27f15faa176bc952c",
++            "image_download_size": 2027828568,
++            "image_download_sha256": "5187d507099f26bc4d8218085109af498fae5ff93b40c668f83bab5c7574d954"
          },
+         {
--            "name": "Ubuntu Server 21.04 (RPi 2/3/4/400)",
++            "name": "Ubuntu Server 21.10 (RPi 2/3/4/400)",
              "description": "32-bit server OS for armhf architectures",
--            "url": "http://cdimage.ubuntu.com/releases/hirsute/release/ubuntu-21.04-preinstalled-server-armhf+raspi.img.xz",
++            "url": "http://cdimage.ubuntu.com/releases/impish/release/ubuntu-21.10-preinstalled-server-armhf+raspi.img.xz",
              "icon": "https://assets.ubuntu.com/v1/85a9de76-ubuntu-icon.svg",
--            "extract_size": 3211852800,
--            "extract_sha256": "8eeeaa116b91f4f622bc3abaa80453387783de6bfce77f7a51a02c5337f5c7e1",
--            "image_download_size": 756021496,
--            "release_date": "2021-04-21",
--            "image_download_sha256": "c9a9a5177a03fcbb6203b38e5c3c4e5447fd9e8891515da4146f319f04eb3495",
--            "website": "https://ubuntu.com/raspberry-pi/server"
++            "release_date": "2021-10-14",
++            "website": "https://ubuntu.com/raspberry-pi/server",
++            "extract_size": 3732335616,
++            "extract_sha256": "56caec8fd34aa4aec01641aa3ac3993d21468b375835ca40a7ccf948947ca353",
++            "image_download_size": 886173524,
++            "image_download_sha256": "341593c9607ed20744cd86941d94d73e3ba4f74e8ef2633eec63ce9b0cfac5a1"
          },
+         {
--            "name": "Ubuntu Server 21.04 (RPi 3/4/400)",
++            "name": "Ubuntu Server 21.10 (RPi 3/4/400)",
              "description": "64-bit server OS for arm64 architectures",
--            "url": "http://cdimage.ubuntu.com/releases/hirsute/release/ubuntu-21.04-preinstalled-server-arm64+raspi.img.xz",
++            "url": "http://cdimage.ubuntu.com/releases/impish/release/ubuntu-21.10-preinstalled-server-arm64+raspi.img.xz",
              "icon": "https://assets.ubuntu.com/v1/85a9de76-ubuntu-icon.svg",
--            "extract_size": 3491662848,
--            "extract_sha256": "0d1eb068e55879ed279a3c9ba79fe186919db5746606477a3ba4b76318fd10cd",
--            "image_download_size": 787262724,
--            "release_date": "2021-04-21",
--            "image_download_sha256": "3df85b93b66ccd2d370c844568b37888de66c362eebae5204bf017f6f5875207",
--            "website": "https://ubuntu.com/raspberry-pi/server"
++            "release_date": "2021-10-14",
++            "website": "https://ubuntu.com/raspberry-pi/server",
++            "extract_size": 4068480000,
++            "extract_sha256": "4cf06429e0367f0a59b890819d1792b0d816bc531fcb5bd092e441d8d6a942b9",
++            "image_download_size": 921117368,
++            "image_download_sha256": "126f940d3b270a6c1fc5a183ac8a3d193805fead4f517296a7df9d3e7d691a03"
          },
+         {
              "name": "Ubuntu Server 20.04.3 LTS (RPi 2/3/4/400)",
 === modified file 'refresh_os_list'
 --- refresh_os_list	2021-08-26 12:35:57 +0000
 +++ refresh_os_list	2021-10-14 14:58:33 +0000
@@ -1,5 +1,30 @@
  #!/usr/bin/python3
++"""
++A script for generating the JSON list required by the Raspberry Pi imaging
++utility. Takes an existing JSON list as input and updates it by calculating new
++image sizes and check-sums from the source files on the server. Typical usage
++is to tweak the existing JSON (e.g. moving a URL from one point release to the
++next), feed the script the edited JSON, directing output to a new JSON file,
++then diff the results for sanity before renaming the new file over the old:
++
++$ vim raspi/os*.json
++$ ./refresh_os_list raspi/os*.json > new.json
++$ diff raspi/os*.json new.json
++$ mv new.json raspi/os*.json
++
++The only mandatory field in each entry is "url"; the script will fill out all
++other missing fields (albeit with place-holders in the case of "name" and
++"description" which it can't derive). If you need to add new images to the
++JSON, you can simply add a new entry with just the "url" field and have the
++script fill out everything else (remembering to replace the "name" and
++"description" placeholders afterwards!).
++
++If you need to override the "url" field (e.g. when dealing with the list prior
++to release), add an "override_url" field which will removed from the output but
++will be used as the actual source of the image to calculate hashes and sizes.
++"""
++
  import os
  import io
  import sys
@@ -12,6 +37,7 @@
  import textwrap
  import warnings
  import functools
++import contextlib
  import datetime as dt
  from html.parser import HTMLParser
  from urllib.parse import urlsplit, urlunsplit
@@ -21,36 +47,23 @@
  def main(args=None):
--    """
--    A script for generating the JSON list required by the Raspberry Pi imaging
--    utility. Takes an existing JSON list as input and updates it by calculating
--    new image sizes and check-sums from the source files on the server. Typical
--    usage is to tweak the existing JSON (e.g. moving a URL from one point
--    release to the next), feed the script the edited JSON, directing output to
--    a new JSON file, then diff the results for sanity before renaming the new
--    file over the old:
--
--    $ vim raspi/os*.json
--    $ ./refresh_os_list raspi/os*.json > new.json
--    $ diff raspi/os*.json new.json
--    $ mv new.json raspi/os*.json
--
--    The only mandatory field in each entry is "url"; the script will fill out
--    all other missing fields (albeit with place-holders in the case of "name"
--    and "description" which it can't derive). If you need to add new images to
--    the JSON, you can simply add a new entry with just the "url" field and have
--    the script fill out everything else (remembering to replace the "name" and
--    "description" placeholders afterwards!).
--
--    If you need to override the "url" field (e.g. when dealing with the list
--    prior to release), add an "override_url" field which will removed from the
--    output but will be used as the actual source of the image to calculate
--    hashes and sizes.
--    """
--    if args is None:
--        args = sys.argv[1:]
++    if sys.version_info < (3, 7):
++        raise SystemExit('This script requires Python 3.7 or later')
++
++    if int(os.environ.get('TEST', '0')):
++        # To run the test suite (via the built-in doctest module):
++        #
++        #   $ TEST=1 ./refresh_os_list
++        #
++        # Optionally, if you have python3-coverage installed, and you want to
++        # track the coverage of the test suite you can further do:
++        #
++        #   $ TEST=1 python3-coverage run --source=./ ./refresh_os_list
++        #   $ python3-coverage report --show-missing
++        return _test_main()
++
      parser = argparse.ArgumentParser(
--        description=textwrap.dedent(main.__doc__),
++        description=textwrap.dedent(__doc__),
          formatter_class=argparse.RawDescriptionHelpFormatter)
      parser.add_argument(
          'input_file', type=argparse.FileType('r', encoding='utf-8'),
@@ -64,28 +77,57 @@
          help="Force the utility to refresh all images even if the release "
          "date and download size have not changed in the index")
      args = parser.parse_args(args)
++
      try:
          update_template(args.input_file, args.output_file, args.force)
      except Exception as e:
--        # NOTE: If you want full stack traces just run me like this:
--        # DEBUG=1 ./refresh_os_list blah
--        if not int(os.environ.get('DEBUG', '0')):
++        # If you want full stack traces just run me like this:
++        #
++        #     $ DEBUG=1 ./refresh_os_list blah
++        if int(os.environ.get('DEBUG', '0')):
++            raise
++        else:
              print(str(e), file=sys.stderr)
              sys.exit(1)
--        else:
--            raise
--
--
--def update_template(input_file, output_file, force=False):
++
++
++def update_template(
++        input_file, output_file, force=False,
++        icon_url='https://assets.ubuntu.com/v1/85a9de76-ubuntu-icon.svg'):
      """
      Reads JSON data from *input_file* (a file-like object), updates any entries
      found to have incorrect size and/or check-sums, fills out mandatory missing
      fields with placeholders (name, description, website), and updates the
      release-date (if it's out of date). The output is written (again in JSON
--    format) to *output_file* (another file-like object).
++    format) to *output_file* (another file-like object). For example::
++
++        >>> images = {
++        ...     'impish-armhf+raspi.img.gz': gzip.compress(b'foo' * 123456),
++        ...     'impish-arm64+raspi.img.gz': gzip.compress(b'bar' * 234567),
++        ... }
++        >>> ts = dt.datetime(2021, 10, 25)
++        >>> with contextlib.redirect_stderr(io.StringIO()):
++        ...     with _test_server(_make_index(_make_sums(images), ts)) as url:
++        ...         input_file = io.StringIO(json.dumps({'os_list': [
++        ...             {'url': f'{url}impish-armhf+raspi.img.gz'},
++        ...             {'url': f'{url}impish-arm64+raspi.img.gz'},
++        ...         ]}))
++        ...         output_file = io.StringIO()
++        ...         update_template(input_file, output_file)
++        >>> output = json.loads(output_file.getvalue())
++        >>> len(output['os_list'])
++        2
++        >>> sorted(output['os_list'][0].keys()) # doctest: +NORMALIZE_WHITESPACE
++        ['description', 'extract_sha256', 'extract_size', 'icon',
++         'image_download_sha256', 'image_download_size', 'name',
++         'release_date', 'url', 'website']
++        >>> output['os_list'][0].keys() == output['os_list'][1].keys()
++        True
      If the *force* parameter is ``True`` then all entries are updated
--    regardless of whether they are up to date or not.
++    regardless of whether they are up to date or not. The *icon_url* parameter
++    optionally specifies the URL of the icon to include in entries that lack
++    one.
      Progress information is printed to stderr while the routine is running.
      """
@@ -95,7 +137,7 @@
      # diff'ing the output substantially easier
      template = json.load(input_file, object_pairs_hook=OrderedDict)
      if not isinstance(template, dict):
--        raise ValueError('expected a JSON object in {}'.format(args.input_file))
++        raise ValueError(f'expected a JSON object in {input_file.name}')
      if template.keys() != {'os_list'}:
          raise ValueError('expected a single "os_list" entry')
      if not all('url' in entry for entry in template['os_list']):
@@ -105,7 +147,7 @@
          url = entry.get('override_url', entry['url'])
          source = get_entry(url)
          if 'icon' not in entry:
--            entry['icon'] = 'https://assets.ubuntu.com/v1/85a9de76-ubuntu-icon.svg'
++            entry['icon'] = icon_url
          if 'name' not in entry:
              warnings.warn(Warning('Inserted placeholder entries; check output'))
              entry['name'] = 'PLACEHOLDER'
@@ -126,11 +168,10 @@
              ).date() < source.date
+         )
          if update:
--            print('Updating {}'.format(source.name), file=sys.stderr)
++            print(f'Updating {source.name}', file=sys.stderr)
              entry.update(update_entry(entry, source))
          else:
--            print('Entry for {} is up to date'.format(source.name),
--                  file=sys.stderr)
++            print(f'Entry for {source.name} is up to date', file=sys.stderr)
          # Always update image_download_sha256 to fill it out when missing
          entry['image_download_sha256'] = source.sha256
          entry.pop('override_url', None)
@@ -140,18 +181,58 @@
              urlopen(entry['website'])
          except HTTPError as exc:
              warnings.warn(
--                Warning('Failed to access {entry[website]}: {exc!r}'.format(
--                    entry=entry, exc=exc)))
++                Warning(f'Failed to access {entry[website]}: {exc!r}'))
      json.dump(template, output_file, indent=4)
  class HashStream:
++    """
++    When constructed with *stream*, a file-like object, this class proxies all
++    calls to the usual :meth:`~io.BufferedIOBase.read` calls (which must be
++    sequential as the class provides no "seek" method), and tots up the number
++    of bytes read in the :attr:`size` attribute, and the SHA256 checksum of
++    all data read in the :attr:`cksum` attribute.
++
++    For example::
++
++        >>> import io
++        >>> stream = io.BytesIO(b'foo bar baz')
++        >>> h = HashStream(stream)
++        >>> h.size
++        0
++        >>> h.cksum
++        'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'
++        >>> h.read(3)
++        b'foo'
++        >>> h.size
++        3
++        >>> h.cksum
++        '2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae'
++        >>> h.read()
++        b' bar baz'
++        >>> h.size == len(stream.getvalue())
++        True
++        >>> h.cksum
++        'dbd318c1c462aee872f41109a4dfd3048871a03dedd0fe0e757ced57dad6f2d7'
++
++    The class may also be used as a context manager, in which case exiting the
++    context manager closes the underlying stream::
++
++        >>> stream = io.BytesIO(b'foo bar baz')
++        >>> stream.closed
++        False
++        >>> with HashStream(stream) as h:
++        ...     print(h.read().decode('ascii'))
++        foo bar baz
++        >>> stream.closed
++        True
++    """
      def __init__(self, stream):
          self.stream = stream
          self._cksum = hashlib.sha256()
          self._size = 0
--    def read(self, n):
++    def read(self, n=-1):
          result = self.stream.read(n)
          self._size += len(result)
          self._cksum.update(result)
@@ -165,10 +246,17 @@
      @property
      def cksum(self):
++        """
++        The SHA256 checksum of all data read so far, returned as a lowercased
++        hex-string.
++        """
          return self._cksum.hexdigest().lower()
      @property
      def size(self):
++        """
++        The number of bytes of data read so far.
++        """
          return self._size
@@ -180,7 +268,27 @@
      It stores the content of all ``<th>`` and ``<td>`` tags under each ``<tr>``
      tag in the :attr:`table` attribute as a list of lists (the outer list of
      rows, the inner lists of cells within those rows). All data is represented
--    as strings, or as ``None`` for entirely empty entries.
++    as strings, or as ``None`` for entirely empty entries. For example::
++
++        >>> html = '''
++        ... <html><body><table>
++        ... <p>A table:
++        ... <tr><th>#</th><th>Name</th></tr>
++        ... <tr><td>1</td><td>foo</td></tr>
++        ... <tr><td>2</td><td>bar</td></tr>
++        ... <tr><td></td><td>quux</td></tr>
++        ... </table></body></html>
++        ... '''
++        >>> parser = TableParser()
++        >>> parser.feed(html)
++        >>> parser.table
++        [['#', 'Name'], ['1', 'foo'], ['2', 'bar'], [None, 'quux']]
++
++    .. note::
++
++        As this is a subclass of an HTML parser (as opposed to an XML parser)
++        there is no requirement that the input is strictly valid XML, hence the
++        lack of a closing ``<p>`` tag above is acceptable.
      """
      def __init__(self):
          super().__init__(convert_charrefs=True)
@@ -217,6 +325,20 @@
      """
      Given the *url* of an image, returns an :class:`IndexEntry` named tuple
      containing the url, name, generated date, SHA-256 check-sum, and file size.
++    For example::
++
++        >>> images = {
++        ...     'impish-armhf+raspi.img.xz': b'foo' * 123456,
++        ...     'impish-arm64+raspi.img.xz': b'bar' * 234567,
++        ... }
++        >>> ts = dt.datetime(2021, 10, 25)
++        >>> with _test_server(_make_index(_make_sums(images), ts)) as url:
++        ...     entry = get_entry(f'{url}impish-armhf+raspi.img.xz')
++        >>> entry # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS
++        IndexEntry(url='http://127.0.0.1:4444/impish-armhf+raspi.img.xz',
++        name='impish-armhf+raspi.img.xz', date=datetime.date(2021, 10, 25),
++        sha256='f4c1a97c6eef546ce6814a31abd371f3072bd9056377fcfee...',
++        size=370368)
      """
      split = urlsplit(url)
      path, name = split.path.rsplit('/', 1)
@@ -224,19 +346,35 @@
      try:
          entry = get_index(index)[name]
      except KeyError:
--        raise ValueError('unable to find {}; are you sure the filename is '
--                         'correct?'.format(url))
++        raise ValueError(
++            f'unable to find {url}; are you sure the filename is correct?')
      if entry.size is None or entry.sha256 is None:
--        raise ValueError('unable to retrieve file-size or checksum for '
--                         '{}'.format(url))
++        raise ValueError(f'unable to retrieve file-size or checksum for {url}')
      return entry
--@functools.lru_cache()
++@functools.lru_cache
  def get_index(url):
      """
      Given the *url* of a cdimage directory containing images, returns a dict
      mapping filenames to :class:`IndexEntry` named tuples.
++    For example::
++
++        >>> images = {
++        ...     'impish-armhf+raspi.img.xz': b'foo' * 123456,
++        ...     'impish-arm64+raspi.img.xz': b'bar' * 234567,
++        ... }
++        >>> ts = dt.datetime(2021, 10, 25)
++        >>> with _test_server(_make_index(_make_sums(images), ts)) as url:
++        ...     index = get_index(url)
++        >>> sorted(index.keys())
++        ['impish-arm64+raspi.img.xz', 'impish-armhf+raspi.img.xz']
++        >>> entry = index['impish-arm64+raspi.img.xz']
++        >>> entry # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS
++        IndexEntry(url='http://127.0.0.1:4444/impish-arm64+raspi.img.xz',
++        name='impish-arm64+raspi.img.xz', date=datetime.date(2021, 10, 25),
++        sha256='e9cd9718e97ac951c0ead5de8069d0ff5de188620b12b02...',
++        size=703701)
      """
      # NOTE: This code relies on the current layout of pages on
      # cdimage.ubuntu.com; if extra tables or columns are introduced or
@@ -246,8 +384,8 @@
          with urlopen(url) as page:
              parser.feed(page.read().decode('utf-8'))
      except HTTPError:
--        raise ValueError('unable to find {}; are you sure the path is '
--                         'correct?'.format(url))
++        raise ValueError(
++            f'unable to get {url}; are you sure the path is correct?')
      entries = {}
      for row in parser.table:
          try:
@@ -277,6 +415,7 @@
                      entries[name] = entries[name]._replace(sha256=cksum)
                  except KeyError:
                      pass
++        del entries['SHA256SUMS']
      return entries
@@ -284,10 +423,26 @@
      """
      Given an *entry* (a dict, read from the source JSON) representing a single
      image, and a *source* (an :class:`IndexEntry` namedtuple) representing the
--    current state of that source on the cdimage server, download the
--    specified image and update the "extract_size", "extract_sha256",
++    current state of that source on the cdimage server, download the specified
++    image and update the "extract_size", "extract_sha256",
      "image_download_size", "image_download_sha256", and "release_date" fields
--    (as necessary), returning the new entry.
++    (as necessary), returning the new entry. For example::
++
++        >>> images = {
++        ...     'impish-armhf+raspi.img.gz': gzip.compress(b'foo' * 123456),
++        ...     'impish-arm64+raspi.img.gz': gzip.compress(b'bar' * 234567),
++        ... }
++        >>> ts = dt.datetime(2021, 10, 25)
++        >>> with _test_server(_make_index(_make_sums(images), ts)) as url:
++        ...     entry = {'url': f'{url}impish-armhf+raspi.img.gz'}
++        ...     new_entry = update_entry(entry, get_entry(entry['url']))
++        >>> sorted(new_entry.keys()) # doctest: +NORMALIZE_WHITESPACE
++        ['extract_sha256', 'extract_size', 'image_download_sha256',
++         'image_download_size', 'release_date', 'url']
++        >>> new_entry['release_date']
++        '2021-10-25'
++        >>> new_entry['extract_size']
++        370368
      Note: the routine does *not* modify the entry dict passed in.
      """
@@ -300,8 +455,8 @@
          }[source.url.rsplit('.', 1)[1]]
          with HashStream(decompresser(compressed_image)) as decompressed_image:
              while True:
--                print('\rDownloading & verifying image:{:5.1f}%'.format(
--                    compressed_image.size * 100 / source.size),
++                print('\rDownloading & verifying image:'
++                      f'{compressed_image.size * 100 / source.size:5.1f}%',
                      end='', file=sys.stderr)
                  buf = decompressed_image.read(65536)
                  if not buf:
@@ -326,5 +481,480 @@
      return entry
++def _test_main():
++    """
++    Run the test suite via doctest.
++    """
++    # All functions from here on are purely for the benefit of the test suite
++    import doctest
++
++    # Undecorate get_index to prevent the cache from breaking many tests (could
++    # use get_index.cache_clear but this hack is marginally cleaner at least
++    # from the perspective of the tests themselves)
++    global get_index
++    get_index = get_index.__wrapped__
++    failures, total = doctest.testmod()
++    return bool(failures)
++
++
++@contextlib.contextmanager
++def _test_server(files, *, host='127.0.0.1', port=4444):
++    """
++    This function provides a test HTTP server for the doctest suite.
++
++    It expects to be called with *content*, a :class:`dict` mapping filenames
++    to byte-strings representing file contents. All contents will be written to
++    a temporary directory, and a trivial HTTP server will be started to serve
++    its content on the specified *host* and *port* (defaults to port 4444 on
++    localhost).
++
++    The function acts as a context manager, cleaning up the http daemon and
++    temporary directory upon exit. The URL of the root of the server is yielded
++    by the context manager.
++    """
++    import tempfile
++    import http.server
++    from pathlib import Path
++    from threading import Thread
++
++    class SilentHandler(http.server.SimpleHTTPRequestHandler):
++        def log_message(self, fmt, *args):
++            # Don't spam the console
++            pass
++
++    with tempfile.TemporaryDirectory() as temp:
++        for filename, data in files.items():
++            filepath = Path(temp) / filename
++            filepath.write_bytes(data)
++
++        handler = functools.partial(SilentHandler, directory=temp)
++        with http.server.ThreadingHTTPServer(
++                ('127.0.0.1', 4444), handler) as httpd:
++            httpd_thread = Thread(target=httpd.serve_forever)
++            httpd_thread.start()
++            try:
++                yield f'http://{host}:{port}/'
++            finally:
++                httpd.shutdown()
++                httpd_thread.join(timeout=5)
++                assert not httpd_thread.is_alive()
++
++
++def _make_sums(files):
++    """
++    This function exists to generate SHA256SUMS files for the doctest suite.
++
++    Given *files*, a :class:`dict` mapping filenames to byte-strings of
++    file contents, this function returns a new :class:`dict` which is a copy of
++    *files* with one additional entry titled "SHA256SUMS" which contains the
++    output of the "sha256sum" command for the given content.
++    """
++    files = files.copy()
++    files['SHA256SUMS'] = '\n'.join(
++        f'{hashlib.sha256(data).hexdigest()}  {filename}'
++        for filename, data in files.items()
++    ).encode('ascii')
++    return files
++
++
++def _make_index(files, timestamp=None):
++    """
++    This function generates index.html files for the doctest suite.
++
++    Given *files*, a :class:`dict` mapping image filenames to byte-strings
++    of file contents, this function generates an appropriate "index.html" file,
++    returning a copy of the original :class:`dict` with this new entry.
++
++    Additionally *timestamp*, a :class:`~datetime.datetime` representing the
++    last modification date, can be specified. It defaults to the current time
++    if not given.
++    """
++    if timestamp is None:
++        timestamp = dt.datetime.now()
++    files = files.copy()
++    rows = '\n'.join(
++        f'<tr><td>Icon</td><td>{filename}</td>'
++        f'<td>{timestamp.strftime("%Y-%m-%d %H:%M")}</td>'
++        f'<td>{len(data) // 1024}K</td><td>Descriptive text</td></tr>'
++        for filename, data in files.items()
++    )
++    files['index.html'] = f"""
++    <html><body>
++      <p>The following files are available:</p>
++      <table>
++      <tr><th></th><th>Name</th><th>LastMod</th><th>Size</th><th>Desc</th></tr>
++      {rows}
++      </table>
++    </body></html>
++    """.encode('utf-8')
++    return files
++
++
++# Extra tests to bump test suite coverage
++
++__test__ = {
++    'bad-json': """
++    The input file must be valid JSON::
++
++        >>> import io
++        >>> update_template(io.StringIO('foo'), io.StringIO())
++        Traceback (most recent call last):
++          File "<stdin>", line 2, in <module>
++            update_template(io.StringIO('foo'), io.StringIO())
++          File "./refresh_os_list", line 117, in update_template
++            template = json.load(input_file, object_pairs_hook=OrderedDict)
++          File "/usr/lib/python3.8/json/decoder.py", line 355, in raw_decode
++            raise JSONDecodeError("Expecting value", s, err.value) from None
++        json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
++
++    The input file has to contain a JSON object at the top level::
++
++        >>> data = io.StringIO('[]')
++        >>> data.name = 'foo'
++        >>> update_template(data, io.StringIO())
++        Traceback (most recent call last):
++          File "<stdin>", line 5, in <module>
++            update_template(data, io.StringIO())
++          File "./refresh_os_list", line 119, in update_template
++            raise ValueError(f'expected a JSON object in {input_file.name}')
++        ValueError: expected a JSON object in foo
++
++    The top-level JSON object must contain a single "os_list" entry::
++
++        >>> data = io.StringIO('{"foo": 1}')
++        >>> data.name = 'foo'
++        >>> update_template(data, io.StringIO())
++        Traceback (most recent call last):
++          File "<stdin>", line 8, in <module>
++            update_template(data, io.StringIO())
++          File "./refresh_os_list", line 121, in update_template
++            raise ValueError('expected a single "os_list" entry')
++        ValueError: expected a single "os_list" entry
++
++    Every JSON object below os_list must have a "url" field::
++
++        >>> data = io.StringIO(
++        ...     '{"os_list": {"name": "impish-server-arm64+raspi.img.xz"}}')
++        >>> data.name = 'foo'
++        >>> update_template(data, io.StringIO())
++        Traceback (most recent call last):
++          File "<stdin>", line 11, in <module>
++            update_template(data, io.StringIO())
++          File "./refresh_os_list", line 123, in update_template
++            raise ValueError('all "os_list" entries must contain a "url" entry')
++        ValueError: all "os_list" entries must contain a "url" entry
++    """,
++
++    'bad-url': """
++    The URL provided to get_entry must be valid::
++
++        >>> images = {
++        ...     'impish-armhf+raspi.img.xz': b'foo',
++        ...     'impish-arm64+raspi.img.xz': b'bar',
++        ... }
++        >>> with _test_server(_make_index(_make_sums(images))) as url:
++        ...     wrong_url = f'{url}wrong/impish-armhf+raspi.img.xz'
++        ...     get_entry(wrong_url) # doctest: +ELLIPSIS
++        Traceback (most recent call last):
++          File "<stdin>", line 3, in <module>
++            get_entry(wrong_url) # doctest: +ELLIPSIS
++          File "./refresh_os_list", line 327, in get_entry
++            entry = get_index(index)[name]
++          File "./refresh_os_list", line 368, in get_index
++            raise ValueError(...)
++        ValueError: unable to get http://...; are you sure the path is correct?
++        >>> with _test_server(_make_index(_make_sums(images))) as url:
++        ...     image_url = f'{url}focal-arm64+raspi.img.xz'
++        ...     get_entry(image_url) # doctest: +ELLIPSIS
++        Traceback (most recent call last):
++          File "<stdin>", line 4, in <module>
++            entry = get_entry(f'{url}focal-arm64+raspi.img.xz')
++          File "./refresh_os_list", line 328, in get_entry
++            raise ValueError(...)
++        ValueError: unable to find http://...; are you sure the filename is correct?
++    """,
++
++    'no-checksums': """
++    The SHA256SUMS file must exist on the server::
++
++        >>> images = {
++        ...     'hirsute-armhf+raspi.img.xz': b'foo',
++        ...     'hirsute-arm64+raspi.img.xz': b'bar',
++        ... }
++        >>> with _test_server(_make_index(images)) as url:
++        ...     image_url = f'{url}hirsute-arm64+raspi.img.xz'
++        ...     get_entry(image_url) # doctest: +ELLIPSIS
++        Traceback (most recent call last):
++          File "<stdin>", line 3, in <module>
++            get_entry(image_url)
++          File "./refresh_os_list", line 332, in get_entry
++            raise ValueError(f'unable to retrieve file-size or checksum for {url}')
++        ValueError: unable to retrieve file-size or checksum for http://...
++    """,
++
++    'ignore-star-prefixes': """
++    Filenames in checksum files can have star prefixes (indicating binary
++    input) which should be ignored::
++
++        >>> images = _make_sums({
++        ...     'impish-armhf+raspi.img.xz': b'foo' * 123456,
++        ...     'impish-arm64+raspi.img.xz': b'bar' * 234567,
++        ... })
++        >>> cksums = images['SHA256SUMS'].decode('utf-8').splitlines(True)
++        >>> cksums = [f'{cksum} *{filename}' for line in cksums
++        ...     for cksum, filename in (line.split(None, 1),)]
++        >>> images['SHA256SUMS'] = ''.join(cksums).encode('utf-8')
++        >>> ts = dt.datetime(2021, 10, 25)
++        >>> with _test_server(_make_index(images, ts)) as url:
++        ...     entry = get_entry(f'{url}impish-armhf+raspi.img.xz')
++        >>> entry # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS
++        IndexEntry(url='http://127.0.0.1:4444/impish-armhf+raspi.img.xz',
++        name='impish-armhf+raspi.img.xz', date=datetime.date(2021, 10, 25),
++        sha256='f4c1a97c6eef546ce6814a31abd371f3072bd9056377fcfee...',
++        size=370368)
++    """,
++
++    'ignore-extra-cksums': """
++    Files may be present in the checksum file which we didn't find (or more
++    likely ignored) in the index.html. This should not cause an error::
++
++        >>> images = _make_sums({
++        ...     'impish-armhf+raspi.img.xz': b'foo' * 123456,
++        ...     'impish-arm64+raspi.img.xz': b'bar' * 234567,
++        ... })
++        >>> cksums = images['SHA256SUMS'].decode('utf-8')
++        >>> cksums += '\\n' + '0123abcd' * 8 + ' weird-hash.img.xz'
++        >>> images['SHA256SUMS'] = cksums.encode('utf-8')
++        >>> ts = dt.datetime(2021, 10, 25)
++        >>> with _test_server(_make_index(images, ts)) as url:
++        ...     entry = get_entry(f'{url}impish-armhf+raspi.img.xz')
++        >>> entry # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS
++        IndexEntry(url='http://127.0.0.1:4444/impish-armhf+raspi.img.xz',
++        name='impish-armhf+raspi.img.xz', date=datetime.date(2021, 10, 25),
++        sha256='f4c1a97c6eef546ce6814a31abd371f3072bd9056377fcfee...',
++        size=370368)
++    """,
++
++    'ignore-bad-rows': """
++    If the table contains rows with the wrong number of columns, or rows with
++    timestamps that cannot be converted, ignore those rows::
++
++        >>> content = _make_sums({
++        ...     'hirsute-armhf+raspi.img.xz': b'foo',
++        ...     'hirsute-arm64+raspi.img.xz': b'bar',
++        ... })
++        >>> content['index.html'] = '''
++        ... <html><body><table>
++        ... <tr>
++        ...   <th></th><th>Name</th>
++        ...   <th>LastMod</th><th>Size</th><th>Desc</th>
++        ... </tr>
++        ... <tr><td>Silly row with one column</td></tr>
++        ... <tr>
++        ...   <td>Icon</td>
++        ...   <td>SHA256SUMS</td>
++        ...   <td>2021-10-25 00:00</td><td>{len(content['SHA256SUMS'])}</td>
++        ...   <td>sha256 checksums</td>
++        ... </tr>
++        ... <tr>
++        ...   <td>Icon</td>
++        ...   <td>hirsute-armhf+raspi.img.xz</td>
++        ...   <td>2021-10-25 00:00</td><td>3</td><td>Hirsute release for Pi</td>
++        ... </tr>
++        ... <tr>
++        ...   <td>Icon</td>
++        ...   <td>hirsute-arm64+raspi.img.xz</td>
++        ...   <td>2021-10-25 00:00</td><td>3</td><td>Hirsute release for Pi</td>
++        ... </tr>
++        ... <tr>
++        ...   <td>Icon</td>
++        ...   <td>hirsute-amd64.img.xz</td>
++        ...   <td>Bad timestamp</td><td>3</td><td>Focal release for Pi</td>
++        ... </tr>
++        ... </table></body></html>
++        ... '''.encode('utf-8')
++        >>> with _test_server(content) as url:
++        ...     image_url = f'{url}hirsute-armhf+raspi.img.xz'
++        ...     entry = get_entry(image_url)
++        >>> entry # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS
++        IndexEntry(url='http://127.0.0.1:4444/hirsute-armhf+raspi.img.xz',
++        name='hirsute-armhf+raspi.img.xz', date=datetime.date(2021, 10, 25),
++        sha256='2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f...',
++        size=3)
++    """,
++
++    'corrupted-size': """
++    We should notice if the downloaded size doesn't match what's reported by
++    the server. Here, we grab the new source entry for an image, then switch
++    out the content of the image before calling update_entry so it appears the
++    image has been corrupted on the server::
++
++        >>> images = _make_sums({
++        ...     'impish-armhf+raspi.img.gz': gzip.compress(b'foo'),
++        ...     'impish-arm64+raspi.img.gz': gzip.compress(b'bar'),
++        ... })
++        >>> ts = dt.datetime(2021, 10, 25)
++        >>> with _test_server(_make_index(images, ts)) as url:
++        ...     entry = {'url': f'{url}impish-armhf+raspi.img.gz'}
++        ...     source = get_entry(entry['url'])
++        >>> images['impish-armhf+raspi.img.gz'] = gzip.compress(b'corrupted!')
++        >>> with _test_server(_make_index(images, ts)) as url:
++        ...     update_entry(entry, source)
++        Traceback (most recent call last):
++          File "<stdin>", line 6, in <module>
++            update_entry(entry, source)
++          File "./refresh_os_list", line 451, in update_entry
++            raise ValueError('Corrupted download; size check failed')
++        ValueError: Corrupted download; size check failed
++    """,
++
++    'corrupted-checksum': """
++    If the size of a download matches, but the SHA256 checksum doesn't, we
++    should again notice. In this case we use gzip compresslevel=0 (no
++    compression) with data which is different but the same size as the
++    original, and pull the same trick of grabbing a new source entry then
++    switching out the file on the server before calling update_entry::
++
++        >>> images = _make_sums({
++        ...     'impish-armhf+raspi.img.gz': gzip.compress(b'foo', compresslevel=0),
++        ...     'impish-arm64+raspi.img.gz': gzip.compress(b'bar', compresslevel=0),
++        ... })
++        >>> ts = dt.datetime(2021, 10, 25)
++        >>> with _test_server(_make_index(images, ts)) as url:
++        ...     entry = {'url': f'{url}impish-armhf+raspi.img.gz'}
++        ...     source = get_entry(entry['url'])
++        >>> images['impish-armhf+raspi.img.gz'] = gzip.compress(b'baz', compresslevel=0)
++        >>> with _test_server(_make_index(images, ts)) as url:
++        ...     update_entry(entry, source)
++        Traceback (most recent call last):
++          File "<stdin>", line 6, in <module>
++            update_entry(entry, source)
++          File "./refresh_os_list", line 453, in update_entry
++            raise ValueError('Corrupted download; SHA256 check failed')
++        ValueError: Corrupted download; SHA256 check failed
++    """,
++
++    'use-existing-fields': """
++    If entries being their own name, description, and/or website entries we
++    should simply keep them as is::
++
++        >>> images = {
++        ...     'impish-armhf+raspi.img.gz': gzip.compress(b'foo'),
++        ...     'impish-arm64+raspi.img.gz': gzip.compress(b'bar'),
++        ... }
++        >>> ts = dt.datetime(2021, 10, 25)
++        >>> with contextlib.redirect_stderr(io.StringIO()):
++        ...     with _test_server(_make_index(_make_sums(images), ts)) as url:
++        ...         input_file = io.StringIO(json.dumps({'os_list': [
++        ...             {'name': name,
++        ...              'website': 'https://ubuntu.com/',
++        ...              'url': f'{url}{name}',
++        ...              'description': 'The {name} image'}
++        ...             for name in images
++        ...         ]}))
++        ...         output_file = io.StringIO()
++        ...         update_template(input_file, output_file)
++        >>> output = json.loads(output_file.getvalue())
++        >>> len(output['os_list'])
++        2
++        >>> sorted(output['os_list'][0].keys()) # doctest: +NORMALIZE_WHITESPACE
++        ['description', 'extract_sha256', 'extract_size', 'icon',
++         'image_download_sha256', 'image_download_size', 'name',
++         'release_date', 'url', 'website']
++        >>> output['os_list'][0]['website']
++        'https://ubuntu.com/'
++    """,
++
++    'keep-up-to-date': """
++    If an entry is already up to date (same size, same release date) don't
++    bother to update it. In fact, as long as the existing entry date is greater
++    than or equal to the date on the server we accept it (to deal with official
++    release dates later than the image generation date)::
++
++        >>> images = {
++        ...     'impish-armhf+raspi.img.gz': gzip.compress(b'foo'),
++        ...     'impish-arm64+raspi.img.gz': gzip.compress(b'bar'),
++        ... }
++        >>> ts = dt.datetime(2021, 10, 25)
++        >>> with contextlib.redirect_stderr(io.StringIO()):
++        ...     with _test_server(_make_index(_make_sums(images), ts)) as url:
++        ...         input_file = io.StringIO(json.dumps({'os_list': [
++        ...             {'name': name,
++        ...              'image_download_sha256':
++        ...                  hashlib.sha256(images[name]).hexdigest(),
++        ...              'image_download_size': len(images[name]),
++        ...              'icon': 'https://ubuntu.com/ubuntu.svg',
++        ...              'extract_sha256': '0123abcd' * 8,
++        ...              'extract_size': 3,
++        ...              'website': 'https://ubuntu.com/',
++        ...              'url': f'{url}{name}',
++        ...              'release_date': '2021-10-27',
++        ...              'description': 'The {name} image'}
++        ...             for name in images
++        ...         ]}))
++        ...         output_file = io.StringIO()
++        ...         update_template(input_file, output_file)
++        >>> output = json.loads(output_file.getvalue())
++        >>> output['os_list'][0]['extract_sha256'] = '0123abcd' * 8
++
++    Note that in this case the extracted SHA256 is *not* checked because we
++    don't even bother to download the image. We can tell no actual download
++    occurred above because the extract_sha256 definitely won't match!
++    """,
++
++    'main --help': """
++    Running the main function with --help results in help output::
++
++        >>> import os
++        >>> os.environ['TEST'] = '0' # don't run the test-suite recursively!
++        >>> try:
++        ...     main(['--help']) # doctest: +ELLIPSIS
++        ... except SystemExit:
++        ...     pass
++        usage: refresh_os_list [-h] ... input_file [output_file]
++        ...
++    """,
++
++    'main works': """
++    Running the main function actually does what it says on the tin::
++
++        >>> import tempfile
++        >>> from pathlib import Path
++        >>> tmp = tempfile.TemporaryDirectory()
++        >>> os.environ['TEST'] = '0' # don't run the test-suite recursively!
++        >>> images = {
++        ...     'impish-armhf+raspi.img.gz': gzip.compress(b'foo'),
++        ...     'impish-arm64+raspi.img.gz': gzip.compress(b'bar'),
++        ... }
++        >>> ts = dt.datetime(2021, 10, 25)
++        >>> with _test_server(_make_index(_make_sums(images), ts)) as url:
++        ...     old_data = {'os_list': [
++        ...         {'url': f'{url}{name}'}
++        ...         for name in images
++        ...     ]}
++        ...     old_filename = Path(tmp.name) / 'old.json'
++        ...     old_size = old_filename.write_text(json.dumps(old_data))
++        ...     new_filename = Path(tmp.name) / 'new.json'
++        ...     with contextlib.redirect_stderr(io.StringIO()):
++        ...         main([str(old_filename), str(new_filename)])
++        ...     new_data = json.loads(new_filename.read_text())
++        >>> list(new_data.keys())
++        ['os_list']
++        >>> len(new_data['os_list'])
++        2
++        >>> entry = new_data['os_list'][0]
++        >>> sorted(entry.keys()) # doctest: +NORMALIZE_WHITESPACE
++        ['description', 'extract_sha256', 'extract_size', 'icon',
++         'image_download_sha256', 'image_download_size', 'name',
++         'release_date', 'url', 'website']
++        >>> entry['extract_size']
++        3
++        >>> entry['extract_sha256']
++        '2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae'
++        >>> tmp.cleanup()
++    """,
++}
++
++
  if __name__ == '__main__':
      main()