Merge lp:~matsubara/curtin/simplestreams into lp:~curtin-dev/curtin/trunk

Proposed by Diogo Matsubara
Status: Merged
Merge reported by: Diogo Matsubara
Merged at revision: not available
Proposed branch: lp:~matsubara/curtin/simplestreams
Merge into: lp:~curtin-dev/curtin/trunk
Diff against target: 346 lines (+167/-47)
6 files modified
Makefile (+6/-0)
doc/devel/README-vmtest.txt (+4/-4)
tests/vmtests/__init__.py (+108/-42)
tests/vmtests/helpers.py (+23/-1)
tools/vmtest-sync-images (+21/-0)
tools/vmtest-system-setup (+5/-0)
To merge this branch: bzr merge lp:~matsubara/curtin/simplestreams
Reviewer Review Type Date Requested Status
Ryan Harper (community) Needs Fixing
Review via email: mp+275836@code.launchpad.net

Commit message

Decouple image download from serving images to tests

Description of the change

This branch decouples ImageStore action of updating the cache of images and providing the images to tests. This is useful for debugging without updating the images as they might introduce changes that cause other failures, masking the original failure and making debugging harder.

To post a comment you must log in.
Revision history for this message
Ryan Harper (raharper) wrote :

+1 to decoupling the download from the test run.

Couple of comments though:

Don't drop the checksum verification of the image; we need to keep that otherwise we can't trust the cache. If we need to be able to supply unsigned images, I think we can introduce a separate flag to disable checksums when that is needed.

It appears that the sync pulls down all images in one go, which is nice for the case where we're running the entire test suite. But if I just want to run a single test on just one release, I have to pay for all of the sync. I'd like to retain the ability to sync and run just one image.

We now have ssquery and sync data (what versions, what releases) in two places (vmtest/__init__.py and tools/vmtest-sync-images).

So, I think we need to following features:

1. a make target to pre-sync all required images
   - it needs to extract from the test cases which images and releases to pull down
     so that when we add a new test-case that requires a new release or kver or something
     this program knows that and syncs that without having to modify the source
   - images should be checksummed and the hash written out for validation on subsequent invocations.

2. ability to sync a single image based on the test-case that's being run
   - I should be able to run nosetest3 on a single test and the images should get synced if needed

3. I should be able to disable (syncing/downloading a newer image) via a flag. As you mentioned in the commit message, sometimes we don't want to sync a new image while we're debugging; in which case we want to optionally skip the sync; since this is most useful in debugging, it shouldn't be the default behavior, instead surface it through a flag.

review: Needs Fixing
Revision history for this message
Scott Moser (smoser) wrote :

a couple comments.
using sstream-mirror will allow you to just pass on the releases that you need.
   sstream-mirror .... release=wily
will just then filter to only wily things.

second, i think that Christian is looking at also getting us some hwe-X kernels and initramfses.
so there is likely overlap there.

Revision history for this message
Diogo Matsubara (matsubara) wrote :

Some additional comments from Scott over IRC:

<smoser> cpaelzer, matsubara wrt your sstream stuff, https://code.launchpad.net/~matsubara/curtin/simplestreams/+merge/275836
<smoser> it'd be wonderful if sstream-mirror realized it had the same hash and size and just reused it magically.
<smoser> it doesnt do that, but it could and shoudl. sstream-mirror wont re-download the path if it has it, but it is kind of dumb in doing that. it just says "dont download if path exists". and paths are supposed to be immutable.
<smoser> would be nice if it recognized matches even across paths though.
<smoser> as an image hash might appear in daily/foo/bar and released/foo/bar

lp:~matsubara/curtin/simplestreams updated
284. By Diogo Matsubara

address review comments: sync images on demand, fix tools/sync-images script to sync all and keep DEFAULT_FILTERS in a single place, add flag to disable/enable syncing

285. By Diogo Matsubara

merge from trunk

286. By Diogo Matsubara

sync if there's no local image or if ImageStore says so

287. By Diogo Matsubara

find which releases to tests based on the existing test cases

288. By Diogo Matsubara

add image checksum back

Revision history for this message
Diogo Matsubara (matsubara) wrote :

> +1 to decoupling the download from the test run.
>
> Couple of comments though:
>
> Don't drop the checksum verification of the image; we need to keep that
> otherwise we can't trust the cache. If we need to be able to supply unsigned
> images, I think we can introduce a separate flag to disable checksums when
> that is needed.

I assumed sstream-mirror would always give the correct images, that's why I dropped the verification. More on this below.

>
> It appears that the sync pulls down all images in one go, which is nice for
> the case where we're running the entire test suite. But if I just want to run
> a single test on just one release, I have to pay for all of the sync. I'd
> like to retain the ability to sync and run just one image.
>
> We now have ssquery and sync data (what versions, what releases) in two places
> (vmtest/__init__.py and tools/vmtest-sync-images).
>
> So, I think we need to following features:
>
> 1. a make target to pre-sync all required images
> - it needs to extract from the test cases which images and releases to pull
> down
> so that when we add a new test-case that requires a new release or kver
> or something
> this program knows that and syncs that without having to modify the
> source

Done. I added a helpers.find_releases() function to find the releases based on the test classes we have defined.

> - images should be checksummed and the hash written out for validation on
> subsequent invocations.

I re-added the checksum verification but ideally this would be fixed by sstream-mirror. I filed bug 1513625 and left a comment to that effect so this could be dropped once sstream-mirror does the verification for us.

>
> 2. ability to sync a single image based on the test-case that's being run
> - I should be able to run nosetest3 on a single test and the images should
> get synced if needed

Done. By default the images will get synced on deman when you run nosetests3 on a single test. Image syncing is disabled for the make vmtests target (i.e. running all tests) so we can use that in CI.

>
> 3. I should be able to disable (syncing/downloading a newer image) via a flag.
> As you mentioned in the commit message, sometimes we don't want to sync a new
> image while we're debugging; in which case we want to optionally skip the
> sync; since this is most useful in debugging, it shouldn't be the default
> behavior, instead surface it through a flag.

Added that flag.

Please take another look at my changes addressing your review comments.

Thanks for the review!

Revision history for this message
Ryan Harper (raharper) wrote :

I really like the usage changes. +1

Only a few comments below.

lp:~matsubara/curtin/simplestreams updated
289. By Diogo Matsubara

address review comments

Revision history for this message
Diogo Matsubara (matsubara) wrote :

Hi Ryan,

I addressed your comments. Thanks!

lp:~matsubara/curtin/simplestreams updated
290. By Diogo Matsubara

Incorporate Scott's changes to simplestreams branch
 - add a new get_env_var_bool() function that does the right things with env vars
 - add qemu as a dependency for tools/vmtest-system-setup
 - make IMAGE_SRC_URL settable from a env var
 - add progress output if sstream-mirror supports it
 - fix filters to be a list rather than string

Revision history for this message
Diogo Matsubara (matsubara) wrote :

This changes plus Scott's changes was merged in by Scott into trunk's revno 293

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'Makefile'
2--- Makefile 2015-09-18 18:55:35 +0000
3+++ Makefile 2015-11-10 19:57:52 +0000
4@@ -1,6 +1,8 @@
5 TOP := $(abspath $(dir $(lastword $(MAKEFILE_LIST))))
6 CWD := $(shell pwd)
7 PYTHON ?= python3
8+CURTIN_VMTEST_IMAGE_SYNC ?= False
9+export CURTIN_VMTEST_IMAGE_SYNC
10
11 build:
12
13@@ -28,11 +30,15 @@
14 echo " apt-get install -qy python3-sphinx"; exit 1; } 1>&2
15 make -C doc html
16
17+# By default don't sync images when running all tests.
18 vmtest:
19 nosetests3 $(noseopts) tests/vmtests
20
21 vmtest-deps:
22 @$(CWD)/tools/vmtest-system-setup
23
24+sync-images:
25+ @$(CWD)/tools/vmtest-sync-images
26+
27
28 .PHONY: all test pyflakes pyflakes3 pep8 build
29
30=== modified file 'doc/devel/README-vmtest.txt'
31--- doc/devel/README-vmtest.txt 2015-10-26 10:07:30 +0000
32+++ doc/devel/README-vmtest.txt 2015-11-10 19:57:52 +0000
33@@ -52,8 +52,8 @@
34 sudo ./backdoor-image -v --user=<USER> --password-auth --password=<PW> IMG
35 At step 6 -> 7
36 - You might want to keep all the temporary images around.
37- To do so you can set KEEP_VMTEST_DATA to true:
38- export KEEP_VMTEST_DATA=true
39+ To do so you can set CURTIN_VMTEST_KEEP_DATA to true:
40+ export CURTIN_VMTEST_KEEP_DATA=true
41 That will keep the /tmp/tmpXXXXX directories and all files in there for
42 further execution.
43 At step 7
44@@ -95,6 +95,6 @@
45 look at the host's apt config and read 'Acquire::HTTP::Proxy'
46
47 If required for further debugging one can set the environment variable
48-"KEEP_VMTEST_DATA" like:
49-export KEEP_VMTEST_DATA=true
50+"CURTIN_VMTEST_KEEP_DATA" like:
51+export CURTIN_VMTEST_KEEP_DATA=true
52 This will keep the temporary disk files around by skipping __init__.py:__del__
53
54=== modified file 'tests/vmtests/__init__.py'
55--- tests/vmtests/__init__.py 2015-10-26 19:54:04 +0000
56+++ tests/vmtests/__init__.py 2015-11-10 19:57:52 +0000
57@@ -4,16 +4,26 @@
58 import logging
59 import json
60 import os
61+import pathlib
62 import re
63 import shutil
64 import subprocess
65 import tempfile
66 import textwrap
67+import urllib
68 import curtin.net as curtin_net
69
70 from .helpers import check_call
71
72+IMAGE_SRC_URL = os.environ.get(
73+ 'IMAGE_SRC_URL',
74+ "http://maas.ubuntu.com/images/ephemeral-v2/daily/streams/v1/index.sjson")
75+
76 IMAGE_DIR = os.environ.get("IMAGE_DIR", "/srv/images")
77+DEFAULT_SSTREAM_OPTS = [
78+ '--max=1', '--keyring=/usr/share/keyrings/ubuntu-cloudimage-keyring.gpg']
79+DEFAULT_FILTERS = ['arch=amd64', 'item_name=root-image.gz']
80+
81
82 DEVNULL = open(os.devnull, 'w')
83
84@@ -39,53 +49,90 @@
85
86
87 class ImageStore:
88- def __init__(self, base_dir):
89+ """Local mirror of MAAS images simplestreams data."""
90+
91+ # By default sync on demand.
92+ sync = True
93+
94+ def __init__(self, source_url, base_dir):
95+ """Initialize the ImageStore.
96+
97+ source_url is the simplestreams source from where the images will be
98+ downloaded.
99+ base_dir is the target dir in the filesystem to keep the mirror.
100+ """
101+ self.source_url = source_url
102 self.base_dir = base_dir
103 if not os.path.isdir(self.base_dir):
104 os.makedirs(self.base_dir)
105-
106- def get_image(self, repo, release, arch):
107- # query sstream for root image
108+ self.url = pathlib.Path(self.base_dir).as_uri()
109+
110+ def convert_image(self, root_image_path):
111+ """Use maas2roottar to unpack root-image, kernel and initrd."""
112+ logger.debug('Converting image format')
113+ subprocess.check_call(["tools/maas2roottar", root_image_path])
114+
115+ def sync_images(self, filters=DEFAULT_FILTERS):
116+ """Sync MAAS images from source_url simplestreams."""
117+ try:
118+ out = subprocess.check_output(
119+ ['sstream-mirror', '--help'], stderr=subprocess.STDOUT)
120+ if isinstance(out, bytes):
121+ out = out.decode()
122+ if '--progress' in out:
123+ progress = ['--progress']
124+ except subprocess.CalledProcessError:
125+ progress = []
126+
127+ cmd = ['sstream-mirror'] + DEFAULT_SSTREAM_OPTS + progress + [
128+ self.source_url, self.base_dir] + filters
129+ logger.debug('Syncing images {}'.format(cmd))
130+ out = subprocess.check_output(cmd)
131+ logger.debug(out)
132+
133+ def get_image(self, release, arch, filters=None):
134+ """Return local path for root image, kernel and initrd."""
135+ if filters is None:
136+ filters = [
137+ 'release=%s' % release, 'krel=%s' % release, 'arch=%s' % arch]
138+ # Query local sstream for compressed root image.
139 logger.debug(
140- 'Query simplestreams for root image: '
141- 'release={release} arch={arch}'.format(release=release, arch=arch))
142- cmd = ["tools/usquery", "--max=1", repo, "release=%s" % release,
143- "krel=%s" % release, "arch=%s" % arch,
144- "item_name=root-image.gz"]
145+ 'Query simplestreams for root image: %s', filters)
146+ cmd = ['sstream-query'] + DEFAULT_SSTREAM_OPTS + [
147+ self.url, 'item_name=root-image.gz'] + filters
148 logger.debug(" ".join(cmd))
149 out = subprocess.check_output(cmd)
150 logger.debug(out)
151+ # No output means we have no images locally, so try to sync.
152+ # If there's output, ImageStore.sync decides if images should be
153+ # synced or not.
154+ if not out or self.sync:
155+ self.sync_images(filters=filters)
156+ out = subprocess.check_output(cmd)
157 sstream_data = ast.literal_eval(bytes.decode(out))
158-
159- # Check if we already have the image
160- logger.debug('Checking cache for image')
161- checksum = ""
162- release_dir = os.path.join(self.base_dir, repo, release, arch,
163- sstream_data['version_name'])
164-
165- def p(x):
166- return os.path.join(release_dir, x)
167-
168- if os.path.isdir(release_dir) and os.path.exists(p("root-image.gz")):
169- # get checksum of cached image
170- with open(p("root-image.gz"), "rb") as fp:
171- checksum = hashlib.sha256(fp.read()).hexdigest()
172- if not os.path.exists(p("root-image.gz")) or \
173- checksum != sstream_data['sha256'] or \
174- not os.path.exists(p("root-image")) or \
175- not os.path.exists(p("root-image-kernel")) or \
176- not os.path.exists(p("root-image-initrd")):
177- # clear dir, download, and extract image
178- logger.debug('Image not found, downloading...')
179- shutil.rmtree(release_dir, ignore_errors=True)
180- os.makedirs(release_dir)
181- subprocess.check_call(
182- ["wget", "-c", sstream_data['item_url'], "-O",
183- p("root-image.gz")])
184- logger.debug('Converting image format')
185- subprocess.check_call(["tools/maas2roottar", p("root-image.gz")])
186- return (p("root-image"), p("root-image-kernel"),
187- p("root-image-initrd"))
188+ root_image_gz = urllib.parse.urlsplit(sstream_data['item_url']).path
189+ # Make sure the image checksums correctly. Ideally
190+ # sstream-[query,mirror] would make the verification for us.
191+ # See https://bugs.launchpad.net/simplestreams/+bug/1513625
192+ with open(root_image_gz, "rb") as fp:
193+ checksum = hashlib.sha256(fp.read()).hexdigest()
194+ if checksum != sstream_data['sha256']:
195+ self.sync_images(filters=filters)
196+ out = subprocess.check_output(cmd)
197+ sstream_data = ast.literal_eval(bytes.decode(out))
198+ root_image_gz = urllib.parse.urlsplit(
199+ sstream_data['item_url']).path
200+
201+ image_dir = os.path.dirname(root_image_gz)
202+ root_image_path = os.path.join(image_dir, 'root-image')
203+ kernel_path = os.path.join(image_dir, 'root-image-kernel')
204+ initrd_path = os.path.join(image_dir, 'root-image-initrd')
205+ # Check if we need to unpack things from the compressed image.
206+ if (not os.path.exists(root_image_path) or
207+ not os.path.exists(kernel_path) or
208+ not os.path.exists(initrd_path)):
209+ self.convert_image(root_image_gz)
210+ return (root_image_path, kernel_path, initrd_path)
211
212
213 class TempDir:
214@@ -134,7 +181,7 @@
215 stdout=DEVNULL, stderr=subprocess.STDOUT)
216
217 def __del__(self):
218- if (os.getenv('KEEP_VMTEST_DATA', "false") != "true"):
219+ if get_env_var_bool('CURTIN_VMTEST_KEEP_DATA', False):
220 # remove tempdir
221 shutil.rmtree(self.tmpdir)
222
223@@ -148,9 +195,13 @@
224 def setUpClass(self):
225 logger.debug('Acquiring boot image')
226 # get boot img
227- image_store = ImageStore(IMAGE_DIR)
228+ image_store = ImageStore(IMAGE_SRC_URL, IMAGE_DIR)
229+ # Disable sync if env var is set.
230+ if get_env_var_bool('CURTIN_VMTEST_IMAGE_SYNC', False):
231+ logger.debug("Images not synced.")
232+ image_store.sync = False
233 (boot_img, boot_kernel, boot_initrd) = image_store.get_image(
234- self.repo, self.release, self.arch)
235+ self.release, self.arch)
236
237 # set up tempdir
238 logger.debug('Setting up tempdir')
239@@ -431,3 +482,18 @@
240 logger.debug('Cloud config archive content (pre-json):' + part)
241 parts.append({'content': part, 'type': 'text/x-shellscript'})
242 return '#cloud-config-archive\n' + json.dumps(parts, indent=1)
243+
244+
245+def get_env_var_bool(envname, default=False):
246+ """get a boolean environment variable.
247+
248+ If environment variable is not set, use default.
249+ False values are case insensitive 'false', '0', ''."""
250+ if not isinstance(default, bool):
251+ raise ValueError("default '%s' for '%s' is not a boolean" %
252+ (default, envname))
253+ val = os.environ.get(envname)
254+ if val is None:
255+ return default
256+
257+ return val.lower() not in ("false", "0", "")
258
259=== modified file 'tests/vmtests/helpers.py'
260--- tests/vmtests/helpers.py 2015-09-23 14:59:38 +0000
261+++ tests/vmtests/helpers.py 2015-11-10 19:57:52 +0000
262@@ -15,9 +15,11 @@
263 #
264 # You should have received a copy of the GNU Affero General Public License
265 # along with Curtin. If not, see <http://www.gnu.org/licenses/>.
266-import threading
267+import os
268 import subprocess
269 import signal
270+import threading
271+from unittest import TestLoader
272
273
274 class Command(object):
275@@ -93,3 +95,23 @@
276 def check_call(cmd, signal=signal.SIGTERM, **kwargs):
277 # provide a 'check_call' like interface, but kill with a nice signal
278 return Command(cmd, signal).run(**kwargs)
279+
280+
281+def find_releases():
282+ """Return a sorted list of releases defined in test cases."""
283+ # Use the TestLoader to load all tests cases defined within
284+ # tests/vmtests/ and figure out which releases they are testing.
285+ loader = TestLoader()
286+ # dir with the vmtest modules (i.e. tests/vmtests/)
287+ tests_dir = os.path.dirname(__file__)
288+ # The root_dir for the curtin branch. (i.e. curtin/)
289+ root_dir = os.path.split(os.path.split(tests_dir)[0])[0]
290+ # Find all test modules defined in curtin/tests/vmtests/
291+ module_test_suites = loader.discover(tests_dir, top_level_dir=root_dir)
292+ releases = set()
293+ for mts in module_test_suites:
294+ for class_test_suite in mts:
295+ for test_case in class_test_suite:
296+ if getattr(test_case, 'release', ''):
297+ releases.add(getattr(test_case, 'release'))
298+ return sorted(releases)
299
300=== added file 'tools/vmtest-sync-images'
301--- tools/vmtest-sync-images 1970-01-01 00:00:00 +0000
302+++ tools/vmtest-sync-images 2015-11-10 19:57:52 +0000
303@@ -0,0 +1,21 @@
304+#!/usr/bin/python3
305+# This tool keeps a local copy of the maas images used by vmtests.
306+# It keeps only the latest copy of the available images.
307+import os
308+import sys
309+
310+# Fix path so we can import ImageStore class.
311+sys.path.append(os.path.join(os.path.dirname(__file__), '..'))
312+
313+from tests.vmtests import (
314+ ImageStore, IMAGE_DIR, IMAGE_SRC_URL, DEFAULT_FILTERS)
315+from tests.vmtests.helpers import find_releases
316+
317+
318+if __name__ == '__main__':
319+ # Instantiate the ImageStore object.
320+ store = ImageStore(IMAGE_SRC_URL, IMAGE_DIR)
321+ release_filter = 'release~{}'.format('|'.join(find_releases()))
322+ DEFAULT_FILTERS.append(release_filter)
323+ # Sync images.
324+ store.sync_images(filters=DEFAULT_FILTERS)
325
326=== modified file 'tools/vmtest-system-setup'
327--- tools/vmtest-system-setup 2015-09-21 16:19:01 +0000
328+++ tools/vmtest-system-setup 2015-11-10 19:57:52 +0000
329@@ -5,6 +5,10 @@
330 fail() { [ $# -eq 0 ] || error "$@"; exit 2; }
331
332 rel="$(lsb_release -sc)"
333+case "$(uname -m)" in
334+ i?86|x86_64) qemu="qemu-system-x86";;
335+ ppc*) qemu="qemu-system-ppc";;
336+esac
337 DEPS=(
338 cloud-image-utils
339 make
340@@ -12,6 +16,7 @@
341 python3-nose
342 python3-yaml
343 simplestreams
344+ $qemu
345 ubuntu-cloudimage-keyring
346 )
347

Subscribers

People subscribed via source and target branches