Merge lp:~raharper/curtin/trunk.vmtest-sync-only-once into lp:~curtin-dev/curtin/trunk

Proposed by Ryan Harper
Status: Merged
Merged at revision: 443
Proposed branch: lp:~raharper/curtin/trunk.vmtest-sync-only-once
Merge into: lp:~curtin-dev/curtin/trunk
Diff against target: 240 lines (+72/-27)
5 files modified
tests/vmtests/__init__.py (+28/-14)
tests/vmtests/helpers.py (+27/-6)
tests/vmtests/image_sync.py (+2/-1)
tools/jenkins-runner (+3/-1)
tools/vmtest-sync-images (+12/-5)
To merge this branch: bzr merge lp:~raharper/curtin/trunk.vmtest-sync-only-once
Reviewer Review Type Date Requested Status
Server Team CI bot continuous-integration Approve
Scott Moser (community) Approve
Review via email: mp+315387@code.launchpad.net

Commit message

vmtest: overhaul image sync

Changes to support HWE and Centos images have exposed issues in the image
sync process. Our 'make sync-images' was using a simplestreams filter
that did not pickup any of the required HWE kernels which meant that
during vmtest runs we would trigger new image syncs. Compounding this
issue was the fact that the filter used was too-wide which picked up
things like di-initrd,di-kernel files.

Additionally for an empty IMAGE_DIR repository, there were two bugs; one
it assumed that it could load the vmtest.json/vmtest-centos.json files.
Second when the repo was empty, we did not return an iterable type.

The following changes have been made to achieve the goals:
  - make sync-images should download everything we need to run a complete
    vmtest run without acquiring any new files from our image server.
  - Handle empty repositories properly
  - Do not attempt to sync with the streamds on each test, but only if
    specific needed files are missing

Description of the change

vmtest: overhaul image sync

Changes to support HWE and Centos images have exposed issues in the image
sync process. Our 'make sync-images' was using a simplestreams filter
that did not pickup any of the required HWE kernels which meant that
during vmtest runs we would trigger new image syncs. Compounding this
issue was the fact that the filter used was too-wide which picked up
things like di-initrd,di-kernel files.

Additionally for an empty IMAGE_DIR repository, there were two bugs; one
it assumed that it could load the vmtest.json/vmtest-centos.json files.
Second when the repo was empty, we did not return an iterable type.

The following changes have been made to achieve the goals:
  - make sync-images should download everything we need to run a complete
    vmtest run without acquiring any new files from our image server.
  - Handle empty repositories properly
  - Do not attempt to sync with the streamds on each test, but only if
    specific needed files are missing

To post a comment you must log in.
Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Approve (continuous-integration)
Revision history for this message
Scott Moser (smoser) wrote :

image_sync.py:query
  query returns a list of dictionaries that is not guaranteed
  to be unique per 'ftype'. We'll take the one that sorts first.

  maybe we should raise exception if its not unique ?

other comments inline.

Revision history for this message
Ryan Harper (raharper) wrote :
Download full text (8.7 KiB)

On Mon, Jan 23, 2017 at 1:18 PM, Scott Moser <email address hidden> wrote:

> image_sync.py:query
> query returns a list of dictionaries that is not guaranteed
> to be unique per 'ftype'. We'll take the one that sorts first.
>
> maybe we should raise exception if its not unique ?
>

Maybe? I think you may have authored that function; I'll defer to what
changes you think best there.

>
>
> other comments inline.
>
>
> Diff comments:
>
> > === modified file 'tests/vmtests/__init__.py'
> > --- tests/vmtests/__init__.py 2016-12-02 02:01:20 +0000
> > +++ tests/vmtests/__init__.py 2017-01-23 17:05:51 +0000
> > @@ -165,7 +165,7 @@
> > return
> >
> >
> > -def get_images(src_url, local_d, distro, release, arch, krel=None,
> sync=True,
> > +def get_images(src_url, local_d, distro, release, arch, krel=None,
> sync=False,
>
> i'd think we would not want to change the signature if we dont have to.
> can't the caller be changed just as easily?
>

Well, I think it got flipped when I merged the centos support; so I prefer
it to not sync-by default given that
the method is called get_images vs. the method above in the file which is
explictly, sync_images.

The callers of get_images use it more like (get_path_to_ftypes) which
happens to have an sync/fallback
method to go and get said requested items.

>
> Also can we change the comment / function comment here to say that it
> returns a dictionary of full paths to files.
>

ACK

>
> > ftypes=None):
> > # ensure that the image items (roottar, kernel, initrd)
> > # we need for release and arch are available in base_dir.
> > @@ -186,10 +186,12 @@
> > common_filters.append('krel=%s' % krel)
> > filters = ['ftype~(%s)' % ("|".join(ftypes.keys()))] +
> common_filters
> >
> > + # only sync if requested, allow env to override
>
> the comment doesnt make sense here, here we are not allowing the
> environment to override (the input parameter is used always)
>

I'll update; I did have the env variable there but changed to having the
callers pass in the environment value.

>
> unrelated to your change, but right around here, it'd be better if we did:
> elif isinstance(ftypes, (list, tuple)):
> - ftypes = dict().fromkeys(ftypes)
> + ftypes = dict().fromkeys(ftypes, '')
>

ACK

>
> > if sync:
> > + logger.info('Syncing images from %s with filters=%s', src_url,
> filters)
> > imagesync_mirror(output_d=local_d, source=src_url,
> > - mirror_filters=common_filters,
> > - max_items=IMAGES_TO_KEEP)
> > + mirror_filters=filters,
> > + max_items=IMAGES_TO_KEEP, verbosity=1)
> >
> > query_str = 'query = %s' % (' '.join(filters))
> > logger.debug('Query %s for image. %s', local_d, query_str)
>
> also unrelated to your change, but confusing...
> down belo here in the code we do:
> if not results and not sync:
> # try to fix this with a sync
>
> we do that even if user passed in sync=False. I think this can result in
> a case where we would sync iteratively.
>

Yes; that's true (and useful) but I think having that controll...

Read more...

441. By Ryan Harper

Address review feedback

442. By Ryan Harper

vmtest: image-sync rework sync logic

Invert some of the sync values to support two main use-cases
 - jenkins-runner uses make-sync first, so diable syncing during
   unittest execution, raise exception on missing files.
 - Developers using nose directly can benefit from on-the-fly image
   syncing

Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Approve (continuous-integration)
443. By Ryan Harper

merge from trunk

Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Approve (continuous-integration)
Revision history for this message
Ryan Harper (raharper) wrote :

This passes a full-vmtest run locally and on diglett (save for the old-apt test, due to missing apt-proxy on host and is addressed in another branch, https://code.launchpad.net/~raharper/curtin/trunk.skip-apt-proxy-test-if-not-set/+merge/315688)

444. By Ryan Harper

Drop util.{is_true,is_false}; sync parameter uses '1' for true to match env

Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Approve (continuous-integration)
Revision history for this message
Scott Moser (smoser) :
Revision history for this message
Ryan Harper (raharper) wrote :

On Fri, Jan 27, 2017 at 9:05 AM, Scott Moser <email address hidden> wrote:

>
>
> Diff comments:
>
> >
> > === modified file 'tests/vmtests/__init__.py'
> > --- tests/vmtests/__init__.py 2016-12-02 02:01:20 +0000
> > +++ tests/vmtests/__init__.py 2017-01-25 22:27:40 +0000
> > @@ -186,12 +188,22 @@
> > common_filters.append('krel=%s' % krel)
> > filters = ['ftype~(%s)' % ("|".join(ftypes.keys()))] +
> common_filters
> >
> > - if sync:
> > + if util.is_true(sync):
> > + # sync with the default items + common filters to ensure we get
> > + # everything in one go.
> > + sync_filters = common_filters + ITEM_NAME_FILTERS
> > + logger.info('Syncing images from %s with filters=%s', src_url,
> > + sync_filters)
> > imagesync_mirror(output_d=local_d, source=src_url,
> > - mirror_filters=common_filters,
> > - max_items=IMAGES_TO_KEEP)
> > + mirror_filters=sync_filters,
> > + max_items=IMAGES_TO_KEEP, verbosity=1)
> > + else:
> > + logger.info('Image sync disabled, sync=%s', sync)
> > + logger.info('env var CURTIN_VMTEST_IMAGE_SYNC=%s',
> > + CURTIN_VMTEST_IMAGE_SYNC)
>
> this is failure path, right ?
> shouldnt we just raise an exception ? is there some case where this *is
> not* failure?
>

No, this is informative that we're not syncing.
Next we query to see if we have the files we need, and if so we move on.
If we don't have them *and* we've disabled sync; then we rase ValueError on
missing required images

> >
> > - query_str = 'query = %s' % (' '.join(filters))
> > + query_cmd = 'python3 tests/vmtests/image_sync.py'
> > + query_str = '%s query %s %s' % (query_cmd, local_d, '
> '.join(filters))
> > logger.debug('Query %s for image. %s', local_d, query_str)
> > fail_msg = None
> >
>
>
> --
> https://code.launchpad.net/~raharper/curtin/trunk.vmtest-
> sync-only-once/+merge/315387
> You are the owner of lp:~raharper/curtin/trunk.vmtest-sync-only-once.
>

Revision history for this message
Scott Moser (smoser) wrote :

So, i'm almost all +1 on this..
My comments:
 * remove the is_true/is_false, and revert to the old
    (I'm not completely opposed to this, but in cloud-init experience i have found
     that it just makes things sloppier. Now, instead of '0', you have to support 'FALSE'
     or 'false', and ultimately that just makes any checker or something more complex (like a jasonschema or something).

Then, 2 inline minor things.

thank you ryan!

Revision history for this message
Ryan Harper (raharper) wrote :

On Fri, Jan 27, 2017 at 9:38 AM, Scott Moser <email address hidden> wrote:

> So, i'm almost all +1 on this..
> My comments:
> * remove the is_true/is_false, and revert to the old
> (I'm not completely opposed to this, but in cloud-init experience i
> have found
> that it just makes things sloppier. Now, instead of '0', you have to
> support 'FALSE'
> or 'false', and ultimately that just makes any checker or something
> more complex (like a jasonschema or something).
>

Done

>
> Then, 2 inline minor things.
>

Done

>
> thank you ryan!
>
>
> Diff comments:
>
> >
> > === modified file 'tools/jenkins-runner'
> > --- tools/jenkins-runner 2016-08-05 18:01:25 +0000
> > +++ tools/jenkins-runner 2017-01-25 22:27:40 +0000
> > @@ -58,7 +59,8 @@
> > fmt=" %(release)-7s %(arch)s/%(subarch)s %(version_name)-10s"
> > PYTHONPATH="$PWD" python3 tests/vmtests/image_sync.py query \
> > --output-format="$fmt" "$IMAGE_DIR" ftype=root-image.gz ||
> > - { echo "WARNING: error querying images in $IMAGE_DIR" 1>&2; }
> > + { ret=$?; echo "WARNING: error querying images in $IMAGE_DIR" 1>&2;
> > + exit $ret; }
>
> might as well say FAIL not warn, since you're exiting.
>
> >
> > echo "$(date -R): vmtest start: nosetests3 ${pargs[*]} ${ntargs[*]}"
> > nosetests3 "${pargs[@]}" "${ntargs[@]}"
>
>
> --
> https://code.launchpad.net/~raharper/curtin/trunk.vmtest-
> sync-only-once/+merge/315387
> You are the owner of lp:~raharper/curtin/trunk.vmtest-sync-only-once.
>

445. By Ryan Harper

vmtests: fix spelling of comment and logging noise

- Drop sync message to debug
- Don't log image sync status on every invocation
- Fix misspelling of absolute
- Change error message in jenkins-runner; it's not a warning if it exits.

Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Approve (continuous-integration)
Revision history for this message
Scott Moser (smoser) wrote :

I am marking approve, but please remove
TRUE_STRINGS and FALSE_STRINGS from util.py.

review: Approve
446. By Ryan Harper

util: drop TRUE_STRINGS,FALSE_STRINGS; unneeded

Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Approve (continuous-integration)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'tests/vmtests/__init__.py'
2--- tests/vmtests/__init__.py 2016-12-02 02:01:20 +0000
3+++ tests/vmtests/__init__.py 2017-01-27 19:45:16 +0000
4@@ -18,7 +18,7 @@
5
6 from .image_sync import query as imagesync_query
7 from .image_sync import mirror as imagesync_mirror
8-from .image_sync import (IMAGE_SRC_URL, IMAGE_DIR)
9+from .image_sync import (IMAGE_SRC_URL, IMAGE_DIR, ITEM_NAME_FILTERS)
10 from .helpers import check_call, TimeoutExpired
11 from unittest import TestCase, SkipTest
12
13@@ -32,7 +32,7 @@
14
15 DEVNULL = open(os.devnull, 'w')
16 KEEP_DATA = {"pass": "none", "fail": "all"}
17-CURTIN_VMTEST_IMAGE_SYNC = os.environ.get("CURTIN_VMTEST_IMAGE_SYNC", False)
18+CURTIN_VMTEST_IMAGE_SYNC = os.environ.get("CURTIN_VMTEST_IMAGE_SYNC", "1")
19 IMAGE_SYNCS = []
20 TARGET_IMAGE_FORMAT = "raw"
21
22@@ -165,11 +165,13 @@
23 return
24
25
26-def get_images(src_url, local_d, distro, release, arch, krel=None, sync=True,
27+def get_images(src_url, local_d, distro, release, arch, krel=None, sync="1",
28 ftypes=None):
29 # ensure that the image items (roottar, kernel, initrd)
30 # we need for release and arch are available in base_dir.
31- # returns updated ftypes dictionary {ftype: item_url}
32+ #
33+ # returns ftype dictionary with path to each ftype as values
34+ # {ftype: item_url}
35 if not ftypes:
36 ftypes = {
37 'vmtest.root-image': '',
38@@ -178,7 +180,7 @@
39 'boot-initrd': ''
40 }
41 elif isinstance(ftypes, (list, tuple)):
42- ftypes = dict().fromkeys(ftypes)
43+ ftypes = dict().fromkeys(ftypes, '')
44
45 common_filters = ['release=%s' % release,
46 'arch=%s' % arch, 'os=%s' % distro]
47@@ -186,12 +188,17 @@
48 common_filters.append('krel=%s' % krel)
49 filters = ['ftype~(%s)' % ("|".join(ftypes.keys()))] + common_filters
50
51- if sync:
52+ if sync == "1":
53+ # sync with the default items + common filters to ensure we get
54+ # everything in one go.
55+ sync_filters = common_filters + ITEM_NAME_FILTERS
56+ logger.debug('Syncing images from %s with filters=%s', src_url,
57+ sync_filters)
58 imagesync_mirror(output_d=local_d, source=src_url,
59- mirror_filters=common_filters,
60- max_items=IMAGES_TO_KEEP)
61-
62- query_str = 'query = %s' % (' '.join(filters))
63+ mirror_filters=sync_filters,
64+ max_items=IMAGES_TO_KEEP, verbosity=1)
65+ query_cmd = 'python3 tests/vmtests/image_sync.py'
66+ query_str = '%s query %s %s' % (query_cmd, local_d, ' '.join(filters))
67 logger.debug('Query %s for image. %s', local_d, query_str)
68 fail_msg = None
69
70@@ -205,14 +212,15 @@
71 results = None
72 fail_msg = str(e)
73
74- if not results and not sync:
75+ if not results and sync == "1":
76 # try to fix this with a sync
77 logger.info(fail_msg + " Attempting to fix with an image sync. (%s)",
78 query_str)
79 return get_images(src_url, local_d, distro, release, arch,
80- krel=krel, sync=True, ftypes=ftypes)
81+ krel=krel, sync="1", ftypes=ftypes)
82 elif not results:
83- raise ValueError("Nothing found in query: %s" % query_str)
84+ raise ValueError("Required images not found and "
85+ "syncing disabled:\n%s" % query_str)
86
87 missing = []
88 found = sorted(f.get('ftype') for f in results)
89@@ -344,19 +352,25 @@
90
91 @classmethod
92 def get_test_files(cls):
93+ # get local absolute filesystem paths for each of the needed file types
94 img_verstr, ftypes = get_images(
95 IMAGE_SRC_URL, IMAGE_DIR, cls.distro, cls.release, cls.arch,
96 krel=cls.krel if cls.krel else cls.release,
97+ sync=CURTIN_VMTEST_IMAGE_SYNC,
98 ftypes=('boot-initrd', 'boot-kernel', 'vmtest.root-image'))
99 logger.debug("Install Image %s\n, ftypes: %s\n", img_verstr, ftypes)
100 logger.info("Install Image: %s", img_verstr)
101 if not cls.target_krel and cls.krel:
102 cls.target_krel = cls.krel
103+
104+ # get local absolute filesystem paths for the OS tarball to be
105+ # installed
106 img_verstr, found = get_images(
107 IMAGE_SRC_URL, IMAGE_DIR,
108 cls.target_distro if cls.target_distro else cls.distro,
109 cls.target_release if cls.target_release else cls.release,
110- cls.arch, krel=cls.target_krel, ftypes=('vmtest.root-tgz',))
111+ cls.arch, krel=cls.target_krel, sync=CURTIN_VMTEST_IMAGE_SYNC,
112+ ftypes=('vmtest.root-tgz',))
113 logger.debug("Target Tarball %s\n, ftypes: %s\n", img_verstr, found)
114 logger.info("Target Tarball: %s", img_verstr)
115 ftypes.update(found)
116
117=== modified file 'tests/vmtests/helpers.py'
118--- tests/vmtests/helpers.py 2016-11-14 22:55:12 +0000
119+++ tests/vmtests/helpers.py 2017-01-27 19:45:16 +0000
120@@ -103,6 +103,14 @@
121 def find_releases_by_distro():
122 """
123 Returns a dictionary of distros and the distro releases that will be tested
124+
125+ distros:
126+ ubuntu:
127+ releases: []
128+ krels: []
129+ centos:
130+ releases: []
131+ krels: []
132 """
133 # Use the TestLoder to load all test cases defined within tests/vmtests/
134 # and figure out what distros and releases they are testing. Any tests
135@@ -115,20 +123,33 @@
136 # Find all test modules defined in curtin/tests/vmtests/
137 module_test_suites = loader.discover(tests_dir, top_level_dir=root_dir)
138 # find all distros and releases tested for each distro
139- distros = {}
140+ releases = []
141+ krels = []
142+ rel_by_dist = {}
143 for mts in module_test_suites:
144 for class_test_suite in mts:
145 for test_case in class_test_suite:
146 # skip disabled tests
147 if not getattr(test_case, '__test__', False):
148 continue
149- for (dist, rel) in (
150+ for (dist, rel, krel) in (
151 (getattr(test_case, a, None) for a in attrs)
152- for attrs in (('distro', 'release'),
153- ('target_distro', 'target_release'))):
154+ for attrs in (('distro', 'release', 'krel'),
155+ ('target_distro', 'target_release',
156+ 'krel'))):
157+
158 if dist and rel:
159- distros[dist] = distros.get(dist, set()).union((rel,))
160- return {k: sorted(v) for (k, v) in distros.items()}
161+ distro = rel_by_dist.get(dist, {'releases': [],
162+ 'krels': []})
163+ releases = distro.get('releases')
164+ krels = distro.get('krels')
165+ if rel not in releases:
166+ releases.append(rel)
167+ if krel and krel not in krels:
168+ krels.append(krel)
169+ rel_by_dist.update({dist: distro})
170+
171+ return rel_by_dist
172
173
174 def _parse_ip_a(ip_a):
175
176=== modified file 'tests/vmtests/image_sync.py'
177--- tests/vmtests/image_sync.py 2016-11-14 22:55:12 +0000
178+++ tests/vmtests/image_sync.py 2017-01-27 19:45:16 +0000
179@@ -404,7 +404,8 @@
180 return next((q for q in (
181 query_ptree(sutil.load_content(util.load_file(fpath(path))),
182 max_num=max_items, ifilters=ifilters, path2url=fpath)
183- for path in VMTEST_CONTENT_ID_PATH_MAP.values()) if q), None)
184+ for path in VMTEST_CONTENT_ID_PATH_MAP.values() if os.path.exists(
185+ fpath(path))) if q), [])
186
187
188 def main_query(args):
189
190=== modified file 'tools/jenkins-runner'
191--- tools/jenkins-runner 2016-08-05 18:01:25 +0000
192+++ tools/jenkins-runner 2017-01-27 19:45:16 +0000
193@@ -3,6 +3,7 @@
194 topdir="${CURTIN_VMTEST_TOPDIR:-${WORKSPACE:-$PWD}/output}"
195 pkeep=${CURTIN_VMTEST_KEEP_DATA_PASS:-logs,collect}
196 fkeep=${CURTIN_VMTEST_KEEP_DATA_FAIL:-logs,collect}
197+export CURTIN_VMTEST_IMAGE_SYNC=${CURTIN_VMTEST_IMAGE_SYNC:-0}
198 export CURTIN_VMTEST_KEEP_DATA_PASS=$pkeep
199 export CURTIN_VMTEST_KEEP_DATA_FAIL=$fkeep
200 export CURTIN_VMTEST_TOPDIR="$topdir"
201@@ -58,7 +59,8 @@
202 fmt=" %(release)-7s %(arch)s/%(subarch)s %(version_name)-10s"
203 PYTHONPATH="$PWD" python3 tests/vmtests/image_sync.py query \
204 --output-format="$fmt" "$IMAGE_DIR" ftype=root-image.gz ||
205- { echo "WARNING: error querying images in $IMAGE_DIR" 1>&2; }
206+ { ret=$?; echo "FATAL: error querying images in $IMAGE_DIR" 1>&2;
207+ exit $ret; }
208
209 echo "$(date -R): vmtest start: nosetests3 ${pargs[*]} ${ntargs[*]}"
210 nosetests3 "${pargs[@]}" "${ntargs[@]}"
211
212=== modified file 'tools/vmtest-sync-images'
213--- tools/vmtest-sync-images 2016-09-21 04:26:33 +0000
214+++ tools/vmtest-sync-images 2017-01-27 19:45:16 +0000
215@@ -42,13 +42,20 @@
216 arch_filters = ['arch={}'.format(DEFAULT_ARCH)]
217 filter_sets = []
218 if len(arg_releases):
219- filter_sets.append([_fmt_list_filter('release', arg_releases)])
220+ filter_sets.append([_fmt_list_filter('release', arg_releases),
221+ _fmt_list_filter('krel', arg_releases)])
222 else:
223- filter_sets.extend(
224- (['os={}'.format(distro), _fmt_list_filter('release', rels)]
225- for (distro, rels) in find_releases_by_distro().items()))
226+ for distname, distro in find_releases_by_distro().items():
227+ f = ['os={}'.format(distname),
228+ _fmt_list_filter('release', distro.get('releases'))]
229+ # ensure we fetch release=x krel=x items
230+ krels = distro.get('krels')
231+ if krels:
232+ krels = set(krels).union(set(distro.get('releases')))
233+ f.append(_fmt_list_filter('krel', krels))
234+ filter_sets.extend([f])
235
236 # Sync images.
237 for filter_set in filter_sets:
238- sync_images(IMAGE_SRC_URL, IMAGE_DIR, verbosity=1,
239+ sync_images(IMAGE_SRC_URL, IMAGE_DIR, verbosity=2,
240 filters=filter_set + ITEM_NAME_FILTERS + arch_filters)

Subscribers

People subscribed via source and target branches