Merge lp:~allenap/maas/nose-select-bucket-plugin into lp:~maas-committers/maas/trunk

Proposed by Gavin Panella
Status: Merged
Approved by: Gavin Panella
Approved revision: no longer in the source branch.
Merged at revision: 5645
Proposed branch: lp:~allenap/maas/nose-select-bucket-plugin
Merge into: lp:~maas-committers/maas/trunk
Diff against target: 380 lines (+247/-26)
4 files modified
buildout.cfg (+1/-3)
src/maasserver/djangosettings/development.py (+2/-5)
src/maastesting/noseplug.py (+161/-16)
src/maastesting/tests/test_noseplug.py (+83/-2)
To merge this branch: bzr merge lp:~allenap/maas/nose-select-bucket-plugin
Reviewer Review Type Date Requested Status
Mike Pontillo (community) Approve
Review via email: mp+314602@code.launchpad.net

Commit message

New --select-bucket option for the test scripts.

Tests can be split into a configurable number of buckets and then the tests for one of those buckets can be then run. This is a step towards being able to parallelise test runs. The mechanism for choosing buckets is stable so this will work locally or in a distributed environment.

To post a comment you must log in.
Revision history for this message
Mike Pontillo (mpontillo) wrote :

Looks good to me; mostly-minor comments below.

I won't block you on it, but I would seriously consider using a standards-based hash algorithm for better pseudo-random distribution. See comments below.

review: Approve
Revision history for this message
Mike Pontillo (mpontillo) wrote :

Argh. Launchpad ate my most important comment, on line ~140 where you do the hash. (Mental note: press little save tickbox-button next time.) Basically I think you should do something like:

from hashlib import sha1

def _getHashValue(test_id):
    return int(sha1(test_id.encode('utf-8')).hexdigest(), 16)

You could then take the modulus of `_getHashValue(test.id())` to get your bucket.

Though my inner crypto nerd tells me otherwise, what you have is probably fine. But using a cryptographic-quality hash should lead to more evenly distributed test cases.

Revision history for this message
Gavin Panella (allenap) wrote :

Thanks for all your useful suggestions!

Revision history for this message
Gavin Panella (allenap) wrote :

> Argh. Launchpad ate my most important comment, on line ~140 where you
> do the hash. (Mental note: press little save tickbox-button next
> time.) Basically I think you should do something like:
>
> from hashlib import sha1
>
> def _getHashValue(test_id):
> return int(sha1(test_id.encode('utf-8')).hexdigest(), 16)
>
> You could then take the modulus of `_getHashValue(test.id())` to get
> your bucket.
>
> Though my inner crypto nerd tells me otherwise, what you have is
> probably fine. But using a cryptographic-quality hash should lead to
> more evenly distributed test cases.

This interested me, so I checked it out.

Using http://paste.ubuntu.com/23832921/, I printed out the bucket sizes
when between 2 and 10 buckets are desired, then calculated the mean
absolute deviation for each, and a mean of those figures for each
hashing method.

The results at http://paste.ubuntu.com/23832909/ indicate that:

- For the region, SHA1 is slightly fairer than the sum-of-code-points
  method, but that SHA256 is less fair.

- For the rack, sum-of-code-points is the most fair, and that SHA256 is
  fairer than SHA1.

That's certainly not what I would have expected.

For now I'm going to stick with sum-of-code-points because it's simple
and seems to be fair enough.

Revision history for this message
MAAS Lander (maas-lander) wrote :
Download full text (1.7 MiB)

The attempt to merge lp:~allenap/maas/nose-select-bucket-plugin into lp:maas failed. Below is the output from the failed tests.

Get:1 http://security.ubuntu.com/ubuntu xenial-security InRelease [102 kB]
Hit:2 http://prodstack-zone-2.clouds.archive.ubuntu.com/ubuntu xenial InRelease
Get:3 http://prodstack-zone-2.clouds.archive.ubuntu.com/ubuntu xenial-updates InRelease [102 kB]
Hit:4 http://prodstack-zone-2.clouds.archive.ubuntu.com/ubuntu xenial-backports InRelease
Fetched 204 kB in 0s (481 kB/s)
Reading package lists...
sudo DEBIAN_FRONTEND=noninteractive apt-get -y \
    --no-install-recommends install apache2 archdetect-deb authbind avahi-utils bash bind9 bind9utils build-essential bzr bzr-builddeb chromium-browser chromium-chromedriver curl daemontools debhelper dh-apport dh-systemd distro-info dnsutils firefox freeipmi-tools git gjs ipython isc-dhcp-common isc-dhcp-server libjs-angularjs libjs-jquery libjs-jquery-hotkeys libjs-yui3-full libjs-yui3-min libnss-wrapper libpq-dev make nodejs-legacy npm postgresql pxelinux python3-all python3-apt python3-attr python3-bson python3-convoy python3-crochet python3-cssselect python3-curtin python3-dev python3-distro-info python3-django python3-django-nose python3-django-piston3 python3-dnspython python3-docutils python3-formencode python3-hivex python3-httplib2 python3-jinja2 python3-jsonschema python3-lxml python3-netaddr python3-netifaces python3-novaclient python3-oauth python3-oauthlib python3-openssl python3-paramiko python3-petname python3-pexpect python3-psycopg2 python3-pyinotify python3-pyparsing python3-pyvmomi python3-requests python3-seamicroclient python3-setuptools python3-simplestreams python3-sphinx python3-tempita python3-twisted python3-txtftp python3-tz python3-yaml python3-zope.interface python-bson python-crochet python-django python-django-piston python-djorm-ext-pgarray python-formencode python-lxml python-netaddr python-netifaces python-pocket-lint python-psycopg2 python-simplejson python-tempita python-twisted python-yaml socat syslinux-common tgt ubuntu-cloudimage-keyring wget xvfb
Reading package lists...
Building dependency tree...
Reading state information...
authbind is already the newest version (2.1.1+nmu1).
avahi-utils is already the newest version (0.6.32~rc+dfsg-1ubuntu2).
build-essential is already the newest version (12.1ubuntu2).
debhelper is already the newest version (9.20160115ubuntu3).
distro-info is already the newest version (0.14build1).
git is already the newest version (1:2.7.4-0ubuntu1).
libjs-angularjs is already the newest version (1.2.28-1ubuntu2).
libjs-jquery is already the newest version (1.11.3+dfsg-4).
libjs-yui3-full is already the newest version (3.5.1-1ubuntu3).
libjs-yui3-min is already the newest version (3.5.1-1ubuntu3).
make is already the newest version (4.1-6).
postgresql is already the newest version (9.5+173).
pxelinux is already the newest version (3:6.03+dfsg-11ubuntu1).
python-formencode is already the newest version (1.3.0-0ubuntu5).
python-lxml is already the newest version (3.5.0-1build1).
python-netaddr is already the newest version (0.7.18-1).
python-netifaces is already the newest version (0.10.4-0.1build2).
python-psycop...

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'buildout.cfg'
2--- buildout.cfg 2017-01-11 10:29:20 +0000
3+++ buildout.cfg 2017-01-20 14:09:17 +0000
4@@ -73,10 +73,8 @@
5 # When running headless increase the verbosity so we can see the test
6 # being run from a log file. An `options` list must be defined ahead
7 # of the use of this snippet.
8- # *** NOSE PROGRESSIVE IS BROKEN IN PYTHON 3.5 ***
9- # ["--verbosity=0", "--with-progressive"]
10 options += (
11- ["--verbosity=1"]
12+ ["--verbosity=0", "--with-progressive"]
13 if sys.stdout.isatty() else
14 ["--verbosity=2"]
15 )
16
17=== modified file 'src/maasserver/djangosettings/development.py'
18--- src/maasserver/djangosettings/development.py 2016-12-07 12:46:14 +0000
19+++ src/maasserver/djangosettings/development.py 2017-01-20 14:09:17 +0000
20@@ -65,11 +65,6 @@
21 # Example: "/home/media/media.lawrence.com/media/"
22 MEDIA_ROOT = abspath("media/development")
23
24-INSTALLED_APPS += (
25- 'maastesting',
26- 'django_nose',
27-)
28-
29 INTERNAL_IPS = ('127.0.0.1', '::1')
30
31 # Make all nodes' metadata visible. This is not safe; do not enable it
32@@ -111,6 +106,8 @@
33 "maastesting.noseplug.Resources",
34 "maastesting.noseplug.Scenarios",
35 "maastesting.noseplug.Select",
36+ "maastesting.noseplug.SelectBucket",
37+ "maastesting.noseplug.Subunit",
38 ]
39
40 # Fix crooked settings.
41
42=== modified file 'src/maastesting/noseplug.py'
43--- src/maastesting/noseplug.py 2016-12-14 09:12:04 +0000
44+++ src/maastesting/noseplug.py 2017-01-20 14:09:17 +0000
45@@ -1,4 +1,4 @@
46-# Copyright 2015-2016 Canonical Ltd. This software is licensed under the
47+# Copyright 2015-2017 Canonical Ltd. This software is licensed under the
48 # GNU Affero General Public License version 3 (see the file LICENSE).
49
50 """Nose plugins for MAAS."""
51@@ -14,6 +14,8 @@
52 import inspect
53 import io
54 import logging
55+import optparse
56+import sys
57 import unittest
58
59 from nose.case import Test
60@@ -24,6 +26,21 @@
61 from twisted.python.filepath import FilePath
62
63
64+def flattenTests(tests):
65+ """Recursively eliminate nested test suites.
66+
67+ A `TestSuite` can contain zero or more tests or other suites, or a mix of
68+ both. This function undoes all of this nesting. Be sure you know what you
69+ intend when doing this, because any behaviour that is endowed by those
70+ nested test suites will be lost.
71+ """
72+ for test in tests:
73+ if isinstance(test, unittest.TestSuite):
74+ yield from flattenTests(test)
75+ else:
76+ yield test
77+
78+
79 class Crochet(Plugin):
80 """Start the Twisted reactor via Crochet."""
81
82@@ -97,18 +114,10 @@
83
84 :return: An instance of :class:`OptimisingTestSuite`.
85 """
86- tests = self._flattenTests(test)
87+ tests = flattenTests(test)
88 tests = map(self._hoistResources, tests)
89 return OptimisingTestSuite(tests)
90
91- def _flattenTests(self, tests):
92- """Recursively eliminate test suites."""
93- for test in tests:
94- if isinstance(test, unittest.TestSuite):
95- yield from self._flattenTests(test)
96- else:
97- yield test
98-
99 def _hoistResources(self, test):
100 """Hoist resources from the real test to Nose's test wrapper."""
101 if isinstance(test, Test):
102@@ -262,6 +271,143 @@
103 return inspect.getdoc(self)
104
105
106+class SelectBucket(Plugin):
107+ """Select tests from buckets derived from their names.
108+
109+ Each test's ID is hashed into a number. This number, modulo the given
110+ number of "buckets", defines the "bucket" for the test. This bucket can
111+ then be selected by an option. This gives a caller a rough but stable way
112+ to split up a test suite into parts, for running in parallel perhaps.
113+
114+ Note that, when using this plug-in, the nose-progressive plug-in will be
115+ inoperative. Both this and nose-progressive attempt to customise the test
116+ runner, but this wins.
117+ """
118+
119+ name = "select-bucket"
120+ option_selected_bucket = "%s_selected_bucket" % name
121+ log = logging.getLogger('nose.plugins.%s' % name)
122+ score = 10001 # Run before nose-progressive.
123+
124+ def options(self, parser, env):
125+ """Add options to Nose's parser.
126+
127+ :attention: This is part of the Nose plugin contract.
128+ """
129+ # This plugin is not compatible with django-nose. Nose passes in an
130+ # optparse-based argument parser, which django-nose later strips,
131+ # putting the options into an argparse-based parser. This breaks
132+ # because argparse uses `type` and `action` instead of `callback` but
133+ # django-nose does not provide a compatibility shim for that. We could
134+ # probably work around that here, but django-nose is deprecated in
135+ # MAAS so it's not worth the effort. Hence, when django_nose has been
136+ # loaded, this plugin is simply disabled.
137+ if "django_nose" in sys.modules:
138+ return
139+
140+ super(SelectBucket, self).options(parser, env)
141+ parser.add_option(
142+ "--%s" % self.name, dest=self.option_selected_bucket,
143+ action="callback", callback=self._ingestSelectedBucket,
144+ type="str", metavar="BUCKET/BUCKETS", default=None, help=(
145+ "Select the number of buckets in which to split tests, and "
146+ "which of these buckets to then run, e.g. 8/13 will split "
147+ "tests into 13 buckets and will run those in the 8th bucket "
148+ "(well... the 7th; the bucket number is indexed from 1)."
149+ ),
150+ )
151+
152+ def configure(self, options, conf):
153+ """Configure, based on the parsed options.
154+
155+ :attention: This is part of the Nose plugin contract.
156+ """
157+ super(SelectBucket, self).configure(options, conf)
158+ if self.enabled:
159+ bucket_buckets = getattr(options, self.option_selected_bucket)
160+ if bucket_buckets is None:
161+ self._selectTest = None
162+ else:
163+ bucket, buckets = bucket_buckets
164+ offset = bucket - 1 # Zero-based bucket number.
165+ self._selectTest = lambda test: (
166+ sum(map(ord, test.id())) % buckets == offset)
167+
168+ def _ingestSelectedBucket(self, option, option_string, value, parser):
169+ """Callback for the `--select-bucket` option.
170+
171+ The recognises values that match \d+/\d+, i.e. two integers. The
172+ latter is the total number of buckets, the first is which of those
173+ buckets to select (starting at 1).
174+ """
175+ try:
176+ value = self._parseSelectedBucket(value)
177+ except ValueError as error:
178+ raise optparse.OptionValueError(
179+ "%s: %s" % (option_string, error))
180+ else:
181+ setattr(parser.values, option.dest, value)
182+
183+ def _parseSelectedBucket(self, option):
184+ """Helper for `_ingestSelectedBucket`."""
185+ if option is None:
186+ return None
187+ elif "/" in option:
188+ bucket, buckets = option.split("/", 1)
189+ try:
190+ bucket = int(bucket)
191+ buckets = int(buckets)
192+ except ValueError:
193+ raise ValueError("not bucket/buckets")
194+ else:
195+ if buckets <= 0:
196+ raise ValueError("buckets must be >= 0")
197+ elif bucket <= 0:
198+ raise ValueError("bucket must be >= 0")
199+ elif bucket > buckets:
200+ raise ValueError("bucket must be <= buckets")
201+ else:
202+ return bucket, buckets
203+ else:
204+ raise ValueError("not bucket/buckets")
205+
206+ def prepareTestRunner(self, runner):
207+ """Convert `runner` to a selective variant.
208+
209+ :attention: This is part of the Nose plugin contract.
210+ """
211+ if self._selectTest is not None:
212+ return SelectiveTestRunner(runner, self._selectTest)
213+
214+ def help(self):
215+ """Used in the --help text.
216+
217+ :attention: This is part of the Nose plugin contract.
218+ """
219+ return inspect.getdoc(self)
220+
221+
222+class SelectiveTestRunner:
223+ """Wrap a test runner in order to filter the test suite to run."""
224+
225+ def __init__(self, runner, select):
226+ super(SelectiveTestRunner, self).__init__()
227+ self._runner = runner
228+ self._select = select
229+
230+ def run(self, test):
231+ if isinstance(test, unittest.TestSuite):
232+ tests = flattenTests(test)
233+ tests = filter(self._select, tests)
234+ test = type(test)(tests)
235+ else:
236+ raise TypeError(
237+ "Expected test suite, got %s: %r" % (
238+ type(test).__class__.__name__, test))
239+
240+ return self._runner.run(test)
241+
242+
243 class Subunit(Plugin):
244 """Emit test results as a subunit stream."""
245
246@@ -311,10 +457,9 @@
247 def main():
248 """Invoke Nose's `TestProgram` with extra plugins.
249
250- Specifically the `Crochet`, `Resources`, `Scenarios`, and `Select`
251- plugins. At the command-line it's still necessary to enable these with the
252- flags ``--with-crochet``, ``--with-resources``, ``--with-scenarios``,
253- and/or ``--with-select``.
254+ At the command-line it's still necessary to enable these with the flags
255+ ``--with-crochet``, ``--with-resources``, ``--with-scenarios``, and so on.
256 """
257- plugins = Crochet(), Resources(), Scenarios(), Select(), Subunit()
258- return TestProgram(addplugins=plugins)
259+ return TestProgram(addplugins=(
260+ Crochet(), Resources(), Scenarios(), Select(), SelectBucket(),
261+ Subunit()))
262
263=== modified file 'src/maastesting/tests/test_noseplug.py'
264--- src/maastesting/tests/test_noseplug.py 2016-12-14 09:12:04 +0000
265+++ src/maastesting/tests/test_noseplug.py 2017-01-20 14:09:17 +0000
266@@ -25,6 +25,7 @@
267 from maastesting import noseplug
268 from maastesting.factory import factory
269 from maastesting.matchers import (
270+ IsCallable,
271 MockCalledOnceWith,
272 MockNotCalled,
273 )
274@@ -33,6 +34,7 @@
275 Resources,
276 Scenarios,
277 Select,
278+ SelectBucket,
279 Subunit,
280 )
281 from maastesting.testcase import MAASTestCase
282@@ -362,6 +364,85 @@
283 join(directory, factory.make_name("other-child"))))
284
285
286+class TestSelectBucket(MAASTestCase):
287+
288+ def test__options_adds_options(self):
289+ select = SelectBucket()
290+ parser = OptionParser()
291+ select.options(parser=parser, env={})
292+ self.assertThat(
293+ parser.option_list[-2:],
294+ MatchesListwise([
295+ # The --with-select-bucket option.
296+ MatchesStructure.byEquality(
297+ action="store_true", default=None,
298+ dest="enable_plugin_select_bucket",
299+ ),
300+ # The --select-bucket option.
301+ MatchesStructure.byEquality(
302+ action="callback", default=None,
303+ dest="select-bucket_selected_bucket",
304+ metavar="BUCKET/BUCKETS", type="string",
305+ _short_opts=[], _long_opts=["--select-bucket"],
306+ )
307+ ]))
308+
309+ def test__configure_parses_selected_bucket(self):
310+ select = SelectBucket()
311+ parser = OptionParser()
312+ select.add_options(parser=parser, env={})
313+ options, rest = parser.parse_args(
314+ ["--with-select-bucket", "--select-bucket", "8/13"])
315+ select.configure(options, sentinel.conf)
316+ self.assertThat(select, MatchesStructure(_selectTest=IsCallable()))
317+
318+ @staticmethod
319+ def _make_test_with_id(test_id):
320+ test = unittest.TestCase()
321+ test.id = lambda: test_id
322+ return test
323+
324+ def test__prepareTestRunner_wraps_given_runner_and_filters_tests(self):
325+ select = SelectBucket()
326+ parser = OptionParser()
327+ select.add_options(parser=parser, env={})
328+ options, rest = parser.parse_args(
329+ ["--with-select-bucket", "--select-bucket", "8/13"])
330+ select.configure(options, sentinel.conf)
331+
332+ # We start at 65 because chr(65) is "A" and so makes a nice readable
333+ # ID for the test. We end at 77 because chr(77) is "M", a readable ID
334+ # once again, but more importantly it means we'll have 13 tests, which
335+ # is the modulus we started with.
336+ tests = map(self._make_test_with_id, map(chr, range(65, 78)))
337+ test = unittest.TestSuite(tests)
338+ self.assertThat(test.countTestCases(), Equals(13))
339+
340+ class MockTestRunner:
341+ def run(self, test):
342+ self.test = test
343+
344+ runner = runner_orig = MockTestRunner()
345+ runner = select.prepareTestRunner(runner)
346+ self.assertThat(runner, IsInstance(noseplug.SelectiveTestRunner))
347+
348+ runner.run(test)
349+
350+ self.assertThat(runner_orig.test, IsInstance(type(test)))
351+ self.assertThat(runner_orig.test.countTestCases(), Equals(1))
352+ # Note how the test with ID of "H" is the only one selected.
353+ self.assertThat({t.id() for t in runner_orig.test}, Equals({"H"}))
354+ self.assertThat(ord("H") % 13, Equals(7))
355+
356+ def test__prepareTestRunner_does_nothing_when_no_bucket_selected(self):
357+ select = SelectBucket()
358+ parser = OptionParser()
359+ select.add_options(parser=parser, env={})
360+ options, rest = parser.parse_args(["--with-select-bucket"])
361+ select.configure(options, sentinel.conf)
362+ self.assertThat(select.prepareTestRunner(sentinel.runner), Is(None))
363+
364+
365 class TestSubunit(MAASTestCase):
366
367 def test__options_adds_options(self):
368@@ -411,10 +492,10 @@
369 noseplug.main()
370 self.assertThat(
371 noseplug.TestProgram,
372- MockCalledOnceWith(addplugins=(ANY, ANY, ANY, ANY, ANY)))
373+ MockCalledOnceWith(addplugins=(ANY, ANY, ANY, ANY, ANY, ANY)))
374 plugins = noseplug.TestProgram.call_args[1]["addplugins"]
375 self.assertThat(plugins, MatchesSetwise(
376 IsInstance(Crochet), IsInstance(Resources),
377 IsInstance(Scenarios), IsInstance(Select),
378- IsInstance(Subunit),
379+ IsInstance(SelectBucket), IsInstance(Subunit),
380 ))