Merge lp:~gz/bzr/cleanup_testcases_by_collection_613247 into lp:bzr

Proposed by Martin Packman
Status: Superseded
Proposed branch: lp:~gz/bzr/cleanup_testcases_by_collection_613247
Merge into: lp:bzr
Diff against target: 427 lines (+190/-88) (has conflicts)
4 files modified
bzrlib/errors.py (+1/-0)
bzrlib/tests/TestUtil.py (+25/-1)
bzrlib/tests/__init__.py (+56/-87)
bzrlib/tests/test_selftest.py (+108/-0)
Text conflict in bzrlib/tests/__init__.py
To merge this branch: bzr merge lp:~gz/bzr/cleanup_testcases_by_collection_613247
Reviewer Review Type Date Requested Status
Andrew Bennetts Needs Fixing
Vincent Ladeuil Needs Fixing
Review via email: mp+38999@code.launchpad.net

This proposal supersedes a proposal from 2010-08-15.

Description of the change

This branch ensures that after each test case is run, they are cleaned up rather than lingering until the entire suite completes. On my machine this reduces the memory usage of a full test run from ~575MB to ~140MB, which prevents OOM related errors towards the end. In fact, the first revision is a sufficient fix, however Robert's intention in removing an earlier hack was to allow each test case instance to be deallocated after being run. To make that happen, a lot of code needs to be touched to make sure decrementing the refcount after the test completes will actually result in its deallocation.

The most visible change is there will now be a whine printed to the output stream if a testcase stays alive. As this style has proved rather too easy to ignore, I may try an alternative approach once the remaining ones are fixed. The test result has already been recorded by this point, but adding a new stub test to run next that fails may be an option.

As written, the code relies on refcounting working, so cycles need to be avoided. The alternative is to run a full gc collection after every test, which slow enough to be worth avoiding.

Some changes to testtools are needed to avoid some needless exc_info cycles, but even with that fixed we're hamstrung on a few fronts. Firstly, cases with errors or failures are kept in lists on the result object. In testtools, there is also a list for expected failures but bazaar doesn't use this one. However, the new unittest _ExpectedFailure gets the old exc_info which creates cycles.

The last change on the branch guts the TestDecorator classes. This is needed because otherwise they were keeping extra references to test cases alive, but I found the way they worked previously overly complicated anyway. I'd like to simplify them further if there's not a good backward compatibility reason for keeping the old style.

The branch needs a bit more work still, about a hundred tests still need poking to avoid cycles, and needs a NEWS entry and so on. I'd like some feedback on if people think this is a worthwhile change over just stripping the __dict__ and if there are any other issues I've not realised.

To post a comment you must log in.
Revision history for this message
Robert Collins (lifeless) wrote : Posted in a previous version of this proposal

So, I think this is more correct; fixing cycles is pretty important
for us in other ways : I like that we're finding latent issues here.

I'm not worried about the references to test cases in errors/failures
lists : they are a benefit (and the __dict__ stripping was a _REAL_
pain there in the past).

-Rob

Revision history for this message
Martin Packman (gz) wrote : Posted in a previous version of this proposal

> I'm not worried about the references to test cases in errors/failures
> lists : they are a benefit (and the __dict__ stripping was a _REAL_
> pain there in the past).

Indeed that mostly won't be an issue (and can sometimes be useful), my one worry is the occasional breaks-everything rev will then go back to swapping my machine to death. That's why the current implementation still wipes all uncollected tests, but I'm okay with the idea of removing that once everything else is sorted.

A few things I forgot to mention in the initial review comment.

The lock decorators now use this horribly obnoxious pattern:

    self.lock_read()
    try:
        result = unbound(self, *args, **kwargs)
    except:
        import sys
        exc_info = sys.exc_info()
        try:
            self.unlock()
        finally:
            try:
                raise exc_info[0], exc_info[1], exc_info[2]
            finally:
                del exc_info
    else:
        self.unlock()
        return result

If anyone can think of a sane way of spelling that, please say.

Cycles needing to be avoided will have to be documented if we go with this, and can be annoying to understand without decent knowledge of python internals. I've been using a rough reference tracking function to fix things, but it's not worth landing. Is there any package that can help contributors diagnose things like this?

Would there be any benefit in running all tests with gc disabled by default, and only enabling it for those that explicitly require it?

Is there a better way of telling if the test you just ran failed other than comparing failure counts on the result object?

Revision history for this message
Robert Collins (lifeless) wrote : Posted in a previous version of this proposal

> Is there a better way of telling if the test you just ran failed other than comparing failure counts on the result object?

Where do you need to know this?

Revision history for this message
Martin Packman (gz) wrote : Posted in a previous version of this proposal

> > Is there a better way of telling if the test you just ran failed other than
> comparing failure counts on the result object?
>
> Where do you need to know this?

It has to be outside all the normal methods unfortunately. Even stopTest is too early, as that's passed a reference to the test case, so clearly the caller is going to be keeping it alive. Instead I added the collection accounting in TestSuite.run which was already popping testcases from its internal list, now it uses weakref to check that they really expire after being run.

I've put up <lp:~gz/testtools/avoid_exc_info_cycles> with the testtools changes I needed.

Revision history for this message
Vincent Ladeuil (vila) wrote : Posted in a previous version of this proposal

Hmm, I appreciate the effort to reduce the memory footprint, but unfortunately this seems far from landable :-/

BZR_PLUGIN_PATH=-site ./bzr selftest --parallel=fork
bzr selftest: /home/vila/src/bzr/reviews/cleanup_testcases_by_collection_613247/bzr
   /home/vila/src/bzr/reviews/cleanup_testcases_by_collection_613247/bzrlib
   bzr-2.2b3 python-2.6.5 Linux-2.6.32-24-generic-x86_64-with-Ubuntu-10.04-lucid

----------------------------------------------------------------------
Ran 8 tests in 0.489s

OK
2 tests skipped

Huh ? Where are my tests gone ? (Strong suspicion around your modification of __iter__)

Ran 24856 tests in 1050.614s

FAILED (failures=2, errors=2, known_failure_count=48)

Including failures for bb.test_log so your fix is wrong there.

On the overall, with a superficial reading, the way you avoid ref cycles seem hard to understand
so very likely to be re-introduced. How about breaking the ref-cycles with addCleanup calls instead
(by deleting the offending attributes) ?

Now, a way to track the uncollectable test case sounds like a good way to progress but given the amount of work, could you instead devise a patch where the check depends on some option ?
Once we reach 0 we can toggle the default value for this option.

For the run above, I also get ~4000 Uncollected test cases, far too much to land this patch :-/

review: Needs Fixing
Revision history for this message
Martin Packman (gz) wrote : Posted in a previous version of this proposal

Thanks for looking at this Vincent.

> Huh ? Where are my tests gone ? (Strong suspicion around your modification of
> __iter__)

I suspect this is related to --parallel which I thought might be affected but didn't have any easy way of testing on my machine, I'll take another look at that code.

> Including failures for bb.test_log so your fix is wrong there.

Gah, how did I miss those...

> On the overall, with a superficial reading, the way you avoid ref cycles seem
> hard to understand
> so very likely to be re-introduced. How about breaking the ref-cycles with
> addCleanup calls instead
> (by deleting the offending attributes) ?

Yes, that it's hard to understand is a big concern. At least this patch, with retroactively changing a bunch of tests in subtle ways, is the worst part of it. It's much easier to adjust the test you're writing right now because you get told there's a cycle. So, moving in the direction of actually causing a failure once all existing issues are fixed would I think prevent regressions without being too taxing on contributors.

Note, that sometimes cycles do have to be avoided, because it's easier than trying to break them later, particularly with exc_info and closures where it's internal objects pointing at each other rather than public TestCase attributes.

> Now, a way to track the uncollectable test case sounds like a good way to
> progress but given the amount of work, could you instead devise a patch where
> the check depends on some option ?
> Once we reach 0 we can toggle the default value for this option.

Well, the "print junk to stdout" is my less intrusive option, but the aim is to get it to 0 before landing. However...

> For the run above, I also get ~4000 Uncollected test cases, far too much to
> land this patch :-/

This could well be just needing the testtools exc_info cycle cleanup as well, branch for that linked in previous comment.

...getting 4000 complaints from not having the testtools change probably does mean this needs to land quieter, making everyone patch testtools or be spammed isn't polite.

Revision history for this message
Vincent Ladeuil (vila) wrote : Posted in a previous version of this proposal
Download full text (3.2 KiB)

> Thanks for looking at this Vincent.
>
> > Huh ? Where are my tests gone ? (Strong suspicion around your modification
> of
> > __iter__)
>
> I suspect this is related to --parallel which I thought might be affected but
> didn't have any easy way of testing on my machine, I'll take another look at
> that code.

Clearly --parallel specific and at least one __iter__ you've modified should be involved.

>
> > Including failures for bb.test_log so your fix is wrong there.
>
> Gah, how did I miss those...

I'm pretty sure the way the log formatter are registered requires that a single instance is used
and you've changed that. In this case I think breaking the cycle with an addCleanup should be ok.

>
> > On the overall, with a superficial reading, the way you avoid ref cycles
> seem
> > hard to understand
> > so very likely to be re-introduced. How about breaking the ref-cycles with
> > addCleanup calls instead
> > (by deleting the offending attributes) ?
>
> Yes, that it's hard to understand is a big concern. At least this patch, with
> retroactively changing a bunch of tests in subtle ways, is the worst part of
> it. It's much easier to adjust the test you're writing right now because you
> get told there's a cycle.

I understand that, but avoiding the cycle may make the code less clear than breaking it during tearDown,
especially if there is an accompanying comment ;)

> So, moving in the direction of actually causing a
> failure once all existing issues are fixed would I think prevent regressions
> without being too taxing on contributors.

Yup, +1 on that.

> Note, that sometimes cycles do have to be avoided, because it's easier than
> trying to break them later, particularly with exc_info and closures where it's
> internal objects pointing at each other rather than public TestCase
> attributes.

No problem with that, but again adding a comment will help document such cases (so people
can learn from them) and avoid regressions.

> > Now, a way to track the uncollectable test case sounds like a good way to
> > progress but given the amount of work, could you instead devise a patch
> where
> > the check depends on some option ?
> > Once we reach 0 we can toggle the default value for this option.
>
> Well, the "print junk to stdout" is my less intrusive option, but the aim is
> to get it to 0 before landing. However...

I'm fine with that as long as an option control it, there are precedents, look for the lock checks.

> > For the run above, I also get ~4000 Uncollected test cases, far too much to
> > land this patch :-/
>
> This could well be just needing the testtools exc_info cycle cleanup as well,
> branch for that linked in previous comment.

Right, testing with your testtools patch, I still get ~2000 uncollected tests.

>
> ...getting 4000 complaints from not having the testtools change probably does
> mean this needs to land quieter, making everyone patch testtools or be spammed
> isn't polite.

Yup, that's why I suggest an option to control the output that could be stay off by default
until you reach 0 uncollected tests which you can then target in several steps.

In fact, ideally, the first step could be a patch that just imp...

Read more...

Revision history for this message
Martin Packman (gz) wrote : Posted in a previous version of this proposal

Quick followup on some points from review last time round.

> > > Huh ? Where are my tests gone ? (Strong suspicion around your modification
> > of
> > > __iter__)
> >
> > I suspect this is related to --parallel which I thought might be affected
> but
> > didn't have any easy way of testing on my machine, I'll take another look at
> > that code.
>
> Clearly --parallel specific and at least one __iter__ you've modified should
> be involved.

This was a combination of subunit playing very differently to the normal selftest result object, and bug 625551 meaning the errors from the children were silently eaten so I didn't notice.

> I'm pretty sure the way the log formatter are registered requires that a
> single instance is used
> and you've changed that. In this case I think breaking the cycle with an
> addCleanup should be ok.

Made this change.

> I'm fine with that as long as an option control it, there are precedents, look
> for the lock checks.

I'm still avoiding this for the moment, so people with 8GB ram at least get some console spam before they completely break my box.

> Right, testing with your testtools patch, I still get ~2000 uncollected tests.

The testtools change landed for 0.9.7 and I'd be interested in a new count. Should now not break with subunit which should help.

Revision history for this message
Andrew Bennetts (spiv) wrote :
Download full text (4.3 KiB)

Martin [gz] wrote:
[...]
> As written, the code relies on refcounting working, so cycles need to be
> avoided. The alternative is to run a full gc collection after every test,
> which slow enough to be worth avoiding.

Another alternative: why not a single full gc collection at the end of the
entire run, and then report any test cases that are unexpectedly still
alive?

I don't think you should assume that “uncollected at this instant” implies
“can't be collected yet”. And the real issue is uncollectable objects, not
objects that have fractionally outlived their usefulness, AFAICS.

The point is that if a VM chooses to allow say 20 or even 200 used test case
instances to be alive simultaneously before collecting them, then that's
fine. It'll still keep the memory ceiling reasonably low rather than
growing in proportion to the number of tests. So requiring that a test case
lifetime is strictly minimal is too strong a requirement, and I'm not sure
it's worth paying the price for it. The price is either assuming a
refcounting VM (and the fairly extreme senstivity to refcounting differences
due to different versions of dependencies like testtools or even the Python
stdlib), or forcing a full GC after every test. I agree that the latter is
too much, but I also think the former isn't worth paying either.

To be clear, I don't mind (small) contortions to make most or even all tests
be collected promptly via refcounting, under common conditions. I do mind
emitting warnings when that fails, because I think the noise will outweigh
the benefits.

I'd be very happy to have an optional flag to selftest to force a full GC
after every test (or more creatively, after every N tests, then once a leak
is found re-run the last N individually to pinpoint it...). And maybe even
an optional flag to warn if the refcount-based attempt to release the test
case doesn't release it. It's good to have tools to help developers
diagnose memory issues. I just don't believe these tools should be on all
the time unless they are very cheap and very accurate.

> The branch needs a bit more work still, about a hundred tests still need
> poking to avoid cycles, and needs a NEWS entry and so on. I'd like some
> feedback on if people think this is a worthwhile change over just
> stripping the __dict__ and if there are any other issues I've not
> realised.

Why not strip __dict__ as well? Defense in depth...

Basically, I'm not inclined to go to extreme lengths to get utter perfection
here: if the memory consumption is down, and is likely stay down, and is
fairly easy to diagnose and correct if it doesn't stay down, then I think
that's clearly Good Enough, even if there are a handful of tests that don't
relinquish all their memory, or don't relinquish it as promptly as they
theoretically could.

Aside from the comments above, most of the diff looks ok to me except these
bits:

> +++ bzrlib/tests/test_errors.py 2010-10-20 23:41:12 +0000
> @@ -575,12 +575,15 @@
> try:
> 1/0
> except ZeroDivisionError:
> - exc_info = sys.exc_info()
> - err = errors.HookFailed('hook stage', 'hook name', exc_info, warn=False)
> - self.asser...

Read more...

Revision history for this message
Andrew Bennetts (spiv) wrote :

I said:

> Why not strip __dict__ as well? Defense in depth...

I just realised you do! So that's nice :)

Revision history for this message
Andrew Bennetts (spiv) wrote :

FWIW, under 4-way --parallel=fork on my laptop:

* I see significant memory savings too (no hard figures as I wasn't watching closely enough, but at least half the size, possibly better).

* The time drops by about 30s (from 560s down to 530s), which is the sort of modest improvement I was hoping to see :)

So, nice work! I just don't want it to be a burden on future development.

Revision history for this message
Martin Packman (gz) wrote :
Download full text (3.7 KiB)

Thanks Andrew, this is great feedback.

> Another alternative: why not a single full gc collection at the end of the
> entire run, and then report any test cases that are unexpectedly still
> alive?

I didn't take this approach because I'm already in swapping death by that point. Now the main issues are fixed in this branch that sounds like a reasonable option.

> The point is that if a VM chooses to allow say 20 or even 200 used test case
> instances to be alive simultaneously before collecting them, then that's
> fine. It'll still keep the memory ceiling reasonably low rather than
> growing in proportion to the number of tests. So requiring that a test case
> lifetime is strictly minimal is too strong a requirement, and I'm not sure
> it's worth paying the price for it. The price is either assuming a
> refcounting VM (and the fairly extreme senstivity to refcounting differences
> due to different versions of dependencies like testtools or even the Python
> stdlib), or forcing a full GC after every test. I agree that the latter is
> too much, but I also think the former isn't worth paying either.

It's true cycles are harmless in and off themselves nowadays, but they do make execution non-deterministic. By and large, they are also avoidable and I think something of a code smell in tests at least. Many of the changes I've made to tests actually improve the code, and where it makes things worse tends to be in tests using slightly odd patterns.

I think there are benefits in not creating cycles in bzrlib proper. There are cases where building up a lot of garbage really hurts†, and while performance regressions should be noticed, I like selftest coverage.

There's only one Python implementation bzr can use, and it refcounts, so I don't think relying on that is much of a stretch. Neither Jython nor IronPython‡ are funded any more, and PyPy...

> Why not strip __dict__ as well? Defense in depth...

As you noted this branch does do that, but Robert was trying to avoid this. Having failed tests in a examinable state at the end of the run is potentially useful.

> Basically, I'm not inclined to go to extreme lengths to get utter perfection
> here: if the memory consumption is down, and is likely stay down, and is
> fairly easy to diagnose and correct if it doesn't stay down, then I think
> that's clearly Good Enough, even if there are a handful of tests that don't
> relinquish all their memory, or don't relinquish it as promptly as they
> theoretically could.

I feel this is where we are and have been with thread leaks, and it's been pain felt disproportionately by the wrong people. It's much easier for the test writer to get things correct, than those who end up suffering from the random problems. This is why I'm adding noise, and even considering making tests fail, for such a minor thing. It might be an overreaction, but I'd prefer to have an always on, testable solution than one big switch that will be generally ignored.

> That hunk looks odd. Surely falling off the end of the function
> already deletes all the locals? And if the test fails, I don't really care
> about the extra memory cost. Certainly not enough to clutter the test.

What...

Read more...

Revision history for this message
Andrew Bennetts (spiv) wrote :
Download full text (5.5 KiB)

Martin [gz] wrote:
> Thanks Andrew, this is great feedback.
>
> > Another alternative: why not a single full gc collection at the end of the
> > entire run, and then report any test cases that are unexpectedly still
> > alive?
>
> I didn't take this approach because I'm already in swapping death by that
> point. Now the main issues are fixed in this branch that sounds like a
> reasonable option.

Ah, right. For unexpected swap death it would be good to have an optional mode
along the lines of this patch. Although hopefully it'll never return :)

> > The point is that if a VM chooses to allow say 20 or even 200 used test case
> > instances to be alive simultaneously before collecting them, then that's
> > fine. It'll still keep the memory ceiling reasonably low rather than
> > growing in proportion to the number of tests. So requiring that a test case
> > lifetime is strictly minimal is too strong a requirement, and I'm not sure
> > it's worth paying the price for it. The price is either assuming a
> > refcounting VM (and the fairly extreme senstivity to refcounting differences
> > due to different versions of dependencies like testtools or even the Python
> > stdlib), or forcing a full GC after every test. I agree that the latter is
> > too much, but I also think the former isn't worth paying either.
>
> It's true cycles are harmless in and off themselves nowadays, but they do make
> execution non-deterministic. By and large, they are also avoidable and I think
> something of a code smell in tests at least. Many of the changes I've made to
> tests actually improve the code, and where it makes things worse tends to be
> in tests using slightly odd patterns.

I'm not really bothered by a little non-determinism in tests, especially as we
do so little with __del__ methods or weakref callbacks. If we want to remove
non-determinism in tests we have a much bigger problem: threads.

I'd say the changes you've made to test code are 50-50: about half a touch
nicer, about half a bit worse. I'm talking in terms of clarity, which is
probably the most important concern for test code. Adding try/finally blocks
that happen to break cycles a casual reader or writer wouldn't even think of,
when cycles aren't significant to the behaviour being tested, detract from the
test.

> I think there are benefits in not creating cycles in bzrlib proper. There are
> cases where building up a lot of garbage really hurts†, and while performance
> regressions should be noticed, I like selftest coverage.

Yes, John's noticed the same thing. Manually breaking cycles is a terribly
fragile way to get a 5% benefit though (5% being what I saw in terms of selftest
performance, and roughly what John's noticed in the past for some other issues).
The best fix is to simply create less objects, or at least create less
gc-tracked objects: hence StaticTuple, and other performance work John's done.
Manually breaking cycles here and there throughout the code isn't really
tackling root causes.

I don't mind it in generic higher-order function infrastructure like
bzrlib.decorators and bzrlib.cleanups, where many places benefit from a fairly
simple change (although even there it m...

Read more...

Revision history for this message
Vincent Ladeuil (vila) wrote :

This is definitely cleaner ! Excellent.

Yet:

./bzr selftest -s bzrlib.tests.test_selftest.TestUncollectedWarnings => 7 failures

and doing a full run output an 'Uncollected test case:' line for every test.

Do I miss some special version of testtools ? (I'm running with lp:testtools revno 128).

review: Needs Fixing
Revision history for this message
Martin Packman (gz) wrote :

Sorry, had left tip on launchpad in a slightly broken state, have synced it up with my local branch again.

Revision history for this message
Andrew Bennetts (spiv) wrote :

I still see a fair few "Uncollected test case" messages when doing "./bzr selftest". Too many to be useful, I fear.

And if I use --parallel=fork (which I do almost all the time) then every test is reported as uncollected! So this still needs a bit more work I guess. I'm going to put this back to Work in Progress to declutter the review queue.

review: Needs Fixing
Revision history for this message
Andrew Bennetts (spiv) wrote :

On the plus side it does still pass regular selftest locally, even after merging latest lp:bzr (no conflicts currently).

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bzrlib/errors.py'
2--- bzrlib/errors.py 2011-06-21 11:34:52 +0000
3+++ bzrlib/errors.py 2011-07-01 00:12:55 +0000
4@@ -1659,6 +1659,7 @@
5
6 def __init__(self, exc_info):
7 import traceback
8+ # GZ 2010-08-10: Cycle with exc_tb/exc_info affects at least one test
9 self.exc_type, self.exc_value, self.exc_tb = exc_info
10 self.exc_info = exc_info
11 traceback_strings = traceback.format_exception(
12
13=== modified file 'bzrlib/tests/TestUtil.py'
14--- bzrlib/tests/TestUtil.py 2011-05-26 08:05:45 +0000
15+++ bzrlib/tests/TestUtil.py 2011-07-01 00:12:55 +0000
16@@ -19,6 +19,7 @@
17 import sys
18 import logging
19 import unittest
20+import weakref
21
22 from bzrlib import pyutils
23
24@@ -82,14 +83,37 @@
25 tests = list(self)
26 tests.reverse()
27 self._tests = []
28+ stream = getattr(result, "stream", None)
29+ # With subunit, not only is stream underscored, but the actual result
30+ # object is hidden inside a wrapper decorator, get out the real stream
31+ if stream is None:
32+ stream = result.decorated._stream
33+ stored_count = 0
34+ count_stored_tests = getattr(result, "_count_stored_tests", int)
35+ from bzrlib.tests import selftest_debug_flags
36+ notify = "uncollected_cases" in selftest_debug_flags
37 while tests:
38 if result.shouldStop:
39 self._tests = reversed(tests)
40 break
41- tests.pop().run(result)
42+ case = _run_and_collect_case(tests.pop(), result)()
43+ new_stored_count = count_stored_tests()
44+ if case is not None and isinstance(case, unittest.TestCase):
45+ if stored_count == new_stored_count and notify:
46+ # Testcase didn't fail, but somehow is still alive
47+ stream.write("Uncollected test case: %s\n" % (case.id(),))
48+ # Zombie the testcase but leave a working stub id method
49+ case.__dict__ = {"id": lambda _id=case.id(): _id}
50+ stored_count = new_stored_count
51 return result
52
53
54+def _run_and_collect_case(case, res):
55+ """Run test case against result and use weakref to drop the refcount"""
56+ case.run(res)
57+ return weakref.ref(case)
58+
59+
60 class TestLoader(unittest.TestLoader):
61 """Custom TestLoader to extend the stock python one."""
62
63
64=== modified file 'bzrlib/tests/__init__.py'
65--- bzrlib/tests/__init__.py 2011-06-27 15:42:09 +0000
66+++ bzrlib/tests/__init__.py 2011-07-01 00:12:55 +0000
67@@ -495,6 +495,10 @@
68 self.not_applicable_count += 1
69 self.report_not_applicable(test, reason)
70
71+ def _count_stored_tests(self):
72+ """Count of tests instances kept alive due to not succeeding"""
73+ return self.error_count + self.failure_count + self.known_failure_count
74+
75 def _post_mortem(self, tb=None):
76 """Start a PDB post mortem session."""
77 if os.environ.get('BZR_TEST_PDB', None):
78@@ -3159,6 +3163,9 @@
79 result_decorators=result_decorators,
80 )
81 runner.stop_on_failure=stop_on_failure
82+ if isinstance(suite, unittest.TestSuite):
83+ # Empty out _tests list of passed suite and populate new TestSuite
84+ suite._tests[:], suite = [], TestSuite(suite)
85 # built in decorator factories:
86 decorators = [
87 random_order(random_seed, runner),
88@@ -3262,34 +3269,17 @@
89
90 class TestDecorator(TestUtil.TestSuite):
91 """A decorator for TestCase/TestSuite objects.
92-
93- Usually, subclasses should override __iter__(used when flattening test
94- suites), which we do to filter, reorder, parallelise and so on, run() and
95- debug().
96+
97+ Contains rather than flattening suite passed on construction
98 """
99
100- def __init__(self, suite):
101- TestUtil.TestSuite.__init__(self)
102- self.addTest(suite)
103-
104- def countTestCases(self):
105- cases = 0
106- for test in self:
107- cases += test.countTestCases()
108- return cases
109-
110- def debug(self):
111- for test in self:
112- test.debug()
113-
114- def run(self, result):
115- # Use iteration on self, not self._tests, to allow subclasses to hook
116- # into __iter__.
117- for test in self:
118- if result.shouldStop:
119- break
120- test.run(result)
121- return result
122+ def __init__(self, suite=None):
123+ super(TestDecorator, self).__init__()
124+ if suite is not None:
125+ self.addTest(suite)
126+
127+ # Don't need subclass run method with suite emptying
128+ run = unittest.TestSuite.run
129
130
131 class CountingDecorator(TestDecorator):
132@@ -3306,90 +3296,50 @@
133 """A decorator which excludes test matching an exclude pattern."""
134
135 def __init__(self, suite, exclude_pattern):
136- TestDecorator.__init__(self, suite)
137- self.exclude_pattern = exclude_pattern
138- self.excluded = False
139-
140- def __iter__(self):
141- if self.excluded:
142- return iter(self._tests)
143- self.excluded = True
144- suite = exclude_tests_by_re(self, self.exclude_pattern)
145- del self._tests[:]
146- self.addTests(suite)
147- return iter(self._tests)
148+ super(ExcludeDecorator, self).__init__(
149+ exclude_tests_by_re(suite, exclude_pattern))
150
151
152 class FilterTestsDecorator(TestDecorator):
153 """A decorator which filters tests to those matching a pattern."""
154
155 def __init__(self, suite, pattern):
156- TestDecorator.__init__(self, suite)
157- self.pattern = pattern
158- self.filtered = False
159-
160- def __iter__(self):
161- if self.filtered:
162- return iter(self._tests)
163- self.filtered = True
164- suite = filter_suite_by_re(self, self.pattern)
165- del self._tests[:]
166- self.addTests(suite)
167- return iter(self._tests)
168+ super(FilterTestsDecorator, self).__init__(
169+ filter_suite_by_re(suite, pattern))
170
171
172 class RandomDecorator(TestDecorator):
173 """A decorator which randomises the order of its tests."""
174
175 def __init__(self, suite, random_seed, stream):
176- TestDecorator.__init__(self, suite)
177- self.random_seed = random_seed
178- self.randomised = False
179- self.stream = stream
180-
181- def __iter__(self):
182- if self.randomised:
183- return iter(self._tests)
184- self.randomised = True
185- self.stream.write("Randomizing test order using seed %s\n\n" %
186- (self.actual_seed()))
187+ random_seed = self.actual_seed(random_seed)
188+ stream.write("Randomizing test order using seed %s\n\n" %
189+ (random_seed,))
190 # Initialise the random number generator.
191- random.seed(self.actual_seed())
192- suite = randomize_suite(self)
193- del self._tests[:]
194- self.addTests(suite)
195- return iter(self._tests)
196+ random.seed(random_seed)
197+ super(RandomDecorator, self).__init__(randomize_suite(suite))
198
199- def actual_seed(self):
200- if self.random_seed == "now":
201+ @staticmethod
202+ def actual_seed(seed):
203+ if seed == "now":
204 # We convert the seed to a long to make it reuseable across
205 # invocations (because the user can reenter it).
206- self.random_seed = long(time.time())
207+ return long(time.time())
208 else:
209 # Convert the seed to a long if we can
210 try:
211- self.random_seed = long(self.random_seed)
212- except:
213+ return long(seed)
214+ except (TypeError, ValueError):
215 pass
216- return self.random_seed
217+ return seed
218
219
220 class TestFirstDecorator(TestDecorator):
221 """A decorator which moves named tests to the front."""
222
223 def __init__(self, suite, pattern):
224- TestDecorator.__init__(self, suite)
225- self.pattern = pattern
226- self.filtered = False
227-
228- def __iter__(self):
229- if self.filtered:
230- return iter(self._tests)
231- self.filtered = True
232- suites = split_suite_by_re(self, self.pattern)
233- del self._tests[:]
234- self.addTests(suites)
235- return iter(self._tests)
236+ super(TestFirstDecorator, self).__init__()
237+ self.addTests(split_suite_by_re(suite, pattern))
238
239
240 def partition_tests(suite, count):
241@@ -3442,9 +3392,12 @@
242 os.waitpid(self.pid, 0)
243
244 test_blocks = partition_tests(suite, concurrency)
245+ # Clear the tests from the original suite so it doesn't keep them alive
246+ suite._tests[:] = []
247 for process_tests in test_blocks:
248- process_suite = TestUtil.TestSuite()
249- process_suite.addTests(process_tests)
250+ process_suite = TestUtil.TestSuite(process_tests)
251+ # Also clear each split list so new suite has only reference
252+ process_tests[:] = []
253 c2pread, c2pwrite = os.pipe()
254 pid = os.fork()
255 if pid == 0:
256@@ -3456,12 +3409,15 @@
257 # read from stdin (otherwise its a roulette to see what
258 # child actually gets keystrokes for pdb etc).
259 sys.stdin.close()
260- sys.stdin = None
261+ # GZ 2011-06-16: Why set stdin to None? Breaks multi fork.
262+ #sys.stdin = None
263 stream = os.fdopen(c2pwrite, 'wb', 1)
264 subunit_result = AutoTimingTestResultDecorator(
265 TestProtocolClient(stream))
266 process_suite.run(subunit_result)
267 finally:
268+ # GZ 2011-06-16: Is always exiting with silent success
269+ # really the right thing? Hurts debugging.
270 os._exit(0)
271 else:
272 os.close(c2pwrite)
273@@ -3574,8 +3530,13 @@
274 # with proper exclusion rules.
275 # -Ethreads Will display thread ident at creation/join time to
276 # help track thread leaks
277+<<<<<<< TREE
278
279 # -Econfig_stats Will collect statistics using addDetail
280+=======
281+# -Euncollected_cases Display the identity of any test cases that weren't
282+# deallocated after being completed.
283+>>>>>>> MERGE-SOURCE
284 selftest_debug_flags = set()
285
286
287@@ -4683,6 +4644,14 @@
288 from subunit.test_results import AutoTimingTestResultDecorator
289 class SubUnitBzrProtocolClient(TestProtocolClient):
290
291+ # GZ 2011-05-26: This duplicates logic in ExtendedTestResult.stopTest
292+ def stopTest(self, test):
293+ super(SubUnitBzrProtocolClient, self).stopTest(test)
294+ # Break bound method cycles added in Python 2.7 unittest rewrite
295+ type_equality_funcs = getattr(test, "_type_equality_funcs", None)
296+ if type_equality_funcs is not None:
297+ type_equality_funcs.clear()
298+
299 def addSuccess(self, test, details=None):
300 # The subunit client always includes the details in the subunit
301 # stream, but we don't want to include it in ours.
302
303=== modified file 'bzrlib/tests/test_selftest.py'
304--- bzrlib/tests/test_selftest.py 2011-06-17 13:59:23 +0000
305+++ bzrlib/tests/test_selftest.py 2011-07-01 00:12:55 +0000
306@@ -17,6 +17,7 @@
307 """Tests for the test framework."""
308
309 from cStringIO import StringIO
310+import gc
311 import doctest
312 import os
313 import signal
314@@ -3385,6 +3386,113 @@
315 self.assertLength(1, calls)
316
317
318+class TestUncollectedWarnings(tests.TestCase):
319+ """Check a test case still alive after being run emits a warning"""
320+
321+ class Test(tests.TestCase):
322+ def test_pass(self):
323+ pass
324+ def test_self_ref(self):
325+ self.also_self = self.test_self_ref
326+ def test_skip(self):
327+ self.skip("Don't need")
328+
329+ def _get_suite(self):
330+ return TestUtil.TestSuite([
331+ self.Test("test_pass"),
332+ self.Test("test_self_ref"),
333+ self.Test("test_skip"),
334+ ])
335+
336+ def _inject_stream_into_subunit(self, stream):
337+ """To be overridden by subclasses that run tests out of process"""
338+
339+ def _run_selftest_with_suite(self, **kwargs):
340+ sio = StringIO()
341+ self._inject_stream_into_subunit(sio)
342+ old_flags = tests.selftest_debug_flags
343+ tests.selftest_debug_flags = old_flags.union(["uncollected_cases"])
344+ gc_on = gc.isenabled()
345+ if gc_on:
346+ gc.disable()
347+ try:
348+ tests.selftest(test_suite_factory=self._get_suite, stream=sio,
349+ **kwargs)
350+ finally:
351+ if gc_on:
352+ gc.enable()
353+ tests.selftest_debug_flags = old_flags
354+ output = sio.getvalue()
355+ self.assertNotContainsRe(output, "Uncollected test case.*test_pass")
356+ self.assertContainsRe(output, "Uncollected test case.*test_self_ref")
357+ return output
358+
359+ def test_testsuite(self):
360+ self._run_selftest_with_suite()
361+
362+ def test_pattern(self):
363+ out = self._run_selftest_with_suite(pattern="test_(?:pass|self_ref)$")
364+ self.assertNotContainsRe(out, "test_skip")
365+
366+ def test_exclude_pattern(self):
367+ out = self._run_selftest_with_suite(exclude_pattern="test_skip$")
368+ self.assertNotContainsRe(out, "test_skip")
369+
370+ def test_random_seed(self):
371+ self._run_selftest_with_suite(random_seed="now")
372+
373+ def test_matching_tests_first(self):
374+ self._run_selftest_with_suite(matching_tests_first=True,
375+ pattern="test_self_ref$")
376+
377+ def test_starting_with_and_exclude(self):
378+ out = self._run_selftest_with_suite(starting_with=["bt."],
379+ exclude_pattern="test_skip$")
380+ self.assertNotContainsRe(out, "test_skip")
381+
382+ def test_additonal_decorator(self):
383+ out = self._run_selftest_with_suite(
384+ suite_decorators=[tests.TestDecorator])
385+
386+
387+class TestUncollectedWarningsSubunit(TestUncollectedWarnings):
388+ """Check warnings from tests staying alive are emitted with subunit"""
389+
390+ _test_needs_features = [features.subunit]
391+
392+ def _run_selftest_with_suite(self, **kwargs):
393+ return TestUncollectedWarnings._run_selftest_with_suite(self,
394+ runner_class=tests.SubUnitBzrRunner, **kwargs)
395+
396+
397+class TestUncollectedWarningsForking(TestUncollectedWarnings):
398+ """Check warnings from tests staying alive are emitted when forking"""
399+
400+ _test_needs_features = [features.subunit]
401+
402+ def _inject_stream_into_subunit(self, stream):
403+ """Monkey-patch subunit so the extra output goes to stream not stdout
404+
405+ Some APIs need rewriting so this kind of bogus hackery can be replaced
406+ by passing the stream param from run_tests down into ProtocolTestCase.
407+ """
408+ from subunit import ProtocolTestCase
409+ _original_init = ProtocolTestCase.__init__
410+ def _init_with_passthrough(self, *args, **kwargs):
411+ _original_init(self, *args, **kwargs)
412+ self._passthrough = stream
413+ self.overrideAttr(ProtocolTestCase, "__init__", _init_with_passthrough)
414+
415+ def _run_selftest_with_suite(self, **kwargs):
416+ # GZ 2011-05-26: Add a PosixSystem feature so this check can go away
417+ if getattr(os, "fork", None) is None:
418+ raise tests.TestNotApplicable("Platform doesn't support forking")
419+ # Make sure the fork code is actually invoked by claiming two cores
420+ self.overrideAttr(osutils, "local_concurrency", lambda: 2)
421+ kwargs.setdefault("suite_decorators", []).append(tests.fork_decorator)
422+ return TestUncollectedWarnings._run_selftest_with_suite(self, **kwargs)
423+
424+
425 class TestEnvironHandling(tests.TestCase):
426
427 def test_overrideEnv_None_called_twice_doesnt_leak(self):