Merge lp:~gz/bzr/cleanup_testcases_by_collection_613247 into lp:bzr
| Status: | Superseded |
|---|---|
| Proposed branch: | lp:~gz/bzr/cleanup_testcases_by_collection_613247 |
| Merge into: | lp:bzr |
| Diff against target: |
838 lines (+212/-143) 24 files modified
bzrlib/branch.py (+4/-1) bzrlib/cleanup.py (+4/-1) bzrlib/decorators.py (+16/-4) bzrlib/errors.py (+1/-0) bzrlib/osutils.py (+4/-1) bzrlib/tests/TestUtil.py (+17/-1) bzrlib/tests/__init__.py (+48/-89) bzrlib/tests/blackbox/test_breakin.py (+5/-4) bzrlib/tests/blackbox/test_log.py (+1/-0) bzrlib/tests/blackbox/test_serve.py (+2/-1) bzrlib/tests/per_repository/test_check.py (+2/-2) bzrlib/tests/per_workingtree/test_smart_add.py (+1/-0) bzrlib/tests/test_chk_map.py (+1/-1) bzrlib/tests/test_clean_tree.py (+0/-1) bzrlib/tests/test_errors.py (+16/-11) bzrlib/tests/test_http.py (+4/-3) bzrlib/tests/test_knit.py (+1/-0) bzrlib/tests/test_merge.py (+2/-2) bzrlib/tests/test_registry.py (+4/-3) bzrlib/tests/test_repository.py (+2/-2) bzrlib/tests/test_selftest.py (+70/-10) bzrlib/tests/test_smart.py (+1/-1) bzrlib/tests/test_smart_transport.py (+1/-0) bzrlib/tests/test_workingtree_4.py (+5/-5) |
| To merge this branch: | bzr merge lp:~gz/bzr/cleanup_testcases_by_collection_613247 |
| Related bugs: |
| Reviewer | Review Type | Date Requested | Status |
|---|---|---|---|
| Vincent Ladeuil | 2010-08-15 | Needs Fixing on 2010-08-19 | |
|
Review via email:
|
|||
This proposal has been superseded by a proposal from 2010-10-20.
Description of the Change
This branch ensures that after each test case is run, they are cleaned up rather than lingering until the entire suite completes. On my machine this reduces the memory usage of a full test run from ~575MB to ~140MB, which prevents OOM related errors towards the end. In fact, the first revision is a sufficient fix, however Robert's intention in removing an earlier hack was to allow each test case instance to be deallocated after being run. To make that happen, a lot of code needs to be touched to make sure decrementing the refcount after the test completes will actually result in its deallocation.
The most visible change is there will now be a whine printed to the output stream if a testcase stays alive. As this style has proved rather too easy to ignore, I may try an alternative approach once the remaining ones are fixed. The test result has already been recorded by this point, but adding a new stub test to run next that fails may be an option.
As written, the code relies on refcounting working, so cycles need to be avoided. The alternative is to run a full gc collection after every test, which slow enough to be worth avoiding.
Some changes to testtools are needed to avoid some needless exc_info cycles, but even with that fixed we're hamstrung on a few fronts. Firstly, cases with errors or failures are kept in lists on the result object. In testtools, there is also a list for expected failures but bazaar doesn't use this one. However, the new unittest _ExpectedFailure gets the old exc_info which creates cycles.
The last change on the branch guts the TestDecorator classes. This is needed because otherwise they were keeping extra references to test cases alive, but I found the way they worked previously overly complicated anyway. I'd like to simplify them further if there's not a good backward compatibility reason for keeping the old style.
The branch needs a bit more work still, about a hundred tests still need poking to avoid cycles, and needs a NEWS entry and so on. I'd like some feedback on if people think this is a worthwhile change over just stripping the __dict__ and if there are any other issues I've not realised.
| Robert Collins (lifeless) wrote : | # |
| Martin Packman (gz) wrote : | # |
> I'm not worried about the references to test cases in errors/failures
> lists : they are a benefit (and the __dict__ stripping was a _REAL_
> pain there in the past).
Indeed that mostly won't be an issue (and can sometimes be useful), my one worry is the occasional breaks-everything rev will then go back to swapping my machine to death. That's why the current implementation still wipes all uncollected tests, but I'm okay with the idea of removing that once everything else is sorted.
A few things I forgot to mention in the initial review comment.
The lock decorators now use this horribly obnoxious pattern:
self.
try:
result = unbound(self, *args, **kwargs)
except:
import sys
exc_info = sys.exc_info()
try:
finally:
try:
del exc_info
else:
return result
If anyone can think of a sane way of spelling that, please say.
Cycles needing to be avoided will have to be documented if we go with this, and can be annoying to understand without decent knowledge of python internals. I've been using a rough reference tracking function to fix things, but it's not worth landing. Is there any package that can help contributors diagnose things like this?
Would there be any benefit in running all tests with gc disabled by default, and only enabling it for those that explicitly require it?
Is there a better way of telling if the test you just ran failed other than comparing failure counts on the result object?
| Robert Collins (lifeless) wrote : | # |
> Is there a better way of telling if the test you just ran failed other than comparing failure counts on the result object?
Where do you need to know this?
| Martin Packman (gz) wrote : | # |
> > Is there a better way of telling if the test you just ran failed other than
> comparing failure counts on the result object?
>
> Where do you need to know this?
It has to be outside all the normal methods unfortunately. Even stopTest is too early, as that's passed a reference to the test case, so clearly the caller is going to be keeping it alive. Instead I added the collection accounting in TestSuite.run which was already popping testcases from its internal list, now it uses weakref to check that they really expire after being run.
I've put up <lp:~gz/testtools/avoid_exc_info_cycles> with the testtools changes I needed.
| Vincent Ladeuil (vila) wrote : | # |
Hmm, I appreciate the effort to reduce the memory footprint, but unfortunately this seems far from landable :-/
BZR_PLUGIN_
bzr selftest: /home/vila/
/home/
bzr-2.2b3 python-2.6.5 Linux-2.
-------
Ran 8 tests in 0.489s
OK
2 tests skipped
Huh ? Where are my tests gone ? (Strong suspicion around your modification of __iter__)
Ran 24856 tests in 1050.614s
FAILED (failures=2, errors=2, known_failure_
Including failures for bb.test_log so your fix is wrong there.
On the overall, with a superficial reading, the way you avoid ref cycles seem hard to understand
so very likely to be re-introduced. How about breaking the ref-cycles with addCleanup calls instead
(by deleting the offending attributes) ?
Now, a way to track the uncollectable test case sounds like a good way to progress but given the amount of work, could you instead devise a patch where the check depends on some option ?
Once we reach 0 we can toggle the default value for this option.
For the run above, I also get ~4000 Uncollected test cases, far too much to land this patch :-/
| Martin Packman (gz) wrote : | # |
Thanks for looking at this Vincent.
> Huh ? Where are my tests gone ? (Strong suspicion around your modification of
> __iter__)
I suspect this is related to --parallel which I thought might be affected but didn't have any easy way of testing on my machine, I'll take another look at that code.
> Including failures for bb.test_log so your fix is wrong there.
Gah, how did I miss those...
> On the overall, with a superficial reading, the way you avoid ref cycles seem
> hard to understand
> so very likely to be re-introduced. How about breaking the ref-cycles with
> addCleanup calls instead
> (by deleting the offending attributes) ?
Yes, that it's hard to understand is a big concern. At least this patch, with retroactively changing a bunch of tests in subtle ways, is the worst part of it. It's much easier to adjust the test you're writing right now because you get told there's a cycle. So, moving in the direction of actually causing a failure once all existing issues are fixed would I think prevent regressions without being too taxing on contributors.
Note, that sometimes cycles do have to be avoided, because it's easier than trying to break them later, particularly with exc_info and closures where it's internal objects pointing at each other rather than public TestCase attributes.
> Now, a way to track the uncollectable test case sounds like a good way to
> progress but given the amount of work, could you instead devise a patch where
> the check depends on some option ?
> Once we reach 0 we can toggle the default value for this option.
Well, the "print junk to stdout" is my less intrusive option, but the aim is to get it to 0 before landing. However...
> For the run above, I also get ~4000 Uncollected test cases, far too much to
> land this patch :-/
This could well be just needing the testtools exc_info cycle cleanup as well, branch for that linked in previous comment.
...getting 4000 complaints from not having the testtools change probably does mean this needs to land quieter, making everyone patch testtools or be spammed isn't polite.
| Vincent Ladeuil (vila) wrote : | # |
> Thanks for looking at this Vincent.
>
> > Huh ? Where are my tests gone ? (Strong suspicion around your modification
> of
> > __iter__)
>
> I suspect this is related to --parallel which I thought might be affected but
> didn't have any easy way of testing on my machine, I'll take another look at
> that code.
Clearly --parallel specific and at least one __iter__ you've modified should be involved.
>
> > Including failures for bb.test_log so your fix is wrong there.
>
> Gah, how did I miss those...
I'm pretty sure the way the log formatter are registered requires that a single instance is used
and you've changed that. In this case I think breaking the cycle with an addCleanup should be ok.
>
> > On the overall, with a superficial reading, the way you avoid ref cycles
> seem
> > hard to understand
> > so very likely to be re-introduced. How about breaking the ref-cycles with
> > addCleanup calls instead
> > (by deleting the offending attributes) ?
>
> Yes, that it's hard to understand is a big concern. At least this patch, with
> retroactively changing a bunch of tests in subtle ways, is the worst part of
> it. It's much easier to adjust the test you're writing right now because you
> get told there's a cycle.
I understand that, but avoiding the cycle may make the code less clear than breaking it during tearDown,
especially if there is an accompanying comment ;)
> So, moving in the direction of actually causing a
> failure once all existing issues are fixed would I think prevent regressions
> without being too taxing on contributors.
Yup, +1 on that.
> Note, that sometimes cycles do have to be avoided, because it's easier than
> trying to break them later, particularly with exc_info and closures where it's
> internal objects pointing at each other rather than public TestCase
> attributes.
No problem with that, but again adding a comment will help document such cases (so people
can learn from them) and avoid regressions.
> > Now, a way to track the uncollectable test case sounds like a good way to
> > progress but given the amount of work, could you instead devise a patch
> where
> > the check depends on some option ?
> > Once we reach 0 we can toggle the default value for this option.
>
> Well, the "print junk to stdout" is my less intrusive option, but the aim is
> to get it to 0 before landing. However...
I'm fine with that as long as an option control it, there are precedents, look for the lock checks.
> > For the run above, I also get ~4000 Uncollected test cases, far too much to
> > land this patch :-/
>
> This could well be just needing the testtools exc_info cycle cleanup as well,
> branch for that linked in previous comment.
Right, testing with your testtools patch, I still get ~2000 uncollected tests.
>
> ...getting 4000 complaints from not having the testtools change probably does
> mean this needs to land quieter, making everyone patch testtools or be spammed
> isn't polite.
Yup, that's why I suggest an option to control the output that could be stay off by default
until you reach 0 uncollected tests which you can then target in several steps.
In fact, ideally, the first step could be a patch that just imp...
- 5357. By Martin Packman on 2010-08-23
-
Revert changes to bb.test_log which caused some failures
- 5358. By Martin Packman on 2010-08-26
-
Instead break cycle from closure over testcase instance in bb.test_log
- 5359. By Martin Packman on 2010-10-19
-
Merge bzr.dev to pick up leak fixes
- 5360. By Martin Packman on 2010-10-19
-
Use jam's trick with getDetails rather than hacking around double underscore name mangling
- 5361. By Martin Packman on 2010-10-20
-
A few more simple test changes to avoid or break cycles
- 5362. By Martin Packman on 2010-10-20
-
Adapt test cleanup accounting to subunit quirks
| Martin Packman (gz) wrote : | # |
Quick followup on some points from review last time round.
> > > Huh ? Where are my tests gone ? (Strong suspicion around your modification
> > of
> > > __iter__)
> >
> > I suspect this is related to --parallel which I thought might be affected
> but
> > didn't have any easy way of testing on my machine, I'll take another look at
> > that code.
>
> Clearly --parallel specific and at least one __iter__ you've modified should
> be involved.
This was a combination of subunit playing very differently to the normal selftest result object, and bug 625551 meaning the errors from the children were silently eaten so I didn't notice.
> I'm pretty sure the way the log formatter are registered requires that a
> single instance is used
> and you've changed that. In this case I think breaking the cycle with an
> addCleanup should be ok.
Made this change.
> I'm fine with that as long as an option control it, there are precedents, look
> for the lock checks.
I'm still avoiding this for the moment, so people with 8GB ram at least get some console spam before they completely break my box.
> Right, testing with your testtools patch, I still get ~2000 uncollected tests.
The testtools change landed for 0.9.7 and I'd be interested in a new count. Should now not break with subunit which should help.
- 5363. By Martin Packman on 2010-11-08
-
Resort to relying on subunit internals to avoid messing around with stdout in testing
- 5364. By Martin Packman on 2010-11-08
-
Add testing for uncollected case warnings under subunit
- 5365. By Martin Packman on 2010-11-08
-
Only fiddle with gc for tests if it is enabled, as noted by spiv in review
- 5366. By Martin Packman on 2010-11-12
-
Tidy up and comment some of the test changes
- 5367. By Martin Packman on 2011-05-18
-
Merge bzr.dev to renew work
- 5368. By Martin Packman on 2011-05-18
-
Fix mixup in lazy hook restoration and clear dict
- 5369. By Martin Packman on 2011-05-18
-
Put reporting of uncollected tests behind a -Ecollection flag
- 5370. By Martin Packman on 2011-05-18
-
Cleanup testcase logging internals and avoid a cycle
- 5371. By Martin Packman on 2011-05-18
-
Clear extra lists with test instances in fork_for_tests
- 5372. By Martin Packman on 2011-05-20
-
Supersede all of the split out work that jam did
- 5373. By John A Meinel on 2011-05-25
-
Bring bzr.dev 5912 back into gz's branch.
- 5374. By Martin Packman on 2011-05-26
-
Use 'uncollected_cases' as the flag name which seems a bit clearer
- 5375. By Martin Packman on 2011-05-26
-
Fix tests to work regardless of flag state
- 5376. By Martin Packman on 2011-05-26
-
Add test subclass for collection of cases under --parallel=fork
- 5377. By Martin Packman on 2011-05-26
-
Fix subunit collection tests on Python 2.7 with a little copy code
- 5378. By Martin Packman on 2011-05-26
-
Make the fork tests check the right code on single core machines
- 5379. By Martin Packman on 2011-05-26
-
Really fix the fork tests by screwing around with subunit internals
- 5380. By Martin Packman on 2011-05-26
-
Tweaks and comments
- 5381. By Martin Packman on 2011-05-26
-
Merge bzr.dev to resolve conflicts
- 5382. By Martin Packman on 2011-06-16
-
Fix fork tests when run using fork, this was a pain to debug
- 5383. By Martin Packman on 2011-07-27
-
Use tweaks for subunit protocol client with --parallel=fork as well as --subunit
- 5384. By Martin Packman on 2011-09-16
-
Merge bzr.dev to unify _type_equality_
funcs workarounds - 5385. By Martin Packman on 2011-09-16
-
Factor out _type_equality_
funcs to a new function - 5386. By Martin Packman on 2011-09-16
-
Use pseudo testcase to always fail as suggested by lifeless in review rather than hacking the subunit stream
- 5387. By Martin Packman on 2011-09-16
-
Stop making zombie testcases for now as it's an extra risk
- 5388. By Martin Packman on 2011-10-04
-
Ensure all inner tests in TestUncollected
Warnings are run - 5389. By Martin Packman on 2011-10-04
-
After adding a pseudo-failure update the stored test count again
- 5390. By Martin Packman on 2011-10-04
-
Add release notes
- 5391. By Martin Packman on 2011-10-05
-
Revert checking of test count in TestUncollected
Cases.test_ random_ seed - 5392. By Martin Packman on 2011-10-18
-
Dedent release notes to right format

So, I think this is more correct; fixing cycles is pretty important
for us in other ways : I like that we're finding latent issues here.
I'm not worried about the references to test cases in errors/failures
lists : they are a benefit (and the __dict__ stripping was a _REAL_
pain there in the past).
-Rob