Code review comment for lp:~gz/bzr/cleanup_testcases_by_collection_613247

Revision history for this message
Martin Packman (gz) wrote :

Thanks Andrew, this is great feedback.

> Another alternative: why not a single full gc collection at the end of the
> entire run, and then report any test cases that are unexpectedly still
> alive?

I didn't take this approach because I'm already in swapping death by that point. Now the main issues are fixed in this branch that sounds like a reasonable option.

> The point is that if a VM chooses to allow say 20 or even 200 used test case
> instances to be alive simultaneously before collecting them, then that's
> fine. It'll still keep the memory ceiling reasonably low rather than
> growing in proportion to the number of tests. So requiring that a test case
> lifetime is strictly minimal is too strong a requirement, and I'm not sure
> it's worth paying the price for it. The price is either assuming a
> refcounting VM (and the fairly extreme senstivity to refcounting differences
> due to different versions of dependencies like testtools or even the Python
> stdlib), or forcing a full GC after every test. I agree that the latter is
> too much, but I also think the former isn't worth paying either.

It's true cycles are harmless in and off themselves nowadays, but they do make execution non-deterministic. By and large, they are also avoidable and I think something of a code smell in tests at least. Many of the changes I've made to tests actually improve the code, and where it makes things worse tends to be in tests using slightly odd patterns.

I think there are benefits in not creating cycles in bzrlib proper. There are cases where building up a lot of garbage really hurts†, and while performance regressions should be noticed, I like selftest coverage.

There's only one Python implementation bzr can use, and it refcounts, so I don't think relying on that is much of a stretch. Neither Jython nor IronPython‡ are funded any more, and PyPy...

> Why not strip __dict__ as well? Defense in depth...

As you noted this branch does do that, but Robert was trying to avoid this. Having failed tests in a examinable state at the end of the run is potentially useful.

> Basically, I'm not inclined to go to extreme lengths to get utter perfection
> here: if the memory consumption is down, and is likely stay down, and is
> fairly easy to diagnose and correct if it doesn't stay down, then I think
> that's clearly Good Enough, even if there are a handful of tests that don't
> relinquish all their memory, or don't relinquish it as promptly as they
> theoretically could.

I feel this is where we are and have been with thread leaks, and it's been pain felt disproportionately by the wrong people. It's much easier for the test writer to get things correct, than those who end up suffering from the random problems. This is why I'm adding noise, and even considering making tests fail, for such a minor thing. It might be an overreaction, but I'd prefer to have an always on, testable solution than one big switch that will be generally ignored.

> That hunk looks odd. Surely falling off the end of the function
> already deletes all the locals? And if the test fails, I don't really care
> about the extra memory cost. Certainly not enough to clutter the test.

What's happening here was;

    exc_info -> (3-tuple)[2] -> (traceback).tb_frame -> (frame).f_locals['exc_info'] -> ...

The change is about as ugly as breaking cycles gets, but I left this and a few other things where I feel modifying the actual logic would be preferable. HookFailed is 1) deprecated, and 2) doesn't actually need to keep the traceback object.

> To be pedantic this shouldn't gc.enable unless gc.isenabled() was true
> before.

Good catch.

† <http://benjamin.smedbergs.us/blog/tag/pymake/>
‡ <http://hugunin.net/microsoft_farewell.html>

« Back to merge proposal