Tests sometimes fail on EC2 due to _LockWarner garbage

Bug #721166 reported by Jonathan Lange
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
Critical
Martin Pool

Bug Description

I've just had two unrelated branches fail on EC2 test with errors about _LockWarner garbage.

In one, all of the devscripts tests failed like:
======================================================================
ERROR: devscripts.tests.test_sourcecode.TestPlanUpdate.test_trivial (subunit.RemotedTestCase)
----------------------------------------------------------------------
_StringException: Text attachment: garbage
------------
[<bzrlib.lockable_files._LockWarner object at 0xdbc49d0>]
------------

In the other, a test which checked for output to stderr failed with:
======================================================================
FAILURE: lp.scripts.tests.test_sphinxdocs.TestSphinxDocumentation.test_docs_build_without_error (subunit.RemotedTestCase)
----------------------------------------------------------------------
_StringException: Text attachment: traceback
------------
Traceback (most recent call last):
_StringException: Text attachment: traceback
------------
Traceback (most recent call last):
 File "/var/launchpad/tmp/eggs/testtools-0.9.8-py2.6.egg/testtools/runtest.py", line 169, in _run_user
   return fn(*args, **kwargs)
 File "/var/launchpad/tmp/eggs/testtools-0.9.8-py2.6.egg/testtools/testcase.py", line 499, in _run_test_method
   return self._get_test_method()()
 File "/var/launchpad/test/lib/lp/scripts/tests/test_sphinxdocs.py", line 34, in test_docs_build_without_error
   self.assertEqual('Making output directory...\n', stderr)
 File "/var/launchpad/tmp/eggs/testtools-0.9.8-py2.6.egg/testtools/testcase.py", line 268, in assertEqual
   self.assertThat(observed, matcher)
 File "/var/launchpad/tmp/eggs/testtools-0.9.8-py2.6.egg/testtools/testcase.py", line 345, in assertThat
   % (matchee, matcher, mismatch.describe()))
AssertionError: Match failed. Matchee: "Making output directory...
Exception IndexError: IndexError('list index out of range',) in <bound method _LockWarner.__del__ of <bzrlib.lockable_files._LockWarner object at 0xef9ea50>> ignored
"
Matcher: Equals('Making output directory...\n')
Difference: !=:
reference = 'Making output directory...\n'
actual = "Making output directory...\nException IndexError: IndexError('list index out of range',) in <bound method _LockWarner.__del__ of <bzrlib.lockable_files._LockWarner object at 0xef9ea50>> ignored\n"

------------

Neither of the branches dealt messed with Bazaar in any way. Julian also reports getting the devscript failures from another branch.

Related branches

Revision history for this message
Jonathan Lange (jml) wrote :
Revision history for this message
Jonathan Lange (jml) wrote :

Note that the second error in the bug description has occurred twice.

Jonathan Lange (jml)
Changed in launchpad:
importance: High → Critical
Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 721166] Re: Tests sometimes fail on EC2 due to _LockWarner garbage

I suspect that warnings is in bad shape.

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

I would guess that these messages are spammed to stderr in every test run, but it's a matter of luck as to whether this happens during a test that cares about output on stderr?

Revision history for this message
Jonathan Lange (jml) wrote :

Yeah, that would be my guess.

Revision history for this message
Jonathan Lange (jml) wrote :

Well, also, it can cause failures when it occurs as uncollected garbage at the time when Zope is checking for such things.

Jonathan Lange (jml)
tags: added: spurious-test-failure
Revision history for this message
Martin Pool (mbp) wrote :

This could be fixed by just eliminating the __del__ method in <https://code.launchpad.net/~mbp/bzr/rm-del-methods/+merge/64466>. We need to see if anyone thinks it's worth keeping. This bug is a good example of why they're so problematic, if you can't even reliably print a warning.

Martin Pool (mbp)
Changed in launchpad:
assignee: nobody → Martin Pool (mbp)
Martin Pool (mbp)
Changed in launchpad:
status: Triaged → In Progress
Revision history for this message
Jonathan Lange (jml) wrote :

Thanks Martin. If you re-enable lib/lp/scripts/tests/disabled_test_sphinxdocs.py by renaming it to 'test_sphinxdocs.py', you should get a close-to-reliable way to reproduce this error from the Launchpad test suite.

Revision history for this message
Jelmer Vernooij (jelmer) wrote :

Since the fix for this has already landed in bzr 2.4b4, I wonder if this would be resolved in Launchpad when lp:~jelmer/launchpad/bzr-2.4b4 lands?

Revision history for this message
Jonathan Lange (jml) wrote :

On Tue, Jul 19, 2011 at 3:00 PM, Jelmer Vernooij
<email address hidden> wrote:
> Since the fix for this has already landed in bzr 2.4b4, I wonder if this
> would be resolved in Launchpad when lp:~jelmer/launchpad/bzr-2.4b4
> lands?

There's one easy way to find out:

Re-enable 'lib/lp/scripts/tests/disabled_test_sphinxdocs.py' by
renaming it to 'test_sphinxdocs.py' and see if it fails.

jml

Revision history for this message
Martin Pool (mbp) wrote :

I reenabled the tests as jml suggested, and ran them 20 times with no failures, so it looks like the fix in bzr has indeed fixed the problem. The test_sourcecode tests don't seem to be disabled (either by the file being renamed or within the file.) So, I'm going to propose a change that reenames test_sphinxdocs, and if that passes ec2 we can call this done?

Revision history for this message
Martin Pool (mbp) wrote :
Martin Pool (mbp)
tags: added: no-qa
Martin Pool (mbp)
Changed in launchpad:
status: In Progress → Fix Committed
Revision history for this message
Launchpad QA Bot (lpqabot) wrote :
tags: added: qa-needstesting
William Grant (wgrant)
tags: added: qa-untestable
removed: qa-needstesting
William Grant (wgrant)
Changed in launchpad:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.