Merge lp:~spiv/bzr/finishLogFile-robustness-531746 into lp:bzr

Proposed by Andrew Bennetts
Status: Merged
Approved by: Vincent Ladeuil
Approved revision: no longer in the source branch.
Merged at revision: not available
Proposed branch: lp:~spiv/bzr/finishLogFile-robustness-531746
Merge into: lp:bzr
Diff against target: 56 lines (+34/-1)
2 files modified
NEWS (+7/-0)
bzrlib/tests/__init__.py (+27/-1)
To merge this branch: bzr merge lp:~spiv/bzr/finishLogFile-robustness-531746
Reviewer Review Type Date Requested Status
Vincent Ladeuil Needs Fixing
Review via email: mp+20722@code.launchpad.net

Description of the change

This branch simply brute-forces away the traceback I reported in bug 531746.

I checked the C code for file.close, and I've verified by inspection that
 * IOError with .errno==None from file.close only occurs for the 'concurrent operation' error, and
 * it is safe to retry in this case, as no state relating to the file has been mutated.

Ideally perhaps we'd track down every single stray thread and make sure they are stopped before this point, but that's a rather daunting task.

(We may need to do some of that anyway soon, to track down and fix the thread leaks that cause 'cannot start new thread' errors, but we can at least squash this problem.)

To post a comment you must log in.
Revision history for this message
Vincent Ladeuil (vila) wrote :

BB:tweak

I agree with the fix, but make it verbose, please, so we at least get
some indication that *something* occurred.

We'll probably want to get rid of that code in the future so a comment
mentioning the bug (or this merge proposal) will help get back the context
at that point.

review: Needs Fixing
Revision history for this message
Vincent Ladeuil (vila) wrote :

(Using review needsfixing, status approved to emulate BB:tweak).

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'NEWS'
--- NEWS 2010-03-07 03:04:12 +0000
+++ NEWS 2010-03-10 06:05:37 +0000
@@ -168,6 +168,13 @@
168* Stop sending apport crash files to ``.cache`` in the directory from168* Stop sending apport crash files to ``.cache`` in the directory from
169 which ``bzr selftest`` was run. (Martin Pool, #422350)169 which ``bzr selftest`` was run. (Martin Pool, #422350)
170170
171* Tests no longer fail if "close() called during concurrent
172 operation on the same file object" occurs when closing the log file
173 (which can happen if a thread tries to write to the log file at the
174 wrong moment). An warning will be written to ``stderr`` when this
175 happens, and another warning will be written if the log file could not
176 be closed after retrying 100 times. (Andrew Bennetts, #531746)
177
171bzr 2.1.1178bzr 2.1.1
172#########179#########
173180
174181
=== modified file 'bzrlib/tests/__init__.py'
--- bzrlib/tests/__init__.py 2010-02-18 02:15:48 +0000
+++ bzrlib/tests/__init__.py 2010-03-10 06:05:37 +0000
@@ -1671,7 +1671,33 @@
1671 unicodestr = log_contents.decode('utf8', 'replace')1671 unicodestr = log_contents.decode('utf8', 'replace')
1672 log_contents = unicodestr.encode('utf8')1672 log_contents = unicodestr.encode('utf8')
1673 if not keep_log_file:1673 if not keep_log_file:
1674 self._log_file.close()1674 close_attempts = 0
1675 max_close_attempts = 100
1676 first_close_error = None
1677 while close_attempts < max_close_attempts:
1678 close_attempts += 1
1679 try:
1680 self._log_file.close()
1681 except IOError, ioe:
1682 if ioe.errno is None:
1683 # No errno implies 'close() called during
1684 # concurrent operation on the same file object', so
1685 # retry. Probably a thread is trying to write to
1686 # the log file.
1687 if first_close_error is None:
1688 first_close_error = ioe
1689 continue
1690 raise
1691 else:
1692 break
1693 if close_attempts > 1:
1694 sys.stderr.write(
1695 'Unable to close log file on first attempt, '
1696 'will retry: %s\n' % (first_close_error,))
1697 if close_attempts == max_close_attempts:
1698 sys.stderr.write(
1699 'Unable to close log file after %d attempts.\n'
1700 % (max_close_attempts,))
1675 self._log_file = None1701 self._log_file = None
1676 # Permit multiple calls to get_log until we clean it up in1702 # Permit multiple calls to get_log until we clean it up in
1677 # finishLogFile1703 # finishLogFile