On 01.09.2011 10:11, Vincent Ladeuil wrote:
> [feedback] Overall, the approach sounds good. I'm not sure I'm
> reading this correctly but it seems you've now made
> ScopeReplacer._replace() the only place where races can occur. If
> that's the case and we want to lock something, that should
> probably be there.

Depends on what race you're talking about.

A race resulting in double import will occur if the second thread reads
the _real_obj member before the first thread has written it. As the read
of _real_obj occurs in _resolve, that's involved in the race as well,
and a lock should encompass that whole code section, i.e. be aquired
before _real_obj is first accessed.

A race resulting in an UNDETECTED double replacement is restricted to
the very end of _replace, starting at the second read for _real_obj, the
one to the prev_obj local variable.

Reducing the chances for the second kind of race is the only reason for
the duplicate read of _real_obj. The assumption is that importing a
module might take considerable time, so a critical code section only
after the import represents a much shorter timespan.

> [nit]
> 
> 17	+
> 18	+ disallow_proxying()
> 
> A bit naked ;) A comment explaining why it's done here (as early
> as possible but no sooner) will be good.

Guess you're right. To make things even cleaner, I could import the
lazy_import module instead of the function of the same name. That would
make this call use a qualified name, giving it more context. Will do
both soon.

> [needsinfo]
> 
> 218	- if event == 'line' and 'lazy_import.py' in filename:
> 219	- line = linecache.getline(filename, frame.f_lineno)
> 220	- # ...and the line of code is the one we're looking for...
> 221	- if 'del self._factory' in line:
> <...>
> 228	+ if (filename.endswith('lazy_import.py') and
> 229	+ function_name == self.method_to_trace):
> 
> This looks nice and less hackish, but I thought you were relying
> on the code you're replacing here in lp:~gagern/bzr/bug-702914 ?

This branch here supersedes that one, in my opinion. Notice that I never
requested that one to be merged. It was only intended as a proof that
the race problems hadn't been completely resolved yet. And the fact that
problems remained despite the original test cases shows another fact:
you cannot reasonably test for all possible races, but only those you
can think of. Thus the more people reviewing the code and thinking of
possible races, the more likely they won't miss one. The number of test
cases alone does little to increase confidence.

> I wouldn't mind merging these tests instead (or their moral equivalent).

There is no direct equivalent, as the lines of code mentioned no longer
exist in that fashion. I also came to believe that the scenarios
approach was bad, as it implied that all three kinds of test could be
matched by the same break point line regexp, which isn't necessarily
obvious.

The tests here are imo suitable to guard against a wide range of likely
races. The fact that the method name is an argument makes adding more
tests easy. So I'd rather prefer these tests here over the ones from
lp:~gagern/bzr/bug-702914 . I'd be happy to add three more tests for
_replace instead of _resolve, to match the number of distinct tests and
implement a VERY ROUGH approximation of the respective code locations.

One drawback of the code here is that it will only suspend the thread at
function entry, not at arbitrary lines within the function body, the way
the replaced code did. On the other hand, my code should work even if
people delete the *.py files and run bzrlib from the *.pyc only.