Merge lp:~gagern/bzr/bug396819-lazy_import-threadsafe into lp:bzr

Proposed by Martin von Gagern
Status: Merged
Merged at revision: 6426
Proposed branch: lp:~gagern/bzr/bug396819-lazy_import-threadsafe
Merge into: lp:bzr
Diff against target: 304 lines (+115/-82)
3 files modified
bzrlib/builtins.py (+11/-2)
bzrlib/lazy_import.py (+67/-53)
bzrlib/tests/test_lazy_import.py (+37/-27)
To merge this branch: bzr merge lp:~gagern/bzr/bug396819-lazy_import-threadsafe
Reviewer Review Type Date Requested Status
Martin Packman (community) Needs Fixing
Jelmer Vernooij (community) Abstain
Vincent Ladeuil Needs Information
John A Meinel Pending
Benji York Pending
Review via email: mp+73475@code.launchpad.net

Commit message

Make lazy imports (more) thread-safe

Description of the change

I know this still lacks a news entry, but apart from that, would you be willing to merge something along these lines to make lazy imports properly thread safe?

As I argued in bug #396819, I believe that you cannot have guaranteed tread-safe lazy imports without allowing the same replacer to resolve to the real object an arbitrary number of times. And you cannot avoid duplicate calls to the fatory (i.e. duplicate imports) unless you want to use locks.

The attached code avoids locks, as concurrent replacements should be rare and as duplicate imports should be harmless except for a slight cpu time penalty to detect it. It simplifies the replacer class by introducing a _resolve method which always returns the actual object.

As there are no longer any errors when the replace code is called repeatedly, we might miss the fact that the replacer object itself has been copied to some other name. To deal with this problem, the old error-raising behaviour will be restored for selftests, thus ensuring that the code is clean in that respect. Obviously, when the selftests reach the part where the thread safety of the replacer is tested, those checks have to be disabled again...

If you want any kind of beautification, name changes, or whatever, simply ask me or feel free to branch this yourself and apply changes as you see fit. I'm only proposing the general idea as a kind of proof of concept, and thus am most interested in your overall general opinion.

To post a comment you must log in.
Revision history for this message
Vincent Ladeuil (vila) wrote :

Thanks for working on this !

This looks promising and I won't mind polishing it as patch pilot
(or helping you) but before digging I'd like some feedback from
jam and benji which have been involved recently.

[feedback] Overall, the approach sounds good. I'm not sure I'm
reading this correctly but it seems you've now made
ScopeReplacer._replace() the only place where races can occur. If
that's the case and we want to lock something, that should
probably be there.

[nit]

17 +
18 + disallow_proxying()

A bit naked ;) A comment explaining why it's done here (as early
as possible but no sooner) will be good.

[needsinfo]

218 - if event == 'line' and 'lazy_import.py' in filename:
219 - line = linecache.getline(filename, frame.f_lineno)
220 - # ...and the line of code is the one we're looking for...
221 - if 'del self._factory' in line:
<...>
228 + if (filename.endswith('lazy_import.py') and
229 + function_name == self.method_to_trace):

This looks nice and less hackish, but I thought you were relying
on the code you're replacing here in lp:~gagern/bzr/bug-702914 ?

I wouldn't mind merging these tests instead (or their moral equivalent).

review: Needs Information
Revision history for this message
Martin von Gagern (gagern) wrote :
Download full text (3.5 KiB)

On 01.09.2011 10:11, Vincent Ladeuil wrote:
> [feedback] Overall, the approach sounds good. I'm not sure I'm
> reading this correctly but it seems you've now made
> ScopeReplacer._replace() the only place where races can occur. If
> that's the case and we want to lock something, that should
> probably be there.

Depends on what race you're talking about.

A race resulting in double import will occur if the second thread reads
the _real_obj member before the first thread has written it. As the read
of _real_obj occurs in _resolve, that's involved in the race as well,
and a lock should encompass that whole code section, i.e. be aquired
before _real_obj is first accessed.

A race resulting in an UNDETECTED double replacement is restricted to
the very end of _replace, starting at the second read for _real_obj, the
one to the prev_obj local variable.

Reducing the chances for the second kind of race is the only reason for
the duplicate read of _real_obj. The assumption is that importing a
module might take considerable time, so a critical code section only
after the import represents a much shorter timespan.

> [nit]
>
> 17 +
> 18 + disallow_proxying()
>
> A bit naked ;) A comment explaining why it's done here (as early
> as possible but no sooner) will be good.

Guess you're right. To make things even cleaner, I could import the
lazy_import module instead of the function of the same name. That would
make this call use a qualified name, giving it more context. Will do
both soon.

> [needsinfo]
>
> 218 - if event == 'line' and 'lazy_import.py' in filename:
> 219 - line = linecache.getline(filename, frame.f_lineno)
> 220 - # ...and the line of code is the one we're looking for...
> 221 - if 'del self._factory' in line:
> <...>
> 228 + if (filename.endswith('lazy_import.py') and
> 229 + function_name == self.method_to_trace):
>
> This looks nice and less hackish, but I thought you were relying
> on the code you're replacing here in lp:~gagern/bzr/bug-702914 ?

This branch here supersedes that one, in my opinion. Notice that I never
requested that one to be merged. It was only intended as a proof that
the race problems hadn't been completely resolved yet. And the fact that
problems remained despite the original test cases shows another fact:
you cannot reasonably test for all possible races, but only those you
can think of. Thus the more people reviewing the code and thinking of
possible races, the more likely they won't miss one. The number of test
cases alone does little to increase confidence.

> I wouldn't mind merging these tests instead (or their moral equivalent).

There is no direct equivalent, as the lines of code mentioned no longer
exist in that fashion. I also came to believe that the scenarios
approach was bad, as it implied that all three kinds of test could be
matched by the same break point line regexp, which isn't necessarily
obvious.

The tests here are imo suitable to guard against a wide range of likely
races. The fact that the method name is an argument makes adding more
tests easy. So I'd rather prefer these tests here over the ones from
lp:~gagern/bzr/bug-702914 . I'd be happy to add three more tests for
_replace instead of ...

Read more...

6113. By Martin von Gagern

Allow subclasses to override _should_proxy.

This avoids a nasty hack in the unit test, as it allows single tests to
still use a proxying replacer even if proxying is disabled during selftest
by default.

6114. By Martin von Gagern

Document the call to disallow_proxying in cmd_selftest.

Now we use a qualified name and have a verbose comment explaining it.

Revision history for this message
Martin von Gagern (gagern) wrote :

Ping?

Revision history for this message
Jelmer Vernooij (jelmer) wrote :

On Tue 06 Sep 2011 20:53:04 CEST, Martin von Gagern wrote:
> Ping?
John is sprinting this week so it might take a while longer before he
has time to look at this.

Cheers,

Jelmer

Revision history for this message
Jelmer Vernooij (jelmer) :
review: Abstain
Revision history for this message
Martin von Gagern (gagern) wrote :

Second ping? Merge request has been out more than a month, please don't forget about it!

Revision history for this message
Martin Packman (gz) wrote :

Sorry we've not managed to get this reviewed MvG. Your analysis here and in the bug about the previous change seem correct to me, but I'm unconvinced about making further changes to lazy_import in this regard.

I can't offer useful suggestions for improving the branch, only complain about the state of the pitch:

* There's no way to disable lazy imports for things like loggerhead that care about threading not startup time.

* Telling whether a given lazy import is actually doing anything useful is far too hard.

  * It's not uncommon to see an object get imported then unconditionally used further down in the module.

  * Many parts of bzrlib are lazily imported somewhere, but end up always being required anyway.

  * The import tariff tests are nice, but very heavy compared to say, a lint style check.

* Parsing import statements out of strings is silly in many respects.

What I'd like to see would be a replacement with fewer features bzrlib could move to instead. I'm imagining it would:

* Use the import statement directly.

* Only works on module objects.

* Use a proxy that remains in place but gains attributes.

Since bzrlib.lazy_import was introduced we've gained a bunch of other lazy mechanisms which take over some of the work, but it's all very ad hoc.

Revision history for this message
Martin von Gagern (gagern) wrote :

Am 23.11.11 15:19, schrieb Martin Packman:
> Sorry we've not managed to get this reviewed MvG. Your analysis here and in the bug about the previous change seem correct to me, but I'm unconvinced about making further changes to lazy_import in this regard.

Currently, lazy_import is not merely ill designed, but actually breaking
stuff. So I believe that this proposal still should receive a proper
review, and the fix be committed. Unless there actually is code ready to
replace it, fixing it is better than not fixing it.

> I can't offer useful suggestions for improving the branch, only complain about the state of the pitch:
> * There's no way to disable lazy imports for things like loggerhead that care about threading not startup time.

Would be easy, but a different issue, to be implemented in a different
branch, preferrably on top of this one.

> * Telling whether a given lazy import is actually doing anything useful is far too hard.
> * It's not uncommon to see an object get imported then unconditionally used further down in the module.
> * Many parts of bzrlib are lazily imported somewhere, but end up always being required anyway.
> * The import tariff tests are nice, but very heavy compared to say, a lint style check.

tariff tests can deal with only a very limited number of scenarios to
check, but will by design also catch obscure paths leading to some
import. Lint-style checks could process a lot more code paths, detecting
lots of the stuff you describe above, but would be likely to miss some
more obscure constructs. So I'd consider them complementary.

> What I'd like to see would be a replacement with fewer features bzrlib could move to instead. I'm imagining it would:
>
> * Use the import statement directly.

I know that strange things are possible, but how would this work?
"from bzrlib.lazyly import foobar", with "lazily" not being a module but
instead an object working as a proxy factory? I guess that should be
possible, although I have some concerns about nested modules.

> * Only works on module objects.

That would surely make things a lot easier and cleaner.

> * Use a proxy that remains in place but gains attributes.

Not sure about performance here. I can imagine why a self-replacing
proxy would at least feel faster. It should be possible to do some
benchmarks on this.

Revision history for this message
Martin Packman (gz) wrote :

> Currently, lazy_import is not merely ill designed, but actually breaking
> stuff. So I believe that this proposal still should receive a proper
> review, and the fix be committed. Unless there actually is code ready to
> replace it, fixing it is better than not fixing it.

That's fair. Between me and Jelmer I'll see if we can manage a review here.

I'll leave the discussion of alternatives to a later merge proposal. :)

Revision history for this message
Martin Packman (gz) wrote :

Okay, Jelmer and I have discussed the general issue and the approach this branch takes. So, the code here is aimed at removing cases that result in exceptions, either usefully (in the case of multiply assigned proxies), or incorrectly (in the case of thread races). On that basis, the branch should land, and soon so it has some time on trunk to shake out any further issues.

The disallow_proxying mechanism for keeping the old testing for accidentally assigning a lazy object before it gets thunked I'm not convinced should be included with this change.

+ ScopeReplacer._last_duplicate_replacement = self

This is module-global state which is only used so the one call point of disallow_proxying() in selftest can retroactively raise (the most recent occurrence of) the error. Given that bzrlib.tests and test modules do not lazily import the will have already pulled in many things (varying if -s is used), this will only test one real import path.

For this branch, I suggest pulling out that code and living with the reduced coverage of assignment accidents, and just using `self.overrideAttr(ScopeReplacer, "_should_proxy", False)` for tests.

To get coverage for import issues again, using a debug flag to set the _should_proxy value and something similar to the import tarrif tests would work, and should be done in a future branch.

+ The loosing caller used to see an exception (bugs 396819 and 702914).
...
+ in order win the race, setting up the original caller to loose.

When changing this docstring it'd be nice to fix the spelling of 'lose'. :)

review: Needs Fixing

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bzrlib/builtins.py'
2--- bzrlib/builtins.py 2011-08-31 15:25:11 +0000
3+++ bzrlib/builtins.py 2011-09-01 20:49:31 +0000
4@@ -18,8 +18,8 @@
5
6 import os
7
8-from bzrlib.lazy_import import lazy_import
9-lazy_import(globals(), """
10+from bzrlib import lazy_import
11+lazy_import.lazy_import(globals(), """
12 import cStringIO
13 import sys
14 import time
15@@ -3800,6 +3800,15 @@
16 randomize=None, exclude=None, strict=False,
17 load_list=None, debugflag=None, starting_with=None, subunit=False,
18 parallel=None, lsprof_tests=False):
19+
20+ # During selftest, disallow proxying, as it can cause severe
21+ # performance penalties and is only needed for thread
22+ # safety. The selftest command is assumed to not use threads
23+ # too heavily. The call should be as early as possible, as
24+ # error reporting for past duplicate imports won't have useful
25+ # backtraces.
26+ lazy_import.disallow_proxying()
27+
28 from bzrlib import tests
29
30 if testspecs_list is not None:
31
32=== modified file 'bzrlib/lazy_import.py'
33--- bzrlib/lazy_import.py 2011-08-19 18:02:37 +0000
34+++ bzrlib/lazy_import.py 2011-09-01 20:49:31 +0000
35@@ -51,10 +51,15 @@
36
37 __slots__ = ('_scope', '_factory', '_name', '_real_obj')
38
39- # Setting this to True will allow you to do x = y, and still access members
40- # from both variables. This should not normally be enabled, but is useful
41- # when building documentation.
42- _should_proxy = False
43+ # This will record the last duplicate replacement. It is used
44+ # during selftests where duplicate replacements are forbidden.
45+ _last_duplicate_replacement = None
46+
47+ # If you to do x = y, setting this to False will disallow access to
48+ # members from the second variable (i.e. x). This should normally
49+ # be enabled for reasons of thread safety and documentation, but
50+ # will be disabled during the selftest command to check for abuse.
51+ _should_proxy = True
52
53 def __init__(self, scope, factory, name):
54 """Create a temporary object in the specified scope.
55@@ -71,74 +76,83 @@
56 object.__setattr__(self, '_real_obj', None)
57 scope[name] = self
58
59+ def _resolve(self):
60+ """Return the real object for which this is a placeholder"""
61+ real_obj = object.__getattribute__(self, '_real_obj')
62+ if real_obj is not None:
63+ object.__getattribute__(self, '_duplicate_replacement')()
64+ return real_obj
65+ replace = object.__getattribute__(self, '_replace')
66+ return replace()
67+
68 def _replace(self):
69 """Actually replace self with other in the given scope"""
70 name = object.__getattribute__(self, '_name')
71- try:
72- factory = object.__getattribute__(self, '_factory')
73- scope = object.__getattribute__(self, '_scope')
74- except AttributeError, e:
75- # Because ScopeReplacer objects only replace a single
76- # item, passing them to another variable before they are
77- # replaced would cause them to keep getting replaced
78- # (only they are replacing the wrong variable). So we
79- # make it forbidden, and try to give a good error.
80- raise errors.IllegalUseOfScopeReplacer(
81- name, msg="Object already cleaned up, did you assign it"
82- " to another variable?",
83- extra=e)
84+ factory = object.__getattribute__(self, '_factory')
85+ scope = object.__getattribute__(self, '_scope')
86 obj = factory(self, scope, name)
87 if obj is self:
88 raise errors.IllegalUseOfScopeReplacer(name, msg="Object tried"
89 " to replace itself, check it's not using its own scope.")
90- if ScopeReplacer._should_proxy:
91+
92+ # The same object might be replaced repeatedly due to a race
93+ # condition. We want to detect this fact for possible reporting.
94+ prev_obj = object.__getattribute__(self, '_real_obj')
95+ if prev_obj is None:
96 object.__setattr__(self, '_real_obj', obj)
97- scope[name] = obj
98+ scope[name] = obj
99+ else:
100+ # This indicates a duplicate replacement and should never
101+ # happen except under rare race conditions. See lp:396819.
102+ # Note: the above check is not thread-safe, so there is a
103+ # slight chance that concurrent duplicate replacements are
104+ # missed. Not worth the cost of a lock, though.
105+ object.__getattribute__(self, '_duplicate_replacement')()
106 return obj
107
108- def _cleanup(self):
109- """Stop holding on to all the extra stuff"""
110- try:
111- del self._factory
112- except AttributeError:
113- # Oops, we just lost a race with another caller of _cleanup. Just
114- # ignore it.
115- pass
116-
117- try:
118- del self._scope
119- except AttributeError:
120- # Another race loss. See above.
121- pass
122-
123- # We keep _name, so that we can report errors
124- # del self._name
125-
126 def __getattribute__(self, attr):
127- obj = object.__getattribute__(self, '_real_obj')
128- if obj is None:
129- _replace = object.__getattribute__(self, '_replace')
130- obj = _replace()
131- _cleanup = object.__getattribute__(self, '_cleanup')
132- _cleanup()
133+ obj = object.__getattribute__(self, '_resolve')()
134 return getattr(obj, attr)
135
136 def __setattr__(self, attr, value):
137- obj = object.__getattribute__(self, '_real_obj')
138- if obj is None:
139- _replace = object.__getattribute__(self, '_replace')
140- obj = _replace()
141- _cleanup = object.__getattribute__(self, '_cleanup')
142- _cleanup()
143+ obj = object.__getattribute__(self, '_resolve')()
144 return setattr(obj, attr, value)
145
146 def __call__(self, *args, **kwargs):
147- _replace = object.__getattribute__(self, '_replace')
148- obj = _replace()
149- _cleanup = object.__getattribute__(self, '_cleanup')
150- _cleanup()
151+ obj = object.__getattribute__(self, '_resolve')()
152 return obj(*args, **kwargs)
153
154+ def _duplicate_replacement(self):
155+ if object.__getattribute__(self, '_should_proxy'):
156+ # Do not report now, but remember in case we want to
157+ # report this later.
158+ ScopeReplacer._last_duplicate_replacement = self
159+ else:
160+ name = object.__getattribute__(self, '_name')
161+ raise errors.IllegalUseOfScopeReplacer(
162+ name, msg="Object already replaced, did you assign it"
163+ " to another variable?")
164+
165+
166+def disallow_proxying():
167+ """Disallow lazily imported modules to be used as proxies.
168+
169+ Calling this function might cause problems with concurrent imports
170+ in multithreaded environments, but will help detecting wasteful
171+ indirection, so it should be called when executing unit tests.
172+
173+ Calling this function will even act for the most recent past case
174+ of proxying, reporting the name of the proxied object. So a
175+ successful call to this function will ensure that scope replacer
176+ objects were never misused as proxies in the past and will never
177+ be misused in the future."""
178+
179+ ScopeReplacer._should_proxy = False
180+ last = ScopeReplacer._last_duplicate_replacement
181+ if last is not None:
182+ ScopeReplacer._last_duplicate_replacement = None
183+ object.__getattribute__(last, '_duplicate_replacement')()
184+
185
186 class ImportReplacer(ScopeReplacer):
187 """This is designed to replace only a portion of an import list.
188
189=== modified file 'bzrlib/tests/test_lazy_import.py'
190--- bzrlib/tests/test_lazy_import.py 2011-08-19 18:19:31 +0000
191+++ bzrlib/tests/test_lazy_import.py 2011-09-01 20:49:31 +0000
192@@ -375,7 +375,6 @@
193 ('foo', 2),
194 ('foo', 3),
195 ('__getattribute__', 'foo'),
196- '_replace',
197 ], actions)
198
199 def test_enable_proxying(self):
200@@ -1171,51 +1170,62 @@
201 ], self.actions)
202
203
204+class ProxyingReplacer(lazy_import.ScopeReplacer):
205+ """Replacer class that will do proxying even during selftest.
206+
207+ This is useful to check that the lazy import code is sufficiently
208+ thread-safe.
209+ """
210+
211+ _should_proxy = True
212+
213+
214 class TestScopeReplacerReentrance(TestCase):
215 """The ScopeReplacer should be reentrant.
216
217 Invoking a replacer while an invocation was already on-going leads to a
218- race to see which invocation will be the first to delete the _factory and
219- _scope attributes. The loosing caller used to see AttributeErrors (bug
220- 702914).
221+ race to see which invocation will be the first to call _replace.
222+ The loosing caller used to see an exception (bugs 396819 and 702914).
223
224- These tests set up a tracer that stops at the moment just before one of
225- the attributes is being deleted and starts another call to the
226- functionality in question (__call__, __getattribute__, __setattr_) in
227- order win the race, setting up the originall caller to loose.
228+ These tests set up a tracer that stops at a suitable moment (upon
229+ entry of a specified method) and starts another call to the
230+ functionality in question (__call__, __getattribute__, __setattr_)
231+ in order win the race, setting up the original caller to loose.
232 """
233
234 def tracer(self, frame, event, arg):
235+ if event != 'call':
236+ return self.tracer
237 # Grab the name of the file that contains the code being executed.
238- filename = frame.f_globals["__file__"]
239+ code = frame.f_code
240+ filename = code.co_filename
241 # Convert ".pyc" and ".pyo" file names to their ".py" equivalent.
242 filename = re.sub(r'\.py[co]$', '.py', filename)
243+ function_name = code.co_name
244 # If we're executing a line of code from the right module...
245- if event == 'line' and 'lazy_import.py' in filename:
246- line = linecache.getline(filename, frame.f_lineno)
247- # ...and the line of code is the one we're looking for...
248- if 'del self._factory' in line:
249- # We don't need to trace any more.
250- sys.settrace(None)
251- # Run another racer. This one will "win" the race, deleting
252- # the attributes. When the first racer resumes it will loose
253- # the race, generating an AttributeError.
254- self.racer()
255+ if (filename.endswith('lazy_import.py') and
256+ function_name == self.method_to_trace):
257+ # We don't need to trace any more.
258+ sys.settrace(None)
259+ # Run another racer. This one will "win" the race.
260+ self.racer()
261 return self.tracer
262
263- def run_race(self, racer):
264+ def run_race(self, racer, method_to_trace='_resolve'):
265+ lazy_import.ScopeReplacer._should_proxy = True
266 self.racer = racer
267+ self.method_to_trace = method_to_trace
268 sys.settrace(self.tracer)
269- self.racer() # Should not raise an AttributeError
270- # Make sure the tracer actually found the code it was looking for. If
271- # not, maybe the code was refactored in such a way that these tests
272- # aren't needed any more.
273+ self.racer() # Should not raise any exception
274+ # Make sure the tracer actually found the code it was
275+ # looking for. If not, maybe the code was refactored in
276+ # such a way that these tests aren't needed any more.
277 self.assertEqual(None, sys.gettrace())
278
279 def test_call(self):
280 def factory(*args):
281 return factory
282- replacer = lazy_import.ScopeReplacer({}, factory, 'name')
283+ replacer = ProxyingReplacer({}, factory, 'name')
284 self.run_race(replacer)
285
286 def test_setattr(self):
287@@ -1225,7 +1235,7 @@
288 def factory(*args):
289 return Replaced()
290
291- replacer = lazy_import.ScopeReplacer({}, factory, 'name')
292+ replacer = ProxyingReplacer({}, factory, 'name')
293
294 def racer():
295 replacer.foo = 42
296@@ -1239,7 +1249,7 @@
297 def factory(*args):
298 return Replaced()
299
300- replacer = lazy_import.ScopeReplacer({}, factory, 'name')
301+ replacer = ProxyingReplacer({}, factory, 'name')
302
303 def racer():
304 replacer.foo