Merge lp:~allenap/maas/lazy-module-loading into lp:maas/trunk

Proposed by Gavin Panella
Status: Rejected
Rejected by: MAAS Lander
Proposed branch: lp:~allenap/maas/lazy-module-loading
Merge into: lp:maas/trunk
Diff against target: 377 lines (+292/-13)
4 files modified
src/provisioningserver/utils/__init__.py (+8/-9)
src/provisioningserver/utils/lazy.py (+137/-0)
src/provisioningserver/utils/tests/test_lazy.py (+145/-0)
src/provisioningserver/utils/tests/test_utils.py (+2/-4)
To merge this branch: bzr merge lp:~allenap/maas/lazy-module-loading
Reviewer Review Type Date Requested Status
Mike Pontillo (community) Needs Information
Review via email: mp+274405@code.launchpad.net

Commit message

A lazy module loader to take away those circular import blues.

Description of the change

This has been sitting around since June and I finally got around to finishing it off when travelling home from Seattle. This contains the module importer and an example or two of its use. There's a follow-up branch that replaces most/all of the remaining circular imports with it.

To post a comment you must log in.
Revision history for this message
Mike Pontillo (mpontillo) wrote :

I think this is a novel way to solve the problem, and I think it could be a step in the right direction.

However, I'm not sure we should land it this late in 1.9. It seems risky to introduce the extra logic now. It seems you are creating "proxy" objects, and I don't fully understand how all of the proxying logic works. For example, do the objects returned by the lazy importer obey "normal" python conventions, such as returning the appropriate type when isinstance() is called?

It seems that another way to solve the circular import problem would be to import modules rather than functions at the top level, so that the specific dependencies we need aren't resolved until the correct time. But that would lead to code like:

from maasserver.models import fabric

...

def foo():
    f = fabric.Fabric()
    # or...
    Fabric = fabric.Fabric
    f = Fabric()

... which seems less than ideal. (but perhaps no worse than the extra 'import' line we have now, or writing a few hundred lines of lazy-importing code!)

review: Needs Information
Revision history for this message
Mike Pontillo (mpontillo) wrote :

Just thought of another thing - this may throw off code introspection tools (such as IPython, PyCharm, etc), which gives me pause...

Revision history for this message
Gavin Panella (allenap) wrote :
Download full text (3.5 KiB)

> I think this is a novel way to solve the problem, and I think it could
> be a step in the right direction.
>
> However, I'm not sure we should land it this late in 1.9. It seems
> risky to introduce the extra logic now. It seems you are creating
> "proxy" objects, and I don't fully understand how all of the proxying
> logic works. For example, do the objects returned by the lazy importer
> obey "normal" python conventions, such as returning the appropriate
> type when isinstance() is called?

https://docs.python.org/2/reference/datamodel.html#customizing-instance-and-subclass-checks
says that isinstance() and issubclass() are implemented by calling
__instancecheck__() and __subclasscheck__() on the type being checked
against, i.e.:

    isintance(thing, TypeA) --> TypeA.__instancecheck__(thing)
    issubclass(TypeA, TypeB) --> TypeB.__subclasscheck__(TypeA)

The Stub instances that this branch deposits override __getattribute__()
to forward attribute look-ups to the lazily-loaded object. This method
always called, even when accessing preexisting attributes. The similar
__getattr__() method is only called for attributes that don't exist.

The Stub instances also override __setattr__(), __delattr__(), and
__call__() but it's possible these are not actually required. I will try
removing them. I will also add isinstance() and issubclass() checks.

It's possible that the one thing that these Stub instances won't fool is
an `is` check, e.g. `stub is Something`, because `is` is not implemented
as a call to a __double_underscore__() method.

>
> It seems that another way to solve the circular import problem would
> be to import modules rather than functions at the top level, so that
> the specific dependencies we need aren't resolved until the correct
> time. But that would lead to code like:

This is a good point. It works up to a point. Subclassing, for example,
demands that we dereference names within the module. However, the code
in this branch can't help that situation either.

We may be able to solve all circular reference problems by only ever
importing modules. We might need to avoid code in packages' __init__.py
too.

This has a few downsides:

* Lint checks will catch fewer naming issues; none currently check that
  a name exists in the target module.

* Patching in tests, while not an ideal practice, would have to be done
  in the owning module rather than in the module under test. This can be
  problematic (patching too much) but nothing we can't work around.

* Readability. It helps to work with short names and we'd lose that.

* A small performance penalty, though nothing we ought to care about.

None of those are particularly bad. Your perspective has made me realise
that lazy-importing is (only?) useful for:

(a) Using short names for convenience and readability.

(b) Deferring work to increase start-up performance.

Now, (b) is irrelevant to us, and (a) might not be a big enough win to
justify this branch's code.

>
> from maasserver.models import fabric
>
> ...
>
> def foo():
> f = fabric.Fabric()
> # or...
> Fabric = fabric.Fabric
> f = Fabric()
>
> ... which seems less than ideal. (but perhaps no worse than th...

Read more...

Revision history for this message
MAAS Lander (maas-lander) wrote :

Transitioned to Git.

lp:maas has now moved from Bzr to Git.
Please propose your branches with Launchpad using Git.

git clone https://git.launchpad.net/maas

Unmerged revisions

4018. By Gavin Panella

Merge trunk, resolving minor conflicts.

4017. By Gavin Panella

Update tests and importer()'s docstring.

4016. By Gavin Panella

Actually add the tests.

4015. By Gavin Panella

Extract much of update_references() into gen_reference_updates().

4014. By Gavin Panella

Remove talk of changing frames; local variables can be read but not written.

4013. By Gavin Panella

Tests for lazy.find() and lazy.update_references().

4012. By Gavin Panella

Move the lazy module importer to a module of its own.

4011. By Gavin Panella

lazy() helper for circular imports and deferred loading, work in progress.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/provisioningserver/utils/__init__.py'
2--- src/provisioningserver/utils/__init__.py 2015-06-29 13:42:10 +0000
3+++ src/provisioningserver/utils/__init__.py 2015-10-14 14:00:38 +0000
4@@ -48,6 +48,7 @@
5 NoConnectionsAvailable,
6 NodeAlreadyExists,
7 )
8+from provisioningserver.utils import lazy
9 from provisioningserver.utils.twisted import (
10 asynchronous,
11 pause,
12@@ -66,6 +67,11 @@
13 )
14
15
16+region = lazy.importer("provisioningserver.rpc.region")
17+ClusterConfiguration = lazy.importer(
18+ "provisioningserver.config.ClusterConfiguration")
19+
20+
21 maaslog = get_maas_logger("utils")
22
23
24@@ -113,10 +119,6 @@
25 :param power_parameters: The power parameters for the node, as a
26 dict.
27 """
28- # Avoid circular dependencies.
29- from provisioningserver.rpc.region import CreateNode
30- from provisioningserver.config import ClusterConfiguration
31-
32 with ClusterConfiguration.open() as config:
33 cluster_uuid = config.cluster_uuid
34
35@@ -139,7 +141,7 @@
36 macs = sorted(set(macs))
37 try:
38 response = yield client(
39- CreateNode,
40+ region.CreateNode,
41 cluster_uuid=cluster_uuid,
42 architecture=arch,
43 power_type=power_type,
44@@ -181,9 +183,6 @@
45 :param system_id: system_id of node to commission.
46 :param user: user for the node.
47 """
48- # Avoid circular dependencies.
49- from provisioningserver.rpc.region import CommissionNode
50-
51 for elapsed, remaining, wait in retries(15, 5, reactor):
52 try:
53 client = getRegionClient()
54@@ -197,7 +196,7 @@
55
56 try:
57 yield client(
58- CommissionNode,
59+ region.CommissionNode,
60 system_id=system_id,
61 user=user)
62 except CommissionNodeFailed as e:
63
64=== added file 'src/provisioningserver/utils/lazy.py'
65--- src/provisioningserver/utils/lazy.py 1970-01-01 00:00:00 +0000
66+++ src/provisioningserver/utils/lazy.py 2015-10-14 14:00:38 +0000
67@@ -0,0 +1,137 @@
68+# Copyright 2015 Canonical Ltd. This software is licensed under the
69+# GNU Affero General Public License version 3 (see the file LICENSE).
70+
71+"""Doing things lazily, like importing modules."""
72+
73+from __future__ import (
74+ absolute_import,
75+ print_function,
76+ unicode_literals,
77+ )
78+
79+str = None
80+
81+__metaclass__ = type
82+__all__ = [
83+ "importer",
84+]
85+
86+from collections import (
87+ MutableMapping,
88+ MutableSequence,
89+ MutableSet,
90+)
91+import ctypes
92+from functools import partial
93+from gc import get_referrers
94+from importlib import import_module
95+from operator import setitem
96+from threading import RLock
97+
98+# A Python `cell`, used internally to implement variables referenced by
99+# multiple scopes. See https://docs.python.org/2/c-api/cell.html.
100+PyCell = type((lambda o: lambda: o)(0).func_closure[0])
101+
102+# Mechanism to delve into Python's brain to *set* the contents of cells.
103+PyCell_Set = ctypes.pythonapi.PyCell_Set
104+
105+
106+def find(name):
107+ """Find the given `name` as a module or object within."""
108+ try:
109+ return import_module(name)
110+ except ImportError:
111+ name, _, last = name.rpartition(".")
112+ if len(name) > 0:
113+ try:
114+ return getattr(find(name), last)
115+ except AttributeError:
116+ raise ImportError(
117+ "Cannot import %s from %s" % (last, name))
118+ else:
119+ raise
120+
121+
122+def gen_reference_updates(placeholder, actual):
123+ """Generate the updates needed to replace `placeholder` with `actual`.
124+
125+ Currently this can only replaces references within mutable mappings,
126+ lists, and sets, and within cells. This returns no-argument callables;
127+ these should be called only *after* exhausting this generator.
128+ """
129+ for referrer in get_referrers(placeholder):
130+ if isinstance(referrer, MutableMapping):
131+ for key, value in referrer.viewitems():
132+ if value is placeholder:
133+ yield partial(setitem, referrer, key, actual)
134+ elif isinstance(referrer, MutableSequence):
135+ for index, value in enumerate(referrer):
136+ if value is placeholder:
137+ yield partial(setitem, referrer, index, actual)
138+ elif isinstance(referrer, MutableSet):
139+ yield partial(referrer.discard, placeholder)
140+ yield partial(referrer.add, actual)
141+ elif isinstance(referrer, PyCell):
142+ # This was an attempt to get updates in dicts working, but it does
143+ # not appear to have any effect in that regard.
144+ yield partial(
145+ PyCell_Set, ctypes.py_object(referrer),
146+ ctypes.py_object(actual))
147+ else:
148+ # Don't yet know how to handle this. For example, this might be a
149+ # frame (it's not possible to update locals without major surgery)
150+ # or an immutable container.
151+ pass
152+
153+
154+def update_references(placeholder, actual):
155+ """Update references to `placeholder` with `actual`.
156+
157+ See `gen_reference_updates` for limitations.
158+ """
159+ updates = gen_reference_updates(placeholder, actual)
160+ for update in list(updates):
161+ update()
162+
163+
164+def importer(name):
165+ """Import the given module lazily.
166+
167+ This returns a stub object that proxies attribute access and calls through
168+ to the lazily imported object. On first attribute access, or on the first
169+ call, the stub will be replaced in modules, dicts, lists, and sets
170+ throughout the running interpreter.
171+
172+ There are some limitations. It can't replace references in local
173+ variables. It can't replace references to objects without a ``__dict__``
174+ attribute when that reference is within a dict.
175+
176+ :param name: The dotted path to a module or object within that module,
177+ e.g. "provisioningserver.config.ClusterConfiguration".
178+ """
179+
180+ def load(store=[], lock=RLock()):
181+ with lock:
182+ if len(store) == 0:
183+ thing = find(name)
184+ update_references(stub, thing)
185+ store.append(thing)
186+ return store[0]
187+
188+ class Stub:
189+
190+ def __getattribute__(self, name):
191+ return getattr(load(), name)
192+
193+ def __setattr__(self, name, value):
194+ return setattr(load(), name, value)
195+
196+ def __delattr__(self, name):
197+ return delattr(load(), name)
198+
199+ def __call__(self, *args, **kwargs):
200+ return load()(*args, **kwargs)
201+
202+ stub = Stub()
203+
204+ return stub
205
206=== added file 'src/provisioningserver/utils/tests/test_lazy.py'
207--- src/provisioningserver/utils/tests/test_lazy.py 1970-01-01 00:00:00 +0000
208+++ src/provisioningserver/utils/tests/test_lazy.py 2015-10-14 14:00:38 +0000
209@@ -0,0 +1,145 @@
210+# Copyright 2012-2015 Canonical Ltd. This software is licensed under the
211+# GNU Affero General Public License version 3 (see the file LICENSE).
212+
213+"""Tests for `provisioningserver.utils.lazy`."""
214+
215+from __future__ import (
216+ absolute_import,
217+ print_function,
218+ unicode_literals,
219+ )
220+
221+str = None
222+
223+__metaclass__ = type
224+__all__ = []
225+
226+import sys
227+
228+from maastesting.factory import factory
229+from maastesting.testcase import MAASTestCase
230+from provisioningserver.utils import lazy
231+from testtools.matchers import Contains
232+
233+
234+class TestFind(MAASTestCase):
235+ """Tests for `provisioningserver.utils.lazy.find`."""
236+
237+ def test_finds_builtin_module(self):
238+ self.assertIs(sys, lazy.find("sys"))
239+
240+ def test_finds_module_on_sys_path(self):
241+ self.assertIs(lazy, lazy.find("provisioningserver.utils.lazy"))
242+
243+ def test_finds_attribute_in_builtin(self):
244+ self.assertIs(sys.exit, lazy.find("sys.exit"))
245+
246+ def test_finds_attribute_in_module_on_sys_path(self):
247+ self.assertIs(lazy.find, lazy.find(
248+ "provisioningserver.utils.lazy.find"))
249+
250+ def test_raises_ImportError_when_module_cannot_be_found(self):
251+ name = factory.make_name()
252+ error = self.assertRaises(ImportError, lazy.find, name)
253+ self.assertEqual("No module named " + name, unicode(error))
254+
255+ def test_raises_ImportError_when_attribute_cannot_be_found(self):
256+ name = factory.make_name("attr")
257+ error = self.assertRaises(ImportError, lazy.find, "sys." + name)
258+ self.assertEqual("Cannot import %s from sys" % name, unicode(error))
259+
260+
261+class TestUpdateReferences(MAASTestCase):
262+ """Tests for `provisioningserver.utils.lazy.update_references`."""
263+
264+ def test_does_not_update_locals(self):
265+ ref = object()
266+ lazy.update_references(ref, sys)
267+ self.assertIsNot(ref, sys)
268+
269+ def test_updates_reference_in_module(self):
270+ global ref_in_module
271+ ref_in_module = object()
272+ self.assertIsNot(sys, ref_in_module)
273+ lazy.update_references(ref_in_module, sys)
274+ self.assertIs(sys, ref_in_module)
275+
276+ def test_updates_reference_in_dict_value(self):
277+ thing = type(b"Thing", (object,), {})()
278+ ref_in_dict = {None: thing}
279+ self.assertIsNot(sys, ref_in_dict[None])
280+ lazy.update_references(thing, sys)
281+ self.assertIs(sys, ref_in_dict[None])
282+
283+ def test_does_not_update_reference_in_dict_key(self):
284+ thing = type(b"Thing", (object,), {})()
285+ ref_in_dict = {thing: None}
286+ self.assertThat(ref_in_dict, Contains(thing))
287+ lazy.update_references(thing, sys)
288+ self.assertThat(ref_in_dict, Contains(thing))
289+
290+ def test_does_not_update_reference_to_dictless_object_in_dict_value(self):
291+ thing = type(b"Thing", (object,), {b"__slots__": ()})()
292+ ref_in_dict = {None: thing}
293+ self.assertIsNot(sys, ref_in_dict[None])
294+ lazy.update_references(ref_in_dict[None], sys)
295+ self.assertIsNot(sys, ref_in_dict[None])
296+
297+ def test_does_not_update_reference_to_dictless_object_in_dict_key(self):
298+ thing = type(b"Thing", (object,), {b"__slots__": ()})()
299+ ref_in_dict = {thing: None}
300+ self.assertThat(ref_in_dict, Contains(thing))
301+ lazy.update_references(thing, sys)
302+ self.assertThat(ref_in_dict, Contains(thing))
303+
304+ def test_updates_reference_in_list(self):
305+ ref_in_list = [object()]
306+ self.assertIsNot(sys, ref_in_list[0])
307+ lazy.update_references(ref_in_list[0], sys)
308+ self.assertIs(sys, ref_in_list[0])
309+
310+ def test_updates_reference_in_set(self):
311+ ref_in_set = {object()}
312+ self.assertIsNot(sys, list(ref_in_set)[0])
313+ lazy.update_references(list(ref_in_set)[0], sys)
314+ self.assertIs(sys, list(ref_in_set)[0])
315+
316+
317+class TestImporter(MAASTestCase):
318+ """Tests for `provisioningserver.utils.lazy.importer`."""
319+
320+ def test_returns_stub(self):
321+ stub = lazy.importer("sys")
322+ self.assertIsNot(stub, sys)
323+ self.assertEqual("Stub", type(stub).__name__)
324+
325+ def test_does_not_update_import_in_locals(self):
326+ stub_in_locals = lazy.importer("sys")
327+ self.assertIsNot(sys, stub_in_locals)
328+ stub_in_locals.exit # Look-up an attribute.
329+ self.assertIsNot(sys, stub_in_locals)
330+
331+ def test_updates_import_in_module(self):
332+ global stub_in_module
333+ stub_in_module = lazy.importer("sys")
334+ self.assertIsNot(sys, stub_in_module)
335+ stub_in_module.exit # Look-up an attribute.
336+ self.assertIs(sys, stub_in_module)
337+
338+ def test_updates_import_in_dict(self):
339+ stub_in_dict = {None: lazy.importer("sys")}
340+ self.assertIsNot(sys, stub_in_dict[None])
341+ stub_in_dict[None].exit # Look-up an attribute.
342+ self.assertIs(sys, stub_in_dict[None])
343+
344+ def test_updates_import_in_list(self):
345+ stub_in_list = [lazy.importer("sys")]
346+ self.assertIsNot(sys, stub_in_list[0])
347+ stub_in_list[0].exit # Look-up an attribute.
348+ self.assertIs(sys, stub_in_list[0])
349+
350+ def test_updates_import_in_set(self):
351+ stub_in_set = {lazy.importer("sys")}
352+ self.assertIsNot(sys, list(stub_in_set)[0])
353+ list(stub_in_set)[0].exit # Look-up an attribute.
354+ self.assertIs(sys, list(stub_in_set)[0])
355
356=== modified file 'src/provisioningserver/utils/tests/test_utils.py'
357--- src/provisioningserver/utils/tests/test_utils.py 2015-06-29 13:42:10 +0000
358+++ src/provisioningserver/utils/tests/test_utils.py 2015-10-14 14:00:38 +0000
359@@ -487,9 +487,8 @@
360 system_id = factory.make_name("system-id")
361 protocol.CreateNode.return_value = defer.succeed(
362 {"system_id": system_id})
363- get_cluster_uuid = self.patch(
364- provisioningserver.utils, 'get_cluster_uuid')
365- get_cluster_uuid.return_value = 'cluster-' + factory.make_UUID()
366+ test_uuid = factory.make_UUID()
367+ self.useFixture(ClusterConfigurationFixture(cluster_uuid=test_uuid))
368
369 uuid = 'node-' + factory.make_UUID()
370 macs = sorted(factory.make_mac_address() for _ in range(3))
371@@ -513,7 +512,6 @@
372 system_id = factory.make_name("system-id")
373 protocol.CreateNode.return_value = defer.succeed(
374 {"system_id": system_id})
375-
376 test_uuid = factory.make_UUID()
377 self.useFixture(ClusterConfigurationFixture(cluster_uuid=test_uuid))
378