Merge lp:~therve/storm/cache-compile into lp:storm

Proposed by Thomas Herve
Status: Merged
Merged at revision: 400
Proposed branch: lp:~therve/storm/cache-compile
Merge into: lp:storm
Diff against target: 282 lines (+40/-26)
4 files modified
storm/expr.py (+14/-5)
storm/info.py (+10/-11)
tests/expr.py (+7/-1)
tests/info.py (+9/-9)
To merge this branch: bzr merge lp:~therve/storm/cache-compile
Reviewer Review Type Date Requested Status
Gustavo Niemeyer Approve
Jamu Kakar (community) Approve
Free Ekanayaka (community) Approve
Review via email: mp+71473@code.launchpad.net

Description of the change

Here it is. The branch adds a new attribute on Table and Column to cache the compilation. AFAIU, those compilations don't depend on the state, so we can store directly the value. It also changes the ClassInfo to store the table as a Table object instead of a SQLToken, to go through the compile_table logic.

I made a quick benchmark to show the gains:

                        | Trunk | Branch | Gain
10 inserts, 3 colums | 1.68s | 1.55s | 7.8%
100 insersts, 3 columns | 14.15s | 12.79s | 9.6%
20 insersts, 10 columns | 5.38s | 4.71s | 12.4%

I think it's worth it considering the low impact of the changes.

To post a comment you must log in.
Revision history for this message
Gustavo Niemeyer (niemeyer) wrote :

Nice!

Would you mind to repeat the benchmark with 1000 inserts and 10 columns each, to see a bit better the effects? The first number feels a bit suspect (hard to believe 10 inserts would take 1.6 seconds).

review: Needs Information
Revision history for this message
Thomas Herve (therve) wrote :

Gustavo: sorry this was explicit, but it's run 1000 times over.

Revision history for this message
Thomas Herve (therve) wrote :

Here's the script I used, with the last run: http://pastebin.ubuntu.com/666307/

Revision history for this message
Free Ekanayaka (free.ekanayaka) wrote :

Hi Thomas,

that's indeed significantly faster for use cases similar to the one in the attached script.

+1

[1]

- self.table = SQLToken(self.table)
+ self.table = Table(self.table)

I guess this is to make use of the cache? Are SQLToken and Table fully interchangeable in this case?

[2]

It'd be good to add a couple of unit tests for the new behavior.

review: Approve
lp:~therve/storm/cache-compile updated
399. By Thomas Herve

Add tests for compile_cache value

Revision history for this message
Thomas Herve (therve) wrote :

[1] Yes, it's for using the cache. table has been documented as an expression, so I don't see any problem changing it. The compilation passes token=True which results to the same result.

[2] I added some tests.

Thanks!

Revision history for this message
Jamu Kakar (jkakar) wrote :

Looks good to me, +1!

review: Approve
Revision history for this message
Gustavo Niemeyer (niemeyer) wrote :

Thomas, I mentioned 1000 inserts, not running it 1000 times.

Either way, sounds like it's improved things in practice, so +1.

review: Approve
Revision history for this message
Thomas Herve (therve) wrote :

Right, with 1000 inserts

Branch: 0.227118968964
Trunk: 0.256669044495

So around 11% faster.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'storm/expr.py'
2--- storm/expr.py 2011-05-16 10:39:36 +0000
3+++ storm/expr.py 2011-08-24 11:55:39 +0000
4@@ -800,7 +800,8 @@
5 @ivar variable_factory: Factory producing C{Variable} instances typed
6 according to this column.
7 """
8- __slots__ = ("name", "table", "primary", "variable_factory")
9+ __slots__ = ("name", "table", "primary", "variable_factory",
10+ "compile_cache")
11
12 def __init__(self, name=Undef, table=Undef, primary=False,
13 variable_factory=None):
14@@ -808,6 +809,7 @@
15 self.table = table
16 self.primary = int(primary)
17 self.variable_factory = variable_factory or Variable
18+ self.compile_cache = None
19
20 @compile.when(Column)
21 def compile_column(compile, column, state):
22@@ -819,11 +821,15 @@
23 alias = state.aliases.get(column)
24 if alias is not None:
25 return compile(alias.name, state, token=True)
26- return compile(column.name, state, token=True)
27+ if column.compile_cache is None:
28+ column.compile_cache = compile(column.name, state, token=True)
29+ return column.compile_cache
30 state.push("context", COLUMN_PREFIX)
31 table = compile(column.table, state, token=True)
32 state.pop()
33- return "%s.%s" % (table, compile(column.name, state, token=True))
34+ if column.compile_cache is None:
35+ column.compile_cache = compile(column.name, state, token=True)
36+ return "%s.%s" % (table, column.compile_cache)
37
38 @compile_python.when(Column)
39 def compile_python_column(compile, column, state):
40@@ -870,14 +876,17 @@
41
42
43 class Table(FromExpr):
44- __slots__ = ("name",)
45+ __slots__ = ("name", "compile_cache")
46
47 def __init__(self, name):
48 self.name = name
49+ self.compile_cache = None
50
51 @compile.when(Table)
52 def compile_table(compile, table, state):
53- return compile(table.name, state, token=True)
54+ if table.compile_cache is None:
55+ table.compile_cache = compile(table.name, state, token=True)
56+ return table.compile_cache
57
58
59 class JoinExpr(FromExpr):
60
61=== modified file 'storm/info.py'
62--- storm/info.py 2008-06-20 06:50:47 +0000
63+++ storm/info.py 2011-08-24 11:55:39 +0000
64@@ -18,11 +18,11 @@
65 # You should have received a copy of the GNU Lesser General Public License
66 # along with this program. If not, see <http://www.gnu.org/licenses/>.
67 #
68-from weakref import ref, WeakKeyDictionary
69+from weakref import ref
70
71 from storm.exceptions import ClassInfoError
72-from storm.expr import Expr, FromExpr, Column, Desc, TABLE
73-from storm.expr import SQLToken, CompileError, compile
74+from storm.expr import Column, Desc, TABLE
75+from storm.expr import compile, Table
76 from storm.event import EventSystem
77 from storm import Undef, has_cextensions
78
79@@ -40,9 +40,11 @@
80 obj_info = ObjectInfo(obj)
81 return obj.__dict__.setdefault("__storm_object_info__", obj_info)
82
83+
84 def set_obj_info(obj, obj_info):
85 obj.__dict__["__storm_object_info__"] = obj_info
86
87+
88 def get_cls_info(cls):
89 if "__storm_class_info__" in cls.__dict__:
90 # Can't use attribute access here, otherwise subclassing won't work.
91@@ -51,6 +53,7 @@
92 cls.__storm_class_info__ = ClassInfo(cls)
93 return cls.__storm_class_info__
94
95+
96 class ClassInfo(dict):
97 """Persistent storm-related information of a class.
98
99@@ -71,7 +74,7 @@
100 self.cls = cls
101
102 if isinstance(self.table, basestring):
103- self.table = SQLToken(self.table)
104+ self.table = Table(self.table)
105
106 pairs = []
107 for attr in dir(cls):
108@@ -79,7 +82,6 @@
109 if isinstance(column, Column):
110 pairs.append((attr, column))
111
112-
113 pairs.sort()
114
115 self.columns = tuple(pair[1] for pair in pairs)
116@@ -121,7 +123,6 @@
117 self.primary_key_pos = tuple(id_positions[id(column)]
118 for column in self.primary_key)
119
120-
121 __order__ = getattr(cls, "__storm_order__", None)
122 if __order__ is None:
123 self.default_order = Undef
124@@ -151,7 +152,7 @@
125 __hash__ = object.__hash__
126
127 # For get_obj_info(), an ObjectInfo is its own obj_info.
128- __storm_object_info__ = property(lambda self:self)
129+ __storm_object_info__ = property(lambda self: self)
130
131 def __init__(self, obj):
132 # FASTPATH This method is part of the fast path. Be careful when
133@@ -171,7 +172,7 @@
134 column.variable_factory(column=column,
135 event=event,
136 validator_object_factory=self.get_obj)
137-
138+
139 self.primary_vars = tuple(variables[column]
140 for column in self.cls_info.primary_key)
141
142@@ -199,7 +200,6 @@
143 from storm.cextensions import ObjectInfo, get_obj_info
144
145
146-
147 class ClassAlias(object):
148 """Create a named alias for a Storm class for use in queries.
149
150@@ -229,8 +229,7 @@
151 cls._storm_alias_cache = {}
152 elif name in cache:
153 return cache[name]
154- cls_info = get_cls_info(cls)
155- alias_cls = type(cls.__name__+"Alias", (self_cls,),
156+ alias_cls = type(cls.__name__ + "Alias", (self_cls,),
157 {"__storm_table__": name})
158 alias_cls.__bases__ = (cls, self_cls)
159 alias_cls_info = get_cls_info(alias_cls)
160
161=== modified file 'tests/expr.py'
162--- tests/expr.py 2011-05-11 05:38:41 +0000
163+++ tests/expr.py 2011-08-24 11:55:39 +0000
164@@ -135,6 +135,7 @@
165 expr = Column()
166 self.assertEquals(expr.name, Undef)
167 self.assertEquals(expr.table, Undef)
168+ self.assertIdentical(expr.compile_cache, None)
169
170 # Test for identity. We don't want False there.
171 self.assertTrue(expr.primary is 0)
172@@ -1007,13 +1008,16 @@
173 statement = compile(expr, state)
174 self.assertEquals(statement, "column1")
175 self.assertEquals(state.parameters, [])
176+ self.assertEquals(expr.compile_cache, "column1")
177
178 def test_column_table(self):
179- expr = Select(Column(column1, Func1()))
180+ column = Column(column1, Func1())
181+ expr = Select(column)
182 state = State()
183 statement = compile(expr, state)
184 self.assertEquals(statement, "SELECT func1().column1 FROM func1()")
185 self.assertEquals(state.parameters, [])
186+ self.assertEquals(column.compile_cache, "column1")
187
188 def test_column_contexts(self):
189 table, = track_contexts(1)
190@@ -1579,10 +1583,12 @@
191
192 def test_table(self):
193 expr = Table(table1)
194+ self.assertIdentical(expr.compile_cache, None)
195 state = State()
196 statement = compile(expr, state)
197 self.assertEquals(statement, '"table 1"')
198 self.assertEquals(state.parameters, [])
199+ self.assertEquals(expr.compile_cache, '"table 1"')
200
201 def test_alias(self):
202 expr = Alias(Table(table1), "name")
203
204=== modified file 'tests/info.py'
205--- tests/info.py 2008-06-18 17:28:37 +0000
206+++ tests/info.py 2011-08-24 11:55:39 +0000
207@@ -94,7 +94,7 @@
208 (self.Class.prop1, self.Class.prop2))
209
210 def test_table(self):
211- self.assertEquals(self.cls_info.table, "table")
212+ self.assertEquals(self.cls_info.table.name, "table")
213
214 def test_primary_key(self):
215 # Can't use == for props.
216@@ -202,7 +202,7 @@
217
218 def test_variables(self):
219 self.assertTrue(isinstance(self.obj_info.variables, dict))
220-
221+
222 for column in self.cls_info.columns:
223 variable = self.obj_info.variables.get(column)
224 self.assertTrue(isinstance(variable, Variable))
225@@ -227,7 +227,7 @@
226
227 def test_primary_vars(self):
228 self.assertTrue(isinstance(self.obj_info.primary_vars, tuple))
229-
230+
231 for column, variable in zip(self.cls_info.primary_key,
232 self.obj_info.primary_vars):
233 self.assertEquals(self.obj_info.variables.get(column),
234@@ -316,7 +316,7 @@
235 self.obj_info.checkpoint()
236
237 obj = object()
238-
239+
240 self.obj_info.event.hook("changed", object_changed1, obj)
241 self.obj_info.event.hook("changed", object_changed2, obj)
242
243@@ -326,7 +326,7 @@
244 self.assertEquals(changes1,
245 [(1, self.obj_info, self.variable2, Undef, 10, False, obj),
246 (1, self.obj_info, self.variable1, Undef, 20, False, obj)])
247- self.assertEquals(changes2,
248+ self.assertEquals(changes2,
249 [(2, self.obj_info, self.variable2, Undef, 10, False, obj),
250 (2, self.obj_info, self.variable1, Undef, 20, False, obj)])
251
252@@ -339,7 +339,7 @@
253 self.assertEquals(changes1,
254 [(1, self.obj_info, self.variable1, 20, None, False, obj),
255 (1, self.obj_info, self.variable2, 10, None, False, obj)])
256- self.assertEquals(changes2,
257+ self.assertEquals(changes2,
258 [(2, self.obj_info, self.variable1, 20, None, False, obj),
259 (2, self.obj_info, self.variable2, 10, None, False, obj)])
260
261@@ -352,7 +352,7 @@
262 self.assertEquals(changes1,
263 [(1, self.obj_info, self.variable1, None, Undef, False, obj),
264 (1, self.obj_info, self.variable2, None, Undef, False, obj)])
265- self.assertEquals(changes2,
266+ self.assertEquals(changes2,
267 [(2, self.obj_info, self.variable1, None, Undef, False, obj),
268 (2, self.obj_info, self.variable2, None, Undef, False, obj)])
269
270@@ -528,11 +528,11 @@
271 prop1 = Property("column1", primary=True)
272 self.Class = Class
273 self.ClassAlias = ClassAlias(self.Class, "alias")
274-
275+
276 def test_cls_info_cls(self):
277 cls_info = get_cls_info(self.ClassAlias)
278 self.assertEquals(cls_info.cls, self.Class)
279- self.assertEquals(cls_info.table, "alias")
280+ self.assertEquals(cls_info.table.name, "alias")
281 self.assertEquals(self.ClassAlias.prop1.name, "column1")
282 self.assertEquals(self.ClassAlias.prop1.table, self.ClassAlias)
283

Subscribers

People subscribed via source and target branches

to status/vote changes: