Merge lp:~gz/bzr/lru_cache_refactor into lp:bzr
- lru_cache_refactor
- Merge into bzr.dev
Status: | Merged |
---|---|
Approved by: | Martin Packman |
Approved revision: | no longer in the source branch. |
Merged at revision: | 6235 |
Proposed branch: | lp:~gz/bzr/lru_cache_refactor |
Merge into: | lp:bzr |
Diff against target: |
655 lines (+139/-256) 4 files modified
bzrlib/chk_map.py (+3/-3) bzrlib/lru_cache.py (+32/-103) bzrlib/tests/test_lru_cache.py (+101/-150) doc/en/release-notes/bzr-2.5.txt (+3/-0) |
To merge this branch: | bzr merge lp:~gz/bzr/lru_cache_refactor |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Jelmer Vernooij (community) | Approve | ||
Review via email: mp+79439@code.launchpad.net |
Commit message
Refactor and deprecate unused parts of lru_cache
Description of the change
Remove some complications from bzrlib.lru_cache that made it harder for me to understand. There should be no functional change for Bazaar.
The LRUCache and LRUSize cache objects basically act like dicts, but may also evict items if a threshold is passed, starting with those touched 'least recently'.
The branch addresses the following wrinkles:
* Deprecates the `items` method that doesn't return a list of 2-tuples. This seems to be mostly for test usage so has been renamed to `as_dict` which is what it actually does.
* There's a test helper method `walk_lru` that may as well live with the tests.
* Given that usage of LRUSizeCache in bzrlib sensibly sticks to using to store simple immutable (or sort of at least, GroupCompressBlock is a bit odd) objects, their size may as well be recalculated on eviction. This is a judgement call, given bad objects that change size this could make caches behave oddly. That's already the case if given a `_compute_size` function that returns funny values.
* Remove per-node callback functions. Given that bzrlib has no callers that want this, and to work properly they need a fair bit of fixing, this seems like the simplest option. In short, the issues are:
** The nomenclature is confusing. The caches have a cleanup method, and their nodes have a cleanup slot that calling cleanup may, or may not, call.
** LRUSizeCache.add doesn't do the right thing with cleanup callbacks. It uses the one passed in with the new node if an old node is being replaced and there's not room for the new node, rather than the function from the slot. Otherwise when replacing a node, it doesn't call cleanup on the old node at all.
** The cleanup is per-node, but even the tests for it use it for a per-cache callback only.
** It'd be risky anyway to put things that need explict finalizing in a cache like this which may keep them alive indefinitely.
** The `add` method that exposed the cleanup object has been deprecated in favour of just using the dict-style access that nearly all callers use anyway. This avoids mixing up the check for removed functionality with live code.
** It seems that bzrlib.fifo_cache has a similar cleanup callback interface, but if it's not being used for lru_cache I see no reason to keep it around.
** I tried having a test-only cache with a cleanup callback, but basically the tests that need it are only testing things to do with the callback, the other tests do well enough at ensuring the caches behave in a sane manner.
Martin Packman (gz) wrote : | # |
> This (and the deprecation of 'add') seems like a worthwhile thing to document
> in the release notes, in case external users rely on it. bzr-git uses
> LRUSizeCache for example.
Noted. I take it bzr-git just uses LRUSizeCache as a dict like everything else?
> In general, this seems like a reasonable cleanup (excuse the pun :-P).
More puns please!
John A Meinel (jameinel) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 10/14/2011 10:12 PM, Martin Packman wrote:
> Martin Packman has proposed merging lp:~gz/bzr/lru_cache_refactor
> into lp:bzr.
>
> Requested reviews: bzr-core (bzr-core)
>
> For more details, see:
> https:/
>
> Remove some complications from bzrlib.lru_cache that made it harder
> for me to understand. There should be no functional change for
> Bazaar.
>
> The LRUCache and LRUSize cache objects basically act like dicts,
> but may also evict items if a threshold is passed, starting with
> those touched 'least recently'.
>
>
> The branch addresses the following wrinkles:
>
> * Deprecates the `items` method that doesn't return a list of
> 2-tuples. This seems to be mostly for test usage so has been
> renamed to `as_dict` which is what it actually does.
>
> * There's a test helper method `walk_lru` that may as well live
> with the tests.
>
> * Given that usage of LRUSizeCache in bzrlib sensibly sticks to
> using to store simple immutable (or sort of at least,
> GroupCompressBlock is a bit odd) objects, their size may as well be
> recalculated on eviction. This is a judgement call, given bad
> objects that change size this could make caches behave oddly.
> That's already the case if given a `_compute_size` function that
> returns funny values.
If something's size does change, then this quickly messes up the size
tracking. I think it is reasonable to eject an item with the size that
you added it.
>
> * Remove per-node callback functions. Given that bzrlib has no
> callers that want this, and to work properly they need a fair bit
> of fixing, this seems like the simplest option. In short, the
> issues are:
>
I do believe there is already 3rd party code (Launchpad, loggerhead?)
that makes use of this. bzr-cvsps-import I think also uses it.
John
=:->
> ** The nomenclature is confusing. The caches have a cleanup method,
> and their nodes have a cleanup slot that calling cleanup may, or
> may not, call.
>
> ** LRUSizeCache.add doesn't do the right thing with cleanup
> callbacks. It uses the one passed in with the new node if an old
> node is being replaced and there's not room for the new node,
> rather than the function from the slot. Otherwise when replacing a
> node, it doesn't call cleanup on the old node at all.
>
> ** The cleanup is per-node, but even the tests for it use it for a
> per-cache callback only.
bzr-cvsps-import passes in Branch objects, and uses their
Branch.unlock() as the cleanup callback. Which does use a per-node
callback (every branch's unlock is different). We can certainly update
the tests to reflect this.
>
> ** It'd be risky anyway to put things that need explict finalizing
> in a cache like this which may keep them alive indefinitely.
>
> ** The `add` method that exposed the cleanup object has been
> deprecated in favour of just using the dict-style access that
> nearly all callers use anyway. This avoids mixing up the check for
> removed functionality with live code.
>
> ** It seems that bzrlib.fifo_cache has a similar cleanup callback
> interface, but if it's not being used for lru_cache I see...
Martin Packman (gz) wrote : | # |
> If something's size does change, then this quickly messes up the size
> tracking. I think it is reasonable to eject an item with the size that
> you added it.
This is true, but if objects cached change size that's already making the cache a dangerous rather than useful thing. I guess it's a question of how much we want to defend against misuse.
> bzr-cvsps-import passes in Branch objects, and uses their
> Branch.unlock() as the cleanup callback. Which does use a per-node
> callback (every branch's unlock is different). We can certainly update
> the tests to reflect this.
My initial thought was to add a new base class without the per-node cleanup, move all the bzrlib users to it, and write a small subclass under the current name that adds callbacks. I could still do this, especially if there is code in launchpad or loggerhead that wants it.
I'm unconvinced by the cvsps use case though, caching locked branches doesn't seem a great idea to start with. It would be simple enough to do this with a per-cache callback though, with something like `getattr(value "unlock")()` but... wait, that's not needed, the plugin as its own lru_cache module and doesn't use the one in bzrlib at all.
Martin Packman (gz) wrote : | # |
Loggerhead just uses LRUCache as a dict, and none of the deprecated methods. Launchpad I can't branch, so I can't grep.
Robert Collins (lifeless) wrote : | # |
On Sun, Oct 16, 2011 at 1:10 AM, Martin Packman
<email address hidden> wrote:
> Loggerhead just uses LRUCache as a dict, and none of the deprecated methods. Launchpad I can't branch, so I can't grep.
What happens when you try?
John A Meinel (jameinel) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 10/16/2011 7:46 PM, Robert Collins wrote:
> On Sun, Oct 16, 2011 at 1:10 AM, Martin Packman
> <email address hidden> wrote:
>> Loggerhead just uses LRUCache as a dict, and none of the
>> deprecated methods. Launchpad I can't branch, so I can't grep.
>
> What happens when you try?
>
I think he is on Windows, and Launchpad has symlinks in the source tree.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAk6
Q3gAn0DZmtFCD2Z
=XPrT
-----END PGP SIGNATURE-----
Martin Packman (gz) wrote : | # |
> > Loggerhead just uses LRUCache as a dict, and none of the deprecated methods.
> Launchpad I can't branch, so I can't grep.
>
> What happens when you try?
Sorry, that was just me being lazy over the weekend. Was easy enough from John's suggestion to branch and inspect the Loggerhead source, but previous experience stopped me trying with Launchpad - I should have just asked someone to grep it for me.
The basic problem in the past has been that it takes a long time and a lot of resources, and the connection hangs after a while, and it's not possible to resume an interrupted branch. I could certainly try again at some point.
Robert Collins (lifeless) wrote : | # |
On Mon, Oct 17, 2011 at 9:30 PM, Martin Packman
<email address hidden> wrote:
> The basic problem in the past has been that it takes a long time and a lot of resources, and the connection hangs after a while, and it's not possible to resume an interrupted branch. I could certainly try again at some point.
I highly recommend following the Running/LXC wiki page so that you
have a dev environment available: LP is one of the larger single
consumers of bzr, and being able to see what we do with it firsthand
(and poke at it) is invaluable.
-Rob
Martin Packman (gz) wrote : | # |
Have just grepped the launchpad source, via the kind assistance of allenap, and there are no uses of lru_cache outside loggerhead which I'v already checked.
Martin Packman (gz) wrote : | # |
sent to pqm by email
Martin Packman (gz) wrote : | # |
sent to pqm by email
Preview Diff
1 | === modified file 'bzrlib/chk_map.py' |
2 | --- bzrlib/chk_map.py 2011-05-20 14:46:02 +0000 |
3 | +++ bzrlib/chk_map.py 2011-10-27 15:17:22 +0000 |
4 | @@ -920,7 +920,7 @@ |
5 | bytes = ''.join(lines) |
6 | if len(bytes) != self._current_size(): |
7 | raise AssertionError('Invalid _current_size') |
8 | - _get_cache().add(self._key, bytes) |
9 | + _get_cache()[self._key] = bytes |
10 | return [self._key] |
11 | |
12 | def refs(self): |
13 | @@ -1193,7 +1193,7 @@ |
14 | prefix, node_key_filter = keys[record.key] |
15 | node_and_filters.append((node, node_key_filter)) |
16 | self._items[prefix] = node |
17 | - _get_cache().add(record.key, bytes) |
18 | + _get_cache()[record.key] = bytes |
19 | for info in node_and_filters: |
20 | yield info |
21 | |
22 | @@ -1319,7 +1319,7 @@ |
23 | lines.append(serialised[prefix_len:]) |
24 | sha1, _, _ = store.add_lines((None,), (), lines) |
25 | self._key = StaticTuple("sha1:" + sha1,).intern() |
26 | - _get_cache().add(self._key, ''.join(lines)) |
27 | + _get_cache()[self._key] = ''.join(lines) |
28 | yield self._key |
29 | |
30 | def _search_key(self, key): |
31 | |
32 | === modified file 'bzrlib/lru_cache.py' |
33 | --- bzrlib/lru_cache.py 2010-07-15 13:05:40 +0000 |
34 | +++ bzrlib/lru_cache.py 2011-10-27 15:17:22 +0000 |
35 | @@ -17,6 +17,7 @@ |
36 | """A simple least-recently-used (LRU) cache.""" |
37 | |
38 | from bzrlib import ( |
39 | + symbol_versioning, |
40 | trace, |
41 | ) |
42 | |
43 | @@ -25,18 +26,13 @@ |
44 | class _LRUNode(object): |
45 | """This maintains the linked-list which is the lru internals.""" |
46 | |
47 | - __slots__ = ('prev', 'next_key', 'key', 'value', 'cleanup', 'size') |
48 | + __slots__ = ('prev', 'next_key', 'key', 'value') |
49 | |
50 | - def __init__(self, key, value, cleanup=None): |
51 | + def __init__(self, key, value): |
52 | self.prev = None |
53 | self.next_key = _null_key |
54 | self.key = key |
55 | self.value = value |
56 | - self.cleanup = cleanup |
57 | - # TODO: We could compute this 'on-the-fly' like we used to, and remove |
58 | - # one pointer from this object, we just need to decide if it |
59 | - # actually costs us much of anything in normal usage |
60 | - self.size = None |
61 | |
62 | def __repr__(self): |
63 | if self.prev is None: |
64 | @@ -46,16 +42,6 @@ |
65 | return '%s(%r n:%r p:%r)' % (self.__class__.__name__, self.key, |
66 | self.next_key, prev_key) |
67 | |
68 | - def run_cleanup(self): |
69 | - try: |
70 | - if self.cleanup is not None: |
71 | - self.cleanup(self.key, self.value) |
72 | - finally: |
73 | - # cleanup might raise an exception, but we want to make sure |
74 | - # to break refcycles, etc |
75 | - self.cleanup = None |
76 | - self.value = None |
77 | - |
78 | |
79 | class LRUCache(object): |
80 | """A class which manages a cache of entries, removing unused ones.""" |
81 | @@ -105,62 +91,23 @@ |
82 | def __len__(self): |
83 | return len(self._cache) |
84 | |
85 | - def _walk_lru(self): |
86 | - """Walk the LRU list, only meant to be used in tests.""" |
87 | - node = self._most_recently_used |
88 | - if node is not None: |
89 | - if node.prev is not None: |
90 | - raise AssertionError('the _most_recently_used entry is not' |
91 | - ' supposed to have a previous entry' |
92 | - ' %s' % (node,)) |
93 | - while node is not None: |
94 | - if node.next_key is _null_key: |
95 | - if node is not self._least_recently_used: |
96 | - raise AssertionError('only the last node should have' |
97 | - ' no next value: %s' % (node,)) |
98 | - node_next = None |
99 | - else: |
100 | - node_next = self._cache[node.next_key] |
101 | - if node_next.prev is not node: |
102 | - raise AssertionError('inconsistency found, node.next.prev' |
103 | - ' != node: %s' % (node,)) |
104 | - if node.prev is None: |
105 | - if node is not self._most_recently_used: |
106 | - raise AssertionError('only the _most_recently_used should' |
107 | - ' not have a previous node: %s' |
108 | - % (node,)) |
109 | - else: |
110 | - if node.prev.next_key != node.key: |
111 | - raise AssertionError('inconsistency found, node.prev.next' |
112 | - ' != node: %s' % (node,)) |
113 | - yield node |
114 | - node = node_next |
115 | - |
116 | + @symbol_versioning.deprecated_method( |
117 | + symbol_versioning.deprecated_in((2, 5, 0))) |
118 | def add(self, key, value, cleanup=None): |
119 | - """Add a new value to the cache. |
120 | - |
121 | - Also, if the entry is ever removed from the cache, call |
122 | - cleanup(key, value). |
123 | - |
124 | - :param key: The key to store it under |
125 | - :param value: The object to store |
126 | - :param cleanup: None or a function taking (key, value) to indicate |
127 | - 'value' should be cleaned up. |
128 | - """ |
129 | + if cleanup is not None: |
130 | + raise ValueError("Per-node cleanup functions no longer supported") |
131 | + return self.__setitem__(key, value) |
132 | + |
133 | + def __setitem__(self, key, value): |
134 | + """Add a new value to the cache""" |
135 | if key is _null_key: |
136 | raise ValueError('cannot use _null_key as a key') |
137 | if key in self._cache: |
138 | node = self._cache[key] |
139 | - try: |
140 | - node.run_cleanup() |
141 | - finally: |
142 | - # Maintain the LRU properties, even if cleanup raises an |
143 | - # exception |
144 | - node.value = value |
145 | - node.cleanup = cleanup |
146 | - self._record_access(node) |
147 | + node.value = value |
148 | + self._record_access(node) |
149 | else: |
150 | - node = _LRUNode(key, value, cleanup=cleanup) |
151 | + node = _LRUNode(key, value) |
152 | self._cache[key] = node |
153 | self._record_access(node) |
154 | |
155 | @@ -190,10 +137,13 @@ |
156 | """ |
157 | return self._cache.keys() |
158 | |
159 | - def items(self): |
160 | - """Get the key:value pairs as a dict.""" |
161 | + def as_dict(self): |
162 | + """Get a new dict with the same key:value pairs as the cache""" |
163 | return dict((k, n.value) for k, n in self._cache.iteritems()) |
164 | |
165 | + items = symbol_versioning.deprecated_method( |
166 | + symbol_versioning.deprecated_in((2, 5, 0)))(as_dict) |
167 | + |
168 | def cleanup(self): |
169 | """Clear the cache until it shrinks to the requested size. |
170 | |
171 | @@ -204,10 +154,6 @@ |
172 | while len(self._cache) > self._after_cleanup_count: |
173 | self._remove_lru() |
174 | |
175 | - def __setitem__(self, key, value): |
176 | - """Add a value to the cache, there will be no cleanup function.""" |
177 | - self.add(key, value, cleanup=None) |
178 | - |
179 | def _record_access(self, node): |
180 | """Record that key was accessed.""" |
181 | # Move 'node' to the front of the queue |
182 | @@ -241,19 +187,14 @@ |
183 | # If we have removed all entries, remove the head pointer as well |
184 | if self._least_recently_used is None: |
185 | self._most_recently_used = None |
186 | - try: |
187 | - node.run_cleanup() |
188 | - finally: |
189 | - # cleanup might raise an exception, but we want to make sure to |
190 | - # maintain the linked list |
191 | - if node.prev is not None: |
192 | - node.prev.next_key = node.next_key |
193 | - if node.next_key is not _null_key: |
194 | - node_next = self._cache[node.next_key] |
195 | - node_next.prev = node.prev |
196 | - # And remove this node's pointers |
197 | - node.prev = None |
198 | - node.next_key = _null_key |
199 | + if node.prev is not None: |
200 | + node.prev.next_key = node.next_key |
201 | + if node.next_key is not _null_key: |
202 | + node_next = self._cache[node.next_key] |
203 | + node_next.prev = node.prev |
204 | + # And remove this node's pointers |
205 | + node.prev = None |
206 | + node.next_key = _null_key |
207 | |
208 | def _remove_lru(self): |
209 | """Remove one entry from the lru, and handle consequences. |
210 | @@ -316,17 +257,8 @@ |
211 | self._update_max_size(max_size, after_cleanup_size=after_cleanup_size) |
212 | LRUCache.__init__(self, max_cache=max(int(max_size/512), 1)) |
213 | |
214 | - def add(self, key, value, cleanup=None): |
215 | - """Add a new value to the cache. |
216 | - |
217 | - Also, if the entry is ever removed from the cache, call |
218 | - cleanup(key, value). |
219 | - |
220 | - :param key: The key to store it under |
221 | - :param value: The object to store |
222 | - :param cleanup: None or a function taking (key, value) to indicate |
223 | - 'value' should be cleaned up. |
224 | - """ |
225 | + def __setitem__(self, key, value): |
226 | + """Add a new value to the cache""" |
227 | if key is _null_key: |
228 | raise ValueError('cannot use _null_key as a key') |
229 | node = self._cache.get(key, None) |
230 | @@ -341,15 +273,12 @@ |
231 | if node is not None: |
232 | # We won't be replacing the old node, so just remove it |
233 | self._remove_node(node) |
234 | - if cleanup is not None: |
235 | - cleanup(key, value) |
236 | return |
237 | if node is None: |
238 | - node = _LRUNode(key, value, cleanup=cleanup) |
239 | + node = _LRUNode(key, value) |
240 | self._cache[key] = node |
241 | else: |
242 | - self._value_size -= node.size |
243 | - node.size = value_len |
244 | + self._value_size -= self._compute_size(node.value) |
245 | self._value_size += value_len |
246 | self._record_access(node) |
247 | |
248 | @@ -368,7 +297,7 @@ |
249 | self._remove_lru() |
250 | |
251 | def _remove_node(self, node): |
252 | - self._value_size -= node.size |
253 | + self._value_size -= self._compute_size(node.value) |
254 | LRUCache._remove_node(self, node) |
255 | |
256 | def resize(self, max_size, after_cleanup_size=None): |
257 | |
258 | === modified file 'bzrlib/tests/test_lru_cache.py' |
259 | --- bzrlib/tests/test_lru_cache.py 2011-05-13 12:51:05 +0000 |
260 | +++ bzrlib/tests/test_lru_cache.py 2011-10-27 15:17:22 +0000 |
261 | @@ -18,10 +18,43 @@ |
262 | |
263 | from bzrlib import ( |
264 | lru_cache, |
265 | + symbol_versioning, |
266 | tests, |
267 | ) |
268 | |
269 | |
270 | +def walk_lru(lru): |
271 | + """Test helper to walk the LRU list and assert its consistency""" |
272 | + node = lru._most_recently_used |
273 | + if node is not None: |
274 | + if node.prev is not None: |
275 | + raise AssertionError('the _most_recently_used entry is not' |
276 | + ' supposed to have a previous entry' |
277 | + ' %s' % (node,)) |
278 | + while node is not None: |
279 | + if node.next_key is lru_cache._null_key: |
280 | + if node is not lru._least_recently_used: |
281 | + raise AssertionError('only the last node should have' |
282 | + ' no next value: %s' % (node,)) |
283 | + node_next = None |
284 | + else: |
285 | + node_next = lru._cache[node.next_key] |
286 | + if node_next.prev is not node: |
287 | + raise AssertionError('inconsistency found, node.next.prev' |
288 | + ' != node: %s' % (node,)) |
289 | + if node.prev is None: |
290 | + if node is not lru._most_recently_used: |
291 | + raise AssertionError('only the _most_recently_used should' |
292 | + ' not have a previous node: %s' |
293 | + % (node,)) |
294 | + else: |
295 | + if node.prev.next_key != node.key: |
296 | + raise AssertionError('inconsistency found, node.prev.next' |
297 | + ' != node: %s' % (node,)) |
298 | + yield node |
299 | + node = node_next |
300 | + |
301 | + |
302 | class TestLRUCache(tests.TestCase): |
303 | """Test that LRU cache properly keeps track of entries.""" |
304 | |
305 | @@ -61,11 +94,12 @@ |
306 | cache[None] |
307 | cache[1] |
308 | cache[None] |
309 | - self.assertEqual([None, 1], [n.key for n in cache._walk_lru()]) |
310 | + self.assertEqual([None, 1], [n.key for n in walk_lru(cache)]) |
311 | |
312 | def test_add__null_key(self): |
313 | cache = lru_cache.LRUCache(max_cache=10) |
314 | - self.assertRaises(ValueError, cache.add, lru_cache._null_key, 1) |
315 | + self.assertRaises(ValueError, |
316 | + cache.__setitem__, lru_cache._null_key, 1) |
317 | |
318 | def test_overflow(self): |
319 | """Adding extra entries will pop out old ones.""" |
320 | @@ -94,80 +128,12 @@ |
321 | |
322 | self.assertFalse('foo' in cache) |
323 | |
324 | - def test_cleanup(self): |
325 | - """Test that we can use a cleanup function.""" |
326 | - cleanup_called = [] |
327 | - def cleanup_func(key, val): |
328 | - cleanup_called.append((key, val)) |
329 | - |
330 | - cache = lru_cache.LRUCache(max_cache=2) |
331 | - |
332 | - cache.add('baz', '1', cleanup=cleanup_func) |
333 | - cache.add('foo', '2', cleanup=cleanup_func) |
334 | - cache.add('biz', '3', cleanup=cleanup_func) |
335 | - |
336 | - self.assertEqual([('baz', '1')], cleanup_called) |
337 | - |
338 | - # 'foo' is now most recent, so final cleanup will call it last |
339 | - cache['foo'] |
340 | - cache.clear() |
341 | - self.assertEqual([('baz', '1'), ('biz', '3'), ('foo', '2')], |
342 | - cleanup_called) |
343 | - |
344 | - def test_cleanup_on_replace(self): |
345 | - """Replacing an object should cleanup the old value.""" |
346 | - cleanup_called = [] |
347 | - def cleanup_func(key, val): |
348 | - cleanup_called.append((key, val)) |
349 | - |
350 | - cache = lru_cache.LRUCache(max_cache=2) |
351 | - cache.add(1, 10, cleanup=cleanup_func) |
352 | - cache.add(2, 20, cleanup=cleanup_func) |
353 | - cache.add(2, 25, cleanup=cleanup_func) |
354 | - |
355 | - self.assertEqual([(2, 20)], cleanup_called) |
356 | - self.assertEqual(25, cache[2]) |
357 | - |
358 | - # Even __setitem__ should make sure cleanup() is called |
359 | - cache[2] = 26 |
360 | - self.assertEqual([(2, 20), (2, 25)], cleanup_called) |
361 | - |
362 | - def test_cleanup_error_maintains_linked_list(self): |
363 | - cleanup_called = [] |
364 | - def cleanup_func(key, val): |
365 | - cleanup_called.append((key, val)) |
366 | - raise ValueError('failure during cleanup') |
367 | - |
368 | - cache = lru_cache.LRUCache(max_cache=10) |
369 | - for i in xrange(10): |
370 | - cache.add(i, i, cleanup=cleanup_func) |
371 | - for i in xrange(10, 20): |
372 | - self.assertRaises(ValueError, |
373 | - cache.add, i, i, cleanup=cleanup_func) |
374 | - |
375 | - self.assertEqual([(i, i) for i in xrange(10)], cleanup_called) |
376 | - |
377 | - self.assertEqual(range(19, 9, -1), [n.key for n in cache._walk_lru()]) |
378 | - |
379 | - def test_cleanup_during_replace_still_replaces(self): |
380 | - cleanup_called = [] |
381 | - def cleanup_func(key, val): |
382 | - cleanup_called.append((key, val)) |
383 | - raise ValueError('failure during cleanup') |
384 | - |
385 | - cache = lru_cache.LRUCache(max_cache=10) |
386 | - for i in xrange(10): |
387 | - cache.add(i, i, cleanup=cleanup_func) |
388 | - self.assertRaises(ValueError, |
389 | - cache.add, 1, 20, cleanup=cleanup_func) |
390 | - # We also still update the recent access to this node |
391 | - self.assertEqual([1, 9, 8, 7, 6, 5, 4, 3, 2, 0], |
392 | - [n.key for n in cache._walk_lru()]) |
393 | - self.assertEqual(20, cache[1]) |
394 | - |
395 | - self.assertEqual([(1, 1)], cleanup_called) |
396 | - self.assertEqual([1, 9, 8, 7, 6, 5, 4, 3, 2, 0], |
397 | - [n.key for n in cache._walk_lru()]) |
398 | + def test_cleanup_function_deprecated(self): |
399 | + """Test that per-node cleanup functions are no longer allowed""" |
400 | + cache = lru_cache.LRUCache() |
401 | + self.assertRaises(ValueError, self.applyDeprecated, |
402 | + symbol_versioning.deprecated_in((2, 5, 0)), |
403 | + cache.add, "key", 1, cleanup=lambda: None) |
404 | |
405 | def test_len(self): |
406 | cache = lru_cache.LRUCache(max_cache=10, after_cleanup_count=10) |
407 | @@ -197,21 +163,21 @@ |
408 | # We hit the max |
409 | self.assertEqual(10, len(cache)) |
410 | self.assertEqual([11, 10, 9, 1, 8, 7, 6, 5, 4, 3], |
411 | - [n.key for n in cache._walk_lru()]) |
412 | + [n.key for n in walk_lru(cache)]) |
413 | |
414 | def test_cleanup_shrinks_to_after_clean_count(self): |
415 | cache = lru_cache.LRUCache(max_cache=5, after_cleanup_count=3) |
416 | |
417 | - cache.add(1, 10) |
418 | - cache.add(2, 20) |
419 | - cache.add(3, 25) |
420 | - cache.add(4, 30) |
421 | - cache.add(5, 35) |
422 | + cache[1] = 10 |
423 | + cache[2] = 20 |
424 | + cache[3] = 25 |
425 | + cache[4] = 30 |
426 | + cache[5] = 35 |
427 | |
428 | self.assertEqual(5, len(cache)) |
429 | # This will bump us over the max, which causes us to shrink down to |
430 | # after_cleanup_cache size |
431 | - cache.add(6, 40) |
432 | + cache[6] = 40 |
433 | self.assertEqual(3, len(cache)) |
434 | |
435 | def test_after_cleanup_larger_than_max(self): |
436 | @@ -227,11 +193,11 @@ |
437 | cache = lru_cache.LRUCache(max_cache=5, after_cleanup_count=2) |
438 | |
439 | # Add these in order |
440 | - cache.add(1, 10) |
441 | - cache.add(2, 20) |
442 | - cache.add(3, 25) |
443 | - cache.add(4, 30) |
444 | - cache.add(5, 35) |
445 | + cache[1] = 10 |
446 | + cache[2] = 20 |
447 | + cache[3] = 25 |
448 | + cache[4] = 30 |
449 | + cache[5] = 35 |
450 | |
451 | self.assertEqual(5, len(cache)) |
452 | # Force a compaction |
453 | @@ -242,33 +208,33 @@ |
454 | cache = lru_cache.LRUCache(max_cache=5) |
455 | |
456 | # Add these in order |
457 | - cache.add(1, 10) |
458 | - cache.add(2, 20) |
459 | - cache.add(3, 25) |
460 | - cache.add(4, 30) |
461 | - cache.add(5, 35) |
462 | + cache[1] = 10 |
463 | + cache[2] = 20 |
464 | + cache[3] = 25 |
465 | + cache[4] = 30 |
466 | + cache[5] = 35 |
467 | |
468 | - self.assertEqual([5, 4, 3, 2, 1], [n.key for n in cache._walk_lru()]) |
469 | + self.assertEqual([5, 4, 3, 2, 1], [n.key for n in walk_lru(cache)]) |
470 | |
471 | # Now access some randomly |
472 | cache[2] |
473 | cache[5] |
474 | cache[3] |
475 | cache[2] |
476 | - self.assertEqual([2, 3, 5, 4, 1], [n.key for n in cache._walk_lru()]) |
477 | + self.assertEqual([2, 3, 5, 4, 1], [n.key for n in walk_lru(cache)]) |
478 | |
479 | def test_get(self): |
480 | cache = lru_cache.LRUCache(max_cache=5) |
481 | |
482 | - cache.add(1, 10) |
483 | - cache.add(2, 20) |
484 | + cache[1] = 10 |
485 | + cache[2] = 20 |
486 | self.assertEqual(20, cache.get(2)) |
487 | self.assertIs(None, cache.get(3)) |
488 | obj = object() |
489 | self.assertIs(obj, cache.get(3, obj)) |
490 | - self.assertEqual([2, 1], [n.key for n in cache._walk_lru()]) |
491 | + self.assertEqual([2, 1], [n.key for n in walk_lru(cache)]) |
492 | self.assertEqual(10, cache.get(1)) |
493 | - self.assertEqual([1, 2], [n.key for n in cache._walk_lru()]) |
494 | + self.assertEqual([1, 2], [n.key for n in walk_lru(cache)]) |
495 | |
496 | def test_keys(self): |
497 | cache = lru_cache.LRUCache(max_cache=5, after_cleanup_count=5) |
498 | @@ -332,18 +298,19 @@ |
499 | |
500 | def test_add__null_key(self): |
501 | cache = lru_cache.LRUSizeCache() |
502 | - self.assertRaises(ValueError, cache.add, lru_cache._null_key, 1) |
503 | + self.assertRaises(ValueError, |
504 | + cache.__setitem__, lru_cache._null_key, 1) |
505 | |
506 | def test_add_tracks_size(self): |
507 | cache = lru_cache.LRUSizeCache() |
508 | self.assertEqual(0, cache._value_size) |
509 | - cache.add('my key', 'my value text') |
510 | + cache['my key'] = 'my value text' |
511 | self.assertEqual(13, cache._value_size) |
512 | |
513 | def test_remove_tracks_size(self): |
514 | cache = lru_cache.LRUSizeCache() |
515 | self.assertEqual(0, cache._value_size) |
516 | - cache.add('my key', 'my value text') |
517 | + cache['my key'] = 'my value text' |
518 | self.assertEqual(13, cache._value_size) |
519 | node = cache._cache['my key'] |
520 | cache._remove_node(node) |
521 | @@ -353,64 +320,48 @@ |
522 | """Adding a large value may not be cached at all.""" |
523 | cache = lru_cache.LRUSizeCache(max_size=10, after_cleanup_size=5) |
524 | self.assertEqual(0, cache._value_size) |
525 | - self.assertEqual({}, cache.items()) |
526 | - cache.add('test', 'key') |
527 | - self.assertEqual(3, cache._value_size) |
528 | - self.assertEqual({'test': 'key'}, cache.items()) |
529 | - cache.add('test2', 'key that is too big') |
530 | - self.assertEqual(3, cache._value_size) |
531 | - self.assertEqual({'test':'key'}, cache.items()) |
532 | + self.assertEqual({}, cache.as_dict()) |
533 | + cache['test'] = 'key' |
534 | + self.assertEqual(3, cache._value_size) |
535 | + self.assertEqual({'test': 'key'}, cache.as_dict()) |
536 | + cache['test2'] = 'key that is too big' |
537 | + self.assertEqual(3, cache._value_size) |
538 | + self.assertEqual({'test':'key'}, cache.as_dict()) |
539 | # If we would add a key, only to cleanup and remove all cached entries, |
540 | # then obviously that value should not be stored |
541 | - cache.add('test3', 'bigkey') |
542 | - self.assertEqual(3, cache._value_size) |
543 | - self.assertEqual({'test':'key'}, cache.items()) |
544 | - |
545 | - cache.add('test4', 'bikey') |
546 | - self.assertEqual(3, cache._value_size) |
547 | - self.assertEqual({'test':'key'}, cache.items()) |
548 | - |
549 | - def test_no_add_over_size_cleanup(self): |
550 | - """If a large value is not cached, we will call cleanup right away.""" |
551 | - cleanup_calls = [] |
552 | - def cleanup(key, value): |
553 | - cleanup_calls.append((key, value)) |
554 | - |
555 | - cache = lru_cache.LRUSizeCache(max_size=10, after_cleanup_size=5) |
556 | - self.assertEqual(0, cache._value_size) |
557 | - self.assertEqual({}, cache.items()) |
558 | - cache.add('test', 'key that is too big', cleanup=cleanup) |
559 | - # key was not added |
560 | - self.assertEqual(0, cache._value_size) |
561 | - self.assertEqual({}, cache.items()) |
562 | - # and cleanup was called |
563 | - self.assertEqual([('test', 'key that is too big')], cleanup_calls) |
564 | + cache['test3'] = 'bigkey' |
565 | + self.assertEqual(3, cache._value_size) |
566 | + self.assertEqual({'test':'key'}, cache.as_dict()) |
567 | + |
568 | + cache['test4'] = 'bikey' |
569 | + self.assertEqual(3, cache._value_size) |
570 | + self.assertEqual({'test':'key'}, cache.as_dict()) |
571 | |
572 | def test_adding_clears_cache_based_on_size(self): |
573 | """The cache is cleared in LRU order until small enough""" |
574 | cache = lru_cache.LRUSizeCache(max_size=20) |
575 | - cache.add('key1', 'value') # 5 chars |
576 | - cache.add('key2', 'value2') # 6 chars |
577 | - cache.add('key3', 'value23') # 7 chars |
578 | + cache['key1'] = 'value' # 5 chars |
579 | + cache['key2'] = 'value2' # 6 chars |
580 | + cache['key3'] = 'value23' # 7 chars |
581 | self.assertEqual(5+6+7, cache._value_size) |
582 | cache['key2'] # reference key2 so it gets a newer reference time |
583 | - cache.add('key4', 'value234') # 8 chars, over limit |
584 | + cache['key4'] = 'value234' # 8 chars, over limit |
585 | # We have to remove 2 keys to get back under limit |
586 | self.assertEqual(6+8, cache._value_size) |
587 | self.assertEqual({'key2':'value2', 'key4':'value234'}, |
588 | - cache.items()) |
589 | + cache.as_dict()) |
590 | |
591 | def test_adding_clears_to_after_cleanup_size(self): |
592 | cache = lru_cache.LRUSizeCache(max_size=20, after_cleanup_size=10) |
593 | - cache.add('key1', 'value') # 5 chars |
594 | - cache.add('key2', 'value2') # 6 chars |
595 | - cache.add('key3', 'value23') # 7 chars |
596 | + cache['key1'] = 'value' # 5 chars |
597 | + cache['key2'] = 'value2' # 6 chars |
598 | + cache['key3'] = 'value23' # 7 chars |
599 | self.assertEqual(5+6+7, cache._value_size) |
600 | cache['key2'] # reference key2 so it gets a newer reference time |
601 | - cache.add('key4', 'value234') # 8 chars, over limit |
602 | + cache['key4'] = 'value234' # 8 chars, over limit |
603 | # We have to remove 3 keys to get back under limit |
604 | self.assertEqual(8, cache._value_size) |
605 | - self.assertEqual({'key4':'value234'}, cache.items()) |
606 | + self.assertEqual({'key4':'value234'}, cache.as_dict()) |
607 | |
608 | def test_custom_sizes(self): |
609 | def size_of_list(lst): |
610 | @@ -418,23 +369,23 @@ |
611 | cache = lru_cache.LRUSizeCache(max_size=20, after_cleanup_size=10, |
612 | compute_size=size_of_list) |
613 | |
614 | - cache.add('key1', ['val', 'ue']) # 5 chars |
615 | - cache.add('key2', ['val', 'ue2']) # 6 chars |
616 | - cache.add('key3', ['val', 'ue23']) # 7 chars |
617 | + cache['key1'] = ['val', 'ue'] # 5 chars |
618 | + cache['key2'] = ['val', 'ue2'] # 6 chars |
619 | + cache['key3'] = ['val', 'ue23'] # 7 chars |
620 | self.assertEqual(5+6+7, cache._value_size) |
621 | cache['key2'] # reference key2 so it gets a newer reference time |
622 | - cache.add('key4', ['value', '234']) # 8 chars, over limit |
623 | + cache['key4'] = ['value', '234'] # 8 chars, over limit |
624 | # We have to remove 3 keys to get back under limit |
625 | self.assertEqual(8, cache._value_size) |
626 | - self.assertEqual({'key4':['value', '234']}, cache.items()) |
627 | + self.assertEqual({'key4':['value', '234']}, cache.as_dict()) |
628 | |
629 | def test_cleanup(self): |
630 | cache = lru_cache.LRUSizeCache(max_size=20, after_cleanup_size=10) |
631 | |
632 | # Add these in order |
633 | - cache.add('key1', 'value') # 5 chars |
634 | - cache.add('key2', 'value2') # 6 chars |
635 | - cache.add('key3', 'value23') # 7 chars |
636 | + cache['key1'] = 'value' # 5 chars |
637 | + cache['key2'] = 'value2' # 6 chars |
638 | + cache['key3'] = 'value23' # 7 chars |
639 | self.assertEqual(5+6+7, cache._value_size) |
640 | |
641 | cache.cleanup() |
642 | |
643 | === modified file 'doc/en/release-notes/bzr-2.5.txt' |
644 | --- doc/en/release-notes/bzr-2.5.txt 2011-10-25 12:50:34 +0000 |
645 | +++ doc/en/release-notes/bzr-2.5.txt 2011-10-27 15:17:22 +0000 |
646 | @@ -81,6 +81,9 @@ |
647 | |
648 | * ``Branch.revision_history`` is now deprecated. (Jelmer Vernooij, #799519) |
649 | |
650 | +* Methods ``add`` and ``items`` of ``LRUCache`` and ``LRUSizeCache`` are |
651 | + deprecated. Use normal dict-style access instead. (Martin Packman) |
652 | + |
653 | * New flag ``RepositoryFormat.supports_unreferenced_revisions`` which |
654 | indicates whether revisions can be present in a repository without |
655 | being referenced from e.g. a branch history at the same time. |
> * Deprecates the `items` method that doesn't return a list of 2-tuples. This seems to be mostly for test usage so has been renamed to `as_dict` which is what it actually does.
This (and the deprecation of 'add') seems like a worthwhile thing to document in the release notes, in case external users rely on it. bzr-git uses LRUSizeCache for example.
In general, this seems like a reasonable cleanup (excuse the pun :-P). Perhaps it would be good if somebody who has more experience with this code could also have a look at the API changes though.