Merge lp:~gz/bzr/lru_cache_refactor into lp:bzr

Proposed by Martin Packman
Status: Merged
Approved by: Martin Packman
Approved revision: no longer in the source branch.
Merged at revision: 6235
Proposed branch: lp:~gz/bzr/lru_cache_refactor
Merge into: lp:bzr
Diff against target: 655 lines (+139/-256)
4 files modified
bzrlib/chk_map.py (+3/-3)
bzrlib/lru_cache.py (+32/-103)
bzrlib/tests/test_lru_cache.py (+101/-150)
doc/en/release-notes/bzr-2.5.txt (+3/-0)
To merge this branch: bzr merge lp:~gz/bzr/lru_cache_refactor
Reviewer Review Type Date Requested Status
Jelmer Vernooij (community) Approve
Review via email: mp+79439@code.launchpad.net

Commit message

Refactor and deprecate unused parts of lru_cache

Description of the change

Remove some complications from bzrlib.lru_cache that made it harder for me to understand. There should be no functional change for Bazaar.

The LRUCache and LRUSize cache objects basically act like dicts, but may also evict items if a threshold is passed, starting with those touched 'least recently'.

The branch addresses the following wrinkles:

* Deprecates the `items` method that doesn't return a list of 2-tuples. This seems to be mostly for test usage so has been renamed to `as_dict` which is what it actually does.

* There's a test helper method `walk_lru` that may as well live with the tests.

* Given that usage of LRUSizeCache in bzrlib sensibly sticks to using to store simple immutable (or sort of at least, GroupCompressBlock is a bit odd) objects, their size may as well be recalculated on eviction. This is a judgement call, given bad objects that change size this could make caches behave oddly. That's already the case if given a `_compute_size` function that returns funny values.

* Remove per-node callback functions. Given that bzrlib has no callers that want this, and to work properly they need a fair bit of fixing, this seems like the simplest option. In short, the issues are:

** The nomenclature is confusing. The caches have a cleanup method, and their nodes have a cleanup slot that calling cleanup may, or may not, call.

** LRUSizeCache.add doesn't do the right thing with cleanup callbacks. It uses the one passed in with the new node if an old node is being replaced and there's not room for the new node, rather than the function from the slot. Otherwise when replacing a node, it doesn't call cleanup on the old node at all.

** The cleanup is per-node, but even the tests for it use it for a per-cache callback only.

** It'd be risky anyway to put things that need explict finalizing in a cache like this which may keep them alive indefinitely.

** The `add` method that exposed the cleanup object has been deprecated in favour of just using the dict-style access that nearly all callers use anyway. This avoids mixing up the check for removed functionality with live code.

** It seems that bzrlib.fifo_cache has a similar cleanup callback interface, but if it's not being used for lru_cache I see no reason to keep it around.

** I tried having a test-only cache with a cleanup callback, but basically the tests that need it are only testing things to do with the callback, the other tests do well enough at ensuring the caches behave in a sane manner.

To post a comment you must log in.
Revision history for this message
Jelmer Vernooij (jelmer) wrote :

> * Deprecates the `items` method that doesn't return a list of 2-tuples. This seems to be mostly for test usage so has been renamed to `as_dict` which is what it actually does.
This (and the deprecation of 'add') seems like a worthwhile thing to document in the release notes, in case external users rely on it. bzr-git uses LRUSizeCache for example.

In general, this seems like a reasonable cleanup (excuse the pun :-P). Perhaps it would be good if somebody who has more experience with this code could also have a look at the API changes though.

review: Approve
Revision history for this message
Martin Packman (gz) wrote :

> This (and the deprecation of 'add') seems like a worthwhile thing to document
> in the release notes, in case external users rely on it. bzr-git uses
> LRUSizeCache for example.

Noted. I take it bzr-git just uses LRUSizeCache as a dict like everything else?

> In general, this seems like a reasonable cleanup (excuse the pun :-P).

More puns please!

Revision history for this message
John A Meinel (jameinel) wrote :
Download full text (3.8 KiB)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 10/14/2011 10:12 PM, Martin Packman wrote:
> Martin Packman has proposed merging lp:~gz/bzr/lru_cache_refactor
> into lp:bzr.
>
> Requested reviews: bzr-core (bzr-core)
>
> For more details, see:
> https://code.launchpad.net/~gz/bzr/lru_cache_refactor/+merge/79439
>
> Remove some complications from bzrlib.lru_cache that made it harder
> for me to understand. There should be no functional change for
> Bazaar.
>
> The LRUCache and LRUSize cache objects basically act like dicts,
> but may also evict items if a threshold is passed, starting with
> those touched 'least recently'.
>
>
> The branch addresses the following wrinkles:
>
> * Deprecates the `items` method that doesn't return a list of
> 2-tuples. This seems to be mostly for test usage so has been
> renamed to `as_dict` which is what it actually does.
>
> * There's a test helper method `walk_lru` that may as well live
> with the tests.
>
> * Given that usage of LRUSizeCache in bzrlib sensibly sticks to
> using to store simple immutable (or sort of at least,
> GroupCompressBlock is a bit odd) objects, their size may as well be
> recalculated on eviction. This is a judgement call, given bad
> objects that change size this could make caches behave oddly.
> That's already the case if given a `_compute_size` function that
> returns funny values.

If something's size does change, then this quickly messes up the size
tracking. I think it is reasonable to eject an item with the size that
you added it.

>
> * Remove per-node callback functions. Given that bzrlib has no
> callers that want this, and to work properly they need a fair bit
> of fixing, this seems like the simplest option. In short, the
> issues are:
>

I do believe there is already 3rd party code (Launchpad, loggerhead?)
that makes use of this. bzr-cvsps-import I think also uses it.

John
=:->

> ** The nomenclature is confusing. The caches have a cleanup method,
> and their nodes have a cleanup slot that calling cleanup may, or
> may not, call.
>
> ** LRUSizeCache.add doesn't do the right thing with cleanup
> callbacks. It uses the one passed in with the new node if an old
> node is being replaced and there's not room for the new node,
> rather than the function from the slot. Otherwise when replacing a
> node, it doesn't call cleanup on the old node at all.
>
> ** The cleanup is per-node, but even the tests for it use it for a
> per-cache callback only.

bzr-cvsps-import passes in Branch objects, and uses their
Branch.unlock() as the cleanup callback. Which does use a per-node
callback (every branch's unlock is different). We can certainly update
the tests to reflect this.

>
> ** It'd be risky anyway to put things that need explict finalizing
> in a cache like this which may keep them alive indefinitely.
>
> ** The `add` method that exposed the cleanup object has been
> deprecated in favour of just using the dict-style access that
> nearly all callers use anyway. This avoids mixing up the check for
> removed functionality with live code.
>
> ** It seems that bzrlib.fifo_cache has a similar cleanup callback
> interface, but if it's not being used for lru_cache I see...

Read more...

Revision history for this message
Martin Packman (gz) wrote :

> If something's size does change, then this quickly messes up the size
> tracking. I think it is reasonable to eject an item with the size that
> you added it.

This is true, but if objects cached change size that's already making the cache a dangerous rather than useful thing. I guess it's a question of how much we want to defend against misuse.

> bzr-cvsps-import passes in Branch objects, and uses their
> Branch.unlock() as the cleanup callback. Which does use a per-node
> callback (every branch's unlock is different). We can certainly update
> the tests to reflect this.

My initial thought was to add a new base class without the per-node cleanup, move all the bzrlib users to it, and write a small subclass under the current name that adds callbacks. I could still do this, especially if there is code in launchpad or loggerhead that wants it.

I'm unconvinced by the cvsps use case though, caching locked branches doesn't seem a great idea to start with. It would be simple enough to do this with a per-cache callback though, with something like `getattr(value "unlock")()` but... wait, that's not needed, the plugin as its own lru_cache module and doesn't use the one in bzrlib at all.

Revision history for this message
Martin Packman (gz) wrote :

Loggerhead just uses LRUCache as a dict, and none of the deprecated methods. Launchpad I can't branch, so I can't grep.

Revision history for this message
Robert Collins (lifeless) wrote :

On Sun, Oct 16, 2011 at 1:10 AM, Martin Packman
<email address hidden> wrote:
> Loggerhead just uses LRUCache as a dict, and none of the deprecated methods. Launchpad I can't branch, so I can't grep.

What happens when you try?

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 10/16/2011 7:46 PM, Robert Collins wrote:
> On Sun, Oct 16, 2011 at 1:10 AM, Martin Packman
> <email address hidden> wrote:
>> Loggerhead just uses LRUCache as a dict, and none of the
>> deprecated methods. Launchpad I can't branch, so I can't grep.
>
> What happens when you try?
>

I think he is on Windows, and Launchpad has symlinks in the source tree.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk6b4IsACgkQJdeBCYSNAAOAkQCfTjnzuXZIgnVEMnxqiSONZ7Nn
Q3gAn0DZmtFCD2ZndhIzxzK4avQTrrlr
=XPrT
-----END PGP SIGNATURE-----

Revision history for this message
Martin Packman (gz) wrote :

> > Loggerhead just uses LRUCache as a dict, and none of the deprecated methods.
> Launchpad I can't branch, so I can't grep.
>
> What happens when you try?

Sorry, that was just me being lazy over the weekend. Was easy enough from John's suggestion to branch and inspect the Loggerhead source, but previous experience stopped me trying with Launchpad - I should have just asked someone to grep it for me.

The basic problem in the past has been that it takes a long time and a lot of resources, and the connection hangs after a while, and it's not possible to resume an interrupted branch. I could certainly try again at some point.

Revision history for this message
Robert Collins (lifeless) wrote :

On Mon, Oct 17, 2011 at 9:30 PM, Martin Packman
<email address hidden> wrote:
> The basic problem in the past has been that it takes a long time and a lot of resources, and the connection hangs after a while, and it's not possible to resume an interrupted branch. I could certainly try again at some point.

I highly recommend following the Running/LXC wiki page so that you
have a dev environment available: LP is one of the larger single
consumers of bzr, and being able to see what we do with it firsthand
(and poke at it) is invaluable.

-Rob

Revision history for this message
Martin Packman (gz) wrote :

Have just grepped the launchpad source, via the kind assistance of allenap, and there are no uses of lru_cache outside loggerhead which I'v already checked.

Revision history for this message
Martin Packman (gz) wrote :

sent to pqm by email

Revision history for this message
Martin Packman (gz) wrote :

sent to pqm by email

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'bzrlib/chk_map.py'
--- bzrlib/chk_map.py 2011-05-20 14:46:02 +0000
+++ bzrlib/chk_map.py 2011-10-27 15:17:22 +0000
@@ -920,7 +920,7 @@
920 bytes = ''.join(lines)920 bytes = ''.join(lines)
921 if len(bytes) != self._current_size():921 if len(bytes) != self._current_size():
922 raise AssertionError('Invalid _current_size')922 raise AssertionError('Invalid _current_size')
923 _get_cache().add(self._key, bytes)923 _get_cache()[self._key] = bytes
924 return [self._key]924 return [self._key]
925925
926 def refs(self):926 def refs(self):
@@ -1193,7 +1193,7 @@
1193 prefix, node_key_filter = keys[record.key]1193 prefix, node_key_filter = keys[record.key]
1194 node_and_filters.append((node, node_key_filter))1194 node_and_filters.append((node, node_key_filter))
1195 self._items[prefix] = node1195 self._items[prefix] = node
1196 _get_cache().add(record.key, bytes)1196 _get_cache()[record.key] = bytes
1197 for info in node_and_filters:1197 for info in node_and_filters:
1198 yield info1198 yield info
11991199
@@ -1319,7 +1319,7 @@
1319 lines.append(serialised[prefix_len:])1319 lines.append(serialised[prefix_len:])
1320 sha1, _, _ = store.add_lines((None,), (), lines)1320 sha1, _, _ = store.add_lines((None,), (), lines)
1321 self._key = StaticTuple("sha1:" + sha1,).intern()1321 self._key = StaticTuple("sha1:" + sha1,).intern()
1322 _get_cache().add(self._key, ''.join(lines))1322 _get_cache()[self._key] = ''.join(lines)
1323 yield self._key1323 yield self._key
13241324
1325 def _search_key(self, key):1325 def _search_key(self, key):
13261326
=== modified file 'bzrlib/lru_cache.py'
--- bzrlib/lru_cache.py 2010-07-15 13:05:40 +0000
+++ bzrlib/lru_cache.py 2011-10-27 15:17:22 +0000
@@ -17,6 +17,7 @@
17"""A simple least-recently-used (LRU) cache."""17"""A simple least-recently-used (LRU) cache."""
1818
19from bzrlib import (19from bzrlib import (
20 symbol_versioning,
20 trace,21 trace,
21 )22 )
2223
@@ -25,18 +26,13 @@
25class _LRUNode(object):26class _LRUNode(object):
26 """This maintains the linked-list which is the lru internals."""27 """This maintains the linked-list which is the lru internals."""
2728
28 __slots__ = ('prev', 'next_key', 'key', 'value', 'cleanup', 'size')29 __slots__ = ('prev', 'next_key', 'key', 'value')
2930
30 def __init__(self, key, value, cleanup=None):31 def __init__(self, key, value):
31 self.prev = None32 self.prev = None
32 self.next_key = _null_key33 self.next_key = _null_key
33 self.key = key34 self.key = key
34 self.value = value35 self.value = value
35 self.cleanup = cleanup
36 # TODO: We could compute this 'on-the-fly' like we used to, and remove
37 # one pointer from this object, we just need to decide if it
38 # actually costs us much of anything in normal usage
39 self.size = None
4036
41 def __repr__(self):37 def __repr__(self):
42 if self.prev is None:38 if self.prev is None:
@@ -46,16 +42,6 @@
46 return '%s(%r n:%r p:%r)' % (self.__class__.__name__, self.key,42 return '%s(%r n:%r p:%r)' % (self.__class__.__name__, self.key,
47 self.next_key, prev_key)43 self.next_key, prev_key)
4844
49 def run_cleanup(self):
50 try:
51 if self.cleanup is not None:
52 self.cleanup(self.key, self.value)
53 finally:
54 # cleanup might raise an exception, but we want to make sure
55 # to break refcycles, etc
56 self.cleanup = None
57 self.value = None
58
5945
60class LRUCache(object):46class LRUCache(object):
61 """A class which manages a cache of entries, removing unused ones."""47 """A class which manages a cache of entries, removing unused ones."""
@@ -105,62 +91,23 @@
105 def __len__(self):91 def __len__(self):
106 return len(self._cache)92 return len(self._cache)
10793
108 def _walk_lru(self):94 @symbol_versioning.deprecated_method(
109 """Walk the LRU list, only meant to be used in tests."""95 symbol_versioning.deprecated_in((2, 5, 0)))
110 node = self._most_recently_used
111 if node is not None:
112 if node.prev is not None:
113 raise AssertionError('the _most_recently_used entry is not'
114 ' supposed to have a previous entry'
115 ' %s' % (node,))
116 while node is not None:
117 if node.next_key is _null_key:
118 if node is not self._least_recently_used:
119 raise AssertionError('only the last node should have'
120 ' no next value: %s' % (node,))
121 node_next = None
122 else:
123 node_next = self._cache[node.next_key]
124 if node_next.prev is not node:
125 raise AssertionError('inconsistency found, node.next.prev'
126 ' != node: %s' % (node,))
127 if node.prev is None:
128 if node is not self._most_recently_used:
129 raise AssertionError('only the _most_recently_used should'
130 ' not have a previous node: %s'
131 % (node,))
132 else:
133 if node.prev.next_key != node.key:
134 raise AssertionError('inconsistency found, node.prev.next'
135 ' != node: %s' % (node,))
136 yield node
137 node = node_next
138
139 def add(self, key, value, cleanup=None):96 def add(self, key, value, cleanup=None):
140 """Add a new value to the cache.97 if cleanup is not None:
14198 raise ValueError("Per-node cleanup functions no longer supported")
142 Also, if the entry is ever removed from the cache, call99 return self.__setitem__(key, value)
143 cleanup(key, value).100
144101 def __setitem__(self, key, value):
145 :param key: The key to store it under102 """Add a new value to the cache"""
146 :param value: The object to store
147 :param cleanup: None or a function taking (key, value) to indicate
148 'value' should be cleaned up.
149 """
150 if key is _null_key:103 if key is _null_key:
151 raise ValueError('cannot use _null_key as a key')104 raise ValueError('cannot use _null_key as a key')
152 if key in self._cache:105 if key in self._cache:
153 node = self._cache[key]106 node = self._cache[key]
154 try:107 node.value = value
155 node.run_cleanup()108 self._record_access(node)
156 finally:
157 # Maintain the LRU properties, even if cleanup raises an
158 # exception
159 node.value = value
160 node.cleanup = cleanup
161 self._record_access(node)
162 else:109 else:
163 node = _LRUNode(key, value, cleanup=cleanup)110 node = _LRUNode(key, value)
164 self._cache[key] = node111 self._cache[key] = node
165 self._record_access(node)112 self._record_access(node)
166113
@@ -190,10 +137,13 @@
190 """137 """
191 return self._cache.keys()138 return self._cache.keys()
192139
193 def items(self):140 def as_dict(self):
194 """Get the key:value pairs as a dict."""141 """Get a new dict with the same key:value pairs as the cache"""
195 return dict((k, n.value) for k, n in self._cache.iteritems())142 return dict((k, n.value) for k, n in self._cache.iteritems())
196143
144 items = symbol_versioning.deprecated_method(
145 symbol_versioning.deprecated_in((2, 5, 0)))(as_dict)
146
197 def cleanup(self):147 def cleanup(self):
198 """Clear the cache until it shrinks to the requested size.148 """Clear the cache until it shrinks to the requested size.
199149
@@ -204,10 +154,6 @@
204 while len(self._cache) > self._after_cleanup_count:154 while len(self._cache) > self._after_cleanup_count:
205 self._remove_lru()155 self._remove_lru()
206156
207 def __setitem__(self, key, value):
208 """Add a value to the cache, there will be no cleanup function."""
209 self.add(key, value, cleanup=None)
210
211 def _record_access(self, node):157 def _record_access(self, node):
212 """Record that key was accessed."""158 """Record that key was accessed."""
213 # Move 'node' to the front of the queue159 # Move 'node' to the front of the queue
@@ -241,19 +187,14 @@
241 # If we have removed all entries, remove the head pointer as well187 # If we have removed all entries, remove the head pointer as well
242 if self._least_recently_used is None:188 if self._least_recently_used is None:
243 self._most_recently_used = None189 self._most_recently_used = None
244 try:190 if node.prev is not None:
245 node.run_cleanup()191 node.prev.next_key = node.next_key
246 finally:192 if node.next_key is not _null_key:
247 # cleanup might raise an exception, but we want to make sure to193 node_next = self._cache[node.next_key]
248 # maintain the linked list194 node_next.prev = node.prev
249 if node.prev is not None:195 # And remove this node's pointers
250 node.prev.next_key = node.next_key196 node.prev = None
251 if node.next_key is not _null_key:197 node.next_key = _null_key
252 node_next = self._cache[node.next_key]
253 node_next.prev = node.prev
254 # And remove this node's pointers
255 node.prev = None
256 node.next_key = _null_key
257198
258 def _remove_lru(self):199 def _remove_lru(self):
259 """Remove one entry from the lru, and handle consequences.200 """Remove one entry from the lru, and handle consequences.
@@ -316,17 +257,8 @@
316 self._update_max_size(max_size, after_cleanup_size=after_cleanup_size)257 self._update_max_size(max_size, after_cleanup_size=after_cleanup_size)
317 LRUCache.__init__(self, max_cache=max(int(max_size/512), 1))258 LRUCache.__init__(self, max_cache=max(int(max_size/512), 1))
318259
319 def add(self, key, value, cleanup=None):260 def __setitem__(self, key, value):
320 """Add a new value to the cache.261 """Add a new value to the cache"""
321
322 Also, if the entry is ever removed from the cache, call
323 cleanup(key, value).
324
325 :param key: The key to store it under
326 :param value: The object to store
327 :param cleanup: None or a function taking (key, value) to indicate
328 'value' should be cleaned up.
329 """
330 if key is _null_key:262 if key is _null_key:
331 raise ValueError('cannot use _null_key as a key')263 raise ValueError('cannot use _null_key as a key')
332 node = self._cache.get(key, None)264 node = self._cache.get(key, None)
@@ -341,15 +273,12 @@
341 if node is not None:273 if node is not None:
342 # We won't be replacing the old node, so just remove it274 # We won't be replacing the old node, so just remove it
343 self._remove_node(node)275 self._remove_node(node)
344 if cleanup is not None:
345 cleanup(key, value)
346 return276 return
347 if node is None:277 if node is None:
348 node = _LRUNode(key, value, cleanup=cleanup)278 node = _LRUNode(key, value)
349 self._cache[key] = node279 self._cache[key] = node
350 else:280 else:
351 self._value_size -= node.size281 self._value_size -= self._compute_size(node.value)
352 node.size = value_len
353 self._value_size += value_len282 self._value_size += value_len
354 self._record_access(node)283 self._record_access(node)
355284
@@ -368,7 +297,7 @@
368 self._remove_lru()297 self._remove_lru()
369298
370 def _remove_node(self, node):299 def _remove_node(self, node):
371 self._value_size -= node.size300 self._value_size -= self._compute_size(node.value)
372 LRUCache._remove_node(self, node)301 LRUCache._remove_node(self, node)
373302
374 def resize(self, max_size, after_cleanup_size=None):303 def resize(self, max_size, after_cleanup_size=None):
375304
=== modified file 'bzrlib/tests/test_lru_cache.py'
--- bzrlib/tests/test_lru_cache.py 2011-05-13 12:51:05 +0000
+++ bzrlib/tests/test_lru_cache.py 2011-10-27 15:17:22 +0000
@@ -18,10 +18,43 @@
1818
19from bzrlib import (19from bzrlib import (
20 lru_cache,20 lru_cache,
21 symbol_versioning,
21 tests,22 tests,
22 )23 )
2324
2425
26def walk_lru(lru):
27 """Test helper to walk the LRU list and assert its consistency"""
28 node = lru._most_recently_used
29 if node is not None:
30 if node.prev is not None:
31 raise AssertionError('the _most_recently_used entry is not'
32 ' supposed to have a previous entry'
33 ' %s' % (node,))
34 while node is not None:
35 if node.next_key is lru_cache._null_key:
36 if node is not lru._least_recently_used:
37 raise AssertionError('only the last node should have'
38 ' no next value: %s' % (node,))
39 node_next = None
40 else:
41 node_next = lru._cache[node.next_key]
42 if node_next.prev is not node:
43 raise AssertionError('inconsistency found, node.next.prev'
44 ' != node: %s' % (node,))
45 if node.prev is None:
46 if node is not lru._most_recently_used:
47 raise AssertionError('only the _most_recently_used should'
48 ' not have a previous node: %s'
49 % (node,))
50 else:
51 if node.prev.next_key != node.key:
52 raise AssertionError('inconsistency found, node.prev.next'
53 ' != node: %s' % (node,))
54 yield node
55 node = node_next
56
57
25class TestLRUCache(tests.TestCase):58class TestLRUCache(tests.TestCase):
26 """Test that LRU cache properly keeps track of entries."""59 """Test that LRU cache properly keeps track of entries."""
2760
@@ -61,11 +94,12 @@
61 cache[None]94 cache[None]
62 cache[1]95 cache[1]
63 cache[None]96 cache[None]
64 self.assertEqual([None, 1], [n.key for n in cache._walk_lru()])97 self.assertEqual([None, 1], [n.key for n in walk_lru(cache)])
6598
66 def test_add__null_key(self):99 def test_add__null_key(self):
67 cache = lru_cache.LRUCache(max_cache=10)100 cache = lru_cache.LRUCache(max_cache=10)
68 self.assertRaises(ValueError, cache.add, lru_cache._null_key, 1)101 self.assertRaises(ValueError,
102 cache.__setitem__, lru_cache._null_key, 1)
69103
70 def test_overflow(self):104 def test_overflow(self):
71 """Adding extra entries will pop out old ones."""105 """Adding extra entries will pop out old ones."""
@@ -94,80 +128,12 @@
94128
95 self.assertFalse('foo' in cache)129 self.assertFalse('foo' in cache)
96130
97 def test_cleanup(self):131 def test_cleanup_function_deprecated(self):
98 """Test that we can use a cleanup function."""132 """Test that per-node cleanup functions are no longer allowed"""
99 cleanup_called = []133 cache = lru_cache.LRUCache()
100 def cleanup_func(key, val):134 self.assertRaises(ValueError, self.applyDeprecated,
101 cleanup_called.append((key, val))135 symbol_versioning.deprecated_in((2, 5, 0)),
102136 cache.add, "key", 1, cleanup=lambda: None)
103 cache = lru_cache.LRUCache(max_cache=2)
104
105 cache.add('baz', '1', cleanup=cleanup_func)
106 cache.add('foo', '2', cleanup=cleanup_func)
107 cache.add('biz', '3', cleanup=cleanup_func)
108
109 self.assertEqual([('baz', '1')], cleanup_called)
110
111 # 'foo' is now most recent, so final cleanup will call it last
112 cache['foo']
113 cache.clear()
114 self.assertEqual([('baz', '1'), ('biz', '3'), ('foo', '2')],
115 cleanup_called)
116
117 def test_cleanup_on_replace(self):
118 """Replacing an object should cleanup the old value."""
119 cleanup_called = []
120 def cleanup_func(key, val):
121 cleanup_called.append((key, val))
122
123 cache = lru_cache.LRUCache(max_cache=2)
124 cache.add(1, 10, cleanup=cleanup_func)
125 cache.add(2, 20, cleanup=cleanup_func)
126 cache.add(2, 25, cleanup=cleanup_func)
127
128 self.assertEqual([(2, 20)], cleanup_called)
129 self.assertEqual(25, cache[2])
130
131 # Even __setitem__ should make sure cleanup() is called
132 cache[2] = 26
133 self.assertEqual([(2, 20), (2, 25)], cleanup_called)
134
135 def test_cleanup_error_maintains_linked_list(self):
136 cleanup_called = []
137 def cleanup_func(key, val):
138 cleanup_called.append((key, val))
139 raise ValueError('failure during cleanup')
140
141 cache = lru_cache.LRUCache(max_cache=10)
142 for i in xrange(10):
143 cache.add(i, i, cleanup=cleanup_func)
144 for i in xrange(10, 20):
145 self.assertRaises(ValueError,
146 cache.add, i, i, cleanup=cleanup_func)
147
148 self.assertEqual([(i, i) for i in xrange(10)], cleanup_called)
149
150 self.assertEqual(range(19, 9, -1), [n.key for n in cache._walk_lru()])
151
152 def test_cleanup_during_replace_still_replaces(self):
153 cleanup_called = []
154 def cleanup_func(key, val):
155 cleanup_called.append((key, val))
156 raise ValueError('failure during cleanup')
157
158 cache = lru_cache.LRUCache(max_cache=10)
159 for i in xrange(10):
160 cache.add(i, i, cleanup=cleanup_func)
161 self.assertRaises(ValueError,
162 cache.add, 1, 20, cleanup=cleanup_func)
163 # We also still update the recent access to this node
164 self.assertEqual([1, 9, 8, 7, 6, 5, 4, 3, 2, 0],
165 [n.key for n in cache._walk_lru()])
166 self.assertEqual(20, cache[1])
167
168 self.assertEqual([(1, 1)], cleanup_called)
169 self.assertEqual([1, 9, 8, 7, 6, 5, 4, 3, 2, 0],
170 [n.key for n in cache._walk_lru()])
171137
172 def test_len(self):138 def test_len(self):
173 cache = lru_cache.LRUCache(max_cache=10, after_cleanup_count=10)139 cache = lru_cache.LRUCache(max_cache=10, after_cleanup_count=10)
@@ -197,21 +163,21 @@
197 # We hit the max163 # We hit the max
198 self.assertEqual(10, len(cache))164 self.assertEqual(10, len(cache))
199 self.assertEqual([11, 10, 9, 1, 8, 7, 6, 5, 4, 3],165 self.assertEqual([11, 10, 9, 1, 8, 7, 6, 5, 4, 3],
200 [n.key for n in cache._walk_lru()])166 [n.key for n in walk_lru(cache)])
201167
202 def test_cleanup_shrinks_to_after_clean_count(self):168 def test_cleanup_shrinks_to_after_clean_count(self):
203 cache = lru_cache.LRUCache(max_cache=5, after_cleanup_count=3)169 cache = lru_cache.LRUCache(max_cache=5, after_cleanup_count=3)
204170
205 cache.add(1, 10)171 cache[1] = 10
206 cache.add(2, 20)172 cache[2] = 20
207 cache.add(3, 25)173 cache[3] = 25
208 cache.add(4, 30)174 cache[4] = 30
209 cache.add(5, 35)175 cache[5] = 35
210176
211 self.assertEqual(5, len(cache))177 self.assertEqual(5, len(cache))
212 # This will bump us over the max, which causes us to shrink down to178 # This will bump us over the max, which causes us to shrink down to
213 # after_cleanup_cache size179 # after_cleanup_cache size
214 cache.add(6, 40)180 cache[6] = 40
215 self.assertEqual(3, len(cache))181 self.assertEqual(3, len(cache))
216182
217 def test_after_cleanup_larger_than_max(self):183 def test_after_cleanup_larger_than_max(self):
@@ -227,11 +193,11 @@
227 cache = lru_cache.LRUCache(max_cache=5, after_cleanup_count=2)193 cache = lru_cache.LRUCache(max_cache=5, after_cleanup_count=2)
228194
229 # Add these in order195 # Add these in order
230 cache.add(1, 10)196 cache[1] = 10
231 cache.add(2, 20)197 cache[2] = 20
232 cache.add(3, 25)198 cache[3] = 25
233 cache.add(4, 30)199 cache[4] = 30
234 cache.add(5, 35)200 cache[5] = 35
235201
236 self.assertEqual(5, len(cache))202 self.assertEqual(5, len(cache))
237 # Force a compaction203 # Force a compaction
@@ -242,33 +208,33 @@
242 cache = lru_cache.LRUCache(max_cache=5)208 cache = lru_cache.LRUCache(max_cache=5)
243209
244 # Add these in order210 # Add these in order
245 cache.add(1, 10)211 cache[1] = 10
246 cache.add(2, 20)212 cache[2] = 20
247 cache.add(3, 25)213 cache[3] = 25
248 cache.add(4, 30)214 cache[4] = 30
249 cache.add(5, 35)215 cache[5] = 35
250216
251 self.assertEqual([5, 4, 3, 2, 1], [n.key for n in cache._walk_lru()])217 self.assertEqual([5, 4, 3, 2, 1], [n.key for n in walk_lru(cache)])
252218
253 # Now access some randomly219 # Now access some randomly
254 cache[2]220 cache[2]
255 cache[5]221 cache[5]
256 cache[3]222 cache[3]
257 cache[2]223 cache[2]
258 self.assertEqual([2, 3, 5, 4, 1], [n.key for n in cache._walk_lru()])224 self.assertEqual([2, 3, 5, 4, 1], [n.key for n in walk_lru(cache)])
259225
260 def test_get(self):226 def test_get(self):
261 cache = lru_cache.LRUCache(max_cache=5)227 cache = lru_cache.LRUCache(max_cache=5)
262228
263 cache.add(1, 10)229 cache[1] = 10
264 cache.add(2, 20)230 cache[2] = 20
265 self.assertEqual(20, cache.get(2))231 self.assertEqual(20, cache.get(2))
266 self.assertIs(None, cache.get(3))232 self.assertIs(None, cache.get(3))
267 obj = object()233 obj = object()
268 self.assertIs(obj, cache.get(3, obj))234 self.assertIs(obj, cache.get(3, obj))
269 self.assertEqual([2, 1], [n.key for n in cache._walk_lru()])235 self.assertEqual([2, 1], [n.key for n in walk_lru(cache)])
270 self.assertEqual(10, cache.get(1))236 self.assertEqual(10, cache.get(1))
271 self.assertEqual([1, 2], [n.key for n in cache._walk_lru()])237 self.assertEqual([1, 2], [n.key for n in walk_lru(cache)])
272238
273 def test_keys(self):239 def test_keys(self):
274 cache = lru_cache.LRUCache(max_cache=5, after_cleanup_count=5)240 cache = lru_cache.LRUCache(max_cache=5, after_cleanup_count=5)
@@ -332,18 +298,19 @@
332298
333 def test_add__null_key(self):299 def test_add__null_key(self):
334 cache = lru_cache.LRUSizeCache()300 cache = lru_cache.LRUSizeCache()
335 self.assertRaises(ValueError, cache.add, lru_cache._null_key, 1)301 self.assertRaises(ValueError,
302 cache.__setitem__, lru_cache._null_key, 1)
336303
337 def test_add_tracks_size(self):304 def test_add_tracks_size(self):
338 cache = lru_cache.LRUSizeCache()305 cache = lru_cache.LRUSizeCache()
339 self.assertEqual(0, cache._value_size)306 self.assertEqual(0, cache._value_size)
340 cache.add('my key', 'my value text')307 cache['my key'] = 'my value text'
341 self.assertEqual(13, cache._value_size)308 self.assertEqual(13, cache._value_size)
342309
343 def test_remove_tracks_size(self):310 def test_remove_tracks_size(self):
344 cache = lru_cache.LRUSizeCache()311 cache = lru_cache.LRUSizeCache()
345 self.assertEqual(0, cache._value_size)312 self.assertEqual(0, cache._value_size)
346 cache.add('my key', 'my value text')313 cache['my key'] = 'my value text'
347 self.assertEqual(13, cache._value_size)314 self.assertEqual(13, cache._value_size)
348 node = cache._cache['my key']315 node = cache._cache['my key']
349 cache._remove_node(node)316 cache._remove_node(node)
@@ -353,64 +320,48 @@
353 """Adding a large value may not be cached at all."""320 """Adding a large value may not be cached at all."""
354 cache = lru_cache.LRUSizeCache(max_size=10, after_cleanup_size=5)321 cache = lru_cache.LRUSizeCache(max_size=10, after_cleanup_size=5)
355 self.assertEqual(0, cache._value_size)322 self.assertEqual(0, cache._value_size)
356 self.assertEqual({}, cache.items())323 self.assertEqual({}, cache.as_dict())
357 cache.add('test', 'key')324 cache['test'] = 'key'
358 self.assertEqual(3, cache._value_size)325 self.assertEqual(3, cache._value_size)
359 self.assertEqual({'test': 'key'}, cache.items())326 self.assertEqual({'test': 'key'}, cache.as_dict())
360 cache.add('test2', 'key that is too big')327 cache['test2'] = 'key that is too big'
361 self.assertEqual(3, cache._value_size)328 self.assertEqual(3, cache._value_size)
362 self.assertEqual({'test':'key'}, cache.items())329 self.assertEqual({'test':'key'}, cache.as_dict())
363 # If we would add a key, only to cleanup and remove all cached entries,330 # If we would add a key, only to cleanup and remove all cached entries,
364 # then obviously that value should not be stored331 # then obviously that value should not be stored
365 cache.add('test3', 'bigkey')332 cache['test3'] = 'bigkey'
366 self.assertEqual(3, cache._value_size)333 self.assertEqual(3, cache._value_size)
367 self.assertEqual({'test':'key'}, cache.items())334 self.assertEqual({'test':'key'}, cache.as_dict())
368335
369 cache.add('test4', 'bikey')336 cache['test4'] = 'bikey'
370 self.assertEqual(3, cache._value_size)337 self.assertEqual(3, cache._value_size)
371 self.assertEqual({'test':'key'}, cache.items())338 self.assertEqual({'test':'key'}, cache.as_dict())
372
373 def test_no_add_over_size_cleanup(self):
374 """If a large value is not cached, we will call cleanup right away."""
375 cleanup_calls = []
376 def cleanup(key, value):
377 cleanup_calls.append((key, value))
378
379 cache = lru_cache.LRUSizeCache(max_size=10, after_cleanup_size=5)
380 self.assertEqual(0, cache._value_size)
381 self.assertEqual({}, cache.items())
382 cache.add('test', 'key that is too big', cleanup=cleanup)
383 # key was not added
384 self.assertEqual(0, cache._value_size)
385 self.assertEqual({}, cache.items())
386 # and cleanup was called
387 self.assertEqual([('test', 'key that is too big')], cleanup_calls)
388339
389 def test_adding_clears_cache_based_on_size(self):340 def test_adding_clears_cache_based_on_size(self):
390 """The cache is cleared in LRU order until small enough"""341 """The cache is cleared in LRU order until small enough"""
391 cache = lru_cache.LRUSizeCache(max_size=20)342 cache = lru_cache.LRUSizeCache(max_size=20)
392 cache.add('key1', 'value') # 5 chars343 cache['key1'] = 'value' # 5 chars
393 cache.add('key2', 'value2') # 6 chars344 cache['key2'] = 'value2' # 6 chars
394 cache.add('key3', 'value23') # 7 chars345 cache['key3'] = 'value23' # 7 chars
395 self.assertEqual(5+6+7, cache._value_size)346 self.assertEqual(5+6+7, cache._value_size)
396 cache['key2'] # reference key2 so it gets a newer reference time347 cache['key2'] # reference key2 so it gets a newer reference time
397 cache.add('key4', 'value234') # 8 chars, over limit348 cache['key4'] = 'value234' # 8 chars, over limit
398 # We have to remove 2 keys to get back under limit349 # We have to remove 2 keys to get back under limit
399 self.assertEqual(6+8, cache._value_size)350 self.assertEqual(6+8, cache._value_size)
400 self.assertEqual({'key2':'value2', 'key4':'value234'},351 self.assertEqual({'key2':'value2', 'key4':'value234'},
401 cache.items())352 cache.as_dict())
402353
403 def test_adding_clears_to_after_cleanup_size(self):354 def test_adding_clears_to_after_cleanup_size(self):
404 cache = lru_cache.LRUSizeCache(max_size=20, after_cleanup_size=10)355 cache = lru_cache.LRUSizeCache(max_size=20, after_cleanup_size=10)
405 cache.add('key1', 'value') # 5 chars356 cache['key1'] = 'value' # 5 chars
406 cache.add('key2', 'value2') # 6 chars357 cache['key2'] = 'value2' # 6 chars
407 cache.add('key3', 'value23') # 7 chars358 cache['key3'] = 'value23' # 7 chars
408 self.assertEqual(5+6+7, cache._value_size)359 self.assertEqual(5+6+7, cache._value_size)
409 cache['key2'] # reference key2 so it gets a newer reference time360 cache['key2'] # reference key2 so it gets a newer reference time
410 cache.add('key4', 'value234') # 8 chars, over limit361 cache['key4'] = 'value234' # 8 chars, over limit
411 # We have to remove 3 keys to get back under limit362 # We have to remove 3 keys to get back under limit
412 self.assertEqual(8, cache._value_size)363 self.assertEqual(8, cache._value_size)
413 self.assertEqual({'key4':'value234'}, cache.items())364 self.assertEqual({'key4':'value234'}, cache.as_dict())
414365
415 def test_custom_sizes(self):366 def test_custom_sizes(self):
416 def size_of_list(lst):367 def size_of_list(lst):
@@ -418,23 +369,23 @@
418 cache = lru_cache.LRUSizeCache(max_size=20, after_cleanup_size=10,369 cache = lru_cache.LRUSizeCache(max_size=20, after_cleanup_size=10,
419 compute_size=size_of_list)370 compute_size=size_of_list)
420371
421 cache.add('key1', ['val', 'ue']) # 5 chars372 cache['key1'] = ['val', 'ue'] # 5 chars
422 cache.add('key2', ['val', 'ue2']) # 6 chars373 cache['key2'] = ['val', 'ue2'] # 6 chars
423 cache.add('key3', ['val', 'ue23']) # 7 chars374 cache['key3'] = ['val', 'ue23'] # 7 chars
424 self.assertEqual(5+6+7, cache._value_size)375 self.assertEqual(5+6+7, cache._value_size)
425 cache['key2'] # reference key2 so it gets a newer reference time376 cache['key2'] # reference key2 so it gets a newer reference time
426 cache.add('key4', ['value', '234']) # 8 chars, over limit377 cache['key4'] = ['value', '234'] # 8 chars, over limit
427 # We have to remove 3 keys to get back under limit378 # We have to remove 3 keys to get back under limit
428 self.assertEqual(8, cache._value_size)379 self.assertEqual(8, cache._value_size)
429 self.assertEqual({'key4':['value', '234']}, cache.items())380 self.assertEqual({'key4':['value', '234']}, cache.as_dict())
430381
431 def test_cleanup(self):382 def test_cleanup(self):
432 cache = lru_cache.LRUSizeCache(max_size=20, after_cleanup_size=10)383 cache = lru_cache.LRUSizeCache(max_size=20, after_cleanup_size=10)
433384
434 # Add these in order385 # Add these in order
435 cache.add('key1', 'value') # 5 chars386 cache['key1'] = 'value' # 5 chars
436 cache.add('key2', 'value2') # 6 chars387 cache['key2'] = 'value2' # 6 chars
437 cache.add('key3', 'value23') # 7 chars388 cache['key3'] = 'value23' # 7 chars
438 self.assertEqual(5+6+7, cache._value_size)389 self.assertEqual(5+6+7, cache._value_size)
439390
440 cache.cleanup()391 cache.cleanup()
441392
=== modified file 'doc/en/release-notes/bzr-2.5.txt'
--- doc/en/release-notes/bzr-2.5.txt 2011-10-25 12:50:34 +0000
+++ doc/en/release-notes/bzr-2.5.txt 2011-10-27 15:17:22 +0000
@@ -81,6 +81,9 @@
8181
82* ``Branch.revision_history`` is now deprecated. (Jelmer Vernooij, #799519)82* ``Branch.revision_history`` is now deprecated. (Jelmer Vernooij, #799519)
8383
84* Methods ``add`` and ``items`` of ``LRUCache`` and ``LRUSizeCache`` are
85 deprecated. Use normal dict-style access instead. (Martin Packman)
86
84* New flag ``RepositoryFormat.supports_unreferenced_revisions`` which87* New flag ``RepositoryFormat.supports_unreferenced_revisions`` which
85 indicates whether revisions can be present in a repository without88 indicates whether revisions can be present in a repository without
86 being referenced from e.g. a branch history at the same time.89 being referenced from e.g. a branch history at the same time.