Merge lp:~jelmer/bzr/inventorytree into lp:bzr

Proposed by Jelmer Vernooij
Status: Merged
Approved by: Andrew Bennetts
Approved revision: no longer in the source branch.
Merged at revision: 5781
Proposed branch: lp:~jelmer/bzr/inventorytree
Merge into: lp:bzr
Diff against target: 507 lines (+183/-119)
8 files modified
bzrlib/memorytree.py (+3/-1)
bzrlib/mutabletree.py (+1/-1)
bzrlib/revisiontree.py (+1/-1)
bzrlib/tests/per_workingtree/test_inv.py (+1/-5)
bzrlib/transform.py (+1/-1)
bzrlib/tree.py (+160/-105)
bzrlib/workingtree_4.py (+12/-5)
doc/en/release-notes/bzr-2.4.txt (+4/-0)
To merge this branch: bzr merge lp:~jelmer/bzr/inventorytree
Reviewer Review Type Date Requested Status
Andrew Bennetts Approve
Review via email: mp+57059@code.launchpad.net

Commit message

Split inventory-specific functionality out of Tree into InventoryTree.

Description of the change

Don't require Tree implementations to have an inventory object; this
moves the bits dependent on Inventory into a new InventoryTree class.

Inventories are either hard (code-wise and performance-wise) or impossible to
construct for foreign trees, and they should generally be an implementation
detail.

I'd like to make a similar change to split inventory-dependent code out of
working tree, which should make the base working tree a fair bit simpler
and make it easier to create subclasses that don't use inventories.

To post a comment you must log in.
Revision history for this message
Jelmer Vernooij (jelmer) wrote :

This reflects the existing situation where some things already don't provide an inventory (or just an empty inventory), including PreviewTree, SvnWorkingTree, GitWorkingTree, GitRevisionTree, SvnRevisionTree.

Revision history for this message
Andrew Bennetts (spiv) wrote :

I see that Martin has already approved and sent to PQM <https://code.launchpad.net/~jelmer/bzr/mutableinventorytree/+merge/57061>, which depends on this change, so it's fortunate that this change looks good too :)

I wonder a little about how 3rd-party code copes with Trees without .inventory attributes. Probably poorly, I think we've occasionally seen bugs like that already with e.g. PreviewTrees. I guess there's not much we can do about that.

I think the docstring for InventoryTree ought to explain a bit more how to subclass it, e.g. like suggesting overriding the _inventory attribute or _get_inventory method. I don't think that's part of the interface is especially clear at the moment either, so this patch doesn't make things worse :)

It'd be good to have the API Changes section of release notes mention all the new classes. It's fine with me if that happens when you land the end of this series of changes rather than right away.

One PEP 8 nit: “class C(B1,B2):” ought to have a space after the comma: “class C(B1, B2):”

review: Approve
Revision history for this message
Jelmer Vernooij (jelmer) wrote :

sent to pqm by email

Revision history for this message
Jelmer Vernooij (jelmer) wrote :

sent to pqm by email

Revision history for this message
Jelmer Vernooij (jelmer) wrote :

sent to pqm by email

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bzrlib/memorytree.py'
2--- bzrlib/memorytree.py 2011-03-06 18:10:47 +0000
3+++ bzrlib/memorytree.py 2011-04-11 11:23:29 +0000
4@@ -26,6 +26,8 @@
5 errors,
6 mutabletree,
7 revision as _mod_revision,
8+ revisiontree,
9+ tree,
10 )
11 from bzrlib.decorators import needs_read_lock
12 from bzrlib.osutils import sha_file
13@@ -33,7 +35,7 @@
14 from bzrlib.transport.memory import MemoryTransport
15
16
17-class MemoryTree(mutabletree.MutableTree):
18+class MemoryTree(mutabletree.MutableTree, tree.InventoryTree):
19 """A MemoryTree is a specialisation of MutableTree.
20
21 It maintains nearly no state outside of read_lock and write_lock
22
23=== modified file 'bzrlib/mutabletree.py'
24--- bzrlib/mutabletree.py 2011-04-05 01:12:15 +0000
25+++ bzrlib/mutabletree.py 2011-04-11 11:23:29 +0000
26@@ -54,7 +54,7 @@
27 return tree_write_locked
28
29
30-class MutableTree(tree.Tree):
31+class MutableTree(tree.InventoryTree):
32 """A MutableTree is a specialisation of Tree which is able to be mutated.
33
34 Generally speaking these mutations are only possible within a lock_write
35
36=== modified file 'bzrlib/revisiontree.py'
37--- bzrlib/revisiontree.py 2010-11-29 23:35:13 +0000
38+++ bzrlib/revisiontree.py 2011-04-11 11:23:29 +0000
39@@ -25,7 +25,7 @@
40 )
41
42
43-class RevisionTree(tree.Tree):
44+class RevisionTree(tree.InventoryTree):
45 """Tree viewing a previous revision.
46
47 File text can be retrieved from the text store.
48
49=== modified file 'bzrlib/tests/per_workingtree/test_commit.py'
50=== modified file 'bzrlib/tests/per_workingtree/test_inv.py'
51--- bzrlib/tests/per_workingtree/test_inv.py 2009-07-10 07:14:02 +0000
52+++ bzrlib/tests/per_workingtree/test_inv.py 2011-04-11 11:23:29 +0000
53@@ -18,12 +18,8 @@
54
55
56 import os
57-import time
58
59-from bzrlib import (
60- errors,
61- inventory,
62- )
63+from bzrlib import inventory
64 from bzrlib.tests.per_workingtree import TestCaseWithWorkingTree
65
66
67
68=== modified file 'bzrlib/transform.py'
69--- bzrlib/transform.py 2011-04-08 12:28:05 +0000
70+++ bzrlib/transform.py 2011-04-11 11:23:29 +0000
71@@ -1932,7 +1932,7 @@
72 raise NotImplementedError(self.new_orphan)
73
74
75-class _PreviewTree(tree.Tree):
76+class _PreviewTree(tree.InventoryTree):
77 """Partial implementation of Tree to support show_diff_trees"""
78
79 def __init__(self, transform):
80
81=== modified file 'bzrlib/tree.py'
82--- bzrlib/tree.py 2011-04-09 15:16:09 +0000
83+++ bzrlib/tree.py 2011-04-11 11:23:29 +0000
84@@ -50,13 +50,6 @@
85
86 * `RevisionTree` is a tree as recorded at some point in the past.
87
88- Trees contain an `Inventory` object, and also know how to retrieve
89- file texts mentioned in the inventory, either from a working
90- directory or from a store.
91-
92- It is possible for trees to contain files that are not described
93- in their inventory or vice versa; for this use `filenames()`.
94-
95 Trees can be compared, etc, regardless of whether they are working
96 trees or versioned trees.
97 """
98@@ -128,13 +121,13 @@
99 raise NotImplementedError(self.has_filename)
100
101 def has_id(self, file_id):
102- return self.inventory.has_id(file_id)
103+ raise NotImplementedError(self.has_id)
104
105 def __contains__(self, file_id):
106 return self.has_id(file_id)
107
108 def has_or_had_id(self, file_id):
109- return self.inventory.has_id(file_id)
110+ raise NotImplementedError(self.has_or_had_id)
111
112 def is_ignored(self, filename):
113 """Check whether the filename is ignored by this tree.
114@@ -145,7 +138,8 @@
115 return False
116
117 def __iter__(self):
118- return iter(self.inventory)
119+ """Yield all file ids in this tree."""
120+ raise NotImplementedError(self.__iter__)
121
122 def all_file_ids(self):
123 """Iterate through all file ids, including ids for missing files."""
124@@ -156,9 +150,8 @@
125
126 :raises NoSuchId:
127 """
128- return self.inventory.id2path(file_id)
129+ raise NotImplementedError(self.id2path)
130
131- @needs_read_lock
132 def iter_entries_by_dir(self, specific_file_ids=None, yield_parents=False):
133 """Walk the tree in 'by_dir' order.
134
135@@ -187,8 +180,7 @@
136 down to specific_file_ids that have been requested. This has no
137 impact if specific_file_ids is None.
138 """
139- return self.inventory.iter_entries_by_dir(
140- specific_file_ids=specific_file_ids, yield_parents=yield_parents)
141+ raise NotImplementedError(self.iter_entries_by_dir)
142
143 def iter_references(self):
144 if self.supports_tree_reference():
145@@ -245,9 +237,6 @@
146 def _file_size(self, entry, stat_value):
147 raise NotImplementedError(self._file_size)
148
149- def _get_inventory(self):
150- return self._inventory
151-
152 def get_file(self, file_id, path=None):
153 """Return a file object for the file file_id in the tree.
154
155@@ -322,7 +311,7 @@
156 raise NotImplementedError(self.get_file_size)
157
158 def get_file_by_path(self, path):
159- return self.get_file(self._inventory.path2id(path), path)
160+ raise NotImplementedError(self.get_file_by_path)
161
162 def is_executable(self, file_id, path=None):
163 """Check if a file is executable.
164@@ -369,83 +358,6 @@
165 """
166 raise NotImplementedError(self.get_symlink_target)
167
168- def get_canonical_inventory_paths(self, paths):
169- """Like get_canonical_inventory_path() but works on multiple items.
170-
171- :param paths: A sequence of paths relative to the root of the tree.
172- :return: A list of paths, with each item the corresponding input path
173- adjusted to account for existing elements that match case
174- insensitively.
175- """
176- return list(self._yield_canonical_inventory_paths(paths))
177-
178- def get_canonical_inventory_path(self, path):
179- """Returns the first inventory item that case-insensitively matches path.
180-
181- If a path matches exactly, it is returned. If no path matches exactly
182- but more than one path matches case-insensitively, it is implementation
183- defined which is returned.
184-
185- If no path matches case-insensitively, the input path is returned, but
186- with as many path entries that do exist changed to their canonical
187- form.
188-
189- If you need to resolve many names from the same tree, you should
190- use get_canonical_inventory_paths() to avoid O(N) behaviour.
191-
192- :param path: A paths relative to the root of the tree.
193- :return: The input path adjusted to account for existing elements
194- that match case insensitively.
195- """
196- return self._yield_canonical_inventory_paths([path]).next()
197-
198- def _yield_canonical_inventory_paths(self, paths):
199- for path in paths:
200- # First, if the path as specified exists exactly, just use it.
201- if self.path2id(path) is not None:
202- yield path
203- continue
204- # go walkin...
205- cur_id = self.get_root_id()
206- cur_path = ''
207- bit_iter = iter(path.split("/"))
208- for elt in bit_iter:
209- lelt = elt.lower()
210- new_path = None
211- for child in self.iter_children(cur_id):
212- try:
213- # XXX: it seem like if the child is known to be in the
214- # tree, we shouldn't need to go from its id back to
215- # its path -- mbp 2010-02-11
216- #
217- # XXX: it seems like we could be more efficient
218- # by just directly looking up the original name and
219- # only then searching all children; also by not
220- # chopping paths so much. -- mbp 2010-02-11
221- child_base = os.path.basename(self.id2path(child))
222- if (child_base == elt):
223- # if we found an exact match, we can stop now; if
224- # we found an approximate match we need to keep
225- # searching because there might be an exact match
226- # later.
227- cur_id = child
228- new_path = osutils.pathjoin(cur_path, child_base)
229- break
230- elif child_base.lower() == lelt:
231- cur_id = child
232- new_path = osutils.pathjoin(cur_path, child_base)
233- except errors.NoSuchId:
234- # before a change is committed we can see this error...
235- continue
236- if new_path:
237- cur_path = new_path
238- else:
239- # got to the end of this directory and no entries matched.
240- # Return what matched so far, plus the rest as specified.
241- cur_path = osutils.pathjoin(cur_path, elt, *list(bit_iter))
242- break
243- yield cur_path
244- # all done.
245
246 def get_root_id(self):
247 """Return the file_id for the root of this tree."""
248@@ -513,6 +425,7 @@
249 @staticmethod
250 def _file_revision(revision_tree, file_id):
251 """Determine the revision associated with a file in a given tree."""
252+ # FIXME: Shouldn't this be a RevisionTree method?
253 revision_tree.lock_read()
254 try:
255 return revision_tree.inventory[file_id].revision
256@@ -537,9 +450,6 @@
257 vf.fallback_versionedfiles.append(base_vf)
258 return last_revision
259
260- inventory = property(_get_inventory,
261- doc="Inventory of this Tree")
262-
263 def _check_retrieved(self, ie, f):
264 if not __debug__:
265 return
266@@ -562,10 +472,9 @@
267 "file is actually %s" % fp['sha1'],
268 "store is probably damaged/corrupt"])
269
270- @needs_read_lock
271 def path2id(self, path):
272 """Return the id for path in this tree."""
273- return self._inventory.path2id(path)
274+ raise NotImplementedError(self.path2id)
275
276 def paths2ids(self, paths, trees=[], require_versioned=True):
277 """Return all the ids that can be reached by walking from paths.
278@@ -628,11 +537,7 @@
279
280 :return: set of paths.
281 """
282- # NB: we specifically *don't* call self.has_filename, because for
283- # WorkingTrees that can indicate files that exist on disk but that
284- # are not versioned.
285- pred = self.inventory.has_filename
286- return set((p for p in paths if not pred(p)))
287+ raise NotImplementedError(self.filter_unversioned_files)
288
289 def walkdirs(self, prefix=""):
290 """Walk the contents of this tree from path down.
291@@ -735,6 +640,156 @@
292 return searcher
293
294
295+class InventoryTree(Tree):
296+ """A tree that relies on an inventory for its metadata.
297+
298+ Trees contain an `Inventory` object, and also know how to retrieve
299+ file texts mentioned in the inventory, either from a working
300+ directory or from a store.
301+
302+ It is possible for trees to contain files that are not described
303+ in their inventory or vice versa; for this use `filenames()`.
304+
305+ Subclasses should set the _inventory attribute, which is considered
306+ private to external API users.
307+ """
308+
309+ def get_canonical_inventory_paths(self, paths):
310+ """Like get_canonical_inventory_path() but works on multiple items.
311+
312+ :param paths: A sequence of paths relative to the root of the tree.
313+ :return: A list of paths, with each item the corresponding input path
314+ adjusted to account for existing elements that match case
315+ insensitively.
316+ """
317+ return list(self._yield_canonical_inventory_paths(paths))
318+
319+ def get_canonical_inventory_path(self, path):
320+ """Returns the first inventory item that case-insensitively matches path.
321+
322+ If a path matches exactly, it is returned. If no path matches exactly
323+ but more than one path matches case-insensitively, it is implementation
324+ defined which is returned.
325+
326+ If no path matches case-insensitively, the input path is returned, but
327+ with as many path entries that do exist changed to their canonical
328+ form.
329+
330+ If you need to resolve many names from the same tree, you should
331+ use get_canonical_inventory_paths() to avoid O(N) behaviour.
332+
333+ :param path: A paths relative to the root of the tree.
334+ :return: The input path adjusted to account for existing elements
335+ that match case insensitively.
336+ """
337+ return self._yield_canonical_inventory_paths([path]).next()
338+
339+ def _yield_canonical_inventory_paths(self, paths):
340+ for path in paths:
341+ # First, if the path as specified exists exactly, just use it.
342+ if self.path2id(path) is not None:
343+ yield path
344+ continue
345+ # go walkin...
346+ cur_id = self.get_root_id()
347+ cur_path = ''
348+ bit_iter = iter(path.split("/"))
349+ for elt in bit_iter:
350+ lelt = elt.lower()
351+ new_path = None
352+ for child in self.iter_children(cur_id):
353+ try:
354+ # XXX: it seem like if the child is known to be in the
355+ # tree, we shouldn't need to go from its id back to
356+ # its path -- mbp 2010-02-11
357+ #
358+ # XXX: it seems like we could be more efficient
359+ # by just directly looking up the original name and
360+ # only then searching all children; also by not
361+ # chopping paths so much. -- mbp 2010-02-11
362+ child_base = os.path.basename(self.id2path(child))
363+ if (child_base == elt):
364+ # if we found an exact match, we can stop now; if
365+ # we found an approximate match we need to keep
366+ # searching because there might be an exact match
367+ # later.
368+ cur_id = child
369+ new_path = osutils.pathjoin(cur_path, child_base)
370+ break
371+ elif child_base.lower() == lelt:
372+ cur_id = child
373+ new_path = osutils.pathjoin(cur_path, child_base)
374+ except errors.NoSuchId:
375+ # before a change is committed we can see this error...
376+ continue
377+ if new_path:
378+ cur_path = new_path
379+ else:
380+ # got to the end of this directory and no entries matched.
381+ # Return what matched so far, plus the rest as specified.
382+ cur_path = osutils.pathjoin(cur_path, elt, *list(bit_iter))
383+ break
384+ yield cur_path
385+ # all done.
386+
387+ def _get_inventory(self):
388+ return self._inventory
389+
390+ inventory = property(_get_inventory,
391+ doc="Inventory of this Tree")
392+
393+ @needs_read_lock
394+ def path2id(self, path):
395+ """Return the id for path in this tree."""
396+ return self._inventory.path2id(path)
397+
398+ def id2path(self, file_id):
399+ """Return the path for a file id.
400+
401+ :raises NoSuchId:
402+ """
403+ return self.inventory.id2path(file_id)
404+
405+ def has_id(self, file_id):
406+ return self.inventory.has_id(file_id)
407+
408+ def has_or_had_id(self, file_id):
409+ return self.inventory.has_id(file_id)
410+
411+ def __iter__(self):
412+ return iter(self.inventory)
413+
414+ def filter_unversioned_files(self, paths):
415+ """Filter out paths that are versioned.
416+
417+ :return: set of paths.
418+ """
419+ # NB: we specifically *don't* call self.has_filename, because for
420+ # WorkingTrees that can indicate files that exist on disk but that
421+ # are not versioned.
422+ pred = self.inventory.has_filename
423+ return set((p for p in paths if not pred(p)))
424+
425+ @needs_read_lock
426+ def iter_entries_by_dir(self, specific_file_ids=None, yield_parents=False):
427+ """Walk the tree in 'by_dir' order.
428+
429+ This will yield each entry in the tree as a (path, entry) tuple.
430+ The order that they are yielded is:
431+
432+ See Tree.iter_entries_by_dir for details.
433+
434+ :param yield_parents: If True, yield the parents from the root leading
435+ down to specific_file_ids that have been requested. This has no
436+ impact if specific_file_ids is None.
437+ """
438+ return self.inventory.iter_entries_by_dir(
439+ specific_file_ids=specific_file_ids, yield_parents=yield_parents)
440+
441+ def get_file_by_path(self, path):
442+ return self.get_file(self._inventory.path2id(path), path)
443+
444+
445 ######################################################################
446 # diff
447
448
449=== modified file 'bzrlib/workingtree_4.py'
450--- bzrlib/workingtree_4.py 2011-04-08 09:54:43 +0000
451+++ bzrlib/workingtree_4.py 2011-04-11 11:23:29 +0000
452@@ -60,12 +60,19 @@
453 safe_unicode,
454 )
455 from bzrlib.transport.local import LocalTransport
456-from bzrlib.tree import InterTree
457-from bzrlib.tree import Tree
458-from bzrlib.workingtree import WorkingTree, WorkingTree3, WorkingTreeFormat3
459+from bzrlib.tree import (
460+ InterTree,
461+ InventoryTree,
462+ )
463+from bzrlib.workingtree import (
464+ WorkingTree,
465+ WorkingTree3,
466+ WorkingTreeFormat3,
467+ )
468
469
470 class DirStateWorkingTree(WorkingTree3):
471+
472 def __init__(self, basedir,
473 branch,
474 _control_files=None,
475@@ -1250,7 +1257,7 @@
476 def rename_one(self, from_rel, to_rel, after=False):
477 """See WorkingTree.rename_one"""
478 self.flush()
479- WorkingTree.rename_one(self, from_rel, to_rel, after)
480+ super(DirStateWorkingTree, self).rename_one(from_rel, to_rel, after)
481
482 @needs_tree_write_lock
483 def apply_inventory_delta(self, changes):
484@@ -1604,7 +1611,7 @@
485 return True
486
487
488-class DirStateRevisionTree(Tree):
489+class DirStateRevisionTree(InventoryTree):
490 """A revision tree pulling the inventory from a dirstate.
491
492 Note that this is one of the historical (ie revision) trees cached in the
493
494=== modified file 'doc/en/release-notes/bzr-2.4.txt'
495--- doc/en/release-notes/bzr-2.4.txt 2011-04-08 22:51:59 +0000
496+++ doc/en/release-notes/bzr-2.4.txt 2011-04-11 11:23:29 +0000
497@@ -95,6 +95,10 @@
498 mysteriously silent.)
499 (Martin Pool)
500
501+* Inventory-specific functionality has been split out of ``Tree`` into
502+ a new ``InventoryTree`` class. Tree instances no longer
503+ necessarily provide an ``inventory`` attribute. (Jelmer Vernooij)
504+
505 * New method ``Hooks.uninstall_named_hook``. (Jelmer Vernooij, #301472)
506
507 * ``revision_graph_can_have_wrong_parents`` is now an attribute