Merge lp:~lifeless/bzr/apply-inventory-delta into lp:~bzr/bzr/trunk-old

Proposed by Robert Collins
Status: Merged
Merged at revision: not available
Proposed branch: lp:~lifeless/bzr/apply-inventory-delta
Merge into: lp:~bzr/bzr/trunk-old
Diff against target: 669 lines
To merge this branch: bzr merge lp:~lifeless/bzr/apply-inventory-delta
Reviewer Review Type Date Requested Status
Martin Pool Approve
Review via email: mp+8511@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Robert Collins (lifeless) wrote :

While my work on validating inventory deltas is not complete, this
particular patch is self contained and already has a number of
significant bugs fixed. In particular it catches silent corruption of
dirstate basis trees by malformed commits.

I'd like this reviewed at this point as landing it will may well improve
the user experience by turning fail-later experiences into fail-early,
making some other bugs easier to diagnose and fix. I have yet to check
that there aren't corruptions being caused unwittingly by the test suite
- however I'll do that concurrently with review.

-Rob

--

Revision history for this message
Robert Collins (lifeless) wrote :

It turns out there are some [small] issues. In particular adding these
guards exposes current live code exercised in the test suite that
generates inconsistent deltas. I'm fixing these now.

-Rob

Revision history for this message
Martin Pool (mbp) wrote :

18 + # Accumulate parent references (path and id), to check for parentless
19 + # items or items placed under files/links/tree-references.
20 + parents = set()

Please make the comment consistent with what's actually in the set, ie (path, id).

62 + except errors.BzrError, e:
63 + if 'integrity error' not in str(e):
64 + raise

I'd much rather that were an isinstance check, because people would tend to think of the string as something they can freely change in the UI without affecting the behaviour. Also it would let you make the except clause just catch that one type of error.

135 + # before starting to mutate the inventory, as there isn't a rollback
136 + # facility.
137 + list(_check_delta_unique_ids(_check_delta_unique_new_paths(
138 + _check_delta_unique_old_paths(_check_delta_ids_match_entry(
139 + delta)))))

That's very cute. It seems a bit like they should be consumers rather than pass-throughs, but I'm not sure if that would actually be any clearer, or easier to write in Python.

review: Approve
Revision history for this message
Robert Collins (lifeless) wrote :

I didn't realise that you'd commented when you +1'd on IRC, so these
aren't in the branch thats landing.

On Fri, 2009-07-10 at 05:14 +0000, Martin Pool wrote:
> Review: Approve
> 18 + # Accumulate parent references (path and id), to check for parentless
> 19 + # items or items placed under files/links/tree-references.
> 20 + parents = set()
>
> Please make the comment consistent with what's actually in the set, ie (path, id).

Will tweak.

> 62 + except errors.BzrError, e:
> 63 + if 'integrity error' not in str(e):
> 64 + raise
>
> I'd much rather that were an isinstance check, because people would tend to think of the string as something they can freely change in the UI without affecting the behaviour. Also it would let you make the except clause just catch that one type of error.

I wish whatever interface you reviewed in would do line wrapping :(.
Could you perhaps file a bug on code-reviews about this?

I'd prefer that too, but we have this tension about exception
proliferation; the exception being raised *actually is* a BzrError :(.

> 135 + # before starting to mutate the inventory, as there isn't a rollback
> 136 + # facility.
> 137 + list(_check_delta_unique_ids(_check_delta_unique_new_paths(
> 138 + _check_delta_unique_old_paths(_check_delta_ids_match_entry(
> 139 + delta)))))
>
> That's very cute. It seems a bit like they should be consumers rather than pass-throughs, but I'm not sure if that would actually be any clearer, or easier to write in Python.

Do you mean coroutines? That needs python 2.5 Or do you mean things that
do
for item in delta:
 ....

and don't yield? - such an arrangement can't stack, and means less
locality of reference - rather than applying N tests to each item, we'd
apply one test to all items N times.

-Rob

Revision history for this message
Martin Pool (mbp) wrote :

2009/7/10 Robert Collins <email address hidden>:
>> 62    + except errors.BzrError, e:
>> 63    + if 'integrity error' not in str(e):
>> 64    + raise
>>
>> I'd much rather that were an isinstance check, because people would tend to think of the string as something they can freely change in the UI without affecting the behaviour.  Also it would let you make the except clause just catch that one type of error.
>
> I wish whatever interface you reviewed in would do line wrapping :(.
> Could you perhaps file a bug on code-reviews about this?

I used the web ui. It looks ok to me: it's shown wrapped in the code
review page and in my mail client. Presumably you got this as mail.
Does it behave differently to Launchpad bug reports?

> I'd prefer that too, but we have this tension about exception
> proliferation; the exception being raised *actually is* a BzrError :(.

Couldn't we change them?

>
>> 135   + # before starting to mutate the inventory, as there isn't a rollback
>> 136   + # facility.
>> 137   + list(_check_delta_unique_ids(_check_delta_unique_new_paths(
>> 138   + _check_delta_unique_old_paths(_check_delta_ids_match_entry(
>> 139   + delta)))))
>>
>> That's very cute.  It seems a bit like they should be consumers rather than pass-throughs, but I'm not sure if that would actually be any clearer, or easier to write in Python.
>
> Do you mean coroutines? That needs python 2.5 Or do you mean things that
> do
> for item in delta:
>  ....
>
> and don't yield? - such an arrangement can't stack, and means less
> locality of reference - rather than applying N tests to each item, we'd
> apply one test to all items N times.

I'd like to say this:

  multi_map([check_delta_unique_new_paths, check_delta_unique_old_paths, ...],
     delta)

and have these closures all called for side effects one-by-one through
the iterator, allowing them to accumulate internal state as they go
along, without any idea that they can modify the data or that there is
a data dependency between them.

--
Martin <http://launchpad.net/~mbp/>

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin Pool wrote:
> 2009/7/10 Robert Collins <email address hidden>:
>>> 62 + except errors.BzrError, e:
>>> 63 + if 'integrity error' not in str(e):
>>> 64 + raise
>>>
>>> I'd much rather that were an isinstance check, because people would tend to think of the string as something they can freely change in the UI without affecting the behaviour. Also it would let you make the except clause just catch that one type of error.
>> I wish whatever interface you reviewed in would do line wrapping :(.
>> Could you perhaps file a bug on code-reviews about this?
>
> I used the web ui. It looks ok to me: it's shown wrapped in the code
> review page and in my mail client. Presumably you got this as mail.
> Does it behave differently to Launchpad bug reports?

I've filed a bug on it. It *doesn't* wrap and it makes most mail clients
wrap when reading, but *unwrap* when responding. Which is pretty
annoying to try and quote.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkpXdaoACgkQJdeBCYSNAANI0QCcDS0tIxGtVbs8BBR7PAsthmkb
kXYAn2G4kyIgiZdcO3spWas+rmf8IsE8
=AIyx
-----END PGP SIGNATURE-----

Revision history for this message
Robert Collins (lifeless) wrote :

On Fri, 2009-07-10 at 07:45 +0000, Martin Pool wrote:
>
> multi_map([check_delta_unique_new_paths,
> check_delta_unique_old_paths, ...],
> delta)
>
> and have these closures all called for side effects one-by-one through
> the iterator, allowing them to accumulate internal state as they go
> along, without any idea that they can modify the data or that there is
> a data dependency between them.

That would be nice. It requires considerably more code though (objects
rather than generators) and will likely be measurably slower, as
generators don't require new frames on every entrance.

-Rob

Revision history for this message
Martin Pool (mbp) wrote :

2009/7/13 Robert Collins <email address hidden>:
> That would be nice. It requires considerably more code though (objects
> rather than generators) and will likely be measurably slower, as
> generators don't require new frames on every entrance.

Yes that's pretty much what I concluded.

--
Martin <http://launchpad.net/~mbp/>

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bzrlib/dirstate.py'
2--- bzrlib/dirstate.py 2009-06-22 14:32:48 +0000
3+++ bzrlib/dirstate.py 2009-07-10 03:35:16 +0000
4@@ -1285,7 +1285,8 @@
5 removals = {}
6 for old_path, new_path, file_id, inv_entry in sorted(delta, reverse=True):
7 if (file_id in insertions) or (file_id in removals):
8- raise AssertionError("repeated file id in delta %r" % (file_id,))
9+ raise errors.InconsistentDelta(old_path or new_path, file_id,
10+ "repeated file_id")
11 if old_path is not None:
12 old_path = old_path.encode('utf-8')
13 removals[file_id] = old_path
14@@ -1399,7 +1400,19 @@
15 # At the same time, to reduce interface friction we convert the input
16 # inventory entries to dirstate.
17 root_only = ('', '')
18+ # Accumulate parent references (path and id), to check for parentless
19+ # items or items placed under files/links/tree-references.
20+ parents = set()
21 for old_path, new_path, file_id, inv_entry in delta:
22+ if inv_entry is not None and file_id != inv_entry.file_id:
23+ raise errors.InconsistentDelta(new_path, file_id,
24+ "mismatched entry file_id %r" % inv_entry)
25+ if new_path is not None:
26+ new_path_utf8 = encode(new_path)
27+ # note the parent for validation
28+ dirname, basename = osutils.split(new_path_utf8)
29+ if basename:
30+ parents.add((dirname, inv_entry.parent_id))
31 if old_path is None:
32 adds.append((None, encode(new_path), file_id,
33 inv_to_entry(inv_entry), True))
34@@ -1420,7 +1433,6 @@
35 # for 'r' items on every pass.
36 self._update_basis_apply_deletes(deletes)
37 deletes = []
38- new_path_utf8 = encode(new_path)
39 # Split into an add/delete pair recursively.
40 adds.append((None, new_path_utf8, file_id,
41 inv_to_entry(inv_entry), False))
42@@ -1447,18 +1459,34 @@
43 (source_path, target_path, entry[0][2], None, False))
44 deletes.append(
45 (encode(old_path), new_path, file_id, None, False))
46+ # note the parent for validation
47+ dirname, basename = osutils.split(new_path)
48+ parents.add((dirname, inv_entry.parent_id))
49 else:
50 # changes to just the root should not require remove/insertion
51 # of everything.
52 changes.append((encode(old_path), encode(new_path), file_id,
53 inv_to_entry(inv_entry)))
54
55- # Finish expunging deletes/first half of renames.
56- self._update_basis_apply_deletes(deletes)
57- # Reinstate second half of renames and new paths.
58- self._update_basis_apply_adds(adds)
59- # Apply in-situ changes.
60- self._update_basis_apply_changes(changes)
61+ try:
62+ # Finish expunging deletes/first half of renames.
63+ self._update_basis_apply_deletes(deletes)
64+ # Reinstate second half of renames and new paths.
65+ self._update_basis_apply_adds(adds)
66+ # Apply in-situ changes.
67+ self._update_basis_apply_changes(changes)
68+ # Validate parents
69+ self._update_basis_check_parents(parents)
70+ except errors.BzrError, e:
71+ if 'integrity error' not in str(e):
72+ raise
73+ # _get_entry raises BzrError when a request is inconsistent; we
74+ # want such errors to be shown as InconsistentDelta - and that
75+ # fits the behaviour we trigger. Partof this is driven by dirstate
76+ # only supporting deltas that turn the basis into a closer fit to
77+ # the active tree.
78+ self._changes_aborted = True
79+ raise errors.InconsistentDeltaDelta(delta, "error from _get_entry.")
80
81 self._dirblock_state = DirState.IN_MEMORY_MODIFIED
82 self._header_state = DirState.IN_MEMORY_MODIFIED
83@@ -1573,6 +1601,22 @@
84 # it is being resurrected here, so blank it out temporarily.
85 self._dirblocks[block_index][1][entry_index][1][1] = null
86
87+ def _update_basis_check_parents(self, parents):
88+ """Check that parents required by the delta are all intact."""
89+ for dirname, file_id in parents:
90+ # Get the entry - the ensures that file_id, dirname exists and has
91+ # the right file id.
92+ entry = self._get_entry(1, file_id, dirname)
93+ if entry[1] is None:
94+ self._changes_aborted = True
95+ raise errors.InconsistentDelta(dirname, file_id,
96+ "This parent is not present.")
97+ # Parents of things must be directories
98+ if entry[1][1][0] != 'd':
99+ self._changes_aborted = True
100+ raise errors.InconsistentDelta(dirname, file_id,
101+ "This parent is not a directory.")
102+
103 def _observed_sha1(self, entry, sha1, stat_value,
104 _stat_to_minikind=_stat_to_minikind, _pack_stat=pack_stat):
105 """Note the sha1 of a file.
106
107=== modified file 'bzrlib/errors.py'
108--- bzrlib/errors.py 2009-06-30 15:54:23 +0000
109+++ bzrlib/errors.py 2009-07-10 03:35:16 +0000
110@@ -2162,6 +2162,18 @@
111 self.reason = reason
112
113
114+class InconsistentDeltaDelta(InconsistentDelta):
115+ """Used when we get a delta that is not valid."""
116+
117+ _fmt = ("An inconsistent delta was supplied: %(delta)r"
118+ "\nreason: %(reason)s")
119+
120+ def __init__(self, delta, reason):
121+ BzrError.__init__(self)
122+ self.delta = delta
123+ self.reason = reason
124+
125+
126 class UpgradeRequired(BzrError):
127
128 _fmt = "To use this feature you must upgrade your branch at %(path)s."
129
130=== modified file 'bzrlib/inventory.py'
131--- bzrlib/inventory.py 2009-07-02 23:10:53 +0000
132+++ bzrlib/inventory.py 2009-07-10 03:35:16 +0000
133@@ -1127,12 +1127,11 @@
134 """
135 # Check that the delta is legal. It would be nice if this could be
136 # done within the loops below but it's safer to validate the delta
137- # before starting to mutate the inventory.
138- unique_file_ids = set([f for _, _, f, _ in delta])
139- if len(unique_file_ids) != len(delta):
140- raise AssertionError("a file-id appears multiple times in %r"
141- % (delta,))
142- del unique_file_ids
143+ # before starting to mutate the inventory, as there isn't a rollback
144+ # facility.
145+ list(_check_delta_unique_ids(_check_delta_unique_new_paths(
146+ _check_delta_unique_old_paths(_check_delta_ids_match_entry(
147+ delta)))))
148
149 children = {}
150 # Remove all affected items which were in the original inventory,
151@@ -1167,7 +1166,11 @@
152 replacement.revision = new_entry.revision
153 replacement.children = children.pop(replacement.file_id, {})
154 new_entry = replacement
155- self.add(new_entry)
156+ try:
157+ self.add(new_entry)
158+ except AttributeError:
159+ raise errors.InconsistentDelta(new_path, new_entry.file_id,
160+ "Parent is not a directory.")
161 if len(children):
162 # Get the parent id that was deleted
163 parent_id, children = children.popitem()
164@@ -1265,13 +1268,13 @@
165 try:
166 parent = self._byid[entry.parent_id]
167 except KeyError:
168- raise BzrError("parent_id {%s} not in inventory" %
169- entry.parent_id)
170-
171+ raise errors.InconsistentDelta("<unknown>", entry.parent_id,
172+ "Parent not in inventory.")
173 if entry.name in parent.children:
174- raise BzrError("%s is already versioned" %
175- osutils.pathjoin(self.id2path(parent.file_id),
176- entry.name).encode('utf-8'))
177+ raise errors.InconsistentDelta(
178+ self.id2path(parent.children[entry.name].file_id),
179+ entry.file_id,
180+ "Path already versioned")
181 parent.children[entry.name] = entry
182 return self._add_child(entry)
183
184@@ -1619,6 +1622,18 @@
185 result.parent_id_basename_to_file_id = None
186 result.root_id = self.root_id
187 id_to_entry_delta = []
188+ # inventory_delta is only traversed once, so we just update the
189+ # variable.
190+ # Check for repeated file ids
191+ inventory_delta = _check_delta_unique_ids(inventory_delta)
192+ # Repeated old paths
193+ inventory_delta = _check_delta_unique_old_paths(inventory_delta)
194+ # Check for repeated new paths
195+ inventory_delta = _check_delta_unique_new_paths(inventory_delta)
196+ # Check for entries that don't match the fileid
197+ inventory_delta = _check_delta_ids_match_entry(inventory_delta)
198+ # All changed entries need to have their parents be directories.
199+ parents = set()
200 for old_path, new_path, file_id, entry in inventory_delta:
201 # file id changes
202 if new_path == '':
203@@ -1639,6 +1654,7 @@
204 # Update caches. It's worth doing this whether
205 # we're propagating the old caches or not.
206 result._path_to_fileid_cache[new_path] = file_id
207+ parents.add(entry.parent_id)
208 if old_path is None:
209 old_key = None
210 else:
211@@ -1664,6 +1680,14 @@
212 result.id_to_entry.apply_delta(id_to_entry_delta)
213 if parent_id_basename_delta:
214 result.parent_id_basename_to_file_id.apply_delta(parent_id_basename_delta)
215+ for parent in parents:
216+ try:
217+ if result[parent].kind != 'directory':
218+ raise errors.InconsistentDelta(result.id2path(parent), parent,
219+ 'Not a directory, but given children')
220+ except errors.NoSuchId:
221+ raise errors.InconsistentDelta("<unknown>", parent,
222+ "Parent is not present in resulting inventory.")
223 return result
224
225 @classmethod
226@@ -2072,3 +2096,64 @@
227 _NAME_RE = re.compile(r'^[^/\\]+$')
228
229 return bool(_NAME_RE.match(name))
230+
231+
232+def _check_delta_unique_ids(delta):
233+ """Decorate a delta and check that the file ids in it are unique.
234+
235+ :return: A generator over delta.
236+ """
237+ ids = set()
238+ for item in delta:
239+ length = len(ids) + 1
240+ ids.add(item[2])
241+ if len(ids) != length:
242+ raise errors.InconsistentDelta(item[0] or item[1], item[2],
243+ "repeated file_id")
244+ yield item
245+
246+
247+def _check_delta_unique_new_paths(delta):
248+ """Decorate a delta and check that the new paths in it are unique.
249+
250+ :return: A generator over delta.
251+ """
252+ paths = set()
253+ for item in delta:
254+ length = len(paths) + 1
255+ path = item[1]
256+ if path is not None:
257+ paths.add(path)
258+ if len(paths) != length:
259+ raise errors.InconsistentDelta(path, item[2], "repeated path")
260+ yield item
261+
262+
263+def _check_delta_unique_old_paths(delta):
264+ """Decorate a delta and check that the old paths in it are unique.
265+
266+ :return: A generator over delta.
267+ """
268+ paths = set()
269+ for item in delta:
270+ length = len(paths) + 1
271+ path = item[0]
272+ if path is not None:
273+ paths.add(path)
274+ if len(paths) != length:
275+ raise errors.InconsistentDelta(path, item[2], "repeated path")
276+ yield item
277+
278+
279+def _check_delta_ids_match_entry(delta):
280+ """Decorate a delta and check that the ids in it match the entry.file_id.
281+
282+ :return: A generator over delta.
283+ """
284+ for item in delta:
285+ entry = item[3]
286+ if entry is not None:
287+ if entry.file_id != item[2]:
288+ raise errors.InconsistentDelta(item[0] or item[1], item[2],
289+ "mismatched id with %r" % entry)
290+ yield item
291
292=== modified file 'bzrlib/tests/test_errors.py'
293--- bzrlib/tests/test_errors.py 2009-06-19 00:33:36 +0000
294+++ bzrlib/tests/test_errors.py 2009-07-10 03:35:16 +0000
295@@ -87,6 +87,12 @@
296 "reason: reason for foo",
297 str(error))
298
299+ def test_inconsistent_delta_delta(self):
300+ error = errors.InconsistentDeltaDelta([], 'reason')
301+ self.assertEqualDiff(
302+ "An inconsistent delta was supplied: []\nreason: reason",
303+ str(error))
304+
305 def test_in_process_transport(self):
306 error = errors.InProcessTransport('fpp')
307 self.assertEqualDiff(
308
309=== modified file 'bzrlib/tests/test_inv.py'
310--- bzrlib/tests/test_inv.py 2009-06-17 18:41:26 +0000
311+++ bzrlib/tests/test_inv.py 2009-07-10 03:35:16 +0000
312@@ -15,10 +15,311 @@
313 # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
314
315
316-from bzrlib import errors, chk_map, inventory, osutils
317+from bzrlib import (
318+ chk_map,
319+ bzrdir,
320+ errors,
321+ inventory,
322+ osutils,
323+ repository,
324+ revision,
325+ )
326 from bzrlib.inventory import (CHKInventory, Inventory, ROOT_ID, InventoryFile,
327 InventoryDirectory, InventoryEntry, TreeReference)
328-from bzrlib.tests import TestCase, TestCaseWithTransport
329+from bzrlib.tests import (
330+ TestCase,
331+ TestCaseWithTransport,
332+ condition_isinstance,
333+ multiply_tests,
334+ split_suite_by_condition,
335+ )
336+from bzrlib.tests.workingtree_implementations import workingtree_formats
337+
338+
339+def load_tests(standard_tests, module, loader):
340+ """Parameterise some inventory tests."""
341+ to_adapt, result = split_suite_by_condition(standard_tests,
342+ condition_isinstance(TestDeltaApplication))
343+ scenarios = [
344+ ('Inventory', {'apply_delta':apply_inventory_Inventory}),
345+ ]
346+ # Working tree basis delta application
347+ # Repository add_inv_by_delta.
348+ # Reduce form of the per_repository test logic - that logic needs to be
349+ # be able to get /just/ repositories whereas these tests are fine with
350+ # just creating trees.
351+ formats = set()
352+ for _, format in repository.format_registry.iteritems():
353+ scenarios.append((str(format.__name__), {
354+ 'apply_delta':apply_inventory_Repository_add_inventory_by_delta,
355+ 'format':format}))
356+ for format in workingtree_formats():
357+ scenarios.append((str(format.__class__.__name__), {
358+ 'apply_delta':apply_inventory_WT_basis,
359+ 'format':format}))
360+ return multiply_tests(to_adapt, scenarios, result)
361+
362+
363+def apply_inventory_Inventory(self, basis, delta):
364+ """Apply delta to basis and return the result.
365+
366+ :param basis: An inventory to be used as the basis.
367+ :param delta: The inventory delta to apply:
368+ :return: An inventory resulting from the application.
369+ """
370+ basis.apply_delta(delta)
371+ return basis
372+
373+
374+def apply_inventory_WT_basis(self, basis, delta):
375+ """Apply delta to basis and return the result.
376+
377+ This sets the parent and then calls update_basis_by_delta.
378+ It also puts the basis in the repository under both 'basis' and 'result' to
379+ allow safety checks made by the WT to succeed, and finally ensures that all
380+ items in the delta with a new path are present in the WT before calling
381+ update_basis_by_delta.
382+
383+ :param basis: An inventory to be used as the basis.
384+ :param delta: The inventory delta to apply:
385+ :return: An inventory resulting from the application.
386+ """
387+ control = self.make_bzrdir('tree', format=self.format._matchingbzrdir)
388+ control.create_repository()
389+ control.create_branch()
390+ tree = self.format.initialize(control)
391+ tree.lock_write()
392+ try:
393+ repo = tree.branch.repository
394+ repo.start_write_group()
395+ try:
396+ rev = revision.Revision('basis', timestamp=0, timezone=None,
397+ message="", committer="foo@example.com")
398+ basis.revision_id = 'basis'
399+ repo.add_revision('basis', rev, basis)
400+ # Add a revision for the result, with the basis content -
401+ # update_basis_by_delta doesn't check that the delta results in
402+ # result, and we want inconsistent deltas to get called on the
403+ # tree, or else the code isn't actually checked.
404+ rev = revision.Revision('result', timestamp=0, timezone=None,
405+ message="", committer="foo@example.com")
406+ basis.revision_id = 'result'
407+ repo.add_revision('result', rev, basis)
408+ except:
409+ repo.abort_write_group()
410+ raise
411+ else:
412+ repo.commit_write_group()
413+ # Set the basis state as the trees current state
414+ tree._write_inventory(basis)
415+ # This reads basis from the repo and puts it into the tree's local
416+ # cache, if it has one.
417+ tree.set_parent_ids(['basis'])
418+ paths = {}
419+ parents = set()
420+ for old, new, id, entry in delta:
421+ if entry is None:
422+ continue
423+ paths[new] = (entry.file_id, entry.kind)
424+ parents.add(osutils.dirname(new))
425+ parents = osutils.minimum_path_selection(parents)
426+ parents.discard('')
427+ # Put place holders in the tree to permit adding the other entries.
428+ for pos, parent in enumerate(parents):
429+ if not tree.path2id(parent):
430+ # add a synthetic directory in the tree so we can can put the
431+ # tree0 entries in place for dirstate.
432+ tree.add([parent], ["id%d" % pos], ["directory"])
433+ if paths:
434+ # Many deltas may cause this mini-apply to fail, but we want to see what
435+ # the delta application code says, not the prep that we do to deal with
436+ # limitations of dirstate's update_basis code.
437+ for path, (file_id, kind) in sorted(paths.items()):
438+ try:
439+ tree.add([path], [file_id], [kind])
440+ except (KeyboardInterrupt, SystemExit):
441+ raise
442+ except:
443+ pass
444+ finally:
445+ tree.unlock()
446+ # Fresh lock, reads disk again.
447+ tree.lock_write()
448+ try:
449+ tree.update_basis_by_delta('result', delta)
450+ finally:
451+ tree.unlock()
452+ # reload tree - ensure we get what was written.
453+ tree = tree.bzrdir.open_workingtree()
454+ basis_tree = tree.basis_tree()
455+ basis_tree.lock_read()
456+ self.addCleanup(basis_tree.unlock)
457+ # Note, that if the tree does not have a local cache, the trick above of
458+ # setting the result as the basis, will come back to bite us. That said,
459+ # all the implementations in bzr do have a local cache.
460+ return basis_tree.inventory
461+
462+
463+def apply_inventory_Repository_add_inventory_by_delta(self, basis, delta):
464+ """Apply delta to basis and return the result.
465+
466+ This inserts basis as a whole inventory and then uses
467+ add_inventory_by_delta to add delta.
468+
469+ :param basis: An inventory to be used as the basis.
470+ :param delta: The inventory delta to apply:
471+ :return: An inventory resulting from the application.
472+ """
473+ format = self.format()
474+ control = self.make_bzrdir('tree', format=format._matchingbzrdir)
475+ repo = format.initialize(control)
476+ repo.lock_write()
477+ try:
478+ repo.start_write_group()
479+ try:
480+ rev = revision.Revision('basis', timestamp=0, timezone=None,
481+ message="", committer="foo@example.com")
482+ basis.revision_id = 'basis'
483+ repo.add_revision('basis', rev, basis)
484+ except:
485+ repo.abort_write_group()
486+ raise
487+ else:
488+ repo.commit_write_group()
489+ finally:
490+ repo.unlock()
491+ repo.lock_write()
492+ try:
493+ repo.start_write_group()
494+ try:
495+ inv_sha1 = repo.add_inventory_by_delta('basis', delta,
496+ 'result', ['basis'])
497+ except:
498+ repo.abort_write_group()
499+ raise
500+ else:
501+ repo.commit_write_group()
502+ finally:
503+ repo.unlock()
504+ # Fresh lock, reads disk again.
505+ repo = repo.bzrdir.open_repository()
506+ repo.lock_read()
507+ self.addCleanup(repo.unlock)
508+ return repo.get_inventory('result')
509+
510+
511+class TestDeltaApplication(TestCaseWithTransport):
512+
513+ def get_empty_inventory(self, reference_inv=None):
514+ """Get an empty inventory.
515+
516+ Note that tests should not depend on the revision of the root for
517+ setting up test conditions, as it has to be flexible to accomodate non
518+ rich root repositories.
519+
520+ :param reference_inv: If not None, get the revision for the root from
521+ this inventory. This is useful for dealing with older repositories
522+ that routinely discarded the root entry data. If None, the root's
523+ revision is set to 'basis'.
524+ """
525+ inv = inventory.Inventory()
526+ if reference_inv is not None:
527+ inv.root.revision = reference_inv.root.revision
528+ else:
529+ inv.root.revision = 'basis'
530+ return inv
531+
532+ def test_empty_delta(self):
533+ inv = self.get_empty_inventory()
534+ delta = []
535+ inv = self.apply_delta(self, inv, delta)
536+ inv2 = self.get_empty_inventory(inv)
537+ self.assertEqual([], inv2._make_delta(inv))
538+
539+ def test_repeated_file_id(self):
540+ inv = self.get_empty_inventory()
541+ file1 = inventory.InventoryFile('id', 'path1', inv.root.file_id)
542+ file1.revision = 'result'
543+ file1.text_size = 0
544+ file1.text_sha1 = ""
545+ file2 = inventory.InventoryFile('id', 'path2', inv.root.file_id)
546+ file2.revision = 'result'
547+ file2.text_size = 0
548+ file2.text_sha1 = ""
549+ delta = [(None, u'path1', 'id', file1), (None, u'path2', 'id', file2)]
550+ self.assertRaises(errors.InconsistentDelta, self.apply_delta, self,
551+ inv, delta)
552+
553+ def test_repeated_new_path(self):
554+ inv = self.get_empty_inventory()
555+ file1 = inventory.InventoryFile('id1', 'path', inv.root.file_id)
556+ file1.revision = 'result'
557+ file1.text_size = 0
558+ file1.text_sha1 = ""
559+ file2 = inventory.InventoryFile('id2', 'path', inv.root.file_id)
560+ file2.revision = 'result'
561+ file2.text_size = 0
562+ file2.text_sha1 = ""
563+ delta = [(None, u'path', 'id1', file1), (None, u'path', 'id2', file2)]
564+ self.assertRaises(errors.InconsistentDelta, self.apply_delta, self,
565+ inv, delta)
566+
567+ def test_repeated_old_path(self):
568+ inv = self.get_empty_inventory()
569+ file1 = inventory.InventoryFile('id1', 'path', inv.root.file_id)
570+ file1.revision = 'result'
571+ file1.text_size = 0
572+ file1.text_sha1 = ""
573+ # We can't *create* a source inventory with the same path, but
574+ # a badly generated partial delta might claim the same source twice.
575+ # This would be buggy in two ways: the path is repeated in the delta,
576+ # And the path for one of the file ids doesn't match the source
577+ # location. Alternatively, we could have a repeated fileid, but that
578+ # is separately checked for.
579+ file2 = inventory.InventoryFile('id2', 'path2', inv.root.file_id)
580+ file2.revision = 'result'
581+ file2.text_size = 0
582+ file2.text_sha1 = ""
583+ inv.add(file1)
584+ inv.add(file2)
585+ delta = [(u'path', None, 'id1', None), (u'path', None, 'id2', None)]
586+ self.assertRaises(errors.InconsistentDelta, self.apply_delta, self,
587+ inv, delta)
588+
589+ def test_mismatched_id_entry_id(self):
590+ inv = self.get_empty_inventory()
591+ file1 = inventory.InventoryFile('id1', 'path', inv.root.file_id)
592+ file1.revision = 'result'
593+ file1.text_size = 0
594+ file1.text_sha1 = ""
595+ delta = [(None, u'path', 'id', file1)]
596+ self.assertRaises(errors.InconsistentDelta, self.apply_delta, self,
597+ inv, delta)
598+
599+ def test_parent_is_not_directory(self):
600+ inv = self.get_empty_inventory()
601+ file1 = inventory.InventoryFile('id1', 'path', inv.root.file_id)
602+ file1.revision = 'result'
603+ file1.text_size = 0
604+ file1.text_sha1 = ""
605+ file2 = inventory.InventoryFile('id2', 'path2', 'id1')
606+ file2.revision = 'result'
607+ file2.text_size = 0
608+ file2.text_sha1 = ""
609+ inv.add(file1)
610+ delta = [(None, u'path/path2', 'id2', file2)]
611+ self.assertRaises(errors.InconsistentDelta, self.apply_delta, self,
612+ inv, delta)
613+
614+ def test_parent_is_missing(self):
615+ inv = self.get_empty_inventory()
616+ file2 = inventory.InventoryFile('id2', 'path2', 'missingparent')
617+ file2.revision = 'result'
618+ file2.text_size = 0
619+ file2.text_sha1 = ""
620+ delta = [(None, u'path/path2', 'id2', file2)]
621+ self.assertRaises(errors.InconsistentDelta, self.apply_delta, self,
622+ inv, delta)
623
624
625 class TestInventoryEntry(TestCase):
626
627=== modified file 'bzrlib/tests/workingtree_implementations/__init__.py'
628--- bzrlib/tests/workingtree_implementations/__init__.py 2009-07-06 21:13:30 +0000
629+++ bzrlib/tests/workingtree_implementations/__init__.py 2009-07-10 03:35:16 +0000
630@@ -59,6 +59,12 @@
631 return self.workingtree_format.initialize(made_control)
632
633
634+def workingtree_formats():
635+ """The known working tree formats."""
636+ return (workingtree.WorkingTreeFormat._formats.values() +
637+ workingtree._legacy_formats)
638+
639+
640 def load_tests(standard_tests, module, loader):
641 test_workingtree_implementations = [
642 'bzrlib.tests.workingtree_implementations.test_add_reference',
643@@ -106,8 +112,8 @@
644 # None here will cause a readonly decorator to be created
645 # by the TestCaseWithTransport.get_readonly_transport method.
646 None,
647- workingtree.WorkingTreeFormat._formats.values()
648- + workingtree._legacy_formats)
649+ workingtree_formats()
650+ )
651
652 # add the tests for the sub modules
653 return tests.multiply_tests(
654
655=== modified file 'bzrlib/xml8.py'
656--- bzrlib/xml8.py 2009-06-09 00:59:51 +0000
657+++ bzrlib/xml8.py 2009-07-10 03:35:16 +0000
658@@ -168,9 +168,9 @@
659 :raises: AssertionError if an error has occurred.
660 """
661 if inv.revision_id is None:
662- raise AssertionError()
663+ raise AssertionError("inv.revision_id is None")
664 if inv.root.revision is None:
665- raise AssertionError()
666+ raise AssertionError("inv.root.revision is None")
667
668 def _check_cache_size(self, inv_size, entry_cache):
669 """Check that the entry_cache is large enough.