Merge lp:~jameinel/bzr/1.15-gc-stacking into lp:~bzr/bzr/trunk-old

Proposed by John A Meinel on 2009-05-28
Status: Superseded
Proposed branch: lp:~jameinel/bzr/1.15-gc-stacking
Merge into: lp:~bzr/bzr/trunk-old
Diff against target: 2011 lines (has conflicts)
Text conflict in NEWS
To merge this branch: bzr merge lp:~jameinel/bzr/1.15-gc-stacking
Reviewer Review Type Date Requested Status
Andrew Bennetts 2009-05-28 Needs Fixing on 2009-05-29
Review via email: mp+6852@code.launchpad.net

This proposal has been superseded by a proposal from 2009-05-29.

To post a comment you must log in.
John A Meinel (jameinel) wrote :

This change enables --development6-rich-root to stack. It ends up including the Repository fallback locking fixes, and a few other code cleanups that we encountered along the way.

It unfortunately adds a bit more direct connection between PackRepository and PackRepository.revisions._index._key_dependencies

We already had an explicit connection, because get_missing_parent_inventories() was directly accessing that variable. The bit we added was to just add a reset() of that cache when a write group is committed, aborted, or suspended. We felt that this was the 'right thing', but it was also required to fix a test about ghosts.

(We had a test that ghosts aren't filled in, but without resetting key deps, an earlier commit that introduced the ghost still tracks that the ghost is missing. Existing Pack fetching would suffer this as well if it used the Stream code for fetching, rather than Pack => Pack.)

This is potentially up for backporting to a 1.15.1 release.

Andrew Bennetts (spiv) wrote :

It's a shame that you add both "_find_present_inventory_ids" and "_find_present_inventories" to groupcompress_repo.py, but it's not trivial to factor out that duplication. Similarly, around line 950 of that file you have a duplication of the logic of find_parent_ids_of_revisions, but again reusing that code isn't trivial. Something to cleanup in the future I guess...

In test_sprout_from_stacked_with_short_history in bzrlib/tests/per_repository_reference/test_fetch.py you start with a comment saying "Now copy this ...", which is a bit weird as the first thing in a test. Probably this comment hasn't been updated after you refactored the test? Anyway, please update it.

1353 + for record in stream:
1354 + records.append(record.key)
1355 + if record.key == ('a-id', 'A-id'):
1356 + self.assertEqual(''.join(content[:-2]),
1357 + record.get_bytes_as('fulltext'))
1358 + elif record.key == ('a-id', 'B-id'):
1359 + self.assertEqual(''.join(content[:-1]),
1360 + record.get_bytes_as('fulltext'))
1361 + elif record.key == ('a-id', 'C-id'):
1362 + self.assertEqual(''.join(content),
1363 + record.get_bytes_as('fulltext'))
1364 + else:
1365 + self.fail('Unexpected record: %s' % (record.key,))

This is ok, but I think I'd rather:

for record in stream:
    records.append((record.key, record.get_bytes_as('fulltext')))
records.sort()
self.assertEqual(
    [(('a-id', 'A-id'), ''.join(content[:-2])), (('a-id', 'B-id'), ''.join(content[:-1])),
     (('a-id', 'C-id'), ''.join(content))],
    records)

Which is more compact and doesn't have any need for conditionals in the test, and will probably give more informative failures.

bzrlib/tests/per_repository_reference/test_initialize.py adds a test with no assert* calls. Is that intentional?

In bzrlib/tests/test_pack_repository.py, test_resume_chk_bytes has a line of unreachable code after a raise statement.

In bzrlib/tests/test_repository.py, is the typo in 'abcdefghijklmnopqrstuvwxzy123456789' meant to be a test to see how attentive your reviewer is? ;)

Other than those, this seems fine to me though.

review: Needs Fixing
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andrew Bennetts wrote:
> Review: Needs Fixing
> It's a shame that you add both "_find_present_inventory_ids" and "_find_present_inventories" to groupcompress_repo.py, but it's not trivial to factor out that duplication. Similarly, around line 950 of that file you have a duplication of the logic of find_parent_ids_of_revisions, but again reusing that code isn't trivial. Something to cleanup in the future I guess...
>

_find_present_inventory_ids, and _find_present_inventories are actually
exchangeable, it is just
self.from_repository._find_present_inventory_ids
rather than
self._find_present_inventories.

I'm glad you caught the duplication.

And for "_find_parent_ids_of_revisions()" it also is available as
self.from_repository....

Mostly because this is GroupCHKStreamSource which can assume that it has
a RepositoryCHK1 as the .from_repository.

Ultimately, we should probably move those functions to be on Repository,
and potentially make them public. I don't really like widening the
Repository api, but as it has a default implementation that works just
fine for all other implementations, it doesn't really cause a burden for
something like SVNRepository.

> In test_sprout_from_stacked_with_short_history in bzrlib/tests/per_repository_reference/test_fetch.py you start with a comment saying "Now copy this ...", which is a bit weird as the first thing in a test. Probably this comment hasn't been updated after you refactored the test? Anyway, please update it.

Done.

...

> for record in stream:
> records.append((record.key, record.get_bytes_as('fulltext')))
> records.sort()
> self.assertEqual(
> [(('a-id', 'A-id'), ''.join(content[:-2])), (('a-id', 'B-id'), ''.join(content[:-1])),
> (('a-id', 'C-id'), ''.join(content))],
> records)
>
> Which is more compact and doesn't have any need for conditionals in the test, and will probably give more informative failures.

Done.

>
> bzrlib/tests/per_repository_reference/test_initialize.py adds a test with no assert* calls. Is that intentional?
>

It exercises the code that was broken by doing things differently. (As
in you would get an exception.)
I can add arbitrary assertions, but the reason for the test was to have
a simple call to "initialize_on_transport_ex()" given all repository
formats, and remote requests, etc, etc.

I'll add some basic bits, just to make it look like a real test. I'll
even add one that tests we can initialize all formats over the smart server.

> In bzrlib/tests/test_pack_repository.py, test_resume_chk_bytes has a line of unreachable code after a raise statement.
>
> In bzrlib/tests/test_repository.py, is the typo in 'abcdefghijklmnopqrstuvwxzy123456789' meant to be a test to see how attentive your reviewer is? ;)
>
> Other than those, this seems fine to me though.

Fixed.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkofsfMACgkQJdeBCYSNAAP7BgCfeZehp6iRn0THWW1lDnOEzs1p
PxoAnjCXPs75oPLPiZTtSrrDT6jebUkt
=zVXm
-----END PGP SIGNATURE-----

lp:~jameinel/bzr/1.15-gc-stacking updated on 2009-05-29
4378. By John A Meinel on 2009-05-28

Remove some debugging note() and mutter() calls.

4379. By John A Meinel on 2009-05-29

Some review feedback from Andrew.

Also found that BzrDir.initialize_on_transport_ex(Remote) is actually quite
broken for non-default formats.

4380. By John A Meinel on 2009-05-29

A bit more review feedback from Andrew.

4381. By John A Meinel on 2009-05-29

Add NEWS entry for fixing bug #373455

4382. By John A Meinel on 2009-05-29

Merge bzr.dev

Unmerged revisions

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'NEWS'
2--- NEWS 2009-05-28 18:56:55 +0000
3+++ NEWS 2009-05-29 00:35:25 +0000
4@@ -28,9 +28,17 @@
5 Bug Fixes
6 *********
7
8+<<<<<<< TREE
9 * Better message in ``bzr add`` output suggesting using ``bzr ignored`` to
10 see which files can also be added. (Jason Spashett, #76616)
11
12+=======
13+* Clarify the rules for locking and fallback repositories. Fix bugs in how
14+ ``RemoteRepository`` was handling fallbacks along with the
15+ ``_real_repository``. (Andrew Bennetts, John Arbash Meinel, #375496)
16+
17+
18+>>>>>>> MERGE-SOURCE
19 Documentation
20 *************
21
22@@ -76,6 +84,15 @@
23 New Features
24 ************
25
26+* New command ``bzr dpush`` that can push changes to foreign
27+ branches (svn, git) without setting custom bzr-specific metadata.
28+ (Jelmer Vernooij)
29+
30+* The new development format ``--development6-rich-root`` now supports
31+ stacking. We chose not to use a new format marker, since old clients
32+ will just fail to open stacked branches, the same as if we used a new
33+ format flag. (John Arbash Meinel, #373455)
34+
35 * Plugins can now define their own annotation tie-breaker when two revisions
36 introduce the exact same line. See ``bzrlib.annotate._break_annotation_tie``
37 Be aware though that this is temporary, private (as indicated by the leading
38
39=== modified file 'bzrlib/branch.py'
40--- bzrlib/branch.py 2009-05-26 20:32:34 +0000
41+++ bzrlib/branch.py 2009-05-29 00:35:25 +0000
42@@ -101,13 +101,9 @@
43 def _open_hook(self):
44 """Called by init to allow simpler extension of the base class."""
45
46- def _activate_fallback_location(self, url, lock_style):
47+ def _activate_fallback_location(self, url):
48 """Activate the branch/repository from url as a fallback repository."""
49 repo = self._get_fallback_repository(url)
50- if lock_style == 'write':
51- repo.lock_write()
52- elif lock_style == 'read':
53- repo.lock_read()
54 self.repository.add_fallback_repository(repo)
55
56 def break_lock(self):
57@@ -656,7 +652,7 @@
58 self.repository.fetch(source_repository, revision_id,
59 find_ghosts=True)
60 else:
61- self._activate_fallback_location(url, 'write')
62+ self._activate_fallback_location(url)
63 # write this out after the repository is stacked to avoid setting a
64 # stacked config that doesn't work.
65 self._set_config_location('stacked_on_location', url)
66@@ -2370,7 +2366,7 @@
67 raise AssertionError(
68 "'transform_fallback_location' hook %s returned "
69 "None, not a URL." % hook_name)
70- self._activate_fallback_location(url, None)
71+ self._activate_fallback_location(url)
72
73 def __init__(self, *args, **kwargs):
74 self._ignore_fallbacks = kwargs.get('ignore_fallbacks', False)
75
76=== modified file 'bzrlib/bzrdir.py'
77--- bzrlib/bzrdir.py 2009-05-23 04:55:52 +0000
78+++ bzrlib/bzrdir.py 2009-05-29 00:35:25 +0000
79@@ -3177,6 +3177,8 @@
80 remote_repo.dont_leave_lock_in_place()
81 else:
82 remote_repo.lock_write()
83+ note('Final stack: %s', final_stack)
84+ mutter('Aquiring fallback: %s', remote_repo)
85 policy = UseExistingRepository(remote_repo, final_stack,
86 final_stack_pwd, require_stacking)
87 policy.acquire_repository()
88@@ -3475,6 +3477,7 @@
89 stacked_repo = stacked_dir.open_branch().repository
90 except errors.NotBranchError:
91 stacked_repo = stacked_dir.open_repository()
92+ mutter('stacking on %s', stacked_repo)
93 try:
94 repository.add_fallback_repository(stacked_repo)
95 except errors.UnstackableRepositoryFormat:
96
97=== modified file 'bzrlib/groupcompress.py'
98--- bzrlib/groupcompress.py 2009-05-25 19:04:59 +0000
99+++ bzrlib/groupcompress.py 2009-05-29 00:35:25 +0000
100@@ -31,13 +31,13 @@
101 diff,
102 errors,
103 graph as _mod_graph,
104+ knit,
105 osutils,
106 pack,
107 patiencediff,
108 trace,
109 )
110 from bzrlib.graph import Graph
111-from bzrlib.knit import _DirectPackAccess
112 from bzrlib.btree_index import BTreeBuilder
113 from bzrlib.lru_cache import LRUSizeCache
114 from bzrlib.tsort import topo_sort
115@@ -911,7 +911,7 @@
116 writer.begin()
117 index = _GCGraphIndex(graph_index, lambda:True, parents=parents,
118 add_callback=graph_index.add_nodes)
119- access = _DirectPackAccess({})
120+ access = knit._DirectPackAccess({})
121 access.set_writer(writer, graph_index, (transport, 'newpack'))
122 result = GroupCompressVersionedFiles(index, access, delta)
123 result.stream = stream
124@@ -1547,7 +1547,7 @@
125 """Mapper from GroupCompressVersionedFiles needs into GraphIndex storage."""
126
127 def __init__(self, graph_index, is_locked, parents=True,
128- add_callback=None):
129+ add_callback=None, track_external_parent_refs=False):
130 """Construct a _GCGraphIndex on a graph_index.
131
132 :param graph_index: An implementation of bzrlib.index.GraphIndex.
133@@ -1558,12 +1558,19 @@
134 :param add_callback: If not None, allow additions to the index and call
135 this callback with a list of added GraphIndex nodes:
136 [(node, value, node_refs), ...]
137+ :param track_external_parent_refs: As keys are added, keep track of the
138+ keys they reference, so that we can query get_missing_parents(),
139+ etc.
140 """
141 self._add_callback = add_callback
142 self._graph_index = graph_index
143 self._parents = parents
144 self.has_graph = parents
145 self._is_locked = is_locked
146+ if track_external_parent_refs:
147+ self._key_dependencies = knit._KeyRefs()
148+ else:
149+ self._key_dependencies = None
150
151 def add_records(self, records, random_id=False):
152 """Add multiple records to the index.
153@@ -1614,6 +1621,11 @@
154 for key, (value, node_refs) in keys.iteritems():
155 result.append((key, value))
156 records = result
157+ key_dependencies = self._key_dependencies
158+ if key_dependencies is not None and self._parents:
159+ for key, value, refs in records:
160+ parents = refs[0]
161+ key_dependencies.add_references(key, parents)
162 self._add_callback(records)
163
164 def _check_read(self):
165@@ -1668,6 +1680,14 @@
166 result[node[1]] = None
167 return result
168
169+ def get_missing_parents(self):
170+ """Return the keys of missing parents."""
171+ # Copied from _KnitGraphIndex.get_missing_parents
172+ # We may have false positives, so filter those out.
173+ self._key_dependencies.add_keys(
174+ self.get_parent_map(self._key_dependencies.get_unsatisfied_refs()))
175+ return frozenset(self._key_dependencies.get_unsatisfied_refs())
176+
177 def get_build_details(self, keys):
178 """Get the various build details for keys.
179
180@@ -1719,6 +1739,23 @@
181 delta_end = int(bits[3])
182 return node[0], start, stop, basis_end, delta_end
183
184+ def scan_unvalidated_index(self, graph_index):
185+ """Inform this _GCGraphIndex that there is an unvalidated index.
186+
187+ This allows this _GCGraphIndex to keep track of any missing
188+ compression parents we may want to have filled in to make those
189+ indices valid.
190+
191+ :param graph_index: A GraphIndex
192+ """
193+ if self._key_dependencies is not None:
194+ # Add parent refs from graph_index (and discard parent refs that
195+ # the graph_index has).
196+ add_refs = self._key_dependencies.add_references
197+ for node in graph_index.iter_all_entries():
198+ add_refs(node[1], node[3][0])
199+
200+
201
202 from bzrlib._groupcompress_py import (
203 apply_delta,
204
205=== modified file 'bzrlib/inventory.py'
206--- bzrlib/inventory.py 2009-04-10 12:11:58 +0000
207+++ bzrlib/inventory.py 2009-05-29 00:35:25 +0000
208@@ -1547,11 +1547,9 @@
209 def _get_mutable_inventory(self):
210 """See CommonInventory._get_mutable_inventory."""
211 entries = self.iter_entries()
212- if self.root_id is not None:
213- entries.next()
214- inv = Inventory(self.root_id, self.revision_id)
215+ inv = Inventory(None, self.revision_id)
216 for path, inv_entry in entries:
217- inv.add(inv_entry)
218+ inv.add(inv_entry.copy())
219 return inv
220
221 def create_by_apply_delta(self, inventory_delta, new_revision_id,
222
223=== modified file 'bzrlib/knit.py'
224--- bzrlib/knit.py 2009-05-25 19:04:59 +0000
225+++ bzrlib/knit.py 2009-05-29 00:35:25 +0000
226@@ -2882,6 +2882,8 @@
227
228 def get_missing_parents(self):
229 """Return the keys of missing parents."""
230+ # If updating this, you should also update
231+ # groupcompress._GCGraphIndex.get_missing_parents
232 # We may have false positives, so filter those out.
233 self._key_dependencies.add_keys(
234 self.get_parent_map(self._key_dependencies.get_unsatisfied_refs()))
235
236=== modified file 'bzrlib/remote.py'
237--- bzrlib/remote.py 2009-05-10 23:45:33 +0000
238+++ bzrlib/remote.py 2009-05-29 00:35:26 +0000
239@@ -670,9 +670,10 @@
240 self._ensure_real()
241 return self._real_repository.suspend_write_group()
242
243- def get_missing_parent_inventories(self):
244+ def get_missing_parent_inventories(self, check_for_missing_texts=True):
245 self._ensure_real()
246- return self._real_repository.get_missing_parent_inventories()
247+ return self._real_repository.get_missing_parent_inventories(
248+ check_for_missing_texts=check_for_missing_texts)
249
250 def _ensure_real(self):
251 """Ensure that there is a _real_repository set.
252@@ -860,10 +861,10 @@
253 self._unstacked_provider.enable_cache(cache_misses=True)
254 if self._real_repository is not None:
255 self._real_repository.lock_read()
256+ for repo in self._fallback_repositories:
257+ repo.lock_read()
258 else:
259 self._lock_count += 1
260- for repo in self._fallback_repositories:
261- repo.lock_read()
262
263 def _remote_lock_write(self, token):
264 path = self.bzrdir._path_for_remote_call(self._client)
265@@ -901,13 +902,13 @@
266 self._lock_count = 1
267 cache_misses = self._real_repository is None
268 self._unstacked_provider.enable_cache(cache_misses=cache_misses)
269+ for repo in self._fallback_repositories:
270+ # Writes don't affect fallback repos
271+ repo.lock_read()
272 elif self._lock_mode == 'r':
273 raise errors.ReadOnlyError(self)
274 else:
275 self._lock_count += 1
276- for repo in self._fallback_repositories:
277- # Writes don't affect fallback repos
278- repo.lock_read()
279 return self._lock_token or None
280
281 def leave_lock_in_place(self):
282@@ -1015,6 +1016,10 @@
283 self._lock_token = None
284 if not self._leave_lock:
285 self._unlock(old_token)
286+ # Fallbacks are always 'lock_read()' so we don't pay attention to
287+ # self._leave_lock
288+ for repo in self._fallback_repositories:
289+ repo.unlock()
290
291 def break_lock(self):
292 # should hand off to the network
293@@ -1084,6 +1089,11 @@
294 # We need to accumulate additional repositories here, to pass them in
295 # on various RPC's.
296 #
297+ if self.is_locked():
298+ # We will call fallback.unlock() when we transition to the unlocked
299+ # state, so always add a lock here. If a caller passes us a locked
300+ # repository, they are responsible for unlocking it later.
301+ repository.lock_read()
302 self._fallback_repositories.append(repository)
303 # If self._real_repository was parameterised already (e.g. because a
304 # _real_branch had its get_stacked_on_url method called), then the
305@@ -1971,7 +1981,7 @@
306 except (errors.NotStacked, errors.UnstackableBranchFormat,
307 errors.UnstackableRepositoryFormat), e:
308 return
309- self._activate_fallback_location(fallback_url, None)
310+ self._activate_fallback_location(fallback_url)
311
312 def _get_config(self):
313 return RemoteBranchConfig(self)
314
315=== modified file 'bzrlib/repofmt/groupcompress_repo.py'
316--- bzrlib/repofmt/groupcompress_repo.py 2009-05-26 13:12:59 +0000
317+++ bzrlib/repofmt/groupcompress_repo.py 2009-05-29 00:35:26 +0000
318@@ -51,6 +51,7 @@
319 PackRootCommitBuilder,
320 RepositoryPackCollection,
321 RepositoryFormatPack,
322+ ResumedPack,
323 Packer,
324 )
325
326@@ -163,7 +164,21 @@
327 have deltas based on a fallback repository.
328 (See <https://bugs.launchpad.net/bzr/+bug/288751>)
329 """
330- # Groupcompress packs don't have any external references
331+ # Groupcompress packs don't have any external references, arguably CHK
332+ # pages have external references, but we cannot 'cheaply' determine
333+ # them without actually walking all of the chk pages.
334+
335+
336+class ResumedGCPack(ResumedPack):
337+
338+ def _check_references(self):
339+ """Make sure our external compression parents are present."""
340+ # See GCPack._check_references for why this is empty
341+
342+ def _get_external_refs(self, index):
343+ # GC repositories don't have compression parents external to a given
344+ # pack file
345+ return set()
346
347
348 class GCCHKPacker(Packer):
349@@ -540,6 +555,7 @@
350 class GCRepositoryPackCollection(RepositoryPackCollection):
351
352 pack_factory = GCPack
353+ resumed_pack_factory = ResumedGCPack
354
355 def _already_packed(self):
356 """Is the collection already packed?"""
357@@ -609,7 +625,8 @@
358 self.revisions = GroupCompressVersionedFiles(
359 _GCGraphIndex(self._pack_collection.revision_index.combined_index,
360 add_callback=self._pack_collection.revision_index.add_callback,
361- parents=True, is_locked=self.is_locked),
362+ parents=True, is_locked=self.is_locked,
363+ track_external_parent_refs=True),
364 access=self._pack_collection.revision_index.data_access,
365 delta=False)
366 self.signatures = GroupCompressVersionedFiles(
367@@ -719,52 +736,21 @@
368 # make it raise to trap naughty direct users.
369 raise NotImplementedError(self._iter_inventory_xmls)
370
371- def _find_revision_outside_set(self, revision_ids):
372- revision_set = frozenset(revision_ids)
373- for revid in revision_ids:
374- parent_ids = self.get_parent_map([revid]).get(revid, ())
375- for parent in parent_ids:
376- if parent in revision_set:
377- # Parent is not outside the set
378- continue
379- if parent not in self.get_parent_map([parent]):
380- # Parent is a ghost
381- continue
382- return parent
383- return _mod_revision.NULL_REVISION
384+ def _find_parent_ids_of_revisions(self, revision_ids):
385+ # TODO: we probably want to make this a helper that other code can get
386+ # at
387+ parent_map = self.get_parent_map(revision_ids)
388+ parents = set()
389+ map(parents.update, parent_map.itervalues())
390+ parents.difference_update(revision_ids)
391+ parents.discard(_mod_revision.NULL_REVISION)
392+ return parents
393
394- def _find_file_keys_to_fetch(self, revision_ids, pb):
395- rich_root = self.supports_rich_root()
396- revision_outside_set = self._find_revision_outside_set(revision_ids)
397- if revision_outside_set == _mod_revision.NULL_REVISION:
398- uninteresting_root_keys = set()
399- else:
400- uninteresting_inv = self.get_inventory(revision_outside_set)
401- uninteresting_root_keys = set([uninteresting_inv.id_to_entry.key()])
402- interesting_root_keys = set()
403- for idx, inv in enumerate(self.iter_inventories(revision_ids)):
404- interesting_root_keys.add(inv.id_to_entry.key())
405- revision_ids = frozenset(revision_ids)
406- file_id_revisions = {}
407- bytes_to_info = inventory.CHKInventory._bytes_to_utf8name_key
408- for record, items in chk_map.iter_interesting_nodes(self.chk_bytes,
409- interesting_root_keys, uninteresting_root_keys,
410- pb=pb):
411- # This is cheating a bit to use the last grabbed 'inv', but it
412- # works
413- for name, bytes in items:
414- (name_utf8, file_id, revision_id) = bytes_to_info(bytes)
415- if not rich_root and name_utf8 == '':
416- continue
417- if revision_id in revision_ids:
418- # Would we rather build this up into file_id => revision
419- # maps?
420- try:
421- file_id_revisions[file_id].add(revision_id)
422- except KeyError:
423- file_id_revisions[file_id] = set([revision_id])
424- for file_id, revisions in file_id_revisions.iteritems():
425- yield ('file', file_id, revisions)
426+ def _find_present_inventory_ids(self, revision_ids):
427+ keys = [(r,) for r in revision_ids]
428+ parent_map = self.inventories.get_parent_map(keys)
429+ present_inventory_ids = set(k[-1] for k in parent_map)
430+ return present_inventory_ids
431
432 def fileids_altered_by_revision_ids(self, revision_ids, _inv_weave=None):
433 """Find the file ids and versions affected by revisions.
434@@ -776,23 +762,39 @@
435 revision_ids. Each altered file-ids has the exact revision_ids that
436 altered it listed explicitly.
437 """
438- rich_roots = self.supports_rich_root()
439- result = {}
440+ rich_root = self.supports_rich_root()
441+ bytes_to_info = inventory.CHKInventory._bytes_to_utf8name_key
442+ file_id_revisions = {}
443 pb = ui.ui_factory.nested_progress_bar()
444 try:
445- total = len(revision_ids)
446- for pos, inv in enumerate(self.iter_inventories(revision_ids)):
447- pb.update("Finding text references", pos, total)
448- for entry in inv.iter_just_entries():
449- if entry.revision != inv.revision_id:
450- continue
451- if not rich_roots and entry.file_id == inv.root_id:
452- continue
453- alterations = result.setdefault(entry.file_id, set([]))
454- alterations.add(entry.revision)
455- return result
456+ parent_ids = self._find_parent_ids_of_revisions(revision_ids)
457+ present_parent_inv_ids = self._find_present_inventory_ids(parent_ids)
458+ uninteresting_root_keys = set()
459+ interesting_root_keys = set()
460+ inventories_to_read = set(present_parent_inv_ids)
461+ inventories_to_read.update(revision_ids)
462+ for inv in self.iter_inventories(inventories_to_read):
463+ entry_chk_root_key = inv.id_to_entry.key()
464+ if inv.revision_id in present_parent_inv_ids:
465+ uninteresting_root_keys.add(entry_chk_root_key)
466+ else:
467+ interesting_root_keys.add(entry_chk_root_key)
468+
469+ chk_bytes = self.chk_bytes
470+ for record, items in chk_map.iter_interesting_nodes(chk_bytes,
471+ interesting_root_keys, uninteresting_root_keys,
472+ pb=pb):
473+ for name, bytes in items:
474+ (name_utf8, file_id, revision_id) = bytes_to_info(bytes)
475+ if not rich_root and name_utf8 == '':
476+ continue
477+ try:
478+ file_id_revisions[file_id].add(revision_id)
479+ except KeyError:
480+ file_id_revisions[file_id] = set([revision_id])
481 finally:
482 pb.finished()
483+ return file_id_revisions
484
485 def find_text_key_references(self):
486 """Find the text key references within the repository.
487@@ -843,12 +845,6 @@
488 return GroupCHKStreamSource(self, to_format)
489 return super(CHKInventoryRepository, self)._get_source(to_format)
490
491- def suspend_write_group(self):
492- raise errors.UnsuspendableWriteGroup(self)
493-
494- def _resume_write_group(self, tokens):
495- raise errors.UnsuspendableWriteGroup(self)
496-
497
498 class GroupCHKStreamSource(repository.StreamSource):
499 """Used when both the source and target repo are GroupCHK repos."""
500@@ -861,7 +857,7 @@
501 self._chk_id_roots = None
502 self._chk_p_id_roots = None
503
504- def _get_filtered_inv_stream(self):
505+ def _get_inventory_stream(self, inventory_keys):
506 """Get a stream of inventory texts.
507
508 When this function returns, self._chk_id_roots and self._chk_p_id_roots
509@@ -873,7 +869,7 @@
510 id_roots_set = set()
511 p_id_roots_set = set()
512 source_vf = self.from_repository.inventories
513- stream = source_vf.get_record_stream(self._revision_keys,
514+ stream = source_vf.get_record_stream(inventory_keys,
515 'groupcompress', True)
516 for record in stream:
517 bytes = record.get_bytes_as('fulltext')
518@@ -897,16 +893,27 @@
519 p_id_roots_set.clear()
520 return ('inventories', _filtered_inv_stream())
521
522- def _get_filtered_chk_streams(self, excluded_keys):
523+ def _find_present_inventories(self, revision_ids):
524+ revision_keys = [(r,) for r in revision_ids]
525+ inventories = self.from_repository.inventories
526+ present_inventories = inventories.get_parent_map(revision_keys)
527+ return [p[-1] for p in present_inventories]
528+
529+ def _get_filtered_chk_streams(self, excluded_revision_ids):
530 self._text_keys = set()
531- excluded_keys.discard(_mod_revision.NULL_REVISION)
532- if not excluded_keys:
533+ excluded_revision_ids.discard(_mod_revision.NULL_REVISION)
534+ if not excluded_revision_ids:
535 uninteresting_root_keys = set()
536 uninteresting_pid_root_keys = set()
537 else:
538+ # filter out any excluded revisions whose inventories are not
539+ # actually present
540+ # TODO: Update Repository.iter_inventories() to add
541+ # ignore_missing=True
542+ present_ids = self._find_present_inventories(excluded_revision_ids)
543 uninteresting_root_keys = set()
544 uninteresting_pid_root_keys = set()
545- for inv in self.from_repository.iter_inventories(excluded_keys):
546+ for inv in self.from_repository.iter_inventories(present_ids):
547 uninteresting_root_keys.add(inv.id_to_entry.key())
548 uninteresting_pid_root_keys.add(
549 inv.parent_id_basename_to_file_id.key())
550@@ -922,12 +929,16 @@
551 self._text_keys.add((file_id, revision_id))
552 if record is not None:
553 yield record
554+ # Consumed
555+ self._chk_id_roots = None
556 yield 'chk_bytes', _filter_id_to_entry()
557 def _get_parent_id_basename_to_file_id_pages():
558 for record, items in chk_map.iter_interesting_nodes(chk_bytes,
559 self._chk_p_id_roots, uninteresting_pid_root_keys):
560 if record is not None:
561 yield record
562+ # Consumed
563+ self._chk_p_id_roots = None
564 yield 'chk_bytes', _get_parent_id_basename_to_file_id_pages()
565
566 def _get_text_stream(self):
567@@ -943,18 +954,45 @@
568 for stream_info in self._fetch_revision_texts(revision_ids):
569 yield stream_info
570 self._revision_keys = [(rev_id,) for rev_id in revision_ids]
571- yield self._get_filtered_inv_stream()
572- # The keys to exclude are part of the search recipe
573- _, _, exclude_keys, _ = search.get_recipe()
574- for stream_info in self._get_filtered_chk_streams(exclude_keys):
575+ yield self._get_inventory_stream(self._revision_keys)
576+ # TODO: The keys to exclude might be part of the search recipe
577+ # For now, exclude all parents that are at the edge of ancestry, for
578+ # which we have inventories
579+ parent_map = self.from_repository.get_parent_map(revision_ids)
580+ parent_ids = set()
581+ map(parent_ids.update, parent_map.itervalues())
582+ parent_ids.difference_update(parent_map)
583+ for stream_info in self._get_filtered_chk_streams(parent_ids):
584 yield stream_info
585 yield self._get_text_stream()
586
587+ def get_stream_for_missing_keys(self, missing_keys):
588+ # missing keys can only occur when we are byte copying and not
589+ # translating (because translation means we don't send
590+ # unreconstructable deltas ever).
591+ missing_inventory_keys = set()
592+ for key in missing_keys:
593+ if key[0] != 'inventories':
594+ raise AssertionError('The only missing keys we should'
595+ ' be filling in are inventory keys, not %s'
596+ % (key[0],))
597+ missing_inventory_keys.add(key[1:])
598+ if self._chk_id_roots or self._chk_p_id_roots:
599+ raise AssertionError('Cannot call get_stream_for_missing_keys'
600+ ' untill all of get_stream() has been consumed.')
601+ # Yield the inventory stream, so we can find the chk stream
602+ yield self._get_inventory_stream(missing_inventory_keys)
603+ # We use the empty set for excluded_revision_ids, to make it clear that
604+ # we want to transmit all referenced chk pages.
605+ for stream_info in self._get_filtered_chk_streams(set()):
606+ yield stream_info
607+
608
609 class RepositoryFormatCHK1(RepositoryFormatPack):
610 """A hashed CHK+group compress pack repository."""
611
612 repository_class = CHKInventoryRepository
613+ supports_external_lookups = True
614 supports_chks = True
615 # For right now, setting this to True gives us InterModel1And2 rather
616 # than InterDifferingSerializer
617
618=== modified file 'bzrlib/repofmt/pack_repo.py'
619--- bzrlib/repofmt/pack_repo.py 2009-04-27 23:14:00 +0000
620+++ bzrlib/repofmt/pack_repo.py 2009-05-29 00:35:26 +0000
621@@ -268,10 +268,11 @@
622
623 def __init__(self, name, revision_index, inventory_index, text_index,
624 signature_index, upload_transport, pack_transport, index_transport,
625- pack_collection):
626+ pack_collection, chk_index=None):
627 """Create a ResumedPack object."""
628 ExistingPack.__init__(self, pack_transport, name, revision_index,
629- inventory_index, text_index, signature_index)
630+ inventory_index, text_index, signature_index,
631+ chk_index=chk_index)
632 self.upload_transport = upload_transport
633 self.index_transport = index_transport
634 self.index_sizes = [None, None, None, None]
635@@ -281,6 +282,9 @@
636 ('text', text_index),
637 ('signature', signature_index),
638 ]
639+ if chk_index is not None:
640+ indices.append(('chk', chk_index))
641+ self.index_sizes.append(None)
642 for index_type, index in indices:
643 offset = self.index_offset(index_type)
644 self.index_sizes[offset] = index._size
645@@ -301,6 +305,8 @@
646 self.upload_transport.delete(self.file_name())
647 indices = [self.revision_index, self.inventory_index, self.text_index,
648 self.signature_index]
649+ if self.chk_index is not None:
650+ indices.append(self.chk_index)
651 for index in indices:
652 index._transport.delete(index._name)
653
654@@ -308,7 +314,10 @@
655 self._check_references()
656 new_name = '../packs/' + self.file_name()
657 self.upload_transport.rename(self.file_name(), new_name)
658- for index_type in ['revision', 'inventory', 'text', 'signature']:
659+ index_types = ['revision', 'inventory', 'text', 'signature']
660+ if self.chk_index is not None:
661+ index_types.append('chk')
662+ for index_type in index_types:
663 old_name = self.index_name(index_type, self.name)
664 new_name = '../indices/' + old_name
665 self.upload_transport.rename(old_name, new_name)
666@@ -316,6 +325,11 @@
667 self._state = 'finished'
668
669 def _get_external_refs(self, index):
670+ """Return compression parents for this index that are not present.
671+
672+ This returns any compression parents that are referenced by this index,
673+ which are not contained *in* this index. They may be present elsewhere.
674+ """
675 return index.external_references(1)
676
677
678@@ -1352,6 +1366,7 @@
679 """
680
681 pack_factory = NewPack
682+ resumed_pack_factory = ResumedPack
683
684 def __init__(self, repo, transport, index_transport, upload_transport,
685 pack_transport, index_builder_class, index_class,
686@@ -1680,9 +1695,14 @@
687 inv_index = self._make_index(name, '.iix', resume=True)
688 txt_index = self._make_index(name, '.tix', resume=True)
689 sig_index = self._make_index(name, '.six', resume=True)
690- result = ResumedPack(name, rev_index, inv_index, txt_index,
691- sig_index, self._upload_transport, self._pack_transport,
692- self._index_transport, self)
693+ if self.chk_index is not None:
694+ chk_index = self._make_index(name, '.cix', resume=True)
695+ else:
696+ chk_index = None
697+ result = self.resumed_pack_factory(name, rev_index, inv_index,
698+ txt_index, sig_index, self._upload_transport,
699+ self._pack_transport, self._index_transport, self,
700+ chk_index=chk_index)
701 except errors.NoSuchFile, e:
702 raise errors.UnresumableWriteGroup(self.repo, [name], str(e))
703 self.add_pack_to_memory(result)
704@@ -1809,14 +1829,11 @@
705 def reset(self):
706 """Clear all cached data."""
707 # cached revision data
708- self.repo._revision_knit = None
709 self.revision_index.clear()
710 # cached signature data
711- self.repo._signature_knit = None
712 self.signature_index.clear()
713 # cached file text data
714 self.text_index.clear()
715- self.repo._text_knit = None
716 # cached inventory data
717 self.inventory_index.clear()
718 # cached chk data
719@@ -2035,7 +2052,6 @@
720 except KeyError:
721 pass
722 del self._resumed_packs[:]
723- self.repo._text_knit = None
724
725 def _remove_resumed_pack_indices(self):
726 for resumed_pack in self._resumed_packs:
727@@ -2081,7 +2097,6 @@
728 # when autopack takes no steps, the names list is still
729 # unsaved.
730 self._save_pack_names()
731- self.repo._text_knit = None
732
733 def _suspend_write_group(self):
734 tokens = [pack.name for pack in self._resumed_packs]
735@@ -2095,7 +2110,6 @@
736 self._new_pack.abort()
737 self._new_pack = None
738 self._remove_resumed_pack_indices()
739- self.repo._text_knit = None
740 return tokens
741
742 def _resume_write_group(self, tokens):
743@@ -2202,6 +2216,7 @@
744 % (self._format, self.bzrdir.transport.base))
745
746 def _abort_write_group(self):
747+ self.revisions._index._key_dependencies.refs.clear()
748 self._pack_collection._abort_write_group()
749
750 def _find_inconsistent_revision_parents(self):
751@@ -2262,11 +2277,13 @@
752 self._pack_collection._start_write_group()
753
754 def _commit_write_group(self):
755+ self.revisions._index._key_dependencies.refs.clear()
756 return self._pack_collection._commit_write_group()
757
758 def suspend_write_group(self):
759 # XXX check self._write_group is self.get_transaction()?
760 tokens = self._pack_collection._suspend_write_group()
761+ self.revisions._index._key_dependencies.refs.clear()
762 self._write_group = None
763 return tokens
764
765@@ -2295,10 +2312,10 @@
766 self._write_lock_count += 1
767 if self._write_lock_count == 1:
768 self._transaction = transactions.WriteTransaction()
769+ if not locked:
770 for repo in self._fallback_repositories:
771 # Writes don't affect fallback repos
772 repo.lock_read()
773- if not locked:
774 self._refresh_data()
775
776 def lock_read(self):
777@@ -2307,10 +2324,9 @@
778 self._write_lock_count += 1
779 else:
780 self.control_files.lock_read()
781+ if not locked:
782 for repo in self._fallback_repositories:
783- # Writes don't affect fallback repos
784 repo.lock_read()
785- if not locked:
786 self._refresh_data()
787
788 def leave_lock_in_place(self):
789@@ -2356,10 +2372,10 @@
790 transaction = self._transaction
791 self._transaction = None
792 transaction.finish()
793- for repo in self._fallback_repositories:
794- repo.unlock()
795 else:
796 self.control_files.unlock()
797+
798+ if not self.is_locked():
799 for repo in self._fallback_repositories:
800 repo.unlock()
801
802
803=== modified file 'bzrlib/repository.py'
804--- bzrlib/repository.py 2009-05-12 04:54:04 +0000
805+++ bzrlib/repository.py 2009-05-29 00:35:26 +0000
806@@ -969,6 +969,10 @@
807 """
808 if not self._format.supports_external_lookups:
809 raise errors.UnstackableRepositoryFormat(self._format, self.base)
810+ if self.is_locked():
811+ # This repository will call fallback.unlock() when we transition to
812+ # the unlocked state, so we make sure to increment the lock count
813+ repository.lock_read()
814 self._check_fallback_repository(repository)
815 self._fallback_repositories.append(repository)
816 self.texts.add_fallback_versioned_files(repository.texts)
817@@ -1240,19 +1244,19 @@
818 """
819 locked = self.is_locked()
820 result = self.control_files.lock_write(token=token)
821- for repo in self._fallback_repositories:
822- # Writes don't affect fallback repos
823- repo.lock_read()
824 if not locked:
825+ for repo in self._fallback_repositories:
826+ # Writes don't affect fallback repos
827+ repo.lock_read()
828 self._refresh_data()
829 return result
830
831 def lock_read(self):
832 locked = self.is_locked()
833 self.control_files.lock_read()
834- for repo in self._fallback_repositories:
835- repo.lock_read()
836 if not locked:
837+ for repo in self._fallback_repositories:
838+ repo.lock_read()
839 self._refresh_data()
840
841 def get_physical_lock_status(self):
842@@ -1424,7 +1428,7 @@
843 def suspend_write_group(self):
844 raise errors.UnsuspendableWriteGroup(self)
845
846- def get_missing_parent_inventories(self):
847+ def get_missing_parent_inventories(self, check_for_missing_texts=True):
848 """Return the keys of missing inventory parents for revisions added in
849 this write group.
850
851@@ -1439,7 +1443,7 @@
852 return set()
853 if not self.is_in_write_group():
854 raise AssertionError('not in a write group')
855-
856+
857 # XXX: We assume that every added revision already has its
858 # corresponding inventory, so we only check for parent inventories that
859 # might be missing, rather than all inventories.
860@@ -1448,9 +1452,12 @@
861 unstacked_inventories = self.inventories._index
862 present_inventories = unstacked_inventories.get_parent_map(
863 key[-1:] for key in parents)
864- if len(parents.difference(present_inventories)) == 0:
865+ parents.difference_update(present_inventories)
866+ if len(parents) == 0:
867 # No missing parent inventories.
868 return set()
869+ if not check_for_missing_texts:
870+ return set(('inventories', rev_id) for (rev_id,) in parents)
871 # Ok, now we have a list of missing inventories. But these only matter
872 # if the inventories that reference them are missing some texts they
873 # appear to introduce.
874@@ -1577,8 +1584,8 @@
875 self.control_files.unlock()
876 if self.control_files._lock_count == 0:
877 self._inventory_entry_cache.clear()
878- for repo in self._fallback_repositories:
879- repo.unlock()
880+ for repo in self._fallback_repositories:
881+ repo.unlock()
882
883 @needs_read_lock
884 def clone(self, a_bzrdir, revision_id=None):
885@@ -4003,18 +4010,20 @@
886 try:
887 if resume_tokens:
888 self.target_repo.resume_write_group(resume_tokens)
889+ is_resume = True
890 else:
891 self.target_repo.start_write_group()
892+ is_resume = False
893 try:
894 # locked_insert_stream performs a commit|suspend.
895- return self._locked_insert_stream(stream, src_format)
896+ return self._locked_insert_stream(stream, src_format, is_resume)
897 except:
898 self.target_repo.abort_write_group(suppress_errors=True)
899 raise
900 finally:
901 self.target_repo.unlock()
902
903- def _locked_insert_stream(self, stream, src_format):
904+ def _locked_insert_stream(self, stream, src_format, is_resume):
905 to_serializer = self.target_repo._format._serializer
906 src_serializer = src_format._serializer
907 new_pack = None
908@@ -4070,14 +4079,18 @@
909 if new_pack is not None:
910 new_pack._write_data('', flush=True)
911 # Find all the new revisions (including ones from resume_tokens)
912- missing_keys = self.target_repo.get_missing_parent_inventories()
913+ missing_keys = self.target_repo.get_missing_parent_inventories(
914+ check_for_missing_texts=is_resume)
915 try:
916 for prefix, versioned_file in (
917 ('texts', self.target_repo.texts),
918 ('inventories', self.target_repo.inventories),
919 ('revisions', self.target_repo.revisions),
920 ('signatures', self.target_repo.signatures),
921+ ('chk_bytes', self.target_repo.chk_bytes),
922 ):
923+ if versioned_file is None:
924+ continue
925 missing_keys.update((prefix,) + key for key in
926 versioned_file.get_missing_compression_parent_keys())
927 except NotImplementedError:
928@@ -4230,6 +4243,7 @@
929 keys['texts'] = set()
930 keys['revisions'] = set()
931 keys['inventories'] = set()
932+ keys['chk_bytes'] = set()
933 keys['signatures'] = set()
934 for key in missing_keys:
935 keys[key[0]].add(key[1:])
936@@ -4242,6 +4256,13 @@
937 keys['revisions'],))
938 for substream_kind, keys in keys.iteritems():
939 vf = getattr(self.from_repository, substream_kind)
940+ if vf is None and keys:
941+ raise AssertionError(
942+ "cannot fill in keys for a versioned file we don't"
943+ " have: %s needs %s" % (substream_kind, keys))
944+ if not keys:
945+ # No need to stream something we don't have
946+ continue
947 # Ask for full texts always so that we don't need more round trips
948 # after this stream.
949 stream = vf.get_record_stream(keys,
950
951=== modified file 'bzrlib/tests/per_repository/test_fileid_involved.py'
952--- bzrlib/tests/per_repository/test_fileid_involved.py 2009-03-23 14:59:43 +0000
953+++ bzrlib/tests/per_repository/test_fileid_involved.py 2009-05-29 00:35:26 +0000
954@@ -1,4 +1,4 @@
955-# Copyright (C) 2005 Canonical Ltd
956+# Copyright (C) 2005, 2009 Canonical Ltd
957 #
958 # This program is free software; you can redistribute it and/or modify
959 # it under the terms of the GNU General Public License as published by
960@@ -16,7 +16,12 @@
961
962 import os
963 import sys
964+import time
965
966+from bzrlib import (
967+ revision as _mod_revision,
968+ tests,
969+ )
970 from bzrlib.errors import IllegalPath, NonAsciiRevisionId
971 from bzrlib.tests import TestSkipped
972 from bzrlib.tests.per_repository.test_repository import TestCaseWithRepository
973@@ -49,11 +54,11 @@
974 super(TestFileIdInvolved, self).setUp()
975 # create three branches, and merge it
976 #
977- # /-->J ------>K (branch2)
978- # / \
979- # A ---> B --->C ---->D->G (main)
980- # \ / /
981- # \---> E---/----> F (branch1)
982+ # ,-->J------>K (branch2)
983+ # / \
984+ # A --->B --->C---->D-->G (main)
985+ # \ / /
986+ # '--->E---+---->F (branch1)
987
988 # A changes:
989 # B changes: 'a-file-id-2006-01-01-abcd'
990@@ -137,8 +142,6 @@
991 self.branch = main_branch
992
993 def test_fileids_altered_between_two_revs(self):
994- def foo(old, new):
995- print set(self.branch.repository.get_ancestry(new)).difference(set(self.branch.repository.get_ancestry(old)))
996 self.branch.lock_read()
997 self.addCleanup(self.branch.unlock)
998 self.branch.repository.fileids_altered_by_revision_ids(["rev-J","rev-K"])
999@@ -295,7 +298,7 @@
1000 self.branch = main_branch
1001
1002 def test_fileid_involved_full_compare2(self):
1003- # this tests that fileids_alteted_by_revision_ids returns
1004+ # this tests that fileids_altered_by_revision_ids returns
1005 # more information than compare_tree can, because it
1006 # sees each change rather than the aggregate delta.
1007 self.branch.lock_read()
1008@@ -315,6 +318,73 @@
1009 self.assertSubset(l2, l1)
1010
1011
1012+class FileIdInvolvedWGhosts(TestCaseWithRepository):
1013+
1014+ def create_branch_with_ghost_text(self):
1015+ builder = self.make_branch_builder('ghost')
1016+ builder.build_snapshot('A-id', None, [
1017+ ('add', ('', 'root-id', 'directory', None)),
1018+ ('add', ('a', 'a-file-id', 'file', 'some content\n'))])
1019+ b = builder.get_branch()
1020+ old_rt = b.repository.revision_tree('A-id')
1021+ new_inv = old_rt.inventory._get_mutable_inventory()
1022+ new_inv.revision_id = 'B-id'
1023+ new_inv['a-file-id'].revision = 'ghost-id'
1024+ new_rev = _mod_revision.Revision('B-id',
1025+ timestamp=time.time(),
1026+ timezone=0,
1027+ message='Committing against a ghost',
1028+ committer='Joe Foo <joe@foo.com>',
1029+ properties={},
1030+ parent_ids=('A-id', 'ghost-id'),
1031+ )
1032+ b.lock_write()
1033+ self.addCleanup(b.unlock)
1034+ b.repository.start_write_group()
1035+ b.repository.add_revision('B-id', new_rev, new_inv)
1036+ b.repository.commit_write_group()
1037+ return b
1038+
1039+ def test_file_ids_include_ghosts(self):
1040+ b = self.create_branch_with_ghost_text()
1041+ repo = b.repository
1042+ self.assertEqual(
1043+ {'a-file-id':set(['ghost-id'])},
1044+ repo.fileids_altered_by_revision_ids(['B-id']))
1045+
1046+ def test_file_ids_uses_fallbacks(self):
1047+ builder = self.make_branch_builder('source',
1048+ format=self.bzrdir_format)
1049+ repo = builder.get_branch().repository
1050+ if not repo._format.supports_external_lookups:
1051+ raise tests.TestNotApplicable('format does not support stacking')
1052+ builder.start_series()
1053+ builder.build_snapshot('A-id', None, [
1054+ ('add', ('', 'root-id', 'directory', None)),
1055+ ('add', ('file', 'file-id', 'file', 'contents\n'))])
1056+ builder.build_snapshot('B-id', ['A-id'], [
1057+ ('modify', ('file-id', 'new-content\n'))])
1058+ builder.build_snapshot('C-id', ['B-id'], [
1059+ ('modify', ('file-id', 'yet more content\n'))])
1060+ builder.finish_series()
1061+ source_b = builder.get_branch()
1062+ source_b.lock_read()
1063+ self.addCleanup(source_b.unlock)
1064+ base = self.make_branch('base')
1065+ base.pull(source_b, stop_revision='B-id')
1066+ stacked = self.make_branch('stacked')
1067+ stacked.set_stacked_on_url('../base')
1068+ stacked.pull(source_b, stop_revision='C-id')
1069+
1070+ stacked.lock_read()
1071+ self.addCleanup(stacked.unlock)
1072+ repo = stacked.repository
1073+ keys = {'file-id': set(['A-id'])}
1074+ if stacked.repository.supports_rich_root():
1075+ keys['root-id'] = set(['A-id'])
1076+ self.assertEqual(keys, repo.fileids_altered_by_revision_ids(['A-id']))
1077+
1078+
1079 def set_executability(wt, path, executable=True):
1080 """Set the executable bit for the file at path in the working tree
1081
1082
1083=== modified file 'bzrlib/tests/per_repository/test_write_group.py'
1084--- bzrlib/tests/per_repository/test_write_group.py 2009-05-12 09:05:30 +0000
1085+++ bzrlib/tests/per_repository/test_write_group.py 2009-05-29 00:35:26 +0000
1086@@ -18,7 +18,15 @@
1087
1088 import sys
1089
1090-from bzrlib import bzrdir, errors, graph, memorytree, remote
1091+from bzrlib import (
1092+ bzrdir,
1093+ errors,
1094+ graph,
1095+ memorytree,
1096+ osutils,
1097+ remote,
1098+ versionedfile,
1099+ )
1100 from bzrlib.branch import BzrBranchFormat7
1101 from bzrlib.inventory import InventoryDirectory
1102 from bzrlib.transport import local, memory
1103@@ -240,9 +248,9 @@
1104 inventory) in it must have all the texts in its inventory (even if not
1105 changed w.r.t. to the absent parent), otherwise it will report missing
1106 texts/parent inventory.
1107-
1108+
1109 The core of this test is that a file was changed in rev-1, but in a
1110- stacked repo that only has rev-2
1111+ stacked repo that only has rev-2
1112 """
1113 # Make a trunk with one commit.
1114 trunk_repo = self.make_stackable_repo()
1115@@ -284,6 +292,69 @@
1116 set(), reopened_repo.get_missing_parent_inventories())
1117 reopened_repo.abort_write_group()
1118
1119+ def test_get_missing_parent_inventories_check(self):
1120+ builder = self.make_branch_builder('test')
1121+ builder.build_snapshot('A-id', ['ghost-parent-id'], [
1122+ ('add', ('', 'root-id', 'directory', None)),
1123+ ('add', ('file', 'file-id', 'file', 'content\n'))],
1124+ allow_leftmost_as_ghost=True)
1125+ b = builder.get_branch()
1126+ b.lock_read()
1127+ self.addCleanup(b.unlock)
1128+ repo = self.make_repository('test-repo')
1129+ repo.lock_write()
1130+ self.addCleanup(repo.unlock)
1131+ repo.start_write_group()
1132+ self.addCleanup(repo.abort_write_group)
1133+ # Now, add the objects manually
1134+ text_keys = [('file-id', 'A-id')]
1135+ if repo.supports_rich_root():
1136+ text_keys.append(('root-id', 'A-id'))
1137+ # Directly add the texts, inventory, and revision object for 'A-id'
1138+ repo.texts.insert_record_stream(b.repository.texts.get_record_stream(
1139+ text_keys, 'unordered', True))
1140+ repo.add_revision('A-id', b.repository.get_revision('A-id'),
1141+ b.repository.get_inventory('A-id'))
1142+ get_missing = repo.get_missing_parent_inventories
1143+ if repo._format.supports_external_lookups:
1144+ self.assertEqual(set([('inventories', 'ghost-parent-id')]),
1145+ get_missing(check_for_missing_texts=False))
1146+ self.assertEqual(set(), get_missing(check_for_missing_texts=True))
1147+ self.assertEqual(set(), get_missing())
1148+ else:
1149+ # If we don't support external lookups, we always return empty
1150+ self.assertEqual(set(), get_missing(check_for_missing_texts=False))
1151+ self.assertEqual(set(), get_missing(check_for_missing_texts=True))
1152+ self.assertEqual(set(), get_missing())
1153+
1154+ def test_insert_stream_passes_resume_info(self):
1155+ repo = self.make_repository('test-repo')
1156+ if not repo._format.supports_external_lookups:
1157+ raise TestNotApplicable('only valid in resumable repos')
1158+ # log calls to get_missing_parent_inventories, so that we can assert it
1159+ # is called with the correct parameters
1160+ call_log = []
1161+ orig = repo.get_missing_parent_inventories
1162+ def get_missing(check_for_missing_texts=True):
1163+ call_log.append(check_for_missing_texts)
1164+ return orig(check_for_missing_texts=check_for_missing_texts)
1165+ repo.get_missing_parent_inventories = get_missing
1166+ repo.lock_write()
1167+ self.addCleanup(repo.unlock)
1168+ sink = repo._get_sink()
1169+ sink.insert_stream((), repo._format, [])
1170+ self.assertEqual([False], call_log)
1171+ del call_log[:]
1172+ repo.start_write_group()
1173+ # We need to insert something, or suspend_write_group won't actually
1174+ # create a token
1175+ repo.texts.insert_record_stream([versionedfile.FulltextContentFactory(
1176+ ('file-id', 'rev-id'), (), None, 'lines\n')])
1177+ tokens = repo.suspend_write_group()
1178+ self.assertNotEqual([], tokens)
1179+ sink.insert_stream((), repo._format, tokens)
1180+ self.assertEqual([True], call_log)
1181+
1182
1183 class TestResumeableWriteGroup(TestCaseWithRepository):
1184
1185@@ -518,9 +589,12 @@
1186 source_repo.start_write_group()
1187 key_base = ('file-id', 'base')
1188 key_delta = ('file-id', 'delta')
1189- source_repo.texts.add_lines(key_base, (), ['lines\n'])
1190- source_repo.texts.add_lines(
1191- key_delta, (key_base,), ['more\n', 'lines\n'])
1192+ def text_stream():
1193+ yield versionedfile.FulltextContentFactory(
1194+ key_base, (), None, 'lines\n')
1195+ yield versionedfile.FulltextContentFactory(
1196+ key_delta, (key_base,), None, 'more\nlines\n')
1197+ source_repo.texts.insert_record_stream(text_stream())
1198 source_repo.commit_write_group()
1199 return source_repo
1200
1201@@ -536,9 +610,20 @@
1202 stream = source_repo.texts.get_record_stream(
1203 [key_delta], 'unordered', False)
1204 repo.texts.insert_record_stream(stream)
1205- # It's not commitable due to the missing compression parent.
1206- self.assertRaises(
1207- errors.BzrCheckError, repo.commit_write_group)
1208+ # It's either not commitable due to the missing compression parent, or
1209+ # the stacked location has already filled in the fulltext.
1210+ try:
1211+ repo.commit_write_group()
1212+ except errors.BzrCheckError:
1213+ # It refused to commit because we have a missing parent
1214+ pass
1215+ else:
1216+ same_repo = self.reopen_repo(repo)
1217+ same_repo.lock_read()
1218+ record = same_repo.texts.get_record_stream([key_delta],
1219+ 'unordered', True).next()
1220+ self.assertEqual('more\nlines\n', record.get_bytes_as('fulltext'))
1221+ return
1222 # Merely suspending and resuming doesn't make it commitable either.
1223 wg_tokens = repo.suspend_write_group()
1224 same_repo = self.reopen_repo(repo)
1225@@ -570,8 +655,19 @@
1226 same_repo.texts.insert_record_stream(stream)
1227 # Just like if we'd added that record without a suspend/resume cycle,
1228 # commit_write_group fails.
1229- self.assertRaises(
1230- errors.BzrCheckError, same_repo.commit_write_group)
1231+ try:
1232+ same_repo.commit_write_group()
1233+ except errors.BzrCheckError:
1234+ pass
1235+ else:
1236+ # If the commit_write_group didn't fail, that is because the
1237+ # insert_record_stream already gave it a fulltext.
1238+ same_repo = self.reopen_repo(repo)
1239+ same_repo.lock_read()
1240+ record = same_repo.texts.get_record_stream([key_delta],
1241+ 'unordered', True).next()
1242+ self.assertEqual('more\nlines\n', record.get_bytes_as('fulltext'))
1243+ return
1244 same_repo.abort_write_group()
1245
1246 def test_add_missing_parent_after_resume(self):
1247
1248=== modified file 'bzrlib/tests/per_repository_reference/__init__.py'
1249--- bzrlib/tests/per_repository_reference/__init__.py 2009-03-23 14:59:43 +0000
1250+++ bzrlib/tests/per_repository_reference/__init__.py 2009-05-29 00:35:26 +0000
1251@@ -97,6 +97,9 @@
1252 'bzrlib.tests.per_repository_reference.test_break_lock',
1253 'bzrlib.tests.per_repository_reference.test_check',
1254 'bzrlib.tests.per_repository_reference.test_default_stacking',
1255+ 'bzrlib.tests.per_repository_reference.test_fetch',
1256+ 'bzrlib.tests.per_repository_reference.test_initialize',
1257+ 'bzrlib.tests.per_repository_reference.test_unlock',
1258 ]
1259 # Parameterize per_repository_reference test modules by format.
1260 standard_tests.addTests(loader.loadTestsFromModuleNames(module_list))
1261
1262=== modified file 'bzrlib/tests/per_repository_reference/test_default_stacking.py'
1263--- bzrlib/tests/per_repository_reference/test_default_stacking.py 2009-03-23 14:59:43 +0000
1264+++ bzrlib/tests/per_repository_reference/test_default_stacking.py 2009-05-29 00:35:26 +0000
1265@@ -21,19 +21,13 @@
1266
1267 class TestDefaultStackingPolicy(TestCaseWithRepository):
1268
1269- # XXX: this helper probably belongs on TestCaseWithTransport
1270- def make_smart_server(self, path):
1271- smart_server = server.SmartTCPServer_for_testing()
1272- smart_server.setUp(self.get_server())
1273- return smart_server.get_url() + path
1274-
1275 def test_sprout_to_smart_server_stacking_policy_handling(self):
1276 """Obey policy where possible, ignore otherwise."""
1277 stack_on = self.make_branch('stack-on')
1278 parent_bzrdir = self.make_bzrdir('.', format='default')
1279 parent_bzrdir.get_config().set_default_stack_on('stack-on')
1280 source = self.make_branch('source')
1281- url = self.make_smart_server('target')
1282+ url = self.make_smart_server('target').abspath('')
1283 target = source.bzrdir.sprout(url).open_branch()
1284 self.assertEqual('../stack-on', target.get_stacked_on_url())
1285 self.assertEqual(
1286
1287=== added file 'bzrlib/tests/per_repository_reference/test_fetch.py'
1288--- bzrlib/tests/per_repository_reference/test_fetch.py 1970-01-01 00:00:00 +0000
1289+++ bzrlib/tests/per_repository_reference/test_fetch.py 2009-05-29 00:35:26 +0000
1290@@ -0,0 +1,110 @@
1291+# Copyright (C) 2009 Canonical Ltd
1292+#
1293+# This program is free software; you can redistribute it and/or modify
1294+# it under the terms of the GNU General Public License as published by
1295+# the Free Software Foundation; either version 2 of the License, or
1296+# (at your option) any later version.
1297+#
1298+# This program is distributed in the hope that it will be useful,
1299+# but WITHOUT ANY WARRANTY; without even the implied warranty of
1300+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
1301+# GNU General Public License for more details.
1302+#
1303+# You should have received a copy of the GNU General Public License
1304+# along with this program; if not, write to the Free Software
1305+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
1306+
1307+
1308+from bzrlib.smart import server
1309+from bzrlib.tests.per_repository import TestCaseWithRepository
1310+
1311+
1312+class TestFetch(TestCaseWithRepository):
1313+
1314+ def make_source_branch(self):
1315+ # It would be nice if there was a way to force this to be memory-only
1316+ builder = self.make_branch_builder('source')
1317+ content = ['content lines\n'
1318+ 'for the first revision\n'
1319+ 'which is a marginal amount of content\n'
1320+ ]
1321+ builder.start_series()
1322+ builder.build_snapshot('A-id', None, [
1323+ ('add', ('', 'root-id', 'directory', None)),
1324+ ('add', ('a', 'a-id', 'file', ''.join(content))),
1325+ ])
1326+ content.append('and some more lines for B\n')
1327+ builder.build_snapshot('B-id', ['A-id'], [
1328+ ('modify', ('a-id', ''.join(content)))])
1329+ content.append('and yet even more content for C\n')
1330+ builder.build_snapshot('C-id', ['B-id'], [
1331+ ('modify', ('a-id', ''.join(content)))])
1332+ builder.finish_series()
1333+ source_b = builder.get_branch()
1334+ source_b.lock_read()
1335+ self.addCleanup(source_b.unlock)
1336+ return content, source_b
1337+
1338+ def test_sprout_from_stacked_with_short_history(self):
1339+ # Now copy this data into a branch, and stack on it
1340+ # Use 'make_branch' which gives us a bzr:// branch when appropriate,
1341+ # rather than creating a branch-on-disk
1342+ content, source_b = self.make_source_branch()
1343+ stack_b = self.make_branch('stack-on')
1344+ stack_b.pull(source_b, stop_revision='B-id')
1345+ target_b = self.make_branch('target')
1346+ target_b.set_stacked_on_url('../stack-on')
1347+ target_b.pull(source_b, stop_revision='C-id')
1348+ # At this point, we should have a target branch, with 1 revision, on
1349+ # top of the source.
1350+ final_b = self.make_branch('final')
1351+ final_b.pull(target_b)
1352+ final_b.lock_read()
1353+ self.addCleanup(final_b.unlock)
1354+ self.assertEqual('C-id', final_b.last_revision())
1355+ text_keys = [('a-id', 'A-id'), ('a-id', 'B-id'), ('a-id', 'C-id')]
1356+ stream = final_b.repository.texts.get_record_stream(text_keys,
1357+ 'unordered', True)
1358+ records = []
1359+ for record in stream:
1360+ records.append(record.key)
1361+ if record.key == ('a-id', 'A-id'):
1362+ self.assertEqual(''.join(content[:-2]),
1363+ record.get_bytes_as('fulltext'))
1364+ elif record.key == ('a-id', 'B-id'):
1365+ self.assertEqual(''.join(content[:-1]),
1366+ record.get_bytes_as('fulltext'))
1367+ elif record.key == ('a-id', 'C-id'):
1368+ self.assertEqual(''.join(content),
1369+ record.get_bytes_as('fulltext'))
1370+ else:
1371+ self.fail('Unexpected record: %s' % (record.key,))
1372+ self.assertEqual(text_keys, sorted(records))
1373+
1374+ def test_sprout_from_smart_stacked_with_short_history(self):
1375+ content, source_b = self.make_source_branch()
1376+ transport = self.make_smart_server('server')
1377+ transport.ensure_base()
1378+ url = transport.abspath('')
1379+ stack_b = source_b.bzrdir.sprout(url + '/stack-on', revision_id='B-id')
1380+ # self.make_branch only takes relative paths, so we do it the 'hard'
1381+ # way
1382+ target_transport = transport.clone('target')
1383+ target_transport.ensure_base()
1384+ target_bzrdir = self.bzrdir_format.initialize_on_transport(
1385+ target_transport)
1386+ target_bzrdir.create_repository()
1387+ target_b = target_bzrdir.create_branch()
1388+ target_b.set_stacked_on_url('../stack-on')
1389+ target_b.pull(source_b, stop_revision='C-id')
1390+ # Now we should be able to branch from the remote location to a local
1391+ # location
1392+ final_b = target_b.bzrdir.sprout('final').open_branch()
1393+ self.assertEqual('C-id', final_b.last_revision())
1394+
1395+ # bzrdir.sprout() has slightly different code paths if you supply a
1396+ # revision_id versus not. If you supply revision_id, then you get a
1397+ # PendingAncestryResult for the search, versus a SearchResult...
1398+ final2_b = target_b.bzrdir.sprout('final2',
1399+ revision_id='C-id').open_branch()
1400+ self.assertEqual('C-id', final_b.last_revision())
1401
1402=== added file 'bzrlib/tests/per_repository_reference/test_initialize.py'
1403--- bzrlib/tests/per_repository_reference/test_initialize.py 1970-01-01 00:00:00 +0000
1404+++ bzrlib/tests/per_repository_reference/test_initialize.py 2009-05-29 00:35:26 +0000
1405@@ -0,0 +1,39 @@
1406+# Copyright (C) 2009 Canonical Ltd
1407+#
1408+# This program is free software; you can redistribute it and/or modify
1409+# it under the terms of the GNU General Public License as published by
1410+# the Free Software Foundation; either version 2 of the License, or
1411+# (at your option) any later version.
1412+#
1413+# This program is distributed in the hope that it will be useful,
1414+# but WITHOUT ANY WARRANTY; without even the implied warranty of
1415+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
1416+# GNU General Public License for more details.
1417+#
1418+# You should have received a copy of the GNU General Public License
1419+# along with this program; if not, write to the Free Software
1420+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
1421+
1422+"""Tests for initializing a repository with external references."""
1423+
1424+
1425+from bzrlib import (
1426+ errors,
1427+ )
1428+from bzrlib.tests.per_repository_reference import (
1429+ TestCaseWithExternalReferenceRepository,
1430+ )
1431+
1432+
1433+class TestInitialize(TestCaseWithExternalReferenceRepository):
1434+
1435+ def test_initialize_on_transport_ex(self):
1436+ base = self.make_branch('base')
1437+ network_name = base.repository._format.network_name()
1438+ trans = self.get_transport('stacked')
1439+ result = self.bzrdir_format.initialize_on_transport_ex(
1440+ trans, use_existing_dir=False, create_prefix=False,
1441+ stacked_on='../base', stack_on_pwd=base.base,
1442+ repo_format_name=network_name)
1443+ result_repo, a_bzrdir, require_stacking, repo_policy = result
1444+ result_repo.unlock()
1445
1446=== added file 'bzrlib/tests/per_repository_reference/test_unlock.py'
1447--- bzrlib/tests/per_repository_reference/test_unlock.py 1970-01-01 00:00:00 +0000
1448+++ bzrlib/tests/per_repository_reference/test_unlock.py 2009-05-29 00:35:26 +0000
1449@@ -0,0 +1,76 @@
1450+# Copyright (C) 2009 Canonical Ltd
1451+#
1452+# This program is free software; you can redistribute it and/or modify
1453+# it under the terms of the GNU General Public License as published by
1454+# the Free Software Foundation; either version 2 of the License, or
1455+# (at your option) any later version.
1456+#
1457+# This program is distributed in the hope that it will be useful,
1458+# but WITHOUT ANY WARRANTY; without even the implied warranty of
1459+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
1460+# GNU General Public License for more details.
1461+#
1462+# You should have received a copy of the GNU General Public License
1463+# along with this program; if not, write to the Free Software
1464+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
1465+
1466+"""Tests for locking/unlocking a repository with external references."""
1467+
1468+from bzrlib import (
1469+ branch,
1470+ errors,
1471+ )
1472+from bzrlib.tests.per_repository_reference import (
1473+ TestCaseWithExternalReferenceRepository,
1474+ )
1475+
1476+
1477+class TestUnlock(TestCaseWithExternalReferenceRepository):
1478+
1479+ def create_stacked_branch(self):
1480+ builder = self.make_branch_builder('source',
1481+ format=self.bzrdir_format)
1482+ builder.start_series()
1483+ repo = builder.get_branch().repository
1484+ if not repo._format.supports_external_lookups:
1485+ raise tests.TestNotApplicable('format does not support stacking')
1486+ builder.build_snapshot('A-id', None, [
1487+ ('add', ('', 'root-id', 'directory', None)),
1488+ ('add', ('file', 'file-id', 'file', 'contents\n'))])
1489+ builder.build_snapshot('B-id', ['A-id'], [
1490+ ('modify', ('file-id', 'new-content\n'))])
1491+ builder.build_snapshot('C-id', ['B-id'], [
1492+ ('modify', ('file-id', 'yet more content\n'))])
1493+ builder.finish_series()
1494+ source_b = builder.get_branch()
1495+ source_b.lock_read()
1496+ self.addCleanup(source_b.unlock)
1497+ base = self.make_branch('base')
1498+ base.pull(source_b, stop_revision='B-id')
1499+ stacked = self.make_branch('stacked')
1500+ stacked.set_stacked_on_url('../base')
1501+ stacked.pull(source_b, stop_revision='C-id')
1502+
1503+ return base, stacked
1504+
1505+ def test_unlock_unlocks_fallback(self):
1506+ base = self.make_branch('base')
1507+ stacked = self.make_branch('stacked')
1508+ repo = stacked.repository
1509+ stacked.set_stacked_on_url('../base')
1510+ self.assertEqual(1, len(repo._fallback_repositories))
1511+ fallback_repo = repo._fallback_repositories[0]
1512+ self.assertFalse(repo.is_locked())
1513+ self.assertFalse(fallback_repo.is_locked())
1514+ repo.lock_read()
1515+ self.assertTrue(repo.is_locked())
1516+ self.assertTrue(fallback_repo.is_locked())
1517+ repo.unlock()
1518+ self.assertFalse(repo.is_locked())
1519+ self.assertFalse(fallback_repo.is_locked())
1520+ repo.lock_write()
1521+ self.assertTrue(repo.is_locked())
1522+ self.assertTrue(fallback_repo.is_locked())
1523+ repo.unlock()
1524+ self.assertFalse(repo.is_locked())
1525+ self.assertFalse(fallback_repo.is_locked())
1526
1527=== modified file 'bzrlib/tests/test_graph.py'
1528--- bzrlib/tests/test_graph.py 2009-03-24 23:19:12 +0000
1529+++ bzrlib/tests/test_graph.py 2009-05-29 00:35:26 +0000
1530@@ -1558,6 +1558,19 @@
1531 result = _mod_graph.PendingAncestryResult(['rev-2'], repo)
1532 self.assertEqual(set(['rev-1', 'rev-2']), set(result.get_keys()))
1533
1534+ def test_get_keys_excludes_ghosts(self):
1535+ builder = self.make_branch_builder('b')
1536+ builder.start_series()
1537+ builder.build_snapshot('rev-1', None, [
1538+ ('add', ('', 'root-id', 'directory', ''))])
1539+ builder.build_snapshot('rev-2', ['rev-1', 'ghost'], [])
1540+ builder.finish_series()
1541+ repo = builder.get_branch().repository
1542+ repo.lock_read()
1543+ self.addCleanup(repo.unlock)
1544+ result = _mod_graph.PendingAncestryResult(['rev-2'], repo)
1545+ self.assertEqual(sorted(['rev-1', 'rev-2']), sorted(result.get_keys()))
1546+
1547 def test_get_keys_excludes_null(self):
1548 # Make a 'graph' with an iter_ancestry that returns NULL_REVISION
1549 # somewhere other than the last element, which can happen in real
1550
1551=== modified file 'bzrlib/tests/test_groupcompress.py'
1552--- bzrlib/tests/test_groupcompress.py 2009-04-22 17:18:45 +0000
1553+++ bzrlib/tests/test_groupcompress.py 2009-05-29 00:35:26 +0000
1554@@ -19,8 +19,10 @@
1555 import zlib
1556
1557 from bzrlib import (
1558+ btree_index,
1559 groupcompress,
1560 errors,
1561+ index as _mod_index,
1562 osutils,
1563 tests,
1564 versionedfile,
1565@@ -475,6 +477,23 @@
1566
1567 class TestGroupCompressVersionedFiles(TestCaseWithGroupCompressVersionedFiles):
1568
1569+ def make_g_index(self, name, ref_lists=0, nodes=[]):
1570+ builder = btree_index.BTreeBuilder(ref_lists)
1571+ for node, references, value in nodes:
1572+ builder.add_node(node, references, value)
1573+ stream = builder.finish()
1574+ trans = self.get_transport()
1575+ size = trans.put_file(name, stream)
1576+ return btree_index.BTreeGraphIndex(trans, name, size)
1577+
1578+ def make_g_index_missing_parent(self):
1579+ graph_index = self.make_g_index('missing_parent', 1,
1580+ [(('parent', ), '2 78 2 10', ([],)),
1581+ (('tip', ), '2 78 2 10',
1582+ ([('parent', ), ('missing-parent', )],)),
1583+ ])
1584+ return graph_index
1585+
1586 def test_get_record_stream_as_requested(self):
1587 # Consider promoting 'as-requested' to general availability, and
1588 # make this a VF interface test
1589@@ -606,6 +625,30 @@
1590 else:
1591 self.assertIs(block, record._manager._block)
1592
1593+ def test_add_missing_noncompression_parent_unvalidated_index(self):
1594+ unvalidated = self.make_g_index_missing_parent()
1595+ combined = _mod_index.CombinedGraphIndex([unvalidated])
1596+ index = groupcompress._GCGraphIndex(combined,
1597+ is_locked=lambda: True, parents=True,
1598+ track_external_parent_refs=True)
1599+ index.scan_unvalidated_index(unvalidated)
1600+ self.assertEqual(
1601+ frozenset([('missing-parent',)]), index.get_missing_parents())
1602+
1603+ def test_track_external_parent_refs(self):
1604+ g_index = self.make_g_index('empty', 1, [])
1605+ mod_index = btree_index.BTreeBuilder(1, 1)
1606+ combined = _mod_index.CombinedGraphIndex([g_index, mod_index])
1607+ index = groupcompress._GCGraphIndex(combined,
1608+ is_locked=lambda: True, parents=True,
1609+ add_callback=mod_index.add_nodes,
1610+ track_external_parent_refs=True)
1611+ index.add_records([
1612+ (('new-key',), '2 10 2 10', [(('parent-1',), ('parent-2',))])])
1613+ self.assertEqual(
1614+ frozenset([('parent-1',), ('parent-2',)]),
1615+ index.get_missing_parents())
1616+
1617
1618 class TestLazyGroupCompress(tests.TestCaseWithTransport):
1619
1620
1621=== modified file 'bzrlib/tests/test_pack_repository.py'
1622--- bzrlib/tests/test_pack_repository.py 2009-05-11 15:30:40 +0000
1623+++ bzrlib/tests/test_pack_repository.py 2009-05-29 00:35:26 +0000
1624@@ -620,7 +620,7 @@
1625 Also requires that the exception is logged.
1626 """
1627 self.vfs_transport_factory = memory.MemoryServer
1628- repo = self.make_repository('repo')
1629+ repo = self.make_repository('repo', format=self.get_format())
1630 token = repo.lock_write()
1631 self.addCleanup(repo.unlock)
1632 repo.start_write_group()
1633@@ -637,7 +637,7 @@
1634
1635 def test_abort_write_group_does_raise_when_not_suppressed(self):
1636 self.vfs_transport_factory = memory.MemoryServer
1637- repo = self.make_repository('repo')
1638+ repo = self.make_repository('repo', format=self.get_format())
1639 token = repo.lock_write()
1640 self.addCleanup(repo.unlock)
1641 repo.start_write_group()
1642@@ -650,23 +650,52 @@
1643
1644 def test_suspend_write_group(self):
1645 self.vfs_transport_factory = memory.MemoryServer
1646- repo = self.make_repository('repo')
1647+ repo = self.make_repository('repo', format=self.get_format())
1648 token = repo.lock_write()
1649 self.addCleanup(repo.unlock)
1650 repo.start_write_group()
1651 repo.texts.add_lines(('file-id', 'revid'), (), ['lines'])
1652 wg_tokens = repo.suspend_write_group()
1653 expected_pack_name = wg_tokens[0] + '.pack'
1654+ expected_names = [wg_tokens[0] + ext for ext in
1655+ ('.rix', '.iix', '.tix', '.six')]
1656+ if repo.chk_bytes is not None:
1657+ expected_names.append(wg_tokens[0] + '.cix')
1658+ expected_names.append(expected_pack_name)
1659 upload_transport = repo._pack_collection._upload_transport
1660 limbo_files = upload_transport.list_dir('')
1661- self.assertTrue(expected_pack_name in limbo_files, limbo_files)
1662+ self.assertEqual(sorted(expected_names), sorted(limbo_files))
1663 md5 = osutils.md5(upload_transport.get_bytes(expected_pack_name))
1664 self.assertEqual(wg_tokens[0], md5.hexdigest())
1665
1666+ def test_resume_chk_bytes(self):
1667+ self.vfs_transport_factory = memory.MemoryServer
1668+ repo = self.make_repository('repo', format=self.get_format())
1669+ if repo.chk_bytes is None:
1670+ raise TestNotApplicable('no chk_bytes for this repository')
1671+ expected_names.append(wg_tokens[0] + '.cix')
1672+ token = repo.lock_write()
1673+ self.addCleanup(repo.unlock)
1674+ repo.start_write_group()
1675+ text = 'a bit of text\n'
1676+ key = ('sha1:' + osutils.sha_string(text),)
1677+ repo.chk_bytes.add_lines(key, (), [text])
1678+ wg_tokens = repo.suspend_write_group()
1679+ same_repo = repo.bzrdir.open_repository()
1680+ same_repo.lock_write()
1681+ self.addCleanup(same_repo.unlock)
1682+ same_repo.resume_write_group(wg_tokens)
1683+ self.assertEqual([key], list(same_repo.chk_bytes.keys()))
1684+ self.assertEqual(
1685+ text, same_repo.chk_bytes.get_record_stream([key],
1686+ 'unordered', True).next().get_bytes_as('fulltext'))
1687+ same_repo.abort_write_group()
1688+ self.assertEqual([], list(same_repo.chk_bytes.keys()))
1689+
1690 def test_resume_write_group_then_abort(self):
1691 # Create a repo, start a write group, insert some data, suspend.
1692 self.vfs_transport_factory = memory.MemoryServer
1693- repo = self.make_repository('repo')
1694+ repo = self.make_repository('repo', format=self.get_format())
1695 token = repo.lock_write()
1696 self.addCleanup(repo.unlock)
1697 repo.start_write_group()
1698@@ -685,10 +714,38 @@
1699 self.assertEqual(
1700 [], same_repo._pack_collection._pack_transport.list_dir(''))
1701
1702+ def test_commit_resumed_write_group(self):
1703+ self.vfs_transport_factory = memory.MemoryServer
1704+ repo = self.make_repository('repo', format=self.get_format())
1705+ token = repo.lock_write()
1706+ self.addCleanup(repo.unlock)
1707+ repo.start_write_group()
1708+ text_key = ('file-id', 'revid')
1709+ repo.texts.add_lines(text_key, (), ['lines'])
1710+ wg_tokens = repo.suspend_write_group()
1711+ # Get a fresh repository object for the repo on the filesystem.
1712+ same_repo = repo.bzrdir.open_repository()
1713+ # Resume
1714+ same_repo.lock_write()
1715+ self.addCleanup(same_repo.unlock)
1716+ same_repo.resume_write_group(wg_tokens)
1717+ same_repo.commit_write_group()
1718+ expected_pack_name = wg_tokens[0] + '.pack'
1719+ expected_names = [wg_tokens[0] + ext for ext in
1720+ ('.rix', '.iix', '.tix', '.six')]
1721+ if repo.chk_bytes is not None:
1722+ expected_names.append(wg_tokens[0] + '.cix')
1723+ self.assertEqual(
1724+ [], same_repo._pack_collection._upload_transport.list_dir(''))
1725+ index_names = repo._pack_collection._index_transport.list_dir('')
1726+ self.assertEqual(sorted(expected_names), sorted(index_names))
1727+ pack_names = repo._pack_collection._pack_transport.list_dir('')
1728+ self.assertEqual([expected_pack_name], pack_names)
1729+
1730 def test_resume_malformed_token(self):
1731 self.vfs_transport_factory = memory.MemoryServer
1732 # Make a repository with a suspended write group
1733- repo = self.make_repository('repo')
1734+ repo = self.make_repository('repo', format=self.get_format())
1735 token = repo.lock_write()
1736 self.addCleanup(repo.unlock)
1737 repo.start_write_group()
1738@@ -696,7 +753,7 @@
1739 repo.texts.add_lines(text_key, (), ['lines'])
1740 wg_tokens = repo.suspend_write_group()
1741 # Make a new repository
1742- new_repo = self.make_repository('new_repo')
1743+ new_repo = self.make_repository('new_repo', format=self.get_format())
1744 token = new_repo.lock_write()
1745 self.addCleanup(new_repo.unlock)
1746 hacked_wg_token = (
1747@@ -732,12 +789,12 @@
1748 # can only stack on repositories that have compatible internal
1749 # metadata
1750 if getattr(repo._format, 'supports_tree_reference', False):
1751+ matching_format_name = 'pack-0.92-subtree'
1752+ else:
1753 if repo._format.supports_chks:
1754 matching_format_name = 'development6-rich-root'
1755 else:
1756- matching_format_name = 'pack-0.92-subtree'
1757- else:
1758- matching_format_name = 'rich-root-pack'
1759+ matching_format_name = 'rich-root-pack'
1760 mismatching_format_name = 'pack-0.92'
1761 else:
1762 # We don't have a non-rich-root CHK format.
1763@@ -763,15 +820,14 @@
1764 if getattr(repo._format, 'supports_tree_reference', False):
1765 # can only stack on repositories that have compatible internal
1766 # metadata
1767- if repo._format.supports_chks:
1768- # No CHK subtree formats in bzr.dev, so this doesn't execute.
1769- matching_format_name = 'development6-subtree'
1770- else:
1771- matching_format_name = 'pack-0.92-subtree'
1772+ matching_format_name = 'pack-0.92-subtree'
1773 mismatching_format_name = 'rich-root-pack'
1774 else:
1775 if repo.supports_rich_root():
1776- matching_format_name = 'rich-root-pack'
1777+ if repo._format.supports_chks:
1778+ matching_format_name = 'development6-rich-root'
1779+ else:
1780+ matching_format_name = 'rich-root-pack'
1781 mismatching_format_name = 'pack-0.92-subtree'
1782 else:
1783 raise TestNotApplicable('No formats use non-v5 serializer'
1784@@ -844,6 +900,66 @@
1785 self.assertTrue(large_pack_name in pack_names)
1786
1787
1788+class TestKeyDependencies(TestCaseWithTransport):
1789+
1790+ def get_format(self):
1791+ return bzrdir.format_registry.make_bzrdir(self.format_name)
1792+
1793+ def create_source_and_target(self):
1794+ builder = self.make_branch_builder('source', format=self.get_format())
1795+ builder.start_series()
1796+ builder.build_snapshot('A-id', None, [
1797+ ('add', ('', 'root-id', 'directory', None))])
1798+ builder.build_snapshot('B-id', ['A-id', 'ghost-id'], [])
1799+ builder.finish_series()
1800+ repo = self.make_repository('target')
1801+ b = builder.get_branch()
1802+ b.lock_read()
1803+ self.addCleanup(b.unlock)
1804+ repo.lock_write()
1805+ self.addCleanup(repo.unlock)
1806+ return b.repository, repo
1807+
1808+ def test_key_dependencies_cleared_on_abort(self):
1809+ source_repo, target_repo = self.create_source_and_target()
1810+ target_repo.start_write_group()
1811+ try:
1812+ stream = source_repo.revisions.get_record_stream([('B-id',)],
1813+ 'unordered', True)
1814+ target_repo.revisions.insert_record_stream(stream)
1815+ key_refs = target_repo.revisions._index._key_dependencies
1816+ self.assertEqual([('B-id',)], sorted(key_refs.get_referrers()))
1817+ finally:
1818+ target_repo.abort_write_group()
1819+ self.assertEqual([], sorted(key_refs.get_referrers()))
1820+
1821+ def test_key_dependencies_cleared_on_suspend(self):
1822+ source_repo, target_repo = self.create_source_and_target()
1823+ target_repo.start_write_group()
1824+ try:
1825+ stream = source_repo.revisions.get_record_stream([('B-id',)],
1826+ 'unordered', True)
1827+ target_repo.revisions.insert_record_stream(stream)
1828+ key_refs = target_repo.revisions._index._key_dependencies
1829+ self.assertEqual([('B-id',)], sorted(key_refs.get_referrers()))
1830+ finally:
1831+ target_repo.suspend_write_group()
1832+ self.assertEqual([], sorted(key_refs.get_referrers()))
1833+
1834+ def test_key_dependencies_cleared_on_commit(self):
1835+ source_repo, target_repo = self.create_source_and_target()
1836+ target_repo.start_write_group()
1837+ try:
1838+ stream = source_repo.revisions.get_record_stream([('B-id',)],
1839+ 'unordered', True)
1840+ target_repo.revisions.insert_record_stream(stream)
1841+ key_refs = target_repo.revisions._index._key_dependencies
1842+ self.assertEqual([('B-id',)], sorted(key_refs.get_referrers()))
1843+ finally:
1844+ target_repo.commit_write_group()
1845+ self.assertEqual([], sorted(key_refs.get_referrers()))
1846+
1847+
1848 class TestSmartServerAutopack(TestCaseWithTransport):
1849
1850 def setUp(self):
1851@@ -931,7 +1047,7 @@
1852 dict(format_name='development6-rich-root',
1853 format_string='Bazaar development format - group compression '
1854 'and chk inventory (needs bzr.dev from 1.14)\n',
1855- format_supports_external_lookups=False,
1856+ format_supports_external_lookups=True,
1857 index_class=BTreeGraphIndex),
1858 ]
1859 # name of the scenario is the format name
1860
1861=== modified file 'bzrlib/tests/test_repository.py'
1862--- bzrlib/tests/test_repository.py 2009-04-09 20:23:07 +0000
1863+++ bzrlib/tests/test_repository.py 2009-05-29 00:35:26 +0000
1864@@ -686,11 +686,11 @@
1865 inv.parent_id_basename_to_file_id._root_node.maximum_size)
1866
1867
1868-class TestDevelopment6FindRevisionOutsideSet(TestCaseWithTransport):
1869- """Tests for _find_revision_outside_set."""
1870+class TestDevelopment6FindParentIdsOfRevisions(TestCaseWithTransport):
1871+ """Tests for _find_parent_ids_of_revisions."""
1872
1873 def setUp(self):
1874- super(TestDevelopment6FindRevisionOutsideSet, self).setUp()
1875+ super(TestDevelopment6FindParentIdsOfRevisions, self).setUp()
1876 self.builder = self.make_branch_builder('source',
1877 format='development6-rich-root')
1878 self.builder.start_series()
1879@@ -699,42 +699,42 @@
1880 self.repo = self.builder.get_branch().repository
1881 self.addCleanup(self.builder.finish_series)
1882
1883- def assertRevisionOutsideSet(self, expected_result, rev_set):
1884- self.assertEqual(
1885- expected_result, self.repo._find_revision_outside_set(rev_set))
1886+ def assertParentIds(self, expected_result, rev_set):
1887+ self.assertEqual(sorted(expected_result),
1888+ sorted(self.repo._find_parent_ids_of_revisions(rev_set)))
1889
1890 def test_simple(self):
1891 self.builder.build_snapshot('revid1', None, [])
1892- self.builder.build_snapshot('revid2', None, [])
1893+ self.builder.build_snapshot('revid2', ['revid1'], [])
1894 rev_set = ['revid2']
1895- self.assertRevisionOutsideSet('revid1', rev_set)
1896+ self.assertParentIds(['revid1'], rev_set)
1897
1898 def test_not_first_parent(self):
1899 self.builder.build_snapshot('revid1', None, [])
1900- self.builder.build_snapshot('revid2', None, [])
1901- self.builder.build_snapshot('revid3', None, [])
1902+ self.builder.build_snapshot('revid2', ['revid1'], [])
1903+ self.builder.build_snapshot('revid3', ['revid2'], [])
1904 rev_set = ['revid3', 'revid2']
1905- self.assertRevisionOutsideSet('revid1', rev_set)
1906+ self.assertParentIds(['revid1'], rev_set)
1907
1908 def test_not_null(self):
1909 rev_set = ['initial']
1910- self.assertRevisionOutsideSet(_mod_revision.NULL_REVISION, rev_set)
1911+ self.assertParentIds([], rev_set)
1912
1913 def test_not_null_set(self):
1914 self.builder.build_snapshot('revid1', None, [])
1915 rev_set = [_mod_revision.NULL_REVISION]
1916- self.assertRevisionOutsideSet(_mod_revision.NULL_REVISION, rev_set)
1917+ self.assertParentIds([], rev_set)
1918
1919 def test_ghost(self):
1920 self.builder.build_snapshot('revid1', None, [])
1921 rev_set = ['ghost', 'revid1']
1922- self.assertRevisionOutsideSet('initial', rev_set)
1923+ self.assertParentIds(['initial'], rev_set)
1924
1925 def test_ghost_parent(self):
1926 self.builder.build_snapshot('revid1', None, [])
1927 self.builder.build_snapshot('revid2', ['revid1', 'ghost'], [])
1928 rev_set = ['revid2', 'revid1']
1929- self.assertRevisionOutsideSet('initial', rev_set)
1930+ self.assertParentIds(['ghost', 'initial'], rev_set)
1931
1932 def test_righthand_parent(self):
1933 self.builder.build_snapshot('revid1', None, [])
1934@@ -742,7 +742,7 @@
1935 self.builder.build_snapshot('revid2b', ['revid1'], [])
1936 self.builder.build_snapshot('revid3', ['revid2a', 'revid2b'], [])
1937 rev_set = ['revid3', 'revid2a']
1938- self.assertRevisionOutsideSet('revid2b', rev_set)
1939+ self.assertParentIds(['revid1', 'revid2b'], rev_set)
1940
1941
1942 class TestWithBrokenRepo(TestCaseWithTransport):
1943@@ -1220,3 +1220,68 @@
1944 stream = source._get_source(target._format)
1945 # We don't want the child GroupCHKStreamSource
1946 self.assertIs(type(stream), repository.StreamSource)
1947+
1948+ def test_get_stream_for_missing_keys_includes_all_chk_refs(self):
1949+ source_builder = self.make_branch_builder('source',
1950+ format='development6-rich-root')
1951+ # We have to build a fairly large tree, so that we are sure the chk
1952+ # pages will have split into multiple pages.
1953+ entries = [('add', ('', 'a-root-id', 'directory', None))]
1954+ for i in 'abcdefghijklmnopqrstuvwxzy123456789':
1955+ for j in 'abcdefghijklmnopqrstuvwxzy123456789':
1956+ fname = i + j
1957+ fid = fname + '-id'
1958+ content = 'content for %s\n' % (fname,)
1959+ entries.append(('add', (fname, fid, 'file', content)))
1960+ source_builder.start_series()
1961+ source_builder.build_snapshot('rev-1', None, entries)
1962+ # Now change a few of them, so we get a few new pages for the second
1963+ # revision
1964+ source_builder.build_snapshot('rev-2', ['rev-1'], [
1965+ ('modify', ('aa-id', 'new content for aa-id\n')),
1966+ ('modify', ('cc-id', 'new content for cc-id\n')),
1967+ ('modify', ('zz-id', 'new content for zz-id\n')),
1968+ ])
1969+ source_builder.finish_series()
1970+ source_branch = source_builder.get_branch()
1971+ source_branch.lock_read()
1972+ self.addCleanup(source_branch.unlock)
1973+ target = self.make_repository('target', format='development6-rich-root')
1974+ source = source_branch.repository._get_source(target._format)
1975+ self.assertIsInstance(source, groupcompress_repo.GroupCHKStreamSource)
1976+
1977+ # On a regular pass, getting the inventories and chk pages for rev-2
1978+ # would only get the newly created chk pages
1979+ search = graph.SearchResult(set(['rev-2']), set(['rev-1']), 1,
1980+ set(['rev-2']))
1981+ simple_chk_records = []
1982+ for vf_name, substream in source.get_stream(search):
1983+ if vf_name == 'chk_bytes':
1984+ for record in substream:
1985+ simple_chk_records.append(record.key)
1986+ else:
1987+ for _ in substream:
1988+ continue
1989+ # 3 pages, the root (InternalNode), + 2 pages which actually changed
1990+ self.assertEqual([('sha1:91481f539e802c76542ea5e4c83ad416bf219f73',),
1991+ ('sha1:4ff91971043668583985aec83f4f0ab10a907d3f',),
1992+ ('sha1:81e7324507c5ca132eedaf2d8414ee4bb2226187',),
1993+ ('sha1:b101b7da280596c71a4540e9a1eeba8045985ee0',)],
1994+ simple_chk_records)
1995+ # Now, when we do a similar call using 'get_stream_for_missing_keys'
1996+ # we should get a much larger set of pages.
1997+ missing = [('inventories', 'rev-2')]
1998+ full_chk_records = []
1999+ for vf_name, substream in source.get_stream_for_missing_keys(missing):
2000+ if vf_name == 'inventories':
2001+ for record in substream:
2002+ self.assertEqual(('rev-2',), record.key)
2003+ elif vf_name == 'chk_bytes':
2004+ for record in substream:
2005+ full_chk_records.append(record.key)
2006+ else:
2007+ self.fail('Should not be getting a stream of %s' % (vf_name,))
2008+ # We have 257 records now. This is because we have 1 root page, and 256
2009+ # leaf pages in a complete listing.
2010+ self.assertEqual(257, len(full_chk_records))
2011+ self.assertSubset(simple_chk_records, full_chk_records)