Merge lp:~jelmer/bzr/faster-stacked-tree-build into lp:bzr
- faster-stacked-tree-build
- Merge into bzr.dev
Status: | Rejected |
---|---|
Rejected by: | Jelmer Vernooij |
Proposed branch: | lp:~jelmer/bzr/faster-stacked-tree-build |
Merge into: | lp:bzr |
Diff against target: |
741 lines (+284/-80) 14 files modified
bzrlib/__init__.py (+4/-7) bzrlib/bzrdir.py (+21/-4) bzrlib/chk_map.py (+9/-0) bzrlib/remote.py (+5/-3) bzrlib/repofmt/groupcompress_repo.py (+56/-3) bzrlib/smart/repository.py (+4/-0) bzrlib/tests/blackbox/test_branch.py (+2/-3) bzrlib/tests/per_branch/test_stacking.py (+12/-10) bzrlib/tests/per_repository/test_commit_builder.py (+19/-13) bzrlib/tests/per_repository_reference/test_commit_with_stacking.py (+11/-17) bzrlib/tests/test_remote.py (+42/-19) bzrlib/tests/test_repository.py (+2/-1) bzrlib/vf_search.py (+93/-0) doc/en/release-notes/bzr-2.5.txt (+4/-0) |
To merge this branch: | bzr merge lp:~jelmer/bzr/faster-stacked-tree-build |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Vincent Ladeuil | Pending | ||
Review via email: mp+82416@code.launchpad.net |
This proposal supersedes a proposal from 2011-08-03.
Commit message
Description of the change
This branch was originally submitted by spiv. Here is his cover letter:
Here's my branch that makes 'bzr branch --stacked $SMART_SERVER_URL' much, much faster. It adds a new kind of search “texts for revision”, and adds that to the HPSS protocol (without adding a new verb, just a new search body, as usual it will fallback to VFS if the server doesn't support the new thing), and checks for the new search in get_stream.
The basic idea here is that to build a working tree we need to fetch all the texts, and if the new local repo is stacked we may as well fetch all that data in a single stream and save it into that repo (rather than fetching it text-by-text when we try to build the tree from the empty local repo stacked on the remote fallback).
I forget exactly what is missing from this branch, but probably not much. I think mainly just some clarity about the new SearchResult APIs it adds, and that's perhaps best resolved by having other folks review this anyway. So: review away!
(This is actually all code from while I was still working at Canonical, but the flu stopped me from having time to give it a final self-review and/or polish, so here it is without that.)
Martin Pool (mbp) wrote : Posted in a previous version of this proposal | # |
Martin Pool (mbp) wrote : Posted in a previous version of this proposal | # |
+ # XXX: this is a gross hack.
+ mutter('HACK adding texts-for-rev:%s', tip)
[tweak] Can you add something more about what's specifically wrong with it or what ought to be done instead (even just a pointer to bug 791704), otherwise for someone who comes to look at it it's just kind of rubbing it in.
I presume what you're saying is gross is the special case for when this fires? Or is it that it is done as a special search at all? It does look like it will cause a similar performance cliff to what we currently have when you branch into a new empty repository as opposed to creating a new repository.
[optional] Thinking this through makes me wonder if we should have a debug option to force this on or force it off, so that we can eg use it as a workaround and can more easily compare the performance.
+# def get_full_
+# return set(sr.
[query] If this is really dead can you just remove it? Likewise below.
+ def __init__(self, revision_id, repo=None):
+ """Constructor.
+
+ :param revision_id: A revision ID. All records needed to create a
+ checkout of this revision are referenced by this result (i.e. the
+ revision record and corresponding inventory record, and all
+ referenced texts and chk_bytes).
+ :param repo: An optional repository. If not given, then
+ ``get_full_keys`` cannot be used.
+ """
+ self.revision_id = revision_id
+# self.repo = repo
[query] Is the repo really unused? Maybe the function prototype or docstring needs to be changed?
+ if self.texts_for_inv is not None:
+ # Emit all CHK records for this inventory, and add all texts
+ # those nodes reference.
+ keys = set()
+ for node in self.texts_
+ keys.add(
+ if type(node) is chk_map.LeafNode:
+ self._text_
+ [chk_map.
+ node._items.
+ stream = self.from_
+ keys, 'unordered', True)
+ for record in stream:
+ yield record
+ uninteresting_
+ self.texts_
[comment] This seems to be getting kind of large for a nested function but perhaps refactoring it is out of scope.
[query] It's not obvious to me that there are tests for this actually being reached, but I might be just missing that.
I'd like to get a second review, probably from John, and someone should also build this and just play with it to check the behaviour is reasonable. This needs to be considered as a risk factor for the landing on lp (maybe we should have somewhere to accumulate a list of them.)
Martin Pool (mbp) wrote : Posted in a previous version of this proposal | # |
So, I can do these tweaks, if spiv is busy, but I would definitely like a second review.
Vincent Ladeuil (vila) wrote : Posted in a previous version of this proposal | # |
@jam: AIUI you're working on this bug now, what should be the status of *this* proposal in that respect ?
Jelmer Vernooij (jelmer) wrote : Posted in a previous version of this proposal | # |
I'm looking at finishing this.
Andrew Bennetts (spiv) wrote : | # |
On Wed, Nov 16, 2011 at 03:52:39PM -0000, Jelmer Vernooij wrote:
> Jelmer Vernooij has proposed merging lp:~jelmer/bzr/faster-stacked-tree-build into lp:bzr.
Thanks for picking this up!
-Andrew.
Preview Diff
1 | === modified file 'bzrlib/__init__.py' |
2 | --- bzrlib/__init__.py 2012-01-16 17:57:43 +0000 |
3 | +++ bzrlib/__init__.py 2012-01-25 17:56:30 +0000 |
4 | @@ -148,13 +148,10 @@ |
5 | """ |
6 | try: |
7 | import ctypes |
8 | - except ImportError: |
9 | - return |
10 | - pythonapi = getattr(ctypes, "pythonapi", None) |
11 | - if pythonapi is None: |
12 | - # Not CPython ctypes implementation |
13 | - return |
14 | - old_ptr = ctypes.c_void_p.in_dll(pythonapi, "Py_FileSystemDefaultEncoding") |
15 | + old_ptr = ctypes.c_void_p.in_dll(ctypes.pythonapi, |
16 | + "Py_FileSystemDefaultEncoding") |
17 | + except (ImportError, ValueError): |
18 | + return # No ctypes or not CPython implementation, do nothing |
19 | new_ptr = ctypes.cast(ctypes.c_char_p(intern(new_enc)), ctypes.c_void_p) |
20 | old_ptr.value = new_ptr.value |
21 | if sys.getfilesystemencoding() != new_enc: |
22 | |
23 | === modified file 'bzrlib/bzrdir.py' |
24 | --- bzrlib/bzrdir.py 2012-01-18 20:27:41 +0000 |
25 | +++ bzrlib/bzrdir.py 2012-01-25 17:56:30 +0000 |
26 | @@ -410,8 +410,27 @@ |
27 | else: |
28 | target_repo_kind = fetch.TargetRepoKinds.PREEXISTING |
29 | fetch_spec_factory.target_repo_kind = target_repo_kind |
30 | + create_tree = (create_tree_if_local and not result.has_workingtree() and |
31 | + isinstance(target_transport, local.LocalTransport) and |
32 | + (result_repo is None or result_repo.make_working_trees())) |
33 | + |
34 | if source_repository is not None: |
35 | - fetch_spec = fetch_spec_factory.make_fetch_spec() |
36 | + if revision_id is not None: |
37 | + tip = revision_id |
38 | + elif source_branch is not None: |
39 | + tip = source_branch.last_revision() |
40 | + else: |
41 | + tip = _mod_revision.NULL_REVISION |
42 | + if (create_tree and is_new_repo |
43 | + and tip != _mod_revision.NULL_REVISION |
44 | + and target_repo_kind == fetch.TargetRepoKinds.STACKED): |
45 | + # We're creating a tree for a stacked branch, so make sure we |
46 | + # fetch the texts we're about to need to construct that tree. |
47 | + # XXX: this is a gross hack. |
48 | + mutter('HACK adding texts-for-rev:%s', tip) |
49 | + fetch_spec = vf_search.TextsForRevisionSearchResult(tip) |
50 | + else: |
51 | + fetch_spec = fetch_spec_factory.make_fetch_spec() |
52 | result_repo.fetch(source_repository, fetch_spec=fetch_spec) |
53 | |
54 | if source_branch is None: |
55 | @@ -426,9 +445,7 @@ |
56 | mutter("created new branch %r" % (result_branch,)) |
57 | |
58 | # Create/update the result working tree |
59 | - if (create_tree_if_local and not result.has_workingtree() and |
60 | - isinstance(target_transport, local.LocalTransport) and |
61 | - (result_repo is None or result_repo.make_working_trees())): |
62 | + if create_tree: |
63 | wt = result.create_workingtree(accelerator_tree=accelerator_tree, |
64 | hardlink=hardlink, from_branch=result_branch) |
65 | wt.lock_write() |
66 | |
67 | === modified file 'bzrlib/chk_map.py' |
68 | --- bzrlib/chk_map.py 2011-12-19 13:23:58 +0000 |
69 | +++ bzrlib/chk_map.py 2012-01-25 17:56:30 +0000 |
70 | @@ -222,6 +222,15 @@ |
71 | result.append(' %r %r' % (tuple(key), value)) |
72 | return result |
73 | |
74 | + def iter_nodes(self): |
75 | + """Yield all nodes of this CHK map.""" |
76 | + self._ensure_root() |
77 | + node = self._root_node |
78 | + yield node |
79 | + if type(node) is InternalNode: |
80 | + for node, node_filter in node._iter_nodes(self._store): |
81 | + yield node |
82 | + |
83 | @classmethod |
84 | def from_dict(klass, store, initial_value, maximum_size=0, key_width=1, |
85 | search_key_func=None): |
86 | |
87 | === modified file 'bzrlib/remote.py' |
88 | --- bzrlib/remote.py 2012-01-19 14:17:55 +0000 |
89 | +++ bzrlib/remote.py 2012-01-25 17:56:30 +0000 |
90 | @@ -2998,14 +2998,16 @@ |
91 | except errors.UnknownSmartMethod: |
92 | medium._remember_remote_is_before(version) |
93 | except errors.UnknownErrorFromSmartServer, e: |
94 | - if isinstance(search, vf_search.EverythingResult): |
95 | + new_in_2_5_searches = ( |
96 | + vf_search.EverythingResult, vf_search.TextsForRevisionSearchResult) |
97 | + if isinstance(search, new_in_2_5_searches): |
98 | error_verb = e.error_from_smart_server.error_verb |
99 | if error_verb == 'BadSearch': |
100 | - # Pre-2.4 servers don't support this sort of search. |
101 | + # Pre-2.5 servers don't support this sort of search. |
102 | # XXX: perhaps falling back to VFS on BadSearch is a |
103 | # good idea in general? It might provide a little bit |
104 | # of protection against client-side bugs. |
105 | - medium._remember_remote_is_before((2, 4)) |
106 | + medium._remember_remote_is_before((2, 5)) |
107 | break |
108 | raise |
109 | else: |
110 | |
111 | === modified file 'bzrlib/repofmt/groupcompress_repo.py' |
112 | --- bzrlib/repofmt/groupcompress_repo.py 2011-12-19 19:15:58 +0000 |
113 | +++ bzrlib/repofmt/groupcompress_repo.py 2012-01-25 17:56:30 +0000 |
114 | @@ -786,7 +786,8 @@ |
115 | pass |
116 | except errors.NoSuchRevision, e: |
117 | problems.append( |
118 | - "missing chk node(s) for parent_id_basename_to_file_id maps") |
119 | + "missing chk node(s) for parent_id_basename_to_file_id maps: %s" |
120 | + % (str(e),)) |
121 | present_text_keys = no_fallback_texts_index.get_parent_map(text_keys) |
122 | missing_text_keys = text_keys.difference(present_text_keys) |
123 | if missing_text_keys: |
124 | @@ -1158,6 +1159,8 @@ |
125 | self._text_fetch_order = 'groupcompress' |
126 | self._chk_id_roots = None |
127 | self._chk_p_id_roots = None |
128 | + self.texts_for_rev = None |
129 | + self.texts_for_inv = None |
130 | |
131 | def _get_inventory_stream(self, inventory_keys, allow_absent=False): |
132 | """Get a stream of inventory texts. |
133 | @@ -1167,6 +1170,7 @@ |
134 | """ |
135 | self._chk_id_roots = [] |
136 | self._chk_p_id_roots = [] |
137 | + self._texts_for_inv = None |
138 | def _filtered_inv_stream(): |
139 | id_roots_set = set() |
140 | p_id_roots_set = set() |
141 | @@ -1180,8 +1184,13 @@ |
142 | else: |
143 | raise errors.NoSuchRevision(self, record.key) |
144 | bytes = record.get_bytes_as('fulltext') |
145 | - chk_inv = inventory.CHKInventory.deserialise(None, bytes, |
146 | - record.key) |
147 | + if record.key[-1] == self.texts_for_rev: |
148 | + chk_inv = inventory.CHKInventory.deserialise( |
149 | + self.from_repository.chk_bytes, bytes, record.key) |
150 | + self.texts_for_inv = chk_inv |
151 | + else: |
152 | + chk_inv = inventory.CHKInventory.deserialise(None, bytes, |
153 | + record.key) |
154 | key = chk_inv.id_to_entry.key() |
155 | if key not in id_roots_set: |
156 | self._chk_id_roots.append(key) |
157 | @@ -1222,6 +1231,22 @@ |
158 | inv.parent_id_basename_to_file_id.key()) |
159 | chk_bytes = self.from_repository.chk_bytes |
160 | def _filter_id_to_entry(): |
161 | + if self.texts_for_inv is not None: |
162 | + # Emit all CHK records for this inventory, and add all texts |
163 | + # those nodes reference. |
164 | + keys = set() |
165 | + for node in self.texts_for_inv.id_to_entry.iter_nodes(): |
166 | + keys.add(node.key()) |
167 | + if type(node) is chk_map.LeafNode: |
168 | + self._text_keys.update( |
169 | + [chk_map._bytes_to_text_key(b) for b in |
170 | + node._items.values()]) |
171 | + stream = self.from_repository.chk_bytes.get_record_stream( |
172 | + keys, 'unordered', True) |
173 | + for record in stream: |
174 | + yield record |
175 | + uninteresting_root_keys.add( |
176 | + self.texts_for_inv.id_to_entry.key()) |
177 | interesting_nodes = chk_map.iter_interesting_nodes(chk_bytes, |
178 | self._chk_id_roots, uninteresting_root_keys) |
179 | for record in _filter_text_keys(interesting_nodes, self._text_keys, |
180 | @@ -1232,12 +1257,25 @@ |
181 | self._chk_id_roots = None |
182 | yield 'chk_bytes', _filter_id_to_entry() |
183 | def _get_parent_id_basename_to_file_id_pages(): |
184 | + if self.texts_for_inv is not None: |
185 | + # Emit all CHK records for this inventory. |
186 | + keys = set() |
187 | + inv = self.texts_for_inv |
188 | + for node in inv.parent_id_basename_to_file_id.iter_nodes(): |
189 | + keys.add(node.key()) |
190 | + stream = self.from_repository.chk_bytes.get_record_stream( |
191 | + keys, 'unordered', True) |
192 | + for record in stream: |
193 | + yield record |
194 | + uninteresting_pid_root_keys.add( |
195 | + inv.parent_id_basename_to_file_id.key()) |
196 | for record, items in chk_map.iter_interesting_nodes(chk_bytes, |
197 | self._chk_p_id_roots, uninteresting_pid_root_keys): |
198 | if record is not None: |
199 | yield record |
200 | # Consumed |
201 | self._chk_p_id_roots = None |
202 | + self.texts_for_inv = None |
203 | yield 'chk_bytes', _get_parent_id_basename_to_file_id_pages() |
204 | |
205 | def _get_text_stream(self): |
206 | @@ -1260,6 +1298,20 @@ |
207 | yield record |
208 | |
209 | revision_ids = search.get_keys() |
210 | + texts_for_rev = None |
211 | + for special in search.get_special(): |
212 | + if special[0] == 'texts-for-rev': |
213 | + if texts_for_rev is not None: |
214 | + raise errors.BzrError("Cannot handle special search: %r" % |
215 | + (search,)) |
216 | + texts_for_rev = special[1] |
217 | + if texts_for_rev not in revision_ids: |
218 | + raise AssertionError('texts_for_rev %r not in %r' |
219 | + % (texts_for_rev, revision_ids)) |
220 | + else: |
221 | + raise errors.BzrError("Cannot handle special search: %r" % |
222 | + (search,)) |
223 | + revision_ids = frozenset(revision_ids) |
224 | pb = ui.ui_factory.nested_progress_bar() |
225 | rc = self._record_counter |
226 | self._record_counter.setup(len(revision_ids)) |
227 | @@ -1278,6 +1330,7 @@ |
228 | # Clear the repo's get_parent_map cache too. |
229 | self.from_repository._unstacked_provider.disable_cache() |
230 | self.from_repository._unstacked_provider.enable_cache() |
231 | + self.texts_for_rev = texts_for_rev |
232 | s = self._get_inventory_stream(self._revision_keys) |
233 | yield (s[0], wrap_and_count(pb, rc, s[1])) |
234 | self.from_repository.inventories.clear_cache() |
235 | |
236 | === modified file 'bzrlib/smart/repository.py' |
237 | --- bzrlib/smart/repository.py 2011-12-19 13:23:58 +0000 |
238 | +++ bzrlib/smart/repository.py 2012-01-25 17:56:30 +0000 |
239 | @@ -99,6 +99,10 @@ |
240 | elif lines[0] == 'search': |
241 | return self.recreate_search_from_recipe(repository, lines[1:], |
242 | discard_excess=discard_excess) |
243 | + elif lines[0] == 'texts-for-rev': |
244 | + if len(lines) > 2: |
245 | + return (None, FailedSmartServerResponse(('BadSearch',))) |
246 | + return (vf_search.TextsForRevisionSearchResult(lines[1]), None) |
247 | else: |
248 | return (None, FailedSmartServerResponse(('BadSearch',))) |
249 | |
250 | |
251 | === modified file 'bzrlib/tests/blackbox/test_branch.py' |
252 | --- bzrlib/tests/blackbox/test_branch.py 2012-01-19 12:01:12 +0000 |
253 | +++ bzrlib/tests/blackbox/test_branch.py 2012-01-25 17:56:30 +0000 |
254 | @@ -424,13 +424,12 @@ |
255 | |
256 | def test_branch_stacked(self): |
257 | # We have a mainline |
258 | - trunk_tree = self.make_branch_and_tree('mainline', |
259 | - format='1.9') |
260 | + trunk_tree = self.make_branch_and_tree('mainline') |
261 | original_revid = trunk_tree.commit('mainline') |
262 | self.assertRevisionInRepository('mainline', original_revid) |
263 | # and a branch from it which is stacked |
264 | out, err = self.run_bzr(['branch', '--stacked', 'mainline', |
265 | - 'newbranch']) |
266 | + 'newbranch', '--no-tree']) |
267 | self.assertEqual('', out) |
268 | self.assertEqual('Created new stacked branch referring to %s.\n' % |
269 | trunk_tree.branch.base, err) |
270 | |
271 | === modified file 'bzrlib/tests/per_branch/test_stacking.py' |
272 | --- bzrlib/tests/per_branch/test_stacking.py 2011-11-24 17:24:21 +0000 |
273 | +++ bzrlib/tests/per_branch/test_stacking.py 2012-01-25 17:56:30 +0000 |
274 | @@ -143,7 +143,8 @@ |
275 | trunk_revid = trunk_tree.commit('mainline') |
276 | # and make branch from it which is stacked |
277 | try: |
278 | - new_dir = trunk_tree.bzrdir.sprout('newbranch', stacked=True) |
279 | + new_dir = trunk_tree.bzrdir.sprout('newbranch', stacked=True, |
280 | + create_tree_if_local=False) |
281 | except unstackable_format_errors, e: |
282 | raise TestNotApplicable(e) |
283 | # stacked repository |
284 | @@ -160,14 +161,16 @@ |
285 | trunk_revid = trunk_tree.commit('mainline') |
286 | # Make sure that we can make a stacked branch from it |
287 | try: |
288 | - trunk_tree.bzrdir.sprout('testbranch', stacked=True) |
289 | + trunk_tree.bzrdir.sprout('testbranch', stacked=True, |
290 | + create_tree_if_local=False) |
291 | except unstackable_format_errors, e: |
292 | raise TestNotApplicable(e) |
293 | # Now serve the original mainline from a smart server |
294 | remote_transport = self.make_smart_server('mainline') |
295 | remote_bzrdir = bzrdir.BzrDir.open_from_transport(remote_transport) |
296 | # and make branch from the smart server which is stacked |
297 | - new_dir = remote_bzrdir.sprout('newbranch', stacked=True) |
298 | + new_dir = remote_bzrdir.sprout('newbranch', stacked=True, |
299 | + create_tree_if_local=False) |
300 | # stacked repository |
301 | self.assertRevisionNotInRepository('newbranch', trunk_revid) |
302 | tree = new_dir.open_branch().create_checkout('local') |
303 | @@ -187,7 +190,8 @@ |
304 | mainline_revid = 'rev-1' |
305 | # and make branch from it which is stacked (with no tags) |
306 | try: |
307 | - new_dir = trunk.bzrdir.sprout(self.get_url('newbranch'), stacked=True) |
308 | + new_dir = trunk.bzrdir.sprout(self.get_url('newbranch'), |
309 | + stacked=True, create_tree_if_local=False) |
310 | except unstackable_format_errors, e: |
311 | raise TestNotApplicable(e) |
312 | # stacked repository |
313 | @@ -534,14 +538,12 @@ |
314 | stack_on.commit('first commit', rev_id='rev1') |
315 | try: |
316 | stacked_dir = stack_on.bzrdir.sprout( |
317 | - self.get_url('stacked'), stacked=True) |
318 | + self.get_url('stacked'), stacked=True, |
319 | + create_tree_if_local=False) |
320 | except unstackable_format_errors, e: |
321 | raise TestNotApplicable('Format does not support stacking.') |
322 | - try: |
323 | - stacked = stacked_dir.open_workingtree() |
324 | - except errors.NoWorkingTree: |
325 | - stacked = stacked_dir.open_branch().create_checkout( |
326 | - 'stacked-checkout', lightweight=True) |
327 | + stacked = stacked_dir.open_branch().create_checkout( |
328 | + 'stacked-checkout', lightweight=True) |
329 | tree = stacked.branch.create_checkout('local') |
330 | tree.commit('second commit', rev_id='rev2') |
331 | # Sanity check: stacked's repo should not contain rev1, otherwise this |
332 | |
333 | === modified file 'bzrlib/tests/per_repository/test_commit_builder.py' |
334 | --- bzrlib/tests/per_repository/test_commit_builder.py 2011-12-22 16:17:28 +0000 |
335 | +++ bzrlib/tests/per_repository/test_commit_builder.py 2012-01-25 17:56:30 +0000 |
336 | @@ -284,6 +284,7 @@ |
337 | if not builder.supports_record_entry_contents: |
338 | raise tests.TestNotApplicable("CommitBuilder doesn't support " |
339 | "record_entry_contents") |
340 | + builder.will_record_deletes() |
341 | ie = inventory.make_entry('directory', '', None, |
342 | tree.get_root_id()) |
343 | delta, version_recorded, fs_hash = builder.record_entry_contents( |
344 | @@ -299,7 +300,7 @@ |
345 | if got_new_revision: |
346 | self.assertEqual(('', '', ie.file_id, ie), delta) |
347 | # The delta should be tracked |
348 | - self.assertEqual(delta, builder._basis_delta[-1]) |
349 | + self.assertEqual(delta, builder.get_basis_delta()[-1]) |
350 | else: |
351 | self.assertEqual(None, delta) |
352 | # Directories do not get hashed. |
353 | @@ -441,7 +442,7 @@ |
354 | # the delta should be returned, and recorded in _basis_delta |
355 | delta = builder.record_delete("foo", "foo-id") |
356 | self.assertEqual(("foo", None, "foo-id", None), delta) |
357 | - self.assertEqual(delta, builder._basis_delta[-1]) |
358 | + self.assertEqual(delta, builder.get_basis_delta()[-1]) |
359 | builder.finish_inventory() |
360 | rev_id2 = builder.commit('delete foo') |
361 | except: |
362 | @@ -463,13 +464,14 @@ |
363 | try: |
364 | builder = tree.branch.get_commit_builder([rev_id]) |
365 | try: |
366 | + builder.will_record_deletes() |
367 | delete_change = ('foo-id', ('foo', None), True, (True, False), |
368 | (tree.path2id(''), None), ('foo', None), ('file', None), |
369 | (False, None)) |
370 | list(builder.record_iter_changes(tree, rev_id, |
371 | [delete_change])) |
372 | self.assertEqual(("foo", None, "foo-id", None), |
373 | - builder._basis_delta[0]) |
374 | + builder.get_basis_delta()[0]) |
375 | self.assertTrue(builder.any_changes()) |
376 | builder.finish_inventory() |
377 | rev_id2 = builder.commit('delete foo') |
378 | @@ -866,6 +868,7 @@ |
379 | if not builder.supports_record_entry_contents: |
380 | raise tests.TestNotApplicable("CommitBuilder doesn't " |
381 | "support record_entry_contents") |
382 | + builder.will_record_deletes() |
383 | parent_tree = tree.basis_tree() |
384 | parent_tree.lock_read() |
385 | self.addCleanup(parent_tree.unlock) |
386 | @@ -913,7 +916,8 @@ |
387 | if delta_against_basis: |
388 | expected_delta = (name, new_name, file_id, new_entry) |
389 | # The delta should be recorded |
390 | - self.assertEqual(expected_delta, builder._basis_delta[-1]) |
391 | + self.assertEqual(expected_delta, |
392 | + builder.get_basis_delta()[-1]) |
393 | else: |
394 | expected_delta = None |
395 | self.assertEqual(expected_delta, delta) |
396 | @@ -954,6 +958,7 @@ |
397 | # record_entry_contents. |
398 | parent_ids = tree.get_parent_ids() |
399 | builder = tree.branch.get_commit_builder(parent_ids) |
400 | + builder.will_record_deletes() |
401 | parent_tree = tree.basis_tree() |
402 | parent_tree.lock_read() |
403 | self.addCleanup(parent_tree.unlock) |
404 | @@ -975,15 +980,6 @@ |
405 | self.assertEqualStat(result[2][1], tree_file_stat[1]) |
406 | else: |
407 | self.assertEqual([], result) |
408 | - delta = builder._basis_delta |
409 | - delta_dict = dict((change[2], change) for change in delta) |
410 | - version_recorded = (file_id in delta_dict and |
411 | - delta_dict[file_id][3] is not None and |
412 | - delta_dict[file_id][3].revision == builder._new_revision_id) |
413 | - if records_version: |
414 | - self.assertTrue(version_recorded) |
415 | - else: |
416 | - self.assertFalse(version_recorded) |
417 | self.assertIs(None, builder.new_inventory) |
418 | builder.finish_inventory() |
419 | if tree.branch.repository._format.supports_full_versioned_files: |
420 | @@ -993,6 +989,16 @@ |
421 | self.assertEqual(inv_sha1, builder.inv_sha1) |
422 | self.assertIs(None, builder.new_inventory) |
423 | rev2 = builder.commit('') |
424 | + delta = builder.get_basis_delta() |
425 | + delta_dict = dict((change[2], change) for change in delta) |
426 | + version_recorded = (file_id in delta_dict and |
427 | + delta_dict[file_id][3] is not None and |
428 | + delta_dict[file_id][3].revision == rev2) |
429 | + if records_version: |
430 | + self.assertTrue(version_recorded) |
431 | + else: |
432 | + self.assertFalse(version_recorded) |
433 | + |
434 | new_inventory = builder.revision_tree().inventory |
435 | new_entry = new_inventory[file_id] |
436 | if delta_against_basis: |
437 | |
438 | === modified file 'bzrlib/tests/per_repository_reference/test_commit_with_stacking.py' |
439 | --- bzrlib/tests/per_repository_reference/test_commit_with_stacking.py 2011-01-11 00:36:10 +0000 |
440 | +++ bzrlib/tests/per_repository_reference/test_commit_with_stacking.py 2012-01-25 17:56:30 +0000 |
441 | @@ -39,13 +39,10 @@ |
442 | base_tree.commit('base adds f2', rev_id=self.r2_key[0]) |
443 | stacked_url = urlutils.join(base_tree.branch.base, '../stacked') |
444 | stacked_bzrdir = base_tree.bzrdir.sprout(stacked_url, |
445 | - stacked=True) |
446 | - if isinstance(stacked_bzrdir, remote.RemoteBzrDir): |
447 | - stacked_branch = stacked_bzrdir.open_branch() |
448 | - stacked_tree = stacked_branch.create_checkout('stacked', |
449 | - lightweight=True) |
450 | - else: |
451 | - stacked_tree = stacked_bzrdir.open_workingtree() |
452 | + stacked=True, create_tree_if_local=False) |
453 | + stacked_branch = stacked_bzrdir.open_branch() |
454 | + stacked_tree = stacked_branch.create_checkout('stacked-tree', |
455 | + lightweight=True) |
456 | return base_tree, stacked_tree |
457 | |
458 | |
459 | @@ -79,7 +76,7 @@ |
460 | base_tree, stacked_tree = self.make_stacked_target() |
461 | self.assertEqual(1, |
462 | len(stacked_tree.branch.repository._fallback_repositories)) |
463 | - self.build_tree_contents([('stacked/f1.txt', 'new content\n')]) |
464 | + self.build_tree_contents([('stacked-tree/f1.txt', 'new content\n')]) |
465 | stacked_tree.commit('new content', rev_id='new-rev-id') |
466 | # We open the repository without fallbacks to ensure the data is |
467 | # locally true |
468 | @@ -142,7 +139,7 @@ |
469 | def test_multi_stack(self): |
470 | """base + stacked + stacked-on-stacked""" |
471 | base_tree, stacked_tree = self.make_stacked_target() |
472 | - self.build_tree(['stacked/f3.txt']) |
473 | + self.build_tree(['stacked-tree/f3.txt']) |
474 | stacked_tree.add(['f3.txt'], ['f3.txt-id']) |
475 | stacked_key = ('stacked-rev-id',) |
476 | stacked_tree.commit('add f3', rev_id=stacked_key[0]) |
477 | @@ -153,16 +150,13 @@ |
478 | stacked2_url = urlutils.join(base_tree.branch.base, '../stacked2') |
479 | stacked2_bzrdir = stacked_tree.bzrdir.sprout(stacked2_url, |
480 | revision_id=self.r1_key[0], |
481 | - stacked=True) |
482 | - if isinstance(stacked2_bzrdir, remote.RemoteBzrDir): |
483 | - stacked2_branch = stacked2_bzrdir.open_branch() |
484 | - stacked2_tree = stacked2_branch.create_checkout('stacked2', |
485 | - lightweight=True) |
486 | - else: |
487 | - stacked2_tree = stacked2_bzrdir.open_workingtree() |
488 | + stacked=True, create_tree_if_local=False) |
489 | + stacked2_branch = stacked2_bzrdir.open_branch() |
490 | + stacked2_tree = stacked2_branch.create_checkout('stacked2-tree', |
491 | + lightweight=True) |
492 | # stacked2 is stacked on stacked, but its content is rev1, so |
493 | # it needs to pull the basis information from a fallback-of-fallback. |
494 | - self.build_tree(['stacked2/f3.txt']) |
495 | + self.build_tree(['stacked2-tree/f3.txt']) |
496 | stacked2_only_repo = self.get_only_repo(stacked2_tree) |
497 | self.assertPresent([], stacked2_only_repo.inventories, |
498 | [self.r1_key, self.r2_key]) |
499 | |
500 | === modified file 'bzrlib/tests/test_remote.py' |
501 | --- bzrlib/tests/test_remote.py 2012-01-06 22:06:36 +0000 |
502 | +++ bzrlib/tests/test_remote.py 2012-01-25 17:56:30 +0000 |
503 | @@ -3934,11 +3934,11 @@ |
504 | self.setup_smart_server_with_call_log() |
505 | tree1 = self.make_branch_and_tree('tree1', format='1.9') |
506 | tree1.commit('rev1', rev_id='rev1') |
507 | - tree2 = tree1.branch.bzrdir.sprout('tree2', stacked=True |
508 | - ).open_workingtree() |
509 | - local_tree = tree2.branch.create_checkout('local') |
510 | + stacked_b = tree1.branch.bzrdir.sprout('branch2', |
511 | + stacked=True, create_tree_if_local=False).open_branch() |
512 | + local_tree = stacked_b.create_checkout('local') |
513 | local_tree.commit('local changes make me feel good.') |
514 | - branch2 = Branch.open(self.get_url('tree2')) |
515 | + branch2 = Branch.open(self.get_url('branch2')) |
516 | branch2.lock_read() |
517 | self.addCleanup(branch2.unlock) |
518 | return tree1.branch, branch2 |
519 | @@ -4013,9 +4013,9 @@ |
520 | # is itself stacked yields the full data from all three sources. |
521 | def make_stacked_stacked(): |
522 | _, stacked = self.prepare_stacked_remote_branch() |
523 | - tree = stacked.bzrdir.sprout('tree3', stacked=True |
524 | - ).open_workingtree() |
525 | - local_tree = tree.branch.create_checkout('local-tree3') |
526 | + local_branch = stacked.bzrdir.sprout('tree3', stacked=True, |
527 | + create_tree_if_local=False).open_branch() |
528 | + local_tree = local_branch.create_checkout('local-tree3') |
529 | local_tree.commit('more local changes are better') |
530 | branch = Branch.open(self.get_url('tree3')) |
531 | branch.lock_read() |
532 | @@ -4113,37 +4113,60 @@ |
533 | override_existing=True, info=orig_info) |
534 | |
535 | def test_fetch_everything_backwards_compat(self): |
536 | - """Can fetch with EverythingResult even with pre 2.4 servers. |
537 | + """Can fetch with EverythingResult even with pre 2.5 servers. |
538 | |
539 | - Pre-2.4 do not support 'everything' searches with the |
540 | + Pre-2.5 do not support 'everything' searches with the |
541 | Repository.get_stream_1.19 verb. |
542 | """ |
543 | - verb_log = [] |
544 | + self.disable_search(lambda bytes: bytes == 'everything') |
545 | + local = self.make_branch('local') |
546 | + builder = self.make_branch_builder('remote') |
547 | + builder.build_commit(message="Commit.") |
548 | + remote_branch_url = self.smart_server.get_url() + 'remote' |
549 | + remote_branch = bzrdir.BzrDir.open(remote_branch_url).open_branch() |
550 | + self.hpss_calls = [] |
551 | + fetch_spec = vf_search.EverythingResult(remote_branch.repository) |
552 | + local.repository.fetch(remote_branch.repository, fetch_spec=fetch_spec) |
553 | + # make sure the overridden verb was used |
554 | + self.assertLength(1, self.verb_log) |
555 | + # more than one HPSS call is needed, but because it's a VFS callback |
556 | + # its hard to predict exactly how many. |
557 | + self.assertTrue(len(self.hpss_calls) > 1) |
558 | + |
559 | + def disable_search(self, reject_search_func): |
560 | + self.verb_log = verb_log = [] |
561 | class OldGetStreamVerb(SmartServerRepositoryGetStream_1_19): |
562 | """A version of the Repository.get_stream_1.19 verb patched to |
563 | - reject 'everything' searches the way 2.3 and earlier do. |
564 | + reject searches matching reject_search_func (to emulate old |
565 | + servers). |
566 | """ |
567 | def recreate_search(self, repository, search_bytes, |
568 | discard_excess=False): |
569 | verb_log.append(search_bytes.split('\n', 1)[0]) |
570 | - if search_bytes == 'everything': |
571 | - return (None, |
572 | - request.FailedSmartServerResponse(('BadSearch',))) |
573 | + if reject_search_func(search_bytes): |
574 | + return (None, request.FailedSmartServerResponse(('BadSearch',))) |
575 | return super(OldGetStreamVerb, |
576 | self).recreate_search(repository, search_bytes, |
577 | discard_excess=discard_excess) |
578 | self.override_verb('Repository.get_stream_1.19', OldGetStreamVerb) |
579 | + |
580 | + def test_fetch_texts_for_rev_backwards_compat(self): |
581 | + """Can fetch with TextsForRevisionSearchResult even with pre 2.5 servers. |
582 | + |
583 | + Pre-2.5 do not support 'texts-for-rev' searches with the |
584 | + Repository.get_stream_1.19 verb. |
585 | + """ |
586 | + self.disable_search(lambda bytes: bytes.startswith('texts-for-rev\n')) |
587 | local = self.make_branch('local') |
588 | builder = self.make_branch_builder('remote') |
589 | - builder.build_commit(message="Commit.") |
590 | + builder.build_commit(message="Commit.", rev_id='rev-1') |
591 | remote_branch_url = self.smart_server.get_url() + 'remote' |
592 | remote_branch = bzrdir.BzrDir.open(remote_branch_url).open_branch() |
593 | self.hpss_calls = [] |
594 | - local.repository.fetch( |
595 | - remote_branch.repository, |
596 | - fetch_spec=vf_search.EverythingResult(remote_branch.repository)) |
597 | + local.repository.fetch(remote_branch.repository, |
598 | + fetch_spec=vf_search.TextsForRevisionSearchResult('rev-1')) |
599 | # make sure the overridden verb was used |
600 | - self.assertLength(1, verb_log) |
601 | + self.assertLength(1, self.verb_log) |
602 | # more than one HPSS call is needed, but because it's a VFS callback |
603 | # its hard to predict exactly how many. |
604 | self.assertTrue(len(self.hpss_calls) > 1) |
605 | |
606 | === modified file 'bzrlib/tests/test_repository.py' |
607 | --- bzrlib/tests/test_repository.py 2011-12-22 19:54:56 +0000 |
608 | +++ bzrlib/tests/test_repository.py 2012-01-25 17:56:30 +0000 |
609 | @@ -1559,7 +1559,8 @@ |
610 | """ |
611 | b_source = self.make_abc_branch() |
612 | b_base = b_source.bzrdir.sprout('base', revision_id='A').open_branch() |
613 | - b_stacked = b_base.bzrdir.sprout('stacked', stacked=True).open_branch() |
614 | + b_stacked = b_base.bzrdir.sprout('stacked', stacked=True, |
615 | + create_tree_if_local=False).open_branch() |
616 | b_stacked.lock_write() |
617 | self.addCleanup(b_stacked.unlock) |
618 | b_stacked.fetch(b_source, 'B') |
619 | |
620 | === modified file 'bzrlib/vf_search.py' |
621 | --- bzrlib/vf_search.py 2011-12-19 13:23:58 +0000 |
622 | +++ bzrlib/vf_search.py 2012-01-25 17:56:30 +0000 |
623 | @@ -61,6 +61,9 @@ |
624 | """ |
625 | raise NotImplementedError(self.get_keys) |
626 | |
627 | + def get_special(self): |
628 | + return [] |
629 | + |
630 | def is_empty(self): |
631 | """Return false if the search lists 1 or more revisions.""" |
632 | raise NotImplementedError(self.is_empty) |
633 | @@ -509,3 +512,93 @@ |
634 | # walking, so we can exclude them from the start list. |
635 | start_keys = set(start_keys).difference(found_heads) |
636 | return start_keys, exclude_keys, len(keys) |
637 | + |
638 | + |
639 | +class UnionSearchResult(AbstractSearchResult): |
640 | + """The union of multiple search results.""" |
641 | + |
642 | + def __init__(self, search_results): |
643 | + self.search_results = frozenset(search_results) |
644 | + |
645 | + def get_recipe(self): |
646 | + """Return a recipe that can be used to replay this search. |
647 | + |
648 | + :return: A tuple of ('union', (recipe, ...)). |
649 | + """ |
650 | + return ('union', tuple(sr.get_recipe() for sr in self.search_results)) |
651 | + |
652 | + def get_network_struct(self): |
653 | + """Return a tuple that can be transmitted via the HPSS protocol.""" |
654 | + return ('union', |
655 | + tuple(sr.get_network_struct() for sr in self.search_results)) |
656 | + |
657 | + def get_keys(self): |
658 | + """Return the keys found in this search. |
659 | + |
660 | + :return: A set of keys. |
661 | + """ |
662 | + return set(sr.get_keys() for sr in self.search_results) |
663 | + |
664 | + def get_special(self): |
665 | + specials = [] |
666 | + for sr in self.search_results: |
667 | + specials.extend(sr.get_special()) |
668 | + return specials |
669 | + |
670 | + def is_empty(self): |
671 | + """Return false if the search lists 1 or more revisions.""" |
672 | + return all(sr.is_empty() for sr in self.search_results) |
673 | + |
674 | + def refine(self, seen, referenced): |
675 | + """Create a new search by refining this search. |
676 | + |
677 | + :param seen: Revisions that have been satisfied. |
678 | + :param referenced: Revision references observed while satisfying some |
679 | + of this search. |
680 | + :return: A search result. |
681 | + """ |
682 | + raise NotImplementedError(self.refine) |
683 | + |
684 | + |
685 | +class TextsForRevisionSearchResult(AbstractSearchResult): |
686 | + """A search result specifying all the texts for a given revision.""" |
687 | + |
688 | + def __init__(self, revision_id): |
689 | + """Constructor. |
690 | + |
691 | + :param revision_id: A revision ID. All records needed to create a |
692 | + checkout of this revision are referenced by this result (i.e. the |
693 | + revision record and corresponding inventory record, and all |
694 | + referenced texts and chk_bytes). |
695 | + """ |
696 | + self.revision_id = revision_id |
697 | + |
698 | + def get_recipe(self): |
699 | + """Return a recipe that can be used to replay this search. |
700 | + |
701 | + :return: A tuple of ('texts-for-rev', revision_id) |
702 | + """ |
703 | + return ('texts-for-rev', self.revision_id) |
704 | + |
705 | + def get_network_struct(self): |
706 | + """Return a tuple that can be transmitted via the HPSS protocol.""" |
707 | + return ('texts-for-rev', self.revision_id) |
708 | + |
709 | + def get_keys(self): |
710 | + return [self.revision_id] |
711 | + |
712 | + def get_special(self): |
713 | + return [('texts-for-rev', self.revision_id)] |
714 | + |
715 | + def is_empty(self): |
716 | + """Return false if the search lists 1 or more revisions.""" |
717 | + return False |
718 | + |
719 | + def refine(self, seen, referenced): |
720 | + # XXX: this is pretty wrong, but texts-for-rev is advisory rather than |
721 | + # mandatory I guess... |
722 | + #trace.mutter("seen, ref'd: %r, %r", seen, referenced) |
723 | + res = SearchResult(set(), set(referenced), 0, set()).refine(seen, referenced) |
724 | + #trace.mutter("-> %r", res) |
725 | + return res |
726 | + #raise NotImplementedError(self.refine) |
727 | |
728 | === modified file 'doc/en/release-notes/bzr-2.5.txt' |
729 | --- doc/en/release-notes/bzr-2.5.txt 2012-01-23 15:02:51 +0000 |
730 | +++ doc/en/release-notes/bzr-2.5.txt 2012-01-25 17:56:30 +0000 |
731 | @@ -36,6 +36,10 @@ |
732 | a control directory (but no branch or working tree). |
733 | (Jelmer Vernooij, #913980) |
734 | |
735 | +* Much improved performance of ``bzr branch --stacked`` against the |
736 | + smart server, by introduction of a new search type. |
737 | + (Andrew Bennetts, Jelmer Vernooij, #791704) |
738 | + |
739 | * New HPSS call for ``BzrDir.get_branches``. (Jelmer Vernooij, #894460) |
740 | |
741 | * Two new command hooks, ``pre_command`` and ``post_command``, |
I haven't read it extremely carefully but it looks basically ok.
Thanks for pushing it up.
It would be nice to get some news and what's new entries that give
some indication of how much performance has actually changed.