Merge lp:~spiv/bzr/fetch-spec-everything-not-in-other into lp:bzr

Proposed by Andrew Bennetts
Status: Superseded
Proposed branch: lp:~spiv/bzr/fetch-spec-everything-not-in-other
Merge into: lp:bzr
Diff against target: 981 lines (+488/-74)
13 files modified
bzrlib/fetch.py (+36/-11)
bzrlib/graph.py (+174/-2)
bzrlib/remote.py (+30/-9)
bzrlib/repofmt/knitrepo.py (+19/-11)
bzrlib/repofmt/weaverepo.py (+19/-11)
bzrlib/repository.py (+94/-18)
bzrlib/smart/repository.py (+16/-0)
bzrlib/tests/per_interrepository/test_interrepository.py (+9/-2)
bzrlib/tests/per_repository_reference/test_fetch.py (+40/-1)
bzrlib/tests/test_remote.py (+27/-2)
bzrlib/tests/test_repository.py (+1/-1)
bzrlib/tests/test_smart.py (+18/-6)
doc/en/release-notes/bzr-2.3.txt (+5/-0)
To merge this branch: bzr merge lp:~spiv/bzr/fetch-spec-everything-not-in-other
Reviewer Review Type Date Requested Status
John A Meinel Needs Fixing
Review via email: mp+42078@code.launchpad.net

This proposal has been superseded by a proposal from 2010-12-06.

Commit message

Add EverythingResult, EverythingNotInOther, and NotInOtherForRevs.

Description of the change

Like <https://code.launchpad.net/~spiv/bzr/sprout-does-not-reopen-repo/+merge/41037> this is a step towards bug 309682 (i.e. is a pre-req for <lp:~spiv/bzr/fetch-all-tags-309682>, as well as being a general improvement.

This patch defines some new “fetch specs” to be passed as the fetch_spec argument of fetch: EverythingResult, EmptySearchResult, EverythingNotInOther, and NotInOtherForRevs. I think the naming is a bit off, the original fetch spec was called “SearchResult”, which made some sense, but there's also “PendingAncestryResult” which isn't really a “result”. So perhaps we should rename some or all of these things. As an added wrinkle, EverythingNotInOther an NotInOtherForRevs aren't directly “results” at all, but factories that can be turned into a result. Suggestions for clearer naming would be very welcome!

A nice thing about adding these objects is that it shifts logic to handle these different cases out of the innards of fetch.py to somewhere more modular. This in turn means we can support these things better over HPSS, and so this patch does that: the 'everything' search is now added to the network protocol. Happily new verb was needed, we can safely interpret a BadSearch error in this case as meaning we need to fallback.

Another change in this patch is the deprecation of search_missing_revision_ids' revision_id argument for the new revision_ids (plural).

To post a comment you must log in.
Revision history for this message
John A Meinel (jameinel) wrote :

+ if not isinstance(search, graph.EverythingResult):
^- can we add an attribute, rather than an isinstance check?

if not search.is_everything():

490 + def _present_source_revisions_for(self, revision_ids):
491 + """Returns set of all revisions in ancestry of revision_ids present in
492 + the source repo.
493 +
494 + :param revision_ids: if None, all revisions in source are returned.
495 + """
496 + if revision_ids is not None:
497 + # First, ensure all specified revisions exist. Callers expect
498 + # NoSuchRevision when they pass absent revision_ids here.
499 + revision_ids = set(revision_ids)
500 + graph = self.source.get_graph()
501 + present_revs = set(graph.get_parent_map(revision_ids))
502 + missing = revision_ids.difference(present_revs)
503 + if missing:
504 + raise errors.NoSuchRevision(self.source, missing.pop())
505 + source_ids = [rev_id for (rev_id, parents) in
506 + self.source.get_graph().iter_ancestry(revision_ids)
507 + if rev_id != _mod_revision.NULL_REVISION
508 + and parents is not None]
^- you have "graph = self.source.get_graph()" and then you inline "self.source.get_graph()" into the source_ids list comprehension. Better to re-use the graph object.

145 + def get_recipe(self):
146 + raise NotImplementedError(self.get_recipe)
147 +
148 + def get_network_struct(self):
149 + return ('everything',)

Why don't EverythingNotInOther and NotInOtherForRevs implement these functions?

Overall, I think these changes seem good, but I don't really see how this gets us closer to computing a search without having one end doing a step-by-step search. (Certainly you're still using the same 'search_missing_revision_ids', and you don't seem to be serializing anything over the wire...)

review: Needs Fixing
Revision history for this message
Andrew Bennetts (spiv) wrote :
Download full text (3.8 KiB)

John A Meinel wrote:
> Review: Needs Fixing
> + if not isinstance(search, graph.EverythingResult):
> ^- can we add an attribute, rather than an isinstance check?
>
> if not search.is_everything():

Hmm, I can do that, although currently there's no common base class for these
objects, and my gut feeling is that it would be good to keep their interface as
small and simple as possible. So I'm unsure about whether or not to do this. I
think it's probably easier to add this later than it is to remove it later.

> 490 + def _present_source_revisions_for(self, revision_ids):
> 491 + """Returns set of all revisions in ancestry of revision_ids present in
> 492 + the source repo.
> 493 +
> 494 + :param revision_ids: if None, all revisions in source are returned.
> 495 + """
> 496 + if revision_ids is not None:
> 497 + # First, ensure all specified revisions exist. Callers expect
> 498 + # NoSuchRevision when they pass absent revision_ids here.
> 499 + revision_ids = set(revision_ids)
> 500 + graph = self.source.get_graph()
> 501 + present_revs = set(graph.get_parent_map(revision_ids))
> 502 + missing = revision_ids.difference(present_revs)
> 503 + if missing:
> 504 + raise errors.NoSuchRevision(self.source, missing.pop())
> 505 + source_ids = [rev_id for (rev_id, parents) in
> 506 + self.source.get_graph().iter_ancestry(revision_ids)
> 507 + if rev_id != _mod_revision.NULL_REVISION
> 508 + and parents is not None]
> ^- you have "graph = self.source.get_graph()" and then you inline "self.source.get_graph()" into the source_ids list comprehension. Better to re-use the graph object.

Fixed, thanks for spotting that. I think it occurred because I initially
copy-and-pasted the list comprehension from graph.py.

> 145 + def get_recipe(self):
> 146 + raise NotImplementedError(self.get_recipe)
> 147 +
> 148 + def get_network_struct(self):
> 149 + return ('everything',)
>
> Why don't EverythingNotInOther and NotInOtherForRevs implement these functions?

Because they are a different kind of object. Obviously that's not clear enough
in the current code!

The names could stand to be improved, but there are basically two kinds of
“fetch spec” in this patch:

 * a network-ready search result: SearchResult, PendingAncestryResult,
   EverythingResult. They implement all the methods SearchResult implements
   (I'm not certain that's the perfect design, but it's not too far off.)
 * a lightweight object that can be resolved into a network-ready search result
   later: EverythingNotInOther, NotInOtherForRevs. They implement a get_search
   method that inspects the repository/ies to generate the search result.

> Overall, I think these changes seem good, but I don't really see how this gets
> us closer to computing a search without having one end doing a step-by-step
> search. (Certainly you're still using the same 'search_missing_revision_ids',
> and you don't seem to be serializing anything over the wire...)

Right, this change isn't meant to solve that problem. The concrete performance
issue they help with is in the next patch, fetch-all-tags-309682, to make it
and easy for the sprout code to request a fetch of all tags at the
same time as the tip.

Separately...

Read more...

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bzrlib/fetch.py'
2--- bzrlib/fetch.py 2010-09-17 04:35:23 +0000
3+++ bzrlib/fetch.py 2010-12-03 07:07:15 +0000
4@@ -28,6 +28,7 @@
5 from bzrlib.lazy_import import lazy_import
6 lazy_import(globals(), """
7 from bzrlib import (
8+ graph,
9 tsort,
10 versionedfile,
11 )
12@@ -93,7 +94,8 @@
13 try:
14 pb.update("Finding revisions", 0, 2)
15 search = self._revids_to_fetch()
16- if search is None:
17+ mutter('fetching: %s', search)
18+ if search.is_empty():
19 return
20 pb.update("Fetching revisions", 1, 2)
21 self._fetch_everything_for_search(search)
22@@ -126,8 +128,15 @@
23 resume_tokens, missing_keys = self.sink.insert_stream(
24 stream, from_format, [])
25 if self.to_repository._fallback_repositories:
26- missing_keys.update(
27- self._parent_inventories(search.get_keys()))
28+ if not isinstance(search, graph.EverythingResult):
29+ # If search is EverythingResult this is be unnecessary,
30+ # so we can skip this step. The source will send us
31+ # every revision it has, and their parent inventories.
32+ # (Unless the source is damaged! but not really worth
33+ # optimising for that case. The pack code will reject bad
34+ # streams anyway.)
35+ missing_keys.update(
36+ self._parent_inventories(search.get_keys()))
37 if missing_keys:
38 pb.update("Missing keys")
39 stream = source.get_stream_for_missing_keys(missing_keys)
40@@ -151,17 +160,33 @@
41 """Determines the exact revisions needed from self.from_repository to
42 install self._last_revision in self.to_repository.
43
44- If no revisions need to be fetched, then this just returns None.
45+ :returns: A SearchResult of some sort. (Possibly a
46+ PendingAncestryResult, EmptySearchResult, etc.)
47 """
48- if self._fetch_spec is not None:
49+ mutter("self._fetch_spec, self._last_revision: %r, %r",
50+ self._fetch_spec, self._last_revision)
51+ get_search_result = getattr(self._fetch_spec, 'get_search_result', None)
52+ if get_search_result is not None:
53+ mutter(
54+ 'resolving fetch_spec into search result: %s', self._fetch_spec)
55+ # This is EverythingNotInOther or a similar kind of fetch_spec.
56+ # Turn it into a search result.
57+ return get_search_result()
58+ elif self._fetch_spec is not None:
59+ # The fetch spec is already a concrete search result.
60 return self._fetch_spec
61- mutter('fetch up to rev {%s}', self._last_revision)
62- if self._last_revision is NULL_REVISION:
63+ elif self._last_revision == NULL_REVISION:
64+ # fetch_spec is None + last_revision is null => empty fetch.
65 # explicit limit of no revisions needed
66- return None
67- return self.to_repository.search_missing_revision_ids(
68- self.from_repository, self._last_revision,
69- find_ghosts=self.find_ghosts)
70+ return graph.EmptySearchResult()
71+ elif self._last_revision is not None:
72+ return graph.NotInOtherForRevs(self.to_repository,
73+ self.from_repository, [self._last_revision],
74+ find_ghosts=self.find_ghosts).get_search_result()
75+ else: # self._last_revision is None:
76+ return graph.EverythingNotInOther(self.to_repository,
77+ self.from_repository,
78+ find_ghosts=self.find_ghosts).get_search_result()
79
80 def _parent_inventories(self, revision_ids):
81 # Find all the parent revisions referenced by the stream, but
82
83=== modified file 'bzrlib/graph.py'
84--- bzrlib/graph.py 2010-08-15 15:20:14 +0000
85+++ bzrlib/graph.py 2010-12-03 07:07:15 +0000
86@@ -1536,7 +1536,57 @@
87 return revs, ghosts
88
89
90-class SearchResult(object):
91+class AbstractSearchResult(object):
92+
93+ def get_recipe(self):
94+ """Return a recipe that can be used to replay this search.
95+
96+ The recipe allows reconstruction of the same results at a later date.
97+
98+ :return: A tuple of (search_kind_str, *details). The details vary by
99+ kind of search result.
100+ """
101+ raise NotImplementedError(self.get_recipe)
102+
103+ def get_network_struct(self):
104+ """Return a tuple that can be transmitted via the HPSS protocol."""
105+ raise NotImplementedError(self.get_network_struct)
106+
107+ def get_keys(self):
108+ """Return the keys found in this search.
109+
110+ :return: A set of keys.
111+ """
112+ raise NotImplementedError(self.get_keys)
113+
114+ def is_empty(self):
115+ """Return false if the search lists 1 or more revisions."""
116+ raise NotImplementedError(self.is_empty)
117+
118+ def refine(self, seen, referenced):
119+ """Create a new search by refining this search.
120+
121+ :param seen: Revisions that have been satisfied.
122+ :param referenced: Revision references observed while satisfying some
123+ of this search.
124+ :return: A search result.
125+ """
126+ raise NotImplementedError(self.refine)
127+
128+
129+class AbstractSearch(object):
130+
131+ def get_search_result(self):
132+ """Construct a network-ready search result from this search description.
133+
134+ This may take some time to search repositories, etc.
135+
136+ :return: A search result.
137+ """
138+ raise NotImplementedError(self.get_search_result)
139+
140+
141+class SearchResult(AbstractSearchResult):
142 """The result of a breadth first search.
143
144 A SearchResult provides the ability to reconstruct the search or access a
145@@ -1557,6 +1607,19 @@
146 self._recipe = ('search', start_keys, exclude_keys, key_count)
147 self._keys = frozenset(keys)
148
149+ def __repr__(self):
150+ kind, start_keys, exclude_keys, key_count = self._recipe
151+ if len(start_keys) > 5:
152+ start_keys_repr = repr(list(start_keys)[:5])[:-1] + ', ...]'
153+ else:
154+ start_keys_repr = repr(start_keys)
155+ if len(exclude_keys) > 5:
156+ exclude_keys_repr = repr(list(exclude_keys)[:5])[:-1] + ', ...]'
157+ else:
158+ exclude_keys_repr = repr(exclude_keys)
159+ return '<%s %s:(%s, %s, %d)>' % (self.__class__.__name__,
160+ kind, start_keys_repr, exclude_keys_repr, key_count)
161+
162 def get_recipe(self):
163 """Return a recipe that can be used to replay this search.
164
165@@ -1580,6 +1643,12 @@
166 """
167 return self._recipe
168
169+ def get_network_struct(self):
170+ start_keys = ' '.join(self._recipe[1])
171+ stop_keys = ' '.join(self._recipe[2])
172+ count = str(self._recipe[3])
173+ return (self._recipe[0], '\n'.join((start_keys, stop_keys, count)))
174+
175 def get_keys(self):
176 """Return the keys found in this search.
177
178@@ -1617,7 +1686,7 @@
179 return SearchResult(pending_refs, exclude, count, keys)
180
181
182-class PendingAncestryResult(object):
183+class PendingAncestryResult(AbstractSearchResult):
184 """A search result that will reconstruct the ancestry for some graph heads.
185
186 Unlike SearchResult, this doesn't hold the complete search result in
187@@ -1647,6 +1716,11 @@
188 """
189 return ('proxy-search', self.heads, set(), -1)
190
191+ def get_network_struct(self):
192+ parts = ['ancestry-of']
193+ parts.extend(self.heads)
194+ return parts
195+
196 def get_keys(self):
197 """See SearchResult.get_keys.
198
199@@ -1679,6 +1753,104 @@
200 return PendingAncestryResult(referenced - seen, self.repo)
201
202
203+class EmptySearchResult(AbstractSearchResult):
204+ """An empty search result."""
205+
206+ def is_empty(self):
207+ return True
208+
209+
210+class EverythingResult(AbstractSearchResult):
211+ """A search result that simply requests everything in the repository."""
212+
213+ def __init__(self, repo):
214+ self._repo = repo
215+
216+ def __repr__(self):
217+ return '%s(%r)' % (self.__class__.__name__, self._repo)
218+
219+ def get_recipe(self):
220+ raise NotImplementedError(self.get_recipe)
221+
222+ def get_network_struct(self):
223+ return ('everything',)
224+
225+ def get_keys(self):
226+ if 'evil' in debug.debug_flags:
227+ from bzrlib import remote
228+ if isinstance(self._repo, remote.RemoteRepository):
229+ # warn developers (not users) not to do this
230+ trace.mutter_callsite(
231+ 2, "EverythingResult(RemoteRepository).get_keys() is slow.")
232+ return self._repo.all_revision_ids()
233+
234+ def is_empty(self):
235+ # It's ok for this to wrongly return False: the worst that can happen
236+ # is that RemoteStreamSource will initiate a get_stream on an empty
237+ # repository. And almost all repositories are non-empty.
238+ return False
239+
240+ def refine(self, seen, referenced):
241+ heads = set(self._repo.all_revision_ids())
242+ heads.difference_update(seen)
243+ heads.update(referenced)
244+ return PendingAncestryResult(heads, self._repo)
245+
246+
247+class EverythingNotInOther(AbstractSearch):
248+ """Find all revisions in that are in one repo but not the other."""
249+
250+ def __init__(self, to_repo, from_repo, find_ghosts=False):
251+ self.to_repo = to_repo
252+ self.from_repo = from_repo
253+ self.find_ghosts = find_ghosts
254+
255+ def get_search_result(self):
256+ return self.to_repo.search_missing_revision_ids(
257+ self.from_repo, find_ghosts=self.find_ghosts)
258+
259+
260+class NotInOtherForRevs(AbstractSearch):
261+ """Find all revisions missing in one repo for a some specific heads."""
262+
263+ def __init__(self, to_repo, from_repo, required_ids, if_present_ids=None,
264+ find_ghosts=False):
265+ """Constructor.
266+
267+ :param required_ids: revision IDs of heads that must be found, or else
268+ the search will fail with NoSuchRevision. All revisions in their
269+ ancestry not already in the other repository will be included in
270+ the search result.
271+ :param if_present_ids: revision IDs of heads that may be absent in the
272+ source repository. If present, then their ancestry not already
273+ found in other will be included in the search result.
274+ """
275+ self.to_repo = to_repo
276+ self.from_repo = from_repo
277+ self.find_ghosts = find_ghosts
278+ self.required_ids = required_ids
279+ self.if_present_ids = if_present_ids
280+
281+ def __repr__(self):
282+ if len(self.required_ids) > 5:
283+ reqd_revs_repr = repr(list(self.required_ids)[:5])[:-1] + ', ...]'
284+ else:
285+ reqd_revs_repr = repr(self.required_ids)
286+ if self.if_present_ids and len(self.if_present_ids) > 5:
287+ ifp_revs_repr = repr(list(self.if_present_ids)[:5])[:-1] + ', ...]'
288+ else:
289+ ifp_revs_repr = repr(self.if_present_ids)
290+
291+ return "<%s from:%r to:%r find_ghosts:%r req'd:%r if-present:%r>" % (
292+ self.__class__.__name__, self.from_repo, self.to_repo,
293+ self.find_ghosts, reqd_revs_repr, ifp_revs_repr)
294+
295+ def get_search_result(self):
296+ return self.to_repo.search_missing_revision_ids(
297+ self.from_repo, revision_ids=self.required_ids,
298+ if_present_ids=self.if_present_ids, find_ghosts=self.find_ghosts)
299+
300+
301 def collapse_linear_regions(parent_map):
302 """Collapse regions of the graph that are 'linear'.
303
304
305=== modified file 'bzrlib/remote.py'
306--- bzrlib/remote.py 2010-12-02 10:41:05 +0000
307+++ bzrlib/remote.py 2010-12-03 07:07:15 +0000
308@@ -1344,15 +1344,29 @@
309 return result
310
311 @needs_read_lock
312- def search_missing_revision_ids(self, other, revision_id=None, find_ghosts=True):
313+ def search_missing_revision_ids(self, other,
314+ revision_id=symbol_versioning.DEPRECATED_PARAMETER,
315+ find_ghosts=True, revision_ids=None, if_present_ids=None):
316 """Return the revision ids that other has that this does not.
317
318 These are returned in topological order.
319
320 revision_id: only return revision ids included by revision_id.
321 """
322- return repository.InterRepository.get(
323- other, self).search_missing_revision_ids(revision_id, find_ghosts)
324+ if symbol_versioning.deprecated_passed(revision_id):
325+ symbol_versioning.warn(
326+ 'search_missing_revision_ids(revision_id=...) was '
327+ 'deprecated in 2.3. Use revision_ids=[...] instead.',
328+ DeprecationWarning, stacklevel=2)
329+ if revision_ids is not None:
330+ raise AssertionError(
331+ 'revision_ids is mutually exclusive with revision_id')
332+ if revision_id is not None:
333+ revision_ids = [revision_id]
334+ inter_repo = repository.InterRepository.get(other, self)
335+ return inter_repo.search_missing_revision_ids(
336+ find_ghosts=find_ghosts, revision_ids=revision_ids,
337+ if_present_ids=if_present_ids)
338
339 def fetch(self, source, revision_id=None, pb=None, find_ghosts=False,
340 fetch_spec=None):
341@@ -1759,12 +1773,7 @@
342 return '\n'.join((start_keys, stop_keys, count))
343
344 def _serialise_search_result(self, search_result):
345- if isinstance(search_result, graph.PendingAncestryResult):
346- parts = ['ancestry-of']
347- parts.extend(search_result.heads)
348- else:
349- recipe = search_result.get_recipe()
350- parts = [recipe[0], self._serialise_search_recipe(recipe)]
351+ parts = search_result.get_network_struct()
352 return '\n'.join(parts)
353
354 def autopack(self):
355@@ -1964,6 +1973,7 @@
356 candidate_verbs = [
357 ('Repository.get_stream_1.19', (1, 19)),
358 ('Repository.get_stream', (1, 13))]
359+
360 found_verb = False
361 for verb, version in candidate_verbs:
362 if medium._is_remote_before(version):
363@@ -1973,6 +1983,17 @@
364 verb, args, search_bytes)
365 except errors.UnknownSmartMethod:
366 medium._remember_remote_is_before(version)
367+ except errors.UnknownErrorFromSmartServer, e:
368+ if isinstance(search, graph.EverythingResult):
369+ error_verb = e.error_from_smart_server.error_verb
370+ if error_verb == 'BadSearch':
371+ # Pre-2.3 servers don't support this sort of search.
372+ # XXX: perhaps falling back to VFS on BadSearch is a
373+ # good idea in general? It might provide a little bit
374+ # of protection against client-side bugs.
375+ medium._remember_remote_is_before((2, 3))
376+ break
377+ raise
378 else:
379 response_tuple, response_handler = response
380 found_verb = True
381
382=== modified file 'bzrlib/repofmt/knitrepo.py'
383--- bzrlib/repofmt/knitrepo.py 2010-11-20 21:41:05 +0000
384+++ bzrlib/repofmt/knitrepo.py 2010-12-03 07:07:15 +0000
385@@ -43,6 +43,7 @@
386 RepositoryFormat,
387 RootCommitBuilder,
388 )
389+from bzrlib import symbol_versioning
390
391
392 class _KnitParentsProvider(object):
393@@ -534,16 +535,23 @@
394 return are_knits and InterRepository._same_model(source, target)
395
396 @needs_read_lock
397- def search_missing_revision_ids(self, revision_id=None, find_ghosts=True):
398- """See InterRepository.missing_revision_ids()."""
399- if revision_id is not None:
400- source_ids = self.source.get_ancestry(revision_id)
401- if source_ids[0] is not None:
402- raise AssertionError()
403- source_ids.pop(0)
404- else:
405- source_ids = self.source.all_revision_ids()
406- source_ids_set = set(source_ids)
407+ def search_missing_revision_ids(self,
408+ revision_id=symbol_versioning.DEPRECATED_PARAMETER,
409+ find_ghosts=True, revision_ids=None, if_present_ids=None):
410+ """See InterRepository.searcH_missing_revision_ids()."""
411+ if symbol_versioning.deprecated_passed(revision_id):
412+ symbol_versioning.warn(
413+ 'search_missing_revision_ids(revision_id=...) was '
414+ 'deprecated in 2.3. Use revision_ids=[...] instead.',
415+ DeprecationWarning, stacklevel=2)
416+ if revision_ids is not None:
417+ raise AssertionError(
418+ 'revision_ids is mutually exclusive with revision_id')
419+ if revision_id is not None:
420+ revision_ids = [revision_id]
421+ del revision_id
422+ source_ids_set = self._present_source_revisions_for(
423+ revision_ids, if_present_ids)
424 # source_ids is the worst possible case we may need to pull.
425 # now we want to filter source_ids against what we actually
426 # have in target, but don't try to check for existence where we know
427@@ -553,7 +561,7 @@
428 actually_present_revisions = set(
429 self.target._eliminate_revisions_not_present(possibly_present_revisions))
430 required_revisions = source_ids_set.difference(actually_present_revisions)
431- if revision_id is not None:
432+ if revision_ids is not None:
433 # we used get_ancestry to determine source_ids then we are assured all
434 # revisions referenced are present as they are installed in topological order.
435 # and the tip revision was validated by get_ancestry.
436
437=== modified file 'bzrlib/repofmt/weaverepo.py'
438--- bzrlib/repofmt/weaverepo.py 2010-11-20 21:41:05 +0000
439+++ bzrlib/repofmt/weaverepo.py 2010-12-03 07:07:15 +0000
440@@ -39,6 +39,7 @@
441 lockable_files,
442 lockdir,
443 osutils,
444+ symbol_versioning,
445 trace,
446 urlutils,
447 versionedfile,
448@@ -803,8 +804,10 @@
449 self.target.fetch(self.source, revision_id=revision_id)
450
451 @needs_read_lock
452- def search_missing_revision_ids(self, revision_id=None, find_ghosts=True):
453- """See InterRepository.missing_revision_ids()."""
454+ def search_missing_revision_ids(self,
455+ revision_id=symbol_versioning.DEPRECATED_PARAMETER,
456+ find_ghosts=True, revision_ids=None, if_present_ids=None):
457+ """See InterRepository.search_missing_revision_ids()."""
458 # we want all revisions to satisfy revision_id in source.
459 # but we don't want to stat every file here and there.
460 # we want then, all revisions other needs to satisfy revision_id
461@@ -816,14 +819,19 @@
462 # disk format scales terribly for push anyway due to rewriting
463 # inventory.weave, this is considered acceptable.
464 # - RBC 20060209
465- if revision_id is not None:
466- source_ids = self.source.get_ancestry(revision_id)
467- if source_ids[0] is not None:
468- raise AssertionError()
469- source_ids.pop(0)
470- else:
471- source_ids = self.source._all_possible_ids()
472- source_ids_set = set(source_ids)
473+ if symbol_versioning.deprecated_passed(revision_id):
474+ symbol_versioning.warn(
475+ 'search_missing_revision_ids(revision_id=...) was '
476+ 'deprecated in 2.3. Use revision_ids=[...] instead.',
477+ DeprecationWarning, stacklevel=2)
478+ if revision_ids is not None:
479+ raise AssertionError(
480+ 'revision_ids is mutually exclusive with revision_id')
481+ if revision_id is not None:
482+ revision_ids = [revision_id]
483+ del revision_id
484+ source_ids_set = self._present_source_revisions_for(
485+ revision_ids, if_present_ids)
486 # source_ids is the worst possible case we may need to pull.
487 # now we want to filter source_ids against what we actually
488 # have in target, but don't try to check for existence where we know
489@@ -833,7 +841,7 @@
490 actually_present_revisions = set(
491 self.target._eliminate_revisions_not_present(possibly_present_revisions))
492 required_revisions = source_ids_set.difference(actually_present_revisions)
493- if revision_id is not None:
494+ if revision_ids is not None:
495 # we used get_ancestry to determine source_ids then we are assured all
496 # revisions referenced are present as they are installed in topological order.
497 # and the tip revision was validated by get_ancestry.
498
499=== modified file 'bzrlib/repository.py'
500--- bzrlib/repository.py 2010-12-02 10:41:05 +0000
501+++ bzrlib/repository.py 2010-12-03 07:07:15 +0000
502@@ -42,7 +42,6 @@
503 pyutils,
504 revision as _mod_revision,
505 static_tuple,
506- symbol_versioning,
507 trace,
508 tsort,
509 versionedfile,
510@@ -57,6 +56,7 @@
511 from bzrlib import (
512 errors,
513 registry,
514+ symbol_versioning,
515 ui,
516 )
517 from bzrlib.decorators import needs_read_lock, needs_write_lock, only_raises
518@@ -1561,15 +1561,28 @@
519 return ret
520
521 @needs_read_lock
522- def search_missing_revision_ids(self, other, revision_id=None, find_ghosts=True):
523+ def search_missing_revision_ids(self, other,
524+ revision_id=symbol_versioning.DEPRECATED_PARAMETER,
525+ find_ghosts=True, revision_ids=None, if_present_ids=None):
526 """Return the revision ids that other has that this does not.
527
528 These are returned in topological order.
529
530 revision_id: only return revision ids included by revision_id.
531 """
532+ if symbol_versioning.deprecated_passed(revision_id):
533+ symbol_versioning.warn(
534+ 'search_missing_revision_ids(revision_id=...) was '
535+ 'deprecated in 2.3. Use revision_ids=[...] instead.',
536+ DeprecationWarning, stacklevel=3)
537+ if revision_ids is not None:
538+ raise AssertionError(
539+ 'revision_ids is mutually exclusive with revision_id')
540+ if revision_id is not None:
541+ revision_ids = [revision_id]
542 return InterRepository.get(other, self).search_missing_revision_ids(
543- revision_id, find_ghosts)
544+ find_ghosts=find_ghosts, revision_ids=revision_ids,
545+ if_present_ids=if_present_ids)
546
547 @staticmethod
548 def open(base):
549@@ -3430,7 +3443,7 @@
550 fetch_spec=fetch_spec,
551 find_ghosts=find_ghosts)
552
553- def _walk_to_common_revisions(self, revision_ids):
554+ def _walk_to_common_revisions(self, revision_ids, if_present_ids=None):
555 """Walk out from revision_ids in source to revisions target has.
556
557 :param revision_ids: The start point for the search.
558@@ -3438,10 +3451,14 @@
559 """
560 target_graph = self.target.get_graph()
561 revision_ids = frozenset(revision_ids)
562+ if if_present_ids:
563+ all_wanted_revs = revision_ids.union(if_present_ids)
564+ else:
565+ all_wanted_revs = revision_ids
566 missing_revs = set()
567 source_graph = self.source.get_graph()
568 # ensure we don't pay silly lookup costs.
569- searcher = source_graph._make_breadth_first_searcher(revision_ids)
570+ searcher = source_graph._make_breadth_first_searcher(all_wanted_revs)
571 null_set = frozenset([_mod_revision.NULL_REVISION])
572 searcher_exhausted = False
573 while True:
574@@ -3483,30 +3500,79 @@
575 return searcher.get_result()
576
577 @needs_read_lock
578- def search_missing_revision_ids(self, revision_id=None, find_ghosts=True):
579+ def search_missing_revision_ids(self,
580+ revision_id=symbol_versioning.DEPRECATED_PARAMETER,
581+ find_ghosts=True, revision_ids=None, if_present_ids=None):
582 """Return the revision ids that source has that target does not.
583
584 :param revision_id: only return revision ids included by this
585- revision_id.
586+ revision_id.
587+ :param revision_ids: return revision ids included by these
588+ revision_ids. NoSuchRevision will be raised if any of these
589+ revisions are not present.
590+ :param if_present_ids: like revision_ids, but will not cause
591+ NoSuchRevision if any of these are absent, instead they will simply
592+ not be in the result. This is useful for e.g. finding revisions
593+ to fetch for tags, which may reference absent revisions.
594 :param find_ghosts: If True find missing revisions in deep history
595 rather than just finding the surface difference.
596 :return: A bzrlib.graph.SearchResult.
597 """
598+ if symbol_versioning.deprecated_passed(revision_id):
599+ symbol_versioning.warn(
600+ 'search_missing_revision_ids(revision_id=...) was '
601+ 'deprecated in 2.3. Use revision_ids=[...] instead.',
602+ DeprecationWarning, stacklevel=2)
603+ if revision_ids is not None:
604+ raise AssertionError(
605+ 'revision_ids is mutually exclusive with revision_id')
606+ if revision_id is not None:
607+ revision_ids = [revision_id]
608+ del revision_id
609 # stop searching at found target revisions.
610- if not find_ghosts and revision_id is not None:
611- return self._walk_to_common_revisions([revision_id])
612+ if not find_ghosts and (revision_ids is not None or if_present_ids is
613+ not None):
614+ return self._walk_to_common_revisions(revision_ids,
615+ if_present_ids=if_present_ids)
616 # generic, possibly worst case, slow code path.
617 target_ids = set(self.target.all_revision_ids())
618- if revision_id is not None:
619- source_ids = self.source.get_ancestry(revision_id)
620- if source_ids[0] is not None:
621- raise AssertionError()
622- source_ids.pop(0)
623- else:
624- source_ids = self.source.all_revision_ids()
625+ source_ids = self._present_source_revisions_for(
626+ revision_ids, if_present_ids)
627 result_set = set(source_ids).difference(target_ids)
628 return self.source.revision_ids_to_search_result(result_set)
629
630+ def _present_source_revisions_for(self, revision_ids, if_present_ids=None):
631+ """Returns set of all revisions in ancestry of revision_ids present in
632+ the source repo.
633+
634+ :param revision_ids: if None, all revisions in source are returned.
635+ :param if_present_ids: like revision_ids, but if any/all of these are
636+ absent no error is raised.
637+ """
638+ if revision_ids is not None or if_present_ids is not None:
639+ # First, ensure all specified revisions exist. Callers expect
640+ # NoSuchRevision when they pass absent revision_ids here.
641+ if revision_ids is None:
642+ revision_ids = set()
643+ if if_present_ids is None:
644+ if_present_ids = set()
645+ revision_ids = set(revision_ids)
646+ if_present_ids = set(if_present_ids)
647+ all_wanted_ids = revision_ids.union(if_present_ids)
648+ graph = self.source.get_graph()
649+ present_revs = set(graph.get_parent_map(all_wanted_ids))
650+ missing = revision_ids.difference(present_revs)
651+ if missing:
652+ raise errors.NoSuchRevision(self.source, missing.pop())
653+ found_ids = all_wanted_ids.intersection(present_revs)
654+ source_ids = [rev_id for (rev_id, parents) in
655+ graph.iter_ancestry(found_ids)
656+ if rev_id != _mod_revision.NULL_REVISION
657+ and parents is not None]
658+ else:
659+ source_ids = self.source.all_revision_ids()
660+ return set(source_ids)
661+
662 @staticmethod
663 def _same_model(source, target):
664 """True if source and target have the same data representation.
665@@ -3830,7 +3896,13 @@
666 fetch_spec=None):
667 """See InterRepository.fetch()."""
668 if fetch_spec is not None:
669- raise AssertionError("Not implemented yet...")
670+ if (isinstance(fetch_spec, graph.NotInOtherForRevs) and
671+ len(fetch_spec.required_ids) == 1 and not
672+ fetch_spec.if_present_ids):
673+ revision_id = list(fetch_spec.required_ids)[0]
674+ del fetch_spec
675+ else:
676+ raise AssertionError("Not implemented yet...")
677 ui.ui_factory.warn_experimental_format_fetch(self)
678 if (not self.source.supports_rich_root()
679 and self.target.supports_rich_root()):
680@@ -3843,8 +3915,12 @@
681 ui.ui_factory.show_user_warning('cross_format_fetch',
682 from_format=self.source._format,
683 to_format=self.target._format)
684+ if revision_id:
685+ search_revision_ids = [revision_id]
686+ else:
687+ search_revision_ids = None
688 revision_ids = self.target.search_missing_revision_ids(self.source,
689- revision_id, find_ghosts=find_ghosts).get_keys()
690+ revision_ids=search_revision_ids, find_ghosts=find_ghosts).get_keys()
691 if not revision_ids:
692 return 0, 0
693 revision_ids = tsort.topo_sort(
694
695=== modified file 'bzrlib/smart/repository.py'
696--- bzrlib/smart/repository.py 2010-11-16 06:06:11 +0000
697+++ bzrlib/smart/repository.py 2010-12-03 07:07:15 +0000
698@@ -81,6 +81,8 @@
699 recreate_search trusts that clients will look for missing things
700 they expected and get it from elsewhere.
701 """
702+ if search_bytes == 'everything':
703+ return graph.EverythingResult(repository), None
704 lines = search_bytes.split('\n')
705 if lines[0] == 'ancestry-of':
706 heads = lines[1:]
707@@ -412,6 +414,13 @@
708 def do_repository_request(self, repository, to_network_name):
709 """Get a stream for inserting into a to_format repository.
710
711+ The request body is 'search_bytes', a description of the revisions
712+ being requested.
713+
714+ In 2.3 this verb added support for search_bytes == 'everything'. Older
715+ implementations will respond with a BadSearch error, and clients should
716+ catch this and fallback appropriately.
717+
718 :param repository: The repository to stream from.
719 :param to_network_name: The network name of the format of the target
720 repository.
721@@ -489,6 +498,13 @@
722
723
724 class SmartServerRepositoryGetStream_1_19(SmartServerRepositoryGetStream):
725+ """The same as Repository.get_stream, but will return stream CHK formats to
726+ clients.
727+
728+ See SmartServerRepositoryGetStream._should_fake_unknown.
729+
730+ New in 1.19.
731+ """
732
733 def _should_fake_unknown(self):
734 """Returns False; we don't need to workaround bugs in 1.19+ clients."""
735
736=== modified file 'bzrlib/tests/per_interrepository/test_interrepository.py'
737--- bzrlib/tests/per_interrepository/test_interrepository.py 2009-07-10 06:46:10 +0000
738+++ bzrlib/tests/per_interrepository/test_interrepository.py 2010-12-03 07:07:15 +0000
739@@ -139,9 +139,15 @@
740 self.assertFalse(repo_b.has_revision('pizza'))
741 # Asking specifically for an absent revision errors.
742 self.assertRaises(errors.NoSuchRevision,
743- repo_b.search_missing_revision_ids, repo_a, revision_id='pizza',
744+ repo_b.search_missing_revision_ids, repo_a, revision_ids=['pizza'],
745 find_ghosts=True)
746 self.assertRaises(errors.NoSuchRevision,
747+ repo_b.search_missing_revision_ids, repo_a, revision_ids=['pizza'],
748+ find_ghosts=False)
749+ self.callDeprecated(
750+ ['search_missing_revision_ids(revision_id=...) was deprecated in '
751+ '2.3. Use revision_ids=[...] instead.'],
752+ self.assertRaises, errors.NoSuchRevision,
753 repo_b.search_missing_revision_ids, repo_a, revision_id='pizza',
754 find_ghosts=False)
755
756@@ -151,7 +157,8 @@
757 # make a repository to compare against that is empty
758 repo_b = self.make_to_repository('empty')
759 repo_a = self.bzrdir.open_repository()
760- result = repo_b.search_missing_revision_ids(repo_a, revision_id='rev1')
761+ result = repo_b.search_missing_revision_ids(
762+ repo_a, revision_ids=['rev1'])
763 self.assertEqual(set(['rev1']), result.get_keys())
764 self.assertEqual(('search', set(['rev1']), set([NULL_REVISION]), 1),
765 result.get_recipe())
766
767=== modified file 'bzrlib/tests/per_repository_reference/test_fetch.py'
768--- bzrlib/tests/per_repository_reference/test_fetch.py 2009-06-01 18:13:46 +0000
769+++ bzrlib/tests/per_repository_reference/test_fetch.py 2010-12-03 07:07:15 +0000
770@@ -18,6 +18,7 @@
771 from bzrlib import (
772 branch,
773 errors,
774+ graph,
775 )
776 from bzrlib.smart import (
777 server,
778@@ -25,7 +26,7 @@
779 from bzrlib.tests.per_repository import TestCaseWithRepository
780
781
782-class TestFetch(TestCaseWithRepository):
783+class TestFetchBase(TestCaseWithRepository):
784
785 def make_source_branch(self):
786 # It would be nice if there was a way to force this to be memory-only
787@@ -51,6 +52,9 @@
788 self.addCleanup(source_b.unlock)
789 return content, source_b
790
791+
792+class TestFetch(TestFetchBase):
793+
794 def test_sprout_from_stacked_with_short_history(self):
795 content, source_b = self.make_source_branch()
796 # Split the generated content into a base branch, and a stacked branch
797@@ -149,3 +153,38 @@
798 source_b.lock_read()
799 self.addCleanup(source_b.unlock)
800 stacked.pull(source_b, stop_revision='B-id')
801+
802+
803+class TestFetchFromRepoWithUnconfiguredFallbacks(TestFetchBase):
804+
805+ def make_stacked_source_repo(self):
806+ _, source_b = self.make_source_branch()
807+ # Use 'make_branch' which gives us a bzr:// branch when appropriate,
808+ # rather than creating a branch-on-disk
809+ stack_b = self.make_branch('stack-on')
810+ stack_b.pull(source_b, stop_revision='B-id')
811+ stacked_b = self.make_branch('stacked')
812+ stacked_b.set_stacked_on_url('../stack-on')
813+ stacked_b.pull(source_b, stop_revision='C-id')
814+ return stacked_b.repository
815+
816+ def test_fetch_everything_includes_parent_invs(self):
817+ stacked = self.make_stacked_source_repo()
818+ repo_missing_fallbacks = stacked.bzrdir.open_repository()
819+ self.addCleanup(repo_missing_fallbacks.lock_read().unlock)
820+ target = self.make_repository('target')
821+ self.addCleanup(target.lock_write().unlock)
822+ target.fetch(
823+ repo_missing_fallbacks,
824+ fetch_spec=graph.EverythingResult(repo_missing_fallbacks))
825+ self.assertEqual(repo_missing_fallbacks.revisions.keys(),
826+ target.revisions.keys())
827+ self.assertEqual(repo_missing_fallbacks.inventories.keys(),
828+ target.inventories.keys())
829+ self.assertEqual(['C-id'],
830+ sorted(k[-1] for k in target.revisions.keys()))
831+ self.assertEqual(['B-id', 'C-id'],
832+ sorted(k[-1] for k in target.inventories.keys()))
833+
834+
835+
836
837=== modified file 'bzrlib/tests/test_remote.py'
838--- bzrlib/tests/test_remote.py 2010-12-02 10:41:05 +0000
839+++ bzrlib/tests/test_remote.py 2010-12-03 07:07:15 +0000
840@@ -3182,11 +3182,36 @@
841
842 def test_copy_content_into_avoids_revision_history(self):
843 local = self.make_branch('local')
844- remote_backing_tree = self.make_branch_and_tree('remote')
845- remote_backing_tree.commit("Commit.")
846+ builder = self.make_branch_builder('remote')
847+ builder.build_commit(message="Commit.")
848 remote_branch_url = self.smart_server.get_url() + 'remote'
849 remote_branch = bzrdir.BzrDir.open(remote_branch_url).open_branch()
850 local.repository.fetch(remote_branch.repository)
851 self.hpss_calls = []
852 remote_branch.copy_content_into(local)
853 self.assertFalse('Branch.revision_history' in self.hpss_calls)
854+
855+ def test_fetch_everything_needs_just_one_call(self):
856+ local = self.make_branch('local')
857+ builder = self.make_branch_builder('remote')
858+ builder.build_commit(message="Commit.")
859+ remote_branch_url = self.smart_server.get_url() + 'remote'
860+ remote_branch = bzrdir.BzrDir.open(remote_branch_url).open_branch()
861+ self.hpss_calls = []
862+ local.repository.fetch(remote_branch.repository,
863+ fetch_spec=graph.EverythingResult(remote_branch.repository))
864+ self.assertEqual(['Repository.get_stream_1.19'], self.hpss_calls)
865+
866+ def test_fetch_everything_backwards_compat(self):
867+ """Can fetch with EverythingResult even when the server does not have
868+ the Repository.get_stream_2.3 verb.
869+ """
870+ local = self.make_branch('local')
871+ builder = self.make_branch_builder('remote')
872+ builder.build_commit(message="Commit.")
873+ remote_branch_url = self.smart_server.get_url() + 'remote'
874+ remote_branch = bzrdir.BzrDir.open(remote_branch_url).open_branch()
875+ self.hpss_calls = []
876+ local.repository.fetch(remote_branch.repository,
877+ fetch_spec=graph.EverythingResult(remote_branch.repository))
878+
879
880=== modified file 'bzrlib/tests/test_repository.py'
881--- bzrlib/tests/test_repository.py 2010-11-22 22:27:58 +0000
882+++ bzrlib/tests/test_repository.py 2010-12-03 07:07:15 +0000
883@@ -1659,7 +1659,7 @@
884 self.orig_pack = target.pack
885 target.pack = self.log_pack
886 search = target.search_missing_revision_ids(
887- source_tree.branch.repository, tip)
888+ source_tree.branch.repository, revision_ids=[tip])
889 stream = source.get_stream(search)
890 from_format = source_tree.branch.repository._format
891 sink = target._get_sink()
892
893=== modified file 'bzrlib/tests/test_smart.py'
894--- bzrlib/tests/test_smart.py 2010-05-13 16:17:54 +0000
895+++ bzrlib/tests/test_smart.py 2010-12-03 07:07:15 +0000
896@@ -25,15 +25,11 @@
897 """
898
899 import bz2
900-from cStringIO import StringIO
901-import tarfile
902
903 from bzrlib import (
904- bencode,
905 branch as _mod_branch,
906 bzrdir,
907 errors,
908- pack,
909 tests,
910 transport,
911 urlutils,
912@@ -45,7 +41,6 @@
913 repository as smart_repo,
914 packrepository as smart_packrepo,
915 request as smart_req,
916- server,
917 vfs,
918 )
919 from bzrlib.tests import test_server
920@@ -1474,7 +1469,7 @@
921 request.execute('stacked', 1, (3, r3)))
922
923
924-class TestSmartServerRepositoryGetStream(tests.TestCaseWithMemoryTransport):
925+class GetStreamTestBase(tests.TestCaseWithMemoryTransport):
926
927 def make_two_commit_repo(self):
928 tree = self.make_branch_and_memory_tree('.')
929@@ -1486,6 +1481,9 @@
930 repo = tree.branch.repository
931 return repo, r1, r2
932
933+
934+class TestSmartServerRepositoryGetStream(GetStreamTestBase):
935+
936 def test_ancestry_of(self):
937 """The search argument may be a 'ancestry-of' some heads'."""
938 backing = self.get_transport()
939@@ -1512,6 +1510,18 @@
940 stream_bytes = ''.join(response.body_stream)
941 self.assertStartsWith(stream_bytes, 'Bazaar pack format 1')
942
943+ def test_search_everything(self):
944+ """A search of 'everything' returns a stream."""
945+ backing = self.get_transport()
946+ request = smart_repo.SmartServerRepositoryGetStream_1_19(backing)
947+ repo, r1, r2 = self.make_two_commit_repo()
948+ serialised_fetch_spec = 'everything'
949+ request.execute('', repo._format.network_name())
950+ response = request.do_body(serialised_fetch_spec)
951+ self.assertEqual(('ok',), response.args)
952+ stream_bytes = ''.join(response.body_stream)
953+ self.assertStartsWith(stream_bytes, 'Bazaar pack format 1')
954+
955
956 class TestSmartServerRequestHasRevision(tests.TestCaseWithMemoryTransport):
957
958@@ -1911,6 +1921,8 @@
959 smart_repo.SmartServerRepositoryGetRevisionGraph)
960 self.assertHandlerEqual('Repository.get_stream',
961 smart_repo.SmartServerRepositoryGetStream)
962+ self.assertHandlerEqual('Repository.get_stream_1.19',
963+ smart_repo.SmartServerRepositoryGetStream_1_19)
964 self.assertHandlerEqual('Repository.has_revision',
965 smart_repo.SmartServerRequestHasRevision)
966 self.assertHandlerEqual('Repository.insert_stream',
967
968=== modified file 'doc/en/release-notes/bzr-2.3.txt'
969--- doc/en/release-notes/bzr-2.3.txt 2010-12-02 16:24:54 +0000
970+++ doc/en/release-notes/bzr-2.3.txt 2010-12-03 07:07:15 +0000
971@@ -128,6 +128,11 @@
972 crashes when encountering private bugs (they are just displayed as such).
973 (Vincent Ladeuil, #354985)
974
975+* The ``revision_id`` parameter of
976+ ``Repository.search_missing_revision_ids`` and
977+ ``InterRepository.search_missing_revision_ids`` is deprecated. It is
978+ replaced by the ``revision_ids`` parameter. (Andrew Bennetts)
979+
980 Internals
981 *********
982