Bazaar

Merge lp:~spiv/bzr/fetch-all-tags-309682 into lp:bzr

fetch-all-tags-309682
Merge into bzr.dev

Proposed by Andrew Bennetts on 2010-12-07

Status:

Merged

Merged at revision:

5648

Proposed branch:

lp:~spiv/bzr/fetch-all-tags-309682

Merge into:

lp:bzr

Prerequisite:

lp:~spiv/bzr/fetch-integration

Diff against target:

776 lines (+339/-98)

15 files modified

bzrlib/controldir.py (+58/-49)
bzrlib/fetch.py (+93/-17)
bzrlib/graph.py (+8/-6)
bzrlib/remote.py (+3/-3)
bzrlib/repofmt/knitrepo.py (+1/-1)
bzrlib/repofmt/weaverepo.py (+1/-1)
bzrlib/repository.py (+9/-9)
bzrlib/tests/blackbox/test_branch.py (+33/-2)
bzrlib/tests/per_branch/test_sprout.py (+19/-0)
bzrlib/tests/per_controldir/test_controldir.py (+101/-1)
bzrlib/tests/per_interrepository/test_interrepository.py (+1/-1)
bzrlib/tests/test_bzrdir.py (+4/-0)
bzrlib/tests/test_remote.py (+3/-3)
doc/en/release-notes/bzr-2.3.txt (+0/-5)
doc/en/release-notes/bzr-2.4.txt (+5/-0)

To merge this branch:

bzr merge lp:~spiv/bzr/fetch-all-tags-309682

Low

Fix Released

Link a bug report

Reviewer	Review Type	Date Requested	Status
John A Meinel		2010-12-07	Needs Fixing on 2010-12-14
Review via email: mp+42910@code.launchpad.net

Commit message

'bzr branch' now fetches revisions referred to by tags.

Description of the change

This branch fixes <https://bugs.launchpad.net/bzr/+bug/309682>: bzr branch should copy revisions referred to in tags, not just the tip.

The main difficulty is the ugliness of sprout and to a lesser extent fetch. We don't want fetching the tags to regress our performance: i.e. no extra roundtrips for redundant locking/unlocking/locking again, no multiple calls to fetch for one sprout, no errors from bzr branch simply because a tag refers to a ghost, etc. At the same time we don't want to make the code any more complex and harder to understand than it already is!

The prerequisite branch addresses much of that, so that this patch on its own is fairly small. (The prerequisite branch is simply lp:~spiv/bzr/sprout-does-not-reopen-repo and lp:~spiv/bzr/fetch-spec-everything-not-in-other merged together. Both of those are in the review queue.)

I've taken the approach of moving much of the complexity out of the sprout method, into more focussed methods and objects which are hopefully easy to understand isolation.

Part of the problem is that the interface of sprout is pretty confusing, as demonstrated by the one test failure I've currently got in this branch: bzrlib.tests.per_controldir.test_controldir.TestControlDir.test_sprout_bzrdir_repository_branch_only_source_under_shared

That test fails because it verifies that bzrdir.sprout(url) from a bzrdir with a branch in a shared repo copies the entire repository. I think this behaviour is wrong, because it's likely to surprise callers. A quick check on IRC suggests that lifeless didn't expect this behaviour either. And quick skim of the bzrlib source suggests that at least 'bzr switch --create-branch' is encountering this behaviour without intending it. So I intend to remove or adjust the test for the new behaviour, but I thought I should get some reviewer feedback first.

More generally, the interaction between the various combinations of passing or not passing revision_id and source_branch and various configurations of source branches and repositories is too complex. It's not a friendly interface. Unfortunately adding consideration of tags into the mix is probably only making it messier. FWIW the behaviour now is:

* If a source branch is passed or found, only its tip and tags are fetched
* if revision_id is passed, then it overrides the tip
* otherwise if a revision_id is passed, then that is fetched
* otherwise the whole repository is fetched.

I think much of the mess stems from the weirdness of sprouting a bzrdir rather than a repository or branch directly. I'm starting to think we should remove ControlDir.sprout.

Finally, while this addresses the case reported in bug 309682 (bzr branch), I think a complete fix requires that pull, merge, update, etc also fetch tags. I have a follow-up branch for that.

Revision history for this message

John A Meinel (jameinel) wrote on 2010-12-14:

Download full text (9.0 KiB)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

> I think much of the mess stems from the weirdness of sprouting a bzrdir rather than a repository or branch directly. I'm starting to think we should remove ControlDir.sprout.
>

I'm in favor of this. It will mess up our test suite a ton, because that
has become the defacto method of creating a new branch. However using:

tree.branch.sprout(new_branch) makes a lot more sense to me than
tree.bzrdir.sprout(new_something)

I think some of it stems from, do you want a WT in the target or not,
etc, etc. I would be happier with something that you can compose. This
is also cluttered because you can't just open the WT from a Branch, it
will re-open the branch, etc. (Note that there was also bugs with
branch.sprout() which didn't happen with bzrdir.sprout(), something with
Trees and merge, and double-locking the WT, IIRC)

Things we want to keep in mind:

a) In the ideal interface, we would have a way to copy any number of
   branches simultaneously into a target. This gives us a way to
   implement some form of Repository cloning that people would like.
   (Being able to grab the stack of a loom, or trunk+main features,
   etc.)
b) I think we do like the ability to branch from a source, and preserve
   the type. However, many extensions probably explicitly work around
   this (bzr branch svn:// is not creating an svn branch, same for
   hg/git/etc)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

> I think much of the mess stems from the weirdness of sprouting a bzrdir rather than a repository or branch directly.  I'm starting to think we should remove ControlDir.sprout.
>

I'm in favor of this. It will mess up our test suite a ton, because that
has become the defacto method of creating a new branch. However using:

tree.branch.sprout(new_branch) makes a lot more sense to me than
tree.bzrdir.sprout(new_something)

Things we want to keep in mind:

On 12/6/2010 11:28 PM, Andrew Bennetts wrote:
> Andrew Bennetts has proposed merging lp:~spiv/bzr/fetch-all-tags-309682 into lp:bzr with lp:~spiv/bzr/fetch-integration as a prerequisite.
> 
> Requested reviews:
>   bzr-core (bzr-core)
> Related bugs:
>   #309682 tags are copied but their revisions may not be
>   https://bugs.launchpad.net/bugs/309682
> 
> 
> This branch fixes <https://bugs.launchpad.net/bzr/+bug/309682>: bzr branch should copy revisions referred to in tags, not just the tip.
> 
> The main difficulty is the ugliness of sprout and to a lesser extent fetch.  We don't want fetching the tags to regress our performance: i.e. no extra roundtrips for redundant locking/unlocking/locking again, no multiple calls to fetch for one sprout, no errors from bzr branch simply because a tag refers to a ghost, etc.  At the same time we don't want to make the code any more complex and harder to understand than it already is!
> 
> The prerequisite branch addresses much of that, so that this patch on its own is fairly small.  (The prerequisite branch is simply lp:~spiv/bzr/sprout-does-not-reopen-repo and lp:~spiv/bzr/fetch-spec-everything-not-in-other merged together.  Both of those are in the review queue.)
> 
> I've taken the approach of moving much of the complexity out of the sprout method, into more focussed methods and objects which are hopefully easy to understand isolation.
> 
> Part of the problem is that the interface of sprout is pretty confusing, as demonstrated by the one test failure I've currently got in this branch: bzrlib.tests.per_controldir.test_controldir.TestControlDir.test_sprout_bzrdir_repository_branch_only_source_under_shared
> 
> That test fails because it verifies that bzrdir.sprout(url) from a bzrdir with a branch in a shared repo copies the entire repository.  I think this behaviour is wrong, because it's likely to surprise callers.  A quick check on IRC suggests that lifeless didn't expect this behaviour either.  And quick skim of the bzrlib source suggests that at least 'bzr switch --create-branch' is encountering this behaviour without intending it.  So I intend to remove or adjust the test for the new behaviour, but I thought I should get some reviewer feedback first.
> 
> More generally, the interaction between the various combinations of passing or not passing revision_id and source_branch and various configurations of source branches and repositories is too complex.  It's not a friendly interface.  Unfortunately adding consideration of tags into the mix is probably only making it messier.  FWIW the behaviour now is:
> 
>  * If a source branch is passed or found, only its tip and tags are fetched
>    * if revision_id is passed, then it overrides the tip
>  * otherwise if a revision_id is passed, then that is fetched
>  * otherwise the whole repository is fetched.
> 
> Finally, while this addresses the case reported in bug 309682 (bzr branch), I think a complete fix requires that pull, merge, update, etc also fetch tags.  I have a follow-up branch for that.

+class FetchSpecFactory(object):
+    """A helper for building the best fetch spec for a sprout call.
+
+    Factors that go into determining the sort of fetch to perform:
+     * did the caller specify any revision IDs?
+     * did the caller specify a source branch (need to fetch the tip +
tags)
+     * is there an existing target repo (don't need to refetch revs it
+       already has)
+     * target is stacked?  (similar to pre-existing target repo: even if
+       the target itself is new don't want to refetch existing revs)
+
+    :ivar source_branch: the source branch if one specified, else None.
+    :ivar source_branch_stop_revision: fetch up to this revision of
+        source_branch, rather than its tip.
+    :ivar source_repo: the source repository if one found, else None.
+    :ivar target_repo: the target repository acquired by sprout.
+    :ivar target_repo_kind: one of the _TargetRepoKinds constants.
+    """

^- I really prefer calling them "source_branch_stop_revision_id". or
"source_tip_revision_id". I'd like the id to be in there, to make it
clear that this is a string rev_id, not a tuple key, not a Revision
object, etc.

However, shouldn't it be "source_tip_revision_ids" (a list of strings).
To allow more flexibility? We've often been hampered by having a single
revision_id to work with.

I see that you have "explicit_rev_ids = set()" which isn't documented.
Is there a reason to keep "source_branch_stop_revision" at all?

...

+            try:
+                tags_to_fetch.update(
+                    self.source_branch.tags.get_reverse_tag_dict())

I'm worried about the performance of this, especially in the case of
stuff like Emacs. IIRC the conversion from CVS brought something like
2000 tags.

While it probably isn't critical for "bzr branch", if "bzr pull"
suddenly has to download and query against 2000 tips, that could get
expensive.

(Note that if we had versioned tags, then you could find the diff of the
tag definition, and then only fetch those tags. As a fairly reasonable
way to get incremental fetch only thinking about incremental changes.)

+        fetch_spec_factory = FetchSpecFactory()
+        if revision_id is not None:
+            fetch_spec_factory.add_revision_ids([revision_id])
+            fetch_spec_factory.source_branch_stop_revision = revision_id

^- This structure is a bit confusing. I'd rather see:

fetch_spec_factory.set_tip_revision(revision_id)

And then have that add_revision_ids() && set the member.

Otherwise it is just a hard-to-get-right api. But see my earlier
discussion about wanting the ability to fetch multiple branches. So maybe:
  fetch_spec_factory.add_head() or add_tip_revision_id() or ...

+        heads_to_fetch = set(self.explicit_rev_ids)
+        tags_to_fetch = set()
+        if self.source_branch is not None:
+            try:
+                tags_to_fetch.update(
+                    self.source_branch.tags.get_reverse_tag_dict())
+            except errors.TagsNotSupported:
+                pass
+            if self.source_branch_stop_revision is not None:
+                heads_to_fetch.add(self.source_branch_stop_revision)
+            else:
+                heads_to_fetch.add(self.source_branch.last_revision())

^- Note that adding self.source_branch_stop_revision() is redundant
here, with how you called add_revision_ids() and set the pointer.

...

+            for path, file_id in subtrees:
+                target = urlutils.join(url, urlutils.escape(path))
+                sublocation = source_branch.reference_parent(file_id, path)
+                sublocation.bzrdir.sprout(target,
+                    basis.get_reference_revision(file_id, path),
+                    force_new_repo=force_new_repo, recurse=recurse,
+                    stacked=stacked)

^- This is also a hint for a case where we would probably like to do one
large fetch across all the subtrees. Though it is a bit muddled because
Aaron didn't want to enforce that subtrees must share a repository.

The tests seem ok, I wasn't very thorough.

The idea that sprout will use branch.last_revision() if none is supplied
is ok. Though having a way to explicitly request all revisions for a
repository, even if it has a branch, might also be useful...

review: needs_fixing

The big thing is to clarify some of the new fetch spec interface, the
rest is all stuff we can discuss to figure out what will be the nicest
way forward.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk0H9WQACgkQJdeBCYSNAANbiACcDHWN+22LzAXoXBo3SU2fImYl
kd0AoKdlM5xlqiLtbDjp6TwXoblnB+ab
=CpfO
-----END PGP SIGNATURE-----

review: Needs Fixing

Revision history for this message

Martin Pool (mbp) wrote on 2010-12-14:

On 15 December 2010 09:56, John A Meinel <email address hidden> wrote:
> Review: Needs Fixing
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>> I think much of the mess stems from the weirdness of sprouting a bzrdir rather than a repository or branch directly. I'm starting to think we should remove ControlDir.sprout.
>>
>
> I'm in favor of this. It will mess up our test suite a ton, because that
> has become the defacto method of creating a new branch. However using:
>
> tree.branch.sprout(new_branch) makes a lot more sense to me than
> tree.bzrdir.sprout(new_something)
>
> I think some of it stems from, do you want a WT in the target or not,
> etc, etc. I would be happier with something that you can compose. This
> is also cluttered because you can't just open the WT from a Branch, it
> will re-open the branch, etc. (Note that there was also bugs with
> branch.sprout() which didn't happen with bzrdir.sprout(), something with
> Trees and merge, and double-locking the WT, IIRC)

+1 to removing it.

Years ago, Robert and I had the idea it would be good to be able to
just clone any object as it currently exists. So for a bzrdir, that
would imply copying the whole repository (if any), plus branches, etc.
However I don't think that makes much sense give how things have now
evolved.

The term 'sprout' is intended to denote 'I'm making a new divergent
line of development' vs 'I'm copying exactly what's there'. At one
conceptual level there is a distinction. But we don't really expose
that in the UI; we don't use this name for it in the UI or user model;
and I'm not sure it makes sense to have as a separate method anyhow.

--
Martin

Revision history for this message

Andrew Bennetts (spiv) wrote on 2010-12-16:

Martin Pool wrote:
> >> I think much of the mess stems from the weirdness of sprouting a
> >> bzrdir rather than a repository or branch directly. I'm starting
> >> to think we should remove ControlDir.sprout.
[...]
> +1 to removing it.
>
> Years ago, Robert and I had the idea it would be good to be able to
> just clone any object as it currently exists. So for a bzrdir, that
> would imply copying the whole repository (if any), plus branches, etc.
> However I don't think that makes much sense give how things have now
> evolved.

I agree that it doesn't make much sense now. It already produces
something that is not an exact clone, in hard to predict ways. e.g. it
will create a branch even if the source doesn't have one, and bzr-svn
chooses to make sprout from a SVN checkout produce a bzr-format bzrdir
rather than try to preserve format.

> The term 'sprout' is intended to denote 'I'm making a new divergent
> line of development' vs 'I'm copying exactly what's there'. At one
> conceptual level there is a distinction. But we don't really expose
> that in the UI; we don't use this name for it in the UI or user model;
> and I'm not sure it makes sense to have as a separate method anyhow.

I agree with this also.

Revision history for this message

Andrew Bennetts (spiv) wrote on 2010-12-16:

Download full text (7.3 KiB)

John A Meinel wrote:
[...]
> I think some of it stems from, do you want a WT in the target or not,
> etc, etc. I would be happier with something that you can compose. This
> is also cluttered because you can't just open the WT from a Branch, it
> will re-open the branch, etc. (Note that there was also bugs with
> branch.sprout() which didn't happen with bzrdir.sprout(), something with
> Trees and merge, and double-locking the WT, IIRC)

Perhaps we need an object that explicitly composes together all the components
of a controldir? i.e. rather than getting a controldir that you can call
open_branch etc on, and but then to go from the branch to the working tree you
need to reopen the working tree, you grab all the objects at once.

Without thinking about it too hard, maybe something like:

   bag_of_things = open_bag(location) # strawman name
   branch = bag_of_things.get_branch()
   wt = bag_of_things.get_workingtree() # does not reopen branch; it's opened
                                        # once and remembered for the lifetime
                                        # of the bag.

[etc etc]

I'm not sure that's better, but I am sure the existing design isn't
great :)

The current API design makes it hard to avoid reopening the same object
over and over. Ideally the design would require you to go out of your way to
get two different in-memory objects representing the same on-disk
object, because that's usually not what you want.

> Things we want to keep in mind:
>
> a) In the ideal interface, we would have a way to copy any number of
> branches simultaneously into a target. This gives us a way to
> implement some form of Repository cloning that people would like.
> (Being able to grab the stack of a loom, or trunk+main features,
> etc.)

We at least have a workable way to do the fetch part of that operation,
but we certainly lack nice ways to find, open, copy or create sets of
branch objects.

[...]
> +class FetchSpecFactory(object):
[...]
> ^- I really prefer calling them "source_branch_stop_revision_id". or
> "source_tip_revision_id". I'd like the id to be in there, to make it
> clear that this is a string rev_id, not a tuple key, not a Revision
> object, etc.

Good point, I'll make that change.

> However, shouldn't it be "source_tip_revision_ids" (a list of strings).
> To allow more flexibility? We've often been hampered by having a single
> revision_id to work with.

Hmm. Probably, although I'm inclined to delay complicating this further
until we have concrete use cases to drive the design. YAGNI, in other
words :)

I wouldn't be surprised to find at the point someone refactors to add
that sort of feature that the result turns out to be a deeper change.

> I see that you have "explicit_rev_ids = set()" which isn't documented.

Hmm, it's class-private variable really, I'll prefix it with an
underscore.

> Is there a reason to keep "source_branch_stop_revision" at all?

Perhaps we need an object that explicitly composes together all the components
of a controldir?  i.e. rather than getting a controldir that you can call
open_branch etc on, and but then to go from the branch to the working tree you
need to reopen the working tree, you grab all the objects at once.

Without thinking about it too hard, maybe something like:

bag_of_things = open_bag(location)   # strawman name
   branch = bag_of_things.get_branch()
   wt = bag_of_things.get_workingtree() # does not reopen branch; it's opened
                                        # once and remembered for the lifetime
                                        # of the bag.

[etc etc]

I'm not sure that's better, but I am sure the existing design isn't
great :)

The current API design makes it hard to avoid reopening the same object
over and over.  Ideally the design would require you to go out of your way to
get two different in-memory objects representing the same on-disk
object, because that's usually not what you want.

> Things we want to keep in mind:
> 
> a) In the ideal interface, we would have a way to copy any number of
>    branches simultaneously into a target. This gives us a way to
>    implement some form of Repository cloning that people would like.
>    (Being able to grab the stack of a loom, or trunk+main features,
>    etc.)

We at least have a workable way to do the fetch part of that operation,
but we certainly lack nice ways to find, open, copy or create sets of
branch objects.

Good point, I'll make that change.

> However, shouldn't it be "source_tip_revision_ids" (a list of strings).
> To allow more flexibility? We've often been hampered by having a single
> revision_id to work with.

Hmm.  Probably, although I'm inclined to delay complicating this further
until we have concrete use cases to drive the design.  YAGNI, in other
words :)

I wouldn't be surprised to find at the point someone refactors to add
that sort of feature that the result turns out to be a deeper change.

> I see that you have "explicit_rev_ids = set()" which isn't documented.

Hmm, it's class-private variable really, I'll prefix it with an
underscore.

> Is there a reason to keep "source_branch_stop_revision" at all?

Yes.  It's used by
<https://code.launchpad.net/~spiv/bzr/fetch-tags-from-non-sprout-too/+merge/42911>,
to support operations like “bzr pull -r 99”.  That fetch should not pull
revno 100.  So source_branch_stop_revision is the variable that
overrides the default “fetch the branch tip” behaviour, which is
hopefully fairly clear when you look at how that variable is used by
make_fetch_spec:

if self.source_branch_stop_revision is not None:
                heads_to_fetch.add(self.source_branch_stop_revision)
            else:
                heads_to_fetch.add(self.source_branch.last_revision())

> +            try:
> +                tags_to_fetch.update(
> +                    self.source_branch.tags.get_reverse_tag_dict())
> 
> I'm worried about the performance of this, especially in the case of
> stuff like Emacs. IIRC the conversion from CVS brought something like
> 2000 tags.
> 
> While it probably isn't critical for "bzr branch", if "bzr pull"
> suddenly has to download and query against 2000 tips, that could get
> expensive.

Hmm.  It would be good to find out what the performance impact is for
branches like Emacs.  I'll do some experiments.  If/when this patch
lands, the common case for pull will be that the 2000 tips are already
in the local repo, so it'll depend on how fast it is to make that query
on a local repo.

On the other hand, if you have 2000 tags, and a large fraction of them
don't work because the revisions are missing, then that's also pretty
bad :)

> (Note that if we had versioned tags, then you could find the diff of the
> tag definition, and then only fetch those tags. As a fairly reasonable
> way to get incremental fetch only thinking about incremental changes.)

I suppose another possibility for pull is to compare the source and
target's tags dict, and only consider tips from new or changed tags.

This would have the interesting result that "bzr pull" in a branch made
from 2.2 will leave broken tags broken — but also that the first pull
made with 2.3 (assuming this patch lands) will not suddenly do an
unexpectedly large fetch as it fills in the broken tags.  I'm not sure
if that's a net positive or not :)

If we did this we'd probably want something like fetch-ghosts could use
if they did want to fill in broken tags in an existing branch.  I guess
deleting all tags and then pulling again would be potential clumsy
workaround.

> +        fetch_spec_factory = FetchSpecFactory()
> +        if revision_id is not None:
> +            fetch_spec_factory.add_revision_ids([revision_id])
> +            fetch_spec_factory.source_branch_stop_revision = revision_id
> 
> ^- This structure is a bit confusing. I'd rather see:
> 
>    fetch_spec_factory.set_tip_revision(revision_id)
> 
> And then have that add_revision_ids() && set the member.
> 
> Otherwise it is just a hard-to-get-right api. But see my earlier
> discussion about wanting the ability to fetch multiple branches. So maybe:
>   fetch_spec_factory.add_head() or add_tip_revision_id() or ...

Ok I'll try that, and see how it looks in this branch and the follow-on
branch fetch-tags-from-non-sprout-too.

> +            for path, file_id in subtrees:
> +                target = urlutils.join(url, urlutils.escape(path))
> +                sublocation = source_branch.reference_parent(file_id, path)
> +                sublocation.bzrdir.sprout(target,
> +                    basis.get_reference_revision(file_id, path),
> +                    force_new_repo=force_new_repo, recurse=recurse,
> +                    stacked=stacked)
> 
> ^- This is also a hint for a case where we would probably like to do one
> large fetch across all the subtrees. Though it is a bit muddled because
> Aaron didn't want to enforce that subtrees must share a repository.

This hunk is unchanged in my patch apart from indentation level, so I'm
not too concerned with tweaking it :)

> The tests seem ok, I wasn't very thorough.
> 
> The idea that sprout will use branch.last_revision() if none is supplied
> is ok. Though having a way to explicitly request all revisions for a
> repository, even if it has a branch, might also be useful...

It *might* be useful.  Citation needed ;)

>  review: needs_fixing
> 
> The big thing is to clarify some of the new fetch spec interface, the
> rest is all stuff we can discuss to figure out what will be the nicest
> way forward.

Agreed.  If you have the energy, take a look at how
fetch-tags-from-non-sprout-too uses it too.

-Andrew.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Alejandro Cornejo2

Andrew Bennetts

Bazaar Codereview Subscribers

Benoit Pierre

Gmood

Karl Bielefeldt

Mahmoud Hassan

Matt Nordhoff

Mohd Fikri Mohd Amin

MrJOHN

Václav Haisman

bzr PQM

vincenzo

to status/vote changes:

Alexander Belchenko

amandla2023

 === modified file 'bzrlib/controldir.py'
 --- bzrlib/controldir.py	2011-01-26 19:34:58 +0000
 +++ bzrlib/controldir.py	2011-02-07 04:18:21 +0000
@@ -29,6 +29,7 @@
  from bzrlib import (
      cleanup,
      errors,
++    fetch,
      graph,
      revision as _mod_revision,
      transport as _mod_transport,
@@ -378,53 +379,41 @@
                 accelerator_tree=None, hardlink=False, stacked=False,
                 source_branch=None, create_tree_if_local=True):
          add_cleanup = op.add_cleanup
++        fetch_spec_factory = fetch.FetchSpecFactory()
++        if revision_id is not None:
++            fetch_spec_factory.add_revision_ids([revision_id])
++            fetch_spec_factory.source_branch_stop_revision_id = revision_id
          target_transport = _mod_transport.get_transport(url,
              possible_transports)
          target_transport.ensure_base()
          cloning_format = self.cloning_metadir(stacked)
          # Create/update the result branch
          result = cloning_format.initialize_on_transport(target_transport)
++        source_branch, source_repository = self._find_source_repo(
++            add_cleanup, source_branch)
++        fetch_spec_factory.source_branch = source_branch
          # if a stacked branch wasn't requested, we don't create one
          # even if the origin was stacked
--        stacked_branch_url = None
--        if source_branch is not None:
--            add_cleanup(source_branch.lock_read().unlock)
--            if stacked:
--                stacked_branch_url = self.root_transport.base
--            source_repository = source_branch.repository
++        if stacked and source_branch is not None:
++            stacked_branch_url = self.root_transport.base
          else:
--            try:
--                source_branch = self.open_branch()
--                source_repository = source_branch.repository
--                if stacked:
--                    stacked_branch_url = self.root_transport.base
--            except errors.NotBranchError:
--                source_branch = None
--                try:
--                    source_repository = self.open_repository()
--                except errors.NoRepositoryPresent:
--                    source_repository = None
--                else:
--                    add_cleanup(source_repository.lock_read().unlock)
--            else:
--                add_cleanup(source_branch.lock_read().unlock)
++            stacked_branch_url = None
          repository_policy = result.determine_repository_policy(
              force_new_repo, stacked_branch_url, require_stacking=stacked)
          result_repo, is_new_repo = repository_policy.acquire_repository()
          add_cleanup(result_repo.lock_write().unlock)
--        is_stacked = stacked or (len(result_repo._fallback_repositories) != 0)
--        if is_new_repo and revision_id is not None and not is_stacked:
--            fetch_spec = graph.PendingAncestryResult(
--                [revision_id], source_repository)
++        fetch_spec_factory.source_repo = source_repository
++        fetch_spec_factory.target_repo = result_repo
++        if stacked or (len(result_repo._fallback_repositories) != 0):
++            target_repo_kind = fetch.TargetRepoKinds.STACKED
++        elif is_new_repo:
++            target_repo_kind = fetch.TargetRepoKinds.EMPTY
          else:
--            fetch_spec = None
++            target_repo_kind = fetch.TargetRepoKinds.PREEXISTING
++        fetch_spec_factory.target_repo_kind = target_repo_kind
          if source_repository is not None:
--            # Fetch while stacked to prevent unstacked fetch from
--            # Branch.sprout.
--            if fetch_spec is None:
--                result_repo.fetch(source_repository, revision_id=revision_id)
--            else:
--                result_repo.fetch(source_repository, fetch_spec=fetch_spec)
++            fetch_spec = fetch_spec_factory.make_fetch_spec()
++            result_repo.fetch(source_repository, fetch_spec=fetch_spec)
          if source_branch is None:
              # this is for sprouting a controldir without a branch; is that
@@ -455,34 +444,54 @@
          else:
              wt = None
          if recurse == 'down':
++            basis = None
              if wt is not None:
                  basis = wt.basis_tree()
--                basis.lock_read()
--                subtrees = basis.iter_references()
              elif result_branch is not None:
                  basis = result_branch.basis_tree()
--                basis.lock_read()
--                subtrees = basis.iter_references()
              elif source_branch is not None:
                  basis = source_branch.basis_tree()
--                basis.lock_read()
++            if basis is not None:
++                add_cleanup(basis.lock_read().unlock)
                  subtrees = basis.iter_references()
              else:
                  subtrees = []
--                basis = None
--            try:
--                for path, file_id in subtrees:
--                    target = urlutils.join(url, urlutils.escape(path))
--                    sublocation = source_branch.reference_parent(file_id, path)
--                    sublocation.bzrdir.sprout(target,
--                        basis.get_reference_revision(file_id, path),
--                        force_new_repo=force_new_repo, recurse=recurse,
--                        stacked=stacked)
--            finally:
--                if basis is not None:
--                    basis.unlock()
++            for path, file_id in subtrees:
++                target = urlutils.join(url, urlutils.escape(path))
++                sublocation = source_branch.reference_parent(file_id, path)
++                sublocation.bzrdir.sprout(target,
++                    basis.get_reference_revision(file_id, path),
++                    force_new_repo=force_new_repo, recurse=recurse,
++                    stacked=stacked)
          return result
++    def _find_source_repo(self, add_cleanup, source_branch):
++        """Find the source branch and repo for a sprout operation.
++
++        This is helper intended for use by _sprout.
++
++        :returns: (source_branch, source_repository).  Either or both may be
++            None.  If not None, they will be read-locked (and their unlock(s)
++            scheduled via the add_cleanup param).
++        """
++        if source_branch is not None:
++            add_cleanup(source_branch.lock_read().unlock)
++            return source_branch, source_branch.repository
++        try:
++            source_branch = self.open_branch()
++            source_repository = source_branch.repository
++        except errors.NotBranchError:
++            source_branch = None
++            try:
++                source_repository = self.open_repository()
++            except errors.NoRepositoryPresent:
++                source_repository = None
++            else:
++                add_cleanup(source_repository.lock_read().unlock)
++        else:
++            add_cleanup(source_branch.lock_read().unlock)
++        return source_branch, source_repository
++
      def push_branch(self, source, revision_id=None, overwrite=False,
          remember=False, create_prefix=False):
          """Push the source branch into this ControlDir."""
 === modified file 'bzrlib/fetch.py'
 --- bzrlib/fetch.py	2011-01-14 17:18:23 +0000
 +++ bzrlib/fetch.py	2011-02-07 04:18:21 +0000
@@ -55,6 +55,8 @@
          :param last_revision: If set, try to limit to the data this revision
              references.
++        :param fetch_spec: A SearchResult specifying which revisions to fetch.
++            If set, this overrides last_revision.
          :param find_ghosts: If True search the entire history for ghosts.
          """
          # repository.fetch has the responsibility for short-circuiting
@@ -93,12 +95,12 @@
          pb.show_pct = pb.show_count = False
          try:
              pb.update("Finding revisions", 0, 2)
--            search = self._revids_to_fetch()
--            mutter('fetching: %s', search)
--            if search.is_empty():
++            search_result = self._revids_to_fetch()
++            mutter('fetching: %s', search_result)
++            if search_result.is_empty():
                  return
              pb.update("Fetching revisions", 1, 2)
--            self._fetch_everything_for_search(search)
++            self._fetch_everything_for_search(search_result)
          finally:
              pb.finished()
@@ -151,18 +153,9 @@
          install self._last_revision in self.to_repository.
          :returns: A SearchResult of some sort.  (Possibly a
--        PendingAncestryResult, EmptySearchResult, etc.)
++            PendingAncestryResult, EmptySearchResult, etc.)
          """
--        mutter("self._fetch_spec, self._last_revision: %r, %r",
--                self._fetch_spec, self._last_revision)
--        get_search_result = getattr(self._fetch_spec, 'get_search_result', None)
--        if get_search_result is not None:
--            mutter(
--                'resolving fetch_spec into search result: %s', self._fetch_spec)
--            # This is EverythingNotInOther or a similar kind of fetch_spec.
--            # Turn it into a search result.
--            return get_search_result()
--        elif self._fetch_spec is not None:
++        if self._fetch_spec is not None:
              # The fetch spec is already a concrete search result.
              return self._fetch_spec
          elif self._last_revision == NULL_REVISION:
@@ -172,11 +165,11 @@
          elif self._last_revision is not None:
              return graph.NotInOtherForRevs(self.to_repository,
                  self.from_repository, [self._last_revision],
--                find_ghosts=self.find_ghosts).get_search_result()
++                find_ghosts=self.find_ghosts).execute()
          else: # self._last_revision is None:
              return graph.EverythingNotInOther(self.to_repository,
                  self.from_repository,
--                find_ghosts=self.find_ghosts).get_search_result()
++                find_ghosts=self.find_ghosts).execute()
  class Inter1and2Helper(object):
@@ -339,3 +332,86 @@
              selected_ids.append(parent_id)
      parent_keys = [(root_id, parent_id) for parent_id in selected_ids]
      return parent_keys
++
++
++class TargetRepoKinds(object):
++    """An enum-like set of constants.
++
++    They are the possible values of FetchSpecFactory.target_repo_kinds.
++    """
++
++    PREEXISTING = 'preexisting'
++    STACKED = 'stacked'
++    EMPTY = 'empty'
++
++
++class FetchSpecFactory(object):
++    """A helper for building the best fetch spec for a sprout call.
++
++    Factors that go into determining the sort of fetch to perform:
++     * did the caller specify any revision IDs?
++     * did the caller specify a source branch (need to fetch the tip + tags)
++     * is there an existing target repo (don't need to refetch revs it
++       already has)
++     * target is stacked?  (similar to pre-existing target repo: even if
++       the target itself is new don't want to refetch existing revs)
++
++    :ivar source_branch: the source branch if one specified, else None.
++    :ivar source_branch_stop_revision_id: fetch up to this revision of
++        source_branch, rather than its tip.
++    :ivar source_repo: the source repository if one found, else None.
++    :ivar target_repo: the target repository acquired by sprout.
++    :ivar target_repo_kind: one of the TargetRepoKinds constants.
++    """
++
++    def __init__(self):
++        self._explicit_rev_ids = set()
++        self.source_branch = None
++        self.source_branch_stop_revision_id = None
++        self.source_repo = None
++        self.target_repo = None
++        self.target_repo_kind = None
++
++    def add_revision_ids(self, revision_ids):
++        """Add revision_ids to the set of revision_ids to be fetched."""
++        self._explicit_rev_ids.update(revision_ids)
++
++    def make_fetch_spec(self):
++        """Build a SearchResult or PendingAncestryResult or etc."""
++        if self.target_repo_kind is None or self.source_repo is None:
++            raise AssertionError(
++                'Incomplete FetchSpecFactory: %r' % (self.__dict__,))
++        if len(self._explicit_rev_ids) == 0 and self.source_branch is None:
++            # Caller hasn't specified any revisions or source branch
++            if self.target_repo_kind == TargetRepoKinds.EMPTY:
++                return graph.EverythingResult(self.source_repo)
++            else:
++                # We want everything not already in the target (or target's
++                # fallbacks).
++                return graph.EverythingNotInOther(
++                    self.target_repo, self.source_repo).execute()
++        heads_to_fetch = set(self._explicit_rev_ids)
++        tags_to_fetch = set()
++        if self.source_branch is not None:
++            try:
++                tags_to_fetch.update(
++                    self.source_branch.tags.get_reverse_tag_dict())
++            except errors.TagsNotSupported:
++                pass
++            if self.source_branch_stop_revision_id is not None:
++                heads_to_fetch.add(self.source_branch_stop_revision_id)
++            else:
++                heads_to_fetch.add(self.source_branch.last_revision())
++        if self.target_repo_kind == TargetRepoKinds.EMPTY:
++            # PendingAncestryResult does not raise errors if a requested head
++            # is absent.  Ideally it would support the
++            # required_ids/if_present_ids distinction, but in practice
++            # heads_to_fetch will almost certainly be present so this doesn't
++            # matter much.
++            all_heads = heads_to_fetch.union(tags_to_fetch)
++            return graph.PendingAncestryResult(all_heads, self.source_repo)
++        return graph.NotInOtherForRevs(self.target_repo, self.source_repo,
++            required_ids=heads_to_fetch, if_present_ids=tags_to_fetch
++            ).execute()
++
++
 === modified file 'bzrlib/graph.py'
 --- bzrlib/graph.py	2011-01-14 17:18:23 +0000
 +++ bzrlib/graph.py	2011-02-07 04:18:21 +0000
@@ -1576,14 +1576,15 @@
  class AbstractSearch(object):
--    def get_search_result(self):
++    def execute(self):
          """Construct a network-ready search result from this search description.
          This may take some time to search repositories, etc.
--        :return: A search result.
++        :return: A search result (an object that implements
++            AbstractSearchResult's API).
          """
--        raise NotImplementedError(self.get_search_result)
++        raise NotImplementedError(self.execute)
  class SearchResult(AbstractSearchResult):
@@ -1705,7 +1706,8 @@
      def __repr__(self):
          if len(self.heads) > 5:
--            heads_repr = repr(list(self.heads)[:5] + ', ...]')
++            heads_repr = repr(list(self.heads)[:5])[:-1]
++            heads_repr += ', <%d more>...]' % (len(self.heads) - 5,)
          else:
              heads_repr = repr(self.heads)
          return '<%s heads:%s repo:%r>' % (
@@ -1813,7 +1815,7 @@
          self.from_repo = from_repo
          self.find_ghosts = find_ghosts
--    def get_search_result(self):
++    def execute(self):
          return self.to_repo.search_missing_revision_ids(
              self.from_repo, find_ghosts=self.find_ghosts)
@@ -1853,7 +1855,7 @@
              self.__class__.__name__, self.from_repo, self.to_repo,
              self.find_ghosts, reqd_revs_repr, ifp_revs_repr)
--    def get_search_result(self):
++    def execute(self):
          return self.to_repo.search_missing_revision_ids(
              self.from_repo, revision_ids=self.required_ids,
              if_present_ids=self.if_present_ids, find_ghosts=self.find_ghosts)
 === modified file 'bzrlib/remote.py'
 --- bzrlib/remote.py	2011-01-14 17:18:23 +0000
 +++ bzrlib/remote.py	2011-02-07 04:18:21 +0000
@@ -1360,7 +1360,7 @@
          if symbol_versioning.deprecated_passed(revision_id):
              symbol_versioning.warn(
                  'search_missing_revision_ids(revision_id=...) was '
--                'deprecated in 2.3.  Use revision_ids=[...] instead.',
++                'deprecated in 2.4.  Use revision_ids=[...] instead.',
                  DeprecationWarning, stacklevel=2)
              if revision_ids is not None:
                  raise AssertionError(
@@ -1991,11 +1991,11 @@
                  if isinstance(search, graph.EverythingResult):
                      error_verb = e.error_from_smart_server.error_verb
                      if error_verb == 'BadSearch':
--                        # Pre-2.3 servers don't support this sort of search.
++                        # Pre-2.4 servers don't support this sort of search.
                          # XXX: perhaps falling back to VFS on BadSearch is a
                          # good idea in general?  It might provide a little bit
                          # of protection against client-side bugs.
--                        medium._remember_remote_is_before((2, 3))
++                        medium._remember_remote_is_before((2, 4))
                          break
                  raise
              else:
 === modified file 'bzrlib/repofmt/knitrepo.py'
 --- bzrlib/repofmt/knitrepo.py	2010-12-21 06:10:11 +0000
 +++ bzrlib/repofmt/knitrepo.py	2011-02-07 04:18:21 +0000
@@ -542,7 +542,7 @@
          if symbol_versioning.deprecated_passed(revision_id):
              symbol_versioning.warn(
                  'search_missing_revision_ids(revision_id=...) was '
--                'deprecated in 2.3.  Use revision_ids=[...] instead.',
++                'deprecated in 2.4.  Use revision_ids=[...] instead.',
                  DeprecationWarning, stacklevel=2)
              if revision_ids is not None:
                  raise AssertionError(
 === modified file 'bzrlib/repofmt/weaverepo.py'
 --- bzrlib/repofmt/weaverepo.py	2011-01-14 17:18:23 +0000
 +++ bzrlib/repofmt/weaverepo.py	2011-02-07 04:18:21 +0000
@@ -826,7 +826,7 @@
          if symbol_versioning.deprecated_passed(revision_id):
              symbol_versioning.warn(
                  'search_missing_revision_ids(revision_id=...) was '
--                'deprecated in 2.3.  Use revision_ids=[...] instead.',
++                'deprecated in 2.4.  Use revision_ids=[...] instead.',
                  DeprecationWarning, stacklevel=2)
              if revision_ids is not None:
                  raise AssertionError(
 === modified file 'bzrlib/repository.py'
 --- bzrlib/repository.py	2011-01-19 23:00:12 +0000
 +++ bzrlib/repository.py	2011-02-07 04:18:21 +0000
@@ -1608,7 +1608,7 @@
          if symbol_versioning.deprecated_passed(revision_id):
              symbol_versioning.warn(
                  'search_missing_revision_ids(revision_id=...) was '
--                'deprecated in 2.3.  Use revision_ids=[...] instead.',
++                'deprecated in 2.4.  Use revision_ids=[...] instead.',
                  DeprecationWarning, stacklevel=3)
              if revision_ids is not None:
                  raise AssertionError(
@@ -1785,9 +1785,6 @@
                  not _mod_revision.is_null(revision_id)):
                  self.get_revision(revision_id)
              return 0, []
--        # if there is no specific appropriate InterRepository, this will get
--        # the InterRepository base class, which raises an
--        # IncompatibleRepositories when asked to fetch.
          inter = InterRepository.get(source, self)
          return inter.fetch(revision_id=revision_id, pb=pb,
              find_ghosts=find_ghosts, fetch_spec=fetch_spec)
@@ -3482,8 +3479,12 @@
              # them, make sure that they are present in the target.
              # We don't care about other ghosts as we can't fetch them and
              # haven't been asked to.
++            mutter('reqd: %r  if-present: %r  ->  ghosts: %r', revision_ids,
++                if_present_ids, ghosts)
              ghosts_to_check = set(revision_ids.intersection(ghosts))
              revs_to_get = set(next_revs).union(ghosts_to_check)
++            mutter('ghosts_to_check: %r  revs_to_get: %r  searcher_exhausted: %r',
++                ghosts_to_check, revs_to_get, searcher_exhausted)
              if revs_to_get:
                  have_revs = set(target_graph.get_parent_map(revs_to_get))
                  # we always have NULL_REVISION present.
@@ -3526,7 +3527,7 @@
          if symbol_versioning.deprecated_passed(revision_id):
              symbol_versioning.warn(
                  'search_missing_revision_ids(revision_id=...) was '
--                'deprecated in 2.3.  Use revision_ids=[...] instead.',
++                'deprecated in 2.4.  Use revision_ids=[...] instead.',
                  DeprecationWarning, stacklevel=2)
              if revision_ids is not None:
                  raise AssertionError(
@@ -3901,10 +3902,9 @@
              fetch_spec=None):
          """See InterRepository.fetch()."""
          if fetch_spec is not None:
--            if (isinstance(fetch_spec, graph.NotInOtherForRevs) and
--                    len(fetch_spec.required_ids) == 1 and not
--                    fetch_spec.if_present_ids):
--                revision_id = list(fetch_spec.required_ids)[0]
++            if (isinstance(fetch_spec, graph.SearchResult) and
++                    len(fetch_spec.get_keys()) == 1):
++                revision_id = list(fetch_spec.get_keys())[0]
                  del fetch_spec
              else:
                  raise AssertionError("Not implemented yet...")
 === modified file 'bzrlib/tests/blackbox/test_branch.py'
 --- bzrlib/tests/blackbox/test_branch.py	2011-01-10 22:20:12 +0000
 +++ bzrlib/tests/blackbox/test_branch.py	2011-02-07 04:18:21 +0000
@@ -23,13 +23,11 @@
      branch,
      bzrdir,
      errors,
--    repository,
      revision as _mod_revision,
+     )
  from bzrlib.repofmt.knitrepo import RepositoryFormatKnit1
  from bzrlib.tests import TestCaseWithTransport
  from bzrlib.tests import (
--    KnownFailure,
      HardlinkFeature,
      test_server,
+     )
@@ -270,6 +268,20 @@
          self.run_bzr('checkout --lightweight a b')
          self.assertLength(2, calls)
++    def test_branch_fetches_all_tags(self):
++        builder = self.make_branch_builder('source')
++        builder.build_commit(message="Rev 1", rev_id='rev-1')
++        builder.build_commit(message="Rev 2", rev_id='rev-2')
++        source = builder.get_branch()
++        source.tags.set_tag('tag-a', 'rev-2')
++        source.set_last_revision_info(1, 'rev-1')
++        # Now source has a tag not in its ancestry.  Make a branch from it.
++        self.run_bzr('branch source new-branch')
++        new_branch = branch.Branch.open('new-branch')
++        # The tag is present, and so is its revision.
++        self.assertEqual('rev-2', new_branch.tags.lookup_tag('tag-a'))
++        new_branch.repository.get_revision('rev-2')
++
  class TestBranchStacked(TestCaseWithTransport):
      """Tests for branch --stacked"""
@@ -451,6 +463,25 @@
          # upwards without agreement from bzr's network support maintainers.
          self.assertLength(14, self.hpss_calls)
++    def test_branch_from_branch_with_tags(self):
++        self.setup_smart_server_with_call_log()
++        builder = self.make_branch_builder('source')
++        builder.build_commit(message="Rev 1", rev_id='rev-1')
++        builder.build_commit(message="Rev 2", rev_id='rev-2')
++        source = builder.get_branch()
++        source.tags.set_tag('tag-a', 'rev-2')
++        source.tags.set_tag('tag-missing', 'missing-rev')
++        source.set_last_revision_info(1, 'rev-1')
++        # Now source has a tag not in its ancestry.  Make a branch from it.
++        self.reset_smart_call_log()
++        out, err = self.run_bzr(['branch', self.get_url('source'), 'target'])
++        # This figure represent the amount of work to perform this use case. It
++        # is entirely ok to reduce this number if a test fails due to rpc_count
++        # being too low. If rpc_count increases, more network roundtrips have
++        # become necessary for this use case. Please do not adjust this number
++        # upwards without agreement from bzr's network support maintainers.
++        self.assertLength(9, self.hpss_calls)
++
  class TestRemoteBranch(TestCaseWithSFTPServer):
 === modified file 'bzrlib/tests/per_branch/test_sprout.py'
 --- bzrlib/tests/per_branch/test_sprout.py	2011-01-13 05:36:39 +0000
 +++ bzrlib/tests/per_branch/test_sprout.py	2011-02-07 04:18:21 +0000
@@ -111,6 +111,25 @@
          self.assertEqual((2, 'rev2-alt'), branch2.last_revision_info())
          self.assertEqual(['rev1', 'rev2-alt'], branch2.revision_history())
++    def test_sprout_preserves_tags(self):
++        """Sprout preserves tags, even tags of absent revisions."""
++        try:
++            builder = self.make_branch_builder('source')
++        except errors.UninitializableFormat:
++            raise tests.TestSkipped('Uninitializable branch format')
++        builder.build_commit(message="Rev 1", rev_id='rev-1')
++        source = builder.get_branch()
++        try:
++            source.tags.set_tag('tag-a', 'missing-rev')
++        except errors.TagsNotSupported:
++            raise tests.TestNotApplicable(
++                'Branch format does not support tags.')
++        # Now source has a tag pointing to an absent revision.  Sprout it.
++        target_bzrdir = self.make_repository('target').bzrdir
++        new_branch = source.sprout(target_bzrdir)
++        # The tag is present in the target
++        self.assertEqual('missing-rev', new_branch.tags.lookup_tag('tag-a'))
++
      def test_sprout_from_any_repo_revision(self):
          """We should be able to sprout from any revision."""
          wt = self.make_branch_and_tree('source')
 === modified file 'bzrlib/tests/per_controldir/test_controldir.py'
 --- bzrlib/tests/per_controldir/test_controldir.py	2011-01-27 14:27:18 +0000
 +++ bzrlib/tests/per_controldir/test_controldir.py	2011-02-07 04:18:21 +0000
@@ -503,7 +503,10 @@
          self.assertNotEqual(dir.transport.base, target.transport.base)
          self.assertNotEqual(dir.transport.base, shared_repo.bzrdir.transport.base)
          branch = target.open_branch()
--        self.assertTrue(branch.repository.has_revision('1'))
++        # The sprouted bzrdir has a branch, so only revisions referenced by
++        # that branch are copied, rather than the whole repository.  It's an
++        # empty branch, so none are copied.
++        self.assertEqual([], branch.repository.all_revision_ids())
          if branch.bzrdir._format.supports_workingtrees:
              self.assertTrue(branch.repository.make_working_trees())
          self.assertFalse(branch.repository.is_shared())
@@ -669,6 +672,103 @@
          target = dir.sprout(self.get_url('target'), revision_id='1')
          self.assertEqual('1', target.open_branch().last_revision())
++    def test_sprout_bzrdir_branch_with_tags(self):
++        # when sprouting a branch all revisions named in the tags are copied
++        # too.
++        builder = self.make_branch_builder('source')
++        builder.build_commit(message="Rev 1", rev_id='rev-1')
++        builder.build_commit(message="Rev 2", rev_id='rev-2')
++        source = builder.get_branch()
++        try:
++            source.tags.set_tag('tag-a', 'rev-2')
++        except errors.TagsNotSupported:
++            raise TestNotApplicable('Branch format does not support tags.')
++        source.set_last_revision_info(1, 'rev-1')
++        # Now source has a tag not in its ancestry.  Sprout its controldir.
++        dir = source.bzrdir
++        target = dir.sprout(self.get_url('target'))
++        # The tag is present, and so is its revision.
++        new_branch = target.open_branch()
++        self.assertEqual('rev-2', new_branch.tags.lookup_tag('tag-a'))
++        new_branch.repository.get_revision('rev-2')
++
++    def test_sprout_bzrdir_branch_with_absent_tag(self):
++        # tags referencing absent revisions are copied (and those absent
++        # revisions do not prevent the sprout.)
++        builder = self.make_branch_builder('source')
++        builder.build_commit(message="Rev 1", rev_id='rev-1')
++        source = builder.get_branch()
++        try:
++            source.tags.set_tag('tag-a', 'missing-rev')
++        except errors.TagsNotSupported:
++            raise TestNotApplicable('Branch format does not support tags.')
++        # Now source has a tag pointing to an absent revision.  Sprout its
++        # controldir.
++        dir = source.bzrdir
++        target = dir.sprout(self.get_url('target'))
++        # The tag is present in the target
++        new_branch = target.open_branch()
++        self.assertEqual('missing-rev', new_branch.tags.lookup_tag('tag-a'))
++
++    def test_sprout_bzrdir_passing_source_branch_with_absent_tag(self):
++        # tags referencing absent revisions are copied (and those absent
++        # revisions do not prevent the sprout.)
++        builder = self.make_branch_builder('source')
++        builder.build_commit(message="Rev 1", rev_id='rev-1')
++        source = builder.get_branch()
++        try:
++            source.tags.set_tag('tag-a', 'missing-rev')
++        except errors.TagsNotSupported:
++            raise TestNotApplicable('Branch format does not support tags.')
++        # Now source has a tag pointing to an absent revision.  Sprout its
++        # controldir.
++        dir = source.bzrdir
++        target = dir.sprout(self.get_url('target'), source_branch=source)
++        # The tag is present in the target
++        new_branch = target.open_branch()
++        self.assertEqual('missing-rev', new_branch.tags.lookup_tag('tag-a'))
++
++    def test_sprout_bzrdir_passing_rev_not_source_branch_copies_tags(self):
++        # dir.sprout(..., revision_id='rev1') copies rev1, and all the tags of
++        # the branch at that bzrdir, the ancestry of all of those, but no other
++        # revs (not even the tip of the source branch).
++        builder = self.make_branch_builder('source')
++        builder.build_commit(message="Base", rev_id='base-rev')
++        # Make three parallel lines of ancestry off this base.
++        source = builder.get_branch()
++        builder.build_commit(message="Rev A1", rev_id='rev-a1')
++        builder.build_commit(message="Rev A2", rev_id='rev-a2')
++        builder.build_commit(message="Rev A3", rev_id='rev-a3')
++        source.set_last_revision_info(1, 'base-rev')
++        builder.build_commit(message="Rev B1", rev_id='rev-b1')
++        builder.build_commit(message="Rev B2", rev_id='rev-b2')
++        builder.build_commit(message="Rev B3", rev_id='rev-b3')
++        source.set_last_revision_info(1, 'base-rev')
++        builder.build_commit(message="Rev C1", rev_id='rev-c1')
++        builder.build_commit(message="Rev C2", rev_id='rev-c2')
++        builder.build_commit(message="Rev C3", rev_id='rev-c3')
++        # Set the branch tip to A2
++        source.set_last_revision_info(3, 'rev-a2')
++        try:
++            # Create a tag for B2, and for an absent rev
++            source.tags.set_tag('tag-non-ancestry', 'rev-b2')
++            source.tags.set_tag('tag-absent', 'absent-rev')
++        except errors.TagsNotSupported:
++            raise TestNotApplicable('Branch format does not support tags.')
++        # And ask sprout for C2
++        dir = source.bzrdir
++        target = dir.sprout(self.get_url('target'), revision_id='rev-c2')
++        # The tags are present
++        new_branch = target.open_branch()
++        self.assertEqual(
++            {'tag-absent': 'absent-rev', 'tag-non-ancestry': 'rev-b2'},
++            new_branch.tags.get_tag_dict())
++        # And the revs for A2, B2 and C2's ancestries are present, but no
++        # others.
++        self.assertEqual(
++            ['base-rev', 'rev-b1', 'rev-b2', 'rev-c1', 'rev-c2'],
++            sorted(new_branch.repository.all_revision_ids()))
++
      def test_sprout_bzrdir_tree_branch_reference(self):
          # sprouting should create a repository if needed and a sprouted branch.
          # the tree state should not be copied.
 === modified file 'bzrlib/tests/per_interrepository/test_interrepository.py'
 --- bzrlib/tests/per_interrepository/test_interrepository.py	2011-01-14 17:18:23 +0000
 +++ bzrlib/tests/per_interrepository/test_interrepository.py	2011-02-07 04:18:21 +0000
@@ -138,7 +138,7 @@
              find_ghosts=False)
          self.callDeprecated(
              ['search_missing_revision_ids(revision_id=...) was deprecated in '
--             '2.3.  Use revision_ids=[...] instead.'],
++             '2.4.  Use revision_ids=[...] instead.'],
              self.assertRaises, errors.NoSuchRevision,
              repo_b.search_missing_revision_ids, repo_a, revision_id='pizza',
              find_ghosts=False)
 === modified file 'bzrlib/tests/test_bzrdir.py'
 --- bzrlib/tests/test_bzrdir.py	2011-01-27 14:27:18 +0000
 +++ bzrlib/tests/test_bzrdir.py	2011-02-07 04:18:21 +0000
@@ -31,6 +31,7 @@
      help_topics,
      lock,
      repository,
++    revision as _mod_revision,
      osutils,
      remote,
      symbol_versioning,
@@ -1341,6 +1342,9 @@
      def copy_content_into(self, destination, revision_id=None):
          self.calls.append('copy_content_into')
++    def last_revision(self):
++        return _mod_revision.NULL_REVISION
++
      def get_parent(self):
          return self._parent
 === modified file 'bzrlib/tests/test_remote.py'
 --- bzrlib/tests/test_remote.py	2011-01-14 17:18:23 +0000
 +++ bzrlib/tests/test_remote.py	2011-02-07 04:18:21 +0000
@@ -3211,15 +3211,15 @@
                  override_existing=True)
      def test_fetch_everything_backwards_compat(self):
--        """Can fetch with EverythingResult even with pre 2.3 servers.
++        """Can fetch with EverythingResult even with pre 2.4 servers.
--        Pre-2.3 do not support 'everything' searches with the
++        Pre-2.4 do not support 'everything' searches with the
          Repository.get_stream_1.19 verb.
          """
          verb_log = []
          class OldGetStreamVerb(SmartServerRepositoryGetStream_1_19):
              """A version of the Repository.get_stream_1.19 verb patched to
--            reject 'everything' searches the way 2.2 and earlier do.
++            reject 'everything' searches the way 2.3 and earlier do.
              """
              def recreate_search(self, repository, search_bytes, discard_excess=False):
                  verb_log.append(search_bytes.split('\n', 1)[0])
 === modified file 'doc/en/release-notes/bzr-2.3.txt'
 --- doc/en/release-notes/bzr-2.3.txt	2011-02-03 05:13:46 +0000
 +++ doc/en/release-notes/bzr-2.3.txt	2011-02-07 04:18:21 +0000
@@ -275,11 +275,6 @@
    crashes when encountering private bugs (they are just displayed as such).
    (Vincent Ladeuil, #354985)
--* The ``revision_id`` parameter of
--  ``Repository.search_missing_revision_ids`` and
--  ``InterRepository.search_missing_revision_ids`` is deprecated.  It is
--  replaced by the ``revision_ids`` parameter.  (Andrew Bennetts)
--
  Internals
  *********
 === modified file 'doc/en/release-notes/bzr-2.4.txt'
 --- doc/en/release-notes/bzr-2.4.txt	2011-02-03 05:46:08 +0000
 +++ doc/en/release-notes/bzr-2.4.txt	2011-02-07 04:18:21 +0000
@@ -94,6 +94,11 @@
  * Added ``bzrlib.mergetools`` module with helper functions for working with
    the list of external merge tools. (Gordon Tyler, #489915)
++* The ``revision_id`` parameter of
++  ``Repository.search_missing_revision_ids`` and
++  ``InterRepository.search_missing_revision_ids`` is deprecated.  It is
++  replaced by the ``revision_ids`` parameter.  (Andrew Bennetts)
++
  Internals
  *********