Bazaar

Merge lp:~jelmer/bzr/hpss-get-inventories into lp:bzr

hpss-get-inventories
Merge into bzr.dev

Proposed by Jelmer Vernooij on 2011-11-29

Status:

Superseded

Proposed branch:

lp:~jelmer/bzr/hpss-get-inventories

Merge into:

lp:bzr

Prerequisite:

lp:~jelmer/bzr/hpss-_get-checkout-format

Diff against target:

969 lines (+466/-182)

17 files modified

bzrlib/remote.py (+207/-50)
bzrlib/smart/branch.py (+0/-19)
bzrlib/smart/bzrdir.py (+19/-0)
bzrlib/smart/repository.py (+58/-1)
bzrlib/smart/request.py (+6/-3)
bzrlib/tests/blackbox/test_annotate.py (+1/-1)
bzrlib/tests/blackbox/test_branch.py (+1/-1)
bzrlib/tests/blackbox/test_cat.py (+2/-3)
bzrlib/tests/blackbox/test_checkout.py (+3/-8)
bzrlib/tests/blackbox/test_export.py (+2/-3)
bzrlib/tests/blackbox/test_log.py (+4/-6)
bzrlib/tests/blackbox/test_ls.py (+2/-3)
bzrlib/tests/blackbox/test_sign_my_commits.py (+3/-10)
bzrlib/tests/per_interbranch/test_push.py (+2/-2)
bzrlib/tests/test_remote.py (+78/-41)
bzrlib/tests/test_smart.py (+68/-31)
doc/en/release-notes/bzr-2.5.txt (+10/-0)

To merge this branch:

bzr merge lp:~jelmer/bzr/hpss-get-inventories

Medium

Fix Released

Link a bug report

Reviewer	Review Type	Date Requested	Status
Vincent Ladeuil		2011-11-29	Approve on 2011-12-02
Review via email: mp+83837@code.launchpad.net

This proposal has been superseded by a proposal from 2011-12-11.

Commit message

Add HPSS call for ``Repository.iter_inventories``.

Description of the change

Add a HPSS call for ``Repository.iter_inventories``.

This massively reduces the number of roundtrips for various commands, and causes a fair number to no longer use VFS calls at all when used against a modern remote server.

In the process I removed the HistoryMissing exception that was thrown in the private Repository._get_inventory_xml method, and never caught anywhere except to be converted to a NoSuchRevision exception. The private Repository._iter_inventories no longer directly raises NoSuchRevision but instead indicates when it can't find an inventory so that it is possible to fetch some inventories from one repository and others from another (such as a fallback repository).

Repository.get_deltas_for_revisions() now has its own implementation for remote repositories, which means the client won't use the VFS to get at the inventories to generate the deltas from.

Revision history for this message

Jelmer Vernooij (jelmer) wrote on 2011-11-29:

Timings when checking out bzr.dev:

~/src/bzr/hpss-get-inventories/bzr co --lightweight bzr://people.samba.org/bzr.dev 2.25s user 0.46s system 15% cpu 17.277 total
~/src/bzr/hpss-get-inventories/bzr export /tmp/bzr.dev2 bzr://people.samba.org/bzr.dev 1.54s user 0.40s system 12% cpu 15.435 total

versus against an older bzr server:

~/src/bzr/hpss-get-inventories/bzr co --lightweight bzr://people.samba.org/bzr.dev 4.91s user 1.03s system 10% cpu 58.807 total
~/src/bzr/hpss-get-inventories/bzr export /tmp/bzr.dev5 bzr://people.samba.org/bzr.dev 4.35s user 0.98s system 9% cpu 58.357 total

Revision history for this message

Jelmer Vernooij (jelmer) wrote on 2011-11-29:

For comparison (though I should it's a different server):

apt-get source bzr 2.79s user 0.54s system 27% cpu 12.338 total

so we're getting closer.

Revision history for this message

Andrew Bennetts (spiv) wrote on 2011-11-29:

Jelmer Vernooij wrote:
> Add a HPSS call for ``Repository.iter_inventories``.

I am a bit concerned that by adding verbs like this, with their own ad hoc
record stream-like wire format, that we're growing the maintenance burden
unnecessarily, and not reusing improvements everywhere we could.

I'm glad this is using inventory-deltas. But if it's worth zlib compressing
them here, shouldn't we do that for inventory-deltas from get_record_stream too?

Also, could you implement this via the get_record_stream RPC with a new
parameter that means “just inventories (rather than rev+inv+texts+chks+sigs for
given keys)”?

On one hand it might feel a bit ugly to make one RPC do so many things,
get_record_stream, do so many things, but on the other I think it provides a
useful pressure to keep the interface minimal and consistent (e.g. whether
records are zlib-compressed).

Adding this iter_inventories RPC might be the right thing (although putting
“iter” in the name of an RPC feels weird to me, that's surely a property of the
Python API rather than a property of the RPC), but I'd like hear what you think
about the tradeoffs of doing that vs. extending get_record_stream.

-Andrew.

Revision history for this message

Martin Pool (mbp) wrote on 2011-11-30:

On 30 November 2011 09:37, Andrew Bennetts
<email address hidden> wrote:
> Jelmer Vernooij wrote:
>> Add a HPSS call for ``Repository.iter_inventories``.
>
> I am a bit concerned that by adding verbs like this, with their own ad hoc
> record stream-like wire format, that we're growing the maintenance burden
> unnecessarily, and not reusing improvements everywhere we could.

--
Martin

Revision history for this message

Vincent Ladeuil (vila) wrote on 2011-12-02:

>> I am a bit concerned that by adding verbs like this, with their own ad hoc
>> record stream-like wire format, that we're growing the maintenance burden
>> unnecessarily, and not reusing improvements everywhere we could.

> +1

I kind of had the same feeling when jelmer started adding a bunch of verbs that were basically replacing a vfs roundtrip by a smart request roundtrip.

BUT

If we stop maintaining these verbs on the server side, the clients will fallback to vfs.

So we introduce different/better verbs, we can remove the old ones in both the server and the client and all but the newest clients can still fallback to vfs.

The net effect is at least to get to a point where the most recent client/server do not use vfs at all.

This sounds like a good incremental step to me.

'reusing improvements everywhere we could' is still valuable but doesn't to block progress.

review: Approve

Revision history for this message

Jelmer Vernooij (jelmer) wrote on 2011-12-03:

Hi Andrew,

> Jelmer Vernooij wrote:
> > Add a HPSS call for ``Repository.iter_inventories``.
> I am a bit concerned that by adding verbs like this, with their own ad hoc
> record stream-like wire format, that we're growing the maintenance burden
> unnecessarily, and not reusing improvements everywhere we could.
>
> I'm glad this is using inventory-deltas. But if it's worth zlib compressing
> them here, shouldn't we do that for inventory-deltas from get_record_stream
> too?
It's a pretty big inventory delta in this case - for something like gcc it matters for the delta between null: and the initial revision that was requested. I haven't investigated whether it would help for record streams too, but I imagine it would have less of an effect.

> Also, could you implement this via the get_record_stream RPC with a new
> parameter that means “just inventories (rather than rev+inv+texts+chks+sigs
> for
> given keys)”?
>
> On one hand it might feel a bit ugly to make one RPC do so many things,
> get_record_stream, do so many things, but on the other I think it provides a
> useful pressure to keep the interface minimal and consistent (e.g. whether
> records are zlib-compressed).
>
> Adding this iter_inventories RPC might be the right thing (although putting
> “iter” in the name of an RPC feels weird to me, that's surely a property of
> the Python API rather than a property of the RPC), but I'd like hear what you
> think about the tradeoffs of doing that vs. extending get_record_stream.

With regard to the name: Most of the other calls seem to be named after the equivalent method in the client / server, that's why I went with Repository.iter_inventories. I'm not particularly tied to that name, something like "Repository.get_inventories" or "Repository.stream_inventories" seems reasonable to me too.

I'm not sure whether this should be part of get_record_stream. I have a hard time understanding the get_record_stream code as it is, so I'd rather not make it even more complex by adding more parameters - for example, get_record_streams seem to be dependent on the repository on-disk format to an extent - the inventory stream is not, as it's using inventory deltas. In other words, adding another verb was simpler, while keeping it all understandable for a mere mortal like me. :-)

Either way, I agree we should be reusing more code between the work I've done recently and the existing record stream
calls. But it seems to me that would best be done by refactoring so they e.g. use the same code for sending a stream of blobs of indeterminate length, rather than by all being a part of the same verb.

What do you think?

Cheers,

Jelmer

Hi Andrew,

> Jelmer Vernooij wrote:
> > Add a HPSS call for ``Repository.iter_inventories``.
> I am a bit concerned that by adding verbs like this, with their own ad hoc
> record stream-like wire format, that we're growing the maintenance burden
> unnecessarily, and not reusing improvements everywhere we could.
> 
> I'm glad this is using inventory-deltas.  But if it's worth zlib compressing
> them here, shouldn't we do that for inventory-deltas from get_record_stream
> too?
It's a pretty big inventory delta in this case - for something like gcc it matters for the delta between null: and the initial revision that was requested. I haven't investigated whether it would help for record streams too, but I imagine it would have less of an effect.

> Also, could you implement this via the get_record_stream RPC with a new
> parameter that means “just inventories (rather than rev+inv+texts+chks+sigs
> for
> given keys)”?
> 
> On one hand it might feel a bit ugly to make one RPC do so many things,
> get_record_stream, do so many things, but on the other I think it provides a
> useful pressure to keep the interface minimal and consistent (e.g. whether
> records are zlib-compressed).
> 
> Adding this iter_inventories RPC might be the right thing (although putting
> “iter” in the name of an RPC feels weird to me, that's surely a property of
> the Python API rather than a property of the RPC), but I'd like hear what you
> think about the tradeoffs of doing that vs. extending get_record_stream.

Either way, I agree we should be reusing more code between the work I've done recently and the existing record stream 
 calls. But it seems to me that would best be done by refactoring so they e.g. use the same code for sending a stream of blobs of indeterminate length, rather than by all being a part of the same verb.

What do you think?

Cheers,

Jelmer

Revision history for this message

Andrew Bennetts (spiv) wrote on 2011-12-05:

Download full text (4.5 KiB)

Jelmer Vernooij wrote:
…
> > I'm glad this is using inventory-deltas. But if it's worth zlib compressing
> > them here, shouldn't we do that for inventory-deltas from get_record_stream
> > too?
> It's a pretty big inventory delta in this case - for something like gcc it
> matters for the delta between null: and the initial revision that was
> requested. I haven't investigated whether it would help for record streams
> too, but I imagine it would have less of an effect.

Well, I know that there are already cases that send deltas from null: —
something to do with stacking perhaps? So reusing this improvement globally
would be nice.

> > Also, could you implement this via the get_record_stream RPC with a new
> > parameter that means “just inventories (rather than rev+inv+texts+chks+sigs
> > for
> > given keys)”?
> >
> > On one hand it might feel a bit ugly to make one RPC do so many things,
> > get_record_stream, do so many things, but on the other I think it provides a
> > useful pressure to keep the interface minimal and consistent (e.g. whether
> > records are zlib-compressed).
> >
> > Adding this iter_inventories RPC might be the right thing (although putting
> > “iter” in the name of an RPC feels weird to me, that's surely a property of
> > the Python API rather than a property of the RPC), but I'd like hear what you
> > think about the tradeoffs of doing that vs. extending get_record_stream.
>
> With regard to the name: Most of the other calls seem to be named after the
> equivalent method in the client / server, that's why I went with
> Repository.iter_inventories. I'm not particularly tied to that name, something
> like "Repository.get_inventories" or "Repository.stream_inventories" seems
> reasonable to me too.

I think of using the name of the Python API as a good guide, not a strict rule.
Certainly there's no insert_stream_locked method on Repository…

“Repository.get_inventories” sounds fine to me.

> I'm not sure whether this should be part of get_record_stream. I have a hard
> time understanding the get_record_stream code as it is, so I'd rather not make
> it even more complex by adding more parameters - for example,
> get_record_streams seem to be dependent on the repository on-disk format to an
> extent - the inventory stream is not, as it's using inventory deltas. In other
> words, adding another verb was simpler, while keeping it all understandable
> for a mere mortal like me. :-)
>
> Either way, I agree we should be reusing more code between the work I've done
> recently and the existing record stream calls. But it seems to me that would
> best be done by refactoring so they e.g. use the same code for sending a
> stream of blobs of indeterminate length, rather than by all being a part of
> the same verb.

Well, the path to reuse we've developed *is* record streams. I'm quite willing
to believe that get_record_stream's parameters aren't a convenient way to
express all needs. But I'd really like it the thing that was returned by a new
verb was a true record stream — something you could pass directly to
insert_record_stream. The idea is that record streams should be the One True
Way we have to say “here is a stream of da...

Am 05/12/11 11:31, schrieb Andrew Bennetts:
> Jelmer Vernooij wrote:
> …
>>> I'm glad this is using inventory-deltas.  But if it's worth zlib compressing
>>> them here, shouldn't we do that for inventory-deltas from get_record_stream
>>> too?
>> It's a pretty big inventory delta in this case - for something like gcc it
>> matters for the delta between null: and the initial revision that was
>> requested. I haven't investigated whether it would help for record streams
>> too, but I imagine it would have less of an effect.
> Well, I know that there are already cases that send deltas from null: —
> something to do with stacking perhaps?  So reusing this improvement globally
> would be nice.
I guess the best way to do this would be to add a zlib pack record kind 
for network streams? I.e. a new entry in 
NetworkRecordStream._kind_factory, and something equivalent on the 
server side?
>>> Also, could you implement this via the get_record_stream RPC with a new
>>> parameter that means “just inventories (rather than rev+inv+texts+chks+sigs
>>> for
>>> given keys)”?
>>>
>>> On one hand it might feel a bit ugly to make one RPC do so many things,
>>> get_record_stream, do so many things, but on the other I think it provides a
>>> useful pressure to keep the interface minimal and consistent (e.g. whether
>>> records are zlib-compressed).
>>>
>>> Adding this iter_inventories RPC might be the right thing (although putting
>>> “iter” in the name of an RPC feels weird to me, that's surely a property of
>>> the Python API rather than a property of the RPC), but I'd like hear what you
>>> think about the tradeoffs of doing that vs. extending get_record_stream.
>> With regard to the name: Most of the other calls seem to be named after the
>> equivalent method in the client / server, that's why I went with
>> Repository.iter_inventories. I'm not particularly tied to that name, something
>> like "Repository.get_inventories" or "Repository.stream_inventories" seems
>> reasonable to me too.
> I think of using the name of the Python API as a good guide, not a strict rule.
> Certainly there's no insert_stream_locked method on Repository…
>
> “Repository.get_inventories” sounds fine to me.
I've renamed it.
>
>> I'm not sure whether this should be part of get_record_stream. I have a hard
>> time understanding the get_record_stream code as it is, so I'd rather not make
>> it even more complex by adding more parameters - for example,
>> get_record_streams seem to be dependent on the repository on-disk format to an
>> extent - the inventory stream is not, as it's using inventory deltas. In other
>> words, adding another verb was simpler, while keeping it all understandable
>> for a mere mortal like me. :-)
>>
>> Either way, I agree we should be reusing more code between the work I've done
>> recently and the existing record stream calls. But it seems to me that would
>> best be done by refactoring so they e.g. use the same code for sending a
>> stream of blobs of indeterminate length, rather than by all being a part of
>> the same verb.
> Well, the path to reuse we've developed *is* record streams.  I'm quite willing
> to believe that get_record_stream's parameters aren't a convenient way to
> express all needs.  But I'd really like it the thing that was returned by a new
> verb was a true record stream — something you could pass directly to
> insert_record_stream.  The idea is that record streams should be the One True
> Way we have to say “here is a stream of data from a repository.”  The existing
> get_record_stream stuff is pretty complex, but IIRC that's from two sources:
>
>   1. the complexity of constructing a definitely
>      ready-to-commit-perhaps-after-a-single-fixup-fetch-for-stacking-invariants
>      stream for the “give me all revisions/invs/texts/etc records for revisions
>      in this subgraph of the revision graph”.  That's mostly independent of
>      get_record_stream, it just happens to be where the logic for that is
>      encoded.
>
>   2. the abstractions/indirections to allow various different repository formats
>      to send their own native records directly via record streams.
>
> I think 2 is a strong reason to keep using records streams, rather than
> inventing something new and discovering that out of necessity you have ended up
> reinventing it.
>
> The basic structure of a record stream itself is very simple and lightweight
> (although it may well benefit from being wrapped in zlib compression over the
> wire?), and should be reused as the default decision for repository data unless
> there's a really good reason not to.
>
> The current method of extending get_record_stream by shoehorning more and more
> query types into the one kind of call perhaps isn't a good tradeoff.  I think at
> some level it will boil down to the same work though, even if the veneer is
> get_inventories(invs) rather than get_record_stream(JustInventories(invs), …).
I've done some more digging in the get_record_stream verb code today. 
Using record streams seems reasonable to me, but I would like to use a 
separate verb (rather than Repository.get_stream_1.19 and friends) to 
keep the interface simple - there's so much going on in 
get_record_stream that we don't need here.

It seems like there are two things we can do - either just send a stream 
similar to get_record_stream_1.19 with a single inventory-delta 
"substream", or we could just send the substream with inventory deltas. 
The latter is simpler and would need slightly less bandwidth, but can't 
be used with e.g. insert_record_stream. I'm not sure if we care about 
that, though. What do you think?

The body_stream method of the GetInventories server side implementation 
would look something like this if we were directly using substreams:

def body_stream(self, repository, ordering, revids):
         pack_writer = pack.ContainerSerialiser()
         yield pack_writer.begin()
         serializer = inventory_delta.InventoryDeltaSerializer(
             repository.supports_rich_root(),
             repository._format.supports_tree_reference)

prev_inv = _mod_inventory.Inventory(root_id=None,
             revision_id=_mod_revision.NULL_REVISION)
         self._repository.lock_read()
         try:
             for inv, revid in repository._iter_inventories(revids, 
ordering):
                 if inv is None:
                     continue
                 inv_delta = inv._make_delta(prev_inv)
                 lines = serializer.delta_to_lines(
                     prev_inv.revision_id, inv.revision_id, inv_delta)
                 yield pack_writer.bytes_record("".join(lines), 
[('inventory-delta',)])
                 prev_inv = inv
         finally:
             self._repository.unlock()
         yield pack_writer.end()

Thanks again for reviewing this - I really appreciate your thoughts.

Cheers,

Jelmer

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Alejandro Cornejo2

Bazaar Codereview Subscribers

Benoit Pierre

Gmood

Jelmer Vernooij

Karl Bielefeldt

Mahmoud Hassan

Matt Nordhoff

Mohd Fikri Mohd Amin

MrJOHN

Václav Haisman

bzr PQM

vincenzo

to status/vote changes:

Alexander Belchenko

amandla2023

 === modified file 'bzrlib/remote.py'
 --- bzrlib/remote.py	2011-12-11 16:36:30 +0000
 +++ bzrlib/remote.py	2011-12-11 16:36:30 +0000
@@ -27,6 +27,7 @@
      errors,
      gpg,
      graph,
++    inventory_delta,
      lock,
      lockdir,
      osutils,
@@ -490,6 +491,45 @@
          self._next_open_branch_result = None
          return _mod_bzrdir.BzrDir.break_lock(self)
++    def _vfs_checkout_metadir(self):
++        self._ensure_real()
++        return self._real_bzrdir.checkout_metadir()
++
++    def checkout_metadir(self):
++        medium = self._client._medium
++        if medium._is_remote_before((2, 5)):
++            return self._vfs_checkout_metadir()
++        path = self._path_for_remote_call(self._client)
++        try:
++            response = self._client.call('BzrDir.checkout_metadir',
++                path)
++        except errors.UnknownSmartMethod:
++            medium._remember_remote_is_before((2, 5))
++            return self._vfs_checkout_metadir()
++        if len(response) != 3:
++            raise errors.UnexpectedSmartServerResponse(response)
++        control_name, repo_name, branch_name = response
++        try:
++            format = controldir.network_format_registry.get(control_name)
++        except KeyError:
++            raise errors.UnknownFormatError(kind='control', format=control_name)
++        if repo_name:
++            try:
++                repo_format = _mod_repository.network_format_registry.get(
++                    repo_name)
++            except KeyError:
++                raise errors.UnknownFormatError(kind='repository',
++                    format=repo_name)
++            format.repository_format = repo_format
++        if branch_name:
++            try:
++                format.set_branch_format(
++                    branch.network_format_registry.get(branch_name))
++            except KeyError:
++                raise errors.UnknownFormatError(kind='branch',
++                    format=branch_name)
++        return format
++
      def _vfs_cloning_metadir(self, require_stacking=False):
          self._ensure_real()
          return self._real_bzrdir.cloning_metadir(
@@ -1788,9 +1828,122 @@
      def get_inventory(self, revision_id):
          return list(self.iter_inventories([revision_id]))[0]
++    def _iter_inventories_rpc(self, revision_ids, ordering):
++        if ordering is None:
++            ordering = 'unordered'
++        path = self.bzrdir._path_for_remote_call(self._client)
++        body = "\n".join(revision_ids)
++        response_tuple, response_handler = (
++            self._call_with_body_bytes_expecting_body(
++                "VersionedFileRepository.get_inventories",
++                (path, ordering), body))
++        if response_tuple[0] != "ok":
++            raise errors.UnexpectedSmartServerResponse(response_tuple)
++        deserializer = inventory_delta.InventoryDeltaDeserializer()
++        byte_stream = response_handler.read_streamed_body()
++        decoded = smart_repo._byte_stream_to_stream(byte_stream)
++        if decoded is None:
++            # no results whatsoever
++            return
++        src_format, stream = decoded
++        if src_format.network_name() != self._format.network_name():
++            raise AssertionError(
++                "Mismatched RemoteRepository and stream src %r, %r" % (
++                src_format.network_name(), self._format.network_name()))
++        # ignore the src format, it's not really relevant
++        prev_inv = Inventory(root_id=None,
++            revision_id=_mod_revision.NULL_REVISION)
++        # there should be just one substream, with inventory deltas
++        substream_kind, substream = stream.next()
++        for record in substream:
++            (parent_id, new_id, versioned_root, tree_references, invdelta) = (
++                deserializer.parse_text_bytes(record.get_bytes_as("fulltext")))
++            if parent_id != prev_inv.revision_id:
++                raise AssertionError("invalid base %r != %r" % (parent_id,
++                    prev_inv.revision_id))
++            inv = prev_inv.create_by_apply_delta(invdelta, new_id)
++            yield inv, inv.revision_id
++            prev_inv = inv
++
++    def _iter_inventories_vfs(self, revision_ids, ordering=None):
++        self._ensure_real()
++        return self._real_repository._iter_inventories(revision_ids, ordering)
++
      def iter_inventories(self, revision_ids, ordering=None):
--        self._ensure_real()
--        return self._real_repository.iter_inventories(revision_ids, ordering)
++        """Get many inventories by revision_ids.
++
++        This will buffer some or all of the texts used in constructing the
++        inventories in memory, but will only parse a single inventory at a
++        time.
++
++        :param revision_ids: The expected revision ids of the inventories.
++        :param ordering: optional ordering, e.g. 'topological'.  If not
++            specified, the order of revision_ids will be preserved (by
++            buffering if necessary).
++        :return: An iterator of inventories.
++        """
++        if ((None in revision_ids)
++            or (_mod_revision.NULL_REVISION in revision_ids)):
++            raise ValueError('cannot get null revision inventory')
++        for inv, revid in self._iter_inventories(revision_ids, ordering):
++            if inv is None:
++                raise errors.NoSuchRevision(self, revid)
++            yield inv
++
++    def _iter_inventories(self, revision_ids, ordering=None):
++        if len(revision_ids) == 0:
++            return
++        missing = set(revision_ids)
++        if ordering is None:
++            order_as_requested = True
++            invs = {}
++            order = list(revision_ids)
++            order.reverse()
++            next_revid = order.pop()
++        else:
++            order_as_requested = False
++            if ordering != 'unordered' and self._fallback_repositories:
++                raise ValueError('unsupported ordering %r' % ordering)
++        iter_inv_fns = [self._iter_inventories_rpc] + [
++            fallback._iter_inventories for fallback in
++            self._fallback_repositories]
++        try:
++            for iter_inv in iter_inv_fns:
++                request = [revid for revid in revision_ids if revid in missing]
++                for inv, revid in iter_inv(request, ordering):
++                    if inv is None:
++                        continue
++                    missing.remove(inv.revision_id)
++                    if ordering != 'unordered':
++                        invs[revid] = inv
++                    else:
++                        yield inv, revid
++                if order_as_requested:
++                    # Yield as many results as we can while preserving order.
++                    while next_revid in invs:
++                        inv = invs.pop(next_revid)
++                        yield inv, inv.revision_id
++                        try:
++                            next_revid = order.pop()
++                        except IndexError:
++                            # We still want to fully consume the stream, just
++                            # in case it is not actually finished at this point
++                            next_revid = None
++                            break
++        except errors.UnknownSmartMethod:
++            for inv, revid in self._iter_inventories_vfs(revision_ids, ordering):
++                yield inv, revid
++            return
++        # Report missing
++        if order_as_requested:
++            if next_revid is not None:
++                yield None, next_revid
++            while order:
++                revid = order.pop()
++                yield invs.get(revid), revid
++        else:
++            while missing:
++                yield None, missing.pop()
      @needs_read_lock
      def get_revision(self, revision_id):
@@ -2149,6 +2302,8 @@
      @needs_read_lock
      def _get_inventory_xml(self, revision_id):
++        # This call is used by older working tree formats,
++        # which stored a serialized basis inventory.
          self._ensure_real()
          return self._real_repository._get_inventory_xml(revision_id)
@@ -2171,11 +2326,58 @@
              revids.update(set(fallback.all_revision_ids()))
          return list(revids)
++    def _filtered_revision_trees(self, revision_ids, file_ids):
++        """Return Tree for a revision on this branch with only some files.
++
++        :param revision_ids: a sequence of revision-ids;
++          a revision-id may not be None or 'null:'
++        :param file_ids: if not None, the result is filtered
++          so that only those file-ids, their parents and their
++          children are included.
++        """
++        inventories = self.iter_inventories(revision_ids)
++        for inv in inventories:
++            # Should we introduce a FilteredRevisionTree class rather
++            # than pre-filter the inventory here?
++            filtered_inv = inv.filter(file_ids)
++            yield InventoryRevisionTree(self, filtered_inv, filtered_inv.revision_id)
++
      @needs_read_lock
      def get_deltas_for_revisions(self, revisions, specific_fileids=None):
--        self._ensure_real()
--        return self._real_repository.get_deltas_for_revisions(revisions,
--            specific_fileids=specific_fileids)
++        medium = self._client._medium
++        if medium._is_remote_before((1, 2)):
++            self._ensure_real()
++            for delta in self._real_repository.get_deltas_for_revisions(
++                    revisions, specific_fileids):
++                yield delta
++            return
++        # Get the revision-ids of interest
++        required_trees = set()
++        for revision in revisions:
++            required_trees.add(revision.revision_id)
++            required_trees.update(revision.parent_ids[:1])
++
++        # Get the matching filtered trees. Note that it's more
++        # efficient to pass filtered trees to changes_from() rather
++        # than doing the filtering afterwards. changes_from() could
++        # arguably do the filtering itself but it's path-based, not
++        # file-id based, so filtering before or afterwards is
++        # currently easier.
++        if specific_fileids is None:
++            trees = dict((t.get_revision_id(), t) for
++                t in self.revision_trees(required_trees))
++        else:
++            trees = dict((t.get_revision_id(), t) for
++                t in self._filtered_revision_trees(required_trees,
++                specific_fileids))
++
++        # Calculate the deltas
++        for revision in revisions:
++            if not revision.parent_ids:
++                old_tree = self.revision_tree(_mod_revision.NULL_REVISION)
++            else:
++                old_tree = trees[revision.parent_ids[0]]
++            yield trees[revision.revision_id].changes_from(old_tree)
      @needs_read_lock
      def get_revision_delta(self, revision_id, specific_fileids=None):
@@ -3157,51 +3359,6 @@
                  self.bzrdir, self._client)
          return self._control_files
--    def _get_checkout_format_vfs(self, lightweight=False):
--        self._ensure_real()
--        if lightweight:
--            format = RemoteBzrDirFormat()
--            self.bzrdir._format._supply_sub_formats_to(format)
--            format.workingtree_format = self._real_branch._get_checkout_format(
--                lightweight=lightweight).workingtree_format
--            return format
--        else:
--            return self._real_branch._get_checkout_format(lightweight=False)
--
--    def _get_checkout_format(self, lightweight=False):
--        medium = self._client._medium
--        if medium._is_remote_before((2, 5)):
--            return self._get_checkout_format_vfs(lightweight)
--        try:
--            response = self._client.call('Branch.get_checkout_format',
--                self._remote_path(), lightweight)
--        except errors.UnknownSmartMethod:
--            medium._remember_remote_is_before((2, 5))
--            return self._get_checkout_format_vfs(lightweight)
--        if len(response) != 3:
--            raise errors.UnexpectedSmartServerResponse(response)
--        control_name, repo_name, branch_name = response
--        try:
--            format = controldir.network_format_registry.get(control_name)
--        except KeyError:
--            raise errors.UnknownFormatError(kind='control', format=control_name)
--        if repo_name:
--            try:
--                repo_format = _mod_repository.network_format_registry.get(
--                    repo_name)
--            except KeyError:
--                raise errors.UnknownFormatError(kind='repository',
--                    format=repo_name)
--            format.repository_format = repo_format
--        if branch_name:
--            try:
--                format.set_branch_format(
--                    branch.network_format_registry.get(branch_name))
--            except KeyError:
--                raise errors.UnknownFormatError(kind='branch',
--                    format=branch_name)
--        return format
--
      def get_physical_lock_status(self):
          """See Branch.get_physical_lock_status()."""
          try:
 === modified file 'bzrlib/smart/branch.py'
 --- bzrlib/smart/branch.py	2011-12-11 16:36:30 +0000
 +++ bzrlib/smart/branch.py	2011-11-25 14:04:12 +0000
@@ -446,22 +446,3 @@
              return SuccessfulSmartServerResponse(('yes',))
          else:
              return SuccessfulSmartServerResponse(('no',))
--
--
--class SmartServerBranchRequestGetCheckoutFormat(SmartServerBranchRequest):
--    """Get the format to use for checkouts of a branch.
--
--    New in 2.5.
--    """
--
--    def do_with_branch(self, branch, lightweight):
--        control_format = branch._get_checkout_format(lightweight)
--        control_name = control_format.network_name()
--        if not control_format.fixed_components:
--            branch_name = control_format.get_branch_format().network_name()
--            repo_name = control_format.repository_format.network_name()
--        else:
--            branch_name = ''
--            repo_name = ''
--        return SuccessfulSmartServerResponse(
--            (control_name, repo_name, branch_name))
 === modified file 'bzrlib/smart/bzrdir.py'
 --- bzrlib/smart/bzrdir.py	2011-11-21 14:44:03 +0000
 +++ bzrlib/smart/bzrdir.py	2011-12-11 16:36:30 +0000
@@ -208,6 +208,25 @@
              branch_name))
++class SmartServerBzrDirRequestCheckoutMetaDir(SmartServerRequestBzrDir):
++    """Get the format to use for checkouts.
++
++    New in 2.5.
++    """
++
++    def do_bzrdir_request(self):
++        control_format = self._bzrdir.checkout_metadir()
++        control_name = control_format.network_name()
++        if not control_format.fixed_components:
++            branch_name = control_format.get_branch_format().network_name()
++            repo_name = control_format.repository_format.network_name()
++        else:
++            branch_name = ''
++            repo_name = ''
++        return SuccessfulSmartServerResponse(
++            (control_name, repo_name, branch_name))
++
++
  class SmartServerRequestCreateBranch(SmartServerRequestBzrDir):
      def do(self, path, network_name):
 === modified file 'bzrlib/smart/repository.py'
 --- bzrlib/smart/repository.py	2011-12-05 15:16:52 +0000
 +++ bzrlib/smart/repository.py	2011-12-11 16:36:30 +0000
@@ -14,7 +14,7 @@
  # along with this program; if not, write to the Free Software
  # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
--"""Server-side repository related request implmentations."""
++"""Server-side repository related request implementations."""
  import bz2
  import os
@@ -28,6 +28,8 @@
      bencode,
      errors,
      estimate_compressed_size,
++    inventory as _mod_inventory,
++    inventory_delta,
      osutils,
      pack,
      trace,
@@ -43,6 +45,7 @@
  from bzrlib.repository import _strip_NULL_ghosts, network_format_registry
  from bzrlib import revision as _mod_revision
  from bzrlib.versionedfile import (
++    ChunkedContentFactory,
      NetworkRecordStream,
      record_to_fulltext_bytes,
+     )
@@ -1220,3 +1223,57 @@
                  yield zlib.compress(record.get_bytes_as('fulltext'))
          finally:
              self._repository.unlock()
++
++
++class SmartServerRepositoryGetInventories(SmartServerRepositoryRequest):
++    """Get the inventory deltas for a set of revision ids.
++
++    This accepts a list of revision ids, and then sends a chain
++    of deltas for the inventories of those revisions. The first
++    revision will be empty.
++
++    The server writes back zlibbed serialized inventory deltas,
++    in the ordering specified. The base for each delta is the
++    inventory generated by the previous delta.
++
++    New in 2.5.
++    """
++
++    def _inventory_delta_stream(self, repository, ordering, revids):
++        prev_inv = _mod_inventory.Inventory(root_id=None,
++            revision_id=_mod_revision.NULL_REVISION)
++        serializer = inventory_delta.InventoryDeltaSerializer(
++            repository.supports_rich_root(),
++            repository._format.supports_tree_reference)
++        repository.lock_read()
++        try:
++            for inv, revid in repository._iter_inventories(revids, ordering):
++                if inv is None:
++                    continue
++                inv_delta = inv._make_delta(prev_inv)
++                lines = serializer.delta_to_lines(
++                    prev_inv.revision_id, inv.revision_id, inv_delta)
++                yield ChunkedContentFactory(inv.revision_id, None, None, lines)
++                prev_inv = inv
++        finally:
++            repository.unlock()
++
++    def body_stream(self, repository, ordering, revids):
++        substream = self._inventory_delta_stream(repository,
++            ordering, revids)
++        return _stream_to_byte_stream([('inventory-delta', substream)],
++            repository._format)
++
++    def do_body(self, body_bytes):
++        return SuccessfulSmartServerResponse(('ok', ),
++            body_stream=self.body_stream(self._repository, self._ordering,
++                body_bytes.splitlines()))
++
++    def do_repository_request(self, repository, ordering):
++        if ordering == 'unordered':
++            # inventory deltas for a topologically sorted stream
++            # are likely to be smaller
++            ordering = 'topological'
++        self._ordering = ordering
++        # Signal that we want a body
++        return None
 === modified file 'bzrlib/smart/request.py'
 --- bzrlib/smart/request.py	2011-12-11 16:36:30 +0000
 +++ bzrlib/smart/request.py	2011-12-11 16:36:30 +0000
@@ -547,9 +547,6 @@
      'Branch.get_stacked_on_url', 'bzrlib.smart.branch',
      'SmartServerBranchRequestGetStackedOnURL', info='read')
  request_handlers.register_lazy(
--    'Branch.get_checkout_format', 'bzrlib.smart.branch',
--    'SmartServerBranchRequestGetCheckoutFormat', info='read')
--request_handlers.register_lazy(
      'Branch.get_physical_lock_status', 'bzrlib.smart.branch',
      'SmartServerBranchRequestGetPhysicalLockStatus', info='read')
  request_handlers.register_lazy(
@@ -586,6 +583,9 @@
      'Branch.revision_id_to_revno', 'bzrlib.smart.branch',
      'SmartServerBranchRequestRevisionIdToRevno', info='read')
  request_handlers.register_lazy(
++    'BzrDir.checkout_metadir', 'bzrlib.smart.bzrdir',
++    'SmartServerBzrDirRequestCheckoutMetaDir', info='read')
++request_handlers.register_lazy(
      'BzrDir.cloning_metadir', 'bzrlib.smart.bzrdir',
      'SmartServerBzrDirRequestCloningMetaDir', info='read')
  request_handlers.register_lazy(
@@ -754,6 +754,9 @@
      'VersionedFileRepository.get_serializer_format', 'bzrlib.smart.repository',
      'SmartServerRepositoryGetSerializerFormat', info='read')
  request_handlers.register_lazy(
++    'VersionedFileRepository.get_inventories', 'bzrlib.smart.repository',
++    'SmartServerRepositoryGetInventories', info='read')
++request_handlers.register_lazy(
      'Repository.tarball', 'bzrlib.smart.repository',
      'SmartServerRepositoryTarball', info='read')
  request_handlers.register_lazy(
 === modified file 'bzrlib/tests/blackbox/test_annotate.py'
 --- bzrlib/tests/blackbox/test_annotate.py	2011-12-11 16:36:30 +0000
 +++ bzrlib/tests/blackbox/test_annotate.py	2011-12-11 16:36:30 +0000
@@ -326,6 +326,6 @@
          # being too low. If rpc_count increases, more network roundtrips have
          # become necessary for this use case. Please do not adjust this number
          # upwards without agreement from bzr's network support maintainers.
--        self.assertLength(19, self.hpss_calls)
++        self.assertLength(15, self.hpss_calls)
          self.expectFailure("annotate accesses inventories, which require VFS access",
              self.assertThat, self.hpss_calls, NoVfsCalls)
 === modified file 'bzrlib/tests/blackbox/test_branch.py'
 --- bzrlib/tests/blackbox/test_branch.py	2011-12-11 16:36:30 +0000
 +++ bzrlib/tests/blackbox/test_branch.py	2011-12-11 16:36:30 +0000
@@ -483,7 +483,7 @@
          # being too low. If rpc_count increases, more network roundtrips have
          # become necessary for this use case. Please do not adjust this number
          # upwards without agreement from bzr's network support maintainers.
--        self.assertLength(40, self.hpss_calls)
++        self.assertLength(33, self.hpss_calls)
          self.expectFailure("branching to the same branch requires VFS access",
              self.assertThat, self.hpss_calls, NoVfsCalls)
 === modified file 'bzrlib/tests/blackbox/test_cat.py'
 --- bzrlib/tests/blackbox/test_cat.py	2011-12-11 16:36:30 +0000
 +++ bzrlib/tests/blackbox/test_cat.py	2011-12-11 16:36:30 +0000
@@ -239,6 +239,5 @@
          # being too low. If rpc_count increases, more network roundtrips have
          # become necessary for this use case. Please do not adjust this number
          # upwards without agreement from bzr's network support maintainers.
--        self.assertLength(16, self.hpss_calls)
--        self.expectFailure("cat still uses VFS calls",
--            self.assertThat, self.hpss_calls, NoVfsCalls)
++        self.assertLength(9, self.hpss_calls)
++        self.assertThat(self.hpss_calls, NoVfsCalls)
 === modified file 'bzrlib/tests/blackbox/test_checkout.py'
 --- bzrlib/tests/blackbox/test_checkout.py	2011-12-11 16:36:30 +0000
 +++ bzrlib/tests/blackbox/test_checkout.py	2011-12-11 16:36:30 +0000
@@ -179,8 +179,7 @@
          for count in range(9):
              t.commit(message='commit %d' % count)
          self.reset_smart_call_log()
--        out, err = self.run_bzr(['checkout', self.get_url('from'),
--            'target'])
++        out, err = self.run_bzr(['checkout', self.get_url('from'), 'target'])
          # This figure represent the amount of work to perform this use case. It
          # is entirely ok to reduce this number if a test fails due to rpc_count
          # being too low. If rpc_count increases, more network roundtrips have
@@ -202,9 +201,5 @@
          # being too low. If rpc_count increases, more network roundtrips have
          # become necessary for this use case. Please do not adjust this number
          # upwards without agreement from bzr's network support maintainers.
--        if len(self.hpss_calls) < 28 or len(self.hpss_calls) > 40:
--            self.fail(
--                "Incorrect length: wanted between 28 and 40, got %d for %r" % (
--                    len(self.hpss_calls), self.hpss_calls))
--        self.expectFailure("lightweight checkouts require VFS calls",
--            self.assertThat, self.hpss_calls, NoVfsCalls)
++        self.assertLength(15, self.hpss_calls)
++        self.assertThat(self.hpss_calls, NoVfsCalls)
 === modified file 'bzrlib/tests/blackbox/test_export.py'
 --- bzrlib/tests/blackbox/test_export.py	2011-12-11 16:36:30 +0000
 +++ bzrlib/tests/blackbox/test_export.py	2011-12-11 16:36:30 +0000
@@ -448,6 +448,5 @@
          # being too low. If rpc_count increases, more network roundtrips have
          # become necessary for this use case. Please do not adjust this number
          # upwards without agreement from bzr's network support maintainers.
--        self.assertLength(16, self.hpss_calls)
--        self.expectFailure("export requires inventory access which requires VFS",
--            self.assertThat, self.hpss_calls, NoVfsCalls)
++        self.assertLength(7, self.hpss_calls)
++        self.assertThat(self.hpss_calls, NoVfsCalls)
 === modified file 'bzrlib/tests/blackbox/test_log.py'
 --- bzrlib/tests/blackbox/test_log.py	2011-12-11 16:36:30 +0000
 +++ bzrlib/tests/blackbox/test_log.py	2011-12-11 16:36:30 +0000
@@ -1086,9 +1086,8 @@
          # being too low. If rpc_count increases, more network roundtrips have
          # become necessary for this use case. Please do not adjust this number
          # upwards without agreement from bzr's network support maintainers.
--        self.assertLength(19, self.hpss_calls)
--        self.expectFailure("verbose log accesses inventories, which require VFS",
--            self.assertThat, self.hpss_calls, NoVfsCalls)
++        self.assertLength(11, self.hpss_calls)
++        self.assertThat(self.hpss_calls, NoVfsCalls)
      def test_per_file(self):
          self.setup_smart_server_with_call_log()
@@ -1103,6 +1102,5 @@
          # being too low. If rpc_count increases, more network roundtrips have
          # become necessary for this use case. Please do not adjust this number
          # upwards without agreement from bzr's network support maintainers.
--        self.assertLength(21, self.hpss_calls)
--        self.expectFailure("per-file graph access requires VFS",
--            self.assertThat, self.hpss_calls, NoVfsCalls)
++        self.assertLength(15, self.hpss_calls)
++        self.assertThat(self.hpss_calls, NoVfsCalls)
 === modified file 'bzrlib/tests/blackbox/test_ls.py'
 --- bzrlib/tests/blackbox/test_ls.py	2011-12-11 16:36:30 +0000
 +++ bzrlib/tests/blackbox/test_ls.py	2011-12-11 16:36:30 +0000
@@ -262,6 +262,5 @@
          # being too low. If rpc_count increases, more network roundtrips have
          # become necessary for this use case. Please do not adjust this number
          # upwards without agreement from bzr's network support maintainers.
--        self.assertLength(15, self.hpss_calls)
--        self.expectFailure("inventories can only be accessed over VFS",
--            self.assertThat, self.hpss_calls, NoVfsCalls)
++        self.assertLength(6, self.hpss_calls)
++        self.assertThat(self.hpss_calls, NoVfsCalls)
 === modified file 'bzrlib/tests/blackbox/test_sign_my_commits.py'
 --- bzrlib/tests/blackbox/test_sign_my_commits.py	2011-12-11 16:36:30 +0000
 +++ bzrlib/tests/blackbox/test_sign_my_commits.py	2011-12-11 16:36:30 +0000
@@ -165,7 +165,7 @@
          # being too low. If rpc_count increases, more network roundtrips have
          # become necessary for this use case. Please do not adjust this number
          # upwards without agreement from bzr's network support maintainers.
--        self.assertLength(54, self.hpss_calls)
++        self.assertLength(50, self.hpss_calls)
          self.expectFailure("signing commits requires VFS access",
              self.assertThat, self.hpss_calls, NoVfsCalls)
@@ -185,12 +185,5 @@
          # being too low. If rpc_count increases, more network roundtrips have
          # become necessary for this use case. Please do not adjust this number
          # upwards without agreement from bzr's network support maintainers.
--
--        # The number of readv requests seems to vary depending on the generated
--        # repository and how well it compresses, so allow for a bit of
--        # variation:
--        if len(self.hpss_calls) not in (18, 19):
--            self.fail("Incorrect length: wanted 18 or 19, got %d for %r" % (
--                len(self.hpss_calls), self.hpss_calls))
--        self.expectFailure("verifying commits requires VFS access",
--            self.assertThat, self.hpss_calls, NoVfsCalls)
++        self.assertLength(10, self.hpss_calls)
++        self.assertThat(self.hpss_calls, NoVfsCalls)
 === modified file 'bzrlib/tests/per_interbranch/test_push.py'
 --- bzrlib/tests/per_interbranch/test_push.py	2011-10-15 01:09:01 +0000
 +++ bzrlib/tests/per_interbranch/test_push.py	2011-12-11 16:36:30 +0000
@@ -279,10 +279,10 @@
          # remote graph any further.
          bzr_core_trace = Equals(
              ['Repository.insert_stream_1.19', 'Repository.insert_stream_1.19',
--             'get', 'Branch.set_last_revision_info', 'Branch.unlock'])
++             'Branch.set_last_revision_info', 'Branch.unlock'])
          bzr_loom_trace = Equals(
              ['Repository.insert_stream_1.19', 'Repository.insert_stream_1.19',
--             'get', 'Branch.set_last_revision_info', 'get', 'Branch.unlock'])
++             'Branch.set_last_revision_info', 'get', 'Branch.unlock'])
          self.assertThat(calls_after_insert_stream,
              MatchesAny(bzr_core_trace, bzr_loom_trace))
 === modified file 'bzrlib/tests/test_remote.py'
 --- bzrlib/tests/test_remote.py	2011-12-11 16:36:30 +0000
 +++ bzrlib/tests/test_remote.py	2011-12-11 16:36:30 +0000
@@ -69,6 +69,7 @@
  from bzrlib.smart.repository import (
      SmartServerRepositoryGetParentMap,
      SmartServerRepositoryGetStream_1_19,
++    _stream_to_byte_stream,
+     )
  from bzrlib.symbol_versioning import deprecated_in
  from bzrlib.tests import (
@@ -503,6 +504,43 @@
          self.assertRaises(errors.UnknownFormatError, a_bzrdir.cloning_metadir)
++class TestBzrDirCheckoutMetaDir(TestRemote):
++
++    def test__get_checkout_format(self):
++        transport = MemoryTransport()
++        client = FakeClient(transport.base)
++        reference_bzrdir_format = bzrdir.format_registry.get('default')()
++        control_name = reference_bzrdir_format.network_name()
++        client.add_expected_call(
++            'BzrDir.checkout_metadir', ('quack/', ),
++            'success', (control_name, '', ''))
++        transport.mkdir('quack')
++        transport = transport.clone('quack')
++        a_bzrdir = RemoteBzrDir(transport, RemoteBzrDirFormat(),
++            _client=client)
++        result = a_bzrdir.checkout_metadir()
++        # We should have got a reference control dir with default branch and
++        # repository formats.
++        self.assertEqual(bzrdir.BzrDirMetaFormat1, type(result))
++        self.assertEqual(None, result._repository_format)
++        self.assertEqual(None, result._branch_format)
++        self.assertFinished(client)
++
++    def test_unknown_format(self):
++        transport = MemoryTransport()
++        client = FakeClient(transport.base)
++        client.add_expected_call(
++            'BzrDir.checkout_metadir', ('quack/',),
++            'success', ('dontknow', '', ''))
++        transport.mkdir('quack')
++        transport = transport.clone('quack')
++        a_bzrdir = RemoteBzrDir(transport, RemoteBzrDirFormat(),
++            _client=client)
++        self.assertRaises(errors.UnknownFormatError,
++            a_bzrdir.checkout_metadir)
++        self.assertFinished(client)
++
++
  class TestBzrDirDestroyBranch(TestRemote):
      def test_destroy_default(self):
@@ -1102,47 +1140,6 @@
          self.assertFinished(client)
--class TestBranchGetCheckoutFormat(RemoteBranchTestCase):
--
--    def test__get_checkout_format(self):
--        transport = MemoryTransport()
--        client = FakeClient(transport.base)
--        reference_bzrdir_format = bzrdir.format_registry.get('default')()
--        control_name = reference_bzrdir_format.network_name()
--        client.add_expected_call(
--            'Branch.get_stacked_on_url', ('quack/',),
--            'error', ('NotStacked',))
--        client.add_expected_call(
--            'Branch.get_checkout_format', ('quack/', False),
--            'success', (control_name, '', ''))
--        transport.mkdir('quack')
--        transport = transport.clone('quack')
--        branch = self.make_remote_branch(transport, client)
--        result = branch._get_checkout_format()
--        # We should have got a reference control dir with default branch and
--        # repository formats.
--        self.assertEqual(bzrdir.BzrDirMetaFormat1, type(result))
--        self.assertEqual(None, result._repository_format)
--        self.assertEqual(None, result._branch_format)
--        self.assertFinished(client)
--
--    def test_unknown_format(self):
--        transport = MemoryTransport()
--        client = FakeClient(transport.base)
--        client.add_expected_call(
--            'Branch.get_stacked_on_url', ('quack/',),
--            'error', ('NotStacked',))
--        client.add_expected_call(
--            'Branch.get_checkout_format', ('quack/', False),
--            'success', ('dontknow', '', ''))
--        transport.mkdir('quack')
--        transport = transport.clone('quack')
--        branch = self.make_remote_branch(transport, client)
--        self.assertRaises(errors.UnknownFormatError,
--            branch._get_checkout_format)
--        self.assertFinished(client)
--
--
  class TestBranchGetPhysicalLockStatus(RemoteBranchTestCase):
      def test_get_physical_lock_status_yes(self):
@@ -4216,3 +4213,43 @@
              'Repository.unlock', ('quack/', 'token', 'False'),
              'success', ('ok', ))
          repo.pack(['hinta', 'hintb'])
++
++
++class TestRepositoryIterInventories(TestRemoteRepository):
++    """Test Repository.iter_inventories."""
++
++    def _serialize_inv_delta(self, old_name, new_name, delta):
++        serializer = inventory_delta.InventoryDeltaSerializer(True, False)
++        return "".join(serializer.delta_to_lines(old_name, new_name, delta))
++
++    def test_single_empty(self):
++        transport_path = 'quack'
++        repo, client = self.setup_fake_client_and_repository(transport_path)
++        fmt = bzrdir.format_registry.get('2a')().repository_format
++        repo._format = fmt
++        stream = [('inventory-delta', [
++            versionedfile.FulltextContentFactory('somerevid', None, None,
++                self._serialize_inv_delta('null:', 'somerevid', []))])]
++        client.add_expected_call(
++            'VersionedFileRepository.get_inventories', ('quack/', 'unordered'),
++            'success', ('ok', ),
++            _stream_to_byte_stream(stream, fmt))
++        ret = list(repo.iter_inventories(["somerevid"]))
++        self.assertLength(1, ret)
++        inv = ret[0]
++        self.assertEquals("somerevid", inv.revision_id)
++
++    def test_empty(self):
++        transport_path = 'quack'
++        repo, client = self.setup_fake_client_and_repository(transport_path)
++        ret = list(repo.iter_inventories([]))
++        self.assertEquals(ret, [])
++
++    def test_missing(self):
++        transport_path = 'quack'
++        repo, client = self.setup_fake_client_and_repository(transport_path)
++        client.add_expected_call(
++            'VersionedFileRepository.get_inventories', ('quack/', 'unordered'),
++            'success', ('ok', ), iter([]))
++        self.assertRaises(errors.NoSuchRevision, list, repo.iter_inventories(
++            ["somerevid"]))
 === modified file 'bzrlib/tests/test_smart.py'
 --- bzrlib/tests/test_smart.py	2011-12-11 16:36:30 +0000
 +++ bzrlib/tests/test_smart.py	2011-12-11 16:36:30 +0000
@@ -32,6 +32,7 @@
      bzrdir,
      errors,
      gpg,
++    inventory_delta,
      tests,
      transport,
      urlutils,
@@ -224,6 +225,24 @@
          self.assertEqual(expected, request.execute('', 'False'))
++class TestSmartServerBzrDirRequestCloningMetaDir(
++    tests.TestCaseWithMemoryTransport):
++    """Tests for BzrDir.checkout_metadir."""
++
++    def test_checkout_metadir(self):
++        backing = self.get_transport()
++        request = smart_dir.SmartServerBzrDirRequestCheckoutMetaDir(
++            backing)
++        branch = self.make_branch('.', format='2a')
++        response = request.execute('')
++        self.assertEqual(
++            smart_req.SmartServerResponse(
++                ('Bazaar-NG meta directory, format 1\n',
++                 'Bazaar repository format 2a (needs bzr 1.16 or later)\n',
++                 'Bazaar Branch Format 7 (needs bzr 1.6)\n')),
++            response)
++
++
  class TestSmartServerBzrDirRequestDestroyBranch(
      tests.TestCaseWithMemoryTransport):
      """Tests for BzrDir.destroy_branch."""
@@ -1458,35 +1477,6 @@
              smart_req.SmartServerResponse(('no',)), response)
--class TestSmartServerBranchRequestGetCheckoutFormat(TestLockedBranch):
--
--    def test_lightweight(self):
--        backing = self.get_transport()
--        request = smart_branch.SmartServerBranchRequestGetCheckoutFormat(
--            backing)
--        branch = self.make_branch('.', format='2a')
--        response = request.execute('', 'True')
--        self.assertEqual(
--            smart_req.SmartServerResponse(
--                ('Bazaar-NG meta directory, format 1\n',
--                 'Bazaar repository format 2a (needs bzr 1.16 or later)\n',
--                 'Bazaar Branch Format 7 (needs bzr 1.6)\n')),
--            response)
--
--    def test_heavyweight(self):
--        backing = self.get_transport()
--        request = smart_branch.SmartServerBranchRequestGetCheckoutFormat(
--            backing)
--        branch = self.make_branch('.', format='2a')
--        response = request.execute('', 'False')
--        self.assertEqual(
--            smart_req.SmartServerResponse((
--                'Bazaar-NG meta directory, format 1\n',
--                'Bazaar repository format 2a (needs bzr 1.16 or later)\n',
--                'Bazaar Branch Format 7 (needs bzr 1.6)\n')),
--            response)
--
--
  class TestSmartServerBranchRequestUnlock(TestLockedBranch):
      def setUp(self):
@@ -2456,8 +2446,6 @@
              smart_branch.SmartServerBranchPutConfigFile)
          self.assertHandlerEqual('Branch.get_parent',
              smart_branch.SmartServerBranchGetParent)
--        self.assertHandlerEqual('Branch.get_checkout_format',
--            smart_branch.SmartServerBranchRequestGetCheckoutFormat)
          self.assertHandlerEqual('Branch.get_physical_lock_status',
              smart_branch.SmartServerBranchRequestGetPhysicalLockStatus)
          self.assertHandlerEqual('Branch.get_tags_bytes',
@@ -2492,6 +2480,8 @@
              smart_dir.SmartServerRequestInitializeBzrDir)
          self.assertHandlerEqual('BzrDirFormat.initialize_ex_1.16',
              smart_dir.SmartServerRequestBzrDirInitializeEx)
++        self.assertHandlerEqual('BzrDir.checkout_metadir',
++            smart_dir.SmartServerBzrDirRequestCheckoutMetaDir)
          self.assertHandlerEqual('BzrDir.cloning_metadir',
              smart_dir.SmartServerBzrDirRequestCloningMetaDir)
          self.assertHandlerEqual('BzrDir.get_config_file',
@@ -2558,6 +2548,8 @@
              smart_repo.SmartServerRepositoryAbortWriteGroup)
          self.assertHandlerEqual('VersionedFileRepository.get_serializer_format',
              smart_repo.SmartServerRepositoryGetSerializerFormat)
++        self.assertHandlerEqual('VersionedFileRepository.get_inventories',
++            smart_repo.SmartServerRepositoryGetInventories)
          self.assertHandlerEqual('Transport.is_readonly',
              smart_req.SmartServerIsReadonly)
@@ -2623,3 +2615,48 @@
              smart_req.SuccessfulSmartServerResponse(('ok', ), ),
              request.do_body(''))
++
++class TestSmartServerRepositoryGetInventories(tests.TestCaseWithTransport):
++
++    def _get_serialized_inventory_delta(self, repository, base_revid, revid):
++        base_inv = repository.revision_tree(base_revid).inventory
++        inv = repository.revision_tree(revid).inventory
++        inv_delta = inv._make_delta(base_inv)
++        serializer = inventory_delta.InventoryDeltaSerializer(True, False)
++        return "".join(serializer.delta_to_lines(base_revid, revid, inv_delta))
++
++    def test_single(self):
++        backing = self.get_transport()
++        request = smart_repo.SmartServerRepositoryGetInventories(backing)
++        t = self.make_branch_and_tree('.', format='2a')
++        self.addCleanup(t.lock_write().unlock)
++        self.build_tree_contents([("file", "somecontents")])
++        t.add(["file"], ["thefileid"])
++        t.commit(rev_id='somerev', message="add file")
++        self.assertIs(None, request.execute('', 'unordered'))
++        response = request.do_body("somerev\n")
++        self.assertTrue(response.is_successful())
++        self.assertEquals(response.args, ("ok", ))
++        stream = [('inventory-delta', [
++            versionedfile.FulltextContentFactory('somerev', None, None,
++                self._get_serialized_inventory_delta(
++                    t.branch.repository, 'null:', 'somerev'))])]
++        fmt = bzrdir.format_registry.get('2a')().repository_format
++        self.assertEquals(
++            "".join(response.body_stream),
++            "".join(smart_repo._stream_to_byte_stream(stream, fmt)))
++
++    def test_empty(self):
++        backing = self.get_transport()
++        request = smart_repo.SmartServerRepositoryGetInventories(backing)
++        t = self.make_branch_and_tree('.', format='2a')
++        self.addCleanup(t.lock_write().unlock)
++        self.build_tree_contents([("file", "somecontents")])
++        t.add(["file"], ["thefileid"])
++        t.commit(rev_id='somerev', message="add file")
++        self.assertIs(None, request.execute('', 'unordered'))
++        response = request.do_body("")
++        self.assertTrue(response.is_successful())
++        self.assertEquals(response.args, ("ok", ))
++        self.assertEquals("".join(response.body_stream),
++            "Bazaar pack format 1 (introduced in 0.18)\nB54\n\nBazaar repository format 2a (needs bzr 1.16 or later)\nE")
 === modified file 'doc/en/release-notes/bzr-2.5.txt'
 --- doc/en/release-notes/bzr-2.5.txt	2011-12-11 16:36:30 +0000
 +++ doc/en/release-notes/bzr-2.5.txt	2011-12-11 16:36:30 +0000
@@ -111,6 +111,9 @@
  * Plugins can now register additional "location aliases".
    (Jelmer Vernooij)
++* ``bzr status`` no longer shows shelves if files are specified.
++  (Francis Devereux)
++
  * Revision specifiers will now only browse as much history as they
    need to, rather than grabbing the whole history unnecessarily in some
    cases. (Jelmer Vernooij)
@@ -236,6 +239,13 @@
    ``Repository.get_revision_signature_text``.
    (Jelmer Vernooij)
++* Add HPSS calls for ``Repository.iter_files_bytes`` and
++  ``VersionedFileRepository.get_inventories``, speeding up
++  several commands including ``bzr export`` and ``bzr co --lightweight``.
++  (Jelmer Vernooij, #608640)
++
++* Add HPSS call for ``Branch.get_checkout_format``. (Jelmer Vernooij, #894459)
++
  * Add HPSS call for ``Repository.pack``. (Jelmer Vernooij, #894461)
  * Custom HPSS error handlers can now be installed in the smart server client