Merge lp:~jameinel/bzr/2.4-pull-iter-changes-780677 into lp:bzr

Proposed by John A Meinel
Status: Merged
Approved by: Andrew Bennetts
Approved revision: no longer in the source branch.
Merged at revision: 5849
Proposed branch: lp:~jameinel/bzr/2.4-pull-iter-changes-780677
Merge into: lp:bzr
Diff against target: 56 lines (+19/-5)
3 files modified
bzrlib/workingtree.py (+5/-0)
doc/en/release-notes/bzr-2.4.txt (+8/-0)
doc/en/whats-new/whats-new-in-2.4.txt (+6/-5)
To merge this branch: bzr merge lp:~jameinel/bzr/2.4-pull-iter-changes-780677
Reviewer Review Type Date Requested Status
Andrew Bennetts Approve
Review via email: mp+60568@code.launchpad.net

Commit message

Update WT.pull to use fast Repository iter_changes when .fast_deltas is true. (bug #780677)

Description of the change

This is another small patch to improve the performance of "bzr pull". It is one of the cases where using DirStateRevisionTree is worse than RevisionTree because 2a format repos can compute iter_charges much faster.

This patch by itself drops "bzr pull" from about 29s down to about 17s on my machine. However, when coupled with https://code.launchpad.net/~jameinel/bzr/2.4-set-parent-trees-delta-282941/+merge/60541 it drops "bzr pull" time down to 3.7s. (I'll update the numbers in "WhatsNew" once both of them land.)

For bzr.dev on my machine, the change is less pronounced, but it is still ~1.5s down to 1.2s.

Next on the block is probably 'bzr update'. Though WT._update_tree is pretty ugly because of all the various merges that can occur.

To post a comment you must log in.
Revision history for this message
Andrew Bennetts (spiv) wrote :

Nice!

I wonder if we should push this down into WorkingTree.basis_tree() rather than just special-casing in pull()? Or perhaps this could happen in an InterTree optimiser? Anyway, that's something we can worry about later while we enjoy this improvement landing on trunk :)

Thanks for updating the table in whats-new too.

review: Approve
Revision history for this message
John A Meinel (jameinel) wrote :

So there are times where WT.basis_tree() is faster, namely WT.iter_changes(DirStateRevisionTree). Unfortunately RevisionTree.iter_changes(DirStateRevisionTree) is slower.

However, I think we might have the logic that WT.iter_changes(RevisionTree) will notice RT.get_revision_id() is in self.get_parent_ids() and use the fast path. So we might just get rid of DirStateRevisionTree entirely for the cases where WT.branch.repository._format.fast_deltas is true. Something to look into.

Revision history for this message
John A Meinel (jameinel) wrote :

sent to pqm by email

Revision history for this message
John A Meinel (jameinel) wrote :

sent to pqm by email

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 5/11/2011 5:13 AM, Andrew Bennetts wrote:
> Review: Approve
> Nice!
>
> I wonder if we should push this down into WorkingTree.basis_tree() rather than just special-casing in pull()? Or perhaps this could happen in an InterTree optimiser? Anyway, that's something we can worry about later while we enjoy this improvement landing on trunk :)

I looked into that. It certainly isn't hard to do. ATM, though, I'm not
100% sure of the ramifications. Any code that uses Tree.inventory is
going to be a lot faster using DirStateRevisionTree, rather than
RevisionTree. (or iter_entries_by_dir, etc).

I'd really have to audit the code to make sure that basis_tree is only
being used for iter_changes.

Of course, the other option is to create another InterTree optimizer.
One that looks between DirStateRevisionTree and RevisionTree and just
grabs another RevisionTree and runs iter_changes on the difference.

So for now, I'm sticking with this, but I'm willing to revisit it in the
future.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk3KkCsACgkQJdeBCYSNAAOGagCgjmyZHlnmXFH07G/JXgAS33TW
fqoAoICKi9BDA3KBbNxi5lOJ1RftXiqz
=6PCK
-----END PGP SIGNATURE-----

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bzrlib/workingtree.py'
2--- bzrlib/workingtree.py 2011-05-10 09:30:33 +0000
3+++ bzrlib/workingtree.py 2011-05-11 11:48:38 +0000
4@@ -1032,6 +1032,11 @@
5 new_revision_info = self.branch.last_revision_info()
6 if new_revision_info != old_revision_info:
7 repository = self.branch.repository
8+ if repository._format.fast_deltas:
9+ parent_ids = self.get_parent_ids()
10+ if parent_ids:
11+ basis_id = parent_ids[0]
12+ basis_tree = repository.revision_tree(basis_id)
13 basis_tree.lock_read()
14 try:
15 new_basis_tree = self.branch.basis_tree()
16
17=== modified file 'doc/en/release-notes/bzr-2.4.txt'
18--- doc/en/release-notes/bzr-2.4.txt 2011-05-10 09:34:35 +0000
19+++ doc/en/release-notes/bzr-2.4.txt 2011-05-11 11:48:38 +0000
20@@ -30,6 +30,14 @@
21 more efficiently. For a simple branch it reduces the number of
22 round-trips by about 20%. (Andrew Bennetts)
23
24+* ``bzr pull`` now properly triggers the fast
25+ ``CHKInventory.iter_changes`` rather than the slow generic
26+ inter-Inventory changes. It used to use a ``DirStateRevisionTree`` as
27+ one of the source trees, which is faster when we have to read the whole
28+ inventory anyway, but much slower when we can get just the delta out of
29+ the repository. On a 70k record tree, this changes ``bzr pull`` from 28s
30+ down to 17s. (John Arbash Meinel, #780677)
31+
32 * Slightly reduced memory consumption when fetching into a 2a repository
33 by reusing existing caching a little better. (Andrew Bennetts)
34
35
36=== modified file 'doc/en/whats-new/whats-new-in-2.4.txt'
37--- doc/en/whats-new/whats-new-in-2.4.txt 2011-05-10 09:34:35 +0000
38+++ doc/en/whats-new/whats-new-in-2.4.txt 2011-05-11 11:48:38 +0000
39@@ -63,11 +63,12 @@
40 trees. A possibly incomplete list is as follows for running commands on a
41 70k file tree::
42
43- bzr-2.3 bzr-2.4 action
44- 3m39s 1m03s bzr co --lightweight
45- 38s 6s bzr revert
46- 4m47s 27s bzr merge
47- 4m58s 32s bzr up
48+ bzr-2.3.1 bzr-2.3.2 bzr-2.4 action
49+ 3m39s 1m03s bzr co --lightweight
50+ 38s 6s bzr revert
51+ 4m47s 27s bzr merge
52+ 4m45s 29s 17s bzr pull
53+ 4m58s 32s bzr up
54
55
56 Faster stacked branches