Bazaar

Merge lp:~spiv/bzr/faster-revert-593560 into lp:bzr

faster-revert-593560
Merge into bzr.dev

Proposed by Andrew Bennetts on 2010-08-30

Status:

Merged

Approved by:

Andrew Bennetts on 2010-09-01

Approved revision:

no longer in the source branch.

Merged at revision:

5401

Proposed branch:

lp:~spiv/bzr/faster-revert-593560

Merge into:

lp:bzr

Diff against target:

56 lines (+11/-3)

3 files modified

NEWS (+4/-0)
bzrlib/tree.py (+1/-1)
doc/en/whats-new/whats-new-in-2.3.txt (+6/-2)

To merge this branch:

bzr merge lp:~spiv/bzr/faster-revert-593560

High

Confirmed

Link a bug report

Reviewer	Review Type	Date Requested	Status
bzr-core		2010-08-30	Pending
Review via email: mp+34043@code.launchpad.net

Commit message

Avoid repeatedly calling self.target.all_file_ids() in InterTree.iter_changes.

Description of the change

This simple change improves bzr revert and bzr status on large trees (i.e. gcc-linaro) with many changes (i.e. reverting from lp:gcc-linaro/4.5 to tag:gcc_4_5_0_release) by about 15% on my system. Specifically bzr revert to that tag goes from 3m 49s to 3m 18s, and bzr st on the resulting tree also drops from about 1.8s to 1.6s.

Tests still pass, and avoiding unnecessary (and repeated) calls to all_file_ids seems like a clearly good idea.

Revision history for this message

Martin Pool (mbp) wrote on 2010-08-30:

That's pretty nice.

Perhaps the news entry can say a bit more about what changed so we can
understand it later: "by not repeatedly building a list of file ids."

I'd like you to add this to the whatsnew-2.3 docs too, under a
performance heading.

In some ways this patch feels incomplete not to have a test or a
prevention of regressions. I guess it is reasonable to keep
all_file_ids for people who want it. But perhaps just measuring
performance is the best way to keep it up, so +1 with those doc
tweaks.

--
Martin

Revision history for this message

John A Meinel (jameinel) wrote on 2010-08-30:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 8/30/2010 3:55 AM, Martin Pool wrote:
> That's pretty nice.
>
> Perhaps the news entry can say a bit more about what changed so we can
> understand it later: "by not repeatedly building a list of file ids."
>
> I'd like you to add this to the whatsnew-2.3 docs too, under a
> performance heading.
>
> In some ways this patch feels incomplete not to have a test or a
> prevention of regressions. I guess it is reasonable to keep
> all_file_ids for people who want it. But perhaps just measuring
> performance is the best way to keep it up, so +1 with those doc
> tweaks.
>

Can you switch it to use 'target.has_id()' instead? I think Martin has
mentioned that he doesn't really prefer the '__contains__' syntax for
trees. Certainly it is a bit unclear what 'in tree' should be passed.
(It takes a file_id, but could be a InventoryEntry, etc.)

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkx7upQACgkQJdeBCYSNAAOmUwCeOvR1CsV04QEq7gaS4co/xc4X
xEcAoL5OwV2iJLq1oqrj5DCcCCflg5gt
=LQSN
-----END PGP SIGNATURE-----

Revision history for this message

Martin Pool (mbp) wrote on 2010-08-31:

On 31 August 2010 00:05, John Arbash Meinel <email address hidden> wrote:
> Can you switch it to use 'target.has_id()' instead? I think Martin has
> mentioned that he doesn't really prefer the '__contains__' syntax for
> trees. Certainly it is a bit unclear what 'in tree' should be passed.
> (It takes a file_id, but could be a InventoryEntry, etc.)

Yes, if we have that and can use it I think it is much more explicit.

--
Martin

Revision history for this message

Andrew Bennetts (spiv) wrote on 2010-09-01:

sent to pqm by email

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Alejandro Cornejo2

Andrew Bennetts

Bazaar Codereview Subscribers

Benoit Pierre

Gmood

Karl Bielefeldt

Mahmoud Hassan

Matt Nordhoff

Mohd Fikri Mohd Amin

MrJOHN

Václav Haisman

bzr PQM

vincenzo

to status/vote changes:

Alexander Belchenko

amandla2023

1	=== modified file 'NEWS'
2	--- NEWS 2010-08-30 10:14:21 +0000
3	+++ NEWS 2010-08-31 07:14:41 +0000
4	@@ -149,6 +149,10 @@
5	will be backed up (adding an extention of the form .~#~).
6	(Marius Kruger, #400554)
7
8	+* ``bzr revert`` and ``bzr status`` are up to 15% faster on large trees
9	+ with many changes by not repeatedly building a list of all file-ids.
10	+ (Andrew Bennetts)
11	+
12	* Decrease memory consumption when many chk index pages are loaded. (Such
13	as during ``bzr co`` or ``bzr ls -R`` of a large tree.) Often we need to
14	read many chk pages because the individual chk map nodes will be spread
15
16	=== modified file 'bzrlib/tree.py'
17	--- bzrlib/tree.py 2010-05-25 17:27:52 +0000
18	+++ bzrlib/tree.py 2010-08-31 07:14:41 +0000
19	@@ -1130,7 +1130,7 @@
20	if file_id in to_paths:
21	# already returned
22	continue
23	- if file_id not in self.target.all_file_ids():
24	+ if not self.target.has_id(file_id):
25	# common case - paths we have not emitted are not present in
26	# target.
27	to_path = None
28
29	=== modified file 'doc/en/whats-new/whats-new-in-2.3.txt'
30	--- doc/en/whats-new/whats-new-in-2.3.txt 2010-08-23 18:59:22 +0000
31	+++ doc/en/whats-new/whats-new-in-2.3.txt 2010-08-31 07:14:41 +0000
32	@@ -8,12 +8,12 @@
33	:doc:`../release-notes/index` for a full list.
34
35	Users are encouraged to upgrade from the other stable series. This
36	-document outlines the improvements in Bazaar 2.2 vs Bazaar 2.1. As well as
37	+document outlines the improvements in Bazaar 2.3 vs Bazaar 2.2. As well as
38	summarizing improvements made to the core product, it highlights
39	enhancements within the broader Bazaar world of potential interest to
40	those upgrading.
41
42	-Bazaar 2.2.0 is fully compatible both locally and on the network with 2.0
43	+Bazaar 2.3.0 is fully compatible both locally and on the network with 2.0
44	2.1, and 2.2, and can read and write repositories generated by all
45	previous versions.
46
47	@@ -28,6 +28,10 @@
48	Performance improvements
49	************************
50
51	+* ``bzr revert`` and ``bzr status`` are up to 15% faster on large trees
52	+ with many changes by not repeatedly building a list of all file-ids.
53	+ (Andrew Bennetts)
54	+
55	* ``bzr send`` uses less memory.
56	(John Arbash Meinel, #614576)
57