Merge lp:~vila/bzr/857335-tags-revs-slow into lp:bzr

Proposed by Vincent Ladeuil
Status: Merged
Approved by: Vincent Ladeuil
Approved revision: no longer in the source branch.
Merged at revision: 6163
Proposed branch: lp:~vila/bzr/857335-tags-revs-slow
Merge into: lp:bzr
Diff against target: 83 lines (+34/-8)
2 files modified
bzrlib/builtins.py (+31/-8)
doc/en/release-notes/bzr-2.5.txt (+3/-0)
To merge this branch: bzr merge lp:~vila/bzr/857335-tags-revs-slow
Reviewer Review Type Date Requested Status
Jelmer Vernooij (community) Approve
Review via email: mp+76763@code.launchpad.net

Commit message

Stop reloading all ancestry for each tag when searching tags in a revision range

Description of the change

bzr tags was doing a costly ancestry check when the -rX..Y parameter was
specified.

In the worst case scenario (-r1..), this gave:

bzr: 23 secs
emacs: 19 mins
gcc: 74 mins

With this patch, we're down to:

bzr: 1.3 secs
emacs: 9.8 secs
gcc: 7.6 secs

I had to add a check for the case where X > Y in -rX..Y to comply with the
test suite which roughly says: if the range is invalid, don't output
anything. The timings mentioned in the bug report was without this check.

Using a range is still slower than no using a range which is... unfortunate
but seems out of scope as it would require unifying the various graphs
cached in the branch and the repo in various forms.

To post a comment you must log in.
Revision history for this message
Jelmer Vernooij (jelmer) wrote :

s/ouput/output/

It might be nice to split the bit of code that determines the tags for a range out into a separate method. Other than that, this seems reasonable.

review: Approve
Revision history for this message
Vincent Ladeuil (vila) wrote :

sent to pqm by email

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bzrlib/builtins.py'
2--- bzrlib/builtins.py 2011-09-22 19:51:18 +0000
3+++ bzrlib/builtins.py 2011-09-24 03:27:24 +0000
4@@ -2620,7 +2620,7 @@
5 match_dict['author'] = match_author
6 if match_bugs:
7 match_dict['bugs'] = match_bugs
8-
9+
10 # Build the LogRequest and execute it
11 if len(file_ids) == 0:
12 file_ids = None
13@@ -5752,12 +5752,8 @@
14
15 self.add_cleanup(branch.lock_read().unlock)
16 if revision:
17- graph = branch.repository.get_graph()
18- rev1, rev2 = _get_revision_range(revision, branch, self.name())
19- revid1, revid2 = rev1.rev_id, rev2.rev_id
20- # only show revisions between revid1 and revid2 (inclusive)
21- tags = [(tag, revid) for tag, revid in tags if
22- graph.is_between(revid, revid1, revid2)]
23+ # Restrict to the specified range
24+ tags = self._tags_for_range(branch, revision)
25 if sort is None:
26 sort = tag_sort_methods.get()
27 sort(branch, tags)
28@@ -5768,7 +5764,8 @@
29 revno = branch.revision_id_to_dotted_revno(revid)
30 if isinstance(revno, tuple):
31 revno = '.'.join(map(str, revno))
32- except (errors.NoSuchRevision, errors.GhostRevisionsHaveNoRevno):
33+ except (errors.NoSuchRevision,
34+ errors.GhostRevisionsHaveNoRevno):
35 # Bad tag data/merges can lead to tagged revisions
36 # which are not in this branch. Fail gracefully ...
37 revno = '?'
38@@ -5777,6 +5774,32 @@
39 for tag, revspec in tags:
40 self.outf.write('%-20s %s\n' % (tag, revspec))
41
42+ def _tags_for_range(self, branch, revision):
43+ range_valid = True
44+ rev1, rev2 = _get_revision_range(revision, branch, self.name())
45+ revid1, revid2 = rev1.rev_id, rev2.rev_id
46+ # _get_revision_range will always set revid2 if it's not specified.
47+ # If revid1 is None, it means we want to start from the branch
48+ # origin which is always a valid ancestor. If revid1 == revid2, the
49+ # ancestry check is useless.
50+ if revid1 and revid1 != revid2:
51+ # FIXME: We really want to use the same graph than
52+ # branch.iter_merge_sorted_revisions below, but this is not
53+ # easily available -- vila 2011-09-23
54+ if branch.repository.get_graph().is_ancestor(revid2, revid1):
55+ # We don't want to output anything in this case...
56+ return []
57+ # only show revisions between revid1 and revid2 (inclusive)
58+ tagged_revids = branch.tags.get_reverse_tag_dict()
59+ found = []
60+ for r in branch.iter_merge_sorted_revisions(
61+ start_revision_id=revid2, stop_revision_id=revid1,
62+ stop_rule='include'):
63+ revid_tags = tagged_revids.get(r[0], None)
64+ if revid_tags:
65+ found.extend([(tag, r[0]) for tag in revid_tags])
66+ return found
67+
68
69 class cmd_reconfigure(Command):
70 __doc__ = """Reconfigure the type of a bzr directory.
71
72=== modified file 'doc/en/release-notes/bzr-2.5.txt'
73--- doc/en/release-notes/bzr-2.5.txt 2011-09-23 07:51:28 +0000
74+++ doc/en/release-notes/bzr-2.5.txt 2011-09-24 03:27:24 +0000
75@@ -26,6 +26,9 @@
76 .. Improvements to existing commands, especially improved performance
77 or memory usage, or better results.
78
79+* Stop reloading all ancestry for each tag when searching tags in a revision
80+ range. (Vincent Ladeuil, #857335)
81+
82 Bug Fixes
83 *********
84