Merge lp:~vila/bzr/857335-tags-revs-slow into lp:bzr

Proposed by Vincent Ladeuil
Status: Merged
Approved by: Vincent Ladeuil
Approved revision: no longer in the source branch.
Merged at revision: 6163
Proposed branch: lp:~vila/bzr/857335-tags-revs-slow
Merge into: lp:bzr
Diff against target: 83 lines (+34/-8)
2 files modified
bzrlib/builtins.py (+31/-8)
doc/en/release-notes/bzr-2.5.txt (+3/-0)
To merge this branch: bzr merge lp:~vila/bzr/857335-tags-revs-slow
Reviewer Review Type Date Requested Status
Jelmer Vernooij (community) Approve
Review via email: mp+76763@code.launchpad.net

Commit message

Stop reloading all ancestry for each tag when searching tags in a revision range

Description of the change

bzr tags was doing a costly ancestry check when the -rX..Y parameter was
specified.

In the worst case scenario (-r1..), this gave:

bzr: 23 secs
emacs: 19 mins
gcc: 74 mins

With this patch, we're down to:

bzr: 1.3 secs
emacs: 9.8 secs
gcc: 7.6 secs

I had to add a check for the case where X > Y in -rX..Y to comply with the
test suite which roughly says: if the range is invalid, don't output
anything. The timings mentioned in the bug report was without this check.

Using a range is still slower than no using a range which is... unfortunate
but seems out of scope as it would require unifying the various graphs
cached in the branch and the repo in various forms.

To post a comment you must log in.
Revision history for this message
Jelmer Vernooij (jelmer) wrote :

s/ouput/output/

It might be nice to split the bit of code that determines the tags for a range out into a separate method. Other than that, this seems reasonable.

review: Approve
Revision history for this message
Vincent Ladeuil (vila) wrote :

sent to pqm by email

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'bzrlib/builtins.py'
--- bzrlib/builtins.py 2011-09-22 19:51:18 +0000
+++ bzrlib/builtins.py 2011-09-24 03:27:24 +0000
@@ -2620,7 +2620,7 @@
2620 match_dict['author'] = match_author2620 match_dict['author'] = match_author
2621 if match_bugs:2621 if match_bugs:
2622 match_dict['bugs'] = match_bugs2622 match_dict['bugs'] = match_bugs
2623 2623
2624 # Build the LogRequest and execute it2624 # Build the LogRequest and execute it
2625 if len(file_ids) == 0:2625 if len(file_ids) == 0:
2626 file_ids = None2626 file_ids = None
@@ -5752,12 +5752,8 @@
57525752
5753 self.add_cleanup(branch.lock_read().unlock)5753 self.add_cleanup(branch.lock_read().unlock)
5754 if revision:5754 if revision:
5755 graph = branch.repository.get_graph()5755 # Restrict to the specified range
5756 rev1, rev2 = _get_revision_range(revision, branch, self.name())5756 tags = self._tags_for_range(branch, revision)
5757 revid1, revid2 = rev1.rev_id, rev2.rev_id
5758 # only show revisions between revid1 and revid2 (inclusive)
5759 tags = [(tag, revid) for tag, revid in tags if
5760 graph.is_between(revid, revid1, revid2)]
5761 if sort is None:5757 if sort is None:
5762 sort = tag_sort_methods.get()5758 sort = tag_sort_methods.get()
5763 sort(branch, tags)5759 sort(branch, tags)
@@ -5768,7 +5764,8 @@
5768 revno = branch.revision_id_to_dotted_revno(revid)5764 revno = branch.revision_id_to_dotted_revno(revid)
5769 if isinstance(revno, tuple):5765 if isinstance(revno, tuple):
5770 revno = '.'.join(map(str, revno))5766 revno = '.'.join(map(str, revno))
5771 except (errors.NoSuchRevision, errors.GhostRevisionsHaveNoRevno):5767 except (errors.NoSuchRevision,
5768 errors.GhostRevisionsHaveNoRevno):
5772 # Bad tag data/merges can lead to tagged revisions5769 # Bad tag data/merges can lead to tagged revisions
5773 # which are not in this branch. Fail gracefully ...5770 # which are not in this branch. Fail gracefully ...
5774 revno = '?'5771 revno = '?'
@@ -5777,6 +5774,32 @@
5777 for tag, revspec in tags:5774 for tag, revspec in tags:
5778 self.outf.write('%-20s %s\n' % (tag, revspec))5775 self.outf.write('%-20s %s\n' % (tag, revspec))
57795776
5777 def _tags_for_range(self, branch, revision):
5778 range_valid = True
5779 rev1, rev2 = _get_revision_range(revision, branch, self.name())
5780 revid1, revid2 = rev1.rev_id, rev2.rev_id
5781 # _get_revision_range will always set revid2 if it's not specified.
5782 # If revid1 is None, it means we want to start from the branch
5783 # origin which is always a valid ancestor. If revid1 == revid2, the
5784 # ancestry check is useless.
5785 if revid1 and revid1 != revid2:
5786 # FIXME: We really want to use the same graph than
5787 # branch.iter_merge_sorted_revisions below, but this is not
5788 # easily available -- vila 2011-09-23
5789 if branch.repository.get_graph().is_ancestor(revid2, revid1):
5790 # We don't want to output anything in this case...
5791 return []
5792 # only show revisions between revid1 and revid2 (inclusive)
5793 tagged_revids = branch.tags.get_reverse_tag_dict()
5794 found = []
5795 for r in branch.iter_merge_sorted_revisions(
5796 start_revision_id=revid2, stop_revision_id=revid1,
5797 stop_rule='include'):
5798 revid_tags = tagged_revids.get(r[0], None)
5799 if revid_tags:
5800 found.extend([(tag, r[0]) for tag in revid_tags])
5801 return found
5802
57805803
5781class cmd_reconfigure(Command):5804class cmd_reconfigure(Command):
5782 __doc__ = """Reconfigure the type of a bzr directory.5805 __doc__ = """Reconfigure the type of a bzr directory.
57835806
=== modified file 'doc/en/release-notes/bzr-2.5.txt'
--- doc/en/release-notes/bzr-2.5.txt 2011-09-23 07:51:28 +0000
+++ doc/en/release-notes/bzr-2.5.txt 2011-09-24 03:27:24 +0000
@@ -26,6 +26,9 @@
26.. Improvements to existing commands, especially improved performance 26.. Improvements to existing commands, especially improved performance
27 or memory usage, or better results.27 or memory usage, or better results.
2828
29* Stop reloading all ancestry for each tag when searching tags in a revision
30 range. (Vincent Ladeuil, #857335)
31
29Bug Fixes32Bug Fixes
30*********33*********
3134