Merge lp:~ian-clatworthy/bzr/log-show-custom into lp:~bzr/bzr/trunk-old

Proposed by Ian Clatworthy
Status: Rejected
Rejected by: Ian Clatworthy
Proposed branch: lp:~ian-clatworthy/bzr/log-show-custom
Merge into: lp:~bzr/bzr/trunk-old
Diff against target: 261 lines (has conflicts)
Text conflict in bzrlib/log.py
To merge this branch: bzr merge lp:~ian-clatworthy/bzr/log-show-custom
Reviewer Review Type Date Requested Status
Jelmer Vernooij (community) Needs Fixing
Review via email: mp+6770@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Ian Clatworthy (ian-clatworthy) wrote :

It turns out that checking for custom properties when logging, particularly checking for foreign VCS properties, is pretty expensive. This patch switches their display off by default and introduces a new option to enable them: --show-custom. If that's considered heavy-handed, we could find a smarter way to skip foreign VCS property calculation (I couldn't see a way given the current algorithm) or make the option --show-foreign, leaving non-foreign custom properties on by default.

On OpenOffice.org with 260K revisions, this patch reduces the time by 10% from 50 seconds to 45 seconds. This improvement complements my faster-log branch: together they reduce the overall time from 50+ seconds to 41 seconds.

Revision history for this message
Martin Pool (mbp) wrote :

2009/5/25 Ian Clatworthy <email address hidden>:
> Ian Clatworthy has proposed merging lp:~ian-clatworthy/bzr/log-show-custom into lp:bzr.
>
> Requested reviews:
>    bzr-core (bzr-core)
>
> It turns out that checking for custom properties when logging, particularly checking for foreign VCS properties, is pretty expensive. This patch switches their display off by default and introduces a new option to enable them: --show-custom. If that's considered heavy-handed, we could find a smarter way to skip foreign VCS property calculation (I couldn't see a way given the current algorithm) or make the option --show-foreign, leaving non-foreign custom properties on by default.

I think that's reasonable.

It's the kind of case where it might be nice for people to be able to
set the option on other than by aliasing the command. (Though
aliasing it is an adequate standby.)

--
Martin <http://launchpad.net/~mbp/>

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin Pool wrote:
> 2009/5/25 Ian Clatworthy <email address hidden>:
>> Ian Clatworthy has proposed merging lp:~ian-clatworthy/bzr/log-show-custom into lp:bzr.
>>
>> Requested reviews:
>> bzr-core (bzr-core)
>>
>> It turns out that checking for custom properties when logging, particularly checking for foreign VCS properties, is pretty expensive. This patch switches their display off by default and introduces a new option to enable them: --show-custom. If that's considered heavy-handed, we could find a smarter way to skip foreign VCS property calculation (I couldn't see a way given the current algorithm) or make the option --show-foreign, leaving non-foreign custom properties on by default.
>
> I think that's reasonable.
>
> It's the kind of case where it might be nice for people to be able to
> set the option on other than by aliasing the command. (Though
> aliasing it is an adequate standby.)
>

I think it would be worthwhile to figure out why this is so expensive.
In general, it seems like it should just be a single dict lookup to see
if a given property exists, which doesn't seem particularly expensive.

Is it just an inefficient check in bzr-svn, or is it something about the
API that makes it inefficient? (Such as passing the revision id, rather
than the Revision object, etc.)

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkoaSUQACgkQJdeBCYSNAANo1gCgkBu6/3hvodqv1B58Hb2rG3RN
24oAoKAFUkMLGQs3/AVnT3Ed6End04EM
=+5+n
-----END PGP SIGNATURE-----

Revision history for this message
Ian Clatworthy (ian-clatworthy) wrote :

John A Meinel wrote:
> I think it would be worthwhile to figure out why this is so expensive.
> In general, it seems like it should just be a single dict lookup to see
> if a given property exists, which doesn't seem particularly expensive.
>
> Is it just an inefficient check in bzr-svn, or is it something about the
> API that makes it inefficient? (Such as passing the revision id, rather
> than the Revision object, etc.)
>
It's slow for 2 reasons:

1. a custom property handler is *always* registered for calculating
    foreign properties. So simply testing if the registry is empty isn't
   good enough to short-circuit the normal processing loop.

2. bzrlib/foreign.show_foreign_properties() tries to pull apart
    *every* revision-id to see if it came from a foreign source.

Over 250K revisions, these effects add up with the second one being
particularly expensive. It would be nice if there was a single check we
could do (per branch or repo say) to decide whether to do the above but
I can't see how we can, given the possibility that just a few revisions
here or there (in a merge say) came from a foreign source. Hence my
suggestion that they be switched off by default, so only users
interested in them pay the cost.

BTW, hg has a nice feature where options can be configured per command
per branch (by adding sections to their branch-specific config file). I
think that sort of flexibility is useful in case like this (so that
those working with foreign branches can enable --show-custom by default
when they care without going as far as a per user alias). Just a thought.

Ian C.

Revision history for this message
Jelmer Vernooij (jelmer) wrote :

> It turns out that checking for custom properties when logging, particularly
> checking for foreign VCS properties, is pretty expensive. This patch switches
> their display off by default and introduces a new option to enable them:
> --show-custom. If that's considered heavy-handed, we could find a smarter way
> to skip foreign VCS property calculation (I couldn't see a way given the
> current algorithm) or make the option --show-foreign, leaving non-foreign
> custom properties on by default.
>
> On OpenOffice.org with 260K revisions, this patch reduces the time by 10% from
> 50 seconds to 45 seconds. This improvement complements my faster-log branch:
> together they reduce the overall time from 50+ seconds to 41 seconds.
At least when run against a foreign repository itself, we know directly which function to use. This would mean we need some extra state in log so we couldn't just use a "standard" custom revision property display hook but we should be able to at least show foreign ids when run directly against a foreign repo.

Revision history for this message
Jelmer Vernooij (jelmer) wrote :

I think we can improve the performance of this without introducing a new argument.

I've discussed this on IRC with igc; I'll have a look at providing an alternative implementation that is faster.

review: Needs Fixing
Revision history for this message
Ian Clatworthy (ian-clatworthy) wrote :

> I think we can improve the performance of this without introducing a new
> argument.
>
> I've discussed this on IRC with igc; I'll have a look at providing an
> alternative implementation that is faster.

Cool. I'll mark this merge proposal as 'work in progress' accordingly. After you're done, we can see whether the option still has value or not in forcing a particular behaviour.

Revision history for this message
Jelmer Vernooij (jelmer) wrote :

> > I think we can improve the performance of this without introducing a new
> > argument.
> >
> > I've discussed this on IRC with igc; I'll have a look at providing an
> > alternative implementation that is faster.
>
> Cool. I'll mark this merge proposal as 'work in progress' accordingly. After
> you're done, we can see whether the option still has value or not in forcing a
> particular behaviour.
The performance improvements for foreign vcs fields in log have landed. Is this still a problem for log, or can this review be marked superseded?

Revision history for this message
Jelmer Vernooij (jelmer) wrote :

> > I think we can improve the performance of this without introducing a new
> > argument.
> >
> > I've discussed this on IRC with igc; I'll have a look at providing an
> > alternative implementation that is faster.
>
> Cool. I'll mark this merge proposal as 'work in progress' accordingly. After
> you're done, we can see whether the option still has value or not in forcing a
> particular behaviour.
The performance improvements for foreign vcs fields in log have landed. Is this still a problem for log, or can this review be marked superseded?

Unmerged revisions

4380. By Ian Clatworthy

log --show-custom

4379. By Canonical.com Patch Queue Manager <email address hidden>

(tanner) merge 1.15final back to trunk

4378. By Canonical.com Patch Queue Manager <email address hidden>

(igc) faster branch in a shared repo for dev6rr format (Ian
 Clatworthy)

4377. By Canonical.com Patch Queue Manager <email address hidden>

(igc) make branch --no-tree faster (Ian Clatworthy)

4376. By Canonical.com Patch Queue Manager <email address hidden>

(Jelmer) Return BranchPushResult from InterBranch.lossy_push().

4375. By Canonical.com Patch Queue Manager <email address hidden>

(Jelmer) Add child_submit_format option used by 'bzr send'.

4374. By Canonical.com Patch Queue Manager <email address hidden>

(abentley) Enable committing from a PreviewTree.

4373. By Canonical.com Patch Queue Manager <email address hidden>

(Jelmer) Mention --default-rich-root rather than a particular format
 in the user guide.

4372. By Canonical.com Patch Queue Manager <email address hidden>

(igc) added osutils.parent_directories() (Ian Clatworthy)

4371. By Canonical.com Patch Queue Manager <email address hidden>

(Jelmer) Make dpush help/error a bit more generic.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'NEWS'
--- NEWS 2009-08-04 17:52:34 +0000
+++ NEWS 2009-08-04 20:40:40 +0000
@@ -13,6 +13,10 @@
13Compatibility Breaks13Compatibility Breaks
14********************14********************
1515
16* ``bzr log`` now hides custom/foreign properties by default.
17 To display them, use the new ``--show-custom`` option.
18 (Ian Clatworthy)
19
16New Features20New Features
17************21************
1822
1923
=== modified file 'bzrlib/builtins.py'
--- bzrlib/builtins.py 2009-07-27 06:22:57 +0000
+++ bzrlib/builtins.py 2009-08-04 20:40:40 +0000
@@ -2221,6 +2221,8 @@
2221 help='Show changes made in each revision as a patch.'),2221 help='Show changes made in each revision as a patch.'),
2222 Option('include-merges',2222 Option('include-merges',
2223 help='Show merged revisions like --levels 0 does.'),2223 help='Show merged revisions like --levels 0 does.'),
2224 Option('show-custom',
2225 help='Show custom/foreign properties.'),
2224 ]2226 ]
2225 encoding_type = 'replace'2227 encoding_type = 'replace'
22262228
@@ -2236,7 +2238,8 @@
2236 message=None,2238 message=None,
2237 limit=None,2239 limit=None,
2238 show_diff=False,2240 show_diff=False,
2239 include_merges=False):2241 include_merges=False,
2242 show_custom=False):
2240 from bzrlib.log import (2243 from bzrlib.log import (
2241 Logger,2244 Logger,
2242 make_log_request_dict,2245 make_log_request_dict,
@@ -2312,7 +2315,8 @@
2312 show_timezone=timezone,2315 show_timezone=timezone,
2313 delta_format=get_verbosity_level(),2316 delta_format=get_verbosity_level(),
2314 levels=levels,2317 levels=levels,
2315 show_advice=levels is None)2318 show_advice=levels is None,
2319 show_custom=show_custom)
23162320
2317 # Choose the algorithm for doing the logging. It's annoying2321 # Choose the algorithm for doing the logging. It's annoying
2318 # having multiple code paths like this but necessary until2322 # having multiple code paths like this but necessary until
23192323
=== modified file 'bzrlib/log.py'
--- bzrlib/log.py 2009-06-10 03:56:49 +0000
+++ bzrlib/log.py 2009-08-04 20:40:40 +0000
@@ -1291,7 +1291,8 @@
1291 preferred_levels = 01291 preferred_levels = 0
12921292
1293 def __init__(self, to_file, show_ids=False, show_timezone='original',1293 def __init__(self, to_file, show_ids=False, show_timezone='original',
1294 delta_format=None, levels=None, show_advice=False):1294 delta_format=None, levels=None, show_advice=False,
1295 show_custom=True):
1295 """Create a LogFormatter.1296 """Create a LogFormatter.
12961297
1297 :param to_file: the file to output to1298 :param to_file: the file to output to
@@ -1303,6 +1304,8 @@
1303 let the log formatter decide.1304 let the log formatter decide.
1304 :param show_advice: whether to show advice at the end of the1305 :param show_advice: whether to show advice at the end of the
1305 log or not1306 log or not
1307 :param show_custom: whether to show custom properties like
1308 foreign VCS revision ids
1306 """1309 """
1307 self.to_file = to_file1310 self.to_file = to_file
1308 # 'exact' stream used to show diff, it should print content 'as is'1311 # 'exact' stream used to show diff, it should print content 'as is'
@@ -1317,6 +1320,7 @@
1317 self.levels = levels1320 self.levels = levels
1318 self._show_advice = show_advice1321 self._show_advice = show_advice
1319 self._merge_count = 01322 self._merge_count = 0
1323 self._show_custom = show_custom
13201324
1321 def get_levels(self):1325 def get_levels(self):
1322 """Get the number of levels to display or 0 for all."""1326 """Get the number of levels to display or 0 for all."""
@@ -1397,12 +1401,19 @@
13971401
1398 If a registered handler raises an error it is propagated.1402 If a registered handler raises an error it is propagated.
1399 """1403 """
1404<<<<<<< TREE
1400 for key, handler in properties_handler_registry.iteritems():1405 for key, handler in properties_handler_registry.iteritems():
1401 self._write_properties(indent, handler(revision))1406 self._write_properties(indent, handler(revision))
14021407
1403 def _write_properties(self, indent, properties):1408 def _write_properties(self, indent, properties):
1404 for key, value in properties.items():1409 for key, value in properties.items():
1405 self.to_file.write(indent + key + ': ' + value + '\n')1410 self.to_file.write(indent + key + ': ' + value + '\n')
1411=======
1412 if self._show_custom:
1413 for key, handler in properties_handler_registry.iteritems():
1414 for key, value in handler(revision).items():
1415 self.to_file.write(indent + key + ': ' + value + '\n')
1416>>>>>>> MERGE-SOURCE
14061417
1407 def show_diff(self, to_file, diff, indent):1418 def show_diff(self, to_file, diff, indent):
1408 for l in diff.rstrip().split('\n'):1419 for l in diff.rstrip().split('\n'):
14091420
=== modified file 'bzrlib/tests/blackbox/test_log.py'
--- bzrlib/tests/blackbox/test_log.py 2009-06-10 03:56:49 +0000
+++ bzrlib/tests/blackbox/test_log.py 2009-08-04 20:40:40 +0000
@@ -329,6 +329,15 @@
329 ['log', '--timezone', 'foo'])329 ['log', '--timezone', 'foo'])
330330
331331
332class TestLogCustomProperties(TestLog):
333
334 def test_log_custom_properties(self):
335 # Test --show-custom is legal. Actually formatting tested elsewhere
336 self.make_minimal_branch()
337 out, err = self.run_bzr(['log', '--show-custom'])
338 self.assertEqual('', err)
339
340
332class TestLogVerbose(TestLog):341class TestLogVerbose(TestLog):
333342
334 def setUp(self):343 def setUp(self):
335344
=== modified file 'bzrlib/tests/test_log.py'
--- bzrlib/tests/test_log.py 2009-05-08 13:39:32 +0000
+++ bzrlib/tests/test_log.py 2009-08-04 20:40:40 +0000
@@ -685,34 +685,37 @@
685''',685''',
686 sio.getvalue())686 sio.getvalue())
687687
688 def get_log_with_properties(self, formatter):
689 wt = self.make_branch_and_tree('.')
690 b = wt.branch
691 self.build_tree(['a'])
692 wt.add('a')
693 b.nick = 'test_properties_in_log'
694 wt.commit(message='add a',
695 timestamp=1132711707,
696 timezone=36000,
697 committer='Lorem Ipsum <test@example.com>',
698 authors=['John Doe <jdoe@example.com>'])
699 try:
700 def trivial_custom_prop_handler(revision):
701 return {'test_prop':'test_value'}
702
703 log.properties_handler_registry.register(
704 'trivial_custom_prop_handler',
705 trivial_custom_prop_handler)
706 log.show_log(b, formatter)
707 finally:
708 log.properties_handler_registry.remove(
709 'trivial_custom_prop_handler')
710
688 def test_properties_in_log(self):711 def test_properties_in_log(self):
689 """Log includes the custom properties returned by the registered712 """Log includes the custom properties returned by the registered
690 handlers.713 handlers.
691 """714 """
692 wt = self.make_branch_and_tree('.')
693 b = wt.branch
694 self.build_tree(['a'])
695 wt.add('a')
696 b.nick = 'test_properties_in_log'
697 wt.commit(message='add a',
698 timestamp=1132711707,
699 timezone=36000,
700 committer='Lorem Ipsum <test@example.com>',
701 authors=['John Doe <jdoe@example.com>'])
702 sio = StringIO()715 sio = StringIO()
703 formatter = log.LongLogFormatter(to_file=sio)716 formatter = log.LongLogFormatter(to_file=sio)
704 try:717 self.get_log_with_properties(formatter)
705 def trivial_custom_prop_handler(revision):718 self.assertEqualDiff('''\
706 return {'test_prop':'test_value'}
707
708 log.properties_handler_registry.register(
709 'trivial_custom_prop_handler',
710 trivial_custom_prop_handler)
711 log.show_log(b, formatter)
712 finally:
713 log.properties_handler_registry.remove(
714 'trivial_custom_prop_handler')
715 self.assertEqualDiff('''\
716------------------------------------------------------------719------------------------------------------------------------
717revno: 1720revno: 1
718test_prop: test_value721test_prop: test_value
@@ -722,43 +725,52 @@
722timestamp: Wed 2005-11-23 12:08:27 +1000725timestamp: Wed 2005-11-23 12:08:27 +1000
723message:726message:
724 add a727 add a
725''',728''', sio.getvalue())
726 sio.getvalue())
727729
728 def test_properties_in_short_log(self):730 def test_properties_in_short_log(self):
729 """Log includes the custom properties returned by the registered731 """Log includes the custom properties returned by the registered
730 handlers.732 handlers.
731 """733 """
732 wt = self.make_branch_and_tree('.')
733 b = wt.branch
734 self.build_tree(['a'])
735 wt.add('a')
736 b.nick = 'test_properties_in_short_log'
737 wt.commit(message='add a',
738 timestamp=1132711707,
739 timezone=36000,
740 committer='Lorem Ipsum <test@example.com>',
741 authors=['John Doe <jdoe@example.com>'])
742 sio = StringIO()734 sio = StringIO()
743 formatter = log.ShortLogFormatter(to_file=sio)735 formatter = log.ShortLogFormatter(to_file=sio)
744 try:736 self.get_log_with_properties(formatter)
745 def trivial_custom_prop_handler(revision):737 self.assertEqualDiff('''\
746 return {'test_prop':'test_value'}
747
748 log.properties_handler_registry.register(
749 'trivial_custom_prop_handler',
750 trivial_custom_prop_handler)
751 log.show_log(b, formatter)
752 finally:
753 log.properties_handler_registry.remove(
754 'trivial_custom_prop_handler')
755 self.assertEqualDiff('''\
756 1 John Doe\t2005-11-23738 1 John Doe\t2005-11-23
757 test_prop: test_value739 test_prop: test_value
758 add a740 add a
759741
760''',742''', sio.getvalue())
761 sio.getvalue())743
744 def test_properties_in_log_off(self):
745 """Log excludes the custom properties returned by the registered
746 handlers.
747 """
748 sio = StringIO()
749 formatter = log.LongLogFormatter(to_file=sio, show_custom=False)
750 self.get_log_with_properties(formatter)
751 self.assertEqualDiff('''\
752------------------------------------------------------------
753revno: 1
754author: John Doe <jdoe@example.com>
755committer: Lorem Ipsum <test@example.com>
756branch nick: test_properties_in_log
757timestamp: Wed 2005-11-23 12:08:27 +1000
758message:
759 add a
760''', sio.getvalue())
761
762 def test_properties_in_short_log_off(self):
763 """Log excludes the custom properties returned by the registered
764 handlers.
765 """
766 sio = StringIO()
767 formatter = log.ShortLogFormatter(to_file=sio, show_custom=False)
768 self.get_log_with_properties(formatter)
769 self.assertEqualDiff('''\
770 1 John Doe\t2005-11-23
771 add a
772
773''', sio.getvalue())
762774
763 def test_error_in_properties_handler(self):775 def test_error_in_properties_handler(self):
764 """Log includes the custom properties returned by the registered776 """Log includes the custom properties returned by the registered