Merge lp:~ian-clatworthy/bzr/log-show-custom into lp:~bzr/bzr/trunk-old

Proposed by Ian Clatworthy
Status: Rejected
Rejected by: Ian Clatworthy
Proposed branch: lp:~ian-clatworthy/bzr/log-show-custom
Merge into: lp:~bzr/bzr/trunk-old
Diff against target: 261 lines (has conflicts)
Text conflict in bzrlib/log.py
To merge this branch: bzr merge lp:~ian-clatworthy/bzr/log-show-custom
Reviewer Review Type Date Requested Status
Jelmer Vernooij (community) Needs Fixing
Review via email: mp+6770@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Ian Clatworthy (ian-clatworthy) wrote :

It turns out that checking for custom properties when logging, particularly checking for foreign VCS properties, is pretty expensive. This patch switches their display off by default and introduces a new option to enable them: --show-custom. If that's considered heavy-handed, we could find a smarter way to skip foreign VCS property calculation (I couldn't see a way given the current algorithm) or make the option --show-foreign, leaving non-foreign custom properties on by default.

On OpenOffice.org with 260K revisions, this patch reduces the time by 10% from 50 seconds to 45 seconds. This improvement complements my faster-log branch: together they reduce the overall time from 50+ seconds to 41 seconds.

Revision history for this message
Martin Pool (mbp) wrote :

2009/5/25 Ian Clatworthy <email address hidden>:
> Ian Clatworthy has proposed merging lp:~ian-clatworthy/bzr/log-show-custom into lp:bzr.
>
> Requested reviews:
>    bzr-core (bzr-core)
>
> It turns out that checking for custom properties when logging, particularly checking for foreign VCS properties, is pretty expensive. This patch switches their display off by default and introduces a new option to enable them: --show-custom. If that's considered heavy-handed, we could find a smarter way to skip foreign VCS property calculation (I couldn't see a way given the current algorithm) or make the option --show-foreign, leaving non-foreign custom properties on by default.

I think that's reasonable.

It's the kind of case where it might be nice for people to be able to
set the option on other than by aliasing the command. (Though
aliasing it is an adequate standby.)

--
Martin <http://launchpad.net/~mbp/>

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin Pool wrote:
> 2009/5/25 Ian Clatworthy <email address hidden>:
>> Ian Clatworthy has proposed merging lp:~ian-clatworthy/bzr/log-show-custom into lp:bzr.
>>
>> Requested reviews:
>> bzr-core (bzr-core)
>>
>> It turns out that checking for custom properties when logging, particularly checking for foreign VCS properties, is pretty expensive. This patch switches their display off by default and introduces a new option to enable them: --show-custom. If that's considered heavy-handed, we could find a smarter way to skip foreign VCS property calculation (I couldn't see a way given the current algorithm) or make the option --show-foreign, leaving non-foreign custom properties on by default.
>
> I think that's reasonable.
>
> It's the kind of case where it might be nice for people to be able to
> set the option on other than by aliasing the command. (Though
> aliasing it is an adequate standby.)
>

I think it would be worthwhile to figure out why this is so expensive.
In general, it seems like it should just be a single dict lookup to see
if a given property exists, which doesn't seem particularly expensive.

Is it just an inefficient check in bzr-svn, or is it something about the
API that makes it inefficient? (Such as passing the revision id, rather
than the Revision object, etc.)

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkoaSUQACgkQJdeBCYSNAANo1gCgkBu6/3hvodqv1B58Hb2rG3RN
24oAoKAFUkMLGQs3/AVnT3Ed6End04EM
=+5+n
-----END PGP SIGNATURE-----

Revision history for this message
Ian Clatworthy (ian-clatworthy) wrote :

John A Meinel wrote:
> I think it would be worthwhile to figure out why this is so expensive.
> In general, it seems like it should just be a single dict lookup to see
> if a given property exists, which doesn't seem particularly expensive.
>
> Is it just an inefficient check in bzr-svn, or is it something about the
> API that makes it inefficient? (Such as passing the revision id, rather
> than the Revision object, etc.)
>
It's slow for 2 reasons:

1. a custom property handler is *always* registered for calculating
    foreign properties. So simply testing if the registry is empty isn't
   good enough to short-circuit the normal processing loop.

2. bzrlib/foreign.show_foreign_properties() tries to pull apart
    *every* revision-id to see if it came from a foreign source.

Over 250K revisions, these effects add up with the second one being
particularly expensive. It would be nice if there was a single check we
could do (per branch or repo say) to decide whether to do the above but
I can't see how we can, given the possibility that just a few revisions
here or there (in a merge say) came from a foreign source. Hence my
suggestion that they be switched off by default, so only users
interested in them pay the cost.

BTW, hg has a nice feature where options can be configured per command
per branch (by adding sections to their branch-specific config file). I
think that sort of flexibility is useful in case like this (so that
those working with foreign branches can enable --show-custom by default
when they care without going as far as a per user alias). Just a thought.

Ian C.

Revision history for this message
Jelmer Vernooij (jelmer) wrote :

> It turns out that checking for custom properties when logging, particularly
> checking for foreign VCS properties, is pretty expensive. This patch switches
> their display off by default and introduces a new option to enable them:
> --show-custom. If that's considered heavy-handed, we could find a smarter way
> to skip foreign VCS property calculation (I couldn't see a way given the
> current algorithm) or make the option --show-foreign, leaving non-foreign
> custom properties on by default.
>
> On OpenOffice.org with 260K revisions, this patch reduces the time by 10% from
> 50 seconds to 45 seconds. This improvement complements my faster-log branch:
> together they reduce the overall time from 50+ seconds to 41 seconds.
At least when run against a foreign repository itself, we know directly which function to use. This would mean we need some extra state in log so we couldn't just use a "standard" custom revision property display hook but we should be able to at least show foreign ids when run directly against a foreign repo.

Revision history for this message
Jelmer Vernooij (jelmer) wrote :

I think we can improve the performance of this without introducing a new argument.

I've discussed this on IRC with igc; I'll have a look at providing an alternative implementation that is faster.

review: Needs Fixing
Revision history for this message
Ian Clatworthy (ian-clatworthy) wrote :

> I think we can improve the performance of this without introducing a new
> argument.
>
> I've discussed this on IRC with igc; I'll have a look at providing an
> alternative implementation that is faster.

Cool. I'll mark this merge proposal as 'work in progress' accordingly. After you're done, we can see whether the option still has value or not in forcing a particular behaviour.

Revision history for this message
Jelmer Vernooij (jelmer) wrote :

> > I think we can improve the performance of this without introducing a new
> > argument.
> >
> > I've discussed this on IRC with igc; I'll have a look at providing an
> > alternative implementation that is faster.
>
> Cool. I'll mark this merge proposal as 'work in progress' accordingly. After
> you're done, we can see whether the option still has value or not in forcing a
> particular behaviour.
The performance improvements for foreign vcs fields in log have landed. Is this still a problem for log, or can this review be marked superseded?

Revision history for this message
Jelmer Vernooij (jelmer) wrote :

> > I think we can improve the performance of this without introducing a new
> > argument.
> >
> > I've discussed this on IRC with igc; I'll have a look at providing an
> > alternative implementation that is faster.
>
> Cool. I'll mark this merge proposal as 'work in progress' accordingly. After
> you're done, we can see whether the option still has value or not in forcing a
> particular behaviour.
The performance improvements for foreign vcs fields in log have landed. Is this still a problem for log, or can this review be marked superseded?

Unmerged revisions

4380. By Ian Clatworthy

log --show-custom

4379. By Canonical.com Patch Queue Manager <email address hidden>

(tanner) merge 1.15final back to trunk

4378. By Canonical.com Patch Queue Manager <email address hidden>

(igc) faster branch in a shared repo for dev6rr format (Ian
 Clatworthy)

4377. By Canonical.com Patch Queue Manager <email address hidden>

(igc) make branch --no-tree faster (Ian Clatworthy)

4376. By Canonical.com Patch Queue Manager <email address hidden>

(Jelmer) Return BranchPushResult from InterBranch.lossy_push().

4375. By Canonical.com Patch Queue Manager <email address hidden>

(Jelmer) Add child_submit_format option used by 'bzr send'.

4374. By Canonical.com Patch Queue Manager <email address hidden>

(abentley) Enable committing from a PreviewTree.

4373. By Canonical.com Patch Queue Manager <email address hidden>

(Jelmer) Mention --default-rich-root rather than a particular format
 in the user guide.

4372. By Canonical.com Patch Queue Manager <email address hidden>

(igc) added osutils.parent_directories() (Ian Clatworthy)

4371. By Canonical.com Patch Queue Manager <email address hidden>

(Jelmer) Make dpush help/error a bit more generic.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'NEWS'
2--- NEWS 2009-08-04 17:52:34 +0000
3+++ NEWS 2009-08-04 20:40:40 +0000
4@@ -13,6 +13,10 @@
5 Compatibility Breaks
6 ********************
7
8+* ``bzr log`` now hides custom/foreign properties by default.
9+ To display them, use the new ``--show-custom`` option.
10+ (Ian Clatworthy)
11+
12 New Features
13 ************
14
15
16=== modified file 'bzrlib/builtins.py'
17--- bzrlib/builtins.py 2009-07-27 06:22:57 +0000
18+++ bzrlib/builtins.py 2009-08-04 20:40:40 +0000
19@@ -2221,6 +2221,8 @@
20 help='Show changes made in each revision as a patch.'),
21 Option('include-merges',
22 help='Show merged revisions like --levels 0 does.'),
23+ Option('show-custom',
24+ help='Show custom/foreign properties.'),
25 ]
26 encoding_type = 'replace'
27
28@@ -2236,7 +2238,8 @@
29 message=None,
30 limit=None,
31 show_diff=False,
32- include_merges=False):
33+ include_merges=False,
34+ show_custom=False):
35 from bzrlib.log import (
36 Logger,
37 make_log_request_dict,
38@@ -2312,7 +2315,8 @@
39 show_timezone=timezone,
40 delta_format=get_verbosity_level(),
41 levels=levels,
42- show_advice=levels is None)
43+ show_advice=levels is None,
44+ show_custom=show_custom)
45
46 # Choose the algorithm for doing the logging. It's annoying
47 # having multiple code paths like this but necessary until
48
49=== modified file 'bzrlib/log.py'
50--- bzrlib/log.py 2009-06-10 03:56:49 +0000
51+++ bzrlib/log.py 2009-08-04 20:40:40 +0000
52@@ -1291,7 +1291,8 @@
53 preferred_levels = 0
54
55 def __init__(self, to_file, show_ids=False, show_timezone='original',
56- delta_format=None, levels=None, show_advice=False):
57+ delta_format=None, levels=None, show_advice=False,
58+ show_custom=True):
59 """Create a LogFormatter.
60
61 :param to_file: the file to output to
62@@ -1303,6 +1304,8 @@
63 let the log formatter decide.
64 :param show_advice: whether to show advice at the end of the
65 log or not
66+ :param show_custom: whether to show custom properties like
67+ foreign VCS revision ids
68 """
69 self.to_file = to_file
70 # 'exact' stream used to show diff, it should print content 'as is'
71@@ -1317,6 +1320,7 @@
72 self.levels = levels
73 self._show_advice = show_advice
74 self._merge_count = 0
75+ self._show_custom = show_custom
76
77 def get_levels(self):
78 """Get the number of levels to display or 0 for all."""
79@@ -1397,12 +1401,19 @@
80
81 If a registered handler raises an error it is propagated.
82 """
83+<<<<<<< TREE
84 for key, handler in properties_handler_registry.iteritems():
85 self._write_properties(indent, handler(revision))
86
87 def _write_properties(self, indent, properties):
88 for key, value in properties.items():
89 self.to_file.write(indent + key + ': ' + value + '\n')
90+=======
91+ if self._show_custom:
92+ for key, handler in properties_handler_registry.iteritems():
93+ for key, value in handler(revision).items():
94+ self.to_file.write(indent + key + ': ' + value + '\n')
95+>>>>>>> MERGE-SOURCE
96
97 def show_diff(self, to_file, diff, indent):
98 for l in diff.rstrip().split('\n'):
99
100=== modified file 'bzrlib/tests/blackbox/test_log.py'
101--- bzrlib/tests/blackbox/test_log.py 2009-06-10 03:56:49 +0000
102+++ bzrlib/tests/blackbox/test_log.py 2009-08-04 20:40:40 +0000
103@@ -329,6 +329,15 @@
104 ['log', '--timezone', 'foo'])
105
106
107+class TestLogCustomProperties(TestLog):
108+
109+ def test_log_custom_properties(self):
110+ # Test --show-custom is legal. Actually formatting tested elsewhere
111+ self.make_minimal_branch()
112+ out, err = self.run_bzr(['log', '--show-custom'])
113+ self.assertEqual('', err)
114+
115+
116 class TestLogVerbose(TestLog):
117
118 def setUp(self):
119
120=== modified file 'bzrlib/tests/test_log.py'
121--- bzrlib/tests/test_log.py 2009-05-08 13:39:32 +0000
122+++ bzrlib/tests/test_log.py 2009-08-04 20:40:40 +0000
123@@ -685,34 +685,37 @@
124 ''',
125 sio.getvalue())
126
127+ def get_log_with_properties(self, formatter):
128+ wt = self.make_branch_and_tree('.')
129+ b = wt.branch
130+ self.build_tree(['a'])
131+ wt.add('a')
132+ b.nick = 'test_properties_in_log'
133+ wt.commit(message='add a',
134+ timestamp=1132711707,
135+ timezone=36000,
136+ committer='Lorem Ipsum <test@example.com>',
137+ authors=['John Doe <jdoe@example.com>'])
138+ try:
139+ def trivial_custom_prop_handler(revision):
140+ return {'test_prop':'test_value'}
141+
142+ log.properties_handler_registry.register(
143+ 'trivial_custom_prop_handler',
144+ trivial_custom_prop_handler)
145+ log.show_log(b, formatter)
146+ finally:
147+ log.properties_handler_registry.remove(
148+ 'trivial_custom_prop_handler')
149+
150 def test_properties_in_log(self):
151 """Log includes the custom properties returned by the registered
152 handlers.
153 """
154- wt = self.make_branch_and_tree('.')
155- b = wt.branch
156- self.build_tree(['a'])
157- wt.add('a')
158- b.nick = 'test_properties_in_log'
159- wt.commit(message='add a',
160- timestamp=1132711707,
161- timezone=36000,
162- committer='Lorem Ipsum <test@example.com>',
163- authors=['John Doe <jdoe@example.com>'])
164 sio = StringIO()
165 formatter = log.LongLogFormatter(to_file=sio)
166- try:
167- def trivial_custom_prop_handler(revision):
168- return {'test_prop':'test_value'}
169-
170- log.properties_handler_registry.register(
171- 'trivial_custom_prop_handler',
172- trivial_custom_prop_handler)
173- log.show_log(b, formatter)
174- finally:
175- log.properties_handler_registry.remove(
176- 'trivial_custom_prop_handler')
177- self.assertEqualDiff('''\
178+ self.get_log_with_properties(formatter)
179+ self.assertEqualDiff('''\
180 ------------------------------------------------------------
181 revno: 1
182 test_prop: test_value
183@@ -722,43 +725,52 @@
184 timestamp: Wed 2005-11-23 12:08:27 +1000
185 message:
186 add a
187-''',
188- sio.getvalue())
189+''', sio.getvalue())
190
191 def test_properties_in_short_log(self):
192 """Log includes the custom properties returned by the registered
193 handlers.
194 """
195- wt = self.make_branch_and_tree('.')
196- b = wt.branch
197- self.build_tree(['a'])
198- wt.add('a')
199- b.nick = 'test_properties_in_short_log'
200- wt.commit(message='add a',
201- timestamp=1132711707,
202- timezone=36000,
203- committer='Lorem Ipsum <test@example.com>',
204- authors=['John Doe <jdoe@example.com>'])
205 sio = StringIO()
206 formatter = log.ShortLogFormatter(to_file=sio)
207- try:
208- def trivial_custom_prop_handler(revision):
209- return {'test_prop':'test_value'}
210-
211- log.properties_handler_registry.register(
212- 'trivial_custom_prop_handler',
213- trivial_custom_prop_handler)
214- log.show_log(b, formatter)
215- finally:
216- log.properties_handler_registry.remove(
217- 'trivial_custom_prop_handler')
218- self.assertEqualDiff('''\
219+ self.get_log_with_properties(formatter)
220+ self.assertEqualDiff('''\
221 1 John Doe\t2005-11-23
222 test_prop: test_value
223 add a
224
225-''',
226- sio.getvalue())
227+''', sio.getvalue())
228+
229+ def test_properties_in_log_off(self):
230+ """Log excludes the custom properties returned by the registered
231+ handlers.
232+ """
233+ sio = StringIO()
234+ formatter = log.LongLogFormatter(to_file=sio, show_custom=False)
235+ self.get_log_with_properties(formatter)
236+ self.assertEqualDiff('''\
237+------------------------------------------------------------
238+revno: 1
239+author: John Doe <jdoe@example.com>
240+committer: Lorem Ipsum <test@example.com>
241+branch nick: test_properties_in_log
242+timestamp: Wed 2005-11-23 12:08:27 +1000
243+message:
244+ add a
245+''', sio.getvalue())
246+
247+ def test_properties_in_short_log_off(self):
248+ """Log excludes the custom properties returned by the registered
249+ handlers.
250+ """
251+ sio = StringIO()
252+ formatter = log.ShortLogFormatter(to_file=sio, show_custom=False)
253+ self.get_log_with_properties(formatter)
254+ self.assertEqualDiff('''\
255+ 1 John Doe\t2005-11-23
256+ add a
257+
258+''', sio.getvalue())
259
260 def test_error_in_properties_handler(self):
261 """Log includes the custom properties returned by the registered