Robert Collins wrote:
> Review: Disapprove
> On Tue, 2009-05-12 at 14:05 +0000, Ian Clatworthy wrote:
>   
>> t's possible to get further improvements here by pushing more
>> complexity down into iter_changes as bug #347649 suggests. Given
>> selective commit performance is down to 0.27 seconds for Emacs with 3K
>> files and 105K revisions, I'm not sure that's necessary yet. That path
>> is certainly a lot more work than this patch, given the multiple
>> implementations of iter_changes() around the place and the tuning that
>> has gone into them to date.
>>     
>
> This feels like a bit of a bandaid, and I don't think that the lack of
> failures is an indication of correctness - there weren't tests for the
> failures that bug 347649 refers to. For instance, simple inspection of
> the patch suggests to me that you can miss the root id - e.g. a
> pathological case would be a selective first commit with the root
> skipped.
>
>   
I'd expect bzr commit -x '' to be a pointless commit. I suspect my code
will error in exactly that way. It's a *very* pathological case as well:
no sane person would do that on the command line (though it may happen
via the API).
> I appreciate not wanting to monkey with iter_changes, but the problem
> with doing this just up at the commit level is that you have to black
> box test. I don't think its safe to put these particular changes into
> commit without specific tests.
>
>   
"bzr selftest commit" runs 2335 tests so I think it's test coverage is
pretty comprehensive. Lots of tests broke during the course of putting
this patch together fwiw, e.g. simply passing specific_files to
iter_changes breaks the test suite so bug 347649 is indeed trapped by tests.

> If you were to put the masking support into iter-changes as a optional
> decorating generator (*)you could using test iter-changes quite easily
> (using the intertree tests) - if its fast enough for commit, then as you
> say its likely fast enough in general, and by preserving the layering we
> can optimise later. And unit tests for iter_changes will actually speak
> to the core issue rather than circumstance, which is the most a test on
> commit itself could do.
>
> e.g.
> def iter_changes_exclude_paths(changes, paths)
>     for change in changes:
>         if change[0][1] is_inside_any(paths):
>             continue
>     yield change
>
> def iter_changes_include_added_parents(changes, specific_paths):
>    ....
>
>
> def iter_changes(....)
>     result = self._iter_changes(...)
>     if specific:
>         result = iter_changes_include_added_parents(result, foo)
>     if exclude:
>         result = iter_changes_exclude_paths(result, foo)
>     return result
>
>   
That still implies adding 2 new options to *multiple* implementations of
iter_changes: yield_parents and exclude_files, say. That's a far bigger
change than this for no additional performance gain. Longer term, I'd be
surprised if pushing exclude_files down could gain us much. Yielding
parents in-situ could be *slightly* faster though but it would take a
large delta before any speed difference was noticed I suspect.
>  review disapprove
This patch is an interim step - we really want to leverage the
specific_files masking at the layer below so someone running 'bzr commit
foo/bar' on the whole Debian tree gets a fast response that this patch
delivers. Having said that, I don't think this patch is a band-aid and
it meets all our criteria for approval: code quality is good, test
coverage is not reduced and performance is improved by a factor of 4(!!)
on a common operation. I think it ought to be landed as a step forward
while I work on a faster solution. But that's just me.

Ian C.