Merge lp:~maria-captains/bzr/diffoptions into lp:bzr

Proposed by Martin Packman
Status: Work in progress
Proposed branch: lp:~maria-captains/bzr/diffoptions
Merge into: lp:bzr
Diff against target: 24 lines (+12/-2)
1 file modified
bzrlib/diff.py (+12/-2)
To merge this branch: bzr merge lp:~maria-captains/bzr/diffoptions
Reviewer Review Type Date Requested Status
Andrew Bennetts Needs Fixing
Jelmer Vernooij (community) Abstain
Sergei Golubchik Pending
Review via email: mp+54139@code.launchpad.net

Description of the change

[GZ: Starting fresh review which I think was Sergei's intention in the earlier merge proposal:
<https://code.launchpad.net/~maria-captains/bzr/diffoptions/+merge/52325>
]

Thanks, John, for the discussion.

Let me summarize:
1. It's not always easy to find the "current branch" to be able to get its config.
2. having "-F regex" specified only globally is not flexible enough, as different branches (e.g. one with the C code, another with the Python code) may need different regexps.
2.1 Actually, even one regexp per branch is not enough, as one branch may have both python and C files.
3. Allowing the user to specify arbitrary diff options that affect all generated diffs is too dangerous - "diff -w" will break merge bundles (and "diff -c" breaks bzr shelve and everything that parses diffs).
3.1 But "diff -p" seems to be safe everywhere.

So, I took another approach. It's more strict and more flexible at the same time. More flexible as in the 2.1 above (regexp can be specified per file or a group of files) and more strict as in the 3.1 above (only diff -p - really, diff -F - is allowed).

The patch uses rules to find the regexp for a given tree and for a given file. And runs external diff with -F regexp command-line option.

Please, see the patch. It lacks tests and the documentation is not updated - but it works, and shows the approach.

To post a comment you must log in.
Revision history for this message
Martin Packman (gz) wrote :

+ # if the user requested function names in the diff
+ # we use the external diff

My main issue with this kind of change is that actually using an external diff rather than the bzr internal one *can* be a big behaviour change. Some people don't have an external diff installed at all, and encoding handling is completely different.

If it's happening in non-obvious circumstances, we risk confusing users over when diff works the way they expect and when it breaks. Implementing '-F' in the bzr internal differ is way more work, but would also do what you want here I think.

Revision history for this message
Sergei Golubchik (sergii) wrote :

Hi, Martin!

Thanks for looking into this!

On Mar 21, Martin [gz] wrote:
> + # if the user requested function names in the diff
> + # we use the external diff
>
> My main issue with this kind of change is that actually using an
> external diff rather than the bzr internal one *can* be a big
> behaviour change. Some people don't have an external diff installed at
> all, and encoding handling is completely different.

This is, of course, true. But whether to use this feature is a user's
decision. For example, I suspect that Windowes users (that tend to not
have external diff installed) simply won't set the regex in the
rules file. On the other hand, this feature should work quite good for
lots of users and projects.

> If it's happening in non-obvious circumstances, we risk confusing
> users over when diff works the way they expect and when it breaks.
> Implementing '-F' in the bzr internal differ is way more work, but
> would also do what you want here I think.

Yes, I looked into it too. But a proper solution would require
modifying bzr diff algorithms (both C and python versions, for
consistency) which is not only much more work but also much more risky
approach, as it affects the core functionality of bzr, affects what's
stored in the repository, bundles, everything.

My change, on the other hand, is reasonably small, simple, and safe - it
only affects what is shown to user, all the internal algorithms and data
structures are not affected,

Regards,
Sergei

Revision history for this message
Jelmer Vernooij (jelmer) :
review: Abstain
Revision history for this message
Andrew Bennetts (spiv) wrote :

I'm sorry merge proposal was stalled for so long. I think somehow it was missing a review request for bzr-core, so it wasn't showing in the “Reviews I have to do” section of <https://code.launchpad.net/bzr/+activereviews>.

This looks like a nice idea to me. Thanks for proposing it!

I agree with mgz that'd be nice to support this in the internal diff, but it would be significantly more work (although I'd guess it might not be very hard either).

I like the idea of using the rules file for this setting. We do want to take care I think to distinguish between “diff for 'bzr diff' output” and “diff used by merge directives etc”. I'm not sure that the current patch does that, although I'm not very familiar with how other parts of bzrlib invoke this module.

Some superficial comments on the code itself:

 * compare functions, like internal_diff, using the 'is' operator rather than the '==' operator.
 * don't assume that the result of iter_search_rules(…).next() is a tuple; I think at the moment that happens to be the implementation, but that could easily change to e.g. a list or generator, which would then cause your “diff_show_function_re == ()” check to give the wrong answer in some cases.

This will need tests to be mergeable. Some cases that immediately occur to me:

 * what happens if rules file specifies a diff_show_function_re for a file, but no external diff could be run? I think ideally we'd warn the user (once per bzr invocation, not once per file diffed!) and fallback to internal diff.
 * what happens if the user does “bzr diff --diff-options=…”? I think the intent of your code is to ignore any diff_show_function_re setting in rules in this case, which seems reasonable to me, but it deserves a test.

review: Needs Fixing
Revision history for this message
Martin Pool (mbp) wrote :

This is such a small diff and it's been sitting for so long. I'll see about writing some tests for it. It also needs to be mentioned in the documentation.

I filed https://bugs.launchpad.net/bzr/+bug/764108 to track this.

Revision history for this message
Martin Pool (mbp) wrote :

I'm going to pop this out of the review queue, because it definitely does need more work. But I have bug 764108 assigned to me, and I am going to finish it off under our patch pilot programme.

Revision history for this message
Martin Pool (mbp) wrote :

Unmerged revisions

5729. By Sergei Golubchik

support diff -F in all generated diffs
with per-file configuration rules

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bzrlib/diff.py'
2--- bzrlib/diff.py 2010-08-20 06:49:00 +0000
3+++ bzrlib/diff.py 2011-03-21 00:31:27 +0000
4@@ -709,8 +709,18 @@
5 try:
6 from_text = _get_text(self.old_tree, from_file_id, from_path)
7 to_text = _get_text(self.new_tree, to_file_id, to_path)
8- self.text_differ(from_label, from_text, to_label, to_text,
9- self.to_file, path_encoding=self.path_encoding)
10+ # if the user requested function names in the diff
11+ # we use the external diff
12+ diff_show_function_re = ()
13+ if from_path and self.text_differ == internal_diff:
14+ diff_show_function_re = self.old_tree.iter_search_rules(
15+ [from_path], ['diff_show_function_regex']).next()
16+ if diff_show_function_re != ():
17+ external_diff(from_label, from_text, to_label, to_text,
18+ self.to_file, ['-F', diff_show_function_re[0][1]])
19+ else:
20+ self.text_differ(from_label, from_text, to_label, to_text,
21+ self.to_file, path_encoding=self.path_encoding)
22 except errors.BinaryFile:
23 self.to_file.write(
24 ("Binary files %s and %s differ\n" %