Merge lp:~ace17/bzr/fixDiff4 into lp:bzr

Proposed by Sebastien Alaiwan on 2014-06-12
Status: Needs review
Proposed branch: lp:~ace17/bzr/fixDiff4
Merge into: lp:bzr
Diff against target: 31 lines (+10/-9)
1 file modified
bzrlib/diff.py (+10/-9)
To merge this branch: bzr merge lp:~ace17/bzr/fixDiff4
Reviewer Review Type Date Requested Status
Richard Wilbur 2014-06-12 Needs Fixing on 2014-09-24
Review via email: mp+222900@code.launchpad.net

Description of the Change

Fixes a bug causing --diff-options being passed to the external GNU diff program when a custom diff tool has been specified.

ace@ANTEC bzr-test % bzr status -V
added:
  File1
modified:
  File2
ace@ANTEC bzr-test % bzr diff --using gvimdiff --diff-options "-f"
=== added file 'File1'
diff: conflicting output style options
diff: Try `diff --help' for more information.

bzr: ERROR: external diff failed with exit code 2; command: ['diff', '--label', u'File1\t1970-01-01 00:00:00 +0000', '/tmp/bzr-diff-old-x6Y5up', '--label', u'File1\t2014-06-11 17:22:29 +0000', '/tmp/bzr-diff-new-MNV8Ut', '--binary', '-u', '-f']

To post a comment you must log in.
Richard Wilbur (richard-wilbur) wrote :
Download full text (5.4 KiB)

Here's what I'm noticing now that I'm digging into this problem a little more deeply:
0. First of all, I'm testing on trunk code
rwilbur@ordinate:~/src/bzr$ ./bzr --version
Bazaar (bzr) 2.7.0dev1
  from bzr checkout /home/rwilbur/src/bzr
    revision: 6597
    revid: <email address hidden>
    branch nick: bzr
  Python interpreter: /usr/bin/python 2.7.3
  Python standard library: /usr/lib/python2.7
  Platform: Linux-3.2.0-64-generic-pae-i686-with-Ubuntu-12.04-precise
[...]
1. You have one new file. 'File1' was just added and so has no previous contents in the working tree to calculate a difference from, so bzrlib/diff.py:DiffTree._show_diff() prints the banner
'=== added file 'File1' and then runs a special diff to handle no previous file. bzr creates an empty file with the date and time of the UNIX epoch as a signal to 'patch' that this is a new file and then runs the diff'ers from a list of DiffPath objects till one of them returns a result which is not DiffPath.CANNOT_DIFF or we run off the end of the list.

As your added 'File1' is lexically earlier than your modified 'File2' and the option '-f' is not a valid option for 'diff', the command ends with the error reported by diff when invoked with '-u -f'. I created a similar test case except my modified 'test.text' is lexically before my added 'test2.text' and so when I get an error on my added file it occurs after already processing the modified file, shown below.

~/src/bzr$ ./bzr status ../test
added:
  test2.text
modified:
  test.text
~/src/bzr$ ./bzr diff ../test/
=== modified file 'test.text'
--- test.text 2014-06-15 18:31:57 +0000
+++ test.text 2014-06-15 18:32:54 +0000
@@ -1,1 +1,2 @@
 test
+testing

=== added file 'test2.text'
--- test2.text 1970-01-01 00:00:00 +0000
+++ test2.text 2014-06-15 21:25:02 +0000
@@ -0,0 +1,1 @@
+test2

Now I invoke it with a '--using' clause to test.

~/src/bzr$ ./bzr diff --using=/bin/ls ../test/
=== modified file 'test.text'
/home/rwilbur/src/test/test.text
/tmp/bzr-diff-609Ozf/old/test.text
=== added file 'test2.text'
--- test2.text 1970-01-01 00:00:00 +0000
+++ test2.text 2014-06-15 21:25:02 +0000
@@ -0,0 +1,1 @@
+test2

It is only used on the modified file as the diff is simulated for the added file. Next I tried using 'cat' instead of 'ls' and this time added a 'diff-options' clause that should number the lines.

~/src/bzr$ ./bzr diff --using=/bin/cat --diff-options='-n' ../test/
=== modified file 'test.text'
     1 test
     2 test
     3 testing
=== added file 'test2.text'
a0 1
test2

So, on the modified file the line from the previous content and the two lines from the new version are concatenated and sequentially numbered. In this case, I'm not seeing the diff options override the using clause(/bin/cat is still being invoked). The added file has different output because the '-n' diff option was used to output a RCS format diff.

~/src/bzr$ ./bzr diff --using=/bin/cat --diff-options='-f' ../test/
=== modified file 'test.text'
/bin/cat: invalid option -- 'f'
Try `/bin/cat --help' for more information.
=== added file 'test2.text'
diff: conflicting output style options
diff: Try `diff --help' for more informatio...

Read more...

review: Needs Fixing (add test)
Richard Wilbur (richard-wilbur) wrote :

@Sebastien - I finally got my branch merged that fixes the option parser. Looks like we are ready for your patch. Do you have time to add a test?

Sebastien Alaiwan (ace17) wrote :

Hi Richard,
Trying to write a test, but I'm not sure about the correctness of my patch.
Where are the "--diff-options" supposed to go?
Are they supposed to be passed to:
1) the "/usr/bin/diff" command that bazaar runs internally? (e.g when diffing an added file).
2) the program specified with "--using" ?
3) both?

Richard Wilbur (richard-wilbur) wrote :

Greetings Sebastien,
From what I have seen of 'bzr diff' behaviour, it looks like the diff_options should be sent to 'both'. Although the terminology used in bzrlib/diff.py is:
1. internal_diff is provided by bzr in concert with python difflib
2. "/usr/bin/diff" is called by external_diff
3. the program specified with "--using" is called by an extra factory provided by DiffFromTool

Looks like the comment for DiffTree.from_trees_options()
"""
[...]
        :param external_diff_options: If supplied, use the installed diff
            binary to perform file comparison, using supplied options.
[...]
"""

could use a little maintenance as you are specifically fixing this behaviour. Maybe something along the lines of the following?

        :param external_diff_options: If supplied, invoke an external
            file comparison program with these options--either the
            installed diff binary, or an alternate program if specified
            with '--using'.

review: Needs Fixing
Sebastien Alaiwan (ace17) wrote :

Hi Richard,
Thanks for your explanation.

I think we can't have both of the following:
1) bzr not using the "--using tool" for added files, falling back to /usr/bin/diff.
2) bzr passing the --diff-options to the "--using tool" *and* to the /usr/bin/diff.

If 1) and 2) are both true, then we can run into situations where the same value of --diff-options gets passed to two different tools, leading to trouble. Do you agree?

My patch has the effect of making 2) false: it prevents passing the --diff-options to the /usr/bin/diff. But one could want to do exactly that.

Indeed, one could also want to run her "--using tool" on added files. So I'm beginning to think that maybe the issue is 1). Do you know a valid reason justifying 1) ?

Sebastien Alaiwan (ace17) wrote :

Hi Richard,
Thanks for your explanation.

I think we can't have both of the following:
1) bzr not using the "--using tool" for added files, falling back to /usr/bin/diff.
2) bzr passing the --diff-options to the "--using tool" *and* to the /usr/bin/diff.

If 1) and 2) are both true, then we can run into situations where the same value of --diff-options gets passed to two different tools, leading to trouble. Do you agree?

My patch has the effect of making 2) false: it prevents passing the --diff-options to the /usr/bin/diff. But one could want to do exactly that.

Indeed, one could also want to run her "--using tool" on added files. So I'm beginning to think that maybe the issue is 1). Do you know a valid reason justifying 1) ?

Richard Wilbur (richard-wilbur) wrote :

Hello Sebastien,

I disagree with your statement that your patch prevents passing the --diff-options to the /usr/bin/diff. With your patch, if you specify --diff-options but not --using, then the --diff-options are passed to /usr/bin/diff, as desired.

I'm looking at what would be involved in making the --using tool (when specified) the first thing tried for all types of situations (including added files). If we must fall back to /usr/bin/diff in the presence of an alternate tool specified through --using, one option would simply be to ignore the --diff-options in that case.

Possibly another angle on this whole discussion occurred to me recently. Witness the following test:
rwilbur@ordinate:~/src/bzr_test$ bzr diff --using='/bin/ls' junk
=== modified file 'junk'
/home/rwilbur/src/bzr_test/junk
/tmp/bzr-diff-ZqM1vD/old/junk
rwilbur@ordinate:~/src/bzr_test$ bzr diff --using='/bin/ls -l' junk
=== modified file 'junk'
-rw-r--r-- 1 rwilbur rwilbur 16 2014-09-25 13:28 /home/rwilbur/src/bzr_test/junk
-r--r--r-- 1 rwilbur rwilbur 5 2014-09-25 13:28 /tmp/bzr-diff-lc5kzb/old/junk

If you create the tool command line with path followed by options, right in the --using declaration, it seems to work just fine (and wouldn't collide with options we specify for /usr/bin/diff in --diff-options provided we merge part of your patch).

In this case we could prioritize as follows:
1. If nothing is specified, we use internal diff.
2. If only --diff-options appears, we pass those options to /usr/bin/diff.
3. If only --using appears, we delegate the diff to the specified tool (options may appear in the --using string). If the specified tool cannot handle a particular diff, we use /usr/bin/diff or internal diff to provide a result.
4. If both --using and --diff-options appear, we delegate the diff to the specified tool as in #3, but if it cannot handle a particular diff we use /usr/bin/diff with the options from --diff-options.

This interpretation would effectively decouple the --using and --diff-options specifications. --diff-options would only ever be passed to /usr/bin/diff. --using would always make the specified tool the first delegate.

What do you think of that landscape?

Sebastien Alaiwan (ace17) wrote :

Hi Richard,

Using the value of the "--using" option as a command line part would indeed do the trick.
In fact, this is what I first tried (bzr diff --using "gvim -d -f -n").
However, I never got it to work. Here's what I get on my machine:

ace@ANTEC:~/projects/perdr2$ which ls
/bin/ls

ace@ANTEC:~/projects/perdr2$ bzr diff --using '/bin/ls -l'
=== modified file 'Makefile'
bzr: ERROR: /bin/ls -l could not be found on this machine

ace@ANTEC:~/projects/perdr2$ bzr --version
Bazaar (bzr) 2.7.0dev1

It seems the code responsible for launching the "--using" diff tool interprets the whole string as a program name, bypassing the system shell - which I believe is a good thing, for lots a reasons, including portability and security ones.

So, I'm OK with the following: passing "--diff-options" to external diff, or only to the "--using" diff tool if there's one.

Seems like you got a different result than mine on your computer; how did you do that?! :-)

BTW, is there a reason to have three ways of invoking diff tools? Could the current "external diff" (GNU diff) be an extra DiffTool, or does it play a special role?

Richard Wilbur (richard-wilbur) wrote :

I'm looking at tracking down the change that made putting the options in the '--using' declaration fail.

Unmerged revisions

6598. By Sebastien Alaiwan on 2014-06-11

Avoid calling 'external_diff' with options passed to --diff-options if --using is specified. Fixes the following from failing 'bzr add somefile.txt ; bzr diff --using gvimdiff --diff-options '-f -n -d'

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bzrlib/diff.py'
2--- bzrlib/diff.py 2012-07-24 16:56:07 +0000
3+++ bzrlib/diff.py 2014-06-12 05:15:41 +0000
4@@ -930,17 +930,18 @@
5 """
6 if using is not None:
7 extra_factories = [DiffFromTool.make_from_diff_tree(using, external_diff_options)]
8+ diff_file = internal_diff
9 else:
10 extra_factories = []
11- if external_diff_options:
12- opts = external_diff_options.split()
13- def diff_file(olab, olines, nlab, nlines, to_file, path_encoding=None, context_lines=None):
14- """:param path_encoding: not used but required
15- to match the signature of internal_diff.
16- """
17- external_diff(olab, olines, nlab, nlines, to_file, opts)
18- else:
19- diff_file = internal_diff
20+ if external_diff_options:
21+ opts = external_diff_options.split()
22+ def diff_file(olab, olines, nlab, nlines, to_file, path_encoding=None, context_lines=None):
23+ """:param path_encoding: not used but required
24+ to match the signature of internal_diff.
25+ """
26+ external_diff(olab, olines, nlab, nlines, to_file, opts)
27+ else:
28+ diff_file = internal_diff
29 diff_text = DiffText(old_tree, new_tree, to_file, path_encoding,
30 old_label, new_label, diff_file, context_lines=context_lines)
31 return klass(old_tree, new_tree, to_file, path_encoding, diff_text,