Merge lp:~bialix/bzr/no-single-quote-on-win32 into lp:bzr

Proposed by Alexander Belchenko
Status: Merged
Merged at revision: not available
Proposed branch: lp:~bialix/bzr/no-single-quote-on-win32
Merge into: lp:bzr
Diff against target: 101 lines (+13/-13)
3 files modified
NEWS (+1/-2)
bzrlib/tests/test_win32utils.py (+11/-9)
bzrlib/win32utils.py (+1/-2)
To merge this branch: bzr merge lp:~bialix/bzr/no-single-quote-on-win32
Reviewer Review Type Date Requested Status
Gordon Tyler Approve
John A Meinel Abstain
Martin Packman (community) Approve
Adrian Wilkins Pending
Gary van der Merwe Pending
methane Pending
bzr-core Pending
Review via email: mp+14537@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Alexander Belchenko (bialix) wrote :

Restore standard behavior of command-line arguments parsing by removing support for single quote.

The main reason for this change is because it's not windows-way and who need single quotes and rich shell will use Cygwin/Msys anyway, so please don't break windows standards. Otherwise for any hardcore windows user it's hard to understand why simple command-line like this won't work:

bzr add let's-do-it.txt

bzr:ERROR: No file named lets-do-it.txt

Revision history for this message
Martin Packman (gz) wrote :

I think this is the right policy change.

Messing around a little with double quotes is fine: they aren't valid in filenames anyway, but single quotes are.

If the parsing is pared back to CommandLineToArgv-plus-remembering-if-it-was-quoted, UnicodeShlex could be further simplified.

review: Approve
Revision history for this message
John A Meinel (jameinel) wrote :

Obviously I prefer if we kept '', or I wouldn't have implemented it. If others feel strongly I can be overridden, so I'm not voting -1, just -0.

review: Abstain
Revision history for this message
Gordon Tyler (doxxx) wrote :

Handling of quoted commandline arguments is the responsibility of the shell.

review: Approve
Revision history for this message
Robert Collins (lifeless) wrote :

On Sat, 2009-11-07 at 16:30 +0000, Gordon Tyler wrote:
> Review: Approve
> Handling of quoted commandline arguments is the responsibility of the shell.

So is glob expansion, but we do that in bzr on windows. Why is glob
expansion something bzr should do, but not quoting?

John - does
c:\foo\> bzr commit -m "foo'bar"
work at the moment ? If not, we should fix that - on unix we'd want
"foo'bar" not "foobar" as the resulting string.

For now, I'm abstaining, as I don't use windows either. I *do* think it
would be very nice to be able to have our docs be the same for win32 and
unix as far as handling paths with spaces.

-Rob

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> On Sat, 2009-11-07 at 16:30 +0000, Gordon Tyler wrote:
>> Review: Approve
>> Handling of quoted commandline arguments is the responsibility of the shell.
>
> So is glob expansion, but we do that in bzr on windows. Why is glob
> expansion something bzr should do, but not quoting?
>
> John - does
> c:\foo\> bzr commit -m "foo'bar"
> work at the moment ? If not, we should fix that - on unix we'd want
> "foo'bar" not "foobar" as the resulting string.
>
> For now, I'm abstaining, as I don't use windows either. I *do* think it
> would be very nice to be able to have our docs be the same for win32 and
> unix as far as handling paths with spaces.
>
> -Rob
>

bzr commit -m "foo'bar" works as you would expect. The change is that:

 bzr commit -m foo'bar

is treated as "foobar". Normally your shell would hang waiting for you
to close your quotes. We get the string late, so can't really do that.
(We *could* error, though.)

Note that this patch "breaks"

 bzr commit foo'bar

as it doesn't support
 bzr commit foo\'bar
though I'll note that you could have a file named "'bar". In Unix world
you would commit that with

 bzr commit foo/\'bar

I don't really know how to express it in something similar-enough to
Windows shell...

 bzr commit foo\\\'bar ?

Note that Windows works around not having an escape character by just
not allowing characters that you would otherwise want to escape as a
valid filename. So you can't have:

 bzr commit foo"bar

" is not allowed (and neither is *?<>/\:...)

It wouldn't be hard to update the parser to allow:

 bzr commit foo\'bar

And allowing that we also need

 bzr commit foo\\\'bar

of course.
(we don't *need* foo\"bar since " isn't allowed in filenames, and we
have it already for 'bzr commit -m "foo \" bar"')

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkr12V4ACgkQJdeBCYSNAANneACdGJCsd4JdfElOr+/Zi+9u+ARu
tfgAn3C4hu+aPneEqHXqTvgDmYzL4QUp
=Tiv0
-----END PGP SIGNATURE-----

Revision history for this message
Alexander Belchenko (bialix) wrote :

I disagree with John intent to make command-line posix like (despite the glob) because it breaks rule of the least surprise. If I know how all programs in standard shell behave then I'd like to get the same behavior from bzr.

John's patch was about fixing 3 bugs/feature requests for adding glob expansion. There was no request to change command-line parsing to follow all linux rules. Why then you don't support `` syntax?

John don't use standard shell, he's using Cygwin. Robert don't use Windows either. Why you then want to beak standard windows shell behavior instead of following it? For those who need different there is always Cygwin and Msys.

More closer to some concerns about escaping quotes. I've made some testing with simple python snippet. Here is how it works in standard shell:

C:\>python -c "import sys; print sys.argv"
['-c']

C:\>python -c "import sys; print sys.argv" foo"bar
['-c', 'foobar']

C:\>python -c "import sys; print sys.argv" foobar"
['-c', 'foobar']

C:\>python -c "import sys; print sys.argv" "foo"bar"
['-c', 'foobar']

C:\>python -c "import sys; print sys.argv" foo""bar
['-c', 'foobar']

C:\>python -c "import sys; print sys.argv" foo" "bar
['-c', 'foo bar']

C:\>python -c "import sys; print sys.argv" foo\"bar
['-c', 'foo"bar']

C:\>python -c "import sys; print sys.argv" "foo""bar"
['-c', 'foo"bar']

C:\>python -c "import sys; print sys.argv" foo\'bar
['-c', "foo\\'bar"]

C:\>python -c "import sys; print sys.argv" "foo" "bar"
['-c', 'foo', 'bar']

C:\>python -c "import sys; print sys.argv" foo\" \"bar
['-c', 'foo"', '"bar']

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Alexander Belchenko wrote:
> I disagree with John intent to make command-line posix like (despite the glob) because it breaks rule of the least surprise. If I know how all programs in standard shell behave then I'd like to get the same behavior from bzr.
>
> John's patch was about fixing 3 bugs/feature requests for adding glob expansion. There was no request to change command-line parsing to follow all linux rules. Why then you don't support `` syntax?

We also have had a bug about "don't use single quotes in documentation
because it breaks on Windows". I don't have the bug # offhand.

It *has* been a source of difficulty for users that the command line
does not behave the same across different platforms. Even if we fix up
the bzr docs, people may read recipes in blog posts, etc. And then not
have them work.

I'm at the point of being willing to land your patch to make you happy,
even if it makes me a little unhappy. If you want to continue onward to
change the rest of the differences, feel free. I'll review and land it.

I also found a couple test suite failures because we no longer do
globbing of 'run_bzr()' arguments, though I consider that a good thing.
I'll be proposing a patch for that in a few minutes.

>
> John don't use standard shell, he's using Cygwin. Robert don't use Windows either. Why you then want to beak standard windows shell behavior instead of following it? For those who need different there is always Cygwin and Msys.
>
> More closer to some concerns about escaping quotes. I've made some testing with simple python snippet. Here is how it works in standard shell:
>
> C:\>python -c "import sys; print sys.argv"
> ['-c']
>
> C:\>python -c "import sys; print sys.argv" foo"bar
> ['-c', 'foobar']
>
> C:\>python -c "import sys; print sys.argv" foobar"
> ['-c', 'foobar']
>
> C:\>python -c "import sys; print sys.argv" "foo"bar"
> ['-c', 'foobar']
>
> C:\>python -c "import sys; print sys.argv" foo""bar
> ['-c', 'foobar']
>
> C:\>python -c "import sys; print sys.argv" foo" "bar
> ['-c', 'foo bar']
>
> C:\>python -c "import sys; print sys.argv" foo\"bar
> ['-c', 'foo"bar']
>
> C:\>python -c "import sys; print sys.argv" "foo""bar"
> ['-c', 'foo"bar']
>
> C:\>python -c "import sys; print sys.argv" foo\'bar
> ['-c', "foo\\'bar"]
>
> C:\>python -c "import sys; print sys.argv" "foo" "bar"
> ['-c', 'foo', 'bar']
>
> C:\>python -c "import sys; print sys.argv" foo\" \"bar
> ['-c', 'foo"', '"bar']

As mentioned, if you want to preserve this, it shouldn't be too hard.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkr2C28ACgkQJdeBCYSNAANc0wCfSZTcjOvDZU34CVgV3tn9QwOo
v7YAniCv+xfl6jGkWux/it4XtfAp8r5Y
=JHhR
-----END PGP SIGNATURE-----

Revision history for this message
Alexander Belchenko (bialix) wrote :
Download full text (3.8 KiB)

> Alexander Belchenko wrote:
> > I disagree with John intent to make command-line posix like (despite the
> glob) because it breaks rule of the least surprise. If I know how all programs
> in standard shell behave then I'd like to get the same behavior from bzr.
> >
> > John's patch was about fixing 3 bugs/feature requests for adding glob
> expansion. There was no request to change command-line parsing to follow all
> linux rules. Why then you don't support `` syntax?
>
> We also have had a bug about "don't use single quotes in documentation
> because it breaks on Windows". I don't have the bug # offhand.

I remember that bug.

> It *has* been a source of difficulty for users that the command line
> does not behave the same across different platforms. Even if we fix up
> the bzr docs, people may read recipes in blog posts, etc. And then not
> have them work.

I don't know how to handle this problem really.

Actually I more concerned about wrong habits. I have very strong fingers memory so sometimes I'm even trying to run normal windows commands as `bzr xxx` instead of just `xxx`. And I fear that intensive usage of single quotes leads me or other users to the fact we start trying to use them with standard commands. And they will fail every time.

So the problem is two-fold.

Globs are there for very long time, at least as part of bzr add. This is natural improvement. But single quote is mental block for me. Maybe I'm crazy.

> I'm at the point of being willing to land your patch to make you happy,
> even if it makes me a little unhappy. If you want to continue onward to
> change the rest of the differences, feel free. I'll review and land it.

I don't want to be in position of hard pressure on you and make you unhappy. That's why I've asked bzr-windows community to comment on this issue. You put me in the position when I can only give up. But in fact: *why* you feel unhappy?

I certainly can make myself happy very easy: I can build custom bzr.exe for myself with my patch inside. I don't need TortoiseBzr so it's possible for me.

> I also found a couple test suite failures because we no longer do
> globbing of 'run_bzr()' arguments, though I consider that a good thing.
> I'll be proposing a patch for that in a few minutes.
>
> >
> > John don't use standard shell, he's using Cygwin. Robert don't use Windows
> either. Why you then want to beak standard windows shell behavior instead of
> following it? For those who need different there is always Cygwin and Msys.
> >
> > More closer to some concerns about escaping quotes. I've made some testing
> with simple python snippet. Here is how it works in standard shell:
> >
> > C:\>python -c "import sys; print sys.argv"
> > ['-c']
> >
> > C:\>python -c "import sys; print sys.argv" foo"bar
> > ['-c', 'foobar']
> >
> > C:\>python -c "import sys; print sys.argv" foobar"
> > ['-c', 'foobar']
> >
> > C:\>python -c "import sys; print sys.argv" "foo"bar"
> > ['-c', 'foobar']
> >
> > C:\>python -c "import sys; print sys.argv" foo""bar
> > ['-c', 'foobar']
> >
> > C:\>python -c "import sys; print sys.argv" foo" "bar
> > ['-c', 'foo bar']
> >
> > C:\>python -c "import sys; print sys.argv" foo\"bar
> > ['-c', 'fo...

Read more...

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

...

> I've showed this session as example of cmd.exe behavior and maybe as possible source for extending test suite. I'm still trying to figure out how your new command-line parsing may impact on qsubprocess and bencoded command-line.

If qbzr wraps everything in "" as most programs do, then I think
everything will be ok. It may need to do:

args = [" + arg.replace('"', '\\"') + " for arg in args]

Which generally wraps any argument in "" and then escapes any embedded
quotes.

With that syntax, bzr should be the same as the standard
CommandLineToArgv parser. (caveat bugs)

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEUEARECAAYFAkr2Eo8ACgkQJdeBCYSNAANjXgCgiu3cHX0sblh0hENKO0u/nzm6
Bo0AkwVm8qdMNqfU8BuMLN/Q5CcMDj8=
=9Dql
-----END PGP SIGNATURE-----

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

...

>> It *has* been a source of difficulty for users that the command line
>> does not behave the same across different platforms. Even if we fix up
>> the bzr docs, people may read recipes in blog posts, etc. And then not
>> have them work.
>
> I don't know how to handle this problem really.
>
> Actually I more concerned about wrong habits. I have very strong fingers memory so sometimes I'm even trying to run normal windows commands as `bzr xxx` instead of just `xxx`. And I fear that intensive usage of single quotes leads me or other users to the fact we start trying to use them with standard commands. And they will fail every time.
>

I should mention that your request is in the hands of PQM as we speak.
There were more people who felt that ' should not be a quote char than
those that did. We can always revisit that decision again later when
someone who cares about it brings it up again.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkr2E8UACgkQJdeBCYSNAAOeeQCdE10PuHzfZdU98Q9bDN0LEIog
xPgAnjDEgGLye8YlYG5yLuJyPdYW0Q4G
=uqBm
-----END PGP SIGNATURE-----

Revision history for this message
Gordon Tyler (doxxx) wrote :

My point was that both bash/etc. and cmd handle quoting their own way, so bzr shouldn't be re-implementing it. However, cmd doesn't do globbing, so obviously bzr has to implement it.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'NEWS'
2--- NEWS 2009-11-05 20:08:36 +0000
3+++ NEWS 2009-11-06 10:10:24 +0000
4@@ -28,8 +28,7 @@
5 done in bash, etc.) This means that *all* commands get glob expansion
6 (bzr status, bzr add, bzr mv, etc). It uses a custom command line
7 parser, which allows us to know if a given section was quoted. It means
8- you can now do ``bzr ignore "*.py"``. It also means that single-quotes
9- are now treated as quoted ``bzr ignore '*.py'``.
10+ you can now do ``bzr ignore "*.py"``.
11 (John Arbash Meinel, #425510, #426410, #194450)
12
13 * Sanitize commit messages that come in from the '-m' flag. We translate
14
15=== modified file 'bzrlib/tests/test_win32utils.py'
16--- bzrlib/tests/test_win32utils.py 2009-11-04 22:12:46 +0000
17+++ bzrlib/tests/test_win32utils.py 2009-11-06 10:10:24 +0000
18@@ -288,13 +288,14 @@
19
20 def test_posix_quotations(self):
21 self.assertAsTokens([(True, u'foo bar')], u'"foo bar"')
22- self.assertAsTokens([(True, u'foo bar')], u"'foo bar'")
23- self.assertAsTokens([(True, u'foo bar')], u"'fo''o b''ar'")
24+ self.assertAsTokens([(False, u"'fo''o"), (False, u"b''ar'")],
25+ u"'fo''o b''ar'")
26 self.assertAsTokens([(True, u'foo bar')], u'"fo""o b""ar"')
27- self.assertAsTokens([(True, u'foo bar')], u'"fo"\'o b\'"ar"')
28+ self.assertAsTokens([(True, u"fo'o"), (True, u"b'ar")],
29+ u'"fo"\'o b\'"ar"')
30
31 def test_nested_quotations(self):
32- self.assertAsTokens([(True, u'foo"" bar')], u"'foo\"\" bar'")
33+ self.assertAsTokens([(True, u'foo"" bar')], u"\"foo\\\"\\\" bar\"")
34 self.assertAsTokens([(True, u'foo\'\' bar')], u"\"foo'' bar\"")
35
36 def test_empty_result(self):
37@@ -303,7 +304,7 @@
38
39 def test_quoted_empty(self):
40 self.assertAsTokens([(True, '')], u'""')
41- self.assertAsTokens([(True, '')], u"''")
42+ self.assertAsTokens([(False, u"''")], u"''")
43
44 def test_unicode_chars(self):
45 self.assertAsTokens([(False, u'f\xb5\xee'), (False, u'\u1234\u3456')],
46@@ -311,18 +312,15 @@
47
48 def test_newline_in_quoted_section(self):
49 self.assertAsTokens([(True, u'foo\nbar\nbaz\n')], u'"foo\nbar\nbaz\n"')
50- self.assertAsTokens([(True, u'foo\nbar\nbaz\n')], u"'foo\nbar\nbaz\n'")
51
52 def test_escape_chars(self):
53 self.assertAsTokens([(False, u'foo\\bar')], u'foo\\bar')
54
55 def test_escape_quote(self):
56 self.assertAsTokens([(True, u'foo"bar')], u'"foo\\"bar"')
57- self.assertAsTokens([(True, u'foo\\"bar')], u"'foo\\\"bar'")
58
59 def test_double_escape(self):
60 self.assertAsTokens([(True, u'foo\\bar')], u'"foo\\\\bar"')
61- self.assertAsTokens([(True, u'foo\\\\bar')], u"'foo\\\\bar'")
62 self.assertAsTokens([(False, u'foo\\\\bar')], u"foo\\\\bar")
63
64
65@@ -344,10 +342,14 @@
66 def test_quoted_globs(self):
67 self.build_tree(['a/', 'a/b.c', 'a/c.c', 'a/c.h'])
68 self.assertCommandLine([u'a/*.c'], '"a/*.c"')
69- self.assertCommandLine([u'a/*.c'], "'a/*.c'")
70+ self.assertCommandLine([u"'a/*.c'"], "'a/*.c'")
71
72 def test_slashes_changed(self):
73 self.assertCommandLine([u'a/*.c'], '"a\\*.c"')
74 # Expands the glob, but nothing matches
75 self.assertCommandLine([u'a/*.c'], 'a\\*.c')
76 self.assertCommandLine([u'a/foo.c'], 'a\\foo.c')
77+
78+ def test_no_single_quote_supported(self):
79+ self.assertCommandLine(["add", "let's-do-it.txt"],
80+ "add let's-do-it.txt")
81
82=== modified file 'bzrlib/win32utils.py'
83--- bzrlib/win32utils.py 2009-11-04 22:26:25 +0000
84+++ bzrlib/win32utils.py 2009-11-06 10:10:24 +0000
85@@ -535,7 +535,7 @@
86 self._input_iter = iter(self._input)
87 self._whitespace_match = re.compile(u'\s').match
88 self._word_match = re.compile(u'\S').match
89- self._quote_chars = u'\'"'
90+ self._quote_chars = u'"'
91 # self._quote_match = re.compile(u'[\'"]').match
92 self._escape_match = lambda x: None # Never matches
93 self._escape = '\\'
94@@ -543,7 +543,6 @@
95 # ' ' - after whitespace, starting a new token
96 # 'a' - after text, currently working on a token
97 # '"' - after ", currently in a "-delimited quoted section
98- # "'" - after ', currently in a '-delimited quotod section
99 # "\" - after '\', checking the next char
100 self._state = ' '
101 self._token = [] # Current token being parsed