Merge lp:~bialix/bzr/no-single-quote-on-win32 into lp:bzr

Proposed by Alexander Belchenko
Status: Merged
Merged at revision: not available
Proposed branch: lp:~bialix/bzr/no-single-quote-on-win32
Merge into: lp:bzr
Diff against target: 101 lines (+13/-13)
3 files modified
NEWS (+1/-2)
bzrlib/tests/test_win32utils.py (+11/-9)
bzrlib/win32utils.py (+1/-2)
To merge this branch: bzr merge lp:~bialix/bzr/no-single-quote-on-win32
Reviewer Review Type Date Requested Status
Gordon Tyler Approve
John A Meinel Abstain
Martin Packman (community) Approve
Adrian Wilkins Pending
Gary van der Merwe Pending
methane Pending
bzr-core Pending
Review via email: mp+14537@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Alexander Belchenko (bialix) wrote :

Restore standard behavior of command-line arguments parsing by removing support for single quote.

The main reason for this change is because it's not windows-way and who need single quotes and rich shell will use Cygwin/Msys anyway, so please don't break windows standards. Otherwise for any hardcore windows user it's hard to understand why simple command-line like this won't work:

bzr add let's-do-it.txt

bzr:ERROR: No file named lets-do-it.txt

Revision history for this message
Martin Packman (gz) wrote :

I think this is the right policy change.

Messing around a little with double quotes is fine: they aren't valid in filenames anyway, but single quotes are.

If the parsing is pared back to CommandLineToArgv-plus-remembering-if-it-was-quoted, UnicodeShlex could be further simplified.

review: Approve
Revision history for this message
John A Meinel (jameinel) wrote :

Obviously I prefer if we kept '', or I wouldn't have implemented it. If others feel strongly I can be overridden, so I'm not voting -1, just -0.

review: Abstain
Revision history for this message
Gordon Tyler (doxxx) wrote :

Handling of quoted commandline arguments is the responsibility of the shell.

review: Approve
Revision history for this message
Robert Collins (lifeless) wrote :

On Sat, 2009-11-07 at 16:30 +0000, Gordon Tyler wrote:
> Review: Approve
> Handling of quoted commandline arguments is the responsibility of the shell.

So is glob expansion, but we do that in bzr on windows. Why is glob
expansion something bzr should do, but not quoting?

John - does
c:\foo\> bzr commit -m "foo'bar"
work at the moment ? If not, we should fix that - on unix we'd want
"foo'bar" not "foobar" as the resulting string.

For now, I'm abstaining, as I don't use windows either. I *do* think it
would be very nice to be able to have our docs be the same for win32 and
unix as far as handling paths with spaces.

-Rob

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> On Sat, 2009-11-07 at 16:30 +0000, Gordon Tyler wrote:
>> Review: Approve
>> Handling of quoted commandline arguments is the responsibility of the shell.
>
> So is glob expansion, but we do that in bzr on windows. Why is glob
> expansion something bzr should do, but not quoting?
>
> John - does
> c:\foo\> bzr commit -m "foo'bar"
> work at the moment ? If not, we should fix that - on unix we'd want
> "foo'bar" not "foobar" as the resulting string.
>
> For now, I'm abstaining, as I don't use windows either. I *do* think it
> would be very nice to be able to have our docs be the same for win32 and
> unix as far as handling paths with spaces.
>
> -Rob
>

bzr commit -m "foo'bar" works as you would expect. The change is that:

 bzr commit -m foo'bar

is treated as "foobar". Normally your shell would hang waiting for you
to close your quotes. We get the string late, so can't really do that.
(We *could* error, though.)

Note that this patch "breaks"

 bzr commit foo'bar

as it doesn't support
 bzr commit foo\'bar
though I'll note that you could have a file named "'bar". In Unix world
you would commit that with

 bzr commit foo/\'bar

I don't really know how to express it in something similar-enough to
Windows shell...

 bzr commit foo\\\'bar ?

Note that Windows works around not having an escape character by just
not allowing characters that you would otherwise want to escape as a
valid filename. So you can't have:

 bzr commit foo"bar

" is not allowed (and neither is *?<>/\:...)

It wouldn't be hard to update the parser to allow:

 bzr commit foo\'bar

And allowing that we also need

 bzr commit foo\\\'bar

of course.
(we don't *need* foo\"bar since " isn't allowed in filenames, and we
have it already for 'bzr commit -m "foo \" bar"')

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkr12V4ACgkQJdeBCYSNAANneACdGJCsd4JdfElOr+/Zi+9u+ARu
tfgAn3C4hu+aPneEqHXqTvgDmYzL4QUp
=Tiv0
-----END PGP SIGNATURE-----

Revision history for this message
Alexander Belchenko (bialix) wrote :

I disagree with John intent to make command-line posix like (despite the glob) because it breaks rule of the least surprise. If I know how all programs in standard shell behave then I'd like to get the same behavior from bzr.

John's patch was about fixing 3 bugs/feature requests for adding glob expansion. There was no request to change command-line parsing to follow all linux rules. Why then you don't support `` syntax?

John don't use standard shell, he's using Cygwin. Robert don't use Windows either. Why you then want to beak standard windows shell behavior instead of following it? For those who need different there is always Cygwin and Msys.

More closer to some concerns about escaping quotes. I've made some testing with simple python snippet. Here is how it works in standard shell:

C:\>python -c "import sys; print sys.argv"
['-c']

C:\>python -c "import sys; print sys.argv" foo"bar
['-c', 'foobar']

C:\>python -c "import sys; print sys.argv" foobar"
['-c', 'foobar']

C:\>python -c "import sys; print sys.argv" "foo"bar"
['-c', 'foobar']

C:\>python -c "import sys; print sys.argv" foo""bar
['-c', 'foobar']

C:\>python -c "import sys; print sys.argv" foo" "bar
['-c', 'foo bar']

C:\>python -c "import sys; print sys.argv" foo\"bar
['-c', 'foo"bar']

C:\>python -c "import sys; print sys.argv" "foo""bar"
['-c', 'foo"bar']

C:\>python -c "import sys; print sys.argv" foo\'bar
['-c', "foo\\'bar"]

C:\>python -c "import sys; print sys.argv" "foo" "bar"
['-c', 'foo', 'bar']

C:\>python -c "import sys; print sys.argv" foo\" \"bar
['-c', 'foo"', '"bar']

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Alexander Belchenko wrote:
> I disagree with John intent to make command-line posix like (despite the glob) because it breaks rule of the least surprise. If I know how all programs in standard shell behave then I'd like to get the same behavior from bzr.
>
> John's patch was about fixing 3 bugs/feature requests for adding glob expansion. There was no request to change command-line parsing to follow all linux rules. Why then you don't support `` syntax?

We also have had a bug about "don't use single quotes in documentation
because it breaks on Windows". I don't have the bug # offhand.

It *has* been a source of difficulty for users that the command line
does not behave the same across different platforms. Even if we fix up
the bzr docs, people may read recipes in blog posts, etc. And then not
have them work.

I'm at the point of being willing to land your patch to make you happy,
even if it makes me a little unhappy. If you want to continue onward to
change the rest of the differences, feel free. I'll review and land it.

I also found a couple test suite failures because we no longer do
globbing of 'run_bzr()' arguments, though I consider that a good thing.
I'll be proposing a patch for that in a few minutes.

>
> John don't use standard shell, he's using Cygwin. Robert don't use Windows either. Why you then want to beak standard windows shell behavior instead of following it? For those who need different there is always Cygwin and Msys.
>
> More closer to some concerns about escaping quotes. I've made some testing with simple python snippet. Here is how it works in standard shell:
>
> C:\>python -c "import sys; print sys.argv"
> ['-c']
>
> C:\>python -c "import sys; print sys.argv" foo"bar
> ['-c', 'foobar']
>
> C:\>python -c "import sys; print sys.argv" foobar"
> ['-c', 'foobar']
>
> C:\>python -c "import sys; print sys.argv" "foo"bar"
> ['-c', 'foobar']
>
> C:\>python -c "import sys; print sys.argv" foo""bar
> ['-c', 'foobar']
>
> C:\>python -c "import sys; print sys.argv" foo" "bar
> ['-c', 'foo bar']
>
> C:\>python -c "import sys; print sys.argv" foo\"bar
> ['-c', 'foo"bar']
>
> C:\>python -c "import sys; print sys.argv" "foo""bar"
> ['-c', 'foo"bar']
>
> C:\>python -c "import sys; print sys.argv" foo\'bar
> ['-c', "foo\\'bar"]
>
> C:\>python -c "import sys; print sys.argv" "foo" "bar"
> ['-c', 'foo', 'bar']
>
> C:\>python -c "import sys; print sys.argv" foo\" \"bar
> ['-c', 'foo"', '"bar']

As mentioned, if you want to preserve this, it shouldn't be too hard.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkr2C28ACgkQJdeBCYSNAANc0wCfSZTcjOvDZU34CVgV3tn9QwOo
v7YAniCv+xfl6jGkWux/it4XtfAp8r5Y
=JHhR
-----END PGP SIGNATURE-----

Revision history for this message
Alexander Belchenko (bialix) wrote :
Download full text (3.8 KiB)

> Alexander Belchenko wrote:
> > I disagree with John intent to make command-line posix like (despite the
> glob) because it breaks rule of the least surprise. If I know how all programs
> in standard shell behave then I'd like to get the same behavior from bzr.
> >
> > John's patch was about fixing 3 bugs/feature requests for adding glob
> expansion. There was no request to change command-line parsing to follow all
> linux rules. Why then you don't support `` syntax?
>
> We also have had a bug about "don't use single quotes in documentation
> because it breaks on Windows". I don't have the bug # offhand.

I remember that bug.

> It *has* been a source of difficulty for users that the command line
> does not behave the same across different platforms. Even if we fix up
> the bzr docs, people may read recipes in blog posts, etc. And then not
> have them work.

I don't know how to handle this problem really.

Actually I more concerned about wrong habits. I have very strong fingers memory so sometimes I'm even trying to run normal windows commands as `bzr xxx` instead of just `xxx`. And I fear that intensive usage of single quotes leads me or other users to the fact we start trying to use them with standard commands. And they will fail every time.

So the problem is two-fold.

Globs are there for very long time, at least as part of bzr add. This is natural improvement. But single quote is mental block for me. Maybe I'm crazy.

> I'm at the point of being willing to land your patch to make you happy,
> even if it makes me a little unhappy. If you want to continue onward to
> change the rest of the differences, feel free. I'll review and land it.

I don't want to be in position of hard pressure on you and make you unhappy. That's why I've asked bzr-windows community to comment on this issue. You put me in the position when I can only give up. But in fact: *why* you feel unhappy?

I certainly can make myself happy very easy: I can build custom bzr.exe for myself with my patch inside. I don't need TortoiseBzr so it's possible for me.

> I also found a couple test suite failures because we no longer do
> globbing of 'run_bzr()' arguments, though I consider that a good thing.
> I'll be proposing a patch for that in a few minutes.
>
> >
> > John don't use standard shell, he's using Cygwin. Robert don't use Windows
> either. Why you then want to beak standard windows shell behavior instead of
> following it? For those who need different there is always Cygwin and Msys.
> >
> > More closer to some concerns about escaping quotes. I've made some testing
> with simple python snippet. Here is how it works in standard shell:
> >
> > C:\>python -c "import sys; print sys.argv"
> > ['-c']
> >
> > C:\>python -c "import sys; print sys.argv" foo"bar
> > ['-c', 'foobar']
> >
> > C:\>python -c "import sys; print sys.argv" foobar"
> > ['-c', 'foobar']
> >
> > C:\>python -c "import sys; print sys.argv" "foo"bar"
> > ['-c', 'foobar']
> >
> > C:\>python -c "import sys; print sys.argv" foo""bar
> > ['-c', 'foobar']
> >
> > C:\>python -c "import sys; print sys.argv" foo" "bar
> > ['-c', 'foo bar']
> >
> > C:\>python -c "import sys; print sys.argv" foo\"bar
> > ['-c', 'fo...

Read more...

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

...

> I've showed this session as example of cmd.exe behavior and maybe as possible source for extending test suite. I'm still trying to figure out how your new command-line parsing may impact on qsubprocess and bencoded command-line.

If qbzr wraps everything in "" as most programs do, then I think
everything will be ok. It may need to do:

args = [" + arg.replace('"', '\\"') + " for arg in args]

Which generally wraps any argument in "" and then escapes any embedded
quotes.

With that syntax, bzr should be the same as the standard
CommandLineToArgv parser. (caveat bugs)

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEUEARECAAYFAkr2Eo8ACgkQJdeBCYSNAANjXgCgiu3cHX0sblh0hENKO0u/nzm6
Bo0AkwVm8qdMNqfU8BuMLN/Q5CcMDj8=
=9Dql
-----END PGP SIGNATURE-----

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

...

>> It *has* been a source of difficulty for users that the command line
>> does not behave the same across different platforms. Even if we fix up
>> the bzr docs, people may read recipes in blog posts, etc. And then not
>> have them work.
>
> I don't know how to handle this problem really.
>
> Actually I more concerned about wrong habits. I have very strong fingers memory so sometimes I'm even trying to run normal windows commands as `bzr xxx` instead of just `xxx`. And I fear that intensive usage of single quotes leads me or other users to the fact we start trying to use them with standard commands. And they will fail every time.
>

I should mention that your request is in the hands of PQM as we speak.
There were more people who felt that ' should not be a quote char than
those that did. We can always revisit that decision again later when
someone who cares about it brings it up again.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkr2E8UACgkQJdeBCYSNAAOeeQCdE10PuHzfZdU98Q9bDN0LEIog
xPgAnjDEgGLye8YlYG5yLuJyPdYW0Q4G
=uqBm
-----END PGP SIGNATURE-----

Revision history for this message
Gordon Tyler (doxxx) wrote :

My point was that both bash/etc. and cmd handle quoting their own way, so bzr shouldn't be re-implementing it. However, cmd doesn't do globbing, so obviously bzr has to implement it.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'NEWS'
--- NEWS 2009-11-05 20:08:36 +0000
+++ NEWS 2009-11-06 10:10:24 +0000
@@ -28,8 +28,7 @@
28 done in bash, etc.) This means that *all* commands get glob expansion28 done in bash, etc.) This means that *all* commands get glob expansion
29 (bzr status, bzr add, bzr mv, etc). It uses a custom command line29 (bzr status, bzr add, bzr mv, etc). It uses a custom command line
30 parser, which allows us to know if a given section was quoted. It means30 parser, which allows us to know if a given section was quoted. It means
31 you can now do ``bzr ignore "*.py"``. It also means that single-quotes31 you can now do ``bzr ignore "*.py"``.
32 are now treated as quoted ``bzr ignore '*.py'``.
33 (John Arbash Meinel, #425510, #426410, #194450)32 (John Arbash Meinel, #425510, #426410, #194450)
3433
35* Sanitize commit messages that come in from the '-m' flag. We translate34* Sanitize commit messages that come in from the '-m' flag. We translate
3635
=== modified file 'bzrlib/tests/test_win32utils.py'
--- bzrlib/tests/test_win32utils.py 2009-11-04 22:12:46 +0000
+++ bzrlib/tests/test_win32utils.py 2009-11-06 10:10:24 +0000
@@ -288,13 +288,14 @@
288288
289 def test_posix_quotations(self):289 def test_posix_quotations(self):
290 self.assertAsTokens([(True, u'foo bar')], u'"foo bar"')290 self.assertAsTokens([(True, u'foo bar')], u'"foo bar"')
291 self.assertAsTokens([(True, u'foo bar')], u"'foo bar'")291 self.assertAsTokens([(False, u"'fo''o"), (False, u"b''ar'")],
292 self.assertAsTokens([(True, u'foo bar')], u"'fo''o b''ar'")292 u"'fo''o b''ar'")
293 self.assertAsTokens([(True, u'foo bar')], u'"fo""o b""ar"')293 self.assertAsTokens([(True, u'foo bar')], u'"fo""o b""ar"')
294 self.assertAsTokens([(True, u'foo bar')], u'"fo"\'o b\'"ar"')294 self.assertAsTokens([(True, u"fo'o"), (True, u"b'ar")],
295 u'"fo"\'o b\'"ar"')
295296
296 def test_nested_quotations(self):297 def test_nested_quotations(self):
297 self.assertAsTokens([(True, u'foo"" bar')], u"'foo\"\" bar'")298 self.assertAsTokens([(True, u'foo"" bar')], u"\"foo\\\"\\\" bar\"")
298 self.assertAsTokens([(True, u'foo\'\' bar')], u"\"foo'' bar\"")299 self.assertAsTokens([(True, u'foo\'\' bar')], u"\"foo'' bar\"")
299300
300 def test_empty_result(self):301 def test_empty_result(self):
@@ -303,7 +304,7 @@
303304
304 def test_quoted_empty(self):305 def test_quoted_empty(self):
305 self.assertAsTokens([(True, '')], u'""')306 self.assertAsTokens([(True, '')], u'""')
306 self.assertAsTokens([(True, '')], u"''")307 self.assertAsTokens([(False, u"''")], u"''")
307308
308 def test_unicode_chars(self):309 def test_unicode_chars(self):
309 self.assertAsTokens([(False, u'f\xb5\xee'), (False, u'\u1234\u3456')],310 self.assertAsTokens([(False, u'f\xb5\xee'), (False, u'\u1234\u3456')],
@@ -311,18 +312,15 @@
311312
312 def test_newline_in_quoted_section(self):313 def test_newline_in_quoted_section(self):
313 self.assertAsTokens([(True, u'foo\nbar\nbaz\n')], u'"foo\nbar\nbaz\n"')314 self.assertAsTokens([(True, u'foo\nbar\nbaz\n')], u'"foo\nbar\nbaz\n"')
314 self.assertAsTokens([(True, u'foo\nbar\nbaz\n')], u"'foo\nbar\nbaz\n'")
315315
316 def test_escape_chars(self):316 def test_escape_chars(self):
317 self.assertAsTokens([(False, u'foo\\bar')], u'foo\\bar')317 self.assertAsTokens([(False, u'foo\\bar')], u'foo\\bar')
318318
319 def test_escape_quote(self):319 def test_escape_quote(self):
320 self.assertAsTokens([(True, u'foo"bar')], u'"foo\\"bar"')320 self.assertAsTokens([(True, u'foo"bar')], u'"foo\\"bar"')
321 self.assertAsTokens([(True, u'foo\\"bar')], u"'foo\\\"bar'")
322321
323 def test_double_escape(self):322 def test_double_escape(self):
324 self.assertAsTokens([(True, u'foo\\bar')], u'"foo\\\\bar"')323 self.assertAsTokens([(True, u'foo\\bar')], u'"foo\\\\bar"')
325 self.assertAsTokens([(True, u'foo\\\\bar')], u"'foo\\\\bar'")
326 self.assertAsTokens([(False, u'foo\\\\bar')], u"foo\\\\bar")324 self.assertAsTokens([(False, u'foo\\\\bar')], u"foo\\\\bar")
327325
328326
@@ -344,10 +342,14 @@
344 def test_quoted_globs(self):342 def test_quoted_globs(self):
345 self.build_tree(['a/', 'a/b.c', 'a/c.c', 'a/c.h'])343 self.build_tree(['a/', 'a/b.c', 'a/c.c', 'a/c.h'])
346 self.assertCommandLine([u'a/*.c'], '"a/*.c"')344 self.assertCommandLine([u'a/*.c'], '"a/*.c"')
347 self.assertCommandLine([u'a/*.c'], "'a/*.c'")345 self.assertCommandLine([u"'a/*.c'"], "'a/*.c'")
348346
349 def test_slashes_changed(self):347 def test_slashes_changed(self):
350 self.assertCommandLine([u'a/*.c'], '"a\\*.c"')348 self.assertCommandLine([u'a/*.c'], '"a\\*.c"')
351 # Expands the glob, but nothing matches349 # Expands the glob, but nothing matches
352 self.assertCommandLine([u'a/*.c'], 'a\\*.c')350 self.assertCommandLine([u'a/*.c'], 'a\\*.c')
353 self.assertCommandLine([u'a/foo.c'], 'a\\foo.c')351 self.assertCommandLine([u'a/foo.c'], 'a\\foo.c')
352
353 def test_no_single_quote_supported(self):
354 self.assertCommandLine(["add", "let's-do-it.txt"],
355 "add let's-do-it.txt")
354356
=== modified file 'bzrlib/win32utils.py'
--- bzrlib/win32utils.py 2009-11-04 22:26:25 +0000
+++ bzrlib/win32utils.py 2009-11-06 10:10:24 +0000
@@ -535,7 +535,7 @@
535 self._input_iter = iter(self._input)535 self._input_iter = iter(self._input)
536 self._whitespace_match = re.compile(u'\s').match536 self._whitespace_match = re.compile(u'\s').match
537 self._word_match = re.compile(u'\S').match537 self._word_match = re.compile(u'\S').match
538 self._quote_chars = u'\'"'538 self._quote_chars = u'"'
539 # self._quote_match = re.compile(u'[\'"]').match539 # self._quote_match = re.compile(u'[\'"]').match
540 self._escape_match = lambda x: None # Never matches540 self._escape_match = lambda x: None # Never matches
541 self._escape = '\\'541 self._escape = '\\'
@@ -543,7 +543,6 @@
543 # ' ' - after whitespace, starting a new token543 # ' ' - after whitespace, starting a new token
544 # 'a' - after text, currently working on a token544 # 'a' - after text, currently working on a token
545 # '"' - after ", currently in a "-delimited quoted section545 # '"' - after ", currently in a "-delimited quoted section
546 # "'" - after ', currently in a '-delimited quotod section
547 # "\" - after '\', checking the next char546 # "\" - after '\', checking the next char
548 self._state = ' '547 self._state = ' '
549 self._token = [] # Current token being parsed548 self._token = [] # Current token being parsed