Bazaar

Merge lp:~doxxx/bzr/392428 into lp:bzr

392428
Merge into bzr.dev

Proposed by Gordon Tyler on 2009-12-21

Status:

Superseded

Proposed branch:

lp:~doxxx/bzr/392428

Merge into:

lp:bzr

Diff against target:

439 lines (+182/-129)

6 files modified

bzrlib/commands.py (+6/-2)
bzrlib/tests/test_commands.py (+4/-1)
bzrlib/tests/test_diff.py (+18/-1)
bzrlib/tests/test_rules.py (+6/-2)
bzrlib/tests/test_win32utils.py (+10/-13)
bzrlib/win32utils.py (+138/-110)

To merge this branch:

bzr merge lp:~doxxx/bzr/392428

Medium

Fix Released

Link a bug report

Reviewer	Date Requested	Status
John A Meinel	2009-12-21	Needs Fixing on 2009-12-21
Alexander Belchenko		Approve on 2009-12-21
Review via email: mp+16444@code.launchpad.net

This proposal has been superseded by a proposal from 2010-01-06.

Revision history for this message

Gordon Tyler (doxxx) wrote on 2009-12-21:

Fixed bug 392428 by changing commands.shlex_split_unicode to use win32utils.command_line_to_argv on win32.

I also cleaned up the win32utils.get_unicode_argv function. There was some cruft leftover from what I'm guessing was its previous implementation which used the CommandLineToArgvW Win32 API function. I added error handling as well.

Not sure how to write tests for this. I tested by hand using KDiff3, WinMerge and a Windows command script which calls KDiff3:

bzrdev diff --using "C:\Program Files (x86)\KDiff3\kdiff3.exe"
bzrdev diff --using "C:\Program Files (x86)\WinMerge\WinMergeU.exe"
bzrdev diff --using C:\Tools\diff.cmd

'bzrdev selftest win32utils' still passes.

Revision history for this message

Alexander Belchenko (bialix) wrote on 2009-12-21:

For non-windows it's better to use the helper function from commands.py: shlex_split_unicode. But even so it's OK for me.

Thanks!

review: Approve

Revision history for this message

Gordon Tyler (doxxx) wrote on 2009-12-21:

> For non-windows it's better to use the helper function from commands.py:
> shlex_split_unicode. But even so it's OK for me.

That's the function I modified to use a different codepath on Windows.

Revision history for this message

John A Meinel (jameinel) wrote on 2009-12-21:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Gordon Tyler wrote:
> Gordon Tyler has proposed merging lp:~doxxx/bzr/392428 into lp:bzr.
>
> Requested reviews:
> bzr-core (bzr-core)
> Related bugs:
> #392428 `bzr diff --using C:\foo\bar` does not work
> https://bugs.launchpad.net/bugs/392428
>
>
> Fixed bug 392428 by changing commands.shlex_split_unicode to use win32utils.command_line_to_argv on win32.
>
> I also cleaned up the win32utils.get_unicode_argv function. There was some cruft leftover from what I'm guessing was its previous implementation which used the CommandLineToArgvW Win32 API function. I added error handling as well.
>
> Not sure how to write tests for this. I tested by hand using KDiff3, WinMerge and a Windows command script which calls KDiff3:
>
> bzrdev diff --using "C:\Program Files (x86)\KDiff3\kdiff3.exe"
> bzrdev diff --using "C:\Program Files (x86)\WinMerge\WinMergeU.exe"
> bzrdev diff --using C:\Tools\diff.cmd
>
> 'bzrdev selftest win32utils' still passes.
>
>

I think command_line_to_argv is going to glob, and you probably don't
want that. It may not matter in practice, but its a thought.

The other changes seem gratuitous, but not specifically wrong.

As for testing...

1) Make sure any existing tests of 'shlex_split_unicode' aren't being
skipped on windows

2) Probably add a win32 specific test that when splitting "C:\foo\bar"
we preserve the '\' characters.

I'd like to see at least a smoke test added, but the code is simple
enough you don't have to be super thorough.

review: needsfixing

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAksv2mcACgkQJdeBCYSNAAN+WgCgwHvIP3/4KsE5JRpStpQuwk4i
uLEAoNmmJOXEPjwsaTg5VPSjbhrzW9qq
=ml5g
-----END PGP SIGNATURE-----

review: Needs Fixing

Revision history for this message

Gordon Tyler (doxxx) wrote on 2009-12-23:

So it would seem that UnicodeShlex in win32utils.py doesn't handle single quotes properly which is making some tests in test_commands.py fail. Adding support for single quites to UnicodeShlex makes a lot of the tests in test_win32utils fail because they're expecting UnicodeShlex to not handle single quotes. Once I've fixed those, I think everything should be good.

Revision history for this message

John A Meinel (jameinel) wrote on 2009-12-23:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Gordon Tyler wrote:
> So it would seem that UnicodeShlex in win32utils.py doesn't handle single quotes properly which is making some tests in test_commands.py fail. Adding support for single quites to UnicodeShlex makes a lot of the tests in test_win32utils fail because they're expecting UnicodeShlex to not handle single quotes. Once I've fixed those, I think everything should be good.

UnicodeShlex explicitly does not deal with ' because I was asked not to.
Alexander Belchenko feels that single quotes are not "Windows". (As
people can do:
touch don'tdothis
bzr add don'tdothis
)

*I* wanted ' to be a quote char. Just to mention that adding them in is
not just a bug, it is by design that we don't handle them.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAksynfkACgkQJdeBCYSNAAPtWwCgrDby/wk6EopdNiwfj6V4Mxp4
wq4An2wJzn9ifGlLklYaQUOnYxLsvN3c
=cadk
-----END PGP SIGNATURE-----

Revision history for this message

Martin Pool (mbp) wrote on 2009-12-23:

2009/12/24 Gordon Tyler <email address hidden>:
> So it would seem that UnicodeShlex in win32utils.py doesn't handle single quotes properly which is making some tests in test_commands.py fail. Adding support for single quites to UnicodeShlex makes a lot of the tests in test_win32utils fail because they're expecting UnicodeShlex to not handle single quotes. Once I've fixed those, I think everything should be good.

Which ones? I don't see anything obviously relevant.

--
Martin <http://launchpad.net/~mbp/>

Revision history for this message

Gordon Tyler (doxxx) wrote on 2009-12-24:

C:\dev\bzr\392428>bzrdev selftest commands
testing: C:/dev/bzr/392428/bzr
C:\dev\bzr\392428\bzrlib
bzr-2.1.0dev5 python-2.6.4 Windows-Vista-6.0.6001-SP1

======================================================================
FAIL: test_single_quotes (bzrlib.tests.test_commands.TestGetAlias)
----------------------------------------------------------------------
Traceback (most recent call last):
File "C:\dev\bzr\392428\bzrlib\tests\test_commands.py", line 102, in test_sing
le_quotes
commands.get_alias("diff", config=my_config))
AssertionError: not equal:
a = [u'diff', u'-r', u'-2..-1', u'--diff-options', u'--strip-trailing-cr -wp']
b = [u'diff',
u'-r',
u'-2..-1',
u'--diff-options',
u"'--strip-trailing-cr",
u"-wp'"]

======================================================================
FAIL: test_unicode (bzrlib.tests.test_commands.TestGetAlias)
----------------------------------------------------------------------
Traceback (most recent call last):
File "C:\dev\bzr\392428\bzrlib\tests\test_commands.py", line 116, in test_unic
ode
commands.get_alias("iam", config=my_config))
AssertionError: not equal:
a = [u'whoami', u'Erik B\xe5gfors <email address hidden>']
b = [u'whoami', u"'Erik", u'B\xe5gfors', u"<email address hidden>'"]

----------------------------------------------------------------------
Ran 41 tests in 0.547s

FAILED (failures=2)
1 test skipped
Missing feature 'FTPServer' skipped 18 tests.

Revision history for this message

Martin Pool (mbp) wrote on 2009-12-24:

OK.

So I think that unless we change the approach to quoting on Windows,
we probably need to make the tests skip on Windows, and add a comment
about why.

The fact that single quotes are accepted in configuration files (as
opposed to the command line) on unix but not on Windows is a bit ugly.

--
Martin <http://launchpad.net/~mbp/>

Revision history for this message

Gordon Tyler (doxxx) wrote on 2009-12-24:

I've pushed an update which makes test_single_quotes skip on win32 and fixes test_unicode for win32 by making it use double-quotes instead of single-quotes (the type of quote doesn't really matter for that specific test).

So now both test_win32utils and test_commands pass. I'm running full selftest now to see if anything else related fails.

Revision history for this message

Alexander Belchenko (bialix) wrote on 2009-12-24:

These two failures are wrong by itself and should not be affected by
single quote vs double quote differences.

Gordon Tyler пишет:
> C:\dev\bzr\392428>bzrdev selftest commands
> testing: C:/dev/bzr/392428/bzr
> C:\dev\bzr\392428\bzrlib
> bzr-2.1.0dev5 python-2.6.4 Windows-Vista-6.0.6001-SP1
>
>
> ======================================================================
> FAIL: test_single_quotes (bzrlib.tests.test_commands.TestGetAlias)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> File "C:\dev\bzr\392428\bzrlib\tests\test_commands.py", line 102, in test_sing
> le_quotes
> commands.get_alias("diff", config=my_config))
> AssertionError: not equal:
> a = [u'diff', u'-r', u'-2..-1', u'--diff-options', u'--strip-trailing-cr -wp']
> b = [u'diff',
> u'-r',
> u'-2..-1',
> u'--diff-options',
> u"'--strip-trailing-cr",
> u"-wp'"]
>
>
> ======================================================================
> FAIL: test_unicode (bzrlib.tests.test_commands.TestGetAlias)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> File "C:\dev\bzr\392428\bzrlib\tests\test_commands.py", line 116, in test_unic
> ode
> commands.get_alias("iam", config=my_config))
> AssertionError: not equal:
> a = [u'whoami', u'Erik B\xe5gfors <email address hidden>']
> b = [u'whoami', u"'Erik", u'B\xe5gfors', u"<email address hidden>'"]
>
>
> ----------------------------------------------------------------------
> Ran 41 tests in 0.547s
>
> FAILED (failures=2)
> 1 test skipped
> Missing feature 'FTPServer' skipped 18 tests.
>

Revision history for this message

Gordon Tyler (doxxx) wrote on 2009-12-24:

> These two failures are wrong by itself and should not be affected by
> single quote vs double quote differences.

Okay, I'll see about fixing them.

Revision history for this message

Gordon Tyler (doxxx) wrote on 2009-12-24:

UnicodeShlex doesn't handling escaping a space, so test_from_string_u5 in test_diff.TestDiffFromTool is failing with the modified shlex_split_unicode codepath for win32. So another thing to be fixed.

Revision history for this message

Gordon Tyler (doxxx) wrote on 2010-01-02:

I've written a replacement for UnicodeShlex which duplicates the way the Win32 API function CommandLineToArgvW handles splitting a commandline into arguments, but with the added functionality of indicating which arguments are quoted, like UnicodeShlex did.

Revision history for this message

Gordon Tyler (doxxx) wrote on 2010-01-02:

And I've just added a test to test_diff which actually tests the scenario in bug 392428.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Alejandro Cornejo2

Alexander Belchenko

Bazaar Codereview Subscribers

Benoit Pierre

Gmood

Gordon Tyler

Karl Bielefeldt

Mahmoud Hassan

Matt Nordhoff

Mohd Fikri Mohd Amin

MrJOHN

Václav Haisman

bzr PQM

vincenzo

to status/vote changes:

amandla2023

Bazaar

Merge lp:~doxxx/bzr/392428 into lp:bzr

Commit message

Description of the change

Preview Diff

Subscribers

 === modified file 'bzrlib/commands.py'
 --- bzrlib/commands.py	2009-12-09 05:47:32 +0000
 +++ bzrlib/commands.py	2010-01-02 01:26:16 +0000
@@ -869,8 +869,12 @@
  def shlex_split_unicode(unsplit):
--    import shlex
--    return [u.decode('utf-8') for u in shlex.split(unsplit.encode('utf-8'))]
++    if sys.platform == "win32":
++        from bzrlib.win32utils import command_line_to_argv
++        return command_line_to_argv(unsplit, wildcard_expansion=False)
++    else:
++        import shlex
++        return [u.decode('utf-8') for u in shlex.split(unsplit.encode('utf-8'))]
  def get_alias(cmd, config=None):
 === modified file 'bzrlib/tests/test_commands.py'
 --- bzrlib/tests/test_commands.py	2009-05-23 21:01:51 +0000
 +++ bzrlib/tests/test_commands.py	2010-01-02 01:26:16 +0000
@@ -94,6 +94,9 @@
              commands.get_alias("diff", config=my_config))
      def test_single_quotes(self):
++        if sys.platform == 'win32':
++            raise TestSkipped("commandline parsing on win32 does not "
++                              "support single quotes")
          my_config = self._get_config("[ALIASES]\n"
              "diff=diff -r -2..-1 --diff-options "
              "'--strip-trailing-cr -wp'\n")
@@ -111,7 +114,7 @@
      def test_unicode(self):
          my_config = self._get_config("[ALIASES]\n"
--            u"iam=whoami 'Erik B\u00e5gfors <erik@bagfors.nu>'\n")
++            u'iam=whoami "Erik B\u00e5gfors <erik@bagfors.nu>"\n')
          self.assertEqual([u'whoami', u'Erik B\u00e5gfors <erik@bagfors.nu>'],
                            commands.get_alias("iam", config=my_config))
 === modified file 'bzrlib/tests/test_diff.py'
 --- bzrlib/tests/test_diff.py	2009-12-22 15:50:40 +0000
 +++ bzrlib/tests/test_diff.py	2010-01-02 01:26:16 +0000
@@ -45,6 +45,8 @@
  from bzrlib.revisiontree import RevisionTree
  from bzrlib.revisionspec import RevisionSpec
++from bzrlib.tests.test_win32utils import BackslashDirSeparatorFeature
++
  class _AttribFeature(Feature):
@@ -1292,12 +1294,27 @@
              diff_obj.command_template)
      def test_from_string_u5(self):
--        diff_obj = DiffFromTool.from_string('diff -u\\ 5', None, None, None)
++        # win32 doesn't support escaping spaces with backslashes
++        if sys.platform == 'win32':
++            tool = 'diff "-u 5"'
++        else:
++            tool = 'diff -u\\ 5'
++        diff_obj = DiffFromTool.from_string(tool, None, None, None)
          self.addCleanup(diff_obj.finish)
          self.assertEqual(['diff', '-u 5', '@old_path', '@new_path'],
                           diff_obj.command_template)
          self.assertEqual(['diff', '-u 5', 'old-path', 'new-path'],
                           diff_obj._get_command('old-path', 'new-path'))
++
++    def test_from_string_path_with_backslashes(self):
++        self.requireFeature(BackslashDirSeparatorFeature)
++        tool = 'C:\\Tools\\Diff.exe'
++        diff_obj = DiffFromTool.from_string(tool, None, None, None)
++        self.addCleanup(diff_obj.finish)
++        self.assertEqual(['C:\\Tools\\Diff.exe', '@old_path', '@new_path'],
++                         diff_obj.command_template)
++        self.assertEqual(['C:\\Tools\\Diff.exe', 'old-path', 'new-path'],
++                         diff_obj._get_command('old-path', 'new-path'))
      def test_execute(self):
          output = StringIO()
 === modified file 'bzrlib/tests/test_rules.py'
 --- bzrlib/tests/test_rules.py	2009-03-23 14:59:43 +0000
 +++ bzrlib/tests/test_rules.py	2010-01-02 01:26:16 +0000
@@ -63,8 +63,12 @@
              rs.get_selected_items('a.txt', ['foo']))
      def test_get_items_from_multiple_glob_match(self):
--        rs = self.make_searcher(
--            "[name *.txt *.py 'x x' \"y y\"]\nfoo=bar\na=True\n")
++        # win32 doesn't support single quotes for quoting args with spaces
++        if sys.platform == 'win32':
++            text = '[name *.txt *.py "x x" "y y"]\nfoo=bar\na=True\n'
++        else:
++            text = """[name *.txt *.py 'x x' "y y"]\nfoo=bar\na=True\n"""
++        rs = self.make_searcher(text)
          self.assertEquals((), rs.get_items('NEWS'))
          self.assertEquals((('foo', 'bar'), ('a', 'True')),
              rs.get_items('a.py'))
 === modified file 'bzrlib/tests/test_win32utils.py'
 --- bzrlib/tests/test_win32utils.py	2009-11-20 16:42:28 +0000
 +++ bzrlib/tests/test_win32utils.py	2010-01-02 01:26:16 +0000
@@ -291,11 +291,10 @@
          win32utils.set_file_attr_hidden(path)
--
--class TestUnicodeShlex(tests.TestCase):
++class TestUnicodeCommandLineSplitter(tests.TestCase):
      def assertAsTokens(self, expected, line):
--        s = win32utils.UnicodeShlex(line)
++        s = win32utils.UnicodeCommandLineSplitter(line)
          self.assertEqual(expected, list(s))
      def test_simple(self):
@@ -311,14 +310,6 @@
      def test_ignore_trailing_space(self):
          self.assertAsTokens([(False, u'foo'), (False, u'bar')], u'foo bar  ')
--    def test_posix_quotations(self):
--        self.assertAsTokens([(True, u'foo bar')], u'"foo bar"')
--        self.assertAsTokens([(False, u"'fo''o"), (False, u"b''ar'")],
--            u"'fo''o b''ar'")
--        self.assertAsTokens([(True, u'foo bar')], u'"fo""o b""ar"')
--        self.assertAsTokens([(True, u"fo'o"), (True, u"b'ar")],
--            u'"fo"\'o b\'"ar"')
--
      def test_nested_quotations(self):
          self.assertAsTokens([(True, u'foo"" bar')], u"\"foo\\\"\\\" bar\"")
          self.assertAsTokens([(True, u'foo\'\' bar')], u"\"foo'' bar\"")
@@ -343,10 +334,16 @@
      def test_escape_quote(self):
          self.assertAsTokens([(True, u'foo"bar')], u'"foo\\"bar"')
++        self.assertAsTokens([(True, u'foo\\"bar')], u'"foo\\\\\\"bar"')
++        self.assertAsTokens([(True, u'foo\\bar')], u'"foo\\\\"bar"')
      def test_double_escape(self):
--        self.assertAsTokens([(True, u'foo\\bar')], u'"foo\\\\bar"')
++        self.assertAsTokens([(True, u'foo\\\\bar')], u'"foo\\\\bar"')
          self.assertAsTokens([(False, u'foo\\\\bar')], u"foo\\\\bar")
++
++    def test_multiple_quoted_args(self):
++        self.assertAsTokens([(True, u'x x'), (True, u'y y')],
++            u'"x x" "y y"')
  class Test_CommandLineToArgv(tests.TestCaseInTempDir):
@@ -355,7 +352,7 @@
          # Strictly speaking we should respect parameter order versus glob
          # expansions, but it's not really worth the effort here
          self.assertEqual(expected,
--                         sorted(win32utils._command_line_to_argv(line)))
++                         sorted(win32utils.command_line_to_argv(line)))
      def test_glob_paths(self):
          self.build_tree(['a/', 'a/b.c', 'a/c.c', 'a/c.h'])
 === modified file 'bzrlib/win32utils.py'
 --- bzrlib/win32utils.py	2009-12-16 06:38:15 +0000
 +++ bzrlib/win32utils.py	2010-01-02 01:26:16 +0000
@@ -517,117 +517,147 @@
              trace.mutter('Unable to set hidden attribute on %r: %s', path, e)
--
--class UnicodeShlex(object):
--    """This is a very simplified version of shlex.shlex.
--
--    The main change is that it supports non-ascii input streams. The internal
--    structure is quite simplified relative to shlex.shlex, since we aren't
--    trying to handle multiple input streams, etc. In fact, we don't use a
--    file-like api either.
--    """
--
--    def __init__(self, uni_string):
--        self._input = uni_string
--        self._input_iter = iter(self._input)
--        self._whitespace_match = re.compile(u'\s').match
--        self._word_match = re.compile(u'\S').match
--        self._quote_chars = u'"'
--        # self._quote_match = re.compile(u'[\'"]').match
--        self._escape_match = lambda x: None # Never matches
--        self._escape = '\\'
--        # State can be
--        #   ' ' - after whitespace, starting a new token
--        #   'a' - after text, currently working on a token
--        #   '"' - after ", currently in a "-delimited quoted section
--        #   "\" - after '\', checking the next char
--        self._state = ' '
--        self._token = [] # Current token being parsed
--
--    def _get_token(self):
--        # Were there quote chars as part of this token?
--        quoted = False
--        quoted_state = None
--        for nextchar in self._input_iter:
--            if self._state == ' ':
--                if self._whitespace_match(nextchar):
--                    # if self._token: return token
--                    continue
--                elif nextchar in self._quote_chars:
--                    self._state = nextchar # quoted state
--                elif self._word_match(nextchar):
--                    self._token.append(nextchar)
--                    self._state = 'a'
--                else:
--                    raise AssertionError('wtttf?')
--            elif self._state in self._quote_chars:
--                quoted = True
--                if nextchar == self._state: # End of quote
--                    self._state = 'a' # posix allows 'foo'bar to translate to
--                                      # foobar
--                elif self._state == '"' and nextchar == self._escape:
--                    quoted_state = self._state
--                    self._state = nextchar
--                else:
--                    self._token.append(nextchar)
--            elif self._state == self._escape:
--                if nextchar == '\\':
--                    self._token.append('\\')
--                elif nextchar == '"':
--                    self._token.append(nextchar)
--                else:
--                    self._token.append('\\' + nextchar)
--                self._state = quoted_state
--            elif self._state == 'a':
--                if self._whitespace_match(nextchar):
--                    if self._token:
--                        break # emit this token
--                    else:
--                        continue # no token to emit
--                elif nextchar in self._quote_chars:
--                    # Start a new quoted section
--                    self._state = nextchar
--                # escape?
--                elif (self._word_match(nextchar)
--                      or nextchar in self._quote_chars
--                      # or whitespace_split?
--                      ):
--                    self._token.append(nextchar)
--                else:
--                    raise AssertionError('state == "a", char: %r'
--                                         % (nextchar,))
--            else:
--                raise AssertionError('unknown state: %r' % (self._state,))
--        result = ''.join(self._token)
--        self._token = []
--        if not quoted and result == '':
--            result = None
--        return quoted, result
--
--    def __iter__(self):
--        return self
--
++class _PushbackSequence(object):
++    def __init__(self, orig):
++        self._iter = iter(orig)
++        self._pushback_buffer = []
++
++    def next(self):
++        if len(self._pushback_buffer) > 0:
++            return self._pushback_buffer.pop()
++        else:
++            return self._iter.next()
++
++    def pushback(self, char):
++        self._pushback_buffer.append(char)
++
++    def __iter__(self):
++        return self
++
++class _Whitespace(object):
++    def process(self, next_char, seq, context):
++        if _whitespace_match(next_char):
++            if len(context.token) > 0:
++                return None
++            else:
++                return self
++        elif next_char == u'"':
++            context.quoted = True
++            return _Quotes(self)
++        elif next_char == u'\\':
++            return _Backslash(self)
++        else:
++            context.token.append(next_char)
++            return _Word()
++
++class _Quotes(object):
++    def __init__(self, exit_state):
++        self.exit_state = exit_state
++
++    def process(self, next_char, seq, context):
++        if next_char == u'\\':
++            return _Backslash(self)
++        elif next_char == u'"':
++            return self.exit_state
++        else:
++            context.token.append(next_char)
++            return self
++
++class _Backslash(object):
++    # See http://msdn.microsoft.com/en-us/library/bb776391(VS.85).aspx
++    def __init__(self, exit_state):
++        self.exit_state = exit_state
++        self.count = 1
++
++    def process(self, next_char, seq, context):
++        if next_char == u'\\':
++            self.count += 1
++            return self
++        elif next_char == u'"':
++            # 2N backslashes followed by '"' are N backslashes
++            context.token.append(u'\\' * (self.count/2))
++            # 2N+1 backslashes follwed by '"' are N backslashes followed by '"'
++            # which should not be processed as the start or end of quoted arg
++            if self.count % 2 == 1:
++                context.token.append(next_char) # odd number of '\' escapes the '"'
++            else:
++                seq.pushback(next_char) # let exit_state handle next_char
++            self.count = 0
++            return self.exit_state
++        else:
++            # N backslashes not followed by '"' are just N backslashes
++            if self.count > 0:
++                context.token.append(u'\\' * self.count)
++                self.count = 0
++            seq.pushback(next_char) # let exit_state handle next_char
++            return self.exit_state
++
++    def finish(self, context):
++        if self.count > 0:
++            context.token.append(u'\\' * self.count)
++
++
++class _Word(object):
++    def process(self, next_char, seq, context):
++        if _whitespace_match(next_char):
++            return None
++        elif next_char == u'"':
++            return _Quotes(self)
++        elif next_char == u'\\':
++            return _Backslash(self)
++        else:
++            context.token.append(next_char)
++            return self
++
++_whitespace_match = re.compile(u'\s').match
++class UnicodeCommandLineSplitter(object):
++    def __init__(self, command_line):
++        self._seq = _PushbackSequence(command_line)
++
++    def __iter__(self):
++        return self
++
      def next(self):
          quoted, token = self._get_token()
          if token is None:
              raise StopIteration
          return quoted, token
--
--
--def _command_line_to_argv(command_line):
--    """Convert a Unicode command line into a set of argv arguments.
--
--    This does wildcard expansion, etc. It is intended to make wildcards act
--    closer to how they work in posix shells, versus how they work by default on
--    Windows.
++
++    def _get_token(self):
++        self.quoted = False
++        self.token = []
++        state = _Whitespace()
++        for next_char in self._seq:
++            state = state.process(next_char, self._seq, self)
++            if state is None:
++                break
++        if not state is None and not getattr(state, 'finish', None) is None:
++            state.finish(self)
++        result = u''.join(self.token)
++        if not self.quoted and result == '':
++            result = None
++        return self.quoted, result
++
++
++def command_line_to_argv(command_line, wildcard_expansion=True):
++    """Convert a Unicode command line into a list of argv arguments.
++
++    This optionally does wildcard expansion, etc. It is intended to make
++    wildcards act closer to how they work in posix shells, versus how they
++    work by default on Windows. Quoted arguments are left untouched.
++
++    :param command_line: The unicode string to split into an arg list.
++    :param wildcard_expansion: Whether wildcard expansion should be applied to
++                               each argument. True by default.
++    :return: A list of unicode strings.
      """
--    s = UnicodeShlex(command_line)
--    # Now that we've split the content, expand globs
++    s = UnicodeCommandLineSplitter(command_line)
++    # Now that we've split the content, expand globs if necessary
      # TODO: Use 'globbing' instead of 'glob.glob', this gives us stuff like
      #       '**/' style globs
      args = []
      for is_quoted, arg in s:
--        if is_quoted or not glob.has_magic(arg):
++        if is_quoted or not glob.has_magic(arg) or not wildcard_expansion:
              args.append(arg)
          else:
              args.extend(glob_one(arg))
@@ -636,16 +666,14 @@
  if has_ctypes and winver != 'Windows 98':
      def get_unicode_argv():
--        LPCWSTR = ctypes.c_wchar_p
--        INT = ctypes.c_int
--        POINTER = ctypes.POINTER
--        prototype = ctypes.WINFUNCTYPE(LPCWSTR)
--        GetCommandLine = prototype(("GetCommandLineW",
--                                    ctypes.windll.kernel32))
--        prototype = ctypes.WINFUNCTYPE(POINTER(LPCWSTR), LPCWSTR, POINTER(INT))
--        command_line = GetCommandLine()
++        prototype = ctypes.WINFUNCTYPE(ctypes.c_wchar_p, use_last_error=True)
++        GetCommandLineW = prototype(("GetCommandLineW",
++                                     ctypes.windll.kernel32))
++        command_line = GetCommandLineW()
++        if command_line is None:
++            raise ctypes.WinError()
          # Skip the first argument, since we only care about parameters
--        argv = _command_line_to_argv(command_line)[1:]
++        argv = command_line_to_argv(command_line)[1:]
          if getattr(sys, 'frozen', None) is None:
              # Invoked via 'python.exe' which takes the form:
              #   python.exe [PYTHON_OPTIONS] C:\Path\bzr [BZR_OPTIONS]