Status: | Superseded | ||||
---|---|---|---|---|---|
Proposed branch: | lp:~doxxx/bzr/392428 | ||||
Merge into: | lp:bzr | ||||
Diff against target: |
439 lines (+182/-129) 6 files modified
bzrlib/commands.py (+6/-2) bzrlib/tests/test_commands.py (+4/-1) bzrlib/tests/test_diff.py (+18/-1) bzrlib/tests/test_rules.py (+6/-2) bzrlib/tests/test_win32utils.py (+10/-13) bzrlib/win32utils.py (+138/-110) |
||||
To merge this branch: | bzr merge lp:~doxxx/bzr/392428 | ||||
Related bugs: |
|
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
John A Meinel | Needs Fixing | ||
Alexander Belchenko | Approve | ||
Review via email: mp+16444@code.launchpad.net |
This proposal has been superseded by a proposal from 2010-01-06.
Commit message
Description of the change
Gordon Tyler (doxxx) wrote : | # |
Alexander Belchenko (bialix) wrote : | # |
For non-windows it's better to use the helper function from commands.py: shlex_split_
Thanks!
Gordon Tyler (doxxx) wrote : | # |
> For non-windows it's better to use the helper function from commands.py:
> shlex_split_
That's the function I modified to use a different codepath on Windows.
John A Meinel (jameinel) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Gordon Tyler wrote:
> Gordon Tyler has proposed merging lp:~doxxx/bzr/392428 into lp:bzr.
>
> Requested reviews:
> bzr-core (bzr-core)
> Related bugs:
> #392428 `bzr diff --using C:\foo\bar` does not work
> https:/
>
>
> Fixed bug 392428 by changing commands.
>
> I also cleaned up the win32utils.
>
> Not sure how to write tests for this. I tested by hand using KDiff3, WinMerge and a Windows command script which calls KDiff3:
>
> bzrdev diff --using "C:\Program Files (x86)\KDiff3\
> bzrdev diff --using "C:\Program Files (x86)\WinMerge\
> bzrdev diff --using C:\Tools\diff.cmd
>
> 'bzrdev selftest win32utils' still passes.
>
>
I think command_
want that. It may not matter in practice, but its a thought.
The other changes seem gratuitous, but not specifically wrong.
As for testing...
1) Make sure any existing tests of 'shlex_
skipped on windows
2) Probably add a win32 specific test that when splitting "C:\foo\bar"
we preserve the '\' characters.
I'd like to see at least a smoke test added, but the code is simple
enough you don't have to be super thorough.
review: needsfixing
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAks
uLEAoNmmJOXEPjw
=ml5g
-----END PGP SIGNATURE-----
Gordon Tyler (doxxx) wrote : | # |
So it would seem that UnicodeShlex in win32utils.py doesn't handle single quotes properly which is making some tests in test_commands.py fail. Adding support for single quites to UnicodeShlex makes a lot of the tests in test_win32utils fail because they're expecting UnicodeShlex to not handle single quotes. Once I've fixed those, I think everything should be good.
John A Meinel (jameinel) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Gordon Tyler wrote:
> So it would seem that UnicodeShlex in win32utils.py doesn't handle single quotes properly which is making some tests in test_commands.py fail. Adding support for single quites to UnicodeShlex makes a lot of the tests in test_win32utils fail because they're expecting UnicodeShlex to not handle single quotes. Once I've fixed those, I think everything should be good.
UnicodeShlex explicitly does not deal with ' because I was asked not to.
Alexander Belchenko feels that single quotes are not "Windows". (As
people can do:
touch don'tdothis
bzr add don'tdothis
)
*I* wanted ' to be a quote char. Just to mention that adding them in is
not just a bug, it is by design that we don't handle them.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAks
wq4An2wJzn9ifGl
=cadk
-----END PGP SIGNATURE-----
Martin Pool (mbp) wrote : | # |
2009/12/24 Gordon Tyler <email address hidden>:
> So it would seem that UnicodeShlex in win32utils.py doesn't handle single quotes properly which is making some tests in test_commands.py fail. Adding support for single quites to UnicodeShlex makes a lot of the tests in test_win32utils fail because they're expecting UnicodeShlex to not handle single quotes. Once I've fixed those, I think everything should be good.
Which ones? I don't see anything obviously relevant.
--
Martin <http://
Gordon Tyler (doxxx) wrote : | # |
C:\dev\
testing: C:/dev/
C:\dev\
bzr-2.1.0dev5 python-2.6.4 Windows-
=======
FAIL: test_single_quotes (bzrlib.
-------
Traceback (most recent call last):
File "C:\dev\
le_quotes
commands.
AssertionError: not equal:
a = [u'diff', u'-r', u'-2..-1', u'--diff-options', u'--strip-
b = [u'diff',
u'-r',
u'-2..-1',
u'--diff-options',
u"'--strip-
u"-wp'"]
=======
FAIL: test_unicode (bzrlib.
-------
Traceback (most recent call last):
File "C:\dev\
ode
commands.
AssertionError: not equal:
a = [u'whoami', u'Erik B\xe5gfors <email address hidden>']
b = [u'whoami', u"'Erik", u'B\xe5gfors', u"<email address hidden>'"]
-------
Ran 41 tests in 0.547s
FAILED (failures=2)
1 test skipped
Missing feature 'FTPServer' skipped 18 tests.
Martin Pool (mbp) wrote : | # |
OK.
So I think that unless we change the approach to quoting on Windows,
we probably need to make the tests skip on Windows, and add a comment
about why.
The fact that single quotes are accepted in configuration files (as
opposed to the command line) on unix but not on Windows is a bit ugly.
--
Martin <http://
Gordon Tyler (doxxx) wrote : | # |
I've pushed an update which makes test_single_quotes skip on win32 and fixes test_unicode for win32 by making it use double-quotes instead of single-quotes (the type of quote doesn't really matter for that specific test).
So now both test_win32utils and test_commands pass. I'm running full selftest now to see if anything else related fails.
Alexander Belchenko (bialix) wrote : | # |
These two failures are wrong by itself and should not be affected by
single quote vs double quote differences.
Gordon Tyler пишет:
> C:\dev\
> testing: C:/dev/
> C:\dev\
> bzr-2.1.0dev5 python-2.6.4 Windows-
>
>
> =======
> FAIL: test_single_quotes (bzrlib.
> -------
> Traceback (most recent call last):
> File "C:\dev\
> le_quotes
> commands.
> AssertionError: not equal:
> a = [u'diff', u'-r', u'-2..-1', u'--diff-options', u'--strip-
> b = [u'diff',
> u'-r',
> u'-2..-1',
> u'--diff-options',
> u"'--strip-
> u"-wp'"]
>
>
> =======
> FAIL: test_unicode (bzrlib.
> -------
> Traceback (most recent call last):
> File "C:\dev\
> ode
> commands.
> AssertionError: not equal:
> a = [u'whoami', u'Erik B\xe5gfors <email address hidden>']
> b = [u'whoami', u"'Erik", u'B\xe5gfors', u"<email address hidden>'"]
>
>
> -------
> Ran 41 tests in 0.547s
>
> FAILED (failures=2)
> 1 test skipped
> Missing feature 'FTPServer' skipped 18 tests.
>
Gordon Tyler (doxxx) wrote : | # |
> These two failures are wrong by itself and should not be affected by
> single quote vs double quote differences.
Okay, I'll see about fixing them.
Gordon Tyler (doxxx) wrote : | # |
UnicodeShlex doesn't handling escaping a space, so test_from_string_u5 in test_diff.
Gordon Tyler (doxxx) wrote : | # |
I've written a replacement for UnicodeShlex which duplicates the way the Win32 API function CommandLineToArgvW handles splitting a commandline into arguments, but with the added functionality of indicating which arguments are quoted, like UnicodeShlex did.
Gordon Tyler (doxxx) wrote : | # |
And I've just added a test to test_diff which actually tests the scenario in bug 392428.
Preview Diff
1 | === modified file 'bzrlib/commands.py' |
2 | --- bzrlib/commands.py 2009-12-09 05:47:32 +0000 |
3 | +++ bzrlib/commands.py 2010-01-02 01:26:16 +0000 |
4 | @@ -869,8 +869,12 @@ |
5 | |
6 | |
7 | def shlex_split_unicode(unsplit): |
8 | - import shlex |
9 | - return [u.decode('utf-8') for u in shlex.split(unsplit.encode('utf-8'))] |
10 | + if sys.platform == "win32": |
11 | + from bzrlib.win32utils import command_line_to_argv |
12 | + return command_line_to_argv(unsplit, wildcard_expansion=False) |
13 | + else: |
14 | + import shlex |
15 | + return [u.decode('utf-8') for u in shlex.split(unsplit.encode('utf-8'))] |
16 | |
17 | |
18 | def get_alias(cmd, config=None): |
19 | |
20 | === modified file 'bzrlib/tests/test_commands.py' |
21 | --- bzrlib/tests/test_commands.py 2009-05-23 21:01:51 +0000 |
22 | +++ bzrlib/tests/test_commands.py 2010-01-02 01:26:16 +0000 |
23 | @@ -94,6 +94,9 @@ |
24 | commands.get_alias("diff", config=my_config)) |
25 | |
26 | def test_single_quotes(self): |
27 | + if sys.platform == 'win32': |
28 | + raise TestSkipped("commandline parsing on win32 does not " |
29 | + "support single quotes") |
30 | my_config = self._get_config("[ALIASES]\n" |
31 | "diff=diff -r -2..-1 --diff-options " |
32 | "'--strip-trailing-cr -wp'\n") |
33 | @@ -111,7 +114,7 @@ |
34 | |
35 | def test_unicode(self): |
36 | my_config = self._get_config("[ALIASES]\n" |
37 | - u"iam=whoami 'Erik B\u00e5gfors <erik@bagfors.nu>'\n") |
38 | + u'iam=whoami "Erik B\u00e5gfors <erik@bagfors.nu>"\n') |
39 | self.assertEqual([u'whoami', u'Erik B\u00e5gfors <erik@bagfors.nu>'], |
40 | commands.get_alias("iam", config=my_config)) |
41 | |
42 | |
43 | === modified file 'bzrlib/tests/test_diff.py' |
44 | --- bzrlib/tests/test_diff.py 2009-12-22 15:50:40 +0000 |
45 | +++ bzrlib/tests/test_diff.py 2010-01-02 01:26:16 +0000 |
46 | @@ -45,6 +45,8 @@ |
47 | from bzrlib.revisiontree import RevisionTree |
48 | from bzrlib.revisionspec import RevisionSpec |
49 | |
50 | +from bzrlib.tests.test_win32utils import BackslashDirSeparatorFeature |
51 | + |
52 | |
53 | class _AttribFeature(Feature): |
54 | |
55 | @@ -1292,12 +1294,27 @@ |
56 | diff_obj.command_template) |
57 | |
58 | def test_from_string_u5(self): |
59 | - diff_obj = DiffFromTool.from_string('diff -u\\ 5', None, None, None) |
60 | + # win32 doesn't support escaping spaces with backslashes |
61 | + if sys.platform == 'win32': |
62 | + tool = 'diff "-u 5"' |
63 | + else: |
64 | + tool = 'diff -u\\ 5' |
65 | + diff_obj = DiffFromTool.from_string(tool, None, None, None) |
66 | self.addCleanup(diff_obj.finish) |
67 | self.assertEqual(['diff', '-u 5', '@old_path', '@new_path'], |
68 | diff_obj.command_template) |
69 | self.assertEqual(['diff', '-u 5', 'old-path', 'new-path'], |
70 | diff_obj._get_command('old-path', 'new-path')) |
71 | + |
72 | + def test_from_string_path_with_backslashes(self): |
73 | + self.requireFeature(BackslashDirSeparatorFeature) |
74 | + tool = 'C:\\Tools\\Diff.exe' |
75 | + diff_obj = DiffFromTool.from_string(tool, None, None, None) |
76 | + self.addCleanup(diff_obj.finish) |
77 | + self.assertEqual(['C:\\Tools\\Diff.exe', '@old_path', '@new_path'], |
78 | + diff_obj.command_template) |
79 | + self.assertEqual(['C:\\Tools\\Diff.exe', 'old-path', 'new-path'], |
80 | + diff_obj._get_command('old-path', 'new-path')) |
81 | |
82 | def test_execute(self): |
83 | output = StringIO() |
84 | |
85 | === modified file 'bzrlib/tests/test_rules.py' |
86 | --- bzrlib/tests/test_rules.py 2009-03-23 14:59:43 +0000 |
87 | +++ bzrlib/tests/test_rules.py 2010-01-02 01:26:16 +0000 |
88 | @@ -63,8 +63,12 @@ |
89 | rs.get_selected_items('a.txt', ['foo'])) |
90 | |
91 | def test_get_items_from_multiple_glob_match(self): |
92 | - rs = self.make_searcher( |
93 | - "[name *.txt *.py 'x x' \"y y\"]\nfoo=bar\na=True\n") |
94 | + # win32 doesn't support single quotes for quoting args with spaces |
95 | + if sys.platform == 'win32': |
96 | + text = '[name *.txt *.py "x x" "y y"]\nfoo=bar\na=True\n' |
97 | + else: |
98 | + text = """[name *.txt *.py 'x x' "y y"]\nfoo=bar\na=True\n""" |
99 | + rs = self.make_searcher(text) |
100 | self.assertEquals((), rs.get_items('NEWS')) |
101 | self.assertEquals((('foo', 'bar'), ('a', 'True')), |
102 | rs.get_items('a.py')) |
103 | |
104 | === modified file 'bzrlib/tests/test_win32utils.py' |
105 | --- bzrlib/tests/test_win32utils.py 2009-11-20 16:42:28 +0000 |
106 | +++ bzrlib/tests/test_win32utils.py 2010-01-02 01:26:16 +0000 |
107 | @@ -291,11 +291,10 @@ |
108 | win32utils.set_file_attr_hidden(path) |
109 | |
110 | |
111 | - |
112 | -class TestUnicodeShlex(tests.TestCase): |
113 | +class TestUnicodeCommandLineSplitter(tests.TestCase): |
114 | |
115 | def assertAsTokens(self, expected, line): |
116 | - s = win32utils.UnicodeShlex(line) |
117 | + s = win32utils.UnicodeCommandLineSplitter(line) |
118 | self.assertEqual(expected, list(s)) |
119 | |
120 | def test_simple(self): |
121 | @@ -311,14 +310,6 @@ |
122 | def test_ignore_trailing_space(self): |
123 | self.assertAsTokens([(False, u'foo'), (False, u'bar')], u'foo bar ') |
124 | |
125 | - def test_posix_quotations(self): |
126 | - self.assertAsTokens([(True, u'foo bar')], u'"foo bar"') |
127 | - self.assertAsTokens([(False, u"'fo''o"), (False, u"b''ar'")], |
128 | - u"'fo''o b''ar'") |
129 | - self.assertAsTokens([(True, u'foo bar')], u'"fo""o b""ar"') |
130 | - self.assertAsTokens([(True, u"fo'o"), (True, u"b'ar")], |
131 | - u'"fo"\'o b\'"ar"') |
132 | - |
133 | def test_nested_quotations(self): |
134 | self.assertAsTokens([(True, u'foo"" bar')], u"\"foo\\\"\\\" bar\"") |
135 | self.assertAsTokens([(True, u'foo\'\' bar')], u"\"foo'' bar\"") |
136 | @@ -343,10 +334,16 @@ |
137 | |
138 | def test_escape_quote(self): |
139 | self.assertAsTokens([(True, u'foo"bar')], u'"foo\\"bar"') |
140 | + self.assertAsTokens([(True, u'foo\\"bar')], u'"foo\\\\\\"bar"') |
141 | + self.assertAsTokens([(True, u'foo\\bar')], u'"foo\\\\"bar"') |
142 | |
143 | def test_double_escape(self): |
144 | - self.assertAsTokens([(True, u'foo\\bar')], u'"foo\\\\bar"') |
145 | + self.assertAsTokens([(True, u'foo\\\\bar')], u'"foo\\\\bar"') |
146 | self.assertAsTokens([(False, u'foo\\\\bar')], u"foo\\\\bar") |
147 | + |
148 | + def test_multiple_quoted_args(self): |
149 | + self.assertAsTokens([(True, u'x x'), (True, u'y y')], |
150 | + u'"x x" "y y"') |
151 | |
152 | |
153 | class Test_CommandLineToArgv(tests.TestCaseInTempDir): |
154 | @@ -355,7 +352,7 @@ |
155 | # Strictly speaking we should respect parameter order versus glob |
156 | # expansions, but it's not really worth the effort here |
157 | self.assertEqual(expected, |
158 | - sorted(win32utils._command_line_to_argv(line))) |
159 | + sorted(win32utils.command_line_to_argv(line))) |
160 | |
161 | def test_glob_paths(self): |
162 | self.build_tree(['a/', 'a/b.c', 'a/c.c', 'a/c.h']) |
163 | |
164 | === modified file 'bzrlib/win32utils.py' |
165 | --- bzrlib/win32utils.py 2009-12-16 06:38:15 +0000 |
166 | +++ bzrlib/win32utils.py 2010-01-02 01:26:16 +0000 |
167 | @@ -517,117 +517,147 @@ |
168 | trace.mutter('Unable to set hidden attribute on %r: %s', path, e) |
169 | |
170 | |
171 | - |
172 | -class UnicodeShlex(object): |
173 | - """This is a very simplified version of shlex.shlex. |
174 | - |
175 | - The main change is that it supports non-ascii input streams. The internal |
176 | - structure is quite simplified relative to shlex.shlex, since we aren't |
177 | - trying to handle multiple input streams, etc. In fact, we don't use a |
178 | - file-like api either. |
179 | - """ |
180 | - |
181 | - def __init__(self, uni_string): |
182 | - self._input = uni_string |
183 | - self._input_iter = iter(self._input) |
184 | - self._whitespace_match = re.compile(u'\s').match |
185 | - self._word_match = re.compile(u'\S').match |
186 | - self._quote_chars = u'"' |
187 | - # self._quote_match = re.compile(u'[\'"]').match |
188 | - self._escape_match = lambda x: None # Never matches |
189 | - self._escape = '\\' |
190 | - # State can be |
191 | - # ' ' - after whitespace, starting a new token |
192 | - # 'a' - after text, currently working on a token |
193 | - # '"' - after ", currently in a "-delimited quoted section |
194 | - # "\" - after '\', checking the next char |
195 | - self._state = ' ' |
196 | - self._token = [] # Current token being parsed |
197 | - |
198 | - def _get_token(self): |
199 | - # Were there quote chars as part of this token? |
200 | - quoted = False |
201 | - quoted_state = None |
202 | - for nextchar in self._input_iter: |
203 | - if self._state == ' ': |
204 | - if self._whitespace_match(nextchar): |
205 | - # if self._token: return token |
206 | - continue |
207 | - elif nextchar in self._quote_chars: |
208 | - self._state = nextchar # quoted state |
209 | - elif self._word_match(nextchar): |
210 | - self._token.append(nextchar) |
211 | - self._state = 'a' |
212 | - else: |
213 | - raise AssertionError('wtttf?') |
214 | - elif self._state in self._quote_chars: |
215 | - quoted = True |
216 | - if nextchar == self._state: # End of quote |
217 | - self._state = 'a' # posix allows 'foo'bar to translate to |
218 | - # foobar |
219 | - elif self._state == '"' and nextchar == self._escape: |
220 | - quoted_state = self._state |
221 | - self._state = nextchar |
222 | - else: |
223 | - self._token.append(nextchar) |
224 | - elif self._state == self._escape: |
225 | - if nextchar == '\\': |
226 | - self._token.append('\\') |
227 | - elif nextchar == '"': |
228 | - self._token.append(nextchar) |
229 | - else: |
230 | - self._token.append('\\' + nextchar) |
231 | - self._state = quoted_state |
232 | - elif self._state == 'a': |
233 | - if self._whitespace_match(nextchar): |
234 | - if self._token: |
235 | - break # emit this token |
236 | - else: |
237 | - continue # no token to emit |
238 | - elif nextchar in self._quote_chars: |
239 | - # Start a new quoted section |
240 | - self._state = nextchar |
241 | - # escape? |
242 | - elif (self._word_match(nextchar) |
243 | - or nextchar in self._quote_chars |
244 | - # or whitespace_split? |
245 | - ): |
246 | - self._token.append(nextchar) |
247 | - else: |
248 | - raise AssertionError('state == "a", char: %r' |
249 | - % (nextchar,)) |
250 | - else: |
251 | - raise AssertionError('unknown state: %r' % (self._state,)) |
252 | - result = ''.join(self._token) |
253 | - self._token = [] |
254 | - if not quoted and result == '': |
255 | - result = None |
256 | - return quoted, result |
257 | - |
258 | - def __iter__(self): |
259 | - return self |
260 | - |
261 | +class _PushbackSequence(object): |
262 | + def __init__(self, orig): |
263 | + self._iter = iter(orig) |
264 | + self._pushback_buffer = [] |
265 | + |
266 | + def next(self): |
267 | + if len(self._pushback_buffer) > 0: |
268 | + return self._pushback_buffer.pop() |
269 | + else: |
270 | + return self._iter.next() |
271 | + |
272 | + def pushback(self, char): |
273 | + self._pushback_buffer.append(char) |
274 | + |
275 | + def __iter__(self): |
276 | + return self |
277 | + |
278 | +class _Whitespace(object): |
279 | + def process(self, next_char, seq, context): |
280 | + if _whitespace_match(next_char): |
281 | + if len(context.token) > 0: |
282 | + return None |
283 | + else: |
284 | + return self |
285 | + elif next_char == u'"': |
286 | + context.quoted = True |
287 | + return _Quotes(self) |
288 | + elif next_char == u'\\': |
289 | + return _Backslash(self) |
290 | + else: |
291 | + context.token.append(next_char) |
292 | + return _Word() |
293 | + |
294 | +class _Quotes(object): |
295 | + def __init__(self, exit_state): |
296 | + self.exit_state = exit_state |
297 | + |
298 | + def process(self, next_char, seq, context): |
299 | + if next_char == u'\\': |
300 | + return _Backslash(self) |
301 | + elif next_char == u'"': |
302 | + return self.exit_state |
303 | + else: |
304 | + context.token.append(next_char) |
305 | + return self |
306 | + |
307 | +class _Backslash(object): |
308 | + # See http://msdn.microsoft.com/en-us/library/bb776391(VS.85).aspx |
309 | + def __init__(self, exit_state): |
310 | + self.exit_state = exit_state |
311 | + self.count = 1 |
312 | + |
313 | + def process(self, next_char, seq, context): |
314 | + if next_char == u'\\': |
315 | + self.count += 1 |
316 | + return self |
317 | + elif next_char == u'"': |
318 | + # 2N backslashes followed by '"' are N backslashes |
319 | + context.token.append(u'\\' * (self.count/2)) |
320 | + # 2N+1 backslashes follwed by '"' are N backslashes followed by '"' |
321 | + # which should not be processed as the start or end of quoted arg |
322 | + if self.count % 2 == 1: |
323 | + context.token.append(next_char) # odd number of '\' escapes the '"' |
324 | + else: |
325 | + seq.pushback(next_char) # let exit_state handle next_char |
326 | + self.count = 0 |
327 | + return self.exit_state |
328 | + else: |
329 | + # N backslashes not followed by '"' are just N backslashes |
330 | + if self.count > 0: |
331 | + context.token.append(u'\\' * self.count) |
332 | + self.count = 0 |
333 | + seq.pushback(next_char) # let exit_state handle next_char |
334 | + return self.exit_state |
335 | + |
336 | + def finish(self, context): |
337 | + if self.count > 0: |
338 | + context.token.append(u'\\' * self.count) |
339 | + |
340 | + |
341 | +class _Word(object): |
342 | + def process(self, next_char, seq, context): |
343 | + if _whitespace_match(next_char): |
344 | + return None |
345 | + elif next_char == u'"': |
346 | + return _Quotes(self) |
347 | + elif next_char == u'\\': |
348 | + return _Backslash(self) |
349 | + else: |
350 | + context.token.append(next_char) |
351 | + return self |
352 | + |
353 | +_whitespace_match = re.compile(u'\s').match |
354 | +class UnicodeCommandLineSplitter(object): |
355 | + def __init__(self, command_line): |
356 | + self._seq = _PushbackSequence(command_line) |
357 | + |
358 | + def __iter__(self): |
359 | + return self |
360 | + |
361 | def next(self): |
362 | quoted, token = self._get_token() |
363 | if token is None: |
364 | raise StopIteration |
365 | return quoted, token |
366 | - |
367 | - |
368 | -def _command_line_to_argv(command_line): |
369 | - """Convert a Unicode command line into a set of argv arguments. |
370 | - |
371 | - This does wildcard expansion, etc. It is intended to make wildcards act |
372 | - closer to how they work in posix shells, versus how they work by default on |
373 | - Windows. |
374 | + |
375 | + def _get_token(self): |
376 | + self.quoted = False |
377 | + self.token = [] |
378 | + state = _Whitespace() |
379 | + for next_char in self._seq: |
380 | + state = state.process(next_char, self._seq, self) |
381 | + if state is None: |
382 | + break |
383 | + if not state is None and not getattr(state, 'finish', None) is None: |
384 | + state.finish(self) |
385 | + result = u''.join(self.token) |
386 | + if not self.quoted and result == '': |
387 | + result = None |
388 | + return self.quoted, result |
389 | + |
390 | + |
391 | +def command_line_to_argv(command_line, wildcard_expansion=True): |
392 | + """Convert a Unicode command line into a list of argv arguments. |
393 | + |
394 | + This optionally does wildcard expansion, etc. It is intended to make |
395 | + wildcards act closer to how they work in posix shells, versus how they |
396 | + work by default on Windows. Quoted arguments are left untouched. |
397 | + |
398 | + :param command_line: The unicode string to split into an arg list. |
399 | + :param wildcard_expansion: Whether wildcard expansion should be applied to |
400 | + each argument. True by default. |
401 | + :return: A list of unicode strings. |
402 | """ |
403 | - s = UnicodeShlex(command_line) |
404 | - # Now that we've split the content, expand globs |
405 | + s = UnicodeCommandLineSplitter(command_line) |
406 | + # Now that we've split the content, expand globs if necessary |
407 | # TODO: Use 'globbing' instead of 'glob.glob', this gives us stuff like |
408 | # '**/' style globs |
409 | args = [] |
410 | for is_quoted, arg in s: |
411 | - if is_quoted or not glob.has_magic(arg): |
412 | + if is_quoted or not glob.has_magic(arg) or not wildcard_expansion: |
413 | args.append(arg) |
414 | else: |
415 | args.extend(glob_one(arg)) |
416 | @@ -636,16 +666,14 @@ |
417 | |
418 | if has_ctypes and winver != 'Windows 98': |
419 | def get_unicode_argv(): |
420 | - LPCWSTR = ctypes.c_wchar_p |
421 | - INT = ctypes.c_int |
422 | - POINTER = ctypes.POINTER |
423 | - prototype = ctypes.WINFUNCTYPE(LPCWSTR) |
424 | - GetCommandLine = prototype(("GetCommandLineW", |
425 | - ctypes.windll.kernel32)) |
426 | - prototype = ctypes.WINFUNCTYPE(POINTER(LPCWSTR), LPCWSTR, POINTER(INT)) |
427 | - command_line = GetCommandLine() |
428 | + prototype = ctypes.WINFUNCTYPE(ctypes.c_wchar_p, use_last_error=True) |
429 | + GetCommandLineW = prototype(("GetCommandLineW", |
430 | + ctypes.windll.kernel32)) |
431 | + command_line = GetCommandLineW() |
432 | + if command_line is None: |
433 | + raise ctypes.WinError() |
434 | # Skip the first argument, since we only care about parameters |
435 | - argv = _command_line_to_argv(command_line)[1:] |
436 | + argv = command_line_to_argv(command_line)[1:] |
437 | if getattr(sys, 'frozen', None) is None: |
438 | # Invoked via 'python.exe' which takes the form: |
439 | # python.exe [PYTHON_OPTIONS] C:\Path\bzr [BZR_OPTIONS] |
Fixed bug 392428 by changing commands. shlex_split_ unicode to use win32utils. command_ line_to_ argv on win32.
I also cleaned up the win32utils. get_unicode_ argv function. There was some cruft leftover from what I'm guessing was its previous implementation which used the CommandLineToArgvW Win32 API function. I added error handling as well.
Not sure how to write tests for this. I tested by hand using KDiff3, WinMerge and a Windows command script which calls KDiff3:
bzrdev diff --using "C:\Program Files (x86)\KDiff3\ kdiff3. exe" WinMergeU. exe"
bzrdev diff --using "C:\Program Files (x86)\WinMerge\
bzrdev diff --using C:\Tools\diff.cmd
'bzrdev selftest win32utils' still passes.