Code review comment for lp:~doxxx/bzr/392428

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Gordon Tyler wrote:
> I've fixed various tests, restoring some to their original state since they should work now with single quotes parsing enabled.
>
> There was also a compatibility problem with Python 2.5 in the cleanup changes I made to get_unicode_argv, which I didn't notice until now because I normally use Python 2.6.

review time...

1) I'd like to get away from using 'shlex.split' on all platforms. the
'utf-8' hack is a bit ugly. It will mean losing some support for '\' in
some circumstances, but I'd rather have configuration (espec, stuff like
'rules') work consistently everywhere. (Do you think UnicodeShlex is a
reasonable replacement?)

2)
+_whitespace_match = re.compile(u'\s').match
+

^- Do we want flags=re.UNICODE? Quick testing shows that it isn't set
for u'\s':

re.match(u'\s', u'\u2028') => None
re.match(u'\s', u'\u2028', re.UNICODE) => Match

For those that don't know u'\u2028' is the Unicode "Line Separator"
(u'\N{LINE SEPARATOR}') character.
http://unicode.org/standard/reports/tr13/tr13-5.html
(of course, it doesn't help that Unicode still defines 2 separators,
with a new \N{PARAGRAPH SEPARATOR}... :(

That said, I don't really think we care about breaking command lines on
u'\2028', so maybe it doesn't matter...

The rest looks good to me. So the big question is (1)? Can we get away
with replacing shlex entirely and make things a bit more consistent?

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAktPYTcACgkQJdeBCYSNAAPboACeNXvqObazzbOnMaHSkkxuBClQ
EO0AoLHKIlKWLOKSB5pPdU0RN3qO+cUK
=EoNE
-----END PGP SIGNATURE-----

« Back to merge proposal