Bazaar

Code review comment for lp:~songofacandy/bzr/fix-523746

fix-523746
Merge into 2.0

Revision history for this message

John A Meinel (jameinel) wrote on 2010-02-22:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

INADA Naoki wrote:
>>> I recall subprocess being very slow to import; it worries me that we'll
>>> be importing it all the time.
>>>
>>> A quick check on my machine took about 10ms to import, so its definitely
>>> noticable.
>>>
>>> -Rob
>>>
>> It can pretty easily be moved into the lazy_import section.
>
> osutils.Popen is a class. So subprocess imported every times
> when import osutils.
> I'll make osutils.Popen as a factory function instead of class.
>
>
>> Actually, we have a different problem. In that 'get_user_encoding()' !=
>> 'sys.get_filesystemencoding'. I'm not 100% sure what needs to be passed
>> to programs.
>
> Yes. Arguments contains file path and others texts.
> I think there are no single right way to encode process arguments.
> But I change to use filesystemencoding because file paths appears in
> arguments rather than non-ASCII text arguments.
> In most case, filesystemencoding == user_encoding(). So I think using
> filesystemencoding solves 90% case.

Only outsidef of Windows. On Windows filesystemencoding == mbcs, but
user_encoding is something like 'cp1252'. (and terminal encoding is
cp437...)

Now, I don't know whether arbitrary programs are going to use filesystem
encoding or user_encoding for decoding their arguments. I *think* that
argument decoding would be an OS sort of thing, and we use
filesystemencoding for ENV variables and thus it would make sense for
arguments. bzr itself used user_encoding before we use the Unicode api.

>
>> The real fix is to use CreateProcessW and pass it a unicode string,
>> rather than trying to get everything round-tripped across an encoding
>> barrier.
>
> On Windows, CreateProcessW is good. But this bug affects not only Windows.
> Popen encoding arguments with defaultencoding so popen can't encode utf-8
> file path on Linux, too.

Sure. And on Linux, it has pretty much standardized that everything is
in utf-8. Which is *soooo* much nicer. (file content, filenames, env
vars, terminal encoding, etc.)

>
> To use CreateProcessW, we should decode bytes strings into unicode.
> This brings another problem: "What encoding should we use when decoding".
> And to use CreateProcessW, we should change subprocess internals or reimplement
> subprocess module.
>
> So, just for now, I'd like to fix 90% case.
>

As long as we don't break other cases, I'm fine with that.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkuC3GUACgkQJdeBCYSNAAMUuwCfS28ggzvo1JNRialesoqv8Slw
U2MAn0J3tGr5HGGOPIqEz+w7CCm10hKm
=V9+v
-----END PGP SIGNATURE-----

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

INADA Naoki wrote:
>>> I recall subprocess being very slow to import; it worries me that we'll
>>> be importing it all the time.
>>>
>>> A quick check on my machine took about 10ms to import, so its definitely
>>> noticable.
>>>
>>> -Rob
>>>
>> It can pretty easily be moved into the lazy_import section.
> 
> osutils.Popen is a class. So subprocess imported every times
> when import osutils.
> I'll make osutils.Popen as a factory function instead of class.
> 
> 
>> Actually, we have a different problem. In that 'get_user_encoding()' !=
>> 'sys.get_filesystemencoding'. I'm not 100% sure what needs to be passed
>> to programs.
> 
> Yes. Arguments contains file path and others texts.
> I think there are no single right way to encode process arguments.
> But I change to use filesystemencoding because file paths appears in
> arguments rather than non-ASCII text arguments.
> In most case, filesystemencoding == user_encoding(). So I think using
> filesystemencoding solves 90% case.

Only outsidef of Windows. On Windows filesystemencoding == mbcs, but
user_encoding is something like 'cp1252'. (and terminal encoding is
cp437...)

> 
>> The real fix is to use CreateProcessW and pass it a unicode string,
>> rather than trying to get everything round-tripped across an encoding
>> barrier.
> 
> On Windows, CreateProcessW is good. But this bug affects not only Windows.
> Popen encoding arguments with defaultencoding so popen can't encode utf-8
> file path on Linux, too.

Sure. And on Linux, it has pretty much standardized that everything is
in utf-8. Which is *soooo* much nicer. (file content, filenames, env
vars, terminal encoding, etc.)

> 
> To use CreateProcessW, we should decode bytes strings into unicode.
> This brings another problem: "What encoding should we use when decoding".
> And to use CreateProcessW, we should change subprocess internals or reimplement
> subprocess module.
> 
> So, just for now, I'd like to fix 90% case.
>

As long as we don't break other cases, I'm fine with that.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkuC3GUACgkQJdeBCYSNAAMUuwCfS28ggzvo1JNRialesoqv8Slw
U2MAn0J3tGr5HGGOPIqEz+w7CCm10hKm
=V9+v
-----END PGP SIGNATURE-----

« Back to merge proposal