$ python
Python 2.5.4 (r254:67916, Sep 26 2009, 10:32:22)
[GCC 4.3.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> "aäa".decode("utf-8")
u'a\xe4a'
I have not mentioned latin1 anywhere (locale is utf-8), but still I see
"a\xe4a".
0xE4 is ä in latin1. Where does that come from?
"Lupa evätty" is latin1-compatible, so I guess this latin1 assumption is
not be responsible for the error.
>>> "ä".decode('ascii')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0:
ordinal not in range(128)
Even the error messages are not exactly the same, it seems that
somewhere in bzr the permission denied error message from libc is
converted to python utf8 string assuming that it is ascii.
Instead of assuming ascii, bzr should check LC_MESSAGES and choose
encoding accordingly.
to, 2009-11-12 kello 12:59 +0000, John A Meinel kirjoitti:
> I'm pretty sure we need more context to be able to determine what is
> going on here.
mijutu@crc11-ett:~$ cd /tmp/ crc11-ett: /tmp$ rm -rf test crc11-ett: /tmp$ mkdir test crc11-ett: /tmp$ chmod a-rx test crc11-ett: /tmp$ echo $LANG crc11-ett: /tmp$ bzr branch test/someproject newbranch test/someprojec t/.bzr/ branch- format' , '_preformatted_ string' : test/someprojec t/.bzr/ branch- format' "}, fmt='Permission denied: s"%(extra) s', error=UnicodeDe codeError( 'ascii' , ": [Errno 13] test/someprojec t/.bzr/ branch- format' ", 20, crc11-ett: /tmp$ LANG=C bzr branch test/someproject newbranch someproject/ .bzr/branch- format" : [Errno 13] Permission test/someprojec t/.bzr/ branch- format' crc11-ett: /tmp$ python --version crc11-ett: /tmp$ bzr --version
mijutu@
mijutu@
mijutu@
mijutu@
fi_FI.UTF-8
mijutu@
bzr: ERROR: Unprintable exception PermissionDenied: dict={'path':
u'/tmp/
None, 'extra': ": [Errno 13] Lupa ev\xc3\xa4tty:
u'/tmp/
"%(path)
Lupa ev\xc3\xa4tty: u'/tmp/
21, 'ordinal not in range(128)')
mijutu@
bzr: ERROR: Permission denied:
"/tmp/test/
denied: u'/tmp/
mijutu@
Python 2.5.4
mijutu@
Bazaar (bzr) 1.16.1
Whoops, old bzr. Let's try again.
mijutu@ crc11-ett: /tmp$ bzr branch test/someproject newbranch test/someprojec t/.bzr/ branch- format' , '_preformatted_ string' : test/someprojec t/.bzr/ branch- format' "}, fmt='Permission denied: s"%(extra) s', error=UnicodeDe codeError( 'ascii' , ": [Errno 13] test/someprojec t/.bzr/ branch- format' ", 20, crc11-ett: /tmp$ bzr --version
bzr: ERROR: Unprintable exception PermissionDenied: dict={'path':
u'/tmp/
None, 'extra': ": [Errno 13] Lupa ev\xc3\xa4tty:
u'/tmp/
"%(path)
Lupa ev\xc3\xa4tty: u'/tmp/
21, 'ordinal not in range(128)')
mijutu@
Bazaar (bzr) 2.0.2
http:// wiki.python. org/moin/ UnicodeDecodeEr ror
Python seems to default to latin1 somewehere:
$ python "utf-8" )
Python 2.5.4 (r254:67916, Sep 26 2009, 10:32:22)
[GCC 4.3.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> "aäa".decode(
u'a\xe4a'
I have not mentioned latin1 anywhere (locale is utf-8), but still I see
"a\xe4a".
0xE4 is ä in latin1. Where does that come from?
"Lupa evätty" is latin1-compatible, so I guess this latin1 assumption is
not be responsible for the error.
>>> "ä".decode('ascii')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0:
ordinal not in range(128)
Even the error messages are not exactly the same, it seems that
somewhere in bzr the permission denied error message from libc is
converted to python utf8 string assuming that it is ascii.
Instead of assuming ascii, bzr should check LC_MESSAGES and choose
encoding accordingly.