Merge lp:~vila/bzr/822571-bzr-home-unicode into lp:bzr

Proposed by Vincent Ladeuil
Status: Merged
Approved by: John A Meinel
Approved revision: no longer in the source branch.
Merged at revision: 6060
Proposed branch: lp:~vila/bzr/822571-bzr-home-unicode
Merge into: lp:bzr
Diff against target: 65 lines (+11/-7)
2 files modified
bzrlib/config.py (+8/-7)
doc/en/release-notes/bzr-2.5.txt (+3/-0)
To merge this branch: bzr merge lp:~vila/bzr/822571-bzr-home-unicode
Reviewer Review Type Date Requested Status
John A Meinel Approve
Jelmer Vernooij Pending
Review via email: mp+70870@code.launchpad.net

This proposal supersedes a proposal from 2011-08-08.

Commit message

Decode BZR_HOME with fs encoding to allow unicode homes.

Description of the change

Ad-hoc fix for the failing test.

It's a bit surprising that we failed this way, I don't fully
understand what triggered the failure overall.

The actual failure is related to Martin's addition of the
fdatasync config option which requires getting the path to the
config file and as such interprets BZR_HOME. It may also be
related to Jelmer's change about separating get_transport for
paths and urls.

There may be a deeper issue at work here but not worth
investigating now IMHO, just something to stay aware about.

To post a comment you must log in.
Revision history for this message
John A Meinel (jameinel) wrote : Posted in a previous version of this proposal

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 8/8/2011 11:15 AM, Vincent Ladeuil wrote:
> Vincent Ladeuil has proposed merging
> lp:~vila/bzr/822571-bzr-home-unicode into lp:bzr.
>
> Requested reviews: bzr-core (bzr-core) Related bugs: Bug #822571 in
> Bazaar:
> "bzrlib.tests.blackbox.test_version.TestVersionUnicodeOutput.test_unicode_bzr_home
>
>
fails" https://bugs.launchpad.net/bzr/+bug/822571
>
> For more details, see:
> https://code.launchpad.net/~vila/bzr/822571-bzr-home-unicode/+merge/70706
>
>
>
Ad-hoc fix for the failing test.
>
> It's a bit surprising that we failed this way, I don't fully
> understand what triggered the failure overall.
>
> The actual failure is related to Martin's addition of the fdatasync
> config option which requires getting the path to the config file and
> as such interprets BZR_HOME. It may also be related to Jelmer's
> change about separating get_transport for paths and urls.
>
> There may be a deeper issue at work here but not worth investigating
> now IMHO, just something to stay aware about.

Should we be forcing "utf-8" rather than _fs_enc? I would tend to do

1) _fs_enc if set (not None) and not ASCII

2) fallback to 'mbcs' on Windows

3) fallback to 'utf-8' everywhere else

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk4/4EAACgkQJdeBCYSNAANjXwCfQ4Z4kFfCFKfqPS09/AlFmc/Z
CgMAoMmxtnj4I/NUb9dJMGMUPu+13zXH
=fg9D
-----END PGP SIGNATURE-----

Revision history for this message
Vincent Ladeuil (vila) wrote : Posted in a previous version of this proposal

>>>>> John A Meinel <email address hidden> writes:

<snip/>

    > Should we be forcing "utf-8" rather than _fs_enc?

There are other similar bugs where we trigger unicode errors because the
fs encoding ends up being ascii (generally because the user didn't
properly specify the encoding in the env) and the end result is that we
more or less agreed to go directly to utf8.

    > I would tend to do

    > 1) _fs_enc if set (not None) and not ASCII

But in which cases will that be different than utf8 (especially for a
home directory) ? And is it worth the trouble given that we never got
bug reports about not encoding BZR_HOME ?

    > 2) fallback to 'mbcs' on Windows

mbcs is used already for decoding BZR_HOME on windows (even with this
patch).

    > 3) fallback to 'utf-8' everywhere else

Given that this test is failing on *all* slaves (not only the Ubuntu
ones), I'd rather land this than bikeshed too long for unclear benefits.

Revision history for this message
John A Meinel (jameinel) wrote : Posted in a previous version of this proposal

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 8/8/2011 4:19 PM, Vincent Ladeuil wrote:
>>>>>> John A Meinel <email address hidden> writes:
>
> <snip/>
>
>> Should we be forcing "utf-8" rather than _fs_enc?
>
> There are other similar bugs where we trigger unicode errors because
> the fs encoding ends up being ascii (generally because the user
> didn't properly specify the encoding in the env) and the end result
> is that we more or less agreed to go directly to utf8.
>
>> I would tend to do
>
>> 1) _fs_enc if set (not None) and not ASCII
>
> But in which cases will that be different than utf8 (especially for
> a home directory) ? And is it worth the trouble given that we never
> got bug reports about not encoding BZR_HOME ?

There have been people running iso-8859-1 and iso-8859-2 in the past. Is
it common? Is it becoming more or less common? I know it has happened.

>
>> 2) fallback to 'mbcs' on Windows
>
> mbcs is used already for decoding BZR_HOME on windows (even with
> this patch).
>
>> 3) fallback to 'utf-8' everywhere else
>
> Given that this test is failing on *all* slaves (not only the Ubuntu
> ones), I'd rather land this than bikeshed too long for unclear
> benefits.
>

I would actually say, default to _fs_enc, and lets fix fs_enc to be what
we want. (If ascii => utf-8, otherwise locale.*.)

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk4//uMACgkQJdeBCYSNAAOD4ACgpzVkQRhSkGCWnD2cTc8CiPLs
OCYAoKMcHB9ooyZ8W/S/QCZl4DYl6hte
=xaRV
-----END PGP SIGNATURE-----

Revision history for this message
Vincent Ladeuil (vila) wrote : Posted in a previous version of this proposal

>>>>> John A Meinel <email address hidden> writes:

<snip/>

    > There have been people running iso-8859-1 and iso-8859-2 in the past. Is
    > it common? Is it becoming more or less common? I know it has happened.

But did they override BZR_HOME ? That's what this fix is about.

<snip/>

    > I would actually say, default to _fs_enc, and lets fix fs_enc to be what
    > we want. (If ascii => utf-8, otherwise locale.*.)

That's out of scope here where I'm fire-fighting *all* babune slaves
being red.

Revision history for this message
Jelmer Vernooij (jelmer) wrote : Posted in a previous version of this proposal

I also feel a bit uncomfortable decoding from utf8 where we otherwise decode from _fs_enc.

If getting this text fixed is more important for the moment, perhaps we could revert back to using get_transport() rather than get_transport_from_path() in bzrlib.config ?

review: Needs Fixing
Revision history for this message
Vincent Ladeuil (vila) wrote : Posted in a previous version of this proposal

> I also feel a bit uncomfortable decoding from utf8 where we otherwise decode
> from _fs_enc.

That's the bug, we don't decode paths that come from os.environ because it's unclear what encoding there are using.

>
> If getting this text fixed is more important for the moment, perhaps we could
> revert back to using get_transport() rather than get_transport_from_path() in
> bzrlib.config ?

Just tried, doesn't fix the issue :-/ (And I prefer to go forward on this point anyway).

The fix now decodes with _fs_enc.

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 8/9/2011 2:51 PM, Vincent Ladeuil wrote:
> Vincent Ladeuil has proposed merging
> lp:~vila/bzr/822571-bzr-home-unicode into lp:bzr.
>
> Requested reviews: Jelmer Vernooij (jelmer) Related bugs: Bug #822571
> in Bazaar:
> "bzrlib.tests.blackbox.test_version.TestVersionUnicodeOutput.test_unicode_bzr_home
> fails" https://bugs.launchpad.net/bzr/+bug/822571
>
> For more details, see:
> https://code.launchpad.net/~vila/bzr/822571-bzr-home-unicode/+merge/70870
>
> Ad-hoc fix for the failing test.
>
> It's a bit surprising that we failed this way, I don't fully
> understand what triggered the failure overall.
>
> The actual failure is related to Martin's addition of the fdatasync
> config option which requires getting the path to the config file and
> as such interprets BZR_HOME. It may also be related to Jelmer's
> change about separating get_transport for paths and urls.
>
> There may be a deeper issue at work here but not worth investigating
> now IMHO, just something to stay aware about.

 merge: approve

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk5BLdMACgkQJdeBCYSNAANw2QCfUUR4Fg0Eqd2qhqBIyjicFXWs
WIEAoLHA20+kSKdDCYCpS8kINEoJ+9Sj
=ChOd
-----END PGP SIGNATURE-----

review: Approve
Revision history for this message
Vincent Ladeuil (vila) wrote :

sent to pqm by email

Revision history for this message
Martin Packman (gz) wrote :

Using the filesystem encoding at least is correct, there should be a bug filed for a complete fix though. Any code dealing with the environment on nix needs to be robust against it not being decodable.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bzrlib/config.py'
2--- bzrlib/config.py 2011-08-09 10:02:34 +0000
3+++ bzrlib/config.py 2011-08-09 12:52:17 +0000
4@@ -1513,14 +1513,16 @@
5 raise errors.BzrError('You must have one of BZR_HOME, APPDATA,'
6 ' or HOME set')
7 return osutils.pathjoin(base, 'bazaar', '2.0')
8- elif sys.platform == 'darwin':
9+ else:
10+ if base is not None:
11+ base = base.decode(osutils._fs_enc)
12+ if sys.platform == 'darwin':
13 if base is None:
14 # this takes into account $HOME
15 base = os.path.expanduser("~")
16 return osutils.pathjoin(base, '.bazaar')
17 else:
18 if base is None:
19-
20 xdg_dir = os.environ.get('XDG_CONFIG_HOME', None)
21 if xdg_dir is None:
22 xdg_dir = osutils.pathjoin(os.path.expanduser("~"), ".config")
23@@ -1529,7 +1531,6 @@
24 trace.mutter(
25 "Using configuration in XDG directory %s." % xdg_dir)
26 return xdg_dir
27-
28 base = os.path.expanduser("~")
29 return osutils.pathjoin(base, ".bazaar")
30
31@@ -2630,16 +2631,16 @@
32 class GlobalStore(LockableIniFileStore):
33
34 def __init__(self, possible_transports=None):
35- t = transport.get_transport_from_path(config_dir(),
36- possible_transports=possible_transports)
37+ t = transport.get_transport_from_path(
38+ config_dir(), possible_transports=possible_transports)
39 super(GlobalStore, self).__init__(t, 'bazaar.conf')
40
41
42 class LocationStore(LockableIniFileStore):
43
44 def __init__(self, possible_transports=None):
45- t = transport.get_transport_from_path(config_dir(),
46- possible_transports=possible_transports)
47+ t = transport.get_transport_from_path(
48+ config_dir(), possible_transports=possible_transports)
49 super(LocationStore, self).__init__(t, 'locations.conf')
50
51
52
53=== modified file 'doc/en/release-notes/bzr-2.5.txt'
54--- doc/en/release-notes/bzr-2.5.txt 2011-08-09 12:34:00 +0000
55+++ doc/en/release-notes/bzr-2.5.txt 2011-08-09 12:52:17 +0000
56@@ -74,6 +74,9 @@
57 that subfolder.
58 (Bastian Bowe, #809901)
59
60+* Decode ``BZR_HOME`` with fs encoding on posix platforms to avoid unicode
61+ errors. (Vincent Ladeuil, #822571)
62+
63 * Fix i18n use when no environment variables are set. (Jelmer Vernooij, #810701)
64
65 * TreeTransformBase.fixup_new_roots no longer forces trees to have a root, so