Merge lp:~gz/bzr/undecodable_argv_745712 into lp:bzr

Proposed by Martin Packman on 2011-04-15
Status: Merged
Approved by: Martin Pool on 2011-04-18
Approved revision: 5791
Merged at revision: 5797
Proposed branch: lp:~gz/bzr/undecodable_argv_745712
Merge into: lp:bzr
Diff against target: 145 lines (+27/-28)
6 files modified
bzrlib/commands.py (+4/-3)
bzrlib/osutils.py (+2/-2)
bzrlib/tests/blackbox/test_command_encoding.py (+2/-3)
bzrlib/tests/blackbox/test_commit.py (+0/-20)
bzrlib/tests/blackbox/test_exceptions.py (+16/-0)
doc/en/release-notes/bzr-2.4.txt (+3/-0)
To merge this branch: bzr merge lp:~gz/bzr/undecodable_argv_745712
Reviewer Review Type Date Requested Status
Martin Pool 2011-04-15 Approve on 2011-04-15
Review via email: mp+57799@code.launchpad.net

Commit message

better message when argv can't be decoded in the application locale

Description of the change

When sys.argv can't be decoded according to the application locale encoding, Bazaar needs to report a nice clear error. The code nearly does that already, apart from the fact the BzrError raised from osutils.get_unicode_argv goes uncaught as commands._specified_or_unicode_argv which calls it is outside the normal exception catching and reporting code.

To fix this the argv decoding is now done inside commands.run_bzr rather than commands.main as it was previously. This is something of a subtle behaviour change, but the other alternative is changing the signatures of all the run_* methods, which strikes me as worse.

Should the error message also give more advice about how to fix locale issues, like the locale.setlocale except clause in the main bzr script?

Hm... just seen bb.test_commit.TestCommit.test_unsupported_encoding_commit_message fail which is testing for the old buggy behaviour, will sort that out that tomorrow.

To post a comment you must log in.
Martin Pool (mbp) wrote :

strictly that should be testnotapplicable.

Martin Pool (mbp) wrote :

- argv = _specified_or_unicode_argv(argv)
+ if argv is not None:
+ argv = argv[1:]

It looks like you have one too few spaces there.

The whole topic of locales is probably a bit too complex to explain in an error message. We could mention a URL about it.

+++ bzrlib/osutils.py 2011-04-15 00:49:24 +0000
@@ -96,8 +96,8 @@
         user_encoding = get_user_encoding()
         return [a.decode(user_encoding) for a in sys.argv[1:]]
     except UnicodeDecodeError:
- raise errors.BzrError(("Parameter '%r' is unsupported by the current "
- "encoding." % a))
+ raise errors.BzrError("Parameter %r encoding is unsupported by the "
+ "application locale." % (a,))

One thing I would do here is make the error message include the detected encoding to give people an idea where to start. (It may already be in the crash report but perhaps that's not so visible.)

Aside from that and the point about skip it looks good. Thanks for tackling it.

review: Needs Fixing
lp:~gz/bzr/undecodable_argv_745712 updated on 2011-04-15
5787. By Martin Packman on 2011-04-15

Move _specified_or_unicode_argv call inside run_bzr as it may raise an error that should be reported

5788. By Martin Packman on 2011-04-15

Tweak argv encoding error message

Martin Packman (gz) wrote :

> It looks like you have one too few spaces there.

Worse, a tab. I've rewritten history to hide that embarrassment, shouldn't try and finish branches late.

> The whole topic of locales is probably a bit too complex to explain in an
> error message. We could mention a URL about it.

Along the lines of "See `bzr help locale` for more information."?

> One thing I would do here is make the error message include the detected
> encoding to give people an idea where to start. (It may already be in the
> crash report but perhaps that's not so visible.)

Will do, I nearly did this but the spelling of ascii as 'ANSI_X3.4-1968' looked needlessly scary to me.

> Aside from that and the point about skip it looks good. Thanks for tackling
> it.

Skip change also in the new-old revisions.

Left to do is the bb.test_commit failure, and also one in bb.test_command_encoding which does something dodgy (with a comment saying as much).

lp:~gz/bzr/undecodable_argv_745712 updated on 2011-04-15
5789. By Martin Packman on 2011-04-15

Remove testing that expected argv decoding to be done outside run_bzr

5790. By Martin Packman on 2011-04-15

Add user encoding name to argv decoding error message per poolie in review

5791. By Martin Packman on 2011-04-15

Add release notes

Martin Packman (gz) wrote :

I removed the two problematic testing bits I found, there may be more fallout if plugins are doing dodgy things with TestCase.run_bzr (which they are in my experience), I'll follow up if that's the case.

I deleted test_unsupported_encoding_commit_message outright which loses the blame but not the coverage as it does the same thing as the new test in a less generic manner. Should really have found it before making the branch, poolie had linked it from the bug. The TestCommandEncoding.test_exact assertion might be useful to find a way to keep, though I'm not convinced writing arbitrary bytes to the text output is a generally good idea.

Martin Pool (mbp) wrote :

great, thankyou.

review: Approve
Martin Pool (mbp) wrote :

sent to pqm by email

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bzrlib/commands.py'
2--- bzrlib/commands.py 2011-04-05 01:12:15 +0000
3+++ bzrlib/commands.py 2011-04-15 21:25:54 +0000
4@@ -1031,7 +1031,7 @@
5 Specify the number of processes that can be run concurrently (selftest).
6 """
7 trace.mutter("bazaar version: " + bzrlib.__version__)
8- argv = list(argv)
9+ argv = _specified_or_unicode_argv(argv)
10 trace.mutter("bzr arguments: %r", argv)
11
12 opt_lsprof = opt_profile = opt_no_plugins = opt_builtin = \
13@@ -1177,7 +1177,7 @@
14 new_argv = []
15 try:
16 # ensure all arguments are unicode strings
17- for a in argv[1:]:
18+ for a in argv:
19 if isinstance(a, unicode):
20 new_argv.append(a)
21 else:
22@@ -1199,7 +1199,8 @@
23
24 :return: exit code of bzr command.
25 """
26- argv = _specified_or_unicode_argv(argv)
27+ if argv is not None:
28+ argv = argv[1:]
29 _register_builtin_commands()
30 ret = run_bzr_catch_errors(argv)
31 trace.mutter("return code %d", ret)
32
33=== modified file 'bzrlib/osutils.py'
34--- bzrlib/osutils.py 2011-04-04 13:37:25 +0000
35+++ bzrlib/osutils.py 2011-04-15 21:25:54 +0000
36@@ -96,8 +96,8 @@
37 user_encoding = get_user_encoding()
38 return [a.decode(user_encoding) for a in sys.argv[1:]]
39 except UnicodeDecodeError:
40- raise errors.BzrError(("Parameter '%r' is unsupported by the current "
41- "encoding." % a))
42+ raise errors.BzrError("Parameter %r encoding is unsupported by %s "
43+ "application locale." % (a, user_encoding))
44
45
46 def make_readonly(filename):
47
48=== modified file 'bzrlib/tests/blackbox/test_command_encoding.py'
49--- bzrlib/tests/blackbox/test_command_encoding.py 2009-03-23 14:59:43 +0000
50+++ bzrlib/tests/blackbox/test_command_encoding.py 2011-04-15 21:25:54 +0000
51@@ -54,13 +54,12 @@
52 register_command(cmd_echo_exact)
53 try:
54 self.assertEqual('foo', bzr('echo-exact foo'))
55- # This is cheating a little bit, because 'foo\xb5' shouldn't
56- # get past main()
57- self.assertEqual('foo\xb5', bzr('echo-exact foo\xb5'))
58 # Exact should fail to decode the string
59 self.assertRaises(UnicodeEncodeError,
60 bzr,
61 ['echo-exact', u'foo\xb5'])
62+ # Previously a non-ascii bytestring was also tested, as 'exact'
63+ # outputs bytes untouched, but needed buggy argv parsing to work
64 finally:
65 plugin_cmds.remove('echo-exact')
66
67
68=== modified file 'bzrlib/tests/blackbox/test_commit.py'
69--- bzrlib/tests/blackbox/test_commit.py 2011-04-07 13:08:39 +0000
70+++ bzrlib/tests/blackbox/test_commit.py 2011-04-15 21:25:54 +0000
71@@ -321,26 +321,6 @@
72 tree.add('foo.c')
73 self.run_bzr('commit -m ""', retcode=3)
74
75- def test_unsupported_encoding_commit_message(self):
76- if sys.platform == 'win32':
77- raise tests.TestNotApplicable('Win32 parses arguments directly'
78- ' as Unicode, so we can\'t pass invalid non-ascii')
79- tree = self.make_branch_and_tree('.')
80- self.build_tree_contents([('foo.c', 'int main() {}')])
81- tree.add('foo.c')
82- # LANG env variable has no effect on Windows
83- # but some characters anyway cannot be represented
84- # in default user encoding
85- char = probe_bad_non_ascii(osutils.get_user_encoding())
86- if char is None:
87- raise TestSkipped('Cannot find suitable non-ascii character'
88- 'for user_encoding (%s)' % osutils.get_user_encoding())
89- out,err = self.run_bzr_subprocess('commit -m "%s"' % char,
90- retcode=1,
91- env_changes={'LANG': 'C'})
92- self.assertContainsRe(err, r'bzrlib.errors.BzrError: Parameter.*is '
93- 'unsupported by the current encoding.')
94-
95 def test_other_branch_commit(self):
96 # this branch is to ensure consistent behaviour, whether we're run
97 # inside a branch, or not.
98
99=== modified file 'bzrlib/tests/blackbox/test_exceptions.py'
100--- bzrlib/tests/blackbox/test_exceptions.py 2011-03-05 01:37:14 +0000
101+++ bzrlib/tests/blackbox/test_exceptions.py 2011-04-15 21:25:54 +0000
102@@ -16,6 +16,7 @@
103
104 """Tests for display of exceptions."""
105
106+import os
107 import sys
108
109 from bzrlib import (
110@@ -45,6 +46,21 @@
111 r'exceptions\.AssertionError: always fails\n')
112 self.assertContainsRe(err, r'Bazaar has encountered an internal error')
113
114+ def test_undecodable_argv(self):
115+ """A user error must be reported if argv is not in the locale encoding
116+
117+ A subprocess with an environment ascii-only setting is used so the test
118+ can run without worrying about the locale the test suite is using.
119+ """
120+ if os.name != "posix":
121+ raise tests.TestNotApplicable("Needs system beholden to C locales")
122+ out, err = self.run_bzr_subprocess(["\xa0"],
123+ env_changes={"LANG": "C"},
124+ universal_newlines=True,
125+ retcode=errors.EXIT_ERROR)
126+ self.assertContainsRe(err, r"^bzr: ERROR: .*'\\xa0'.* unsupported")
127+ self.assertEquals(out, "")
128+
129
130 class TestOptParseBugHandling(TestCase):
131 "Test that we handle http://bugs.python.org/issue2931"
132
133=== modified file 'doc/en/release-notes/bzr-2.4.txt'
134--- doc/en/release-notes/bzr-2.4.txt 2011-04-15 10:31:18 +0000
135+++ doc/en/release-notes/bzr-2.4.txt 2011-04-15 21:25:54 +0000
136@@ -53,6 +53,9 @@
137 .. Fixes for situations where bzr would previously crash or give incorrect
138 or undesirable results.
139
140+* Arguments that can't be decoded to unicode in the current posix locale give
141+ a clearer error message without a traceback. (Martin [gz], #745712)
142+
143 * ``bzrlib.log._DEFAULT_REQUEST_PARAMS`` is no longer accidentally
144 mutated by ``bzrlib.log._apply_log_request_defaults``. In practice
145 these default values aren't relied on very often so this probably