Merge lp:~gz/bzr/undecodable_argv_745712 into lp:bzr

Proposed by Martin Packman
Status: Merged
Approved by: Martin Pool
Approved revision: no longer in the source branch.
Merged at revision: 5797
Proposed branch: lp:~gz/bzr/undecodable_argv_745712
Merge into: lp:bzr
Diff against target: 145 lines (+27/-28)
6 files modified
bzrlib/commands.py (+4/-3)
bzrlib/osutils.py (+2/-2)
bzrlib/tests/blackbox/test_command_encoding.py (+2/-3)
bzrlib/tests/blackbox/test_commit.py (+0/-20)
bzrlib/tests/blackbox/test_exceptions.py (+16/-0)
doc/en/release-notes/bzr-2.4.txt (+3/-0)
To merge this branch: bzr merge lp:~gz/bzr/undecodable_argv_745712
Reviewer Review Type Date Requested Status
Martin Pool Approve
Review via email: mp+57799@code.launchpad.net

Commit message

better message when argv can't be decoded in the application locale

Description of the change

When sys.argv can't be decoded according to the application locale encoding, Bazaar needs to report a nice clear error. The code nearly does that already, apart from the fact the BzrError raised from osutils.get_unicode_argv goes uncaught as commands._specified_or_unicode_argv which calls it is outside the normal exception catching and reporting code.

To fix this the argv decoding is now done inside commands.run_bzr rather than commands.main as it was previously. This is something of a subtle behaviour change, but the other alternative is changing the signatures of all the run_* methods, which strikes me as worse.

Should the error message also give more advice about how to fix locale issues, like the locale.setlocale except clause in the main bzr script?

Hm... just seen bb.test_commit.TestCommit.test_unsupported_encoding_commit_message fail which is testing for the old buggy behaviour, will sort that out that tomorrow.

To post a comment you must log in.
Revision history for this message
Martin Pool (mbp) wrote :

strictly that should be testnotapplicable.

Revision history for this message
Martin Pool (mbp) wrote :

- argv = _specified_or_unicode_argv(argv)
+ if argv is not None:
+ argv = argv[1:]

It looks like you have one too few spaces there.

The whole topic of locales is probably a bit too complex to explain in an error message. We could mention a URL about it.

+++ bzrlib/osutils.py 2011-04-15 00:49:24 +0000
@@ -96,8 +96,8 @@
         user_encoding = get_user_encoding()
         return [a.decode(user_encoding) for a in sys.argv[1:]]
     except UnicodeDecodeError:
- raise errors.BzrError(("Parameter '%r' is unsupported by the current "
- "encoding." % a))
+ raise errors.BzrError("Parameter %r encoding is unsupported by the "
+ "application locale." % (a,))

One thing I would do here is make the error message include the detected encoding to give people an idea where to start. (It may already be in the crash report but perhaps that's not so visible.)

Aside from that and the point about skip it looks good. Thanks for tackling it.

review: Needs Fixing
Revision history for this message
Martin Packman (gz) wrote :

> It looks like you have one too few spaces there.

Worse, a tab. I've rewritten history to hide that embarrassment, shouldn't try and finish branches late.

> The whole topic of locales is probably a bit too complex to explain in an
> error message. We could mention a URL about it.

Along the lines of "See `bzr help locale` for more information."?

> One thing I would do here is make the error message include the detected
> encoding to give people an idea where to start. (It may already be in the
> crash report but perhaps that's not so visible.)

Will do, I nearly did this but the spelling of ascii as 'ANSI_X3.4-1968' looked needlessly scary to me.

> Aside from that and the point about skip it looks good. Thanks for tackling
> it.

Skip change also in the new-old revisions.

Left to do is the bb.test_commit failure, and also one in bb.test_command_encoding which does something dodgy (with a comment saying as much).

Revision history for this message
Martin Packman (gz) wrote :

I removed the two problematic testing bits I found, there may be more fallout if plugins are doing dodgy things with TestCase.run_bzr (which they are in my experience), I'll follow up if that's the case.

I deleted test_unsupported_encoding_commit_message outright which loses the blame but not the coverage as it does the same thing as the new test in a less generic manner. Should really have found it before making the branch, poolie had linked it from the bug. The TestCommandEncoding.test_exact assertion might be useful to find a way to keep, though I'm not convinced writing arbitrary bytes to the text output is a generally good idea.

Revision history for this message
Martin Pool (mbp) wrote :

great, thankyou.

review: Approve
Revision history for this message
Martin Pool (mbp) wrote :

sent to pqm by email

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'bzrlib/commands.py'
--- bzrlib/commands.py 2011-04-05 01:12:15 +0000
+++ bzrlib/commands.py 2011-04-15 21:25:54 +0000
@@ -1031,7 +1031,7 @@
1031 Specify the number of processes that can be run concurrently (selftest).1031 Specify the number of processes that can be run concurrently (selftest).
1032 """1032 """
1033 trace.mutter("bazaar version: " + bzrlib.__version__)1033 trace.mutter("bazaar version: " + bzrlib.__version__)
1034 argv = list(argv)1034 argv = _specified_or_unicode_argv(argv)
1035 trace.mutter("bzr arguments: %r", argv)1035 trace.mutter("bzr arguments: %r", argv)
10361036
1037 opt_lsprof = opt_profile = opt_no_plugins = opt_builtin = \1037 opt_lsprof = opt_profile = opt_no_plugins = opt_builtin = \
@@ -1177,7 +1177,7 @@
1177 new_argv = []1177 new_argv = []
1178 try:1178 try:
1179 # ensure all arguments are unicode strings1179 # ensure all arguments are unicode strings
1180 for a in argv[1:]:1180 for a in argv:
1181 if isinstance(a, unicode):1181 if isinstance(a, unicode):
1182 new_argv.append(a)1182 new_argv.append(a)
1183 else:1183 else:
@@ -1199,7 +1199,8 @@
11991199
1200 :return: exit code of bzr command.1200 :return: exit code of bzr command.
1201 """1201 """
1202 argv = _specified_or_unicode_argv(argv)1202 if argv is not None:
1203 argv = argv[1:]
1203 _register_builtin_commands()1204 _register_builtin_commands()
1204 ret = run_bzr_catch_errors(argv)1205 ret = run_bzr_catch_errors(argv)
1205 trace.mutter("return code %d", ret)1206 trace.mutter("return code %d", ret)
12061207
=== modified file 'bzrlib/osutils.py'
--- bzrlib/osutils.py 2011-04-04 13:37:25 +0000
+++ bzrlib/osutils.py 2011-04-15 21:25:54 +0000
@@ -96,8 +96,8 @@
96 user_encoding = get_user_encoding()96 user_encoding = get_user_encoding()
97 return [a.decode(user_encoding) for a in sys.argv[1:]]97 return [a.decode(user_encoding) for a in sys.argv[1:]]
98 except UnicodeDecodeError:98 except UnicodeDecodeError:
99 raise errors.BzrError(("Parameter '%r' is unsupported by the current "99 raise errors.BzrError("Parameter %r encoding is unsupported by %s "
100 "encoding." % a))100 "application locale." % (a, user_encoding))
101101
102102
103def make_readonly(filename):103def make_readonly(filename):
104104
=== modified file 'bzrlib/tests/blackbox/test_command_encoding.py'
--- bzrlib/tests/blackbox/test_command_encoding.py 2009-03-23 14:59:43 +0000
+++ bzrlib/tests/blackbox/test_command_encoding.py 2011-04-15 21:25:54 +0000
@@ -54,13 +54,12 @@
54 register_command(cmd_echo_exact)54 register_command(cmd_echo_exact)
55 try:55 try:
56 self.assertEqual('foo', bzr('echo-exact foo'))56 self.assertEqual('foo', bzr('echo-exact foo'))
57 # This is cheating a little bit, because 'foo\xb5' shouldn't
58 # get past main()
59 self.assertEqual('foo\xb5', bzr('echo-exact foo\xb5'))
60 # Exact should fail to decode the string57 # Exact should fail to decode the string
61 self.assertRaises(UnicodeEncodeError,58 self.assertRaises(UnicodeEncodeError,
62 bzr,59 bzr,
63 ['echo-exact', u'foo\xb5'])60 ['echo-exact', u'foo\xb5'])
61 # Previously a non-ascii bytestring was also tested, as 'exact'
62 # outputs bytes untouched, but needed buggy argv parsing to work
64 finally:63 finally:
65 plugin_cmds.remove('echo-exact')64 plugin_cmds.remove('echo-exact')
6665
6766
=== modified file 'bzrlib/tests/blackbox/test_commit.py'
--- bzrlib/tests/blackbox/test_commit.py 2011-04-07 13:08:39 +0000
+++ bzrlib/tests/blackbox/test_commit.py 2011-04-15 21:25:54 +0000
@@ -321,26 +321,6 @@
321 tree.add('foo.c')321 tree.add('foo.c')
322 self.run_bzr('commit -m ""', retcode=3)322 self.run_bzr('commit -m ""', retcode=3)
323323
324 def test_unsupported_encoding_commit_message(self):
325 if sys.platform == 'win32':
326 raise tests.TestNotApplicable('Win32 parses arguments directly'
327 ' as Unicode, so we can\'t pass invalid non-ascii')
328 tree = self.make_branch_and_tree('.')
329 self.build_tree_contents([('foo.c', 'int main() {}')])
330 tree.add('foo.c')
331 # LANG env variable has no effect on Windows
332 # but some characters anyway cannot be represented
333 # in default user encoding
334 char = probe_bad_non_ascii(osutils.get_user_encoding())
335 if char is None:
336 raise TestSkipped('Cannot find suitable non-ascii character'
337 'for user_encoding (%s)' % osutils.get_user_encoding())
338 out,err = self.run_bzr_subprocess('commit -m "%s"' % char,
339 retcode=1,
340 env_changes={'LANG': 'C'})
341 self.assertContainsRe(err, r'bzrlib.errors.BzrError: Parameter.*is '
342 'unsupported by the current encoding.')
343
344 def test_other_branch_commit(self):324 def test_other_branch_commit(self):
345 # this branch is to ensure consistent behaviour, whether we're run325 # this branch is to ensure consistent behaviour, whether we're run
346 # inside a branch, or not.326 # inside a branch, or not.
347327
=== modified file 'bzrlib/tests/blackbox/test_exceptions.py'
--- bzrlib/tests/blackbox/test_exceptions.py 2011-03-05 01:37:14 +0000
+++ bzrlib/tests/blackbox/test_exceptions.py 2011-04-15 21:25:54 +0000
@@ -16,6 +16,7 @@
1616
17"""Tests for display of exceptions."""17"""Tests for display of exceptions."""
1818
19import os
19import sys20import sys
2021
21from bzrlib import (22from bzrlib import (
@@ -45,6 +46,21 @@
45 r'exceptions\.AssertionError: always fails\n')46 r'exceptions\.AssertionError: always fails\n')
46 self.assertContainsRe(err, r'Bazaar has encountered an internal error')47 self.assertContainsRe(err, r'Bazaar has encountered an internal error')
4748
49 def test_undecodable_argv(self):
50 """A user error must be reported if argv is not in the locale encoding
51
52 A subprocess with an environment ascii-only setting is used so the test
53 can run without worrying about the locale the test suite is using.
54 """
55 if os.name != "posix":
56 raise tests.TestNotApplicable("Needs system beholden to C locales")
57 out, err = self.run_bzr_subprocess(["\xa0"],
58 env_changes={"LANG": "C"},
59 universal_newlines=True,
60 retcode=errors.EXIT_ERROR)
61 self.assertContainsRe(err, r"^bzr: ERROR: .*'\\xa0'.* unsupported")
62 self.assertEquals(out, "")
63
4864
49class TestOptParseBugHandling(TestCase):65class TestOptParseBugHandling(TestCase):
50 "Test that we handle http://bugs.python.org/issue2931"66 "Test that we handle http://bugs.python.org/issue2931"
5167
=== modified file 'doc/en/release-notes/bzr-2.4.txt'
--- doc/en/release-notes/bzr-2.4.txt 2011-04-15 10:31:18 +0000
+++ doc/en/release-notes/bzr-2.4.txt 2011-04-15 21:25:54 +0000
@@ -53,6 +53,9 @@
53.. Fixes for situations where bzr would previously crash or give incorrect53.. Fixes for situations where bzr would previously crash or give incorrect
54 or undesirable results.54 or undesirable results.
5555
56* Arguments that can't be decoded to unicode in the current posix locale give
57 a clearer error message without a traceback. (Martin [gz], #745712)
58
56* ``bzrlib.log._DEFAULT_REQUEST_PARAMS`` is no longer accidentally59* ``bzrlib.log._DEFAULT_REQUEST_PARAMS`` is no longer accidentally
57 mutated by ``bzrlib.log._apply_log_request_defaults``. In practice60 mutated by ``bzrlib.log._apply_log_request_defaults``. In practice
58 these default values aren't relied on very often so this probably61 these default values aren't relied on very often so this probably