Merge lp:~jameinel/bzr/2.1.0b4-win32-unsupported_encoding_commit_message into lp:bzr

Proposed by John A Meinel on 2009-11-08
Status: Merged
Merged at revision: not available
Proposed branch: lp:~jameinel/bzr/2.1.0b4-win32-unsupported_encoding_commit_message
Merge into: lp:bzr
Diff against target: 32 lines (+4/-4)
1 file modified
bzrlib/tests/blackbox/test_commit.py (+4/-4)
To merge this branch: bzr merge lp:~jameinel/bzr/2.1.0b4-win32-unsupported_encoding_commit_message
Reviewer Review Type Date Requested Status
Martin Packman (community) Approve on 2009-11-11
bzr-core 2009-11-08 Pending
Review via email: mp+14603@code.launchpad.net
To post a comment you must log in.
John A Meinel (jameinel) wrote :

Another small update to get the test suite passing on windows. From the commit message:

test_unsupported_encoding_commit_message no longer applies for Windows.

We now parse a Unicode string directly, so it isn't really possible to pass
a non-ascii string that will fail to decode.

I'm slowly building these up into a separate 'win32 test suite' branch, but I figured it would be a lot easier to review them one-at-a-time.

Then if anything ends up being controversial, it doesn't have to block the rest of them.

Martin Packman (gz) wrote :

It is possible to give bazaar commit messages in the wrong encoding under windows using subprocess, but as what happens depends on the codepage, it's not very interesting to test. Again, would prefer if this didn't check sys.platform but confirmed that it does turn the failure into a n/a here.

review: Approve
John A Meinel (jameinel) wrote :

> It is possible to give bazaar commit messages in the wrong encoding under
> windows using subprocess, but as what happens depends on the codepage, it's
> not very interesting to test. Again, would prefer if this didn't check
> sys.platform but confirmed that it does turn the failure into a n/a here.

>>> p = subprocess.call([sys.executable, '-c',
                         'from bzrlib.win32utils import get_unicode_argv;'
                         'print get_unicode_argv()',
    'foo', 'ba\x99'])
[u'foo', u'ba\u2122']

This code is testing that passing something to the process will cause it to
raise an appropriate *error* when decoding the command line arguments. However,
since we now use "get_unicode_argv()" Windows itself is decoding the command
line arguments, and I'm pretty sure all of its codepages support all 8-bit
characters.

I'm not positive to that fact, as I haven't tried different OS installs
(Russian Windows, etc.)

I can say that the meaning of 'ba\x99' will probably depend on the current
codepage, but I haven't found a way for the subprocess to think it has failed
to read its arguments. (A user can't force the arguments in such a way that the
client can tell the arguments aren't meaningful.)

That would be different if you could, for example, set your code-page to a
multi-byte sequence like UTF-8.

However, this *isn't* about getting commit message encodings wrong, this is
about failing to decode command line arguments.

Martin Packman (gz) wrote :

Right, it's not interesting to test because it's just mangled message (here, malformed multibyte characters get discarded or replaced with a generic character).

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'bzrlib/tests/blackbox/test_commit.py'
--- bzrlib/tests/blackbox/test_commit.py 2009-11-03 20:24:25 +0000
+++ bzrlib/tests/blackbox/test_commit.py 2009-11-08 01:20:24 +0000
@@ -25,6 +25,7 @@
25 ignores,25 ignores,
26 msgeditor,26 msgeditor,
27 osutils,27 osutils,
28 tests,
28 )29 )
29from bzrlib.bzrdir import BzrDir30from bzrlib.bzrdir import BzrDir
30from bzrlib.tests import (31from bzrlib.tests import (
@@ -263,6 +264,9 @@
263 self.run_bzr('commit -m ""', retcode=3)264 self.run_bzr('commit -m ""', retcode=3)
264265
265 def test_unsupported_encoding_commit_message(self):266 def test_unsupported_encoding_commit_message(self):
267 if sys.platform == 'win32':
268 raise tests.TestNotApplicable('Win32 parses arguments directly'
269 ' as Unicode, so we can\'t pass invalid non-ascii')
266 tree = self.make_branch_and_tree('.')270 tree = self.make_branch_and_tree('.')
267 self.build_tree_contents([('foo.c', 'int main() {}')])271 self.build_tree_contents([('foo.c', 'int main() {}')])
268 tree.add('foo.c')272 tree.add('foo.c')
@@ -273,10 +277,6 @@
273 if char is None:277 if char is None:
274 raise TestSkipped('Cannot find suitable non-ascii character'278 raise TestSkipped('Cannot find suitable non-ascii character'
275 'for user_encoding (%s)' % osutils.get_user_encoding())279 'for user_encoding (%s)' % osutils.get_user_encoding())
276 # TODO: jam 2009-07-23 This test seems to fail on Windows now. My best
277 # guess is that the change to use Unicode command lines means
278 # that we no longer pay any attention to LANG=C when decoding the
279 # commandline arguments.
280 out,err = self.run_bzr_subprocess('commit -m "%s"' % char,280 out,err = self.run_bzr_subprocess('commit -m "%s"' % char,
281 retcode=1,281 retcode=1,
282 env_changes={'LANG': 'C'})282 env_changes={'LANG': 'C'})