Merge lp:~jml/testtools/unprintable-assertThat-804127 into lp:~testtools-committers/testtools/trunk

Proposed by Jonathan Lange
Status: Merged
Merged at revision: 229
Proposed branch: lp:~jml/testtools/unprintable-assertThat-804127
Merge into: lp:~testtools-committers/testtools/trunk
Diff against target: 309 lines (+158/-15)
7 files modified
NEWS (+10/-1)
scripts/all-pythons (+6/-3)
testtools/compat.py (+7/-2)
testtools/matchers.py (+27/-0)
testtools/testcase.py (+3/-8)
testtools/tests/test_matchers.py (+58/-0)
testtools/tests/test_testcase.py (+47/-1)
To merge this branch: bzr merge lp:~jml/testtools/unprintable-assertThat-804127
Reviewer Review Type Date Requested Status
Martin Packman Approve
testtools committers Pending
Review via email: mp+70530@code.launchpad.net

Description of the change

This branch addresses bug 804127, and I believe fixes it. The approach is to raise our own kind of exception, one that actually works with extended unicode information.

In addition, I've added a to_text() method to compat, that is a way of saying 'unicode' in Python 2 and 3.

Points of concern:
 * Using to_text() in test_verbose_unicode but str() in all the other tests feels wrong, but I'm not sure what's right.
 * I haven't changed or better defined the contract for describe(). Is this really a problem?
 * I was surprised that _exception_to_text could have been so broken for Python 3, but it turns out that it wasn't used at all for Python 3. This makes me wonder how we should be actually testing this bug.

To post a comment you must log in.
Martin Packman (gz) wrote :

This looks like really the best option, thanks.

+try:
+ to_text = unicode
+except NameError:
+ to_text = str
...
- return unicode(evalue)
+ return to_text(evalue)

I think this is abstraction at the wrong level, Python 3 should just be able to do `str(e)` rather than `_exception_to_text(e)` where you use it, or if we really want the exception catching (doesn't seem needed in the test), add a Python 3 version which would basically be:

    def _exception_to_text(evalue):
        try:
            return str(evalue)
        except Exception:
            return ...some fallback...

+ def __init__(self, matchee, matcher, mismatch, verbose=False):

I'm not sure about having verbosity as an attribute of the exception instance, but guess that's a natural consequence given the way the option was added earlier. Otherwise, the traceback formatting code would be complicated by trying to call something other than unicode()/str() on the exception so a verbose argument could be passed. Which would get very ugly.

+ def __str__(self):
+ difference = self.mismatch.describe()

If we are strict about describe methods always returning unicode, and don't expect MismatchError instances to be handled outside the testtools module, this is okay. Otherwise, it would be better to have both a __unicode__ and __str__ method, with the __str__ method returning ascii on Python 2. I can put up some testcases demonstrating the issues involved.

+ # XXX: Using to_text rather than str because, on Python 2, str will
+ # raise UnicodeEncodeError.

As mentioned above, this may be worth avoiding by having __str__ fall back to ascii on Python 2.

+ if ``self.failureException`` has been set to a non-default value, then
+ mismatches will become test errors rather than test failures.

Not that I've ever seen anyone use this unittest feature, but could the testtools runners at least avoid this by doing `except (self.failureException, MismatchError):` or similar?

review: Approve
Jonathan Lange (jml) wrote :

On Fri, Aug 5, 2011 at 5:03 PM, Martin [gz] <email address hidden> wrote:
> Review: Approve
> This looks like really the best option, thanks.
>
> +try:
> +    to_text = unicode
> +except NameError:
> +    to_text = str
> ...
> -        return unicode(evalue)
> +        return to_text(evalue)
>
> I think this is abstraction at the wrong level, Python 3 should just be able to do `str(e)` rather than `_exception_to_text(e)` where you use it, or if we really want the exception catching (doesn't seem needed in the test), add a Python 3 version which would basically be:
>
>    def _exception_to_text(evalue):
>        try:
>            return str(evalue)
>        except Exception:
>            return ...some fallback...
>

Yeah, I see your point. I think having that wrapper is the best way,
since that allows us to write tests that verify behaviour. Currently
_exception_to_text is only used in Python 2 anyway.

I still think to_text is useful though :)

> +    def __init__(self, matchee, matcher, mismatch, verbose=False):
>
> I'm not sure about having verbosity as an attribute of the exception instance, but guess that's a natural consequence given the way the option was added earlier. Otherwise, the traceback formatting code would be complicated by trying to call something other than unicode()/str() on the exception so a verbose argument could be passed. Which would get very ugly.

Perhaps I could raise a standard failure in assertThat for
verbose=False. I agree it's a bit of a violation abstraction as is.

>
> +    def __str__(self):
> +        difference = self.mismatch.describe()
>
> If we are strict about describe methods always returning unicode, and don't expect MismatchError instances to be handled outside the testtools module, this is okay. Otherwise, it would be better to have both a __unicode__ and __str__ method, with the __str__ method returning ascii on Python 2. I can put up some testcases demonstrating the issues involved.

Those test cases would be useful, thanks. Even if they are just
copy-and-paste from the REPL.

> +  if ``self.failureException`` has been set to a non-default value, then
> +  mismatches will become test errors rather than test failures.
>
> Not that I've ever seen anyone use this unittest feature, but could the testtools runners at least avoid this by doing `except (self.failureException, MismatchError):` or similar?

Good idea!

Thanks for the great review!

jml

Martin Packman (gz) wrote :

I've got a little module with tests, the summary of interesting ones is:

    assertThat(_b("\a7"), Equals(_b("\b1")), verbose=True)
Results in a non-ascii byte string for the assertion on Python 2, worse when the bytes include control characters of various sorts. On Python 3 the bytes type ends up in an escaped form when stringified which is much better. All verbose matching with bytes has this issue.

    assertThat(_b("\a7"), MatchesRegex(_b("\b1")))
    assertThat(_b("\a7"), StartsWith(_b("\b1")))
The reverse problem here on Python 3, the matchers are printed as /b'\xb1'/ and 'b'\xb1'' which is not ideal. Python 2 again ends up with a non-ascii assertion.

    assertThat(_u("\a7"), MatchesRegex(_b("\b1")), verbose=True)
Results in an unprintable MismatchError on Python 2.

Martin Packman (gz) wrote :

See <lp:~gz/+junk/testtools_that> for examples of what failing tests look like.

225. By Jonathan Lange on 2011-08-09

Use the right level API to check the error output.

226. By Jonathan Lange on 2011-08-09

Docstring.

227. By Jonathan Lange on 2011-08-11

Take parameters.

228. By Jonathan Lange on 2011-08-11

Merge trunk

Jonathan Lange (jml) wrote :

IRC discussion about this branch::

<jml> mgz: this bytes/unicode thing is doing my head in
<mgz> ehhee
<mgz> jml: and if you've not read PEP 383 yet, which poolie linked the other day, know that it only gets messiuer
<mgz> -u
<mgz> I think testtools is basically there, on a hard problem, apart from two things
<jml> I haven't.
<mgz> 1) Matcher and Mismatch classes need to not use %s, specifically StartsWith and MatchesRegex
<mgz> 2) __str__ on Python 2 should mangle to ascii rather than return unicode and blow things up
<jml> Ahhh, you *say* that.
<mgz> testtools can be smart and try __unicode__ first, and other if the objects leak into other less careful libraries they won't be harmed
<jml> mgz: you mean something like http://paste.ubuntu.com/663608/?
<mgz> in fact, testtools already is smart.
<mgz> I'd have spelled it differently, but yup, looks about what I was thinking
<mgz> did it cause issues?
<mgz> the only issue remaining from there is to not let non-ascii bytestrings in, which is why StartsWith etc needs chaning
<jml> mgz: yeah
<mgz> *changing
<jml> mgz: even on Equals(), you get an unprintable assertion.
<mgz> that's the describe() return issue.
<mgz> I think it might be best to check the type in that __unicode__ function and do something similar to how bytes() get repr-ed in Python 3, rather than trying to fix every method...
<mgz> hm.
<jml> http://paste.ubuntu.com/663616/ for something to muck around with.
<mgz> yup, that's the kind of assertion that we want to work for people.
<mgz> okay, I'll think a bit and throw some code up
<mgz> ...not as in vomit, though I don't guarentee it won't look like that
<jml> http://paste.ubuntu.com/663619/
<jml> more data
<jml> mgz: heh heh
<jml> mgz: also, I didn't really know how to turn your demo tests into actual tests.
<jml> mgz: just pushed up my latest changes, fixes the _exception_to_text abstraction violation.
<jml> mgz: but, however you choose to project your code onto the internet, it would be much appreciated.
<mgz> yeah, I left the demo file as was because getting into another layer of assertions gets really confusing, it's clearer to just see the output as a testtools user would
<jml> fair enough
<mgz> tests for the things that then make those cases work as intended are writable
<mgz> (what type the describe() method returns on non-ascii mismatches etc)
<jml> cool. I tried naively turning them into tests, and they all passed, which isn't really what we want.
<jml> yeah
<jml> fwiw, https://bugs.launchpad.net/testtools/+bug/686807 might be relevant
* mgz looks
<mgz> yeah, that change could probably just be made, the __str__ methods already return things that should work as object creation code, and str() falls back to repr()

Am now waiting on mgz to provide some code suggestions.

Martin Packman (gz) wrote :

An update on where I am, as we've failed to connect on IRC a couple of times.

Fixing problem 1) above with %s without changing the current behaviour is a little complex as there are several different variations on the output desired, and also changes across Python implementations even in the trivial case due to different prefixes. So, I got a little stuck trying to write a does-everything fancy repr without breaking the current behaviour. Simply changing everything to %r would work* but lose the breaking up of multiline strings and new regexp backslash behaviour. I'm nearly there with a proper fix, it's just fiddly.

*Mostly. One kind of problem currently is like:

    from testtools import TestCase
    from testtools.matchers import Equals

    class Test(TestCase):

        def test_lost_control(self):
            self.assertThat("\x1b[31;1m", Equals("CSI 32 ; 1 m"), verbose=True)

    if __name__ == "__main__":
        import unittest
        unittest.main()

This sort is particularly fun as C0 codes are perfectly valid in returns from __repr__ functions, so even with strings fixed other objects can still give surprising test result output.

Jonathan Lange (jml) wrote :

On Fri, Aug 19, 2011 at 10:02 PM, Martin [gz] <email address hidden> wrote:
> An update on where I am, as we've failed to connect on IRC a couple of times.
>

Thanks for the update!

jml

229. By Martin Packman on 2011-08-21

Tests for repr function to replace broken stringification schemes in matchers

230. By Martin Packman on 2011-08-22

Correct a couple of the examples for text_repr tests

231. By Martin Packman on 2011-08-22

Messy first pass at implementing text_repr

232. By Martin Packman on 2011-08-22

Use (possibly extended) repr rather than "%s" for matchee in verbose form of MismatchError

233. By Martin Packman on 2011-08-22

Make MismatchError stringification return appropriate types on Python 2

234. By Martin Packman on 2011-08-22

Remove now unused to_text alias for unicode string type from compat

235. By Martin Packman on 2011-08-24

Test and fix _BinaryMismatch.describe long forms using text_repr

236. By Martin Packman on 2011-08-24

Hack over bump with bytes till things can be rewritten in a better way

237. By Martin Packman on 2011-08-24

Use text_repr for DoesNotStartWith and DoesNotEndWith describe methods

238. By Martin Packman on 2011-08-24

Make StartsWith and EndsWith stringify more like other matchers

239. By Martin Packman on 2011-08-24

Fix MatchesRegex mismatch description with a little transcoding dance

240. By Martin Packman on 2011-08-24

Avoid potential bytes and unicode mixing with DocTestMismatch on Python 2

241. By Martin Packman on 2011-09-13

Extra tests to ensure multiline quoting is correct on Python 3

242. By Martin Packman on 2011-09-13

Fix spelling error noted in review by jml

243. By Martin Packman on 2011-09-13

Add cautions about unprintable strings to Mismatch documentation

244. By Martin Packman on 2011-09-13

Add third state to multiline argument of text_repr and comment the internal logic

245. By Martin Packman on 2011-09-13

Test that not passing multiline to text_repr defaults based on input

246. By Jonathan Lange on 2011-09-14

Merge trunk.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'NEWS'
2--- NEWS 2011-08-08 11:16:01 +0000
3+++ NEWS 2011-08-11 18:28:12 +0000
4@@ -14,6 +14,12 @@
5 now deprecated. Please stop using it.
6 (Jonathan Lange, #813460)
7
8+* ``assertThat`` raises ``MismatchError`` instead of
9+ ``TestCase.failureException``. ``MismatchError`` is a subclass of
10+ ``AssertionError``, so in most cases this change will not matter. However,
11+ if ``self.failureException`` has been set to a non-default value, then
12+ mismatches will become test errors rather than test failures.
13+
14 * ``gather_details`` takes two dicts, rather than two detailed objects.
15 (Jonathan Lange, #801027)
16
17@@ -30,7 +36,10 @@
18 * All public matchers are now in ``testtools.matchers.__all__``.
19 (Jonathan Lange, #784859)
20
21-* assertThat output is much less verbose, displaying only what the mismatch
22+* ``assertThat`` can actually display mismatches and matchers that contain
23+ extended unicode characters. (Jonathan Lange, Martin [gz], #804127)
24+
25+* ``assertThat`` output is much less verbose, displaying only what the mismatch
26 tells us to display. Old-style verbose output can be had by passing
27 ``verbose=True`` to assertThat. (Jonathan Lange, #675323, #593190)
28
29
30=== modified file 'scripts/all-pythons'
31--- scripts/all-pythons 2011-07-26 22:28:21 +0000
32+++ scripts/all-pythons 2011-08-11 18:28:12 +0000
33@@ -29,7 +29,9 @@
34 ROOT = os.path.dirname(os.path.dirname(__file__))
35
36
37-def run_for_python(version, result):
38+def run_for_python(version, result, tests):
39+ if not tests:
40+ tests = ['testtools.tests.test_suite']
41 # XXX: This could probably be broken up and put into subunit.
42 python = 'python%s' % (version,)
43 # XXX: Correct API, but subunit doesn't support it. :(
44@@ -58,7 +60,8 @@
45 cmd = [
46 python,
47 '-W', 'ignore:Module testtools was already imported',
48- subunit_path, 'testtools.tests.test_suite']
49+ subunit_path]
50+ cmd.extend(tests)
51 process = subprocess.Popen(
52 cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, env=env)
53 _make_stream_binary(process.stdout)
54@@ -87,4 +90,4 @@
55 sys.path.append(ROOT)
56 result = TestProtocolClient(sys.stdout)
57 for version in '2.4 2.5 2.6 2.7 3.0 3.1 3.2'.split():
58- run_for_python(version, result)
59+ run_for_python(version, result, sys.argv[1:])
60
61=== modified file 'testtools/compat.py'
62--- testtools/compat.py 2011-07-26 23:08:51 +0000
63+++ testtools/compat.py 2011-08-11 18:28:12 +0000
64@@ -143,8 +143,13 @@
65 stream.newlines, stream.line_buffering)
66 except AttributeError:
67 pass
68- return writer(stream, "replace")
69-
70+ return writer(stream, "replace")
71+
72+
73+try:
74+ to_text = unicode
75+except NameError:
76+ to_text = str
77
78 # The default source encoding is actually "iso-8859-1" until Python 2.5 but
79 # using non-ascii causes a deprecation warning in 2.4 and it's cleaner to
80
81=== modified file 'testtools/matchers.py'
82--- testtools/matchers.py 2011-08-08 11:14:01 +0000
83+++ testtools/matchers.py 2011-08-11 18:28:12 +0000
84@@ -129,6 +129,33 @@
85 id(self), self.__dict__)
86
87
88+class MismatchError(AssertionError):
89+ """Raised when a mismatch occurs."""
90+
91+ # This class exists to work around
92+ # <https://bugs.launchpad.net/testtools/+bug/804127>. It provides a
93+ # guaranteed way of getting a readable exception, no matter what crazy
94+ # characters are in the matchee, matcher or mismatch.
95+
96+ def __init__(self, matchee, matcher, mismatch, verbose=False):
97+ # Have to use old-style upcalling for Python 2.4 and 2.5
98+ # compatibility.
99+ AssertionError.__init__(self)
100+ self.matchee = matchee
101+ self.matcher = matcher
102+ self.mismatch = mismatch
103+ self.verbose = verbose
104+
105+ def __str__(self):
106+ difference = self.mismatch.describe()
107+ if self.verbose:
108+ return (
109+ 'Match failed. Matchee: "%s"\nMatcher: %s\nDifference: %s\n'
110+ % (self.matchee, self.matcher, difference))
111+ else:
112+ return difference
113+
114+
115 class MismatchDecorator(object):
116 """Decorate a ``Mismatch``.
117
118
119=== modified file 'testtools/testcase.py'
120--- testtools/testcase.py 2011-07-27 19:47:22 +0000
121+++ testtools/testcase.py 2011-08-11 18:28:12 +0000
122@@ -29,6 +29,7 @@
123 Annotate,
124 Equals,
125 MatchesException,
126+ MismatchError,
127 Is,
128 Not,
129 )
130@@ -391,7 +392,7 @@
131
132 :param matchee: An object to match with matcher.
133 :param matcher: An object meeting the testtools.Matcher protocol.
134- :raises self.failureException: When matcher does not match thing.
135+ :raises MismatchError: When matcher does not match thing.
136 """
137 # XXX: Should this take an optional 'message' parameter? Would kind of
138 # make sense. The hamcrest one does.
139@@ -406,13 +407,7 @@
140 full_name = "%s-%d" % (name, suffix)
141 suffix += 1
142 self.addDetail(full_name, content)
143- if verbose:
144- message = (
145- 'Match failed. Matchee: "%s"\nMatcher: %s\nDifference: %s\n'
146- % (matchee, matcher, mismatch.describe()))
147- else:
148- message = mismatch.describe()
149- self.fail(message)
150+ raise MismatchError(matchee, matcher, mismatch, verbose)
151
152 def defaultTestResult(self):
153 return TestResult()
154
155=== modified file 'testtools/tests/test_matchers.py'
156--- testtools/tests/test_matchers.py 2011-08-08 11:14:01 +0000
157+++ testtools/tests/test_matchers.py 2011-08-11 18:28:12 +0000
158@@ -12,6 +12,7 @@
159 )
160 from testtools.compat import (
161 StringIO,
162+ to_text,
163 _u,
164 )
165 from testtools.matchers import (
166@@ -37,6 +38,7 @@
167 MatchesStructure,
168 Mismatch,
169 MismatchDecorator,
170+ MismatchError,
171 Not,
172 NotEquals,
173 Raises,
174@@ -62,6 +64,62 @@
175 self.assertEqual({}, mismatch.get_details())
176
177
178+class TestMismatchError(TestCase):
179+
180+ def test_is_assertion_error(self):
181+ # MismatchError is an AssertionError, so that most of the time, it
182+ # looks like a test failure, rather than an error.
183+ def raise_mismatch_error():
184+ raise MismatchError(2, Equals(3), Equals(3).match(2))
185+ self.assertRaises(AssertionError, raise_mismatch_error)
186+
187+ def test_default_description_is_mismatch(self):
188+ mismatch = Equals(3).match(2)
189+ e = MismatchError(2, Equals(3), mismatch)
190+ self.assertEqual(mismatch.describe(), str(e))
191+
192+ def test_default_description_unicode(self):
193+ matchee = _u('\xa7')
194+ matcher = Equals(_u('a'))
195+ mismatch = matcher.match(matchee)
196+ e = MismatchError(matchee, matcher, mismatch)
197+ self.assertEqual(mismatch.describe(), str(e))
198+
199+ def test_verbose_description(self):
200+ matchee = 2
201+ matcher = Equals(3)
202+ mismatch = matcher.match(2)
203+ e = MismatchError(matchee, matcher, mismatch, True)
204+ expected = (
205+ 'Match failed. Matchee: "%s"\n'
206+ 'Matcher: %s\n'
207+ 'Difference: %s\n' % (
208+ matchee,
209+ matcher,
210+ matcher.match(matchee).describe(),
211+ ))
212+ self.assertEqual(expected, str(e))
213+
214+ def test_verbose_unicode(self):
215+ # When assertThat is given matchees or matchers that contain non-ASCII
216+ # unicode strings, we can still provide a meaningful error.
217+ matchee = _u('\xa7')
218+ matcher = Equals(_u('a'))
219+ mismatch = matcher.match(matchee)
220+ expected = (
221+ 'Match failed. Matchee: "%s"\n'
222+ 'Matcher: %s\n'
223+ 'Difference: %s\n' % (
224+ matchee,
225+ matcher,
226+ mismatch.describe(),
227+ ))
228+ e = MismatchError(matchee, matcher, mismatch, True)
229+ # XXX: Using to_text rather than str because, on Python 2, str will
230+ # raise UnicodeEncodeError.
231+ self.assertEqual(expected, to_text(e))
232+
233+
234 class TestMatchersInterface(object):
235
236 def test_matches_match(self):
237
238=== modified file 'testtools/tests/test_testcase.py'
239--- testtools/tests/test_testcase.py 2011-07-26 23:48:48 +0000
240+++ testtools/tests/test_testcase.py 2011-08-11 18:28:12 +0000
241@@ -18,7 +18,10 @@
242 skipUnless,
243 testcase,
244 )
245-from testtools.compat import _b
246+from testtools.compat import (
247+ _b,
248+ _u,
249+ )
250 from testtools.matchers import (
251 Equals,
252 MatchesException,
253@@ -29,6 +32,7 @@
254 Python27TestResult,
255 ExtendedTestResult,
256 )
257+from testtools.testresult.real import TestResult
258 from testtools.tests.helpers import (
259 an_exc_info,
260 LoggingResult,
261@@ -482,6 +486,48 @@
262 self.assertFails(
263 expected, self.assertThat, matchee, matcher, verbose=True)
264
265+ def get_error_string(self, e):
266+ """Get the string showing how 'e' would be formatted in test output.
267+
268+ This is a little bit hacky, since it's designed to give consistent
269+ output regardless of Python version.
270+
271+ In testtools, TestResult._exc_info_to_unicode is the point of dispatch
272+ between various different implementations of methods that format
273+ exceptions, so that's what we have to call. However, that method cares
274+ about stack traces and formats the exception class. We don't care
275+ about either of these, so we take its output and parse it a little.
276+ """
277+ error = TestResult()._exc_info_to_unicode((e.__class__, e, None), self)
278+ # We aren't at all interested in the traceback.
279+ if error.startswith('Traceback (most recent call last):\n'):
280+ lines = error.splitlines(True)[1:]
281+ for i, line in enumerate(lines):
282+ if not line.startswith(' '):
283+ break
284+ error = ''.join(lines[i:])
285+ # We aren't interested in how the exception type is formatted.
286+ exc_class, error = error.split(': ', 1)
287+ return error
288+
289+ def test_assertThat_verbose_unicode(self):
290+ # When assertThat is given matchees or matchers that contain non-ASCII
291+ # unicode strings, we can still provide a meaningful error.
292+ matchee = _u('\xa7')
293+ matcher = Equals(_u('a'))
294+ expected = (
295+ 'Match failed. Matchee: "%s"\n'
296+ 'Matcher: %s\n'
297+ 'Difference: %s\n\n' % (
298+ matchee,
299+ matcher,
300+ matcher.match(matchee).describe(),
301+ ))
302+ e = self.assertRaises(
303+ self.failureException, self.assertThat, matchee, matcher,
304+ verbose=True)
305+ self.assertEqual(expected, self.get_error_string(e))
306+
307 def test_assertEqual_nice_formatting(self):
308 message = "These things ought not be equal."
309 a = ['apple', 'banana', 'cherry']

Subscribers

People subscribed via source and target branches