Merge lp:~gz/testtools/unprintable-assertThat-804127 into lp:~testtools-committers/testtools/trunk
- unprintable-assertThat-804127
- Merge into trunk
Status: | Merged |
---|---|
Approved by: | Jonathan Lange |
Approved revision: | 245 |
Merged at revision: | 229 |
Proposed branch: | lp:~gz/testtools/unprintable-assertThat-804127 |
Merge into: | lp:~testtools-committers/testtools/trunk |
Prerequisite: | lp:~jml/testtools/unprintable-assertThat-804127 |
Diff against target: |
754 lines (+439/-32) (has conflicts) 6 files modified
doc/for-test-authors.rst (+10/-2) testtools/compat.py (+93/-5) testtools/matchers.py (+40/-13) testtools/tests/test_compat.py (+127/-0) testtools/tests/test_matchers.py (+151/-9) testtools/tests/test_testcase.py (+18/-3) Text conflict in testtools/tests/test_matchers.py |
To merge this branch: | bzr merge lp:~gz/testtools/unprintable-assertThat-804127 |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Jonathan Lange | Approve | ||
Review via email: mp+72641@code.launchpad.net |
Commit message
Description of the change
Resolve issues related to stringifying in the matcher code, by replacing several custom repr-like schemes using %s on strings (which may upcase the result to unicode or prevent later upcasting, or leak control codes, or...) with a big ugly method that tries to do what the callers wanted.
This deserves a longer description of the reasoning behind the changes, but there's some discussion of the reasoning in the prerequisite branch and I need some feedback before I go crazy\bier.
Some random notes so I don't forget them later:
* Switched from """...""" to '''...''' so if I see output pasted in bug reports I'll know if I broke it.
* Being clever with the escaping of single quotes in multiline strings is probably more trouble than it's worth, just putting a backslash in front of every occurrence would simplify the logic a lot.
* Astral characters break, as per usual. I'm amused to note that upstream just fixed one more issue their end[1] (but without exposing a usable iterator to python level code still), and on Python 3 only of course.
* If the decision is to dump all these attempts at fancy formatting and just live with the normal repr, I would not mind that at all.
Jonathan Lange (jml) wrote : | # |
Robert Collins (lifeless) wrote : | # |
* What are we expecting Matcher.__str__ to return? Text or bytes?
-- text. Fugly I know. So python 2.x: use __unicode__, 3.x use
__str__, and we expect text all the time. We use it to get something
looking like what the user asked for after all.
....
* Our docs say that match() should return a Mismatch, so we can assume
subclassing of that.
Please no, duck typing should be all we need. (And None is not a
subclass of Mismatch anyway :P).
-Rob
Martin Packman (gz) wrote : | # |
Thanks, this is is a useful review that covers all the stuff I needed to mention but didn't get in the proposal.
> To summarize, although the bug was originally filed against assertThat, it
> turns out that it's actually an issue in the matchers themselves as well as
> assertThat. Right?
Yes. The initial thoughts were about adding more escaping to assertThat, but by then most of the formatting has been done. By moving the onus onto Matchers to escape more carefully, it's possible to preserve more information.
> If so, we have to be thinking about:
> * what are our failure modes with this code and existing broken matchers
> * how to make it easy to define matchers and mismatches correctly
These are exactly the two primary concerns here. We don't want to punish test case authors with spurious errors if they specify an assert incorrectly, but give a clear enough failure for them to fix the problem. We also don't want learning about Python string quirks to be a prerequisite for writing a Matcher.
> * Can you elaborate on your comment on istext()?
The result of the function is generally not enough information to actually use your string safely.
Two of the current callers in texttools are just trying to do special handling on a string, and otherwise assume the object is already a specific correct type. In both cases, the call is (harmlessly) subtly wrong, __import__ doesn't take non-ascii strings on Python 2, and Python 3 bytes can be used for regular expressions.
Code wanting to do actual string handling has it worse, as it will always risk getting a string that can't be safely interpolated, or written stream, or otherwise given as output.
In short, it's something of an attractive nuisance. In using it, I realised the text_repr interface wasn't very good.
> * Is there a way we can make it so matchers never have to use _isbytes?
> Perhaps by adding an "is_multiline" function to compat.
Something like this would be much better, possibly moving the multiline logic into the repr function (so True means triple quote if there's a newline, or by adding a third state). This was me lowering my standards and keeping coding, in order to get something that would work. :)
> * text_repr is a bit hard to follow, maybe add some internal comments?
Yup, and some general cleanup would help, I left the first version that passed the tests alone so we could see if it actually worked as an interface before buffing it.
> * Some more documentation on the intent & API of text_repr would help,
> saying that it attempts to mimic repr() except returning unicode,
> giving some examples, explaining why you might want to use it and also what
> multiline does.
Will write this. As is probably clear from the tests, it aims to use the Python 3 logic for displaying literal unicode characters, and to maintain the `eval(repr(s)) == s` expectation.
> * I don't get why multiline needs three different ways of escaping lines
> whereas the non-multiline case only has two branches
So, there are a cube of relevant combinations:
F T
/ | | R | bytes
/ ...
Jonathan Lange (jml) wrote : | # |
Wow, thanks for the reply. Sorry its taken so long to get back to you.
In general, your replies to my review seem to be either agreeing with me or informing me of something I didn't know. There are a few cleanup things that need to be done before this lands (identified above). Where do you want to go from here?
- 241. By Martin Packman
-
Extra tests to ensure multiline quoting is correct on Python 3
- 242. By Martin Packman
-
Fix spelling error noted in review by jml
- 243. By Martin Packman
-
Add cautions about unprintable strings to Mismatch documentation
- 244. By Martin Packman
-
Add third state to multiline argument of text_repr and comment the internal logic
- 245. By Martin Packman
-
Test that not passing multiline to text_repr defaults based on input
Martin Packman (gz) wrote : | # |
Okay, this branch should now be tidied up enough to land, can you see anything I've missed?
I tried out switching the repr-like function to take any object, and it does seem to be a better interface but complicates the logic even further, so should probably be done separately.
There's still a bunch of todo all over the place, and I haven't managed to squish _isbytes yet, though it's not strictly necessary in the matchers code bar some new tests I added that expect fancy behaviour.
There are a couple more interesting things that could do with handling later. Either Annotate or AnnotateMismatch (or both, see the earlier complaint about string typing) should check the message they are given is printable unicode. Also the regular expression display form could do with some extra testing, I'm not totally convinced it always produces correct expressions (and certainly doesn't with some on Python 2.4 where the unicode-escape codec fails to escape backslashes).
Jonathan Lange (jml) wrote : | # |
Can't see anything you've missed. I'm going to merge this, and will file wishlist bugs for the things that you say you've left out.
Jonathan Lange (jml) : | # |
Jonathan Lange (jml) wrote : | # |
Filed bug 849846 and bug 849843.
Martin Packman (gz) wrote : | # |
Great, thanks jml, I'll subscribe to those bugs.
Preview Diff
1 | === modified file 'doc/for-test-authors.rst' | |||
2 | --- doc/for-test-authors.rst 2011-08-15 16:14:42 +0000 | |||
3 | +++ doc/for-test-authors.rst 2011-09-13 23:54:14 +0000 | |||
4 | @@ -717,7 +717,7 @@ | |||
5 | 717 | self.remainder = remainder | 717 | self.remainder = remainder |
6 | 718 | 718 | ||
7 | 719 | def describe(self): | 719 | def describe(self): |
9 | 720 | return "%s is not divisible by %s, %s remains" % ( | 720 | return "%r is not divisible by %r, %r remains" % ( |
10 | 721 | self.number, self.divider, self.remainder) | 721 | self.number, self.divider, self.remainder) |
11 | 722 | 722 | ||
12 | 723 | def get_details(self): | 723 | def get_details(self): |
13 | @@ -738,11 +738,19 @@ | |||
14 | 738 | remainder = actual % self.divider | 738 | remainder = actual % self.divider |
15 | 739 | if remainder != 0: | 739 | if remainder != 0: |
16 | 740 | return Mismatch( | 740 | return Mismatch( |
18 | 741 | "%s is not divisible by %s, %s remains" % ( | 741 | "%r is not divisible by %r, %r remains" % ( |
19 | 742 | actual, self.divider, remainder)) | 742 | actual, self.divider, remainder)) |
20 | 743 | else: | 743 | else: |
21 | 744 | return None | 744 | return None |
22 | 745 | 745 | ||
23 | 746 | When writing a ``describe`` method or constructing a ``Mismatch`` object the | ||
24 | 747 | code should ensure it only emits printable unicode. As this output must be | ||
25 | 748 | combined with other text and forwarded for presentation, letting through | ||
26 | 749 | non-ascii bytes of ambiguous encoding or control characters could throw an | ||
27 | 750 | exception or mangle the display. In most cases simply avoiding the ``%s`` | ||
28 | 751 | format specifier and using ``%r`` instead will be enough. For examples of | ||
29 | 752 | more complex formatting see the ``testtools.matchers`` implementatons. | ||
30 | 753 | |||
31 | 746 | 754 | ||
32 | 747 | Details | 755 | Details |
33 | 748 | ======= | 756 | ======= |
34 | 749 | 757 | ||
35 | === modified file 'testtools/compat.py' | |||
36 | --- testtools/compat.py 2011-09-13 23:54:14 +0000 | |||
37 | +++ testtools/compat.py 2011-09-13 23:54:14 +0000 | |||
38 | @@ -25,6 +25,7 @@ | |||
39 | 25 | import re | 25 | import re |
40 | 26 | import sys | 26 | import sys |
41 | 27 | import traceback | 27 | import traceback |
42 | 28 | import unicodedata | ||
43 | 28 | 29 | ||
44 | 29 | from testtools.helpers import try_imports | 30 | from testtools.helpers import try_imports |
45 | 30 | 31 | ||
46 | @@ -52,6 +53,7 @@ | |||
47 | 52 | """ | 53 | """ |
48 | 53 | 54 | ||
49 | 54 | if sys.version_info > (3, 0): | 55 | if sys.version_info > (3, 0): |
50 | 56 | import builtins | ||
51 | 55 | def _u(s): | 57 | def _u(s): |
52 | 56 | return s | 58 | return s |
53 | 57 | _r = ascii | 59 | _r = ascii |
54 | @@ -59,12 +61,14 @@ | |||
55 | 59 | """A byte literal.""" | 61 | """A byte literal.""" |
56 | 60 | return s.encode("latin-1") | 62 | return s.encode("latin-1") |
57 | 61 | advance_iterator = next | 63 | advance_iterator = next |
58 | 64 | # GZ 2011-08-24: Seems istext() is easy to misuse and makes for bad code. | ||
59 | 62 | def istext(x): | 65 | def istext(x): |
60 | 63 | return isinstance(x, str) | 66 | return isinstance(x, str) |
61 | 64 | def classtypes(): | 67 | def classtypes(): |
62 | 65 | return (type,) | 68 | return (type,) |
63 | 66 | str_is_unicode = True | 69 | str_is_unicode = True |
64 | 67 | else: | 70 | else: |
65 | 71 | import __builtin__ as builtins | ||
66 | 68 | def _u(s): | 72 | def _u(s): |
67 | 69 | # The double replace mangling going on prepares the string for | 73 | # The double replace mangling going on prepares the string for |
68 | 70 | # unicode-escape - \foo is preserved, \u and \U are decoded. | 74 | # unicode-escape - \foo is preserved, \u and \U are decoded. |
69 | @@ -112,6 +116,95 @@ | |||
70 | 112 | return isinstance(exception, (KeyboardInterrupt, SystemExit)) | 116 | return isinstance(exception, (KeyboardInterrupt, SystemExit)) |
71 | 113 | 117 | ||
72 | 114 | 118 | ||
73 | 119 | # GZ 2011-08-24: Using isinstance checks like this encourages bad interfaces, | ||
74 | 120 | # there should be better ways to write code needing this. | ||
75 | 121 | if not issubclass(getattr(builtins, "bytes", str), str): | ||
76 | 122 | def _isbytes(x): | ||
77 | 123 | return isinstance(x, bytes) | ||
78 | 124 | else: | ||
79 | 125 | # Never return True on Pythons that provide the name but not the real type | ||
80 | 126 | def _isbytes(x): | ||
81 | 127 | return False | ||
82 | 128 | |||
83 | 129 | |||
84 | 130 | def _slow_escape(text): | ||
85 | 131 | """Escape unicode `text` leaving printable characters unmodified | ||
86 | 132 | |||
87 | 133 | The behaviour emulates the Python 3 implementation of repr, see | ||
88 | 134 | unicode_repr in unicodeobject.c and isprintable definition. | ||
89 | 135 | |||
90 | 136 | Because this iterates over the input a codepoint at a time, it's slow, and | ||
91 | 137 | does not handle astral characters correctly on Python builds with 16 bit | ||
92 | 138 | rather than 32 bit unicode type. | ||
93 | 139 | """ | ||
94 | 140 | output = [] | ||
95 | 141 | for c in text: | ||
96 | 142 | o = ord(c) | ||
97 | 143 | if o < 256: | ||
98 | 144 | if o < 32 or 126 < o < 161: | ||
99 | 145 | output.append(c.encode("unicode-escape")) | ||
100 | 146 | elif o == 92: | ||
101 | 147 | # Separate due to bug in unicode-escape codec in Python 2.4 | ||
102 | 148 | output.append("\\\\") | ||
103 | 149 | else: | ||
104 | 150 | output.append(c) | ||
105 | 151 | else: | ||
106 | 152 | # To get correct behaviour would need to pair up surrogates here | ||
107 | 153 | if unicodedata.category(c)[0] in "CZ": | ||
108 | 154 | output.append(c.encode("unicode-escape")) | ||
109 | 155 | else: | ||
110 | 156 | output.append(c) | ||
111 | 157 | return "".join(output) | ||
112 | 158 | |||
113 | 159 | |||
114 | 160 | def text_repr(text, multiline=None): | ||
115 | 161 | """Rich repr for `text` returning unicode, triple quoted if `multiline`""" | ||
116 | 162 | is_py3k = sys.version_info > (3, 0) | ||
117 | 163 | nl = _isbytes(text) and bytes((0xA,)) or "\n" | ||
118 | 164 | if multiline is None: | ||
119 | 165 | multiline = nl in text | ||
120 | 166 | if not multiline and (is_py3k or not str_is_unicode and type(text) is str): | ||
121 | 167 | # Use normal repr for single line of unicode on Python 3 or bytes | ||
122 | 168 | return repr(text) | ||
123 | 169 | prefix = repr(text[:0])[:-2] | ||
124 | 170 | if multiline: | ||
125 | 171 | # To escape multiline strings, split and process each line in turn, | ||
126 | 172 | # making sure that quotes are not escaped. | ||
127 | 173 | if is_py3k: | ||
128 | 174 | offset = len(prefix) + 1 | ||
129 | 175 | lines = [] | ||
130 | 176 | for l in text.split(nl): | ||
131 | 177 | r = repr(l) | ||
132 | 178 | q = r[-1] | ||
133 | 179 | lines.append(r[offset:-1].replace("\\" + q, q)) | ||
134 | 180 | elif not str_is_unicode and isinstance(text, str): | ||
135 | 181 | lines = [l.encode("string-escape").replace("\\'", "'") | ||
136 | 182 | for l in text.split("\n")] | ||
137 | 183 | else: | ||
138 | 184 | lines = [_slow_escape(l) for l in text.split("\n")] | ||
139 | 185 | # Combine the escaped lines and append two of the closing quotes, | ||
140 | 186 | # then iterate over the result to escape triple quotes correctly. | ||
141 | 187 | _semi_done = "\n".join(lines) + "''" | ||
142 | 188 | p = 0 | ||
143 | 189 | while True: | ||
144 | 190 | p = _semi_done.find("'''", p) | ||
145 | 191 | if p == -1: | ||
146 | 192 | break | ||
147 | 193 | _semi_done = "\\".join([_semi_done[:p], _semi_done[p:]]) | ||
148 | 194 | p += 2 | ||
149 | 195 | return "".join([prefix, "'''\\\n", _semi_done, "'"]) | ||
150 | 196 | escaped_text = _slow_escape(text) | ||
151 | 197 | # Determine which quote character to use and if one gets prefixed with a | ||
152 | 198 | # backslash following the same logic Python uses for repr() on strings | ||
153 | 199 | quote = "'" | ||
154 | 200 | if "'" in text: | ||
155 | 201 | if '"' in text: | ||
156 | 202 | escaped_text = escaped_text.replace("'", "\\'") | ||
157 | 203 | else: | ||
158 | 204 | quote = '"' | ||
159 | 205 | return "".join([prefix, quote, escaped_text, quote]) | ||
160 | 206 | |||
161 | 207 | |||
162 | 115 | def unicode_output_stream(stream): | 208 | def unicode_output_stream(stream): |
163 | 116 | """Get wrapper for given stream that writes any unicode without exception | 209 | """Get wrapper for given stream that writes any unicode without exception |
164 | 117 | 210 | ||
165 | @@ -146,11 +239,6 @@ | |||
166 | 146 | return writer(stream, "replace") | 239 | return writer(stream, "replace") |
167 | 147 | 240 | ||
168 | 148 | 241 | ||
169 | 149 | try: | ||
170 | 150 | to_text = unicode | ||
171 | 151 | except NameError: | ||
172 | 152 | to_text = str | ||
173 | 153 | |||
174 | 154 | # The default source encoding is actually "iso-8859-1" until Python 2.5 but | 242 | # The default source encoding is actually "iso-8859-1" until Python 2.5 but |
175 | 155 | # using non-ascii causes a deprecation warning in 2.4 and it's cleaner to | 243 | # using non-ascii causes a deprecation warning in 2.4 and it's cleaner to |
176 | 156 | # treat all versions the same way | 244 | # treat all versions the same way |
177 | 157 | 245 | ||
178 | === modified file 'testtools/matchers.py' | |||
179 | --- testtools/matchers.py 2011-09-13 23:54:14 +0000 | |||
180 | +++ testtools/matchers.py 2011-09-13 23:54:14 +0000 | |||
181 | @@ -49,7 +49,10 @@ | |||
182 | 49 | classtypes, | 49 | classtypes, |
183 | 50 | _error_repr, | 50 | _error_repr, |
184 | 51 | isbaseexception, | 51 | isbaseexception, |
185 | 52 | _isbytes, | ||
186 | 52 | istext, | 53 | istext, |
187 | 54 | str_is_unicode, | ||
188 | 55 | text_repr | ||
189 | 53 | ) | 56 | ) |
190 | 54 | 57 | ||
191 | 55 | 58 | ||
192 | @@ -102,6 +105,8 @@ | |||
193 | 102 | """Describe the mismatch. | 105 | """Describe the mismatch. |
194 | 103 | 106 | ||
195 | 104 | This should be either a human-readable string or castable to a string. | 107 | This should be either a human-readable string or castable to a string. |
196 | 108 | In particular, is should either be plain ascii or unicode on Python 2, | ||
197 | 109 | and care should be taken to escape control characters. | ||
198 | 105 | """ | 110 | """ |
199 | 106 | try: | 111 | try: |
200 | 107 | return self._description | 112 | return self._description |
201 | @@ -151,12 +156,25 @@ | |||
202 | 151 | def __str__(self): | 156 | def __str__(self): |
203 | 152 | difference = self.mismatch.describe() | 157 | difference = self.mismatch.describe() |
204 | 153 | if self.verbose: | 158 | if self.verbose: |
205 | 159 | # GZ 2011-08-24: Smelly API? Better to take any object and special | ||
206 | 160 | # case text inside? | ||
207 | 161 | if istext(self.matchee) or _isbytes(self.matchee): | ||
208 | 162 | matchee = text_repr(self.matchee, multiline=False) | ||
209 | 163 | else: | ||
210 | 164 | matchee = repr(self.matchee) | ||
211 | 154 | return ( | 165 | return ( |
214 | 155 | 'Match failed. Matchee: "%s"\nMatcher: %s\nDifference: %s\n' | 166 | 'Match failed. Matchee: %s\nMatcher: %s\nDifference: %s\n' |
215 | 156 | % (self.matchee, self.matcher, difference)) | 167 | % (matchee, self.matcher, difference)) |
216 | 157 | else: | 168 | else: |
217 | 158 | return difference | 169 | return difference |
218 | 159 | 170 | ||
219 | 171 | if not str_is_unicode: | ||
220 | 172 | |||
221 | 173 | __unicode__ = __str__ | ||
222 | 174 | |||
223 | 175 | def __str__(self): | ||
224 | 176 | return self.__unicode__().encode("ascii", "backslashreplace") | ||
225 | 177 | |||
226 | 160 | 178 | ||
227 | 161 | class MismatchDecorator(object): | 179 | class MismatchDecorator(object): |
228 | 162 | """Decorate a ``Mismatch``. | 180 | """Decorate a ``Mismatch``. |
229 | @@ -268,7 +286,12 @@ | |||
230 | 268 | self.with_nl = with_nl | 286 | self.with_nl = with_nl |
231 | 269 | 287 | ||
232 | 270 | def describe(self): | 288 | def describe(self): |
234 | 271 | return self.matcher._describe_difference(self.with_nl) | 289 | s = self.matcher._describe_difference(self.with_nl) |
235 | 290 | if str_is_unicode or isinstance(s, unicode): | ||
236 | 291 | return s | ||
237 | 292 | # GZ 2011-08-24: This is actually pretty bogus, most C0 codes should | ||
238 | 293 | # be escaped, in addition to non-ascii bytes. | ||
239 | 294 | return s.decode("latin1").encode("ascii", "backslashreplace") | ||
240 | 272 | 295 | ||
241 | 273 | 296 | ||
242 | 274 | class DoesNotContain(Mismatch): | 297 | class DoesNotContain(Mismatch): |
243 | @@ -298,8 +321,8 @@ | |||
244 | 298 | self.expected = expected | 321 | self.expected = expected |
245 | 299 | 322 | ||
246 | 300 | def describe(self): | 323 | def describe(self): |
249 | 301 | return "'%s' does not start with '%s'." % ( | 324 | return "%s does not start with %s." % ( |
250 | 302 | self.matchee, self.expected) | 325 | text_repr(self.matchee), text_repr(self.expected)) |
251 | 303 | 326 | ||
252 | 304 | 327 | ||
253 | 305 | class DoesNotEndWith(Mismatch): | 328 | class DoesNotEndWith(Mismatch): |
254 | @@ -314,8 +337,8 @@ | |||
255 | 314 | self.expected = expected | 337 | self.expected = expected |
256 | 315 | 338 | ||
257 | 316 | def describe(self): | 339 | def describe(self): |
260 | 317 | return "'%s' does not end with '%s'." % ( | 340 | return "%s does not end with %s." % ( |
261 | 318 | self.matchee, self.expected) | 341 | text_repr(self.matchee), text_repr(self.expected)) |
262 | 319 | 342 | ||
263 | 320 | 343 | ||
264 | 321 | class _BinaryComparison(object): | 344 | class _BinaryComparison(object): |
265 | @@ -347,8 +370,8 @@ | |||
266 | 347 | def _format(self, thing): | 370 | def _format(self, thing): |
267 | 348 | # Blocks of text with newlines are formatted as triple-quote | 371 | # Blocks of text with newlines are formatted as triple-quote |
268 | 349 | # strings. Everything else is pretty-printed. | 372 | # strings. Everything else is pretty-printed. |
271 | 350 | if istext(thing) and '\n' in thing: | 373 | if istext(thing) or _isbytes(thing): |
272 | 351 | return '"""\\\n%s"""' % (thing,) | 374 | return text_repr(thing) |
273 | 352 | return pformat(thing) | 375 | return pformat(thing) |
274 | 353 | 376 | ||
275 | 354 | def describe(self): | 377 | def describe(self): |
276 | @@ -359,7 +382,7 @@ | |||
277 | 359 | self._mismatch_string, self._format(self.expected), | 382 | self._mismatch_string, self._format(self.expected), |
278 | 360 | self._format(self.other)) | 383 | self._format(self.other)) |
279 | 361 | else: | 384 | else: |
281 | 362 | return "%s %s %s" % (left, self._mismatch_string,right) | 385 | return "%s %s %s" % (left, self._mismatch_string, right) |
282 | 363 | 386 | ||
283 | 364 | 387 | ||
284 | 365 | class Equals(_BinaryComparison): | 388 | class Equals(_BinaryComparison): |
285 | @@ -599,7 +622,7 @@ | |||
286 | 599 | self.expected = expected | 622 | self.expected = expected |
287 | 600 | 623 | ||
288 | 601 | def __str__(self): | 624 | def __str__(self): |
290 | 602 | return "Starts with '%s'." % self.expected | 625 | return "StartsWith(%r)" % (self.expected,) |
291 | 603 | 626 | ||
292 | 604 | def match(self, matchee): | 627 | def match(self, matchee): |
293 | 605 | if not matchee.startswith(self.expected): | 628 | if not matchee.startswith(self.expected): |
294 | @@ -618,7 +641,7 @@ | |||
295 | 618 | self.expected = expected | 641 | self.expected = expected |
296 | 619 | 642 | ||
297 | 620 | def __str__(self): | 643 | def __str__(self): |
299 | 621 | return "Ends with '%s'." % self.expected | 644 | return "EndsWith(%r)" % (self.expected,) |
300 | 622 | 645 | ||
301 | 623 | def match(self, matchee): | 646 | def match(self, matchee): |
302 | 624 | if not matchee.endswith(self.expected): | 647 | if not matchee.endswith(self.expected): |
303 | @@ -875,8 +898,12 @@ | |||
304 | 875 | 898 | ||
305 | 876 | def match(self, value): | 899 | def match(self, value): |
306 | 877 | if not re.match(self.pattern, value, self.flags): | 900 | if not re.match(self.pattern, value, self.flags): |
307 | 901 | pattern = self.pattern | ||
308 | 902 | if not isinstance(pattern, str_is_unicode and str or unicode): | ||
309 | 903 | pattern = pattern.decode("latin1") | ||
310 | 904 | pattern = pattern.encode("unicode_escape").decode("ascii") | ||
311 | 878 | return Mismatch("%r does not match /%s/" % ( | 905 | return Mismatch("%r does not match /%s/" % ( |
313 | 879 | value, self.pattern)) | 906 | value, pattern.replace("\\\\", "\\"))) |
314 | 880 | 907 | ||
315 | 881 | 908 | ||
316 | 882 | class MatchesSetwise(object): | 909 | class MatchesSetwise(object): |
317 | 883 | 910 | ||
318 | === modified file 'testtools/tests/test_compat.py' | |||
319 | --- testtools/tests/test_compat.py 2011-07-04 18:03:28 +0000 | |||
320 | +++ testtools/tests/test_compat.py 2011-09-13 23:54:14 +0000 | |||
321 | @@ -16,6 +16,7 @@ | |||
322 | 16 | _get_source_encoding, | 16 | _get_source_encoding, |
323 | 17 | _u, | 17 | _u, |
324 | 18 | str_is_unicode, | 18 | str_is_unicode, |
325 | 19 | text_repr, | ||
326 | 19 | unicode_output_stream, | 20 | unicode_output_stream, |
327 | 20 | ) | 21 | ) |
328 | 21 | from testtools.matchers import ( | 22 | from testtools.matchers import ( |
329 | @@ -262,6 +263,132 @@ | |||
330 | 262 | self.assertEqual("pa???n", sout.getvalue()) | 263 | self.assertEqual("pa???n", sout.getvalue()) |
331 | 263 | 264 | ||
332 | 264 | 265 | ||
333 | 266 | class TestTextRepr(testtools.TestCase): | ||
334 | 267 | """Ensure in extending repr, basic behaviours are not being broken""" | ||
335 | 268 | |||
336 | 269 | ascii_examples = ( | ||
337 | 270 | # Single character examples | ||
338 | 271 | # C0 control codes should be escaped except multiline \n | ||
339 | 272 | ("\x00", "'\\x00'", "'''\\\n\\x00'''"), | ||
340 | 273 | ("\b", "'\\x08'", "'''\\\n\\x08'''"), | ||
341 | 274 | ("\t", "'\\t'", "'''\\\n\\t'''"), | ||
342 | 275 | ("\n", "'\\n'", "'''\\\n\n'''"), | ||
343 | 276 | ("\r", "'\\r'", "'''\\\n\\r'''"), | ||
344 | 277 | # Quotes and backslash should match normal repr behaviour | ||
345 | 278 | ('"', "'\"'", "'''\\\n\"'''"), | ||
346 | 279 | ("'", "\"'\"", "'''\\\n\\''''"), | ||
347 | 280 | ("\\", "'\\\\'", "'''\\\n\\\\'''"), | ||
348 | 281 | # DEL is also unprintable and should be escaped | ||
349 | 282 | ("\x7F", "'\\x7f'", "'''\\\n\\x7f'''"), | ||
350 | 283 | |||
351 | 284 | # Character combinations that need double checking | ||
352 | 285 | ("\r\n", "'\\r\\n'", "'''\\\n\\r\n'''"), | ||
353 | 286 | ("\"'", "'\"\\''", "'''\\\n\"\\''''"), | ||
354 | 287 | ("'\"", "'\\'\"'", "'''\\\n'\"'''"), | ||
355 | 288 | ("\\n", "'\\\\n'", "'''\\\n\\\\n'''"), | ||
356 | 289 | ("\\\n", "'\\\\\\n'", "'''\\\n\\\\\n'''"), | ||
357 | 290 | ("\\' ", "\"\\\\' \"", "'''\\\n\\\\' '''"), | ||
358 | 291 | ("\\'\n", "\"\\\\'\\n\"", "'''\\\n\\\\'\n'''"), | ||
359 | 292 | ("\\'\"", "'\\\\\\'\"'", "'''\\\n\\\\'\"'''"), | ||
360 | 293 | ("\\'''", "\"\\\\'''\"", "'''\\\n\\\\\\'\\'\\''''"), | ||
361 | 294 | ) | ||
362 | 295 | |||
363 | 296 | # Bytes with the high bit set should always be escaped | ||
364 | 297 | bytes_examples = ( | ||
365 | 298 | (_b("\x80"), "'\\x80'", "'''\\\n\\x80'''"), | ||
366 | 299 | (_b("\xA0"), "'\\xa0'", "'''\\\n\\xa0'''"), | ||
367 | 300 | (_b("\xC0"), "'\\xc0'", "'''\\\n\\xc0'''"), | ||
368 | 301 | (_b("\xFF"), "'\\xff'", "'''\\\n\\xff'''"), | ||
369 | 302 | (_b("\xC2\xA7"), "'\\xc2\\xa7'", "'''\\\n\\xc2\\xa7'''"), | ||
370 | 303 | ) | ||
371 | 304 | |||
372 | 305 | # Unicode doesn't escape printable characters as per the Python 3 model | ||
373 | 306 | unicode_examples = ( | ||
374 | 307 | # C1 codes are unprintable | ||
375 | 308 | (_u("\x80"), "'\\x80'", "'''\\\n\\x80'''"), | ||
376 | 309 | (_u("\x9F"), "'\\x9f'", "'''\\\n\\x9f'''"), | ||
377 | 310 | # No-break space is unprintable | ||
378 | 311 | (_u("\xA0"), "'\\xa0'", "'''\\\n\\xa0'''"), | ||
379 | 312 | # Letters latin alphabets are printable | ||
380 | 313 | (_u("\xA1"), _u("'\xa1'"), _u("'''\\\n\xa1'''")), | ||
381 | 314 | (_u("\xFF"), _u("'\xff'"), _u("'''\\\n\xff'''")), | ||
382 | 315 | (_u("\u0100"), _u("'\u0100'"), _u("'''\\\n\u0100'''")), | ||
383 | 316 | # Line and paragraph seperators are unprintable | ||
384 | 317 | (_u("\u2028"), "'\\u2028'", "'''\\\n\\u2028'''"), | ||
385 | 318 | (_u("\u2029"), "'\\u2029'", "'''\\\n\\u2029'''"), | ||
386 | 319 | # Unpaired surrogates are unprintable | ||
387 | 320 | (_u("\uD800"), "'\\ud800'", "'''\\\n\\ud800'''"), | ||
388 | 321 | (_u("\uDFFF"), "'\\udfff'", "'''\\\n\\udfff'''"), | ||
389 | 322 | # Unprintable general categories not fully tested: Cc, Cf, Co, Cn, Zs | ||
390 | 323 | ) | ||
391 | 324 | |||
392 | 325 | b_prefix = repr(_b(""))[:-2] | ||
393 | 326 | u_prefix = repr(_u(""))[:-2] | ||
394 | 327 | |||
395 | 328 | def test_ascii_examples_oneline_bytes(self): | ||
396 | 329 | for s, expected, _ in self.ascii_examples: | ||
397 | 330 | b = _b(s) | ||
398 | 331 | actual = text_repr(b, multiline=False) | ||
399 | 332 | # Add self.assertIsInstance check? | ||
400 | 333 | self.assertEqual(actual, self.b_prefix + expected) | ||
401 | 334 | self.assertEqual(eval(actual), b) | ||
402 | 335 | |||
403 | 336 | def test_ascii_examples_oneline_unicode(self): | ||
404 | 337 | for s, expected, _ in self.ascii_examples: | ||
405 | 338 | u = _u(s) | ||
406 | 339 | actual = text_repr(u, multiline=False) | ||
407 | 340 | self.assertEqual(actual, self.u_prefix + expected) | ||
408 | 341 | self.assertEqual(eval(actual), u) | ||
409 | 342 | |||
410 | 343 | def test_ascii_examples_multiline_bytes(self): | ||
411 | 344 | for s, _, expected in self.ascii_examples: | ||
412 | 345 | b = _b(s) | ||
413 | 346 | actual = text_repr(b, multiline=True) | ||
414 | 347 | self.assertEqual(actual, self.b_prefix + expected) | ||
415 | 348 | self.assertEqual(eval(actual), b) | ||
416 | 349 | |||
417 | 350 | def test_ascii_examples_multiline_unicode(self): | ||
418 | 351 | for s, _, expected in self.ascii_examples: | ||
419 | 352 | u = _u(s) | ||
420 | 353 | actual = text_repr(u, multiline=True) | ||
421 | 354 | self.assertEqual(actual, self.u_prefix + expected) | ||
422 | 355 | self.assertEqual(eval(actual), u) | ||
423 | 356 | |||
424 | 357 | def test_ascii_examples_defaultline_bytes(self): | ||
425 | 358 | for s, one, multi in self.ascii_examples: | ||
426 | 359 | expected = "\n" in s and multi or one | ||
427 | 360 | self.assertEqual(text_repr(_b(s)), self.b_prefix + expected) | ||
428 | 361 | |||
429 | 362 | def test_ascii_examples_defaultline_unicode(self): | ||
430 | 363 | for s, one, multi in self.ascii_examples: | ||
431 | 364 | expected = "\n" in s and multi or one | ||
432 | 365 | self.assertEqual(text_repr(_u(s)), self.u_prefix + expected) | ||
433 | 366 | |||
434 | 367 | def test_bytes_examples_oneline(self): | ||
435 | 368 | for b, expected, _ in self.bytes_examples: | ||
436 | 369 | actual = text_repr(b, multiline=False) | ||
437 | 370 | self.assertEqual(actual, self.b_prefix + expected) | ||
438 | 371 | self.assertEqual(eval(actual), b) | ||
439 | 372 | |||
440 | 373 | def test_bytes_examples_multiline(self): | ||
441 | 374 | for b, _, expected in self.bytes_examples: | ||
442 | 375 | actual = text_repr(b, multiline=True) | ||
443 | 376 | self.assertEqual(actual, self.b_prefix + expected) | ||
444 | 377 | self.assertEqual(eval(actual), b) | ||
445 | 378 | |||
446 | 379 | def test_unicode_examples_oneline(self): | ||
447 | 380 | for u, expected, _ in self.unicode_examples: | ||
448 | 381 | actual = text_repr(u, multiline=False) | ||
449 | 382 | self.assertEqual(actual, self.u_prefix + expected) | ||
450 | 383 | self.assertEqual(eval(actual), u) | ||
451 | 384 | |||
452 | 385 | def test_unicode_examples_multiline(self): | ||
453 | 386 | for u, _, expected in self.unicode_examples: | ||
454 | 387 | actual = text_repr(u, multiline=True) | ||
455 | 388 | self.assertEqual(actual, self.u_prefix + expected) | ||
456 | 389 | self.assertEqual(eval(actual), u) | ||
457 | 390 | |||
458 | 391 | |||
459 | 265 | def test_suite(): | 392 | def test_suite(): |
460 | 266 | from unittest import TestLoader | 393 | from unittest import TestLoader |
461 | 267 | return TestLoader().loadTestsFromName(__name__) | 394 | return TestLoader().loadTestsFromName(__name__) |
462 | 268 | 395 | ||
463 | === modified file 'testtools/tests/test_matchers.py' | |||
464 | --- testtools/tests/test_matchers.py 2011-09-13 23:54:14 +0000 | |||
465 | +++ testtools/tests/test_matchers.py 2011-09-13 23:54:14 +0000 | |||
466 | @@ -12,7 +12,9 @@ | |||
467 | 12 | ) | 12 | ) |
468 | 13 | from testtools.compat import ( | 13 | from testtools.compat import ( |
469 | 14 | StringIO, | 14 | StringIO, |
471 | 15 | to_text, | 15 | str_is_unicode, |
472 | 16 | text_repr, | ||
473 | 17 | _b, | ||
474 | 16 | _u, | 18 | _u, |
475 | 17 | ) | 19 | ) |
476 | 18 | from testtools.matchers import ( | 20 | from testtools.matchers import ( |
477 | @@ -20,7 +22,11 @@ | |||
478 | 20 | AllMatch, | 22 | AllMatch, |
479 | 21 | Annotate, | 23 | Annotate, |
480 | 22 | AnnotatedMismatch, | 24 | AnnotatedMismatch, |
481 | 25 | <<<<<<< TREE | ||
482 | 23 | Contains, | 26 | Contains, |
483 | 27 | ======= | ||
484 | 28 | _BinaryMismatch, | ||
485 | 29 | >>>>>>> MERGE-SOURCE | ||
486 | 24 | Equals, | 30 | Equals, |
487 | 25 | DocTestMatches, | 31 | DocTestMatches, |
488 | 26 | DoesNotEndWith, | 32 | DoesNotEndWith, |
489 | @@ -96,7 +102,7 @@ | |||
490 | 96 | mismatch = matcher.match(2) | 102 | mismatch = matcher.match(2) |
491 | 97 | e = MismatchError(matchee, matcher, mismatch, True) | 103 | e = MismatchError(matchee, matcher, mismatch, True) |
492 | 98 | expected = ( | 104 | expected = ( |
494 | 99 | 'Match failed. Matchee: "%s"\n' | 105 | 'Match failed. Matchee: %r\n' |
495 | 100 | 'Matcher: %s\n' | 106 | 'Matcher: %s\n' |
496 | 101 | 'Difference: %s\n' % ( | 107 | 'Difference: %s\n' % ( |
497 | 102 | matchee, | 108 | matchee, |
498 | @@ -112,17 +118,80 @@ | |||
499 | 112 | matcher = Equals(_u('a')) | 118 | matcher = Equals(_u('a')) |
500 | 113 | mismatch = matcher.match(matchee) | 119 | mismatch = matcher.match(matchee) |
501 | 114 | expected = ( | 120 | expected = ( |
503 | 115 | 'Match failed. Matchee: "%s"\n' | 121 | 'Match failed. Matchee: %s\n' |
504 | 116 | 'Matcher: %s\n' | 122 | 'Matcher: %s\n' |
505 | 117 | 'Difference: %s\n' % ( | 123 | 'Difference: %s\n' % ( |
507 | 118 | matchee, | 124 | text_repr(matchee), |
508 | 119 | matcher, | 125 | matcher, |
509 | 120 | mismatch.describe(), | 126 | mismatch.describe(), |
510 | 121 | )) | 127 | )) |
511 | 122 | e = MismatchError(matchee, matcher, mismatch, True) | 128 | e = MismatchError(matchee, matcher, mismatch, True) |
515 | 123 | # XXX: Using to_text rather than str because, on Python 2, str will | 129 | if str_is_unicode: |
516 | 124 | # raise UnicodeEncodeError. | 130 | actual = str(e) |
517 | 125 | self.assertEqual(expected, to_text(e)) | 131 | else: |
518 | 132 | actual = unicode(e) | ||
519 | 133 | # Using str() should still work, and return ascii only | ||
520 | 134 | self.assertEqual( | ||
521 | 135 | expected.replace(matchee, matchee.encode("unicode-escape")), | ||
522 | 136 | str(e).decode("ascii")) | ||
523 | 137 | self.assertEqual(expected, actual) | ||
524 | 138 | |||
525 | 139 | |||
526 | 140 | class Test_BinaryMismatch(TestCase): | ||
527 | 141 | """Mismatches from binary comparisons need useful describe output""" | ||
528 | 142 | |||
529 | 143 | _long_string = "This is a longish multiline non-ascii string\n\xa7" | ||
530 | 144 | _long_b = _b(_long_string) | ||
531 | 145 | _long_u = _u(_long_string) | ||
532 | 146 | |||
533 | 147 | def test_short_objects(self): | ||
534 | 148 | o1, o2 = object(), object() | ||
535 | 149 | mismatch = _BinaryMismatch(o1, "!~", o2) | ||
536 | 150 | self.assertEqual(mismatch.describe(), "%r !~ %r" % (o1, o2)) | ||
537 | 151 | |||
538 | 152 | def test_short_mixed_strings(self): | ||
539 | 153 | b, u = _b("\xa7"), _u("\xa7") | ||
540 | 154 | mismatch = _BinaryMismatch(b, "!~", u) | ||
541 | 155 | self.assertEqual(mismatch.describe(), "%r !~ %r" % (b, u)) | ||
542 | 156 | |||
543 | 157 | def test_long_bytes(self): | ||
544 | 158 | one_line_b = self._long_b.replace(_b("\n"), _b(" ")) | ||
545 | 159 | mismatch = _BinaryMismatch(one_line_b, "!~", self._long_b) | ||
546 | 160 | self.assertEqual(mismatch.describe(), | ||
547 | 161 | "%s:\nreference = %s\nactual = %s\n" % ("!~", | ||
548 | 162 | text_repr(one_line_b), | ||
549 | 163 | text_repr(self._long_b, multiline=True))) | ||
550 | 164 | |||
551 | 165 | def test_long_unicode(self): | ||
552 | 166 | one_line_u = self._long_u.replace("\n", " ") | ||
553 | 167 | mismatch = _BinaryMismatch(one_line_u, "!~", self._long_u) | ||
554 | 168 | self.assertEqual(mismatch.describe(), | ||
555 | 169 | "%s:\nreference = %s\nactual = %s\n" % ("!~", | ||
556 | 170 | text_repr(one_line_u), | ||
557 | 171 | text_repr(self._long_u, multiline=True))) | ||
558 | 172 | |||
559 | 173 | def test_long_mixed_strings(self): | ||
560 | 174 | mismatch = _BinaryMismatch(self._long_b, "!~", self._long_u) | ||
561 | 175 | self.assertEqual(mismatch.describe(), | ||
562 | 176 | "%s:\nreference = %s\nactual = %s\n" % ("!~", | ||
563 | 177 | text_repr(self._long_b, multiline=True), | ||
564 | 178 | text_repr(self._long_u, multiline=True))) | ||
565 | 179 | |||
566 | 180 | def test_long_bytes_and_object(self): | ||
567 | 181 | obj = object() | ||
568 | 182 | mismatch = _BinaryMismatch(self._long_b, "!~", obj) | ||
569 | 183 | self.assertEqual(mismatch.describe(), | ||
570 | 184 | "%s:\nreference = %s\nactual = %s\n" % ("!~", | ||
571 | 185 | text_repr(self._long_b, multiline=True), | ||
572 | 186 | repr(obj))) | ||
573 | 187 | |||
574 | 188 | def test_long_unicode_and_object(self): | ||
575 | 189 | obj = object() | ||
576 | 190 | mismatch = _BinaryMismatch(self._long_u, "!~", obj) | ||
577 | 191 | self.assertEqual(mismatch.describe(), | ||
578 | 192 | "%s:\nreference = %s\nactual = %s\n" % ("!~", | ||
579 | 193 | text_repr(self._long_u, multiline=True), | ||
580 | 194 | repr(obj))) | ||
581 | 126 | 195 | ||
582 | 127 | 196 | ||
583 | 128 | class TestMatchersInterface(object): | 197 | class TestMatchersInterface(object): |
584 | @@ -208,6 +277,23 @@ | |||
585 | 208 | self.assertEqual("bar\n", matcher.want) | 277 | self.assertEqual("bar\n", matcher.want) |
586 | 209 | self.assertEqual(doctest.ELLIPSIS, matcher.flags) | 278 | self.assertEqual(doctest.ELLIPSIS, matcher.flags) |
587 | 210 | 279 | ||
588 | 280 | def test_describe_non_ascii_bytes(self): | ||
589 | 281 | """Even with bytestrings, the mismatch should be coercible to unicode | ||
590 | 282 | |||
591 | 283 | DocTestMatches is intended for text, but the Python 2 str type also | ||
592 | 284 | permits arbitrary binary inputs. This is a slightly bogus thing to do, | ||
593 | 285 | and under Python 3 using bytes objects will reasonably raise an error. | ||
594 | 286 | """ | ||
595 | 287 | header = _b("\x89PNG\r\n\x1a\n...") | ||
596 | 288 | if str_is_unicode: | ||
597 | 289 | self.assertRaises(TypeError, | ||
598 | 290 | DocTestMatches, header, doctest.ELLIPSIS) | ||
599 | 291 | return | ||
600 | 292 | matcher = DocTestMatches(header, doctest.ELLIPSIS) | ||
601 | 293 | mismatch = matcher.match(_b("GIF89a\1\0\1\0\0\0\0;")) | ||
602 | 294 | # Must be treatable as unicode text, the exact output matters less | ||
603 | 295 | self.assertTrue(unicode(mismatch.describe())) | ||
604 | 296 | |||
605 | 211 | 297 | ||
606 | 212 | class TestEqualsInterface(TestCase, TestMatchersInterface): | 298 | class TestEqualsInterface(TestCase, TestMatchersInterface): |
607 | 213 | 299 | ||
608 | @@ -610,6 +696,21 @@ | |||
609 | 610 | mismatch = DoesNotStartWith("fo", "bo") | 696 | mismatch = DoesNotStartWith("fo", "bo") |
610 | 611 | self.assertEqual("'fo' does not start with 'bo'.", mismatch.describe()) | 697 | self.assertEqual("'fo' does not start with 'bo'.", mismatch.describe()) |
611 | 612 | 698 | ||
612 | 699 | def test_describe_non_ascii_unicode(self): | ||
613 | 700 | string = _u("A\xA7") | ||
614 | 701 | suffix = _u("B\xA7") | ||
615 | 702 | mismatch = DoesNotStartWith(string, suffix) | ||
616 | 703 | self.assertEqual("%s does not start with %s." % ( | ||
617 | 704 | text_repr(string), text_repr(suffix)), | ||
618 | 705 | mismatch.describe()) | ||
619 | 706 | |||
620 | 707 | def test_describe_non_ascii_bytes(self): | ||
621 | 708 | string = _b("A\xA7") | ||
622 | 709 | suffix = _b("B\xA7") | ||
623 | 710 | mismatch = DoesNotStartWith(string, suffix) | ||
624 | 711 | self.assertEqual("%r does not start with %r." % (string, suffix), | ||
625 | 712 | mismatch.describe()) | ||
626 | 713 | |||
627 | 613 | 714 | ||
628 | 614 | class StartsWithTests(TestCase): | 715 | class StartsWithTests(TestCase): |
629 | 615 | 716 | ||
630 | @@ -617,7 +718,17 @@ | |||
631 | 617 | 718 | ||
632 | 618 | def test_str(self): | 719 | def test_str(self): |
633 | 619 | matcher = StartsWith("bar") | 720 | matcher = StartsWith("bar") |
635 | 620 | self.assertEqual("Starts with 'bar'.", str(matcher)) | 721 | self.assertEqual("StartsWith('bar')", str(matcher)) |
636 | 722 | |||
637 | 723 | def test_str_with_bytes(self): | ||
638 | 724 | b = _b("\xA7") | ||
639 | 725 | matcher = StartsWith(b) | ||
640 | 726 | self.assertEqual("StartsWith(%r)" % (b,), str(matcher)) | ||
641 | 727 | |||
642 | 728 | def test_str_with_unicode(self): | ||
643 | 729 | u = _u("\xA7") | ||
644 | 730 | matcher = StartsWith(u) | ||
645 | 731 | self.assertEqual("StartsWith(%r)" % (u,), str(matcher)) | ||
646 | 621 | 732 | ||
647 | 622 | def test_match(self): | 733 | def test_match(self): |
648 | 623 | matcher = StartsWith("bar") | 734 | matcher = StartsWith("bar") |
649 | @@ -646,6 +757,21 @@ | |||
650 | 646 | mismatch = DoesNotEndWith("fo", "bo") | 757 | mismatch = DoesNotEndWith("fo", "bo") |
651 | 647 | self.assertEqual("'fo' does not end with 'bo'.", mismatch.describe()) | 758 | self.assertEqual("'fo' does not end with 'bo'.", mismatch.describe()) |
652 | 648 | 759 | ||
653 | 760 | def test_describe_non_ascii_unicode(self): | ||
654 | 761 | string = _u("A\xA7") | ||
655 | 762 | suffix = _u("B\xA7") | ||
656 | 763 | mismatch = DoesNotEndWith(string, suffix) | ||
657 | 764 | self.assertEqual("%s does not end with %s." % ( | ||
658 | 765 | text_repr(string), text_repr(suffix)), | ||
659 | 766 | mismatch.describe()) | ||
660 | 767 | |||
661 | 768 | def test_describe_non_ascii_bytes(self): | ||
662 | 769 | string = _b("A\xA7") | ||
663 | 770 | suffix = _b("B\xA7") | ||
664 | 771 | mismatch = DoesNotEndWith(string, suffix) | ||
665 | 772 | self.assertEqual("%r does not end with %r." % (string, suffix), | ||
666 | 773 | mismatch.describe()) | ||
667 | 774 | |||
668 | 649 | 775 | ||
669 | 650 | class EndsWithTests(TestCase): | 776 | class EndsWithTests(TestCase): |
670 | 651 | 777 | ||
671 | @@ -653,7 +779,17 @@ | |||
672 | 653 | 779 | ||
673 | 654 | def test_str(self): | 780 | def test_str(self): |
674 | 655 | matcher = EndsWith("bar") | 781 | matcher = EndsWith("bar") |
676 | 656 | self.assertEqual("Ends with 'bar'.", str(matcher)) | 782 | self.assertEqual("EndsWith('bar')", str(matcher)) |
677 | 783 | |||
678 | 784 | def test_str_with_bytes(self): | ||
679 | 785 | b = _b("\xA7") | ||
680 | 786 | matcher = EndsWith(b) | ||
681 | 787 | self.assertEqual("EndsWith(%r)" % (b,), str(matcher)) | ||
682 | 788 | |||
683 | 789 | def test_str_with_unicode(self): | ||
684 | 790 | u = _u("\xA7") | ||
685 | 791 | matcher = EndsWith(u) | ||
686 | 792 | self.assertEqual("EndsWith(%r)" % (u,), str(matcher)) | ||
687 | 657 | 793 | ||
688 | 658 | def test_match(self): | 794 | def test_match(self): |
689 | 659 | matcher = EndsWith("arf") | 795 | matcher = EndsWith("arf") |
690 | @@ -770,11 +906,17 @@ | |||
691 | 770 | ("MatchesRegex('a|b')", MatchesRegex('a|b')), | 906 | ("MatchesRegex('a|b')", MatchesRegex('a|b')), |
692 | 771 | ("MatchesRegex('a|b', re.M)", MatchesRegex('a|b', re.M)), | 907 | ("MatchesRegex('a|b', re.M)", MatchesRegex('a|b', re.M)), |
693 | 772 | ("MatchesRegex('a|b', re.I|re.M)", MatchesRegex('a|b', re.I|re.M)), | 908 | ("MatchesRegex('a|b', re.I|re.M)", MatchesRegex('a|b', re.I|re.M)), |
694 | 909 | ("MatchesRegex(%r)" % (_b("\xA7"),), MatchesRegex(_b("\xA7"))), | ||
695 | 910 | ("MatchesRegex(%r)" % (_u("\xA7"),), MatchesRegex(_u("\xA7"))), | ||
696 | 773 | ] | 911 | ] |
697 | 774 | 912 | ||
698 | 775 | describe_examples = [ | 913 | describe_examples = [ |
699 | 776 | ("'c' does not match /a|b/", 'c', MatchesRegex('a|b')), | 914 | ("'c' does not match /a|b/", 'c', MatchesRegex('a|b')), |
700 | 777 | ("'c' does not match /a\d/", 'c', MatchesRegex(r'a\d')), | 915 | ("'c' does not match /a\d/", 'c', MatchesRegex(r'a\d')), |
701 | 916 | ("%r does not match /\\s+\\xa7/" % (_b('c'),), | ||
702 | 917 | _b('c'), MatchesRegex(_b("\\s+\xA7"))), | ||
703 | 918 | ("%r does not match /\\s+\\xa7/" % (_u('c'),), | ||
704 | 919 | _u('c'), MatchesRegex(_u("\\s+\xA7"))), | ||
705 | 778 | ] | 920 | ] |
706 | 779 | 921 | ||
707 | 780 | 922 | ||
708 | 781 | 923 | ||
709 | === modified file 'testtools/tests/test_testcase.py' | |||
710 | --- testtools/tests/test_testcase.py 2011-09-13 23:54:14 +0000 | |||
711 | +++ testtools/tests/test_testcase.py 2011-09-13 23:54:14 +0000 | |||
712 | @@ -488,7 +488,7 @@ | |||
713 | 488 | matchee = 'foo' | 488 | matchee = 'foo' |
714 | 489 | matcher = Equals('bar') | 489 | matcher = Equals('bar') |
715 | 490 | expected = ( | 490 | expected = ( |
717 | 491 | 'Match failed. Matchee: "%s"\n' | 491 | 'Match failed. Matchee: %r\n' |
718 | 492 | 'Matcher: %s\n' | 492 | 'Matcher: %s\n' |
719 | 493 | 'Difference: %s\n' % ( | 493 | 'Difference: %s\n' % ( |
720 | 494 | matchee, | 494 | matchee, |
721 | @@ -528,10 +528,10 @@ | |||
722 | 528 | matchee = _u('\xa7') | 528 | matchee = _u('\xa7') |
723 | 529 | matcher = Equals(_u('a')) | 529 | matcher = Equals(_u('a')) |
724 | 530 | expected = ( | 530 | expected = ( |
726 | 531 | 'Match failed. Matchee: "%s"\n' | 531 | 'Match failed. Matchee: %s\n' |
727 | 532 | 'Matcher: %s\n' | 532 | 'Matcher: %s\n' |
728 | 533 | 'Difference: %s\n\n' % ( | 533 | 'Difference: %s\n\n' % ( |
730 | 534 | matchee, | 534 | repr(matchee).replace("\\xa7", matchee), |
731 | 535 | matcher, | 535 | matcher, |
732 | 536 | matcher.match(matchee).describe(), | 536 | matcher.match(matchee).describe(), |
733 | 537 | )) | 537 | )) |
734 | @@ -565,6 +565,21 @@ | |||
735 | 565 | self.assertFails(expected_error, self.assertEquals, a, b) | 565 | self.assertFails(expected_error, self.assertEquals, a, b) |
736 | 566 | self.assertFails(expected_error, self.failUnlessEqual, a, b) | 566 | self.assertFails(expected_error, self.failUnlessEqual, a, b) |
737 | 567 | 567 | ||
738 | 568 | def test_assertEqual_non_ascii_str_with_newlines(self): | ||
739 | 569 | message = _u("Be careful mixing unicode and bytes") | ||
740 | 570 | a = "a\n\xa7\n" | ||
741 | 571 | b = "Just a longish string so the more verbose output form is used." | ||
742 | 572 | expected_error = '\n'.join([ | ||
743 | 573 | '!=:', | ||
744 | 574 | "reference = '''\\", | ||
745 | 575 | 'a', | ||
746 | 576 | repr('\xa7')[1:-1], | ||
747 | 577 | "'''", | ||
748 | 578 | 'actual = %r' % (b,), | ||
749 | 579 | ': ' + message, | ||
750 | 580 | ]) | ||
751 | 581 | self.assertFails(expected_error, self.assertEqual, a, b, message) | ||
752 | 582 | |||
753 | 568 | def test_assertIsNone(self): | 583 | def test_assertIsNone(self): |
754 | 569 | self.assertIsNone(None) | 584 | self.assertIsNone(None) |
755 | 570 | 585 |
Heroic effort. I can only imagine what this must have been like to write.
What follows is initial feedback and thoughts. Answer & fix as much as you
like. I'm happy to fix up the minor stuff as I land.
To summarize, although the bug was originally filed against assertThat, it
turns out that it's actually an issue in the matchers themselves as well as
assertThat. Right?
If so, we have to be thinking about:
* what are our failure modes with this code and existing broken matchers
* how to make it easy to define matchers and mismatches correctly
Will return to those in a bit. General code stuff:
* Can you elaborate on your comment on istext()?
* Is there a way we can make it so matchers never have to use _isbytes?
Perhaps by adding an "is_multiline" function to compat.
* I don't care about astral characters
* In compat, "Seperate" should be spelled "Separate"
* text_repr is a bit hard to follow, maybe add some internal comments?
* Some more documentation on the intent & API of text_repr would help,
saying that it attempts to mimic repr() except returning unicode,
giving some examples, explaining why you might want to use it and also what
multiline does.
* I don't get why multiline needs three different ways of escaping lines
whereas the non-multiline case only has two branches
* I think having a repr()-like function that takes any object and
special-cases text would be good
* Just to be clear, we are expecting Mismatch.describe() to return unicode,
right?
* What are we expecting Matcher.__str__ to return? Text or bytes?
On the original points:
* We should add something to the docs explaining best practice for describe( ) methods. i.e., do "foo %s" % (text_repr(x),) describe` `
Mismatch.
* We should update the docstring for ``Mismatch.
* Our docs say that match() should return a Mismatch, so we can assume
subclassing of that.
* These docs need to bubble up, to AnnotatedMismatch, to assertThat etc.
e.g. The 'message' parameter of assertThat needs to have its type
documented.
Thanks so much for persevering.
jml