Merge lp:~gz/pyjunitxml/ensure_well_formed_xml_625589 into lp:pyjunitxml

Proposed by Martin Packman
Status: Merged
Merged at revision: 21
Proposed branch: lp:~gz/pyjunitxml/ensure_well_formed_xml_625589
Merge into: lp:pyjunitxml
Diff against target: 275 lines (+181/-9) (has conflicts)
2 files modified
junitxml/__init__.py (+57/-9)
junitxml/tests/test_junitxml.py (+124/-0)
Text conflict in junitxml/__init__.py
Text conflict in junitxml/tests/test_junitxml.py
To merge this branch: bzr merge lp:~gz/pyjunitxml/ensure_well_formed_xml_625589
Reviewer Review Type Date Requested Status
Robert Collins Needs Fixing
Vincent Ladeuil Approve
Review via email: mp+34573@code.launchpad.net

Description of the change

Fixes the escaping function to be properly robust against arbitrary input, and to output utf-8 bytestrings on Python 2.

There are a couple of judgement calls in how this escaping is written. Firstly, invalid cdata is stripped rather than included in some escaped form in the output. Without resorting to clever things that would require the junit xml format there's always the potential for confusing output like "AssertionError: '' != ''" but that's not much worse than "AssertionError: '\x00' != '\x00'" or similar. Also, \r is stripped even though it may appear in xml content, this is largely because of how parsing handles it anyway.

This branch builds on lp:~gz/pyjunitxml/unexpected_expectations_python_2.7 and p:~gz/pyjunitxml/source_compat_python_3 but launchpad doesn't like multiple prerequisite branches.

To post a comment you must log in.
Revision history for this message
Vincent Ladeuil (vila) wrote :

Woohoo ! Makes babune happy :-)

First time we get an actual Test Result, compare:
  http://babune.ladeuil.net:24842/job/selftest-windows/153/
with
  http://babune.ladeuil.net:24842/job/selftest-windows/154/

I will be running with this branch for all other platforms to check
against regressions.

review: Approve
Revision history for this message
Martin Packman (gz) wrote :

> Woohoo ! Makes babune happy :-)

Ah, great, I was going to ask you to give this a go tomorrow.

> First time we get an actual Test Result, compare:
> http://babune.ladeuil.net:24842/job/selftest-windows/153/
> with
> http://babune.ladeuil.net:24842/job/selftest-windows/154/
>
> I will be running with this branch for all other platforms to check
> against regressions.

Went to check if maverick was now working as well, and apparently it was already made happy by fixing the elementtree thing. The testing is better for this when there are lots of different things going wrong with the test suite. :)

Revision history for this message
Vincent Ladeuil (vila) wrote :

@Robert: ping, this has been working well on babune for the last days. Anything else needed to land ?

Revision history for this message
Robert Collins (lifeless) wrote :

Fails selftest:

:!python -m testtools.run junitxml.test_suite
Tests running...
======================================================================
ERROR: junitxml.tests.test_junitxml.TestWellFormedXml.test_error_with_surrogates
----------------------------------------------------------------------
Traceback (most recent call last):
  File "junitxml/tests/test_junitxml.py", line 313, in test_error_with_surrogates
    self.assertTrue(unichr(0x201A2) in traceback)
UnboundLocalError: local variable 'unichr' referenced before assignment
Ran 18 tests in 0.007s

Also has conflicts in trunk, but they are shallow.

review: Needs Fixing
Revision history for this message
Robert Collins (lifeless) wrote :

And needs NEWS

21. By Martin Packman

Don't create a local unichr for test in Python 3 instead use expected string

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'junitxml/__init__.py'
2--- junitxml/__init__.py 2010-09-11 22:25:07 +0000
3+++ junitxml/__init__.py 2010-09-11 22:28:41 +0000
4@@ -8,9 +8,9 @@
5
6
7 import datetime
8+import re
9 import time
10 import unittest
11-from xml.sax.saxutils import escape
12
13 # same format as sys.version_info: "A tuple containing the five components of
14 # the version number: major, minor, micro, releaselevel, and serial. All
15@@ -52,12 +52,49 @@
16 return None
17
18
19-def _error_name(eclass):
20- module = eclass.__module__
21- if module not in ("__main__", "builtins", "exceptions"):
22- return ".".join([module, eclass.__name__])
23- return eclass.__name__
24-
25+<<<<<<< TREE
26+def _error_name(eclass):
27+ module = eclass.__module__
28+ if module not in ("__main__", "builtins", "exceptions"):
29+ return ".".join([module, eclass.__name__])
30+ return eclass.__name__
31+
32+=======
33+def _error_name(eclass):
34+ module = eclass.__module__
35+ if module not in ("__main__", "builtins", "exceptions"):
36+ return ".".join([module, eclass.__name__])
37+ return eclass.__name__
38+
39+
40+_non_cdata = "[\0-\b\x0B-\x1F\uD800-\uDFFF\uFFFE\uFFFF]+"
41+if "\\u" in _non_cdata:
42+ _non_cdata = _non_cdata.decode("unicode-escape")
43+ def _strip_invalid_chars(s, _sub=re.compile(_non_cdata, re.UNICODE).sub):
44+ if not isinstance(s, unicode):
45+ try:
46+ s = s.decode("utf-8")
47+ except UnicodeDecodeError:
48+ s = s.decode("ascii", "replace")
49+ return _sub("", s).encode("utf-8")
50+else:
51+ def _strip_invalid_chars(s, _sub=re.compile(_non_cdata, re.UNICODE).sub):
52+ return _sub("", s)
53+def _escape_content(s):
54+ return (_strip_invalid_chars(s)
55+ .replace("&", "&amp;")
56+ .replace("<", "&lt;")
57+ .replace("]]>", "]]&gt;"))
58+def _escape_attr(s):
59+ return (_strip_invalid_chars(s)
60+ .replace("&", "&amp;")
61+ .replace("<", "&lt;")
62+ .replace("]]>", "]]&gt;")
63+ .replace('"', "&quot;")
64+ .replace("\t", "&#x9;")
65+ .replace("\n", "&#xA;"))
66+
67+>>>>>>> MERGE-SOURCE
68
69 class JUnitXmlResult(unittest.TestResult):
70 """A TestResult which outputs JUnit compatible XML."""
71@@ -71,6 +108,9 @@
72 """
73 self.__super = super(JUnitXmlResult, self)
74 self.__super.__init__()
75+ # GZ 2010-09-03: We have a problem if passed a text stream in Python 3
76+ # as really we want to write raw UTF-8 to ensure that
77+ # the encoding is not mangled later
78 self._stream = stream
79 self._results = []
80 self._set_time = None
81@@ -122,7 +162,7 @@
82 else:
83 classname, name = test_id[:class_end], test_id[class_end+1:]
84 self._results.append('<testcase classname="%s" name="%s" '
85- 'time="%0.3f"' % (escape(classname), escape(name), duration))
86+ 'time="%0.3f"' % (_escape_attr(classname), _escape_attr(name), duration))
87
88 def stopTestRun(self):
89 """Stop a test run.
90@@ -142,13 +182,21 @@
91 self.__super.addError(test, error)
92 self._test_case_string(test)
93 self._results.append('>\n')
94+<<<<<<< TREE
95 self._results.append('<error type="%s">%s</error>\n</testcase>\n'% (escape(_error_name(error[0])), escape(self._exc_info_to_string(error, test))))
96+=======
97+ self._results.append('<error type="%s">%s</error>\n</testcase>\n'% (_escape_attr(_error_name(error[0])), _escape_content(self._exc_info_to_string(error, test))))
98+>>>>>>> MERGE-SOURCE
99
100 def addFailure(self, test, error):
101 self.__super.addFailure(test, error)
102 self._test_case_string(test)
103 self._results.append('>\n')
104+<<<<<<< TREE
105 self._results.append('<failure type="%s">%s</failure>\n</testcase>\n'% (escape(_error_name(error[0])), escape(self._exc_info_to_string(error, test))))
106+=======
107+ self._results.append('<failure type="%s">%s</failure>\n</testcase>\n'% (_escape_attr(_error_name(error[0])), _escape_content(self._exc_info_to_string(error, test))))
108+>>>>>>> MERGE-SOURCE
109
110 def addSuccess(self, test):
111 self.__super.addSuccess(test)
112@@ -163,7 +211,7 @@
113 pass
114 self._test_case_string(test)
115 self._results.append('>\n')
116- self._results.append('<skip>%s</skip>\n</testcase>\n'% escape(reason))
117+ self._results.append('<skip>%s</skip>\n</testcase>\n'% _escape_attr(reason))
118
119 def addUnexpectedSuccess(self, test):
120 try:
121
122=== modified file 'junitxml/tests/test_junitxml.py'
123--- junitxml/tests/test_junitxml.py 2010-09-11 22:25:07 +0000
124+++ junitxml/tests/test_junitxml.py 2010-09-11 22:28:41 +0000
125@@ -11,7 +11,9 @@
126 from io import StringIO
127 import datetime
128 import re
129+import sys
130 import unittest
131+import xml.dom.minidom
132
133 import junitxml
134
135@@ -188,12 +190,20 @@
136 expected_failure_support = [True]
137 class ExpectedFail(unittest.TestCase):
138 def test_me(self):
139+<<<<<<< TREE
140 self.fail("fail")
141 try:
142 test_me = unittest.expectedFailure(test_me)
143 except AttributeError:
144 # Older python - just let the test fail
145 expected_failure_support[0] = False
146+=======
147+ self.fail("fail")
148+ try:
149+ test_me = unittest.expectedFailure(test_me)
150+ except AttributeError:
151+ pass # Older python - just let the test fail
152+>>>>>>> MERGE-SOURCE
153 self.result.startTestRun()
154 ExpectedFail("test_me").run(self.result)
155 self.result.stopTestRun()
156@@ -210,5 +220,119 @@
157 """
158 if expected_failure_support[0]:
159 self.assertEqual(expected, output)
160+<<<<<<< TREE
161 else:
162 self.assertEqual(expected_old, output)
163+=======
164+
165+
166+class TestWellFormedXml(unittest.TestCase):
167+ """XML created should always be well formed even with odd test cases"""
168+
169+ def _run_and_parse_test(self, case):
170+ output = StringIO()
171+ result = junitxml.JUnitXmlResult(output)
172+ result.startTestRun()
173+ case.run(result)
174+ result.stopTestRun()
175+ return xml.dom.minidom.parseString(output.getvalue())
176+
177+ def test_failure_with_amp(self):
178+ """Check the failure element content is escaped"""
179+ class FailWithAmp(unittest.TestCase):
180+ def runTest(self):
181+ self.fail("& should be escaped as &amp;")
182+ doc = self._run_and_parse_test(FailWithAmp())
183+ self.assertTrue(
184+ doc.getElementsByTagName("failure")[0].firstChild.nodeValue
185+ .endswith("AssertionError: & should be escaped as &amp;\n"))
186+
187+ def test_quotes_in_test_case_id(self):
188+ """Check that quotes in an attribute are escaped"""
189+ class QuoteId(unittest.TestCase):
190+ def id(self):
191+ return unittest.TestCase.id(self) + '("quotes")'
192+ def runTest(self):
193+ pass
194+ doc = self._run_and_parse_test(QuoteId())
195+ self.assertEqual('runTest("quotes")',
196+ doc.getElementsByTagName("testcase")[0].getAttribute("name"))
197+
198+ def test_skip_reason(self):
199+ """Check the skip element content is escaped"""
200+ class SkipWithLt(unittest.TestCase):
201+ def runTest(self):
202+ self.fail("version < 2.7")
203+ try:
204+ runTest = unittest.skip("2.7 <= version")(runTest)
205+ except AttributeError:
206+ self.has_skip = False
207+ else:
208+ self.has_skip = True
209+ doc = self._run_and_parse_test(SkipWithLt())
210+ if self.has_skip:
211+ self.assertEqual('2.7 <= version',
212+ doc.getElementsByTagName("skip")[0].firstChild.nodeValue)
213+ else:
214+ self.assertTrue(
215+ doc.getElementsByTagName("failure")[0].firstChild.nodeValue
216+ .endswith("AssertionError: version < 2.7\n"))
217+
218+ def test_error_with_control_characters(self):
219+ """Check C0 control characters are stripped rather than output"""
220+ class ErrorWithC0(unittest.TestCase):
221+ def runTest(self):
222+ raise ValueError("\x1F\x0E\x0C\x0B\x08\x01\x00lost control")
223+ doc = self._run_and_parse_test(ErrorWithC0())
224+ self.assertTrue(
225+ doc.getElementsByTagName("error")[0].firstChild.nodeValue
226+ .endswith("ValueError: lost control\n"))
227+
228+ def test_error_with_invalid_cdata(self):
229+ """Check unicode outside the valid cdata range is stripped"""
230+ if len("\uffff") == 1:
231+ # Basic str type supports unicode
232+ exception = ValueError("\ufffe\uffffEOF")
233+ else:
234+ class UTF8_Error(Exception):
235+ def __unicode__(self):
236+ return str(self).decode("UTF-8")
237+ exception = UTF8_Error("\xef\xbf\xbe\xef\xbf\xbfEOF")
238+ class ErrorWithBadUnicode(unittest.TestCase):
239+ def runTest(self):
240+ raise exception
241+ doc = self._run_and_parse_test(ErrorWithBadUnicode())
242+ self.assertTrue(
243+ doc.getElementsByTagName("error")[0].firstChild.nodeValue
244+ .endswith("Error: EOF\n"))
245+
246+ def test_error_with_surrogates(self):
247+ """Check unicode surrogates are handled properly, paired or otherwise
248+
249+ This is a pain due to suboptimal unicode support in Python and the
250+ various changes in Python 3. On UCS-2 builds there is no easy way of
251+ getting rid of unpaired surrogates while leaving valid pairs alone, so
252+ this test doesn't require astral characters are kept there.
253+ """
254+ if len("\uffff") == 1:
255+ exception = ValueError("paired: \U000201a2"
256+ " unpaired: "+chr(0xD800)+"-"+chr(0xDFFF))
257+ astral_char = "\U000201a2"
258+ else:
259+ class UTF8_Error(Exception):
260+ def __unicode__(self):
261+ return str(self).decode("UTF-8")
262+ exception = UTF8_Error("paired: \xf0\xa0\x86\xa2"
263+ " unpaired: \xed\xa0\x80-\xed\xbf\xbf")
264+ astral_char = "\U000201a2".decode("unicode-escape")
265+ class ErrorWithSurrogates(unittest.TestCase):
266+ def runTest(self):
267+ raise exception
268+ doc = self._run_and_parse_test(ErrorWithSurrogates())
269+ traceback = doc.getElementsByTagName("error")[0].firstChild.nodeValue
270+ if sys.maxunicode == 0xFFFF:
271+ pass # would be nice to handle astral characters properly even so
272+ else:
273+ self.assertTrue(astral_char in traceback)
274+ self.assertTrue(traceback.endswith(" unpaired: -\n"))
275+>>>>>>> MERGE-SOURCE

Subscribers

People subscribed via source and target branches

to all changes: