Merge lp:~mbp/subunit/505078-chunked into lp:~subunit/subunit/trunk

Proposed by Martin Pool
Status: Merged
Merged at revision: 139
Proposed branch: lp:~mbp/subunit/505078-chunked
Merge into: lp:~subunit/subunit/trunk
Diff against target: 94 lines (+39/-2)
2 files modified
python/subunit/chunked.py (+15/-2)
python/subunit/tests/test_chunked.py (+24/-0)
To merge this branch: bzr merge lp:~mbp/subunit/505078-chunked
Reviewer Review Type Date Requested Status
Jonathan Lange Approve
Martin Pool Pending
Review via email: mp+47754@code.launchpad.net

This proposal supersedes a proposal from 2011-01-10.

Description of the change

This makes subunit not barf if a \r is missing from the input. You can choose strict or non-strict parsing.

To post a comment you must log in.
Revision history for this message
Martin Pool (mbp) wrote : Posted in a previous version of this proposal

Robert said offline: he would like a 'strict' mode flag, defaulting to on, in the chunked parser that controls whether the parser knowingly accepts noncompliant data.

The presence of this flag ought not to imply that it strictly validates the input. There are many things that are incorrect that it will accept at the moment. But it could go towards that in future.

review: Needs Fixing
Revision history for this message
Jonathan Lange (jml) wrote :

Thanks Martin.

I don't like raising ValueError, but that's essentially what happens now anyway, so it's fine by me.

I've merged a version of your patch that changes 'cr' to 'CR' in comments (sometimes a man must take a stand) and that updates NEWS.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'python/subunit/chunked.py'
2--- python/subunit/chunked.py 2009-10-13 04:52:06 +0000
3+++ python/subunit/chunked.py 2011-01-28 01:08:53 +0000
4@@ -1,6 +1,7 @@
5 #
6 # subunit: extensions to python unittest to get test results from subprocesses.
7 # Copyright (C) 2005 Robert Collins <robertc@robertcollins.net>
8+# Copyright (C) 2011 Martin Pool <mbp@sourcefrog.net>
9 #
10 # Licensed under either the Apache License, Version 2.0 or the BSD 3-clause
11 # license at the users choice. A copy of both licenses are available in the
12@@ -19,7 +20,7 @@
13 class Decoder(object):
14 """Decode chunked content to a byte stream."""
15
16- def __init__(self, output):
17+ def __init__(self, output, strict=True):
18 """Create a decoder decoding to output.
19
20 :param output: A file-like object. Bytes written to the Decoder are
21@@ -29,11 +30,18 @@
22 when no more data is available, to detect short streams; the
23 write method will return none-None when the end of a stream is
24 detected.
25+
26+ :param strict: If True (the default), the decoder will not knowingly
27+ accept input that is not conformant to the HTTP specification.
28+ (This does not imply that it will catch every nonconformance.)
29+ If False, it will accept incorrect input that is still
30+ unambiguous.
31 """
32 self.output = output
33 self.buffered_bytes = []
34 self.state = self._read_length
35 self.body_length = 0
36+ self.strict = strict
37
38 def close(self):
39 """Close the decoder.
40@@ -87,7 +95,12 @@
41 if count_chars[-1][-1] != '\n':
42 return
43 count_str = ''.join(count_chars)
44- self.body_length = int(count_str[:-2], 16)
45+ if self.strict:
46+ if count_str[-2:] != '\r\n':
47+ raise ValueError("chunk header invalid: %r" % count_str)
48+ if '\r' in count_str[:-2]:
49+ raise ValueError("too many crs in chunk header %r" % count_str)
50+ self.body_length = int(count_str.rstrip('\n\r'), 16)
51 excess_bytes = len(count_str)
52 while excess_bytes:
53 if excess_bytes >= len(self.buffered_bytes[0]):
54
55=== modified file 'python/subunit/tests/test_chunked.py'
56--- python/subunit/tests/test_chunked.py 2009-10-13 04:52:06 +0000
57+++ python/subunit/tests/test_chunked.py 2011-01-28 01:08:53 +0000
58@@ -1,6 +1,7 @@
59 #
60 # subunit: extensions to python unittest to get test results from subprocesses.
61 # Copyright (C) 2005 Robert Collins <robertc@robertcollins.net>
62+# Copyright (C) 2011 Martin Pool <mbp@sourcefrog.net>
63 #
64 # Licensed under either the Apache License, Version 2.0 or the BSD 3-clause
65 # license at the users choice. A copy of both licenses are available in the
66@@ -86,6 +87,29 @@
67 self.assertEqual('', self.decoder.write('0\r\n'))
68 self.assertEqual('1' * 65536 + '2' * 65536, self.output.getvalue())
69
70+ def test_decode_newline_nonstrict(self):
71+ """Tolerate chunk markers with no cr character."""
72+ # From <http://pad.lv/505078>
73+ self.decoder = subunit.chunked.Decoder(self.output, strict=False)
74+ self.assertEqual(None, self.decoder.write('a\n'))
75+ self.assertEqual(None, self.decoder.write('abcdeabcde'))
76+ self.assertEqual('', self.decoder.write('0\n'))
77+ self.assertEqual('abcdeabcde', self.output.getvalue())
78+
79+ def test_decode_strict_newline_only(self):
80+ """Reject chunk markers with no cr character in strict mode."""
81+ # From <http://pad.lv/505078>
82+ self.assertRaises(ValueError,
83+ self.decoder.write, 'a\n')
84+
85+ def test_decode_strict_multiple_crs(self):
86+ self.assertRaises(ValueError,
87+ self.decoder.write, 'a\r\r\n')
88+
89+ def test_decode_short_header(self):
90+ self.assertRaises(ValueError,
91+ self.decoder.write, '\n')
92+
93
94 class TestEncode(unittest.TestCase):
95

Subscribers

People subscribed via source and target branches