Merge lp:~lifeless/python-oops-tools/bug-881400 into lp:python-oops-tools

Proposed by Robert Collins
Status: Merged
Approved by: Robert Collins
Approved revision: 9
Merged at revision: 8
Proposed branch: lp:~lifeless/python-oops-tools/bug-881400
Merge into: lp:python-oops-tools
Diff against target: 73 lines (+30/-2)
4 files modified
src/oopstools/NEWS.txt (+8/-1)
src/oopstools/oops/models.py (+10/-0)
src/oopstools/oops/test/test_dboopsloader.py (+11/-0)
src/oopstools/version.txt (+1/-1)
To merge this branch: bzr merge lp:~lifeless/python-oops-tools/bug-881400
Reviewer Review Type Date Requested Status
Robert Collins (community) Approve
Review via email: mp+80769@code.launchpad.net

Commit message

Handle bad url bytestrings by url quoting. Also release 0.6.1.

Description of the change

A followon-from https://code.launchpad.net/~rye/python-oops-tools/local-charset-in-url/+merge/80330 -- I've implemented StevenK's points, and mine.

To post a comment you must log in.
Revision history for this message
Robert Collins (lifeless) wrote :

Also includes a release.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/oopstools/NEWS.txt'
2--- src/oopstools/NEWS.txt 2011-10-27 03:40:45 +0000
3+++ src/oopstools/NEWS.txt 2011-10-31 02:12:23 +0000
4@@ -5,6 +5,9 @@
5 NEXT
6 ====
7
8+0.6.1
9+=====
10+
11 * Added AMQP support via the bin/amqp2disk script. (Robert Collins)
12
13 * Bumped oops-amqp rev to 0.0.3 for bugfixes. (Robert Collins)
14@@ -15,7 +18,11 @@
15 * amqp2disk -v will print the received OOPS ids on the console, for
16 entertainment and delight. (Robert Collins)
17
18-* OOPS reports with unicode url values are now handled during oops loading: the
19+* OOPS reports with non-ascii URL values are handled by url escaping the URL
20+ bytestring (this is separate to handling of unicode URL values).
21+ (Roman Yepishev, Robert Collins, #881400)
22+
23+* OOPS reports with unicode URL values are now handled during oops loading: the
24 unicode string is utf8 encoded (an arbitrary choice) and url escaped.
25 (Robert Collins, #879309)
26
27
28=== modified file 'src/oopstools/oops/models.py'
29--- src/oopstools/oops/models.py 2011-10-25 02:25:14 +0000
30+++ src/oopstools/oops/models.py 2011-10-31 02:12:23 +0000
31@@ -371,6 +371,16 @@
32 # We have gotten a ringer, URL's are bytestrings. Encode to UTF8 to get
33 # a bytestring and urllib.quote to get a url.
34 url = urllib.quote(url.encode('utf8'))
35+ else:
36+ try:
37+ url.decode('ascii')
38+ except UnicodeDecodeError:
39+ # The URL is not ASCII and thus definitely not a URL.
40+ # There may be a byte in local encoding or it may be UTF8 - we need
41+ # to quote it to make it a valid URL - this is better than
42+ # rejecting the OOPS, or having a URL that isn't in the DB.
43+ url = urllib.quote(url)
44+
45 informational = oops.get('informational', 'False').lower() == 'true'
46 oops_date = oops.get('time')
47 if oops_date is None:
48
49=== modified file 'src/oopstools/oops/test/test_dboopsloader.py'
50--- src/oopstools/oops/test/test_dboopsloader.py 2011-10-24 05:21:01 +0000
51+++ src/oopstools/oops/test/test_dboopsloader.py 2011-10-31 02:12:23 +0000
52@@ -154,3 +154,14 @@
53 report = { 'url': 'foo', 'id': 'testnotopichandling'}
54 oops = parsed_oops_to_model_oops(report, 'bug_880641')
55 self.assertEqual('', oops.pageid)
56+
57+ def test_broken_url_handling(self):
58+ # This URL is not a valid URL - URL's are a subset of ASCII.
59+ broken_url = '/somep\xe1th'
60+ report = { 'url': broken_url, 'id': 'testbrokenurl'}
61+ # We handle such URL's by url quoting them, failing to do so being a
62+ # (not uncommon) producer mistake.
63+ expected_url = urllib.quote(broken_url)
64+ oops = parsed_oops_to_model_oops(report, 'test_broken_url_handling')
65+ self.assertEqual(expected_url, oops.url)
66+
67
68=== modified file 'src/oopstools/version.txt'
69--- src/oopstools/version.txt 2011-10-13 20:18:51 +0000
70+++ src/oopstools/version.txt 2011-10-31 02:12:23 +0000
71@@ -1,1 +1,1 @@
72-0.6
73+0.6.1

Subscribers

People subscribed via source and target branches

to all changes: