Merge lp:~lifeless/python-oops-tools/bug-881400 into lp:python-oops-tools

Proposed by Robert Collins
Status: Merged
Approved by: Robert Collins
Approved revision: 9
Merged at revision: 8
Proposed branch: lp:~lifeless/python-oops-tools/bug-881400
Merge into: lp:python-oops-tools
Diff against target: 73 lines (+30/-2)
4 files modified
src/oopstools/NEWS.txt (+8/-1)
src/oopstools/oops/models.py (+10/-0)
src/oopstools/oops/test/test_dboopsloader.py (+11/-0)
src/oopstools/version.txt (+1/-1)
To merge this branch: bzr merge lp:~lifeless/python-oops-tools/bug-881400
Reviewer Review Type Date Requested Status
Robert Collins (community) Approve
Review via email: mp+80769@code.launchpad.net

Commit message

Handle bad url bytestrings by url quoting. Also release 0.6.1.

Description of the change

A followon-from https://code.launchpad.net/~rye/python-oops-tools/local-charset-in-url/+merge/80330 -- I've implemented StevenK's points, and mine.

To post a comment you must log in.
Revision history for this message
Robert Collins (lifeless) wrote :

Also includes a release.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'src/oopstools/NEWS.txt'
--- src/oopstools/NEWS.txt 2011-10-27 03:40:45 +0000
+++ src/oopstools/NEWS.txt 2011-10-31 02:12:23 +0000
@@ -5,6 +5,9 @@
5NEXT5NEXT
6====6====
77
80.6.1
9=====
10
8* Added AMQP support via the bin/amqp2disk script. (Robert Collins)11* Added AMQP support via the bin/amqp2disk script. (Robert Collins)
912
10* Bumped oops-amqp rev to 0.0.3 for bugfixes. (Robert Collins)13* Bumped oops-amqp rev to 0.0.3 for bugfixes. (Robert Collins)
@@ -15,7 +18,11 @@
15* amqp2disk -v will print the received OOPS ids on the console, for18* amqp2disk -v will print the received OOPS ids on the console, for
16 entertainment and delight. (Robert Collins)19 entertainment and delight. (Robert Collins)
1720
18* OOPS reports with unicode url values are now handled during oops loading: the21* OOPS reports with non-ascii URL values are handled by url escaping the URL
22 bytestring (this is separate to handling of unicode URL values).
23 (Roman Yepishev, Robert Collins, #881400)
24
25* OOPS reports with unicode URL values are now handled during oops loading: the
19 unicode string is utf8 encoded (an arbitrary choice) and url escaped.26 unicode string is utf8 encoded (an arbitrary choice) and url escaped.
20 (Robert Collins, #879309)27 (Robert Collins, #879309)
2128
2229
=== modified file 'src/oopstools/oops/models.py'
--- src/oopstools/oops/models.py 2011-10-25 02:25:14 +0000
+++ src/oopstools/oops/models.py 2011-10-31 02:12:23 +0000
@@ -371,6 +371,16 @@
371 # We have gotten a ringer, URL's are bytestrings. Encode to UTF8 to get371 # We have gotten a ringer, URL's are bytestrings. Encode to UTF8 to get
372 # a bytestring and urllib.quote to get a url.372 # a bytestring and urllib.quote to get a url.
373 url = urllib.quote(url.encode('utf8'))373 url = urllib.quote(url.encode('utf8'))
374 else:
375 try:
376 url.decode('ascii')
377 except UnicodeDecodeError:
378 # The URL is not ASCII and thus definitely not a URL.
379 # There may be a byte in local encoding or it may be UTF8 - we need
380 # to quote it to make it a valid URL - this is better than
381 # rejecting the OOPS, or having a URL that isn't in the DB.
382 url = urllib.quote(url)
383
374 informational = oops.get('informational', 'False').lower() == 'true'384 informational = oops.get('informational', 'False').lower() == 'true'
375 oops_date = oops.get('time')385 oops_date = oops.get('time')
376 if oops_date is None:386 if oops_date is None:
377387
=== modified file 'src/oopstools/oops/test/test_dboopsloader.py'
--- src/oopstools/oops/test/test_dboopsloader.py 2011-10-24 05:21:01 +0000
+++ src/oopstools/oops/test/test_dboopsloader.py 2011-10-31 02:12:23 +0000
@@ -154,3 +154,14 @@
154 report = { 'url': 'foo', 'id': 'testnotopichandling'}154 report = { 'url': 'foo', 'id': 'testnotopichandling'}
155 oops = parsed_oops_to_model_oops(report, 'bug_880641')155 oops = parsed_oops_to_model_oops(report, 'bug_880641')
156 self.assertEqual('', oops.pageid)156 self.assertEqual('', oops.pageid)
157
158 def test_broken_url_handling(self):
159 # This URL is not a valid URL - URL's are a subset of ASCII.
160 broken_url = '/somep\xe1th'
161 report = { 'url': broken_url, 'id': 'testbrokenurl'}
162 # We handle such URL's by url quoting them, failing to do so being a
163 # (not uncommon) producer mistake.
164 expected_url = urllib.quote(broken_url)
165 oops = parsed_oops_to_model_oops(report, 'test_broken_url_handling')
166 self.assertEqual(expected_url, oops.url)
167
157168
=== modified file 'src/oopstools/version.txt'
--- src/oopstools/version.txt 2011-10-13 20:18:51 +0000
+++ src/oopstools/version.txt 2011-10-31 02:12:23 +0000
@@ -1,1 +1,1 @@
10.610.6.1

Subscribers

People subscribed via source and target branches

to all changes: