Merge lp:~rye/python-oops-tools/local-charset-in-url into lp:python-oops-tools

Proposed by Roman Yepishev
Status: Merged
Merged at revision: 8
Proposed branch: lp:~rye/python-oops-tools/local-charset-in-url
Merge into: lp:python-oops-tools
Diff against target: 35 lines (+17/-0)
2 files modified
src/oopstools/oops/models.py (+9/-0)
src/oopstools/oops/test/test_dboopsloader.py (+8/-0)
To merge this branch: bzr merge lp:~rye/python-oops-tools/local-charset-in-url
Reviewer Review Type Date Requested Status
Robert Collins (community) Needs Fixing
Steve Kowalik (community) code Approve
Review via email: mp+80330@code.launchpad.net

Description of the change

Force quoting the URL if the string cannot be presented as ascii during loading.

To post a comment you must log in.
Revision history for this message
Steve Kowalik (stevenk) wrote :

This looks like excellent work, thank you.

My two niggling comments about this is we have a coding convention that comments are full sentences, so please end your comment with a full stop, and please add a comment to the start of your test case.

Revision history for this message
Steve Kowalik (stevenk) wrote :

This looks like excellent work, thank you.

My two niggling comments about this is we have a coding convention that comments are full sentences, so please end your comment with a full stop, and please add a comment to the start of your test case.

review: Approve (code)
Revision history for this message
Robert Collins (lifeless) wrote :

This looks like it should be an else path to the prior case: if we get a bytestring, check that its ascii. (And the right way to do that would be decode('ascii') not an encode.

review: Needs Fixing
Revision history for this message
Robert Collins (lifeless) wrote :

I've tidied this up in a branch of mine, which I'm going to self-approve (the only functional change was to make this code the else: clause of the if (type) is unicode: condition.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'src/oopstools/oops/models.py'
--- src/oopstools/oops/models.py 2011-10-25 02:25:14 +0000
+++ src/oopstools/oops/models.py 2011-10-25 12:59:23 +0000
@@ -371,6 +371,15 @@
371 # We have gotten a ringer, URL's are bytestrings. Encode to UTF8 to get371 # We have gotten a ringer, URL's are bytestrings. Encode to UTF8 to get
372 # a bytestring and urllib.quote to get a url.372 # a bytestring and urllib.quote to get a url.
373 url = urllib.quote(url.encode('utf8'))373 url = urllib.quote(url.encode('utf8'))
374
375 try:
376 url.encode('ascii')
377 except UnicodeDecodeError:
378 # The URL cannot be represented as ascii but it is not unicode
379 # either. There may be a byte in local encoding and we need to quote
380 # it without any conversion
381 url = urllib.quote(url)
382
374 informational = oops.get('informational', 'False').lower() == 'true'383 informational = oops.get('informational', 'False').lower() == 'true'
375 oops_date = oops.get('time')384 oops_date = oops.get('time')
376 if oops_date is None:385 if oops_date is None:
377386
=== modified file 'src/oopstools/oops/test/test_dboopsloader.py'
--- src/oopstools/oops/test/test_dboopsloader.py 2011-10-24 05:21:01 +0000
+++ src/oopstools/oops/test/test_dboopsloader.py 2011-10-25 12:59:23 +0000
@@ -154,3 +154,11 @@
154 report = { 'url': 'foo', 'id': 'testnotopichandling'}154 report = { 'url': 'foo', 'id': 'testnotopichandling'}
155 oops = parsed_oops_to_model_oops(report, 'bug_880641')155 oops = parsed_oops_to_model_oops(report, 'bug_880641')
156 self.assertEqual('', oops.pageid)156 self.assertEqual('', oops.pageid)
157
158 def test_broken_url_handling(self):
159 broken_url = '/somep\xe1th'
160 report = { 'url': broken_url, 'id': 'testbrokenurl'}
161 expected_url = urllib.quote(broken_url)
162 oops = parsed_oops_to_model_oops(report, 'test_broken_url_handling')
163 self.assertEqual(expected_url, oops.url)
164

Subscribers

People subscribed via source and target branches

to all changes: