Merge lp:~rye/python-oops-tools/local-charset-in-url into lp:python-oops-tools

Proposed by Roman Yepishev
Status: Merged
Merged at revision: 8
Proposed branch: lp:~rye/python-oops-tools/local-charset-in-url
Merge into: lp:python-oops-tools
Diff against target: 35 lines (+17/-0)
2 files modified
src/oopstools/oops/models.py (+9/-0)
src/oopstools/oops/test/test_dboopsloader.py (+8/-0)
To merge this branch: bzr merge lp:~rye/python-oops-tools/local-charset-in-url
Reviewer Review Type Date Requested Status
Robert Collins (community) Needs Fixing
Steve Kowalik (community) code Approve
Review via email: mp+80330@code.launchpad.net

Description of the change

Force quoting the URL if the string cannot be presented as ascii during loading.

To post a comment you must log in.
Revision history for this message
Steve Kowalik (stevenk) wrote :

This looks like excellent work, thank you.

My two niggling comments about this is we have a coding convention that comments are full sentences, so please end your comment with a full stop, and please add a comment to the start of your test case.

Revision history for this message
Steve Kowalik (stevenk) wrote :

This looks like excellent work, thank you.

My two niggling comments about this is we have a coding convention that comments are full sentences, so please end your comment with a full stop, and please add a comment to the start of your test case.

review: Approve (code)
Revision history for this message
Robert Collins (lifeless) wrote :

This looks like it should be an else path to the prior case: if we get a bytestring, check that its ascii. (And the right way to do that would be decode('ascii') not an encode.

review: Needs Fixing
Revision history for this message
Robert Collins (lifeless) wrote :

I've tidied this up in a branch of mine, which I'm going to self-approve (the only functional change was to make this code the else: clause of the if (type) is unicode: condition.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/oopstools/oops/models.py'
2--- src/oopstools/oops/models.py 2011-10-25 02:25:14 +0000
3+++ src/oopstools/oops/models.py 2011-10-25 12:59:23 +0000
4@@ -371,6 +371,15 @@
5 # We have gotten a ringer, URL's are bytestrings. Encode to UTF8 to get
6 # a bytestring and urllib.quote to get a url.
7 url = urllib.quote(url.encode('utf8'))
8+
9+ try:
10+ url.encode('ascii')
11+ except UnicodeDecodeError:
12+ # The URL cannot be represented as ascii but it is not unicode
13+ # either. There may be a byte in local encoding and we need to quote
14+ # it without any conversion
15+ url = urllib.quote(url)
16+
17 informational = oops.get('informational', 'False').lower() == 'true'
18 oops_date = oops.get('time')
19 if oops_date is None:
20
21=== modified file 'src/oopstools/oops/test/test_dboopsloader.py'
22--- src/oopstools/oops/test/test_dboopsloader.py 2011-10-24 05:21:01 +0000
23+++ src/oopstools/oops/test/test_dboopsloader.py 2011-10-25 12:59:23 +0000
24@@ -154,3 +154,11 @@
25 report = { 'url': 'foo', 'id': 'testnotopichandling'}
26 oops = parsed_oops_to_model_oops(report, 'bug_880641')
27 self.assertEqual('', oops.pageid)
28+
29+ def test_broken_url_handling(self):
30+ broken_url = '/somep\xe1th'
31+ report = { 'url': broken_url, 'id': 'testbrokenurl'}
32+ expected_url = urllib.quote(broken_url)
33+ oops = parsed_oops_to_model_oops(report, 'test_broken_url_handling')
34+ self.assertEqual(expected_url, oops.url)
35+

Subscribers

People subscribed via source and target branches

to all changes: