Merge lp:~rye/python-oops-tools/quote-req-vars into lp:python-oops-tools

Proposed by Roman Yepishev
Status: Merged
Merged at revision: 13
Proposed branch: lp:~rye/python-oops-tools/quote-req-vars
Merge into: lp:python-oops-tools
Diff against target: 59 lines (+22/-3) (has conflicts)
2 files modified
src/oopstools/oops/models.py (+12/-0)
src/oopstools/oops/test/test_dboopsloader.py (+10/-3)
Text conflict in src/oopstools/oops/models.py
To merge this branch: bzr merge lp:~rye/python-oops-tools/quote-req-vars
Reviewer Review Type Date Requested Status
Robert Collins (community) Approve
Review via email: mp+81714@code.launchpad.net

Description of the change

req_vars may contain weird content so we will want to sanitize it a bit before putting some values in the database.

To post a comment you must log in.
Revision history for this message
Aaron Bentley (abentley) wrote :

I don't understand why you're changing the input you get before storing it. Isn't it better to store the true values you received, and then sanitize your output if necessary?

Revision history for this message
Robert Collins (lifeless) wrote :

The problem is data that doesn't fit the schema. The schema is varchar rather than bytea, so some form of coercion is needed. The quoting is at least lossless.... We might want to change the schema as well, but I think that can wait for us to at least get the U1 oops-tools stack live with the new code, which should be soon.

Wgrant also had a fix in this area for LP itself, I'll merge them all together and either land or propose for review depending how much fix up is needed.

review: Approve
Revision history for this message
Robert Collins (lifeless) wrote :

Follow on MP is at https://code.launchpad.net/~lifeless/python-oops-tools/bug-884265, if you have any further questions (or suggestions [short of schema changes :)] on how this might be done now).

If we think a schema change should be done - and perhaps it should be - I think we should do that in 2-3 releases - once all the deploys in Canonical are successfully running unmodified releases.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'src/oopstools/oops/models.py'
--- src/oopstools/oops/models.py 2011-11-02 20:25:00 +0000
+++ src/oopstools/oops/models.py 2011-11-09 11:09:40 +0000
@@ -305,6 +305,7 @@
305 is_local_referrer = False)305 is_local_referrer = False)
306 # Grab data needed by the Oops database model from the req_vars.306 # Grab data needed by the Oops database model from the req_vars.
307 req_vars = oops.get('req_vars') or ()307 req_vars = oops.get('req_vars') or ()
308<<<<<<< TREE
308 # Some badly written OOPSes have single item tuples. (field.blob was seen).309 # Some badly written OOPSes have single item tuples. (field.blob was seen).
309 def ensure_tuple(iterable):310 def ensure_tuple(iterable):
310 for item in iterable:311 for item in iterable:
@@ -315,6 +316,17 @@
315 value = ''316 value = ''
316 yield key, value317 yield key, value
317 for key, value in ensure_tuple(req_vars):318 for key, value in ensure_tuple(req_vars):
319=======
320 for key, value in req_vars:
321 try:
322 # We can get anything in HTTP headers
323 # Trying to squeeze the bytes into ASCII plane
324 value.decode('ascii')
325 except UnicodeDecodeError:
326 # If that fails, percent-quote them
327 value = urllib.quote(value)
328
329>>>>>>> MERGE-SOURCE
318 if key == 'REQUEST_METHOD':330 if key == 'REQUEST_METHOD':
319 # Truncate the REQUEST_METHOD if it's longer than 10 characters331 # Truncate the REQUEST_METHOD if it's longer than 10 characters
320 # since a str longer than that is not allowed in the DB.332 # since a str longer than that is not allowed in the DB.
321333
=== modified file 'src/oopstools/oops/test/test_dboopsloader.py'
--- src/oopstools/oops/test/test_dboopsloader.py 2011-11-02 20:25:00 +0000
+++ src/oopstools/oops/test/test_dboopsloader.py 2011-11-09 11:09:40 +0000
@@ -155,15 +155,22 @@
155 oops = parsed_oops_to_model_oops(report, 'bug_880641')155 oops = parsed_oops_to_model_oops(report, 'bug_880641')
156 self.assertEqual('', oops.pageid)156 self.assertEqual('', oops.pageid)
157157
158 def test_broken_url_handling(self):158 def test_broken_header_handling(self):
159 # This URL is not a valid URL - URL's are a subset of ASCII.
160 broken_url = '/somep\xe1th'159 broken_url = '/somep\xe1th'
161 report = { 'url': broken_url, 'id': 'testbrokenurl'}160 user_agent = '\xf8\x07\x9e'
161 report = { 'url': broken_url,
162 'id': 'testbrokenheader',
163 'req_vars': [
164 [ 'HTTP_USER_AGENT', user_agent ]
165 ]
166 }
162 # We handle such URL's by url quoting them, failing to do so being a167 # We handle such URL's by url quoting them, failing to do so being a
163 # (not uncommon) producer mistake.168 # (not uncommon) producer mistake.
164 expected_url = urllib.quote(broken_url)169 expected_url = urllib.quote(broken_url)
170 expected_user_agent = urllib.quote(user_agent)
165 oops = parsed_oops_to_model_oops(report, 'test_broken_url_handling')171 oops = parsed_oops_to_model_oops(report, 'test_broken_url_handling')
166 self.assertEqual(expected_url, oops.url)172 self.assertEqual(expected_url, oops.url)
173 self.assertEqual(expected_user_agent, oops.user_agent)
167174
168 def test_broken_reqvars_bug_885416(self):175 def test_broken_reqvars_bug_885416(self):
169 # If the tuples in req_vars have only one element the oops still needs176 # If the tuples in req_vars have only one element the oops still needs

Subscribers

People subscribed via source and target branches

to all changes: