Merge lp:~sinzui/launchpad/path-info-bad-request into lp:launchpad

Proposed by Curtis Hovey
Status: Merged
Approved by: Curtis Hovey
Approved revision: no longer in the source branch.
Merged at revision: 16384
Proposed branch: lp:~sinzui/launchpad/path-info-bad-request
Merge into: lp:launchpad
Diff against target: 39 lines (+18/-0)
2 files modified
lib/lp/services/webapp/servers.py (+9/-0)
lib/lp/services/webapp/tests/test_servers.py (+9/-0)
To merge this branch: bzr merge lp:~sinzui/launchpad/path-info-bad-request
Reviewer Review Type Date Requested Status
William Grant code Approve
Review via email: mp+140698@code.launchpad.net

Commit message

Recode PATH_INFO to UTF-8 th ensure a UnicodeDecodeError is not raised before oops reporting is available.

Description of the change

Launchpad/Zope fails without raising an error if PATH_INFO is not UTF-8
encoded. The error is deep in Zope and happens before any error handling
is setup

RULES

    Pre-implementation: flacoste, wgrant
    * We want to make the creation of the request object robust to carry
      on for a little bit longer. Something need to indicate that setup
      is incomplete and that an error needs to be reported
    * After error reporting is setup, and before traversal starts, something
      needs to raise a real error. The Error is a generic 400, but it would
      be nice to explain what went wrong since Lp does know for this
      case that the PATH_INFO must be UTF-8 encoded.
    * The New error is exempt from oopses when Lp is not the referrer. Like
      404s and 410s, offsite requests are not an issue with Lp's code.

    Addendum:
    * The very-plain-oops template cannot be used because it claims that
      something when wrong with Lp. In this case though, it was the other
      site, browser, or user.
    * Since the request was not properly setup, the templates in lp.app
      fail when the require a proper request.
    * We need a template that looks like an Lp page, but does not ever
      use the request.

QA

    * Visit https://bugs.qastaging.launchpad.net/mahara/+bug/630891%E4
    * Verify the page:
      * Uses Lp's styles
      * Explains Lp did not understand the request
      * Shows you the traceback when you are logged in as an Lp developer

LINT

    lib/lp/app/browser/configure.zcml
    lib/lp/app/browser/tests/test_launchpad.py
    lib/lp/app/templates/launchpad-badrequest.pt
    lib/lp/services/webapp/error.py
    lib/lp/services/webapp/errorlog.py
    lib/lp/services/webapp/interfaces.py
    lib/lp/services/webapp/publication.py
    lib/lp/services/webapp/servers.py
    lib/lp/services/webapp/tests/test_errorlog.py
    lib/lp/services/webapp/tests/test_publication.py
    lib/lp/services/webapp/tests/test_servers.py

LoC

    I have more than 10,000 lines of credit this week.

TEST

    ./bin/test -vc -t errorlog -t publication -t servers lp.services.webapp.tests
    ./bin/test -vc -t TestErrorViews lp.app.browser.tests

IMPLEMENTATION

I updated BasicLaunchpadRequest to handle the UnicodeDecodeError
exception. The request is made with a sanitised PATH_INFO. The response
status is immediately set to 400 in case something looks at the
response. The X-Launchpad-Bad-Request header is added so that something
can make a proper error for it when oops reports is available.
    lib/lp/services/webapp/servers.py
    lib/lp/services/webapp/tests/test_servers.py

I added BadRequestError. LaunchpadBrowserPublication.beforeTraversal
already raises errors when it is known that Lp cannot continue. I add a
rule to check for the X-Launchpad-Bad-Request header and make a
BadRequestError from it.
    lib/lp/services/webapp/interfaces.py
    lib/lp/services/webapp/publication.py
    lib/lp/services/webapp/tests/test_publication.py

Added BadRequestError to the list of error to ignore when Lp is not the
referrer.
    lib/lp/services/webapp/errorlog.py
    lib/lp/services/webapp/tests/test_errorlog.py

I redefined InvalidBatchSizeView to extend the new InvalidBatchSizeView
because both views are 400s and we want to explain what the user did. I
configured the Error view to use the Lp specific template. Since the
error is about the request, we cannot use standard templates. The
very-plain-oops.pt template from webapp was not appropriate because it
claimed there was something wrong with Lp, which was not true. I
provided a page that looks like Lp's error pages, but avoids anything
that requires a sane request object.
    lib/lp/services/webapp/error.py
    lib/lp/app/browser/configure.zcml
    lib/lp/app/browser/tests/test_launchpad.py
    lib/lp/app/templates/launchpad-badrequest.pt

To post a comment you must log in.
Revision history for this message
William Grant (wgrant) wrote :

Did you consider a less invasive fix along the lines of http://paste.ubuntu.com/1451678/? 'replace' will replace the invalid UTF-8 sequences with U+FFFD, which is not ideal, but it's an uncommon case and probably less risky than your proposal.

review: Needs Information
Revision history for this message
Curtis Hovey (sinzui) wrote :

I did indeed try sanitising the path_info without knowing if it was bad. Lp will raise a 404. There is no indication that the request was not understood. I can think of 1 reason we really want 400, maybe you can cry YAGNI. If Lp is the referrer of a badly encoded request, we will see it in the oops reports with the oopses that most concern us. If the error is reported as a 404, the encoding problem is grouped with a large listing of moved or removed user data. If you don't want this separation, then I will simplify this branch.

Revision history for this message
Curtis Hovey (sinzui) wrote :

We agreed to recode the PATH_INFO and accept a 404 when this happens.

Revision history for this message
William Grant (wgrant) wrote :

Thanks. Looks good now except for the reindented line.

review: Approve (code)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'lib/lp/services/webapp/servers.py'
--- lib/lp/services/webapp/servers.py 2012-07-06 16:39:38 +0000
+++ lib/lp/services/webapp/servers.py 2012-12-20 23:06:22 +0000
@@ -574,6 +574,15 @@
574 def __init__(self, body_instream, environ, response=None):574 def __init__(self, body_instream, environ, response=None):
575 self.traversed_objects = []575 self.traversed_objects = []
576 self._wsgi_keys = set()576 self._wsgi_keys = set()
577 if 'PATH_INFO' in environ:
578 # Zope's sane_environment assumes that PATH_INFO is UTF-8 encoded.
579 # This next step replaces problems with U+FFFD to ensure
580 # a UnicodeDecodeError is not raised before OOPS error handling
581 # is available. This change will convert a 400 error to a 404
582 # because tranversal will raise NotFound when it encounters a
583 # non-ascii path part.
584 environ['PATH_INFO'] = environ['PATH_INFO'].decode(
585 'utf-8', 'replace').encode('utf-8')
577 super(BasicLaunchpadRequest, self).__init__(586 super(BasicLaunchpadRequest, self).__init__(
578 body_instream, environ, response)587 body_instream, environ, response)
579588
580589
=== modified file 'lib/lp/services/webapp/tests/test_servers.py'
--- lib/lp/services/webapp/tests/test_servers.py 2012-02-29 10:35:12 +0000
+++ lib/lp/services/webapp/tests/test_servers.py 2012-12-20 23:06:22 +0000
@@ -380,6 +380,15 @@
380 response.getHeader(380 response.getHeader(
381 'Strict-Transport-Security'), 'max-age=2592000')381 'Strict-Transport-Security'), 'max-age=2592000')
382382
383 def test_baserequest_recovers_from_bad_path_info_encoding(self):
384 # The request object recodes PATH_INFO to ensure sane_environment
385 # does not raise a UnicodeDecodeError when LaunchpadBrowserRequest
386 # is instantiated.
387 bad_path = 'fnord/trunk\xE4'
388 env = {'PATH_INFO': bad_path}
389 request = LaunchpadBrowserRequest(StringIO.StringIO(''), env)
390 self.assertEquals(u'fnord/trunk\ufffd', request.getHeader('PATH_INFO'))
391
383392
384class TestFeedsBrowserRequest(TestCase):393class TestFeedsBrowserRequest(TestCase):
385 """Tests for `FeedsBrowserRequest`."""394 """Tests for `FeedsBrowserRequest`."""