Merge lp:~sinzui/launchpad/path-info-bad-request into lp:launchpad

Proposed by Curtis Hovey
Status: Merged
Approved by: Curtis Hovey
Approved revision: no longer in the source branch.
Merged at revision: 16384
Proposed branch: lp:~sinzui/launchpad/path-info-bad-request
Merge into: lp:launchpad
Diff against target: 39 lines (+18/-0)
2 files modified
lib/lp/services/webapp/servers.py (+9/-0)
lib/lp/services/webapp/tests/test_servers.py (+9/-0)
To merge this branch: bzr merge lp:~sinzui/launchpad/path-info-bad-request
Reviewer Review Type Date Requested Status
William Grant code Approve
Review via email: mp+140698@code.launchpad.net

Commit message

Recode PATH_INFO to UTF-8 th ensure a UnicodeDecodeError is not raised before oops reporting is available.

Description of the change

Launchpad/Zope fails without raising an error if PATH_INFO is not UTF-8
encoded. The error is deep in Zope and happens before any error handling
is setup

RULES

    Pre-implementation: flacoste, wgrant
    * We want to make the creation of the request object robust to carry
      on for a little bit longer. Something need to indicate that setup
      is incomplete and that an error needs to be reported
    * After error reporting is setup, and before traversal starts, something
      needs to raise a real error. The Error is a generic 400, but it would
      be nice to explain what went wrong since Lp does know for this
      case that the PATH_INFO must be UTF-8 encoded.
    * The New error is exempt from oopses when Lp is not the referrer. Like
      404s and 410s, offsite requests are not an issue with Lp's code.

    Addendum:
    * The very-plain-oops template cannot be used because it claims that
      something when wrong with Lp. In this case though, it was the other
      site, browser, or user.
    * Since the request was not properly setup, the templates in lp.app
      fail when the require a proper request.
    * We need a template that looks like an Lp page, but does not ever
      use the request.

QA

    * Visit https://bugs.qastaging.launchpad.net/mahara/+bug/630891%E4
    * Verify the page:
      * Uses Lp's styles
      * Explains Lp did not understand the request
      * Shows you the traceback when you are logged in as an Lp developer

LINT

    lib/lp/app/browser/configure.zcml
    lib/lp/app/browser/tests/test_launchpad.py
    lib/lp/app/templates/launchpad-badrequest.pt
    lib/lp/services/webapp/error.py
    lib/lp/services/webapp/errorlog.py
    lib/lp/services/webapp/interfaces.py
    lib/lp/services/webapp/publication.py
    lib/lp/services/webapp/servers.py
    lib/lp/services/webapp/tests/test_errorlog.py
    lib/lp/services/webapp/tests/test_publication.py
    lib/lp/services/webapp/tests/test_servers.py

LoC

    I have more than 10,000 lines of credit this week.

TEST

    ./bin/test -vc -t errorlog -t publication -t servers lp.services.webapp.tests
    ./bin/test -vc -t TestErrorViews lp.app.browser.tests

IMPLEMENTATION

I updated BasicLaunchpadRequest to handle the UnicodeDecodeError
exception. The request is made with a sanitised PATH_INFO. The response
status is immediately set to 400 in case something looks at the
response. The X-Launchpad-Bad-Request header is added so that something
can make a proper error for it when oops reports is available.
    lib/lp/services/webapp/servers.py
    lib/lp/services/webapp/tests/test_servers.py

I added BadRequestError. LaunchpadBrowserPublication.beforeTraversal
already raises errors when it is known that Lp cannot continue. I add a
rule to check for the X-Launchpad-Bad-Request header and make a
BadRequestError from it.
    lib/lp/services/webapp/interfaces.py
    lib/lp/services/webapp/publication.py
    lib/lp/services/webapp/tests/test_publication.py

Added BadRequestError to the list of error to ignore when Lp is not the
referrer.
    lib/lp/services/webapp/errorlog.py
    lib/lp/services/webapp/tests/test_errorlog.py

I redefined InvalidBatchSizeView to extend the new InvalidBatchSizeView
because both views are 400s and we want to explain what the user did. I
configured the Error view to use the Lp specific template. Since the
error is about the request, we cannot use standard templates. The
very-plain-oops.pt template from webapp was not appropriate because it
claimed there was something wrong with Lp, which was not true. I
provided a page that looks like Lp's error pages, but avoids anything
that requires a sane request object.
    lib/lp/services/webapp/error.py
    lib/lp/app/browser/configure.zcml
    lib/lp/app/browser/tests/test_launchpad.py
    lib/lp/app/templates/launchpad-badrequest.pt

To post a comment you must log in.
Revision history for this message
William Grant (wgrant) wrote :

Did you consider a less invasive fix along the lines of http://paste.ubuntu.com/1451678/? 'replace' will replace the invalid UTF-8 sequences with U+FFFD, which is not ideal, but it's an uncommon case and probably less risky than your proposal.

review: Needs Information
Revision history for this message
Curtis Hovey (sinzui) wrote :

I did indeed try sanitising the path_info without knowing if it was bad. Lp will raise a 404. There is no indication that the request was not understood. I can think of 1 reason we really want 400, maybe you can cry YAGNI. If Lp is the referrer of a badly encoded request, we will see it in the oops reports with the oopses that most concern us. If the error is reported as a 404, the encoding problem is grouped with a large listing of moved or removed user data. If you don't want this separation, then I will simplify this branch.

Revision history for this message
Curtis Hovey (sinzui) wrote :

We agreed to recode the PATH_INFO and accept a 404 when this happens.

Revision history for this message
William Grant (wgrant) wrote :

Thanks. Looks good now except for the reindented line.

review: Approve (code)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'lib/lp/services/webapp/servers.py'
2--- lib/lp/services/webapp/servers.py 2012-07-06 16:39:38 +0000
3+++ lib/lp/services/webapp/servers.py 2012-12-20 23:06:22 +0000
4@@ -574,6 +574,15 @@
5 def __init__(self, body_instream, environ, response=None):
6 self.traversed_objects = []
7 self._wsgi_keys = set()
8+ if 'PATH_INFO' in environ:
9+ # Zope's sane_environment assumes that PATH_INFO is UTF-8 encoded.
10+ # This next step replaces problems with U+FFFD to ensure
11+ # a UnicodeDecodeError is not raised before OOPS error handling
12+ # is available. This change will convert a 400 error to a 404
13+ # because tranversal will raise NotFound when it encounters a
14+ # non-ascii path part.
15+ environ['PATH_INFO'] = environ['PATH_INFO'].decode(
16+ 'utf-8', 'replace').encode('utf-8')
17 super(BasicLaunchpadRequest, self).__init__(
18 body_instream, environ, response)
19
20
21=== modified file 'lib/lp/services/webapp/tests/test_servers.py'
22--- lib/lp/services/webapp/tests/test_servers.py 2012-02-29 10:35:12 +0000
23+++ lib/lp/services/webapp/tests/test_servers.py 2012-12-20 23:06:22 +0000
24@@ -380,6 +380,15 @@
25 response.getHeader(
26 'Strict-Transport-Security'), 'max-age=2592000')
27
28+ def test_baserequest_recovers_from_bad_path_info_encoding(self):
29+ # The request object recodes PATH_INFO to ensure sane_environment
30+ # does not raise a UnicodeDecodeError when LaunchpadBrowserRequest
31+ # is instantiated.
32+ bad_path = 'fnord/trunk\xE4'
33+ env = {'PATH_INFO': bad_path}
34+ request = LaunchpadBrowserRequest(StringIO.StringIO(''), env)
35+ self.assertEquals(u'fnord/trunk\ufffd', request.getHeader('PATH_INFO'))
36+
37
38 class TestFeedsBrowserRequest(TestCase):
39 """Tests for `FeedsBrowserRequest`."""