Merge lp:~salgado/launchpad/bug-422552 into lp:launchpad

Proposed by Guilherme Salgado
Status: Merged
Approved by: Brad Crittenden
Approved revision: not available
Merged at revision: not available
Proposed branch: lp:~salgado/launchpad/bug-422552
Merge into: lp:launchpad
Diff against target: 29 lines (+7/-1)
2 files modified
lib/canonical/launchpad/scripts/tests/test_librarian_apache_log_parser.py (+6/-0)
lib/lp/services/apachelogparser/base.py (+1/-1)
To merge this branch: bzr merge lp:~salgado/launchpad/bug-422552
Reviewer Review Type Date Requested Status
Brad Crittenden (community) code Approve
Review via email: mp+16483@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Guilherme Salgado (salgado) wrote :

= Summary =

As seen in the logs of the script which parses librarian log files to
count downloads, there are errors when parsing some lines. These errors
prevent the script from parsing the remaining of the files.

The error happens when the request string has multiple consecutive white
spaces.

== Proposed fix ==

Fix the bug by normalizing white spaces in the request string

= Launchpad lint =

Checking for conflicts. and issues in doctests and templates.
Running jslint, xmllint, pyflakes, and pylint.
Using normal rules.

Linting changed files:
  lib/canonical/launchpad/scripts/tests/test_librarian_apache_log_parser.py
  lib/lp/services/apachelogparser/base.py

== Pylint notices ==

lib/lp/services/apachelogparser/base.py
    11: [F0401] Unable to import 'lazr.uri' (No module named uri)

Revision history for this message
Brad Crittenden (bac) wrote :

As discussed on IRC I think you can replace request.split(' ') with request.split() and do away with the normalization. Having the test in place is great, too.

review: Approve (code)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'lib/canonical/launchpad/scripts/tests/test_librarian_apache_log_parser.py'
2--- lib/canonical/launchpad/scripts/tests/test_librarian_apache_log_parser.py 2009-08-31 15:01:54 +0000
3+++ lib/canonical/launchpad/scripts/tests/test_librarian_apache_log_parser.py 2009-12-22 16:13:16 +0000
4@@ -47,6 +47,12 @@
5 request = 'GET //8196569//foo HTTP/1.1'
6 self.assertMethodAndFileIDAreCorrect(request)
7
8+ def test_multiple_consecutive_white_spaces(self):
9+ # Some request strings might have multiple consecutive white spaces,
10+ # but they're parsed just like if they didn't have the extra spaces.
11+ request = 'GET /8196569/mediumubuntulogo.png HTTP/1.1'
12+ self.assertMethodAndFileIDAreCorrect(request)
13+
14 def test_return_value_for_https_path(self):
15 request = ('GET https://launchpadlibrarian.net/8196569/'
16 'mediumubuntulogo.png HTTP/1.1')
17
18=== modified file 'lib/lp/services/apachelogparser/base.py'
19--- lib/lp/services/apachelogparser/base.py 2009-11-17 18:36:20 +0000
20+++ lib/lp/services/apachelogparser/base.py 2009-12-22 16:13:16 +0000
21@@ -176,7 +176,7 @@
22
23 def get_method_and_path(request):
24 """Extract the method of the request and path of the requested file."""
25- L = request.split(' ')
26+ L = request.split()
27 # HTTP 1.0 requests might omit the HTTP version so we must cope with them.
28 if len(L) == 2:
29 method, path = L