Merge lp:~benji/launchpad/bug-622765 into lp:launchpad
| Status: | Merged | ||||
|---|---|---|---|---|---|
| Approved by: | Benji York on 2010-08-30 | ||||
| Approved revision: | no longer in the source branch. | ||||
| Merged at revision: | 11477 | ||||
| Proposed branch: | lp:~benji/launchpad/bug-622765 | ||||
| Merge into: | lp:launchpad | ||||
| Diff against target: |
250 lines (+73/-24) 4 files modified
lib/canonical/launchpad/scripts/tests/test_librarian_apache_log_parser.py (+2/-2) lib/lp/services/apachelogparser/base.py (+7/-3) lib/lp/services/apachelogparser/script.py (+9/-1) lib/lp/services/apachelogparser/tests/test_apachelogparser.py (+55/-18) |
||||
| To merge this branch: | bzr merge lp:~benji/launchpad/bug-622765 | ||||
| Related bugs: |
|
| Reviewer | Review Type | Date Requested | Status |
|---|---|---|---|
| Henning Eggers (community) | code | 2010-08-30 | Approve on 2010-08-30 |
|
Review via email:
|
|||
Description of the Change
The config option "logparser_
This branch fixes the problem by keeping track of the number of lines parsed across calls to parse_file.
The related tests can be run with
bin/test -c test_apachelogp
No lint was reported by make lint.
| Benji York (benji) wrote : | # |
On Mon, Aug 30, 2010 at 11:54 AM, Henning Eggers
<email address hidden> wrote:
> Review: Approve code
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi Benji,
> thank you for this nice fix. It is basically good to land after you have
> considered a few comments of mine.
>
> review approve code
>
> Cheers,
> Henning
>
> Am 30.08.2010 17:03, schrieb Benji York:
>>
>> === modified file 'lib/canonical/
>> --- lib/canonical/
>> +++ lib/canonical/
>> @@ -102,7 +102,7 @@
>> '69.233.136.42 - - [13/Jun/
>> '/15018215/
>> '"https:/
>> - downloads, parsed_bytes = parse_file(
>> + downloads, parsed_bytes, ignored = parse_file(
>
> I am not sure "ignored" is the best choice here. I wondered at first if it
> means "ignored_bytes" or "ignored_lines". Would it hurt to call it what it is
> ("parsed_lines") and then simply ignore it?
Done.
>> === modified file 'lib/lp/
>> --- lib/lp/
>> +++ lib/lp/
>> @@ -8,6 +8,7 @@
>>
>> from zope.component import getUtility
>>
>> +from canonical.config import config
>> from lp.app.errors import NotFoundError
>> from lp.services.
>> create_
>> @@ -65,8 +66,15 @@
>>
>> self.setUpUtil
>> country_set = getUtility(
>> + parsed_lines = 0
>> + max_parsed_lines = getattr(
>> + config.launchpad, 'logparser_
>> for fd, position in files_to_parse:
>> - downloads, parsed_bytes = parse_file(
>> + # If we've used up our budget of lines to process, stop.
>> + if (max_parsed_lines is not None
>> + and parsed_lines >= max_parsed_lines):
>
> This line must be indented and the "and" should probably go on the previous
> line but multi-line "if" conditions are always bad to read because the
> dangling lines align with the body of the statement. Storing the boolean in a
> variable first and checking that in the statement often works better.
Done.
>> + # If we have already parsed some lines, then the number of lines
>> + # parsed will be passed in (parsed_lines argument) and parse_file will
>> + # take that number into account when determining if the maximum number
>> + # of lines to parse has been reached.
>
> Hm, this is a functional test and not documentation. This description should
> probably be found in the docstring of parse_file, should it not? If found
> there, it need not be explained in such detail here.
My intent was to explain the functioning of the test. I'm sure I would
appreciate a nice explanation of what's going on if I was reading the
test....

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi Benji,
thank you for this nice fix. It is basically good to land after you have
considered a few comments of mine.
review approve code
Cheers,
Henning
Am 30.08.2010 17:03, schrieb Benji York: launchpad/ scripts/ tests/test_ librarian_ apache_ log_parser. py' launchpad/ scripts/ tests/test_ librarian_ apache_ log_parser. py 2010-08-20 20:31:18 +0000 launchpad/ scripts/ tests/test_ librarian_ apache_ log_parser. py 2010-08-30 15:03:44 +0000 2008:14: 55:22 +0100] "GET ' ul_logo_ 64x64.png HTTP/1.1" 200 2261 ' /launchpad. net/~ubuntulite /+archive" "Mozilla"')
>
> === modified file 'lib/canonical/
> --- lib/canonical/
> +++ lib/canonical/
> @@ -102,7 +102,7 @@
> '69.233.136.42 - - [13/Jun/
> '/15018215/
> '"https:/
> - downloads, parsed_bytes = parse_file(
> + downloads, parsed_bytes, ignored = parse_file(
I am not sure "ignored" is the best choice here. I wondered at first if it
means "ignored_bytes" or "ignored_lines". Would it hurt to call it what it is
("parsed_lines") and then simply ignore it?
There are multiple uses of "ignored" in here, as you know. ;-)
> fd, start_position=0, logger=self.logger, key=get_ library_ file_id) 2008:14: 55:22 +0100] "GET / HTTP/1.1" ' /launchpad. net/~ubuntulite /+archive" "Mozilla"') key=get_ library_ file_id) services/ apachelogparser /base.py' services/ apachelogparser /base.py 2010-08-27 18:41:01 +0000 services/ apachelogparser /base.py 2010-08-30 15:03:44 +0000
> get_download_
> self.assertEqual(
> @@ -121,7 +121,7 @@
> fd = StringIO(
> '69.233.136.42 - - [13/Jun/
> '200 2261 "https:/
> - downloads, parsed_bytes = parse_file(
> + downloads, parsed_bytes, ignored = parse_file(
> fd, start_position=0, logger=self.logger,
> get_download_
> self.assertEqual(
>
> === modified file 'lib/lp/
> --- lib/lp/
> +++ lib/lp/
Very nice. Thank you.
[...]
> === modified file 'lib/lp/ services/ apachelogparser /script. py' services/ apachelogparser /script. py 2010-08-27 18:41:01 +0000 services/ apachelogparser /script. py 2010-08-30 15:03:44 +0000 apachelogparser .base import ( or_update_ parsedlog_ entry, ties() ICountrySet) max_parsed_ lines', None)
> --- lib/lp/
> +++ lib/lp/
> @@ -8,6 +8,7 @@
>
> from zope.component import getUtility
>
> +from canonical.config import config
> from lp.app.errors import NotFoundError
> from lp.services.
> create_
> @@ -65,8 +66,15 @@
>
> self.setUpUtili
> country_set = getUtility(
> + parsed_lines = 0
> + max_parsed_lines = getattr(
> + config.launchpad, 'logparser_
> for fd, position in files_to_parse:
> - downloads, parsed_bytes = parse_file(
> + # If we've used up our budget of lines to process, stop.
> + if (max_parsed_lines is not None
> + and parsed_lines >= max_parsed_lines):
This line must be indented and the "and" should probably go on the previous
line but multi-line "if" conditions are always bad to read because the
dangling lines align with the body of the statement. Storing the boolean in a
variable fi...