Merge lp:~jtv/launchpad/bug-994650-scrub-in-batches into lp:launchpad
Status: | Merged | ||||
---|---|---|---|---|---|
Approved by: | Jeroen T. Vermeulen | ||||
Approved revision: | no longer in the source branch. | ||||
Merge reported by: | Jeroen T. Vermeulen | ||||
Merged at revision: | not available | ||||
Proposed branch: | lp:~jtv/launchpad/bug-994650-scrub-in-batches | ||||
Merge into: | lp:launchpad | ||||
Prerequisite: | lp:~jtv/launchpad/bug-994650-scrub-faster | ||||
Diff against target: |
609 lines (+571/-0) 4 files modified
database/schema/security.cfg (+1/-0) lib/lp/scripts/garbo.py (+4/-0) lib/lp/translations/scripts/scrub_pofiletranslator.py (+278/-0) lib/lp/translations/scripts/tests/test_scrub_pofiletranslator.py (+288/-0) |
||||
To merge this branch: | bzr merge lp:~jtv/launchpad/bug-994650-scrub-in-batches | ||||
Related bugs: |
|
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
j.c.sackett (community) | Approve | ||
Review via email: mp+105189@code.launchpad.net |
Commit message
Further speed up scrubbing of POFileTranslator.
Description of the change
This is one of two further optimizations for POFileTranslator scrubbing as mentioned in the preceding MP: https:/
Here you see two parts of the scrubbing process separated further: finding which POFiles in a batch need their POFileTranslator entries fixed, and actually doing so. The benefit is in a new, intervening step: bulk-load all objects needed for doing this work. Disappointingly, about half of the relevant object graph is needed for log output. But we probably should have it anyway, because tweaking such processes without helpful logs can be highly demotivating.
As an added bonus, it turns out we don't really need to load full POFileTranslator records, let alone cache them across these steps. That's one big memory load off my mind. I deliberately didn't make the scrubber check and correct dates on existing POFileTranslator records; a bit of imprecision is fine.
As a next step, which should further reduce memory load as well as DB querying, we can cache sets of POTMsgSet ids per template across items in a batch. (But not outside batches, since they may change between transactions).
The tests don't go into the new components. They already cover the aggregate behaviour of fix_pofile() and the main loop; there'd be a lot of duplication and I'd hate to lose integration test coverage as a side effect of moving things into more fine-grained unit tests. Also I'm trying to correct for a past of focusing too much on fine-grained tests. But, dear reviewer, if you see anything that you do want tested then please say so.
Jeroen
Jeroen--
This looks good. Thanks.