Merge lp:~jtv/launchpad/bug-994650-scrub-faster into lp:launchpad

Proposed by Jeroen T. Vermeulen
Status: Merged
Approved by: William Grant
Approved revision: no longer in the source branch.
Merge reported by: Jeroen T. Vermeulen
Merged at revision: not available
Proposed branch: lp:~jtv/launchpad/bug-994650-scrub-faster
Merge into: lp:launchpad
Prerequisite: lp:~jtv/launchpad/bug-994650-scrub-pofiletranslator
Diff against target: 547 lines (+509/-0)
4 files modified
database/schema/security.cfg (+1/-0)
lib/lp/scripts/garbo.py (+4/-0)
lib/lp/translations/scripts/scrub_pofiletranslator.py (+212/-0)
lib/lp/translations/scripts/tests/test_scrub_pofiletranslator.py (+292/-0)
To merge this branch: bzr merge lp:~jtv/launchpad/bug-994650-scrub-faster
Reviewer Review Type Date Requested Status
William Grant code Approve
Review via email: mp+105169@code.launchpad.net

Commit message

Speed up POFileTranslator scrubbing.

Description of the change

The garbo job I'm building to update POFileTranslator records for us was painfully slow. William pointed out the main reason: the query that fetched the POFiles that the outer loop iterates over. Each time the outer loop takes little slices from that query's result, Storm re-executes the query.

So we needed to keep that list in memory, but again as William pointed out, there are rather too many POFiles to keep comfortably in memory. This branch loads the POFiles' ids up front, and then postpones the loading of further details for as long as possible. It does mean handing ids around for some things that were objects before

In fact, you see three levels of POFile retrieval here:

1. Setup fetches only ids, so that only the ids have to be kept in memory.

2. Per batch, the loop loads just the details needed to find out if the POFile needs updating: its POTemplate id, its Language id, and its POTMsgSet ids.

3. Any POFile that actually needs fixing (hopefully a minority) gets loaded as a proper object.

This kind of thing is never fast enough, so there are obvious optimizations that I may pursue next: cache the POTMsgSet ids as we iterate over POFiles of the same POTemplate, to help the no-change case even more. And to speed up the case where there are changes, such as on the initial run, first gather those POFiles in the batch that need fixing, and batch-load them. The batch-loading can also cover templates, languages, distroseries, distributions, productseries, and products. Most of these are needed only for the purpose of logging which POFile is being updated.

To test:
{{{
# Get database schema set up for this branch.
bzr branch lp:~jtv/launchpad/db-994410
cd db-994410
make schema

cd ..

# Get the prerequisite branch.
bzr branch lp:~jtv/launchpad/bug-994650-scrub-pofiletranslator jtv-scrubber
cd jtv-scrubber

# Merge in the branch under review here.
bzr merge lp:~jtv/launchpad/bug-994650-scrub-faster
make

# Test.
./bin/test -vvc lp.translations.scripts.tests.test_scrub_pofiletranslator
}}}

Alternatively, get the branch that combines all these changes:

{{{
bzr branch lp:~jtv/launchpad/combined-async-pofiletranslator
cd combined-async-pofiletranslator
make
make schema
./bin/test -vvc lp.translations.scripts.tests.test_scrub_pofiletranslator
}}}

Jeroen

To post a comment you must log in.
Revision history for this message
William Grant (wgrant) :
review: Approve (code)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'database/schema/security.cfg'
2--- database/schema/security.cfg 2012-05-09 13:50:03 +0000
3+++ database/schema/security.cfg 2012-05-24 05:13:32 +0000
4@@ -2181,6 +2181,7 @@
5 public.openidconsumerassociation = SELECT, DELETE
6 public.openidconsumernonce = SELECT, DELETE
7 public.person = SELECT, DELETE
8+public.pofiletranslator = SELECT, INSERT, UPDATE, DELETE
9 public.potranslation = SELECT, DELETE
10 public.potmsgset = SELECT, DELETE
11 public.product = SELECT
12
13=== modified file 'lib/lp/scripts/garbo.py'
14--- lib/lp/scripts/garbo.py 2012-05-09 01:35:41 +0000
15+++ lib/lp/scripts/garbo.py 2012-05-24 05:13:32 +0000
16@@ -105,6 +105,9 @@
17 from lp.translations.model.translationtemplateitem import (
18 TranslationTemplateItem,
19 )
20+from lp.translations.scripts.scrub_pofiletranslator import (
21+ ScrubPOFileTranslator,
22+ )
23
24
25 ONE_DAY_IN_SECONDS = 24 * 60 * 60
26@@ -1418,6 +1421,7 @@
27 ObsoleteBugAttachmentPruner,
28 OldTimeLimitedTokenDeleter,
29 RevisionAuthorEmailLinker,
30+ ScrubPOFileTranslator,
31 SuggestiveTemplatesCacheUpdater,
32 POTranslationPruner,
33 UnusedPOTMsgSetPruner,
34
35=== added file 'lib/lp/translations/scripts/scrub_pofiletranslator.py'
36--- lib/lp/translations/scripts/scrub_pofiletranslator.py 1970-01-01 00:00:00 +0000
37+++ lib/lp/translations/scripts/scrub_pofiletranslator.py 2012-05-24 05:13:32 +0000
38@@ -0,0 +1,212 @@
39+# Copyright 2012 Canonical Ltd. This software is licensed under the
40+# GNU Affero General Public License version 3 (see the file LICENSE).
41+
42+"""Keep `POFileTranslator` more or less consistent with the real data."""
43+
44+__metaclass__ = type
45+__all__ = [
46+ 'ScrubPOFileTranslator',
47+ ]
48+
49+from storm.expr import (
50+ Coalesce,
51+ Desc,
52+ )
53+import transaction
54+
55+from lp.services.database.lpstorm import IStore
56+from lp.services.looptuner import TunableLoop
57+from lp.translations.model.pofile import POFile
58+from lp.translations.model.pofiletranslator import POFileTranslator
59+from lp.translations.model.potemplate import POTemplate
60+from lp.translations.model.translationmessage import TranslationMessage
61+from lp.translations.model.translationtemplateitem import (
62+ TranslationTemplateItem,
63+ )
64+
65+
66+def get_pofile_ids():
67+ """Retrieve ids of POFiles to scrub.
68+
69+ The result's ordering is aimed at maximizing cache effectiveness:
70+ by POTemplate name for locality of shared POTMsgSets, and by language
71+ for locality of shared TranslationMessages.
72+ """
73+ store = IStore(POFile)
74+ query = store.find(
75+ POFile.id,
76+ POFile.potemplateID == POTemplate.id,
77+ POTemplate.iscurrent == True)
78+ return query.order_by(POTemplate.name, POFile.languageID)
79+
80+
81+def get_pofile_details(pofile_ids):
82+ """Retrieve relevant parts of `POFile`s with given ids.
83+
84+ :param pofile_ids: Iterable of `POFile` ids.
85+ :return: Dict mapping each id in `pofile_ids` to a duple of
86+ `POTemplate` id and `Language` id for the associated `POFile`.
87+ """
88+ store = IStore(POFile)
89+ rows = store.find(
90+ (POFile.id, POFile.potemplateID, POFile.languageID),
91+ POFile.id.is_in(pofile_ids))
92+ return dict((row[0], row[1:]) for row in rows)
93+
94+
95+def get_potmsgset_ids(potemplate_id):
96+ """Get the ids for each current `POTMsgSet` in a `POTemplate`."""
97+ store = IStore(POTemplate)
98+ return store.find(
99+ TranslationTemplateItem.potmsgsetID,
100+ TranslationTemplateItem.potemplateID == potemplate_id,
101+ TranslationTemplateItem.sequence > 0)
102+
103+
104+def summarize_contributors(potemplate_id, language_id, potmsgset_ids):
105+ """Return the set of ids of persons who contributed to a `POFile`.
106+
107+ This is a limited version of `get_contributions` that is easier to
108+ compute.
109+ """
110+ store = IStore(POFile)
111+ contribs = store.find(
112+ TranslationMessage.submitterID,
113+ TranslationMessage.potmsgsetID.is_in(potmsgset_ids),
114+ TranslationMessage.languageID == language_id,
115+ TranslationMessage.msgstr0 != None,
116+ Coalesce(TranslationMessage.potemplateID, potemplate_id) ==
117+ potemplate_id)
118+ contribs.config(distinct=True)
119+ return set(contribs)
120+
121+
122+def get_contributions(pofile, potmsgset_ids):
123+ """Map all users' most recent contributions to a `POFile`.
124+
125+ Returns a dict mapping `Person` id to the creation time of their most
126+ recent `TranslationMessage` in `POFile`.
127+
128+ This leaves some small room for error: a contribution that is masked by
129+ a diverged entry in this POFile will nevertheless produce a
130+ POFileTranslator record. Fixing that would complicate the work more than
131+ it is probably worth.
132+
133+ :param pofile: The `POFile` to find contributions for.
134+ :param potmsgset_ids: The ids of the `POTMsgSet`s to look for, as returned
135+ by `get_potmsgset_ids`.
136+ """
137+ store = IStore(pofile)
138+ language_id = pofile.language.id
139+ template_id = pofile.potemplate.id
140+ contribs = store.find(
141+ (TranslationMessage.submitterID, TranslationMessage.date_created),
142+ TranslationMessage.potmsgsetID.is_in(potmsgset_ids),
143+ TranslationMessage.languageID == language_id,
144+ TranslationMessage.msgstr0 != None,
145+ Coalesce(TranslationMessage.potemplateID, template_id) ==
146+ template_id)
147+ contribs = contribs.config(distinct=(TranslationMessage.submitterID,))
148+ contribs = contribs.order_by(
149+ TranslationMessage.submitterID, Desc(TranslationMessage.date_created))
150+ return dict(contribs)
151+
152+
153+def get_pofiletranslators(pofile_id):
154+ """Get `POFileTranslator` entries for a `POFile`.
155+
156+ Returns a dict mapping each contributor's person id to their
157+ `POFileTranslator` record.
158+ """
159+ store = IStore(POFileTranslator)
160+ pofts = store.find(
161+ POFileTranslator, POFileTranslator.pofileID == pofile_id)
162+ return dict((poft.personID, poft) for poft in pofts)
163+
164+
165+def remove_pofiletranslators(logger, pofile, person_ids):
166+ """Delete `POFileTranslator` records."""
167+ logger.debug(
168+ "Removing %d POFileTranslator(s) for %s.",
169+ len(person_ids), pofile.title)
170+ store = IStore(pofile)
171+ pofts = store.find(
172+ POFileTranslator,
173+ POFileTranslator.pofileID == pofile.id,
174+ POFileTranslator.personID.is_in(person_ids))
175+ pofts.remove()
176+
177+
178+def remove_unwarranted_pofiletranslators(logger, pofile, pofts, contribs):
179+ """Delete `POFileTranslator` records that shouldn't be there."""
180+ excess = set(pofts) - set(contribs)
181+ if len(excess) > 0:
182+ remove_pofiletranslators(logger, pofile, excess)
183+
184+
185+def create_missing_pofiletranslators(logger, pofile, pofts, contribs):
186+ """Create `POFileTranslator` records that were missing."""
187+ shortage = set(contribs) - set(pofts)
188+ if len(shortage) == 0:
189+ return
190+ logger.debug(
191+ "Adding %d POFileTranslator(s) for %s.",
192+ len(shortage), pofile.title)
193+ store = IStore(pofile)
194+ for missing_contributor in shortage:
195+ store.add(POFileTranslator(
196+ pofile=pofile, personID=missing_contributor,
197+ date_last_touched=contribs[missing_contributor]))
198+
199+
200+def fix_pofile(logger, pofile_id, potmsgset_ids, pofiletranslators):
201+ """This `POFile` needs fixing. Load its data & fix it."""
202+ pofile = IStore(POFile).get(POFile, pofile_id)
203+ contribs = get_contributions(pofile, potmsgset_ids)
204+ remove_unwarranted_pofiletranslators(
205+ logger, pofile, pofiletranslators, contribs)
206+ create_missing_pofiletranslators(
207+ logger, pofile, pofiletranslators, contribs)
208+
209+
210+def scrub_pofile(logger, pofile_id, template_id, language_id):
211+ """Scrub `POFileTranslator` entries for one `POFile`.
212+
213+ Removes inappropriate entries and adds missing ones.
214+ """
215+ pofiletranslators = get_pofiletranslators(pofile_id)
216+ potmsgset_ids = get_potmsgset_ids(template_id)
217+ contributors = summarize_contributors(
218+ template_id, language_id, potmsgset_ids)
219+ if set(pofiletranslators) != set(contributors):
220+ fix_pofile(logger, pofile_id, potmsgset_ids, pofiletranslators)
221+
222+
223+class ScrubPOFileTranslator(TunableLoop):
224+ """Tunable loop, meant for running from inside Garbo."""
225+
226+ maximum_chunk_size = 500
227+
228+ def __init__(self, *args, **kwargs):
229+ super(ScrubPOFileTranslator, self).__init__(*args, **kwargs)
230+ self.pofile_ids = tuple(get_pofile_ids())
231+ self.next_offset = 0
232+
233+ def __call__(self, chunk_size):
234+ """See `ITunableLoop`."""
235+ start_offset = self.next_offset
236+ self.next_offset = start_offset + int(chunk_size)
237+ batch = self.pofile_ids[start_offset:self.next_offset]
238+ if len(batch) == 0:
239+ self.next_offset = None
240+ return
241+
242+ pofile_details = get_pofile_details(batch)
243+ for pofile_id in batch:
244+ template_id, language_id = pofile_details[pofile_id]
245+ scrub_pofile(self.log, pofile_id, template_id, language_id)
246+ transaction.commit()
247+
248+ def isDone(self):
249+ """See `ITunableLoop`."""
250+ return self.next_offset is None
251
252=== added file 'lib/lp/translations/scripts/tests/test_scrub_pofiletranslator.py'
253--- lib/lp/translations/scripts/tests/test_scrub_pofiletranslator.py 1970-01-01 00:00:00 +0000
254+++ lib/lp/translations/scripts/tests/test_scrub_pofiletranslator.py 2012-05-24 05:13:32 +0000
255@@ -0,0 +1,292 @@
256+# Copyright 2012 Canonical Ltd. This software is licensed under the
257+# GNU Affero General Public License version 3 (see the file LICENSE).
258+
259+"""Test scrubbing of `POFileTranslator`."""
260+
261+__metaclass__ = type
262+
263+from datetime import (
264+ datetime,
265+ timedelta,
266+ )
267+
268+import pytz
269+import transaction
270+
271+from lp.services.database.constants import UTC_NOW
272+from lp.services.database.lpstorm import IStore
273+from lp.services.log.logger import DevNullLogger
274+from lp.testing import TestCaseWithFactory
275+from lp.testing.layers import ZopelessDatabaseLayer
276+from lp.translations.model.pofiletranslator import POFileTranslator
277+from lp.translations.scripts.scrub_pofiletranslator import (
278+ get_contributions,
279+ get_pofile_details,
280+ get_pofile_ids,
281+ get_pofiletranslators,
282+ get_potmsgset_ids,
283+ scrub_pofile,
284+ ScrubPOFileTranslator,
285+ summarize_contributors,
286+ )
287+
288+
289+fake_logger = DevNullLogger()
290+
291+
292+def size_distance(sequence, item1, item2):
293+ """Return the absolute distance between items in a sequence."""
294+ container = list(sequence)
295+ return abs(container.index(item2) - container.index(item1))
296+
297+
298+class TestScrubPOFileTranslator(TestCaseWithFactory):
299+
300+ layer = ZopelessDatabaseLayer
301+
302+ def query_pofiletranslator(self, pofile, person):
303+ """Query `POFileTranslator` for a specific record.
304+
305+ :return: Storm result set.
306+ """
307+ store = IStore(pofile)
308+ return store.find(POFileTranslator, pofile=pofile, person=person)
309+
310+ def make_message_with_pofiletranslator(self, pofile=None):
311+ """Create a normal `TranslationMessage` with `POFileTranslator`."""
312+ if pofile is None:
313+ pofile = self.factory.makePOFile()
314+ potmsgset = self.factory.makePOTMsgSet(
315+ potemplate=pofile.potemplate, sequence=1)
316+ # A database trigger on TranslationMessage automatically creates
317+ # a POFileTranslator record for each new TranslationMessage.
318+ return self.factory.makeSuggestion(pofile=pofile, potmsgset=potmsgset)
319+
320+ def make_message_without_pofiletranslator(self, pofile=None):
321+ """Create a `TranslationMessage` without `POFileTranslator`."""
322+ tm = self.make_message_with_pofiletranslator(pofile)
323+ IStore(pofile).flush()
324+ self.becomeDbUser('postgres')
325+ self.query_pofiletranslator(pofile, tm.submitter).remove()
326+ return tm
327+
328+ def make_pofiletranslator_without_message(self, pofile=None):
329+ """Create a `POFileTranslator` without `TranslationMessage`."""
330+ if pofile is None:
331+ pofile = self.factory.makePOFile()
332+ poft = POFileTranslator(
333+ pofile=pofile, person=self.factory.makePerson(),
334+ date_last_touched=UTC_NOW)
335+ IStore(poft.pofile).add(poft)
336+ return poft
337+
338+ def test_get_pofile_ids_gets_pofiles_for_active_templates(self):
339+ pofile = self.factory.makePOFile()
340+ self.assertIn(pofile.id, get_pofile_ids())
341+
342+ def test_get_pofile_ids_skips_inactive_templates(self):
343+ pofile = self.factory.makePOFile()
344+ pofile.potemplate.iscurrent = False
345+ self.assertNotIn(pofile.id, get_pofile_ids())
346+
347+ def test_get_pofile_ids_clusters_by_template_name(self):
348+ # POFiles for templates with the same name are bunched together
349+ # in the get_pofile_ids() output.
350+ templates = [
351+ self.factory.makePOTemplate(name='shared'),
352+ self.factory.makePOTemplate(name='other'),
353+ self.factory.makePOTemplate(name='andanother'),
354+ self.factory.makePOTemplate(
355+ name='shared', distroseries=self.factory.makeDistroSeries()),
356+ ]
357+ pofiles = [
358+ self.factory.makePOFile(potemplate=template)
359+ for template in templates]
360+ ordering = get_pofile_ids()
361+ self.assertEqual(
362+ 1, size_distance(ordering, pofiles[0].id, pofiles[-1].id))
363+
364+ def test_get_pofile_ids_clusters_by_language(self):
365+ # POFiles for sharing templates and the same language are
366+ # bunched together in the get_pofile_ids() output.
367+ templates = [
368+ self.factory.makePOTemplate(
369+ name='shared', distroseries=self.factory.makeDistroSeries())
370+ for counter in range(2)]
371+ # POFiles per language & template. We create these in a strange
372+ # way to avoid the risk of mistaking accidental orderings such
373+ # as per-id from being mistaken for the proper order.
374+ languages = ['nl', 'fr']
375+ pofiles_per_language = dict((language, []) for language in languages)
376+ for language, pofiles in pofiles_per_language.items():
377+ for template in templates:
378+ pofiles.append(
379+ self.factory.makePOFile(language, potemplate=template))
380+
381+ ordering = get_pofile_ids()
382+ for pofiles in pofiles_per_language.values():
383+ self.assertEqual(
384+ 1, size_distance(ordering, pofiles[0].id, pofiles[1].id))
385+
386+ def test_get_pofile_details_maps_id_to_template_and_language_ids(self):
387+ pofile = self.factory.makePOFile()
388+ self.assertEqual(
389+ {pofile.id: (pofile.potemplate.id, pofile.language.id)},
390+ get_pofile_details([pofile.id]))
391+
392+ def test_get_potmsgset_ids_returns_potmsgset_ids(self):
393+ pofile = self.factory.makePOFile()
394+ potmsgset = self.factory.makePOTMsgSet(
395+ potemplate=pofile.potemplate, sequence=1)
396+ self.assertContentEqual(
397+ [potmsgset.id], get_potmsgset_ids(pofile.potemplate.id))
398+
399+ def test_get_potmsgset_ids_ignores_inactive_messages(self):
400+ pofile = self.factory.makePOFile()
401+ self.factory.makePOTMsgSet(
402+ potemplate=pofile.potemplate, sequence=0)
403+ self.assertContentEqual([], get_potmsgset_ids(pofile.potemplate.id))
404+
405+ def test_summarize_contributors_gets_contributors(self):
406+ pofile = self.factory.makePOFile()
407+ tm = self.factory.makeSuggestion(pofile=pofile)
408+ potmsgset_ids = get_potmsgset_ids(pofile.potemplate.id)
409+ self.assertContentEqual(
410+ [tm.submitter.id],
411+ summarize_contributors(
412+ pofile.potemplate.id, pofile.language.id, potmsgset_ids))
413+
414+ def test_summarize_contributors_ignores_inactive_potmsgsets(self):
415+ pofile = self.factory.makePOFile()
416+ potmsgset = self.factory.makePOTMsgSet(
417+ potemplate=pofile.potemplate, sequence=0)
418+ self.factory.makeSuggestion(pofile=pofile, potmsgset=potmsgset)
419+ potmsgset_ids = get_potmsgset_ids(pofile.potemplate.id)
420+ self.assertContentEqual(
421+ [],
422+ summarize_contributors(
423+ pofile.potemplate.id, pofile.language.id, potmsgset_ids))
424+
425+ def test_summarize_contributors_includes_diverged_msgs_for_template(self):
426+ pofile = self.factory.makePOFile()
427+ tm = self.factory.makeSuggestion(pofile=pofile)
428+ tm.potemplate = pofile.potemplate
429+ potmsgset_ids = get_potmsgset_ids(pofile.potemplate.id)
430+ self.assertContentEqual(
431+ [tm.submitter.id],
432+ summarize_contributors(
433+ pofile.potemplate.id, pofile.language.id, potmsgset_ids))
434+
435+ def test_summarize_contributors_excludes_other_diverged_messages(self):
436+ pofile = self.factory.makePOFile()
437+ tm = self.factory.makeSuggestion(pofile=pofile)
438+ tm.potemplate = self.factory.makePOTemplate()
439+ potmsgset_ids = get_potmsgset_ids(pofile.potemplate.id)
440+ self.assertContentEqual(
441+ [],
442+ summarize_contributors(
443+ pofile.potemplate.id, pofile.language.id, potmsgset_ids))
444+
445+ def test_get_contributions_gets_contributions(self):
446+ pofile = self.factory.makePOFile()
447+ tm = self.factory.makeSuggestion(pofile=pofile)
448+ potmsgset_ids = get_potmsgset_ids(pofile.potemplate.id)
449+ self.assertEqual(
450+ {tm.submitter.id: tm.date_created},
451+ get_contributions(pofile, potmsgset_ids))
452+
453+ def test_get_contributions_uses_latest_contribution(self):
454+ pofile = self.factory.makePOFile()
455+ today = datetime.now(pytz.UTC)
456+ yesterday = today - timedelta(1, 1, 1)
457+ old_tm = self.factory.makeSuggestion(
458+ pofile=pofile, date_created=yesterday)
459+ new_tm = self.factory.makeSuggestion(
460+ translator=old_tm.submitter, pofile=pofile, date_created=today)
461+ potmsgset_ids = get_potmsgset_ids(pofile.potemplate.id)
462+ self.assertNotEqual(old_tm.date_created, new_tm.date_created)
463+ self.assertContentEqual(
464+ [new_tm.date_created],
465+ get_contributions(pofile, potmsgset_ids).values())
466+
467+ def test_get_contributions_ignores_inactive_potmsgsets(self):
468+ pofile = self.factory.makePOFile()
469+ potmsgset = self.factory.makePOTMsgSet(
470+ potemplate=pofile.potemplate, sequence=0)
471+ self.factory.makeSuggestion(pofile=pofile, potmsgset=potmsgset)
472+ potmsgset_ids = get_potmsgset_ids(pofile.potemplate.id)
473+ self.assertEqual({}, get_contributions(pofile, potmsgset_ids))
474+
475+ def test_get_contributions_includes_diverged_messages_for_template(self):
476+ pofile = self.factory.makePOFile()
477+ tm = self.factory.makeSuggestion(pofile=pofile)
478+ tm.potemplate = pofile.potemplate
479+ potmsgset_ids = get_potmsgset_ids(pofile.potemplate.id)
480+ self.assertContentEqual(
481+ [tm.submitter.id], get_contributions(pofile, potmsgset_ids))
482+
483+ def test_get_contributions_excludes_other_diverged_messages(self):
484+ pofile = self.factory.makePOFile()
485+ tm = self.factory.makeSuggestion(pofile=pofile)
486+ tm.potemplate = self.factory.makePOTemplate()
487+ potmsgset_ids = get_potmsgset_ids(pofile.potemplate.id)
488+ self.assertEqual({}, get_contributions(pofile, potmsgset_ids))
489+
490+ def test_get_pofiletranslators_gets_pofiletranslators_for_pofile(self):
491+ pofile = self.factory.makePOFile()
492+ tm = self.make_message_with_pofiletranslator(pofile)
493+ pofts = get_pofiletranslators(pofile.id)
494+ self.assertContentEqual([tm.submitter.id], pofts.keys())
495+ poft = pofts[tm.submitter.id]
496+ self.assertEqual(pofile, poft.pofile)
497+
498+ def test_scrub_pofile_leaves_good_pofiletranslator_in_place(self):
499+ pofile = self.factory.makePOFile()
500+ tm = self.make_message_with_pofiletranslator(pofile)
501+ old_poft = self.query_pofiletranslator(pofile, tm.submitter).one()
502+
503+ scrub_pofile(
504+ fake_logger, pofile.id, pofile.potemplate.id, pofile.language.id)
505+
506+ new_poft = self.query_pofiletranslator(pofile, tm.submitter).one()
507+ self.assertEqual(old_poft, new_poft)
508+
509+ def test_scrub_pofile_deletes_unwarranted_entries(self):
510+ # Deleting POFileTranslator records is not something the app
511+ # server ever does, so it requires special privileges.
512+ self.becomeDbUser('postgres')
513+ poft = self.make_pofiletranslator_without_message()
514+ (pofile, person) = (poft.pofile, poft.person)
515+ scrub_pofile(
516+ fake_logger, pofile.id, pofile.potemplate.id, pofile.language.id)
517+ self.assertIsNone(self.query_pofiletranslator(pofile, person).one())
518+
519+ def test_scrub_pofile_adds_missing_entries(self):
520+ pofile = self.factory.makePOFile()
521+ tm = self.make_message_without_pofiletranslator(pofile)
522+
523+ scrub_pofile(
524+ fake_logger, pofile.id, pofile.potemplate.id, pofile.language.id)
525+
526+ new_poft = self.query_pofiletranslator(pofile, tm.submitter).one()
527+ self.assertEqual(tm.submitter, new_poft.person)
528+ self.assertEqual(pofile, new_poft.pofile)
529+
530+ def test_tunable_loop(self):
531+ pofile = self.factory.makePOFile()
532+ tm = self.make_message_without_pofiletranslator(pofile)
533+ bad_poft = self.make_pofiletranslator_without_message(pofile)
534+ noncontributor = bad_poft.person
535+ transaction.commit()
536+
537+ self.becomeDbUser('garbo')
538+ ScrubPOFileTranslator(fake_logger).run()
539+ # Try to break the loop if it failed to commit its changes.
540+ transaction.abort()
541+
542+ # The unwarranted POFileTranslator record has been deleted.
543+ self.assertIsNotNone(
544+ self.query_pofiletranslator(pofile, tm.submitter).one())
545+ # The missing POFileTranslator has been created.
546+ self.assertIsNone(
547+ self.query_pofiletranslator(pofile, noncontributor).one())