Merge lp:~jtv/launchpad/bug-994650-cache-potmsgsets into lp:launchpad

Proposed by Jeroen T. Vermeulen on 2012-05-09
Status: Merged
Approved by: Jeroen T. Vermeulen on 2012-05-11
Approved revision: no longer in the source branch.
Merged at revision: 15299
Proposed branch: lp:~jtv/launchpad/bug-994650-cache-potmsgsets
Merge into: lp:launchpad
Prerequisite: lp:~jtv/launchpad/bug-994650-scrub-in-batches
Diff against target: 637 lines (+599/-0)
4 files modified
database/schema/security.cfg (+1/-0)
lib/lp/scripts/garbo.py (+4/-0)
lib/lp/translations/scripts/scrub_pofiletranslator.py (+281/-0)
lib/lp/translations/scripts/tests/test_scrub_pofiletranslator.py (+313/-0)
To merge this branch: bzr merge lp:~jtv/launchpad/bug-994650-cache-potmsgsets
Reviewer Review Type Date Requested Status
j.c.sackett (community) 2012-05-09 Approve on 2012-05-10
Review via email: mp+105193@code.launchpad.net

Commit Message

Cache POTMsgSets per template in POFileTranslator scrubber.

Description of the Change

This is one of two further optimizations to the POFileTranslator scrubber as mentioned in this MP: https://code.launchpad.net/~jtv/launchpad/bug-994650-scrub-faster

It's a very simple change. For each POFile that the scrubbing loop processes, it needs to know the ids of all POTMsgSets that participate in the template. But that set may well be the same for two or more POFiles in the batch, so this caches them.

There is no caching across batches, because one batch is handled in one transaction, and a concurrent template import could update the set of POTMsgSets participating in a template between transactions.

To test, use my combined branch which also includes schema changes: lp:~jtv/launchpad/combined-async-pofiletranslator

Run this test:
{{{
./bin/test -vvc lp.translations.scripts.tests.test_scrub_pofiletranslator
}}}

No lint.

Jeroen

To post a comment you must log in.
j.c.sackett (jcsackett) wrote :

Jeroen--

Looks good. One suggestion:

Instead of

13 + if template_id not in cached_potmsgsets:
14 + cached_potmsgsets[template_id] = get_potmsgset_ids(template_id)
15 + potmsgset_ids = cached_potmsgsets[template_id]

I think you can just do

13 + potmsgset_ids = cached_potmsgsets.setdefault(template_id, get_potmsgset_ids(template_id))

dict.setdefault returns the value if the key exists, otherwise it function like dict.get(), returning the second parameter, but *also* setting it for the supplied key.

It's really just an LoC reduction, so not a major improvement, but it's a bit cleaner than the if clause set up. I'll leave it to you whether to make that change or not. I'll leave it to you whether or not you want to implement the change.

It doesn't look like this total arc of work reduces LoC, but I'm not familiar enough with everything you've been doing in the LP code base lately to know if you've got credit or not. I don't want to block a fix for a critical, so I'll just leave this as a reminder to please try and reduce the overall LoC going forward if you haven't already.

Thanks!

review: Approve
Francis J. Lacoste (flacoste) wrote :

On 12-05-10 11:13 AM, j.c.sackett wrote:
> It doesn't look like this total arc of work reduces LoC, but I'm not familiar enough with everything you've been doing in the LP code base lately to know if you've got credit or not. I don't want to block a fix for a critical, so I'll just leave this as a reminder to please try and reduce the overall LoC going forward if you haven't already.

Jeroen gets a waiver because even though this doesn't reduce LoC, it
does reduce maintenance burden a lot, as it will allow us to open new
translation series without the multi-day fuss that is happening right
now :-)

--
Francis J. Lacoste
<email address hidden>

Jeroen T. Vermeulen (jtv) wrote :

Thanks for the review! I was wondering when the LoC thing would come up. :) In defense of Francis' note, not entirely without self-interest, I'm also obviating considerable command-lining and human-scripting (“disable X, click Y”) for every Ubuntu cycle.

> Looks good. One suggestion:
>
> Instead of
>
> 13 + if template_id not in cached_potmsgsets:
> 14 + cached_potmsgsets[template_id] = get_potmsgset_ids(template_id)
> 15 + potmsgset_ids = cached_potmsgsets[template_id]
>
> I think you can just do
>
> 13 + potmsgset_ids = cached_potmsgsets.setdefault(template_id,
> get_potmsgset_ids(template_id))

That would evaluate get_potmsgset_ids(template_id) before the setdefault() call. There's not a very great distance between the call constructing a query that will be executed only when needed, and it also executing the query in order to return some other data structure. So it would make the caching a bit brittle!

Does make me wonder if dict.setdefault shouldn't accept a callable plus arguments so that it can evaluate lazily.

Jeroen

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'database/schema/security.cfg'
2--- database/schema/security.cfg 2012-05-09 13:50:03 +0000
3+++ database/schema/security.cfg 2012-05-24 05:53:33 +0000
4@@ -2181,6 +2181,7 @@
5 public.openidconsumerassociation = SELECT, DELETE
6 public.openidconsumernonce = SELECT, DELETE
7 public.person = SELECT, DELETE
8+public.pofiletranslator = SELECT, INSERT, UPDATE, DELETE
9 public.potranslation = SELECT, DELETE
10 public.potmsgset = SELECT, DELETE
11 public.product = SELECT
12
13=== modified file 'lib/lp/scripts/garbo.py'
14--- lib/lp/scripts/garbo.py 2012-05-09 01:35:41 +0000
15+++ lib/lp/scripts/garbo.py 2012-05-24 05:53:33 +0000
16@@ -105,6 +105,9 @@
17 from lp.translations.model.translationtemplateitem import (
18 TranslationTemplateItem,
19 )
20+from lp.translations.scripts.scrub_pofiletranslator import (
21+ ScrubPOFileTranslator,
22+ )
23
24
25 ONE_DAY_IN_SECONDS = 24 * 60 * 60
26@@ -1418,6 +1421,7 @@
27 ObsoleteBugAttachmentPruner,
28 OldTimeLimitedTokenDeleter,
29 RevisionAuthorEmailLinker,
30+ ScrubPOFileTranslator,
31 SuggestiveTemplatesCacheUpdater,
32 POTranslationPruner,
33 UnusedPOTMsgSetPruner,
34
35=== added file 'lib/lp/translations/scripts/scrub_pofiletranslator.py'
36--- lib/lp/translations/scripts/scrub_pofiletranslator.py 1970-01-01 00:00:00 +0000
37+++ lib/lp/translations/scripts/scrub_pofiletranslator.py 2012-05-24 05:53:33 +0000
38@@ -0,0 +1,281 @@
39+# Copyright 2012 Canonical Ltd. This software is licensed under the
40+# GNU Affero General Public License version 3 (see the file LICENSE).
41+
42+"""Keep `POFileTranslator` more or less consistent with the real data."""
43+
44+__metaclass__ = type
45+__all__ = [
46+ 'ScrubPOFileTranslator',
47+ ]
48+
49+from collections import namedtuple
50+
51+from storm.expr import (
52+ Coalesce,
53+ Desc,
54+ )
55+import transaction
56+
57+from lp.registry.model.distribution import Distribution
58+from lp.registry.model.distroseries import DistroSeries
59+from lp.registry.model.product import Product
60+from lp.registry.model.productseries import ProductSeries
61+from lp.services.database.bulk import (
62+ load,
63+ load_related,
64+ )
65+from lp.services.database.lpstorm import IStore
66+from lp.services.looptuner import TunableLoop
67+from lp.services.worlddata.model.language import Language
68+from lp.translations.model.pofile import POFile
69+from lp.translations.model.pofiletranslator import POFileTranslator
70+from lp.translations.model.potemplate import POTemplate
71+from lp.translations.model.translationmessage import TranslationMessage
72+from lp.translations.model.translationtemplateitem import (
73+ TranslationTemplateItem,
74+ )
75+
76+
77+def get_pofile_ids():
78+ """Retrieve ids of POFiles to scrub.
79+
80+ The result's ordering is aimed at maximizing cache effectiveness:
81+ by POTemplate name for locality of shared POTMsgSets, and by language
82+ for locality of shared TranslationMessages.
83+ """
84+ store = IStore(POFile)
85+ query = store.find(
86+ POFile.id,
87+ POFile.potemplateID == POTemplate.id,
88+ POTemplate.iscurrent == True)
89+ return query.order_by(POTemplate.name, POFile.languageID)
90+
91+
92+def summarize_pofiles(pofile_ids):
93+ """Retrieve relevant parts of `POFile`s with given ids.
94+
95+ This gets just enough information to determine whether any of the
96+ `POFile`s need their `POFileTranslator` records fixed.
97+
98+ :param pofile_ids: Iterable of `POFile` ids.
99+ :return: Dict mapping each id in `pofile_ids` to a duple of
100+ `POTemplate` id and `Language` id for the associated `POFile`.
101+ """
102+ store = IStore(POFile)
103+ rows = store.find(
104+ (POFile.id, POFile.potemplateID, POFile.languageID),
105+ POFile.id.is_in(pofile_ids))
106+ return dict((row[0], row[1:]) for row in rows)
107+
108+
109+def get_potmsgset_ids(potemplate_id):
110+ """Get the ids for each current `POTMsgSet` in a `POTemplate`."""
111+ store = IStore(POTemplate)
112+ return store.find(
113+ TranslationTemplateItem.potmsgsetID,
114+ TranslationTemplateItem.potemplateID == potemplate_id,
115+ TranslationTemplateItem.sequence > 0)
116+
117+
118+def summarize_contributors(potemplate_id, language_id, potmsgset_ids):
119+ """Return the set of ids of persons who contributed to a `POFile`.
120+
121+ This is a limited version of `get_contributions` that is easier to
122+ compute.
123+ """
124+ store = IStore(POFile)
125+ contribs = store.find(
126+ TranslationMessage.submitterID,
127+ TranslationMessage.potmsgsetID.is_in(potmsgset_ids),
128+ TranslationMessage.languageID == language_id,
129+ TranslationMessage.msgstr0 != None,
130+ Coalesce(TranslationMessage.potemplateID, potemplate_id) ==
131+ potemplate_id)
132+ contribs.config(distinct=True)
133+ return set(contribs)
134+
135+
136+def get_contributions(pofile, potmsgset_ids):
137+ """Map all users' most recent contributions to a `POFile`.
138+
139+ Returns a dict mapping `Person` id to the creation time of their most
140+ recent `TranslationMessage` in `POFile`.
141+
142+ This leaves some small room for error: a contribution that is masked by
143+ a diverged entry in this POFile will nevertheless produce a
144+ POFileTranslator record. Fixing that would complicate the work more than
145+ it is probably worth.
146+
147+ :param pofile: The `POFile` to find contributions for.
148+ :param potmsgset_ids: The ids of the `POTMsgSet`s to look for, as returned
149+ by `get_potmsgset_ids`.
150+ """
151+ store = IStore(pofile)
152+ language_id = pofile.language.id
153+ template_id = pofile.potemplate.id
154+ contribs = store.find(
155+ (TranslationMessage.submitterID, TranslationMessage.date_created),
156+ TranslationMessage.potmsgsetID.is_in(potmsgset_ids),
157+ TranslationMessage.languageID == language_id,
158+ TranslationMessage.msgstr0 != None,
159+ Coalesce(TranslationMessage.potemplateID, template_id) ==
160+ template_id)
161+ contribs = contribs.config(distinct=(TranslationMessage.submitterID,))
162+ contribs = contribs.order_by(
163+ TranslationMessage.submitterID, Desc(TranslationMessage.date_created))
164+ return dict(contribs)
165+
166+
167+def get_pofiletranslators(pofile_id):
168+ """Get `Person` ids from `POFileTranslator` entries for a `POFile`.
169+
170+ Returns a `set` of `Person` ids.
171+ """
172+ store = IStore(POFileTranslator)
173+ return set(store.find(
174+ POFileTranslator.personID,
175+ POFileTranslator.pofileID == pofile_id))
176+
177+
178+def remove_pofiletranslators(logger, pofile, person_ids):
179+ """Delete `POFileTranslator` records."""
180+ logger.debug(
181+ "Removing %d POFileTranslator(s) for %s.",
182+ len(person_ids), pofile.title)
183+ store = IStore(pofile)
184+ pofts = store.find(
185+ POFileTranslator,
186+ POFileTranslator.pofileID == pofile.id,
187+ POFileTranslator.personID.is_in(person_ids))
188+ pofts.remove()
189+
190+
191+def remove_unwarranted_pofiletranslators(logger, pofile, pofts, contribs):
192+ """Delete `POFileTranslator` records that shouldn't be there."""
193+ excess = pofts - set(contribs)
194+ if len(excess) > 0:
195+ remove_pofiletranslators(logger, pofile, excess)
196+
197+
198+def create_missing_pofiletranslators(logger, pofile, pofts, contribs):
199+ """Create `POFileTranslator` records that were missing."""
200+ shortage = set(contribs) - pofts
201+ if len(shortage) == 0:
202+ return
203+ logger.debug(
204+ "Adding %d POFileTranslator(s) for %s.",
205+ len(shortage), pofile.title)
206+ store = IStore(pofile)
207+ for missing_contributor in shortage:
208+ store.add(POFileTranslator(
209+ pofile=pofile, personID=missing_contributor,
210+ date_last_touched=contribs[missing_contributor]))
211+
212+
213+def fix_pofile(logger, pofile, potmsgset_ids, pofiletranslators):
214+ """This `POFile` needs fixing. Load its data & fix it."""
215+ contribs = get_contributions(pofile, potmsgset_ids)
216+ remove_unwarranted_pofiletranslators(
217+ logger, pofile, pofiletranslators, contribs)
218+ create_missing_pofiletranslators(
219+ logger, pofile, pofiletranslators, contribs)
220+
221+
222+def needs_fixing(template_id, language_id, potmsgset_ids, pofiletranslators):
223+ """Does the `POFile` with given details need `POFileTranslator` changes?
224+
225+ :param template_id: id of the `POTemplate` for the `POFile`.
226+ :param language_id: id of the `Language` the `POFile` translates to.
227+ :param potmsgset_ids: ids of the `POTMsgSet` items participating in the
228+ template.
229+ :param pofiletranslators: `POFileTranslator` objects for the `POFile`.
230+ :return: Bool: does the existing set of `POFileTranslator` need fixing?
231+ """
232+ contributors = summarize_contributors(
233+ template_id, language_id, potmsgset_ids)
234+ return pofiletranslators != set(contributors)
235+
236+
237+# A tuple describing a POFile that needs its POFileTranslators fixed.
238+WorkItem = namedtuple("WorkItem", [
239+ 'pofile_id',
240+ 'potmsgset_ids',
241+ 'pofiletranslators',
242+ ])
243+
244+
245+def gather_work_items(pofile_ids):
246+ """Produce `WorkItem`s for those `POFile`s that need fixing.
247+
248+ :param pofile_ids: An iterable of `POFile` ids to check.
249+ :param pofile_summaries: Dict as returned by `summarize_pofiles`.
250+ :return: A sequence of `WorkItem`s for those `POFile`s that need fixing.
251+ """
252+ pofile_summaries = summarize_pofiles(pofile_ids)
253+ cached_potmsgsets = {}
254+ work_items = []
255+ for pofile_id in pofile_ids:
256+ template_id, language_id = pofile_summaries[pofile_id]
257+ if template_id not in cached_potmsgsets:
258+ cached_potmsgsets[template_id] = get_potmsgset_ids(template_id)
259+ potmsgset_ids = cached_potmsgsets[template_id]
260+ pofts = get_pofiletranslators(pofile_id)
261+ if needs_fixing(template_id, language_id, potmsgset_ids, pofts):
262+ work_items.append(WorkItem(pofile_id, potmsgset_ids, pofts))
263+
264+ return work_items
265+
266+
267+def preload_work_items(work_items):
268+ """Bulk load data that will be needed to process `work_items`.
269+
270+ :param work_items: A sequence of `WorkItem` records.
271+ :return: A dict mapping `POFile` ids from `work_items` to their
272+ respective `POFile` objects.
273+ """
274+ pofiles = load(POFile, [work_item.pofile_id for work_item in work_items])
275+ load_related(Language, pofiles, ['languageID'])
276+ templates = load_related(POTemplate, pofiles, ['potemplateID'])
277+ distroseries = load_related(DistroSeries, templates, ['distroseriesID'])
278+ load_related(Distribution, distroseries, ['distributionID'])
279+ productseries = load_related(
280+ ProductSeries, templates, ['productseriesID'])
281+ load_related(Product, productseries, ['productID'])
282+ return dict((pofile.id, pofile) for pofile in pofiles)
283+
284+
285+def process_work_items(logger, work_items, pofiles):
286+ """Fix the `POFileTranslator` records covered by `work_items`."""
287+ for work_item in work_items:
288+ pofile = pofiles[work_item.pofile_id]
289+ fix_pofile(
290+ logger, pofile, work_item.potmsgset_ids,
291+ work_item.pofiletranslators)
292+
293+
294+class ScrubPOFileTranslator(TunableLoop):
295+ """Tunable loop, meant for running from inside Garbo."""
296+
297+ maximum_chunk_size = 500
298+
299+ def __init__(self, *args, **kwargs):
300+ super(ScrubPOFileTranslator, self).__init__(*args, **kwargs)
301+ self.pofile_ids = tuple(get_pofile_ids())
302+ self.next_offset = 0
303+
304+ def __call__(self, chunk_size):
305+ """See `ITunableLoop`."""
306+ start_offset = self.next_offset
307+ self.next_offset = start_offset + int(chunk_size)
308+ batch = self.pofile_ids[start_offset:self.next_offset]
309+ if len(batch) == 0:
310+ self.next_offset = None
311+ else:
312+ work_items = gather_work_items(batch)
313+ pofiles = preload_work_items(work_items)
314+ process_work_items(self.log, work_items, pofiles)
315+ transaction.commit()
316+
317+ def isDone(self):
318+ """See `ITunableLoop`."""
319+ return self.next_offset is None
320
321=== added file 'lib/lp/translations/scripts/tests/test_scrub_pofiletranslator.py'
322--- lib/lp/translations/scripts/tests/test_scrub_pofiletranslator.py 1970-01-01 00:00:00 +0000
323+++ lib/lp/translations/scripts/tests/test_scrub_pofiletranslator.py 2012-05-24 05:53:33 +0000
324@@ -0,0 +1,313 @@
325+# Copyright 2012 Canonical Ltd. This software is licensed under the
326+# GNU Affero General Public License version 3 (see the file LICENSE).
327+
328+"""Test scrubbing of `POFileTranslator`."""
329+
330+__metaclass__ = type
331+
332+from datetime import (
333+ datetime,
334+ timedelta,
335+ )
336+
337+import pytz
338+import transaction
339+
340+from lp.services.database.constants import UTC_NOW
341+from lp.services.database.lpstorm import IStore
342+from lp.services.log.logger import DevNullLogger
343+from lp.testing import TestCaseWithFactory
344+from lp.testing.layers import ZopelessDatabaseLayer
345+from lp.translations.model.pofiletranslator import POFileTranslator
346+from lp.translations.scripts.scrub_pofiletranslator import (
347+ fix_pofile,
348+ gather_work_items,
349+ get_contributions,
350+ get_pofile_ids,
351+ get_pofiletranslators,
352+ get_potmsgset_ids,
353+ ScrubPOFileTranslator,
354+ summarize_contributors,
355+ summarize_pofiles,
356+ )
357+
358+
359+fake_logger = DevNullLogger()
360+
361+
362+def size_distance(sequence, item1, item2):
363+ """Return the absolute distance between items in a sequence."""
364+ container = list(sequence)
365+ return abs(container.index(item2) - container.index(item1))
366+
367+
368+class TestScrubPOFileTranslator(TestCaseWithFactory):
369+
370+ layer = ZopelessDatabaseLayer
371+
372+ def query_pofiletranslator(self, pofile, person):
373+ """Query `POFileTranslator` for a specific record.
374+
375+ :return: Storm result set.
376+ """
377+ store = IStore(pofile)
378+ return store.find(POFileTranslator, pofile=pofile, person=person)
379+
380+ def make_message_with_pofiletranslator(self, pofile=None):
381+ """Create a normal `TranslationMessage` with `POFileTranslator`."""
382+ if pofile is None:
383+ pofile = self.factory.makePOFile()
384+ potmsgset = self.factory.makePOTMsgSet(
385+ potemplate=pofile.potemplate,
386+ sequence=self.factory.getUniqueInteger())
387+ # A database trigger on TranslationMessage automatically creates
388+ # a POFileTranslator record for each new TranslationMessage.
389+ return self.factory.makeSuggestion(pofile=pofile, potmsgset=potmsgset)
390+
391+ def make_message_without_pofiletranslator(self, pofile=None):
392+ """Create a `TranslationMessage` without `POFileTranslator`."""
393+ tm = self.make_message_with_pofiletranslator(pofile)
394+ IStore(pofile).flush()
395+ self.becomeDbUser('postgres')
396+ self.query_pofiletranslator(pofile, tm.submitter).remove()
397+ return tm
398+
399+ def make_pofiletranslator_without_message(self, pofile=None):
400+ """Create a `POFileTranslator` without `TranslationMessage`."""
401+ if pofile is None:
402+ pofile = self.factory.makePOFile()
403+ poft = POFileTranslator(
404+ pofile=pofile, person=self.factory.makePerson(),
405+ date_last_touched=UTC_NOW)
406+ IStore(poft.pofile).add(poft)
407+ return poft
408+
409+ def test_get_pofile_ids_gets_pofiles_for_active_templates(self):
410+ pofile = self.factory.makePOFile()
411+ self.assertIn(pofile.id, get_pofile_ids())
412+
413+ def test_get_pofile_ids_skips_inactive_templates(self):
414+ pofile = self.factory.makePOFile()
415+ pofile.potemplate.iscurrent = False
416+ self.assertNotIn(pofile.id, get_pofile_ids())
417+
418+ def test_get_pofile_ids_clusters_by_template_name(self):
419+ # POFiles for templates with the same name are bunched together
420+ # in the get_pofile_ids() output.
421+ templates = [
422+ self.factory.makePOTemplate(name='shared'),
423+ self.factory.makePOTemplate(name='other'),
424+ self.factory.makePOTemplate(name='andanother'),
425+ self.factory.makePOTemplate(
426+ name='shared', distroseries=self.factory.makeDistroSeries()),
427+ ]
428+ pofiles = [
429+ self.factory.makePOFile(potemplate=template)
430+ for template in templates]
431+ ordering = get_pofile_ids()
432+ self.assertEqual(
433+ 1, size_distance(ordering, pofiles[0].id, pofiles[-1].id))
434+
435+ def test_get_pofile_ids_clusters_by_language(self):
436+ # POFiles for sharing templates and the same language are
437+ # bunched together in the get_pofile_ids() output.
438+ templates = [
439+ self.factory.makePOTemplate(
440+ name='shared', distroseries=self.factory.makeDistroSeries())
441+ for counter in range(2)]
442+ # POFiles per language & template. We create these in a strange
443+ # way to avoid the risk of mistaking accidental orderings such
444+ # as per-id from being mistaken for the proper order.
445+ languages = ['nl', 'fr']
446+ pofiles_per_language = dict((language, []) for language in languages)
447+ for language, pofiles in pofiles_per_language.items():
448+ for template in templates:
449+ pofiles.append(
450+ self.factory.makePOFile(language, potemplate=template))
451+
452+ ordering = get_pofile_ids()
453+ for pofiles in pofiles_per_language.values():
454+ self.assertEqual(
455+ 1, size_distance(ordering, pofiles[0].id, pofiles[1].id))
456+
457+ def test_summarize_pofiles_maps_id_to_template_and_language_ids(self):
458+ pofile = self.factory.makePOFile()
459+ self.assertEqual(
460+ {pofile.id: (pofile.potemplate.id, pofile.language.id)},
461+ summarize_pofiles([pofile.id]))
462+
463+ def test_get_potmsgset_ids_returns_potmsgset_ids(self):
464+ pofile = self.factory.makePOFile()
465+ potmsgset = self.factory.makePOTMsgSet(
466+ potemplate=pofile.potemplate, sequence=1)
467+ self.assertContentEqual(
468+ [potmsgset.id], get_potmsgset_ids(pofile.potemplate.id))
469+
470+ def test_get_potmsgset_ids_ignores_inactive_messages(self):
471+ pofile = self.factory.makePOFile()
472+ self.factory.makePOTMsgSet(
473+ potemplate=pofile.potemplate, sequence=0)
474+ self.assertContentEqual([], get_potmsgset_ids(pofile.potemplate.id))
475+
476+ def test_summarize_contributors_gets_contributors(self):
477+ pofile = self.factory.makePOFile()
478+ tm = self.factory.makeSuggestion(pofile=pofile)
479+ potmsgset_ids = get_potmsgset_ids(pofile.potemplate.id)
480+ self.assertContentEqual(
481+ [tm.submitter.id],
482+ summarize_contributors(
483+ pofile.potemplate.id, pofile.language.id, potmsgset_ids))
484+
485+ def test_summarize_contributors_ignores_inactive_potmsgsets(self):
486+ pofile = self.factory.makePOFile()
487+ potmsgset = self.factory.makePOTMsgSet(
488+ potemplate=pofile.potemplate, sequence=0)
489+ self.factory.makeSuggestion(pofile=pofile, potmsgset=potmsgset)
490+ potmsgset_ids = get_potmsgset_ids(pofile.potemplate.id)
491+ self.assertContentEqual(
492+ [],
493+ summarize_contributors(
494+ pofile.potemplate.id, pofile.language.id, potmsgset_ids))
495+
496+ def test_summarize_contributors_includes_diverged_msgs_for_template(self):
497+ pofile = self.factory.makePOFile()
498+ tm = self.factory.makeSuggestion(pofile=pofile)
499+ tm.potemplate = pofile.potemplate
500+ potmsgset_ids = get_potmsgset_ids(pofile.potemplate.id)
501+ self.assertContentEqual(
502+ [tm.submitter.id],
503+ summarize_contributors(
504+ pofile.potemplate.id, pofile.language.id, potmsgset_ids))
505+
506+ def test_summarize_contributors_excludes_other_diverged_messages(self):
507+ pofile = self.factory.makePOFile()
508+ tm = self.factory.makeSuggestion(pofile=pofile)
509+ tm.potemplate = self.factory.makePOTemplate()
510+ potmsgset_ids = get_potmsgset_ids(pofile.potemplate.id)
511+ self.assertContentEqual(
512+ [],
513+ summarize_contributors(
514+ pofile.potemplate.id, pofile.language.id, potmsgset_ids))
515+
516+ def test_get_contributions_gets_contributions(self):
517+ pofile = self.factory.makePOFile()
518+ tm = self.factory.makeSuggestion(pofile=pofile)
519+ potmsgset_ids = get_potmsgset_ids(pofile.potemplate.id)
520+ self.assertEqual(
521+ {tm.submitter.id: tm.date_created},
522+ get_contributions(pofile, potmsgset_ids))
523+
524+ def test_get_contributions_uses_latest_contribution(self):
525+ pofile = self.factory.makePOFile()
526+ today = datetime.now(pytz.UTC)
527+ yesterday = today - timedelta(1, 1, 1)
528+ old_tm = self.factory.makeSuggestion(
529+ pofile=pofile, date_created=yesterday)
530+ new_tm = self.factory.makeSuggestion(
531+ translator=old_tm.submitter, pofile=pofile, date_created=today)
532+ potmsgset_ids = get_potmsgset_ids(pofile.potemplate.id)
533+ self.assertNotEqual(old_tm.date_created, new_tm.date_created)
534+ self.assertContentEqual(
535+ [new_tm.date_created],
536+ get_contributions(pofile, potmsgset_ids).values())
537+
538+ def test_get_contributions_ignores_inactive_potmsgsets(self):
539+ pofile = self.factory.makePOFile()
540+ potmsgset = self.factory.makePOTMsgSet(
541+ potemplate=pofile.potemplate, sequence=0)
542+ self.factory.makeSuggestion(pofile=pofile, potmsgset=potmsgset)
543+ potmsgset_ids = get_potmsgset_ids(pofile.potemplate.id)
544+ self.assertEqual({}, get_contributions(pofile, potmsgset_ids))
545+
546+ def test_get_contributions_includes_diverged_messages_for_template(self):
547+ pofile = self.factory.makePOFile()
548+ tm = self.factory.makeSuggestion(pofile=pofile)
549+ tm.potemplate = pofile.potemplate
550+ potmsgset_ids = get_potmsgset_ids(pofile.potemplate.id)
551+ self.assertContentEqual(
552+ [tm.submitter.id], get_contributions(pofile, potmsgset_ids))
553+
554+ def test_get_contributions_excludes_other_diverged_messages(self):
555+ pofile = self.factory.makePOFile()
556+ tm = self.factory.makeSuggestion(pofile=pofile)
557+ tm.potemplate = self.factory.makePOTemplate()
558+ potmsgset_ids = get_potmsgset_ids(pofile.potemplate.id)
559+ self.assertEqual({}, get_contributions(pofile, potmsgset_ids))
560+
561+ def test_get_pofiletranslators_gets_translators_for_pofile(self):
562+ pofile = self.factory.makePOFile()
563+ tm = self.make_message_with_pofiletranslator(pofile)
564+ self.assertContentEqual(
565+ [tm.submitter.id], get_pofiletranslators(pofile.id))
566+
567+ def test_fix_pofile_leaves_good_pofiletranslator_in_place(self):
568+ pofile = self.factory.makePOFile()
569+ tm = self.make_message_with_pofiletranslator(pofile)
570+ old_poft = self.query_pofiletranslator(pofile, tm.submitter).one()
571+
572+ fix_pofile(
573+ fake_logger, pofile, [tm.potmsgset.id], set([tm.submitter.id]))
574+
575+ new_poft = self.query_pofiletranslator(pofile, tm.submitter).one()
576+ self.assertEqual(old_poft, new_poft)
577+
578+ def test_fix_pofile_deletes_unwarranted_entries(self):
579+ # Deleting POFileTranslator records is not something the app
580+ # server ever does, so it requires special privileges.
581+ self.becomeDbUser('postgres')
582+ poft = self.make_pofiletranslator_without_message()
583+ (pofile, person) = (poft.pofile, poft.person)
584+ fix_pofile(fake_logger, pofile, [], set([person.id]))
585+ self.assertIsNone(self.query_pofiletranslator(pofile, person).one())
586+
587+ def test_fix_pofile_adds_missing_entries(self):
588+ pofile = self.factory.makePOFile()
589+ tm = self.make_message_without_pofiletranslator(pofile)
590+
591+ fix_pofile(fake_logger, pofile, [tm.potmsgset.id], set())
592+
593+ new_poft = self.query_pofiletranslator(pofile, tm.submitter).one()
594+ self.assertEqual(tm.submitter, new_poft.person)
595+ self.assertEqual(pofile, new_poft.pofile)
596+
597+ def test_gather_work_items_caches_potmsgset_ids_for_same_template(self):
598+ template = self.factory.makePOTemplate()
599+ pofiles = [
600+ self.factory.makePOFile(potemplate=template)
601+ for counter in range(2)]
602+ for pofile in pofiles:
603+ self.make_message_without_pofiletranslator(pofile)
604+ work_items = gather_work_items([pofile.id for pofile in pofiles])
605+ # The potmsgset_ids entries are references to one and the same
606+ # object.
607+ self.assertIs(
608+ work_items[0].potmsgset_ids, work_items[1].potmsgset_ids)
609+
610+ def test_gather_work_items_does_not_cache_across_templates(self):
611+ pofiles = [self.factory.makePOFile() for counter in range(2)]
612+ for pofile in pofiles:
613+ self.make_message_without_pofiletranslator(pofile)
614+ work_items = gather_work_items([pofile.id for pofile in pofiles])
615+ # The POFiles are for different templates, so they do not share
616+ # the same potmsgset_ids.
617+ self.assertNotEqual(
618+ work_items[0].potmsgset_ids, work_items[1].potmsgset_ids)
619+
620+ def test_tunable_loop(self):
621+ pofile = self.factory.makePOFile()
622+ tm = self.make_message_without_pofiletranslator(pofile)
623+ bad_poft = self.make_pofiletranslator_without_message(pofile)
624+ noncontributor = bad_poft.person
625+ transaction.commit()
626+
627+ self.becomeDbUser('garbo')
628+ ScrubPOFileTranslator(fake_logger).run()
629+ # Try to break the loop if it failed to commit its changes.
630+ transaction.abort()
631+
632+ # An unwarranted POFileTranslator record has been deleted.
633+ self.assertIsNotNone(
634+ self.query_pofiletranslator(pofile, tm.submitter).one())
635+ # A missing POFileTranslator has been created.
636+ self.assertIsNone(
637+ self.query_pofiletranslator(pofile, noncontributor).one())