> Jeroen T. Vermeulen wrote:
> > Actually this branch resolves more long-standing gripes with the
> > translations import auto-approver than just that one bug.
> 
> Was there a preimplementation call?

Not as such.  The approach is pretty straightforward and we've been
mulling some of these points for years.  I don't particularly see anything
that raises questions of approach or direction.  We did go into what I'm
doing in this branch a few times in our meetings.


> > == Other gripes ==
> >
> > The approver's LaunchpadCronScript still lived in the same module as the
> > script it was originally spliced out of.  Bit of a Zeus-and-Athena
> > scenario there.  I gave it its own module in lp.translations.scripts.
> 
> 
> > Also, the class in there now inherits from LaunchpadCronScript instead
> > of having it instantiated from a separate LaunchpadCronScript-derived
> > class in the main script file.
> 
> Do you think AutoApproveProcess is still a good name, since it's also
> responsible for other queue maintenance?  What about ProcessImportQueue
> or something?

It wasn't a good name when I first saw it, to be honest.  You're right, and
I'm changing this.  See below.


> > Unfortunately this did move some debug
> > output into the doctest; I switched from the FakeLogger to the
> > MockLogger so I could set a higher debug level and avoid this.
> >
> > You may noticed that where transaction managers are passed around, I
> > changed their names from ztm (Zopeless Transaction Manager) to txn.  We
> > don't use the ztm any more.  Actually txn is just an alias for the
> > transaction module, so the argument isn't needed at all now.  But this
> > is a relatively elegant way of telling a method whether you want it to
> > commit transactions.  Passing a boolean for "please commit" is just too
> > ugly.
> 
> But functionally equivalent?  (Would we ever pass in a different
> transaction manager?)

Functionally equivalent to a boolean, yes; if we ever pass a different
transaction manager then it'll be hidden somewhere under these interfaces
anyway.

Operational note: I've checked with Stuart and it seems that there are no
big replication problems to be expected from deleting our entire backlog of
Failed entries.  Otherwise we'd have to clean up the queue a bit before
this change could roll out.


> > === modified file 'cronscripts/rosetta-approve-imports.py'
> > --- cronscripts/rosetta-approve-imports.py    2009-07-17 00:26:05 +0000
> > +++ cronscripts/rosetta-approve-imports.py    2009-09-18 07:39:51 +0000
> > @@ -10,26 +10,10 @@
> >  import _pythonpath
> >
> >  from canonical.config import config
> > -from canonical.database.sqlbase import ISOLATION_LEVEL_READ_COMMITTED
> > -from lp.translations.scripts.po_import import AutoApproveProcess
> > -from lp.services.scripts.base import LaunchpadCronScript
> > -
> > -
> > -class RosettaImportApprover(LaunchpadCronScript):
> > -    def main(self):
> > -        self.txn.set_isolation_level(ISOLATION_LEVEL_READ_COMMITTED)
> > -        process = AutoApproveProcess(self.txn, self.logger)
> > -        self.logger.debug('Starting auto-approval of translation imports')
> > -        process.run()
> > -        self.logger.debug('Completed auto-approval of translation imports')
> > +from lp.translations.scripts.import_approval import AutoApproveProcess
> >
> >
> >  if __name__ == '__main__':
> > -    script = RosettaImportApprover('rosetta-approve-imports',
> > -        dbuser=config.poimport.dbuser)
> > -    script.lock_or_quit()
> > -    try:
> > -        script.run()
> > -    finally:
> > -        script.unlock()
> > -
> > +    script = AutoApproveProcess(
> > +        'rosetta-approve-imports', dbuser=config.poimport.dbuser)
> > +    script.lock_and_run()
> 
> It seems a bit odd to be wrapping the script in if __name__ = '__main__'
> blocks.  It can't be loaded as a module because of its name, and even if
> it could, I don't see any value in it.

It's standard practice; I believe some Python syntax checkers will fool
themselves into thinking the script has an importable name.  I know
that at least one—don't remember which—does its work by importing even
a main executable Python file as a module.


> > === modified file 'lib/lp/translations/model/translationimportqueue.py'
> > --- lib/lp/translations/model/translationimportqueue.py       2009-09-03
> 05:14:07 +0000
> > +++ lib/lp/translations/model/translationimportqueue.py       2009-09-18
> 14:11:35 +0000

> > @@ -61,9 +64,13 @@
> >  from lp.registry.interfaces.person import validate_public_person
> >
> >
> > -# Number of days when the DELETED and IMPORTED entries are removed from the
> > +# Number of days when entries with terminal statuses are removed from the
> >  # queue.
> > -DAYS_TO_KEEP = 3
> > +entry_gc_age = {
> > +    RosettaImportStatus.DELETED: datetime.timedelta(days=3),
> > +    RosettaImportStatus.IMPORTED: datetime.timedelta(days=3),
> > +    RosettaImportStatus.FAILED: datetime.timedelta(days=30),
> > +}
> 
> Does it make sense to specify a timedelta here?  If you always specify
> your durations in days, it's less repetitive convert to a timedelta in
> _cleanUpObsoleteEntries

I thought it was nice and explicit.  Storing number-of-days struck me as
unnecessarily C-like; timedeltas let me store a nondescript "time
interval" that you can just add to or subtract from dates.  And who
knows, maybe someday we'll want other kinds of intervals.


> If you're specifying a timedelta, shouldn't the comment say "Length of
> time" instead of "Number of days"?  Is FAILED really a terminal status?
>  It seems like it's possibly not terminal, and that's why you're
> delaying 30 days.

Absolutely.  I should have updated that part of the comment as well now.


> > @@ -1143,40 +1151,79 @@
> >                  # blocked, so we can block it too.
> >                  entry.setStatus(RosettaImportStatus.BLOCKED)
> >                  num_blocked += 1
> > -                if ztm is not None:
> > -                    # Do the commit to save the changes.
> > -                    ztm.commit()
> > +                if txn is not None:
> > +                    txn.commit()
> >
> >          return num_blocked
> >
> > +    def _cleanUpObsoleteEntries(self, store):
> > +        """Delete obsolete queue entries.
> > +
> > +        :param store: The Store to delete from.
> > +        :return: Number of entries deleted.
> > +        """
> > +        now = datetime.datetime.now(pytz.UTC)
> > +        deletion_criteria = False
> > +        for status, gc_age in entry_gc_age.iteritems():
> > +            cutoff = now - gc_age
> > +            deletion_criteria = Or(
> > +                deletion_criteria, And(
> > +                    TranslationImportQueueEntry.status == status,
> > +                    TranslationImportQueueEntry.date_status_changed <
> cutoff))
> > +
> > +        entries = store.find(TranslationImportQueueEntry,
> deletion_criteria)
> 
> I find the way the deletion_criteria is constructed a bit surprising.
> What do you think about building up a list of clauses first, and then
> ORing them?  For example:
> 
>         deletion_clauses = []
>         for status, gc_age in entry_gc_age.iteritems():
>             cutoff = now - gc_age
>             deletion_clauses.append(And(
>                     TranslationImportQueueEntry.status == status,
>                     TranslationImportQueueEntry.date_status_changed <
>                     cutoff))
>         entries = store.find(
>             TranslationImportQueueEntry, Or(*deletion_clauses))

I considered that and felt that it was a toss-up.  But I'm used my ways
of doing things, and now that you mention it, your way is clearer.
Changed.


> > === added file 'lib/lp/translations/scripts/import_approval.py'
> 
> Since this does more than approving imports, is import_approval a good
> name?  Maybe queue_gardener?

I renamed it import_queue_gardener.py, and the class ImportQueueGardener.

While I was at it I also gave the script its own database user, which
requires a lot fewer database privileges than the original one.

Attaching incremental diff.


Jeroen