Launchpad itself

Recife migration script unusably slow

Bug #682933 reported by Jeroen T. Vermeulen on 2010-11-30

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Launchpad itself	Fix Released	High	Jeroen T. Vermeulen	Launchpad itself 10.12

Bug Description

The migrate-current-flag script on staging can't complete even its first update (83 TMs) in more than an hour of runtime. We urgently need a dramatic speedup.

The problem seems to be that all our indexes are partial. The ideal index for this script would probably be one on TranslationMessage(potmsgset, language, potemplate) where is_current_upstream is true, and we have indexes covering all of that, but they're split up into separate partial indexes where potemplate is null and where potemplate is not null. An index on (potemplate, potmsgset, language) would probably have done just as well as two partial indexes, but not left us with the current problem.

As it is, we'll have to try and speed up the script without index changes.

Tags:

Related branches

lp:~jtv/launchpad/bug-682933

Merged into lp:launchpad/db-devel at revision 10023

Henning Eggers (community): Approve (code) on 2010-11-30

lp:~jtv/launchpad/devel-bug-682933 (Merged)

Revision history for this message

Jeroen T. Vermeulen (jtv) wrote on 2010-11-30:

The real culprit may be Storm bug 682989.

Revision history for this message

Jeroen T. Vermeulen (jtv) wrote on 2010-11-30:

Got a test run prepared for Tom to execute in a few minutes.

Jeroen T. Vermeulen (jtv) on 2010-11-30

tags:

added: recife

Henning Eggers (henninge) on 2010-11-30

Changed in rosetta:
status:	New → In Progress
importance:	Undecided → Critical
assignee:	nobody → Jeroen T. Vermeulen (jtv)
milestone:	none → 10.12

Henning Eggers (henninge) on 2010-11-30

Changed in rosetta:
importance:	Critical → High

Henning Eggers (henninge) on 2010-11-30

tags:

added: upstream-translations-sharing
removed: recife

Revision history for this message

Jeroen T. Vermeulen (jtv) wrote on 2010-11-30:

Working around the Storm bug did fix things, but the script is still not as fast as we'd like: It kicked off at a rate that suggested it would complete in a bit over 7 hours, but then fell asleep to accommodate (AIUI) replication lag.

I expected a lot of the time to go into finding current translationmessages that need to be deactivated to "make room" for the newly activated ones, but it looks to be only a fraction of the time spent. It's not clear to me where the time does go—unless it's the updating and replication itself, in which case there's not much more we can do.

Revision history for this message

Данило Шеган (danilo) wrote on 2010-12-01:

It's ok if the first run of the script takes eg. the entire weekend. It will progressively have less data to process, and that's exactly what we need to aim for. Since it's DBLoopTuner based, we do need to make sure that no slaves are being rebuilt at the time because that will completely stall the script.

Do note that TranslationMessage constraints are slow to check, so that might be why updating is slow.

The first run can basically take up whatever time it takes before the rollout. If we start the script on Friday, it means 4 full days, and that should be enough. Then, along with the roll-out, we can do another much shorter run while LP is read-only (we should time it before the roll-out to make sure it runs in eg. less than 5 minutes, which I expect it will).

Revision history for this message

Jeroen T. Vermeulen (jtv) wrote on 2010-12-01:

Thanks for the explanation. I was mostly disappointed at performance (after the fix) because of your references to "a few minutes." Now I understand that that would be just an incremental "patch-up" run after a prior migration of the bulk of the data.

Revision history for this message

Launchpad QA Bot (lpqabot) wrote on 2010-12-01: Bug fixed by a commit

Fixed in stable r12007 <http://bazaar.launchpad.net/~launchpad-pqm/launchpad/stable/revision/12007>.

tags:	added: qa-needstesting
Changed in rosetta:
status:	In Progress → Fix Committed

Revision history for this message

Launchpad QA Bot (lpqabot) wrote on 2010-12-01:

Fixed in db-stable r10023 <http://bazaar.launchpad.net/~launchpad-pqm/launchpad/db-stable/revision/10023>.

Jeroen T. Vermeulen (jtv) on 2010-12-02

tags:

added: qa-ok
removed: qa-needstesting

Curtis Hovey (sinzui) on 2010-12-08

Changed in rosetta:
status:	Fix Committed → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.