Person:+delete timeouts : Person merging needs to be done asynchronously

Bug #162510 reported by Christian Reis
76
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
Critical
Gavin Panella

Bug Description

The Person merge code is robust but will timeout for records with lots of references to update.

The best way of addressing this is to queue merges to be done asynchronously.

This should be done for both user requested and admin merges.

See bug 728471 about the UI work needed to solve this issue.

Related branches

Revision history for this message
Diogo Matsubara (matsubara) wrote :

OOPS-759D2129 might be related. POSubmission and POMsgSet tables are gone since the Translations DB refactoring, and TranslationMessage might have inherited the problem.

Revision history for this message
Stuart Bishop (stub) wrote : Re: [Bug 162510] Merging people times out updating POSubmission and POMsgSet

Christian Reis wrote:
> Public bug reported:
>
> When merging two users on production, I ran into consistent timeouts on
> these two queries:
>
> UPDATE POSubmission SET person=1457193 WHERE person=299986
> (2218 rows)
>
> UPDATE pomsgset SET reviewer=1457193 WHERE reviewer=299986
> (1473 rows)
>
> I'm wondering if the DB schema change that will be done to fix bug 30602
> will address this.

With the current database design I think we have to accept that there will
always be some cases where the people merge will fail. We could fix this
with a fairly radical db schema refactoring, adding a layer of indirection,
but I'm not sure that would be a step forward.

I think we need to customize the timeout page for people merge, so if it
times out the user is requested to open a support request so we can do it
manually.

--
Stuart Bishop <email address hidden> http://www.canonical.com/
Canonical Ltd. http://www.ubuntu.com/

Revision history for this message
Stuart Bishop (stub) wrote :

On 2/1/08, Stuart Bishop <email address hidden> wrote:
>
> I think we need to customize the timeout page for people merge, so if it
> times out the user is requested to open a support request so we can do it
> manually.

Or perhaps just put the merge requests in a queue and process them
asynchronously.

Revision history for this message
Jeroen T. Vermeulen (jtv) wrote : Re: Merging people times out updating POSubmission and POMsgSet

Here's another case: OOPS-760A2160 (while copying TranslationMessages, the children and heirs of POMsgSet and POSubmission).

Asynchronous merging, if done right, would solve this problem at the possible cost of the translations moving tomically. Or whatever the opposite of atomically is.

Revision history for this message
Diogo Matsubara (matsubara) wrote :

Another case: OOPS-776B1759

Curtis Hovey (sinzui)
Changed in launchpad-registry:
importance: Undecided → Low
status: New → Triaged
Revision history for this message
Diogo Matsubara (matsubara) wrote :

On bug 364924 suggested that we should increase the timeout threshold for admins because this bug bit him a few times. Could the importance of this one be increased?

tags: added: registry-people
removed: registry
tags: added: chr
Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

We just deleted a lot of obsolete TranslationMessages, and we're about to cut down the number by a whole lot more with the message-sharing data migration. If we put much effort into this part of the timeouts right now, the effort might be wasted and not even fix all the timeouts.

Stuart Bishop (stub)
description: updated
summary: - Merging people times out updating POSubmission and POMsgSet
+ Person merging needs to be done asynchronously
description: updated
Curtis Hovey (sinzui)
tags: added: featire tech-debt
removed: registry-people
tags: added: feature
removed: featire
Curtis Hovey (sinzui)
Changed in launchpad-registry:
milestone: none → series-3.1
Revision history for this message
Sense Egbert Hofstede (sense) wrote : Re: Person merging needs to be done asynchronously

A fresh three Error IDs from the latest duplicate:
OOPS-1664ED1753
OOPS-1661ED1419
OOPS-1661ED1019

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 162510] Re: Person merging needs to be done asynchronously

Just as a heads up - we have a jobs system now - documented on the
wiki - so its very feasible to do this asynchronously, if someone
wants to tackle it, no infrastructure is needed.

-Rob

Curtis Hovey (sinzui)
tags: added: merge-deactivate
Revision history for this message
Robert Collins (lifeless) wrote : Re: Person merging needs to be done asynchronously

High as per zero-oops-policy. Also, I think Edwin may be working on it?

Changed in launchpad-registry:
importance: Low → High
Revision history for this message
Stuart Bishop (stub) wrote :

Bug #627701 may provide temporary relief, allowing us to keep the timeout high for merges and very high for admin merges.

Steve McInerney (spm)
tags: added: canonical-losa-lp
Revision history for this message
Sense Egbert Hofstede (sense) wrote :

As I'm writing this, I have resend the POST submit of the final confirmation button more than one hundred and twenty times, but the merging still keeps timing out and doesn't seem to be completed at all. Is this because of the large amount of data associated with the other account (translations), or is this because of a bug somewhere?

Note: I haven't pressed the Confirm button manually each time, but instead pressed refresh and made Firefox resend the POST data (which should equal pressing the Confirm button).

Revision history for this message
Curtis Hovey (sinzui) wrote :

It is because of the large amount of data. Merging was expected to be used for unused/little accounts. Launchpad did not consider that users would want multiple accounts.

We are adding infrastructure to support long merges next week. We can begin moving the merge code into an async process that will send an email when it is complete. We do not expect the work to be complete for many months, but it will be usable by November.

Revision history for this message
Sense Egbert Hofstede (sense) wrote : Re: [Bug 162510] Re: Person merging needs to be done asynchronously

On 5 October 2010 18:44, Curtis Hovey <email address hidden> wrote:
> It is because of the large amount of data. Merging was expected to be
> used for unused/little accounts. Launchpad did not consider that users
> would want multiple accounts.
>
> We are adding infrastructure to support long merges next week. We can
> begin moving the merge code into an async process that will send an
> email when it is complete. We do not expect the work to be complete for
> many months, but it will be usable by November.
>

Thank you for the quick follow-up.

The second account I want to merge didn't come into existence because
I was using two, but because Launchpad was just a tad bit quicker with
importing translations from GNOME into Launchpad than I could validate
my new email address. Since then, many new translations and Bazaar
commits have been made using that email address, so data which I
consider important to associate with my account has been piling up a
lot.

Regards,
Sense Hofstede

summary: - Person merging needs to be done asynchronously
+ Person:+delete timeouts : Person merging needs to be done asynchronously
tags: added: pg83
Stuart Bishop (stub)
tags: removed: pg83
Changed in launchpad:
importance: High → Critical
Gavin Panella (allenap)
Changed in launchpad:
assignee: nobody → Gavin Panella (allenap)
Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

It's just a secondary thing, but according to those oopses we could save an extra 1—2 seconds on listReferences. Its recursion looks a bit naïve; my gut feeling is that we could get it down from 194 to maybe a dozen queries by iterating over batches.

Gavin Panella (allenap)
Changed in launchpad:
status: Triaged → In Progress
Revision history for this message
Launchpad QA Bot (lpqabot) wrote :

Fixed in stable r12511 (http://bazaar.launchpad.net/~launchpad-pqm/launchpad/stable/revision/12511) by a commit, but not testable.

Changed in launchpad:
milestone: none → 11.03
tags: added: qa-untestable
Changed in launchpad:
status: In Progress → Fix Committed
Revision history for this message
Gavin Panella (allenap) wrote :

Some explanation for the benefit of people who simply want to merge or delete their teams:

The branch that has landed, and caused this bug to move to Fix Committed, provides only the back-end work to fix this bug. Some more work to integrate this into the UI is needed. Curtis's squad has agreed to do that. Because of the way we track QA, that work will be done under the banner of a separate bug. This bug will therefore not be fully fixed until both this bug *and* the integration bug are complete.

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 162510] Re: Person:+delete timeouts : Person merging needs to be done asynchronously

Whats the # for the new bug (and please put the bug id in the title for it too).

Curtis Hovey (sinzui)
description: updated
Curtis Hovey (sinzui)
Changed in launchpad:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.