Merge lp:~stub/launchpad/replication into lp:launchpad/db-devel
Proposed by
Stuart Bishop
on 2010-01-15
| Reviewer | Review Type | Date Requested | Status |
|---|---|---|---|
| Henning Eggers (community) | code | 2010-01-15 | Approve on 2010-01-18 |
|
Review via email:
|
|||
Commit Message
Don't let replication lag checks by the appserver block long
To post a comment you must log in.
| Stuart Bishop (stub) wrote : | # |
| Henning Eggers (henninge) wrote : | # |
Hi stub,
this sounds like a smart thing to do and the branch looks good. Only thing that'd confuse me is the ".get_one()[0]" construct which looks like strange semantics in the get_one method. I'd exepct it *not* to return a list. But you did not introduce that, so I'll just let it go.
Cheers,
Henning
review:
Approve
(code)

Address Bug #504696. This is a cherry pick candidate, after manual tests on staging and edge demonstrate things are working as intended.
When things get busy, it can be slow to query the Slony-I tables to determine how lagged the slave database is. This is something we need to deal with, as we are sort of abusing this facility and I doubt it was intended for the sl_status view to be queried 20-30 times per second.
Rather than just letting the replication lag checks go slow, making users cry and timeouts rise, we put a timeout on the check. If we can't get the information we need in 250 ms, assume lag is bad and proceed.
This change cannot be tested by our test suite. It has been tested locally against a replicated environment and should be tested on staging next.