Comment 8 for bug 702024

Revision history for this message
Andrew Bennetts (spiv) wrote :

So I think the status is currently:

 * code is deployed
 * production is running essentially the same configuration as before: one SSH server instance listening directly on the SSH port
 * LOSA testing on staging (or qastaging?) of running instances behind HAProxy (rather than directly on SSH port) failed, as the HTTP service always returns 200 OK even after SIGTERM (should return 503)
 * testing by me in development environment in a VM works as expected, and I don't have any idea why staging behaves differently.
 * (less importantly, there's an outstanding merge proposal to fix a cosmetic glitch in the debug text of the HTTP service: <https://code.launchpad.net/~spiv/launchpad/haproxy-for-twisted-services/+merge/48427>)

I guess the next steps are to spend more time with a LOSA debugging why their tests on staging fail when it works in a development environment.

Also, it's probably worth trying to configure HAProxy to ignore the HTTP service, and just do TCP proxying without polling the HTTP service first. After an instance receives SIGTERM it should immediately stop listening on its TCP port (even though it will continue to service its existing connections), which AIUI should interact nicely with HAProxy (if can't connect to the first instance it tries it can try another instance instead without any disruption to the client).