Comment 2 for bug 819604

Revision history for this message
Jelmer Vernooij (jelmer) wrote : Re: [Bug 819604] Re: when an idle ssh transport is interrupted, bzrlib errors

On 05/08/11 16:43, John A Meinel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 8/2/2011 12:45 PM, Jelmer Vernooij wrote:
>> ** Changed in: bzr Importance: High => Critical
>>
>> ** Changed in: bzr Assignee: (unassigned) => canonical-bazaar
>> (canonical-bazaar)
>>
> Is this actually critical?
I'm not sure if it is critical anymore. The timeout for codehosting
haproxy has now been increased, which prevents this from being a big
problem at the moment.
>
> Also, the initial report was "idle ssh transport". Monty's traceback is
> certainly not 'idle' given that we are actively fetching content.
I think the problem is that the connection was idle for a while - enough
for it to timeout. Trying to reuse it again later then triggered this
exception, at least that is my understanding of it.

> And while the transfer-of-content is roughly stateless, I'm not 100%
> sure we want to default-reconnect SSH connections.
>
> I suppose we could use a reasonable try-again-but-not-forever, or
> try-once-but-fail-if-last-connect-failed. People with flaky connections
> would get slow-but-useful connections. Restarting codehosting would
> allow things to continue roughly interrupted, but the network going down
> wouldn't cause us to spin indefinitely.
>
That sounds reasonable.

Cheers,

Jelmer