Merge lp:~axwalk/juju-core/state-dialtimeout into lp:~go-bot/juju-core/trunk

Proposed by Andrew Wilkins
Status: Merged
Approved by: Andrew Wilkins
Approved revision: no longer in the source branch.
Merged at revision: 2702
Proposed branch: lp:~axwalk/juju-core/state-dialtimeout
Merge into: lp:~go-bot/juju-core/trunk
Diff against target: 26 lines (+6/-3)
1 file modified
state/open.go (+6/-3)
To merge this branch: bzr merge lp:~axwalk/juju-core/state-dialtimeout
Reviewer Review Type Date Requested Status
Juju Engineering Pending
Review via email: mp+218568@code.launchpad.net

Commit message

state: reduce default dial timeout to 30s

Previously we could have been dialling state connections
across the Internet, since the CLI would connect to Mongo
directly. Now, with the CLI going to API only, we can
safely reduce the timeout.

It is possible that the timeout will be exceeded if the
peers in an HA setup are not all ready. In this case the
dial operation will fail and the caller (state server agent)
will restart.

This change will reduce the time to failure in tests, and
allow us to capture some information about the failed test
without exceeding the Go test timeout.

https://codereview.appspot.com/97100045/

Description of the change

state: reduce default dial timeout to 30s

Previously we could have been dialling state connections
across the Internet, since the CLI would connect to Mongo
directly. Now, with the CLI going to API only, we can
safely reduce the timeout.

It is possible that the timeout will be exceeded if the
peers in an HA setup are not all ready. In this case the
dial operation will fail and the caller (state server agent)
will restart.

This change will reduce the time to failure in tests, and
allow us to capture some information about the failed test
without exceeding the Go test timeout.

https://codereview.appspot.com/97100045/

To post a comment you must log in.
Revision history for this message
Andrew Wilkins (axwalk) wrote :

Reviewers: mp+218568_code.launchpad.net,

Message:
Please take a look.

Description:
state: reduce default dial timeout to 30s

Previously we could have been dialling state connections
across the Internet, since the CLI would connect to Mongo
directly. Now, with the CLI going to API only, we can
safely reduce the timeout.

It is possible that the timeout will be exceeded if the
peers in an HA setup are not all ready. In this case the
dial operation will fail and the caller (state server agent)
will restart.

This change will reduce the time to failure in tests, and
allow us to capture some information about the failed test
without exceeding the Go test timeout.

https://code.launchpad.net/~axwalk/juju-core/state-dialtimeout/+merge/218568

(do not edit description out of merge proposal)

Please review this at https://codereview.appspot.com/97100045/

Affected files (+8, -3 lines):
   A [revision details]
   M state/open.go

Index: [revision details]
=== added file '[revision details]'
--- [revision details] 2012-01-01 00:00:00 +0000
+++ [revision details] 2012-01-01 00:00:00 +0000
@@ -0,0 +1,2 @@
+Old revision: tarmac-20140506233919-52y6gttu34ucqsho
+New revision: <email address hidden>

Index: state/open.go
=== modified file 'state/open.go'
--- state/open.go 2014-04-18 11:46:22 +0000
+++ state/open.go 2014-05-07 08:10:22 +0000
@@ -32,6 +32,11 @@
  // default.
  const mongoSocketTimeout = 10 * time.Second

+// defaultDialTimeout should be representative of
+// the upper bound of time taken to dial a mongo
+// server from within the same cloud/private network.
+const defaultDialTimeout = 30 * time.Second
+
  // Info encapsulates information about cluster of
  // servers holding juju state and can be used to make a
  // connection to that cluster.
@@ -63,9 +68,7 @@
  // DefaultDialOpts returns a DialOpts representing the default
  // parameters for contacting a state server.
  func DefaultDialOpts() DialOpts {
- return DialOpts{
- Timeout: 10 * time.Minute,
- }
+ return DialOpts{Timeout: defaultDialTimeout}
  }

  // Open connects to the server described by the given

Revision history for this message
John A Meinel (jameinel) wrote :

I'm pretty sure that most of our test suite code uses fastDialOpts =
DialOptions{}, don't they? It is possible that we have a test that is
waiting the 10 min to connect, which is why we get occasional 'timeout
exceeded' bugs, but I would actually guess that is a different deadlock.

Anyway, I'm happy enough reducing this, since we shouldn't ever be trying
to connect to Mongo directly before the machine is up, which is the only
reason we were waiting 10 minutes before.

LGTM

On Wed, May 7, 2014 at 12:25 PM, Andrew Wilkins <
<email address hidden>> wrote:

> The proposal to merge lp:~axwalk/juju-core/state-dialtimeout into
> lp:juju-core has been updated.
>
> Description changed to:
>
> state: reduce default dial timeout to 30s
>
> Previously we could have been dialling state connections
> across the Internet, since the CLI would connect to Mongo
> directly. Now, with the CLI going to API only, we can
> safely reduce the timeout.
>
> It is possible that the timeout will be exceeded if the
> peers in an HA setup are not all ready. In this case the
> dial operation will fail and the caller (state server agent)
> will restart.
>
> This change will reduce the time to failure in tests, and
> allow us to capture some information about the failed test
> without exceeding the Go test timeout.
>
> https://codereview.appspot.com/97100045/
>
>
> For more details, see:
>
> https://code.launchpad.net/~axwalk/juju-core/state-dialtimeout/+merge/218568
> --
>
> https://code.launchpad.net/~axwalk/juju-core/state-dialtimeout/+merge/218568
> You are subscribed to branch lp:juju-core.
>

Revision history for this message
Andrew Wilkins (axwalk) wrote :

On Wed, May 7, 2014 at 5:39 PM, John A Meinel <email address hidden>wrote:

> I'm pretty sure that most of our test suite code uses fastDialOpts =
> DialOptions{}, don't they? It is possible that we have a test that is
> waiting the 10 min to connect, which is why we get occasional 'timeout
> exceeded' bugs, but I would actually guess that is a different deadlock.
>

Probably most of them (but I don't know for sure without trawling the
code), but there definitely are a bunch that use DefaultDialOpts. I guess
they should be changed.

> Anyway, I'm happy enough reducing this, since we shouldn't ever be trying
> to connect to Mongo directly before the machine is up, which is the only
> reason we were waiting 10 minutes before.
>
> LGTM
>
>
>
> On Wed, May 7, 2014 at 12:25 PM, Andrew Wilkins <
> <email address hidden>> wrote:
>
> > The proposal to merge lp:~axwalk/juju-core/state-dialtimeout into
> > lp:juju-core has been updated.
> >
> > Description changed to:
> >
> > state: reduce default dial timeout to 30s
> >
> > Previously we could have been dialling state connections
> > across the Internet, since the CLI would connect to Mongo
> > directly. Now, with the CLI going to API only, we can
> > safely reduce the timeout.
> >
> > It is possible that the timeout will be exceeded if the
> > peers in an HA setup are not all ready. In this case the
> > dial operation will fail and the caller (state server agent)
> > will restart.
> >
> > This change will reduce the time to failure in tests, and
> > allow us to capture some information about the failed test
> > without exceeding the Go test timeout.
> >
> > https://codereview.appspot.com/97100045/
> >
> >
> > For more details, see:
> >
> >
> https://code.launchpad.net/~axwalk/juju-core/state-dialtimeout/+merge/218568
> > --
> >
> >
> https://code.launchpad.net/~axwalk/juju-core/state-dialtimeout/+merge/218568
> > You are subscribed to branch lp:juju-core.
> >
>
> --
>
> https://code.launchpad.net/~axwalk/juju-core/state-dialtimeout/+merge/218568
> You are the owner of lp:~axwalk/juju-core/state-dialtimeout.
>

Revision history for this message
Andrew Wilkins (axwalk) wrote :

On 2014/05/07 08:25:25, axw wrote:
> Please take a look.

jam has LGTM'd on the MP

https://codereview.appspot.com/97100045/

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'state/open.go'
2--- state/open.go 2014-04-18 11:46:22 +0000
3+++ state/open.go 2014-05-07 08:21:28 +0000
4@@ -32,6 +32,11 @@
5 // default.
6 const mongoSocketTimeout = 10 * time.Second
7
8+// defaultDialTimeout should be representative of
9+// the upper bound of time taken to dial a mongo
10+// server from within the same cloud/private network.
11+const defaultDialTimeout = 30 * time.Second
12+
13 // Info encapsulates information about cluster of
14 // servers holding juju state and can be used to make a
15 // connection to that cluster.
16@@ -63,9 +68,7 @@
17 // DefaultDialOpts returns a DialOpts representing the default
18 // parameters for contacting a state server.
19 func DefaultDialOpts() DialOpts {
20- return DialOpts{
21- Timeout: 10 * time.Minute,
22- }
23+ return DialOpts{Timeout: defaultDialTimeout}
24 }
25
26 // Open connects to the server described by the given

Subscribers

People subscribed via source and target branches

to status/vote changes: