Merge lp:~axwalk/juju-core/replicaset-retry-replsetinitiate into lp:~go-bot/juju-core/trunk

Proposed by Andrew Wilkins
Status: Merged
Approved by: Andrew Wilkins
Approved revision: no longer in the source branch.
Merged at revision: 2735
Proposed branch: lp:~axwalk/juju-core/replicaset-retry-replsetinitiate
Merge into: lp:~go-bot/juju-core/trunk
Diff against target: 47 lines (+27/-3)
1 file modified
replicaset/replicaset.go (+27/-3)
To merge this branch: bzr merge lp:~axwalk/juju-core/replicaset-retry-replsetinitiate
Reviewer Review Type Date Requested Status
Juju Engineering Pending
Review via email: mp+219776@code.launchpad.net

Commit message

replicaset: retry replSetInitiate

It would seem there's a race in mongo,
where it may think that the one and only
local replicaset member is not reachable
shortly after starting up. I was able to
reliably reproduce this error on the bot
machine by running TestInitiateReplica
in worker/peergrouper in a loop. With the
retry loop, no problems.

Fixes lp:1319617

https://codereview.appspot.com/97520046/

Description of the change

replicaset: retry replSetInitiate

It would seem there's a race in mongo,
where it may think that the one and only
local replicaset member is not reachable
shortly after starting up. I was able to
reliably reproduce this error on the bot
machine by running TestInitiateReplica
in worker/peergrouper in a loop. With the
retry loop, no problems.

Fixes lp:1319617

https://codereview.appspot.com/97520046/

To post a comment you must log in.
Revision history for this message
Andrew Wilkins (axwalk) wrote :

Reviewers: mp+219776_code.launchpad.net,

Message:
Please take a look.

Description:
replicaset: retry replSetInitiate

It would seem there's a race in mongo,
where it may think that the one and only
local replicaset member is not reachable
shortly after starting up. I was able to
reliably reproduce this error on the bot
machine by running TestInitiateReplica
in worker/peergrouper in a loop. With the
retry loop, no problems.

Fixes lp:1319617

https://code.launchpad.net/~axwalk/juju-core/replicaset-retry-replsetinitiate/+merge/219776

(do not edit description out of merge proposal)

Please review this at https://codereview.appspot.com/97520046/

Affected files (+29, -3 lines):
   A [revision details]
   M replicaset/replicaset.go

Index: [revision details]
=== added file '[revision details]'
--- [revision details] 2012-01-01 00:00:00 +0000
+++ [revision details] 2012-01-01 00:00:00 +0000
@@ -0,0 +1,2 @@
+Old revision: tarmac-20140515024948-mdinmvuq3nkxrxxi
+New revision: <email address hidden>

Index: replicaset/replicaset.go
=== modified file 'replicaset/replicaset.go'
--- replicaset/replicaset.go 2014-04-15 16:37:08 +0000
+++ replicaset/replicaset.go 2014-05-16 04:09:11 +0000
@@ -11,8 +11,23 @@
   "labix.org/v2/mgo/bson"
  )

-// MaxPeers defines the maximum number of peers that mongo supports.
-const MaxPeers = 7
+const (
+ // MaxPeers defines the maximum number of peers that mongo supports.
+ MaxPeers = 7
+
+ // maxInitiateAttempts is the maximum number of times to attempt
+ // replSetInitiate for each call to Initiate.
+ maxInitiateAttempts = 10
+
+ // initiateAttemptDelay is the amount of time to sleep between failed
+ // attempts to replSetInitiate.
+ initiateAttemptDelay = 100 * time.Millisecond
+
+ // rsMembersUnreachableError is the error message returned from mongo
+ // when it thinks that replicaset members are unreachable. This can
+ // occur if replSetInitiate is executed shortly after starting up mongo.
+ rsMembersUnreachableError = "all members and seeds must be reachable to
initiate set"
+)

  var logger = loggo.GetLogger("juju.replicaset")

@@ -39,7 +54,16 @@
    }},
   }
   logger.Infof("Initiating replicaset with config %#v", cfg)
- return monotonicSession.Run(bson.D{{"replSetInitiate", cfg}}, nil)
+ var err error
+ for i := 0; i < maxInitiateAttempts; i++ {
+ err = monotonicSession.Run(bson.D{{"replSetInitiate", cfg}}, nil)
+ if err != nil && err.Error() == rsMembersUnreachableError {
+ time.Sleep(initiateAttemptDelay)
+ continue
+ }
+ break
+ }
+ return err
  }

  // Member holds configuration information for a replica set member.

Revision history for this message
Ian Booth (wallyworld) wrote :

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'replicaset/replicaset.go'
--- replicaset/replicaset.go 2014-04-15 16:37:08 +0000
+++ replicaset/replicaset.go 2014-05-16 04:14:34 +0000
@@ -11,8 +11,23 @@
11 "labix.org/v2/mgo/bson"11 "labix.org/v2/mgo/bson"
12)12)
1313
14// MaxPeers defines the maximum number of peers that mongo supports.14const (
15const MaxPeers = 715 // MaxPeers defines the maximum number of peers that mongo supports.
16 MaxPeers = 7
17
18 // maxInitiateAttempts is the maximum number of times to attempt
19 // replSetInitiate for each call to Initiate.
20 maxInitiateAttempts = 10
21
22 // initiateAttemptDelay is the amount of time to sleep between failed
23 // attempts to replSetInitiate.
24 initiateAttemptDelay = 100 * time.Millisecond
25
26 // rsMembersUnreachableError is the error message returned from mongo
27 // when it thinks that replicaset members are unreachable. This can
28 // occur if replSetInitiate is executed shortly after starting up mongo.
29 rsMembersUnreachableError = "all members and seeds must be reachable to initiate set"
30)
1631
17var logger = loggo.GetLogger("juju.replicaset")32var logger = loggo.GetLogger("juju.replicaset")
1833
@@ -39,7 +54,16 @@
39 }},54 }},
40 }55 }
41 logger.Infof("Initiating replicaset with config %#v", cfg)56 logger.Infof("Initiating replicaset with config %#v", cfg)
42 return monotonicSession.Run(bson.D{{"replSetInitiate", cfg}}, nil)57 var err error
58 for i := 0; i < maxInitiateAttempts; i++ {
59 err = monotonicSession.Run(bson.D{{"replSetInitiate", cfg}}, nil)
60 if err != nil && err.Error() == rsMembersUnreachableError {
61 time.Sleep(initiateAttemptDelay)
62 continue
63 }
64 break
65 }
66 return err
43}67}
4468
45// Member holds configuration information for a replica set member.69// Member holds configuration information for a replica set member.

Subscribers

People subscribed via source and target branches

to status/vote changes: