Merge lp:~axwalk/juju-core/lp1271504-ensureavailability-agentalive into lp:~go-bot/juju-core/trunk

Proposed by Andrew Wilkins
Status: Merged
Approved by: Andrew Wilkins
Approved revision: no longer in the source branch.
Merged at revision: 2624
Proposed branch: lp:~axwalk/juju-core/lp1271504-ensureavailability-agentalive
Merge into: lp:~go-bot/juju-core/trunk
Diff against target: 565 lines (+373/-87)
5 files modified
cmd/juju/ensureavailability_test.go (+13/-0)
state/addmachine.go (+154/-50)
state/apiserver/client/client_test.go (+24/-20)
state/export_test.go (+2/-0)
state/state_test.go (+180/-17)
To merge this branch: bzr merge lp:~axwalk/juju-core/lp1271504-ensureavailability-agentalive
Reviewer Review Type Date Requested Status
Juju Engineering Pending
Review via email: mp+215172@code.launchpad.net

Commit message

state: (demote)promote (un)available state servers

When EnsureAvailability is called, we now check the
availability of existing state servers. If a state server
is unavailable, then we either set WantsVote=false, or
remove it entirely if it doesn't want a vote.

If we need to create new state servers, we will first
promote existing, available, non-voting state servers
before creating new ones.

Availability is currently defined by the state.Machine's
agent presence. We may want to extend this later to also
consider the associated mongo's replicaset member health.

https://codereview.appspot.com/86490043/

Description of the change

state: (demote)promote (un)available state servers

When EnsureAvailability is called, we now check the
availability of existing state servers. If a state server
is unavailable, then we either set WantsVote=false, or
remove it entirely if it doesn't want a vote.

If we need to create new state servers, we will first
promote existing, available, non-voting state servers
before creating new ones.

Availability is currently defined by the state.Machine's
agent presence. We may want to extend this later to also
consider the associated mongo's replicaset member health.

https://codereview.appspot.com/86490043/

To post a comment you must log in.
Revision history for this message
Andrew Wilkins (axwalk) wrote :

Reviewers: mp+215172_code.launchpad.net,

Message:
Please take a look.

Description:
state: (demote)promote (un)available state servers

When EnsureAvailability is called, we now check the
availability of existing state servers. If a state server
is unavailable, then we either set WantsVote=false, or
remove it entirely if it doesn't want a vote.

If we need to create new state servers, we will first
promote existing, available, non-voting state servers
before creating new ones.

Availability is currently defined by the state.Machine's
agent presence. We may want to extend this later to also
consider the associated mongo's replicaset member health.

https://code.launchpad.net/~axwalk/juju-core/lp1271504-ensureavailability-agentalive/+merge/215172

(do not edit description out of merge proposal)

Please review this at https://codereview.appspot.com/86490043/

Affected files (+272, -61 lines):
   A [revision details]
   M cmd/juju/ensureavailability_test.go
   M state/addmachine.go
   M state/apiserver/client/client_test.go
   M state/export_test.go
   M state/state_test.go

Revision history for this message
Roger Peppe (rogpeppe) wrote :

LGTM modulo the following.
Very nice.

https://codereview.appspot.com/86490043/diff/1/state/addmachine.go
File state/addmachine.go (right):

https://codereview.appspot.com/86490043/diff/1/state/addmachine.go#newcode531
state/addmachine.go:531: mdocs := make([]*machineDoc,
numStateServers-len(info.VotingMachineIds))
nice clear logic here.

https://codereview.appspot.com/86490043/diff/1/state/addmachine.go#newcode555
state/addmachine.go:555: if err != nil {
If err==txn.ErrAborted, I think it's only going to be because someone
else has run EnsureAvailability concurrently. Perhaps we should check if
the voting server count has changed - if it has, return an error,
otherwise return nil because our work will already have been done.

https://codereview.appspot.com/86490043/diff/1/state/addmachine.go#newcode578
state/addmachine.go:578: var newVotingMachineIds, newMachineIds []string
rather than two variables, it might be clearer if we did:

var newInfo StateServerInfo

and then, e.g.
newInfo.MachineIds = append(newInfo.MachineIds, ...)

then it's more obvious when we're just building
up ids in the new info.

https://codereview.appspot.com/86490043/diff/1/state/addmachine.go#newcode589
state/addmachine.go:589: if !m.WantsVote() {
switch the sense of the condition to avoid the negative?

https://codereview.appspot.com/86490043/diff/1/state/addmachine.go#newcode598
state/addmachine.go:598: // If the machine does want to vote, we simply
set novote and
Since the comments are inside the if statement, we can phrase them
unconditionally, which might read slightly more clearly.

// The machine wants to vote, so we simply set novote etc

https://codereview.appspot.com/86490043/diff/1/state/addmachine.go#newcode605
state/addmachine.go:605: } else if !m.HasVote() {
perhaps invert the condition here to save negatives?

https://codereview.appspot.com/86490043/diff/1/state/addmachine.go#newcode606
state/addmachine.go:606: // If the machine does not want to vote and
doesn't have a vote,
again, we can phrase it unconditionally.

https://codereview.appspot.com/86490043/diff/1/state/addmachine.go#newcode625
state/addmachine.go:625: Assert: bson.D{{"novote", bson.D{{"$ne",
false}}}},
I *think* this should be {{"$eq", true}}, because nil!=false
and legacy machines may have a nil novote field.

similarly below.

https://codereview.appspot.com/86490043/diff/1/state/addmachine.go#newcode655
state/addmachine.go:655: Update: bson.D{{"$pull", bson.D{{"jobs",
JobManageEnviron}}}},
perhaps set novote=false here too?

https://codereview.appspot.com/86490043/

Revision history for this message
Andrew Wilkins (axwalk) wrote :

Please take a look.

https://codereview.appspot.com/86490043/diff/1/state/addmachine.go
File state/addmachine.go (right):

https://codereview.appspot.com/86490043/diff/1/state/addmachine.go#newcode555
state/addmachine.go:555: if err != nil {
On 2014/04/11 10:09:17, rog wrote:
> If err==txn.ErrAborted, I think it's only going to be because someone
> else has run EnsureAvailability concurrently. Perhaps we should check
if the
> voting server count has changed - if it has, return an error,
otherwise return
> nil because our work will already have been done.

Done, I made it a transaction loop and added a few tests.

https://codereview.appspot.com/86490043/diff/1/state/addmachine.go#newcode578
state/addmachine.go:578: var newVotingMachineIds, newMachineIds []string
On 2014/04/11 10:09:17, rog wrote:
> rather than two variables, it might be clearer if we did:

> var newInfo StateServerInfo

> and then, e.g.
> newInfo.MachineIds = append(newInfo.MachineIds, ...)

> then it's more obvious when we're just building
> up ids in the new info.

Done.

https://codereview.appspot.com/86490043/diff/1/state/addmachine.go#newcode589
state/addmachine.go:589: if !m.WantsVote() {
On 2014/04/11 10:09:17, rog wrote:
> switch the sense of the condition to avoid the negative?

Done.

https://codereview.appspot.com/86490043/diff/1/state/addmachine.go#newcode598
state/addmachine.go:598: // If the machine does want to vote, we simply
set novote and
On 2014/04/11 10:09:17, rog wrote:
> Since the comments are inside the if statement, we can phrase them
> unconditionally, which might read slightly more clearly.

> // The machine wants to vote, so we simply set novote etc

Done.

https://codereview.appspot.com/86490043/diff/1/state/addmachine.go#newcode605
state/addmachine.go:605: } else if !m.HasVote() {
On 2014/04/11 10:09:17, rog wrote:
> perhaps invert the condition here to save negatives?

Done.

https://codereview.appspot.com/86490043/diff/1/state/addmachine.go#newcode606
state/addmachine.go:606: // If the machine does not want to vote and
doesn't have a vote,
On 2014/04/11 10:09:17, rog wrote:
> again, we can phrase it unconditionally.

Done.

https://codereview.appspot.com/86490043/diff/1/state/addmachine.go#newcode625
state/addmachine.go:625: Assert: bson.D{{"novote", bson.D{{"$ne",
false}}}},
On 2014/04/11 10:09:17, rog wrote:
> I *think* this should be {{"$eq", true}}, because nil!=false
> and legacy machines may have a nil novote field.

> similarly below.

Done.

https://codereview.appspot.com/86490043/diff/1/state/addmachine.go#newcode655
state/addmachine.go:655: Update: bson.D{{"$pull", bson.D{{"jobs",
JobManageEnviron}}}},
On 2014/04/11 10:09:17, rog wrote:
> perhaps set novote=false here too?

Done.

https://codereview.appspot.com/86490043/

Revision history for this message
Go Bot (go-bot) wrote :
Download full text (107.9 KiB)

The attempt to merge lp:~axwalk/juju-core/lp1271504-ensureavailability-agentalive into lp:juju-core failed. Below is the output from the failed tests.

ok launchpad.net/juju-core 0.014s
ok launchpad.net/juju-core/agent 1.205s
ok launchpad.net/juju-core/agent/mongo 0.919s
ok launchpad.net/juju-core/agent/tools 0.174s
ok launchpad.net/juju-core/bzr 5.064s
ok launchpad.net/juju-core/cert 2.691s
ok launchpad.net/juju-core/charm 0.350s
? launchpad.net/juju-core/charm/hooks [no test files]
? launchpad.net/juju-core/charm/testing [no test files]
ok launchpad.net/juju-core/cloudinit 0.032s
ok launchpad.net/juju-core/cloudinit/sshinit 0.782s
ok launchpad.net/juju-core/cmd 0.197s
ok launchpad.net/juju-core/cmd/charm-admin 0.759s
? launchpad.net/juju-core/cmd/charmd [no test files]
? launchpad.net/juju-core/cmd/charmload [no test files]
ok launchpad.net/juju-core/cmd/envcmd 0.211s
ok launchpad.net/juju-core/cmd/juju 226.333s
ok launchpad.net/juju-core/cmd/jujud 77.802s
ok launchpad.net/juju-core/cmd/plugins/juju-metadata 9.276s
? launchpad.net/juju-core/cmd/plugins/juju-restore [no test files]
ok launchpad.net/juju-core/cmd/plugins/local 0.224s
? launchpad.net/juju-core/cmd/plugins/local/juju-local [no test files]
ok launchpad.net/juju-core/constraints 0.029s
ok launchpad.net/juju-core/container 0.039s
ok launchpad.net/juju-core/container/factory 0.054s
ok launchpad.net/juju-core/container/kvm 0.174s
ok launchpad.net/juju-core/container/kvm/mock 0.042s
? launchpad.net/juju-core/container/kvm/testing [no test files]
ok launchpad.net/juju-core/container/lxc 4.261s
? launchpad.net/juju-core/container/lxc/mock [no test files]
? launchpad.net/juju-core/container/lxc/testing [no test files]
? launchpad.net/juju-core/container/testing [no test files]
ok launchpad.net/juju-core/downloader 5.264s
ok launchpad.net/juju-core/environs 2.706s
ok launchpad.net/juju-core/environs/bootstrap 11.064s
ok launchpad.net/juju-core/environs/cloudinit 0.520s
ok launchpad.net/juju-core/environs/config 1.741s
ok launchpad.net/juju-core/environs/configstore 0.028s
ok launchpad.net/juju-core/environs/filestorage 0.027s
ok launchpad.net/juju-core/environs/httpstorage 0.706s
ok launchpad.net/juju-core/environs/imagemetadata 0.421s
? launchpad.net/juju-core/environs/imagemetadata/testing [no test files]
ok launchpad.net/juju-core/environs/instances 0.044s
ok launchpad.net/juju-core/environs/jujutest 0.176s
ok launchpad.net/juju-core/environs/manual 11.771s
ok launchpad.net/juju-core/environs/simplestreams 0.257s
? launchpad.net/juju-core/environs/simplestreams/testing [no test files]
ok launchpad.net/juju-core/environs/sshstorage 0.902s
ok launchpad.net/juju-core/environs/storage 0.968s
ok launchpad.net/juju-core/environs/sync 49.180s
ok launchpad.net/juju-core/environs/testing 0.123s
ok launchpad.net/juju-core/environs/tools 4.880s
? launchpad.net/juju-core/environs/tools/testing [no test files]
ok launchpad.net/juju-core/errors 0.017s
ok launchpad.net/juju-core/instance 0.021s
? launchpad.net/juju-core/instance/testing [no test files]
ok launchpad.net/juju...

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'cmd/juju/ensureavailability_test.go'
--- cmd/juju/ensureavailability_test.go 2014-03-28 09:55:10 +0000
+++ cmd/juju/ensureavailability_test.go 2014-04-14 03:42:21 +0000
@@ -83,6 +83,19 @@
83func (s *EnsureAvailabilitySuite) TestEnsureAvailabilityMultiple(c *gc.C) {83func (s *EnsureAvailabilitySuite) TestEnsureAvailabilityMultiple(c *gc.C) {
84 err := runEnsureAvailability(c, "-n", "1")84 err := runEnsureAvailability(c, "-n", "1")
85 c.Assert(err, gc.IsNil)85 c.Assert(err, gc.IsNil)
86
87 // make sure machine-0 remains alive for the second call to
88 // EnsureAvailability, or machine-0 will get bumped down to
89 // non-voting.
90 m0, err := s.BackingState.Machine("0")
91 c.Assert(err, gc.IsNil)
92 pinger, err := m0.SetAgentAlive()
93 c.Assert(err, gc.IsNil)
94 defer pinger.Kill()
95 s.BackingState.StartSync()
96 err = m0.WaitAgentAlive(testing.LongWait)
97 c.Assert(err, gc.IsNil)
98
86 err = runEnsureAvailability(c, "-n", "3", "--constraints", "mem=4G")99 err = runEnsureAvailability(c, "-n", "3", "--constraints", "mem=4G")
87 c.Assert(err, gc.IsNil)100 c.Assert(err, gc.IsNil)
88101
89102
=== modified file 'state/addmachine.go'
--- state/addmachine.go 2014-04-10 14:40:34 +0000
+++ state/addmachine.go 2014-04-14 03:42:21 +0000
@@ -497,14 +497,6 @@
497// EnsureAvailability adds state server machines as necessary to make497// EnsureAvailability adds state server machines as necessary to make
498// the number of live state servers equal to numStateServers. The given498// the number of live state servers equal to numStateServers. The given
499// constraints and series will be attached to any new machines.499// constraints and series will be attached to any new machines.
500//
501// TODO(rog):
502// If any current state servers are down, they will be
503// removed from the current set of voting replica set
504// peers (although the machines themselves will remain
505// and they will still remain part of the replica set).
506// Once a machine's voting status has been removed,
507// the machine itself may be removed.
508func (st *State) EnsureAvailability(numStateServers int, cons constraints.Value, series string) error {500func (st *State) EnsureAvailability(numStateServers int, cons constraints.Value, series string) error {
509 if numStateServers%2 != 1 || numStateServers <= 0 {501 if numStateServers%2 != 1 || numStateServers <= 0 {
510 return fmt.Errorf("number of state servers must be odd and greater than zero")502 return fmt.Errorf("number of state servers must be odd and greater than zero")
@@ -512,46 +504,158 @@
512 if numStateServers > replicaset.MaxPeers {504 if numStateServers > replicaset.MaxPeers {
513 return fmt.Errorf("state server count is too large (allowed %d)", replicaset.MaxPeers)505 return fmt.Errorf("state server count is too large (allowed %d)", replicaset.MaxPeers)
514 }506 }
515 info, err := st.StateServerInfo()507 for i := 0; i < 5; i++ {
516 if err != nil {508 currentInfo, err := st.StateServerInfo()
517 return err509 if err != nil {
518 }510 return err
519 if len(info.VotingMachineIds) == numStateServers {511 }
520 // TODO(rog) #1271504 2014-01-22512 if len(currentInfo.VotingMachineIds) > numStateServers {
521 // Find machines which are down, set513 return fmt.Errorf("cannot reduce state server count")
522 // their NoVote flag and add new machines to514 }
523 // replace them.515 removeOps, info, promotableMachines, err := st.updateAvailableStateServersOps(currentInfo)
524 return nil516 if err != nil {
525 }517 return err
526 if len(info.VotingMachineIds) > numStateServers {518 }
527 return fmt.Errorf("cannot reduce state server count")519 if len(info.VotingMachineIds) == numStateServers && len(removeOps) == 0 {
528 }520 return nil
529 mdocs := make([]*machineDoc, 0, numStateServers-len(info.MachineIds))521 }
530 var ops []txn.Op522 // Promote existing machines first.
531 for i := len(info.MachineIds); i < numStateServers; i++ {523 var ops []txn.Op
532 template := MachineTemplate{524 if n := numStateServers - len(info.VotingMachineIds); n < len(promotableMachines) {
533 Series: series,525 promotableMachines = promotableMachines[:n]
534 Jobs: []MachineJob{526 }
535 JobHostUnits,527 for _, m := range promotableMachines {
536 JobManageEnviron,528 ops = append(ops, promoteStateServerOps(m)...)
537 },529 info.VotingMachineIds = append(info.VotingMachineIds, m.Id())
538 Constraints: cons,530 }
539 }531 // Create new machines to make up the shortfall.
540 mdoc, addOps, err := st.addMachineOps(template)532 mdocs := make([]*machineDoc, numStateServers-len(info.VotingMachineIds))
541 if err != nil {533 for i := range mdocs {
542 return err534 template := MachineTemplate{
543 }535 Series: series,
544 mdocs = append(mdocs, mdoc)536 Jobs: []MachineJob{
545 ops = append(ops, addOps...)537 JobHostUnits,
546 }538 JobManageEnviron,
547 ssOps, err := st.maintainStateServersOps(mdocs, info)539 },
548 if err != nil {540 Constraints: cons,
549 return fmt.Errorf("cannot prepare machine add operations: %v", err)541 }
550 }542 mdoc, addOps, err := st.addMachineOps(template)
551 ops = append(ops, ssOps...)543 if err != nil {
552 err = st.runTransaction(ops)544 return err
553 if err != nil {545 }
554 return fmt.Errorf("failed to create new state server machines: %v", err)546 mdocs[i] = mdoc
555 }547 ops = append(ops, addOps...)
556 return nil548 }
549 ssOps, err := st.maintainStateServersOps(mdocs, currentInfo)
550 if err != nil {
551 return fmt.Errorf("cannot prepare machine add operations: %v", err)
552 }
553 ops = append(ops, removeOps...)
554 ops = append(ops, ssOps...)
555 err = st.runTransaction(ops)
556 if err == nil {
557 return nil
558 } else if err != txn.ErrAborted {
559 return fmt.Errorf("failed to create new state server machines: %v", err)
560 }
561 // The transaction will only be aborted if another call to
562 // EnsureAvailability completed. Loop back around and try again.
563 }
564 return ErrExcessiveContention
565}
566
567// stateServerAvailable returns true if the specified state server machine is
568// available.
569var stateServerAvailable = func(m *Machine) (bool, error) {
570 // TODO(axw) #1271504 2014-01-22
571 // Check the state server's associated mongo health;
572 // requires coordination with worker/peergrouper.
573 return m.AgentAlive()
574}
575
576// updateAvailableStateServersOps checks the availability of state servers,
577// demoting unavailable, voting machines;
578// removing unavailable, non-voting, non-vote-holding machines;
579// gathering available, non-voting machines that may be promoted;
580// updating StateServerInfo to reflect reality.
581func (st *State) updateAvailableStateServersOps(info *StateServerInfo) (ops []txn.Op, newInfo *StateServerInfo, promotableMachines []*Machine, err error) {
582 newInfo = &StateServerInfo{}
583 for _, mid := range info.MachineIds {
584 m, err := st.Machine(mid)
585 if err != nil {
586 return nil, nil, nil, err
587 }
588 available, err := stateServerAvailable(m)
589 if err != nil {
590 return nil, nil, nil, err
591 }
592 if available {
593 if m.WantsVote() {
594 newInfo.VotingMachineIds = append(newInfo.VotingMachineIds, mid)
595 } else {
596 promotableMachines = append(promotableMachines, m)
597 }
598 newInfo.MachineIds = append(newInfo.MachineIds, mid)
599 continue
600 }
601 if m.WantsVote() {
602 // The machine wants to vote, so we simply set novote and allow it
603 // to run its course to have its vote removed by the worker that
604 // maintains the replicaset. We will replace it with an existing
605 // non-voting state server if there is one, starting a new one if
606 // not.
607 ops = append(ops, demoteStateServerOps(m)...)
608 newInfo.MachineIds = append(newInfo.MachineIds, mid)
609 } else if m.HasVote() {
610 // The machine still has a vote, so keep it in MachineIds for now.
611 newInfo.MachineIds = append(newInfo.MachineIds, mid)
612 } else {
613 // The machine neither wants to nor has a vote, so remove its
614 // JobManageEnviron job immediately.
615 ops = append(ops, removeStateServerOps(m)...)
616 }
617 }
618 return ops, newInfo, promotableMachines, nil
619}
620
621func promoteStateServerOps(m *Machine) []txn.Op {
622 return []txn.Op{{
623 C: m.st.machines.Name,
624 Id: m.doc.Id,
625 Assert: bson.D{{"novote", true}},
626 Update: bson.D{{"$set", bson.D{{"novote", false}}}},
627 }, {
628 C: m.st.stateServers.Name,
629 Id: environGlobalKey,
630 Update: bson.D{{"$addToSet", bson.D{{"votingmachineids", m.doc.Id}}}},
631 }}
632}
633
634func demoteStateServerOps(m *Machine) []txn.Op {
635 return []txn.Op{{
636 C: m.st.machines.Name,
637 Id: m.doc.Id,
638 Assert: bson.D{{"novote", false}},
639 Update: bson.D{{"$set", bson.D{{"novote", true}}}},
640 }, {
641 C: m.st.stateServers.Name,
642 Id: environGlobalKey,
643 Update: bson.D{{"$pull", bson.D{{"votingmachineids", m.doc.Id}}}},
644 }}
645}
646
647func removeStateServerOps(m *Machine) []txn.Op {
648 return []txn.Op{{
649 C: m.st.machines.Name,
650 Id: m.doc.Id,
651 Assert: bson.D{{"novote", true}, {"hasvote", false}},
652 Update: bson.D{
653 {"$pull", bson.D{{"jobs", JobManageEnviron}}},
654 {"$set", bson.D{{"novote", false}}},
655 },
656 }, {
657 C: m.st.stateServers.Name,
658 Id: environGlobalKey,
659 Update: bson.D{{"$pull", bson.D{{"machineids", m.doc.Id}}}},
660 }}
557}661}
558662
=== modified file 'state/apiserver/client/client_test.go'
--- state/apiserver/client/client_test.go 2014-04-11 15:16:00 +0000
+++ state/apiserver/client/client_test.go 2014-04-14 03:42:21 +0000
@@ -29,6 +29,7 @@
29 "launchpad.net/juju-core/state/api/params"29 "launchpad.net/juju-core/state/api/params"
30 "launchpad.net/juju-core/state/api/usermanager"30 "launchpad.net/juju-core/state/api/usermanager"
31 "launchpad.net/juju-core/state/apiserver/client"31 "launchpad.net/juju-core/state/apiserver/client"
32 "launchpad.net/juju-core/state/presence"
32 "launchpad.net/juju-core/state/statecmd"33 "launchpad.net/juju-core/state/statecmd"
33 coretesting "launchpad.net/juju-core/testing"34 coretesting "launchpad.net/juju-core/testing"
34 "launchpad.net/juju-core/utils"35 "launchpad.net/juju-core/utils"
@@ -2132,17 +2133,24 @@
2132 c.Assert(data["transient"], gc.Equals, true)2133 c.Assert(data["transient"], gc.Equals, true)
2133}2134}
21342135
2136func (s *clientSuite) setAgentAlive(c *gc.C, machineId string) *presence.Pinger {
2137 m, err := s.BackingState.Machine(machineId)
2138 c.Assert(err, gc.IsNil)
2139 pinger, err := m.SetAgentAlive()
2140 c.Assert(err, gc.IsNil)
2141 s.BackingState.StartSync()
2142 err = m.WaitAgentAlive(coretesting.LongWait)
2143 c.Assert(err, gc.IsNil)
2144 return pinger
2145}
2146
2135func (s *clientSuite) TestClientEnsureAvailabilitySeries(c *gc.C) {2147func (s *clientSuite) TestClientEnsureAvailabilitySeries(c *gc.C) {
2136 apiParams := []params.EnsureAvailability{{2148 err := s.APIState.Client().EnsureAvailability(1, constraints.Value{}, "")
2137 NumStateServers: 1,2149 c.Assert(err, gc.IsNil)
2138 }, {2150 pinger := s.setAgentAlive(c, "0")
2139 NumStateServers: 3,2151 defer pinger.Kill()
2140 Series: "non-default",2152 err = s.APIState.Client().EnsureAvailability(3, constraints.Value{}, "non-default")
2141 }}2153 c.Assert(err, gc.IsNil)
2142 for _, p := range apiParams {
2143 err := s.APIState.Client().EnsureAvailability(p.NumStateServers, p.Constraints, p.Series)
2144 c.Assert(err, gc.IsNil)
2145 }
2146 machines, err := s.State.AllMachines()2154 machines, err := s.State.AllMachines()
2147 c.Assert(err, gc.IsNil)2155 c.Assert(err, gc.IsNil)
2148 c.Assert(machines, gc.HasLen, 3)2156 c.Assert(machines, gc.HasLen, 3)
@@ -2152,16 +2160,12 @@
2152}2160}
21532161
2154func (s *clientSuite) TestClientEnsureAvailabilityConstraints(c *gc.C) {2162func (s *clientSuite) TestClientEnsureAvailabilityConstraints(c *gc.C) {
2155 apiParams := []params.EnsureAvailability{{2163 err := s.APIState.Client().EnsureAvailability(1, constraints.Value{}, "")
2156 NumStateServers: 1,2164 c.Assert(err, gc.IsNil)
2157 }, {2165 pinger := s.setAgentAlive(c, "0")
2158 NumStateServers: 3,2166 defer pinger.Kill()
2159 Constraints: constraints.MustParse("mem=4G"),2167 err = s.APIState.Client().EnsureAvailability(3, constraints.MustParse("mem=4G"), "")
2160 }}2168 c.Assert(err, gc.IsNil)
2161 for _, p := range apiParams {
2162 err := s.APIState.Client().EnsureAvailability(p.NumStateServers, p.Constraints, p.Series)
2163 c.Assert(err, gc.IsNil)
2164 }
2165 machines, err := s.State.AllMachines()2169 machines, err := s.State.AllMachines()
2166 c.Assert(err, gc.IsNil)2170 c.Assert(err, gc.IsNil)
2167 c.Assert(machines, gc.HasLen, 3)2171 c.Assert(machines, gc.HasLen, 3)
21682172
=== modified file 'state/export_test.go'
--- state/export_test.go 2014-03-25 15:06:31 +0000
+++ state/export_test.go 2014-04-14 03:42:21 +0000
@@ -259,3 +259,5 @@
259func CheckUserExists(st *State, name string) (bool, error) {259func CheckUserExists(st *State, name string) (bool, error) {
260 return st.checkUserExists(name)260 return st.checkUserExists(name)
261}261}
262
263var StateServerAvailable = &stateServerAvailable
262264
=== modified file 'state/state_test.go'
--- state/state_test.go 2014-04-10 14:40:34 +0000
+++ state/state_test.go 2014-04-14 03:42:21 +0000
@@ -6,7 +6,6 @@
6import (6import (
7 "fmt"7 "fmt"
8 "net/url"8 "net/url"
9 "sort"
10 "strconv"9 "strconv"
11 "strings"10 "strings"
12 "time"11 "time"
@@ -1735,6 +1734,10 @@
1735 VotingMachineIds: []string{"0"},1734 VotingMachineIds: []string{"0"},
1736 })1735 })
17371736
1737 s.PatchValue(state.StateServerAvailable, func(m *state.Machine) (bool, error) {
1738 return true, nil
1739 })
1740
1738 err = s.State.EnsureAvailability(3, constraints.Value{}, "quantal")1741 err = s.State.EnsureAvailability(3, constraints.Value{}, "quantal")
1739 c.Assert(err, gc.IsNil)1742 c.Assert(err, gc.IsNil)
17401743
@@ -2889,6 +2892,11 @@
2889}2892}
28902893
2891func (s *StateSuite) TestEnsureAvailabilityAddsNewMachines(c *gc.C) {2894func (s *StateSuite) TestEnsureAvailabilityAddsNewMachines(c *gc.C) {
2895 // Don't use agent presence to decide on machine availability.
2896 s.PatchValue(state.StateServerAvailable, func(m *state.Machine) (bool, error) {
2897 return true, nil
2898 })
2899
2892 ids := make([]string, 3)2900 ids := make([]string, 3)
2893 m0, err := s.State.AddMachine("quantal", state.JobHostUnits, state.JobManageEnviron)2901 m0, err := s.State.AddMachine("quantal", state.JobHostUnits, state.JobManageEnviron)
2894 c.Assert(err, gc.IsNil)2902 c.Assert(err, gc.IsNil)
@@ -2898,12 +2906,7 @@
2898 _, err = s.State.AddMachine("quantal", state.JobHostUnits)2906 _, err = s.State.AddMachine("quantal", state.JobHostUnits)
2899 c.Assert(err, gc.IsNil)2907 c.Assert(err, gc.IsNil)
29002908
2901 info, err := s.State.StateServerInfo()2909 s.assertStateServerInfo(c, []string{"0"}, []string{"0"})
2902 c.Assert(err, gc.IsNil)
2903 c.Assert(info, gc.DeepEquals, &state.StateServerInfo{
2904 MachineIds: []string{m0.Id()},
2905 VotingMachineIds: []string{m0.Id()},
2906 })
29072910
2908 cons := constraints.Value{2911 cons := constraints.Value{
2909 Mem: newUint64(100),2912 Mem: newUint64(100),
@@ -2924,22 +2927,182 @@
2924 c.Assert(m.WantsVote(), jc.IsTrue)2927 c.Assert(m.WantsVote(), jc.IsTrue)
2925 ids[i] = m.Id()2928 ids[i] = m.Id()
2926 }2929 }
2927 sort.Strings(ids)2930 s.assertStateServerInfo(c, ids, ids)
2928
2929 info, err = s.State.StateServerInfo()
2930 c.Assert(err, gc.IsNil)
2931 sort.Strings(info.MachineIds)
2932 sort.Strings(info.VotingMachineIds)
2933 c.Assert(info, gc.DeepEquals, &state.StateServerInfo{
2934 MachineIds: ids,
2935 VotingMachineIds: ids,
2936 })
2937}2931}
29382932
2939func newUint64(i uint64) *uint64 {2933func newUint64(i uint64) *uint64 {
2940 return &i2934 return &i
2941}2935}
29422936
2937func (s *StateSuite) assertStateServerInfo(c *gc.C, machineIds []string, votingMachineIds []string) {
2938 info, err := s.State.StateServerInfo()
2939 c.Assert(err, gc.IsNil)
2940 c.Assert(info.MachineIds, jc.SameContents, machineIds)
2941 c.Assert(info.VotingMachineIds, jc.SameContents, votingMachineIds)
2942}
2943
2944func (s *StateSuite) TestEnsureAvailabilityDemotesUnavailableMachines(c *gc.C) {
2945 err := s.State.EnsureAvailability(3, constraints.Value{}, "quantal")
2946 c.Assert(err, gc.IsNil)
2947 s.assertStateServerInfo(c, []string{"0", "1", "2"}, []string{"0", "1", "2"})
2948 s.PatchValue(state.StateServerAvailable, func(m *state.Machine) (bool, error) {
2949 return m.Id() != "0", nil
2950 })
2951 err = s.State.EnsureAvailability(3, constraints.Value{}, "quantal")
2952 c.Assert(err, gc.IsNil)
2953
2954 // New state server machine "3" is created; "0" still exists in MachineIds,
2955 // but no longer in VotingMachineIds.
2956 s.assertStateServerInfo(c, []string{"0", "1", "2", "3"}, []string{"1", "2", "3"})
2957 m0, err := s.State.Machine("0")
2958 c.Assert(err, gc.IsNil)
2959 c.Assert(m0.WantsVote(), jc.IsFalse)
2960 c.Assert(m0.IsManager(), jc.IsTrue) // job still intact for now
2961 m3, err := s.State.Machine("3")
2962 c.Assert(err, gc.IsNil)
2963 c.Assert(m3.WantsVote(), jc.IsTrue)
2964 c.Assert(m3.IsManager(), jc.IsTrue)
2965}
2966
2967func (s *StateSuite) TestEnsureAvailabilityPromotesAvailableMachines(c *gc.C) {
2968 err := s.State.EnsureAvailability(3, constraints.Value{}, "quantal")
2969 c.Assert(err, gc.IsNil)
2970 s.assertStateServerInfo(c, []string{"0", "1", "2"}, []string{"0", "1", "2"})
2971 s.PatchValue(state.StateServerAvailable, func(m *state.Machine) (bool, error) {
2972 return m.Id() != "0", nil
2973 })
2974 err = s.State.EnsureAvailability(3, constraints.Value{}, "quantal")
2975 c.Assert(err, gc.IsNil)
2976
2977 // New state server machine "3" is created; "0" still exists in MachineIds,
2978 // but no longer in VotingMachineIds.
2979 s.assertStateServerInfo(c, []string{"0", "1", "2", "3"}, []string{"1", "2", "3"})
2980 m0, err := s.State.Machine("0")
2981 c.Assert(err, gc.IsNil)
2982 c.Assert(m0.WantsVote(), jc.IsFalse)
2983
2984 // Mark machine 0 as having a vote, so it doesn't get removed, and make it
2985 // available once more.
2986 err = m0.SetHasVote(true)
2987 c.Assert(err, gc.IsNil)
2988 s.PatchValue(state.StateServerAvailable, func(m *state.Machine) (bool, error) {
2989 return true, nil
2990 })
2991 err = s.State.EnsureAvailability(3, constraints.Value{}, "quantal")
2992 c.Assert(err, gc.IsNil)
2993 // No change; we've got as many voting machines as we need.
2994 s.assertStateServerInfo(c, []string{"0", "1", "2", "3"}, []string{"1", "2", "3"})
2995
2996 // Make machine 3 unavailable; machine 0 should be promoted, and two new
2997 // machines created.
2998 s.PatchValue(state.StateServerAvailable, func(m *state.Machine) (bool, error) {
2999 return m.Id() != "3", nil
3000 })
3001 err = s.State.EnsureAvailability(5, constraints.Value{}, "quantal")
3002 c.Assert(err, gc.IsNil)
3003 s.assertStateServerInfo(c, []string{"0", "1", "2", "3", "4", "5"}, []string{"0", "1", "2", "4", "5"})
3004 err = m0.Refresh()
3005 c.Assert(err, gc.IsNil)
3006 c.Assert(m0.WantsVote(), jc.IsTrue)
3007}
3008
3009func (s *StateSuite) TestEnsureAvailabilityRemovesUnavailableMachines(c *gc.C) {
3010 err := s.State.EnsureAvailability(3, constraints.Value{}, "quantal")
3011 c.Assert(err, gc.IsNil)
3012 s.assertStateServerInfo(c, []string{"0", "1", "2"}, []string{"0", "1", "2"})
3013 s.PatchValue(state.StateServerAvailable, func(m *state.Machine) (bool, error) {
3014 return m.Id() != "0", nil
3015 })
3016 err = s.State.EnsureAvailability(3, constraints.Value{}, "quantal")
3017 c.Assert(err, gc.IsNil)
3018 s.assertStateServerInfo(c, []string{"0", "1", "2", "3"}, []string{"1", "2", "3"})
3019 // machine 0 does not have a vote, so another call to EnsureAvailability
3020 // will remove machine 0's JobEnvironManager job.
3021 err = s.State.EnsureAvailability(3, constraints.Value{}, "quantal")
3022 c.Assert(err, gc.IsNil)
3023 s.assertStateServerInfo(c, []string{"1", "2", "3"}, []string{"1", "2", "3"})
3024 m0, err := s.State.Machine("0")
3025 c.Assert(err, gc.IsNil)
3026 c.Assert(m0.IsManager(), jc.IsFalse)
3027}
3028
3029func (s *StateSuite) TestEnsureAvailabilityConcurrentSame(c *gc.C) {
3030 s.PatchValue(state.StateServerAvailable, func(m *state.Machine) (bool, error) {
3031 return true, nil
3032 })
3033
3034 defer state.SetBeforeHooks(c, s.State, func() {
3035 err := s.State.EnsureAvailability(3, constraints.Value{}, "quantal")
3036 c.Assert(err, gc.IsNil)
3037 // The outer EnsureAvailability call will allocate IDs 0..2,
3038 // and the inner one 3..5.
3039 expected := []string{"3", "4", "5"}
3040 s.assertStateServerInfo(c, expected, expected)
3041 }).Check()
3042
3043 err := s.State.EnsureAvailability(3, constraints.Value{}, "quantal")
3044 c.Assert(err, gc.IsNil)
3045 s.assertStateServerInfo(c, []string{"3", "4", "5"}, []string{"3", "4", "5"})
3046
3047 // Machine 0 should never have been created.
3048 _, err = s.State.Machine("0")
3049 c.Assert(err, jc.Satisfies, errors.IsNotFoundError)
3050}
3051
3052func (s *StateSuite) TestEnsureAvailabilityConcurrentLess(c *gc.C) {
3053 s.PatchValue(state.StateServerAvailable, func(m *state.Machine) (bool, error) {
3054 return true, nil
3055 })
3056
3057 defer state.SetBeforeHooks(c, s.State, func() {
3058 err := s.State.EnsureAvailability(3, constraints.Value{}, "quantal")
3059 c.Assert(err, gc.IsNil)
3060 // The outer EnsureAvailability call will initially allocate IDs 0..4,
3061 // and the inner one 5..7.
3062 expected := []string{"5", "6", "7"}
3063 s.assertStateServerInfo(c, expected, expected)
3064 }).Check()
3065
3066 // This call to EnsureAvailability will initially attempt to allocate
3067 // machines 0..4, and fail due to the concurrent change. It will then
3068 // allocate machines 8..9 to make up the difference from the concurrent
3069 // EnsureAvailability call.
3070 err := s.State.EnsureAvailability(5, constraints.Value{}, "quantal")
3071 c.Assert(err, gc.IsNil)
3072 expected := []string{"5", "6", "7", "8", "9"}
3073 s.assertStateServerInfo(c, expected, expected)
3074
3075 // Machine 0 should never have been created.
3076 _, err = s.State.Machine("0")
3077 c.Assert(err, jc.Satisfies, errors.IsNotFoundError)
3078}
3079
3080func (s *StateSuite) TestEnsureAvailabilityConcurrentMore(c *gc.C) {
3081 s.PatchValue(state.StateServerAvailable, func(m *state.Machine) (bool, error) {
3082 return true, nil
3083 })
3084
3085 defer state.SetBeforeHooks(c, s.State, func() {
3086 err := s.State.EnsureAvailability(5, constraints.Value{}, "quantal")
3087 c.Assert(err, gc.IsNil)
3088 // The outer EnsureAvailability call will allocate IDs 0..2,
3089 // and the inner one 3..7.
3090 expected := []string{"3", "4", "5", "6", "7"}
3091 s.assertStateServerInfo(c, expected, expected)
3092 }).Check()
3093
3094 // This call to EnsureAvailability will initially attempt to allocate
3095 // machines 0..2, and fail due to the concurrent change. It will then
3096 // find that the number of voting machines in state is greater than
3097 // what we're attempting to ensure, and fail.
3098 err := s.State.EnsureAvailability(3, constraints.Value{}, "quantal")
3099 c.Assert(err, gc.ErrorMatches, "cannot reduce state server count")
3100
3101 // Machine 0 should never have been created.
3102 _, err = s.State.Machine("0")
3103 c.Assert(err, jc.Satisfies, errors.IsNotFoundError)
3104}
3105
2943func (s *StateSuite) TestStateServingInfo(c *gc.C) {3106func (s *StateSuite) TestStateServingInfo(c *gc.C) {
2944 info, err := s.State.StateServingInfo()3107 info, err := s.State.StateServingInfo()
2945 c.Assert(info, jc.DeepEquals, params.StateServingInfo{})3108 c.Assert(info, jc.DeepEquals, params.StateServingInfo{})

Subscribers

People subscribed via source and target branches

to status/vote changes: