Merge lp:~thumper/juju-core/provisioner-step-1 into lp:~juju/juju-core/trunk
- provisioner-step-1
- Merge into trunk
Status: | Merged |
---|---|
Approved by: | William Reade |
Approved revision: | no longer in the source branch. |
Merged at revision: | 1260 |
Proposed branch: | lp:~thumper/juju-core/provisioner-step-1 |
Merge into: | lp:~juju/juju-core/trunk |
Diff against target: |
948 lines (+490/-310) 5 files modified
worker/provisioner/broker.go (+27/-0) worker/provisioner/environ_broker.go (+21/-0) worker/provisioner/provisioner.go (+29/-286) worker/provisioner/provisioner_task.go (+373/-0) worker/provisioner/provisioner_test.go (+40/-24) |
To merge this branch: | bzr merge lp:~thumper/juju-core/provisioner-step-1 |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Juju Engineering | Pending | ||
Review via email:
|
Commit message
Description of the change
First part of the provisioner refactoring.
This branch breaks up the current provisioner and defines some interfaces that
we'll be using for containers.
The broker interface is what we have starting, stopping instances, and listing
instances and related machines.
An environment broker is written that defers the actual calls to the environ.
The provisioner now creates a provisioning task for the environment provider
using a machine watcher and the environment broker. The common provisioning
methods are now in provisioner_
I refactored some of the methods as we were duplicating a lot of calls, like
get all instances, and then getting instances for individual ids. Also,
looking up specific machines, and also getting all machines. Now we just do
the "all" gets, and keep a map around for the duration of the checks.
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Tim Penhey (thumper) wrote : | # |
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Ian Booth (wallyworld) wrote : | # |
I think you have essentially copied a bunch of methods from provisioner
to provisioner_task, so mainly a mechanical change. LGTM
https:/
File worker/
https:/
worker/
error
I think the signature here was done to match the existing Environs
StopInstances but I'd rather see StopInstances(
ids od the instances to stop. There's no need to pass in more info than
is necessary and callers typically should be designed to pass in just
the ids. Both ec2 and openstack just extract the ids anyway and mass can
be tweaked as needed.
https:/
File worker/
https:/
worker/
newEnvironmentB
Broker {
newEnvironBroker I think is more idiomatic
https:/
File worker/
https:/
worker/
newProvisionerTask(
environBroker perhaps, expecially if newEnvironmentB
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Frank Mueller (themue) wrote : | # |
Still interested in how the provisioning of containers on possibly each
machine and there without an environBroker will look like. Or is this
provisioner only intended as environment provisioner?
https:/
File worker/
https:/
worker/
error
On 2013/06/06 04:12:43, wallyworld wrote:
> I think the signature here was done to match the existing Environs
StopInstances
> but I'd rather see StopInstances(
instances
> to stop. There's no need to pass in more info than is necessary and
callers
> typically should be designed to pass in just the ids. Both ec2 and
openstack
> just extract the ids anyway and mass can be tweaked as needed.
+1
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
William Reade (fwereade) wrote : | # |
Only serious issue is the AllMachines stuff that I don't think we need.
I'll try to swing by tonight for a chat; if not, I'm fine calling
progress-
resolve it first.
https:/
File worker/
https:/
worker/
string, series string, cons constraints.Value, info *state.Info, apiInfo
*api.Info) (environs.Instance, error)
maybe that'll be an instance.Instance one day... but not today,
certainly.
https:/
worker/
error
On 2013/06/06 07:30:30, mue wrote:
> On 2013/06/06 04:12:43, wallyworld wrote:
> > I think the signature here was done to match the existing Environs
> StopInstances
> > but I'd rather see StopInstances(
instances
> > to stop. There's no need to pass in more info than is necessary and
callers
> > typically should be designed to pass in just the ids. Both ec2 and
openstack
> > just extract the ids anyway and mass can be tweaked as needed.
> +1
+0.5, but I'd like to keep this CL focused.
https:/
worker/
Not quite sure about this.
https:/
File worker/
https:/
worker/
I think we can keep state out of here.
https:/
File worker/
https:/
worker/
loop?
yeah, select on its Dying()?
https:/
worker/
defer a Stop just after creation, surely?
https:/
File worker/
https:/
worker/
stop any unknown instances before watching machines.
I *think* this whole block is redundant. The first event out of the
watcher will contain all relevant machine ids, and if we pass that
through to populateMachineMaps we can just use that... right?
https:/
worker/
machine ID.
I think this comment is wrong, but I'm still in driveby mod...
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Roger Peppe (rogpeppe) wrote : | # |
https:/
File worker/
https:/
worker/
error
On 2013/06/06 04:12:43, wallyworld wrote:
> I think the signature here was done to match the existing Environs
StopInstances
> but I'd rather see StopInstances(
instances
> to stop. There's no need to pass in more info than is necessary and
callers
> typically should be designed to pass in just the ids. Both ec2 and
openstack
> just extract the ids anyway and mass can be tweaked as needed.
the reason the StopInstances call was designed the way it was is so you
cannot stop an instance that is not part of the current environment
(you have to call Environ.Instances to get the instance first, which can
make sure that the instance id corresponds to one in the environment).
perhaps it's not a problem to relax that restriction, but i think it's
worth pointing out.
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Tim Penhey (thumper) wrote : | # |
https:/
File worker/
https:/
worker/
string, series string, cons constraints.Value, info *state.Info, apiInfo
*api.Info) (environs.Instance, error)
On 2013/06/06 14:50:55, fwereade wrote:
> maybe that'll be an instance.Instance one day... but not today,
certainly.
Perhaps *state.Machine? instance.Instance is for started instances
isn't it?
https:/
worker/
error
On 2013/06/06 16:07:35, rog wrote:
> On 2013/06/06 04:12:43, wallyworld wrote:
> > I think the signature here was done to match the existing Environs
> StopInstances
> > but I'd rather see StopInstances(
instances
> > to stop. There's no need to pass in more info than is necessary and
callers
> > typically should be designed to pass in just the ids. Both ec2 and
openstack
> > just extract the ids anyway and mass can be tweaked as needed.
> the reason the StopInstances call was designed the way it was is so
you
> cannot stop an instance that is not part of the current environment
> (you have to call Environ.Instances to get the instance first, which
can make
> sure that the instance id corresponds to one in the environment).
> perhaps it's not a problem to relax that restriction, but i think it's
worth
> pointing out.
Perhaps []state.InstanceId? this is a string type under the covers, but
better matches intent.
The provisioner has done the "lookup the instances" for the machine ids
it gets, so we know, at least from the provisioner side, that all of the
values being passed through should be valid.
https:/
worker/
On 2013/06/06 14:50:55, fwereade wrote:
> Not quite sure about this.
I thought quite a bit about this, and ended up putting it here. The
provisioner task uses it when trying to identify the unknown instances.
It made sense to me that the broker would supply this.
Otherwise it is a global call to State, and we certainly don't care
about "AllMachines" when dealing with a container on a machine 4.
https:/
File worker/
https:/
worker/
newEnvironmentB
Broker {
On 2013/06/06 04:12:43, wallyworld wrote:
> newEnvironBroker I think is more idiomatic
probably
https:/
worker/
On 2013/06/06 14:50:55, fwereade wrote:
> I think we can keep state out of here.
It is used only for the AllMachines call, which I commented on th...
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Tim Penhey (thumper) wrote : | # |
Please take a look.
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Tim Penhey (thumper) wrote : | # |
On 2013/06/07 03:44:19, thumper wrote:
> Please take a look.
FYI - bootstrapped EC2 with this code, after the status below, I
destroyed the environment and was able to see all the instances shutting
down from the AWS console.
$ juju status
machines:
"0":
agent-state: started
agent-version: 1.11.1.1
dns-name: ec2-23-
instance-id: i-83b05aec
series: precise
"1":
agent-state: started
agent-version: 1.11.1.1
dns-name: ec2-54-
instance-id: i-55f5b63a
series: precise
"2":
agent-state: started
agent-version: 1.11.1.1
dns-name: ec2-54-
instance-id: i-d9f4b7b6
series: precise
services:
mysql:
charm: cs:precise/mysql-23
exposed: false
relations:
cluster:
- mysql
units:
mysql/0:
machine: "2"
wordpress:
charm: cs:precise/
exposed: false
relations:
- wordpress
units:
wordpress/0:
machine: "1"
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
William Reade (fwereade) wrote : | # |
Huuuge pile of comments; the majority are not conveniently addressable
this CL, but I would appreciate consideration of those that are. I'm
happy to LGTM on the understanding that you will give them due
consideration before merging.
https:/
File worker/
https:/
worker/
string, series string, cons constraints.Value, info *state.Info, apiInfo
*api.Info) (environs.Instance, error)
On 2013/06/06 22:14:53, thumper wrote:
> On 2013/06/06 14:50:55, fwereade wrote:
> > maybe that'll be an instance.Instance one day... but not today,
certainly.
> Perhaps *state.Machine? instance.Instance is for started instances
isn't it?
There's a firm distinction between machines and instances: a machine
exists purely in state, and an instance is completely independent of
state. What I'm getting at is that Instance is going to stop being an
environs-specific concept, and should probably move to its own package
at some point.
https:/
worker/
error
On 2013/06/06 22:14:53, thumper wrote:
> On 2013/06/06 16:07:35, rog wrote:
> > On 2013/06/06 04:12:43, wallyworld wrote:
> > > I think the signature here was done to match the existing Environs
> > StopInstances
> > > but I'd rather see StopInstances(
the
> instances
> > > to stop. There's no need to pass in more info than is necessary
and callers
> > > typically should be designed to pass in just the ids. Both ec2 and
openstack
> > > just extract the ids anyway and mass can be tweaked as needed.
> >
> > the reason the StopInstances call was designed the way it was is so
you
> > cannot stop an instance that is not part of the current environment
> > (you have to call Environ.Instances to get the instance first, which
can make
> > sure that the instance id corresponds to one in the environment).
> >
> > perhaps it's not a problem to relax that restriction, but i think
it's worth
> > pointing out.
> Perhaps []state.InstanceId? this is a string type under the covers,
but better
> matches intent.
> The provisioner has done the "lookup the instances" for the machine
ids it gets,
> so we know, at least from the provisioner side, that all of the values
being
> passed through should be valid.
I'd be ok with InstanceId fwiw.
https:/
File worker/
https:/
worker/
task.broker.
On 2013/06/06 22:14:53, thumper wrote:
> On 2013/06/06 14:50:55, fwereade wrote:
> > Yeah, there's no call for AllMachines() at all. We just need to pass
the list
> of
> > changed machine ids down from the watcher, get st.Machine() for any
that not
> are
> > already known (inefficient, s...
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Tim Penhey (thumper) wrote : | # |
*** Submitted:
First part of the provisioner refactoring.
This branch breaks up the current provisioner and defines some
interfaces that
we'll be using for containers.
The broker interface is what we have starting, stopping instances, and
listing
instances and related machines.
An environment broker is written that defers the actual calls to the
environ.
The provisioner now creates a provisioning task for the environment
provider
using a machine watcher and the environment broker. The common
provisioning
methods are now in provisioner_
I refactored some of the methods as we were duplicating a lot of calls,
like
get all instances, and then getting instances for individual ids. Also,
looking up specific machines, and also getting all machines. Now we
just do
the "all" gets, and keep a map around for the duration of the checks.
R=wallyworld, mue, fwereade, rog
CC=
https:/
https:/
File worker/
https:/
worker/
ProvisionerTask
On 2013/06/07 22:40:45, fwereade wrote:
> Only used in one method, no need to put it on the type.
Done.
https:/
worker/
p.environ.
On 2013/06/07 22:40:45, fwereade wrote:
> Hmm. This should probably come from state instead (one day the API).
It has
> Addresses() and APIAddresses() methods, and a CACert() method required
by both,
> and you can build perfectly good *Infos from scratch inside the task
so long as
> they have access to that information.
> And, it should definitely not be done just once in the lifetime of the
> provisioner: the set of addresses is theoretically subject to change
and I'd
> prefer to get "fresh" data at least once for every group of machines
we start.
> But for now it's moot wrt observable behaviour, so leave them out of
this CL.
Should have a follow up to do this.
https:/
worker/
p.environmentPr
On 2013/06/07 22:40:45, fwereade wrote:
> `defer watcher.
shutdown errors
> into this type's tomb.
I had a look at this. I think we should move the method. I inferred by
the name that it expects a watcher, but it doesn't, just a Stopper.
https:/
worker/
<-environWatche
On 2013/06/07 22:40:45, fwereade wrote:
> Soon (ie before we write an API that poos this whole thing over the
wire every
> time) I'd like to make this an EntityWatcher, and to get EnvironConfig
directly
> and explicitly in response to the empty notifications. Not this CL
though.
poos this whole thing? Sounds meesy.
Preview Diff
1 | === added file 'worker/provisioner/broker.go' |
2 | --- worker/provisioner/broker.go 1970-01-01 00:00:00 +0000 |
3 | +++ worker/provisioner/broker.go 2013-06-07 03:44:25 +0000 |
4 | @@ -0,0 +1,27 @@ |
5 | +// Copyright 2013 Canonical Ltd. |
6 | +// Licensed under the AGPLv3, see LICENCE file for details. |
7 | + |
8 | +package provisioner |
9 | + |
10 | +import ( |
11 | + "launchpad.net/juju-core/constraints" |
12 | + "launchpad.net/juju-core/environs" |
13 | + "launchpad.net/juju-core/state" |
14 | + "launchpad.net/juju-core/state/api" |
15 | +) |
16 | + |
17 | +type Broker interface { |
18 | + // StartInstance asks for a new instance to be created, associated with |
19 | + // the provided machine identifier. The given info describes the juju |
20 | + // state for the new instance to connect to. The nonce, which must be |
21 | + // unique within an environment, is used by juju to protect against the |
22 | + // consequences of multiple instances being started with the same machine |
23 | + // id. |
24 | + StartInstance(machineId, machineNonce string, series string, cons constraints.Value, info *state.Info, apiInfo *api.Info) (environs.Instance, error) |
25 | + |
26 | + // StopInstances shuts down the given instances. |
27 | + StopInstances([]environs.Instance) error |
28 | + |
29 | + // AllInstances returns all instances currently known to the broker. |
30 | + AllInstances() ([]environs.Instance, error) |
31 | +} |
32 | |
33 | === added file 'worker/provisioner/environ_broker.go' |
34 | --- worker/provisioner/environ_broker.go 1970-01-01 00:00:00 +0000 |
35 | +++ worker/provisioner/environ_broker.go 2013-06-07 03:44:25 +0000 |
36 | @@ -0,0 +1,21 @@ |
37 | +// Copyright 2013 Canonical Ltd. |
38 | +// Licensed under the AGPLv3, see LICENCE file for details. |
39 | + |
40 | +package provisioner |
41 | + |
42 | +import ( |
43 | + "launchpad.net/juju-core/environs" |
44 | +) |
45 | + |
46 | +func newEnvironBroker(environ environs.Environ) Broker { |
47 | + return &environBroker{environ} |
48 | +} |
49 | + |
50 | +type environBroker struct { |
51 | + environs.Environ |
52 | +} |
53 | + |
54 | +// Defer to the Environ for: |
55 | +// StartInstance |
56 | +// StopInstances |
57 | +// AllInstances |
58 | |
59 | === modified file 'worker/provisioner/provisioner.go' |
60 | --- worker/provisioner/provisioner.go 2013-05-31 01:02:53 +0000 |
61 | +++ worker/provisioner/provisioner.go 2013-06-07 03:44:25 +0000 |
62 | @@ -4,35 +4,26 @@ |
63 | package provisioner |
64 | |
65 | import ( |
66 | - stderrors "errors" |
67 | - "fmt" |
68 | + "sync" |
69 | + |
70 | "launchpad.net/juju-core/environs" |
71 | "launchpad.net/juju-core/environs/config" |
72 | - "launchpad.net/juju-core/errors" |
73 | - "launchpad.net/juju-core/log" |
74 | "launchpad.net/juju-core/state" |
75 | - "launchpad.net/juju-core/state/api" |
76 | - "launchpad.net/juju-core/state/api/params" |
77 | "launchpad.net/juju-core/state/watcher" |
78 | - "launchpad.net/juju-core/utils" |
79 | "launchpad.net/juju-core/worker" |
80 | + "launchpad.net/loggo" |
81 | "launchpad.net/tomb" |
82 | - "sync" |
83 | ) |
84 | |
85 | +var logger = loggo.GetLogger("juju.provisioner") |
86 | + |
87 | // Provisioner represents a running provisioning worker. |
88 | type Provisioner struct { |
89 | - st *state.State |
90 | - machineId string // Which machine runs the provisioner. |
91 | - stateInfo *state.Info |
92 | - apiInfo *api.Info |
93 | - environ environs.Environ |
94 | - tomb tomb.Tomb |
95 | - |
96 | - // machine.Id => environs.Instance |
97 | - instances map[string]environs.Instance |
98 | - // instance.Id => machine id |
99 | - machines map[state.InstanceId]string |
100 | + st *state.State |
101 | + machineId string // Which machine runs the provisioner. |
102 | + environ environs.Environ |
103 | + tomb tomb.Tomb |
104 | + environmentProvisioner ProvisionerTask |
105 | |
106 | configObserver |
107 | } |
108 | @@ -58,8 +49,6 @@ |
109 | p := &Provisioner{ |
110 | st: st, |
111 | machineId: machineId, |
112 | - instances: make(map[string]environs.Instance), |
113 | - machines: make(map[state.InstanceId]string), |
114 | } |
115 | go func() { |
116 | defer p.tomb.Done() |
117 | @@ -81,19 +70,26 @@ |
118 | // Get a new StateInfo from the environment: the one used to |
119 | // launch the agent may refer to localhost, which will be |
120 | // unhelpful when attempting to run an agent on a new machine. |
121 | - if p.stateInfo, p.apiInfo, err = p.environ.StateInfo(); err != nil { |
122 | + stateInfo, apiInfo, err := p.environ.StateInfo() |
123 | + if err != nil { |
124 | return err |
125 | } |
126 | |
127 | - // Call processMachines to stop any unknown instances before watching machines. |
128 | - if err := p.processMachines(nil); err != nil { |
129 | - return err |
130 | - } |
131 | + // Start a new worker for the environment provider. |
132 | |
133 | // Start responding to changes in machines, and to any further updates |
134 | // to the environment config. |
135 | machinesWatcher := p.st.WatchMachines() |
136 | - defer watcher.Stop(machinesWatcher, &p.tomb) |
137 | + environmentBroker := newEnvironBroker(p.environ) |
138 | + p.environmentProvisioner = newProvisionerTask( |
139 | + p.machineId, |
140 | + p.st, |
141 | + machinesWatcher, |
142 | + environmentBroker, |
143 | + stateInfo, |
144 | + apiInfo) |
145 | + defer p.environmentProvisioner.Stop() |
146 | + |
147 | for { |
148 | select { |
149 | case <-p.tomb.Dying(): |
150 | @@ -103,17 +99,12 @@ |
151 | return watcher.MustErr(environWatcher) |
152 | } |
153 | if err := p.setConfig(cfg); err != nil { |
154 | - log.Errorf("worker/provisioner: loaded invalid environment configuration: %v", err) |
155 | - } |
156 | - case ids, ok := <-machinesWatcher.Changes(): |
157 | - if !ok { |
158 | - return watcher.MustErr(machinesWatcher) |
159 | - } |
160 | - // TODO(dfc; lp:1042717) fire process machines periodically to shut down unknown |
161 | - // instances. |
162 | - if err := p.processMachines(ids); err != nil { |
163 | - return err |
164 | - } |
165 | + logger.Error("loaded invalid environment configuration: %v", err) |
166 | + } |
167 | + case <-p.environmentProvisioner.Dying(): |
168 | + err := p.environmentProvisioner.Err() |
169 | + logger.Error("environment provisioner died: %v", err) |
170 | + return err |
171 | } |
172 | } |
173 | panic("not reached") |
174 | @@ -155,251 +146,3 @@ |
175 | p.tomb.Kill(nil) |
176 | return p.tomb.Wait() |
177 | } |
178 | - |
179 | -func (p *Provisioner) processMachines(ids []string) error { |
180 | - // Find machines without an instance id or that are dead |
181 | - pending, dead, err := p.pendingOrDead(ids) |
182 | - if err != nil { |
183 | - return err |
184 | - } |
185 | - |
186 | - // Find running instances that have no machines associated |
187 | - unknown, err := p.findUnknownInstances() |
188 | - if err != nil { |
189 | - return err |
190 | - } |
191 | - |
192 | - // Stop all machines that are dead |
193 | - stopping, err := p.instancesForMachines(dead) |
194 | - if err != nil { |
195 | - return err |
196 | - } |
197 | - |
198 | - // It's important that we stop unknown instances before starting |
199 | - // pending ones, because if we start an instance and then fail to |
200 | - // set its InstanceId on the machine we don't want to start a new |
201 | - // instance for the same machine ID. |
202 | - if err := p.stopInstances(append(stopping, unknown...)); err != nil { |
203 | - return err |
204 | - } |
205 | - |
206 | - // Start an instance for the pending ones |
207 | - return p.startMachines(pending) |
208 | -} |
209 | - |
210 | -// findUnknownInstances finds instances which are not associated with a machine. |
211 | -func (p *Provisioner) findUnknownInstances() ([]environs.Instance, error) { |
212 | - all, err := p.environ.AllInstances() |
213 | - if err != nil { |
214 | - return nil, err |
215 | - } |
216 | - instances := make(map[state.InstanceId]environs.Instance) |
217 | - for _, i := range all { |
218 | - instances[i.Id()] = i |
219 | - } |
220 | - // TODO(dfc) this is very inefficient. |
221 | - machines, err := p.st.AllMachines() |
222 | - if err != nil { |
223 | - return nil, err |
224 | - } |
225 | - for _, m := range machines { |
226 | - if instId, ok := m.InstanceId(); ok { |
227 | - delete(instances, instId) |
228 | - } |
229 | - } |
230 | - var unknown []environs.Instance |
231 | - for _, i := range instances { |
232 | - unknown = append(unknown, i) |
233 | - } |
234 | - return unknown, nil |
235 | -} |
236 | - |
237 | -// pendingOrDead looks up machines with ids and retuns those that do not |
238 | -// have an instance id assigned yet, and also those that are dead. |
239 | -func (p *Provisioner) pendingOrDead(ids []string) (pending, dead []*state.Machine, err error) { |
240 | - // TODO(niemeyer): ms, err := st.Machines(alive) |
241 | - for _, id := range ids { |
242 | - m, err := p.st.Machine(id) |
243 | - if errors.IsNotFoundError(err) { |
244 | - log.Infof("worker/provisioner: machine %q not found in state", m) |
245 | - continue |
246 | - } |
247 | - if err != nil { |
248 | - return nil, nil, err |
249 | - } |
250 | - switch m.Life() { |
251 | - case state.Dying: |
252 | - if _, ok := m.InstanceId(); ok { |
253 | - continue |
254 | - } |
255 | - log.Infof("worker/provisioner: killing dying, unprovisioned machine %q", m) |
256 | - if err := m.EnsureDead(); err != nil { |
257 | - return nil, nil, err |
258 | - } |
259 | - fallthrough |
260 | - case state.Dead: |
261 | - dead = append(dead, m) |
262 | - log.Infof("worker/provisioner: removing dead machine %q", m) |
263 | - if err := m.Remove(); err != nil { |
264 | - return nil, nil, err |
265 | - } |
266 | - continue |
267 | - } |
268 | - if instId, hasInstId := m.InstanceId(); !hasInstId { |
269 | - status, _, err := m.Status() |
270 | - if err != nil { |
271 | - log.Infof("worker/provisioner: cannot get machine %q status: %v", m, err) |
272 | - continue |
273 | - } |
274 | - if status == params.StatusPending { |
275 | - pending = append(pending, m) |
276 | - log.Infof("worker/provisioner: found machine %q pending provisioning", m) |
277 | - continue |
278 | - } |
279 | - } else { |
280 | - log.Infof("worker/provisioner: machine %v already started as instance %q", m, instId) |
281 | - } |
282 | - } |
283 | - return |
284 | -} |
285 | - |
286 | -func (p *Provisioner) startMachines(machines []*state.Machine) error { |
287 | - for _, m := range machines { |
288 | - if err := p.startMachine(m); err != nil { |
289 | - return fmt.Errorf("cannot start machine %v: %v", m, err) |
290 | - } |
291 | - } |
292 | - return nil |
293 | -} |
294 | - |
295 | -func (p *Provisioner) startMachine(m *state.Machine) error { |
296 | - // TODO(dfc) the state.Info passed to environ.StartInstance remains contentious |
297 | - // however as the PA only knows one state.Info, and that info is used by MAs and |
298 | - // UAs to locate the state for this environment, it is logical to use the same |
299 | - // state.Info as the PA. |
300 | - stateInfo, apiInfo, err := p.setupAuthentication(m) |
301 | - if err != nil { |
302 | - return err |
303 | - } |
304 | - cons, err := m.Constraints() |
305 | - if err != nil { |
306 | - return err |
307 | - } |
308 | - // Generate a unique nonce for the new instance. |
309 | - uuid, err := utils.NewUUID() |
310 | - if err != nil { |
311 | - return err |
312 | - } |
313 | - // Generated nonce has the format: "machine-#:UUID". The first |
314 | - // part is a badge, specifying the tag of the machine the provisioner |
315 | - // is running on, while the second part is a random UUID. |
316 | - nonce := fmt.Sprintf("%s:%s", state.MachineTag(p.machineId), uuid.String()) |
317 | - inst, err := p.environ.StartInstance(m.Id(), nonce, m.Series(), cons, stateInfo, apiInfo) |
318 | - if err != nil { |
319 | - // Set the state to error, so the machine will be skipped next |
320 | - // time until the error is resolved, but don't return an |
321 | - // error; just keep going with the other machines. |
322 | - log.Errorf("worker/provisioner: cannot start instance for machine %q: %v", m, err) |
323 | - if err1 := m.SetStatus(params.StatusError, err.Error()); err1 != nil { |
324 | - // Something is wrong with this machine, better report it back. |
325 | - log.Errorf("worker/provisioner: cannot set error status for machine %q: %v", m, err1) |
326 | - return err1 |
327 | - } |
328 | - return nil |
329 | - } |
330 | - if err := m.SetProvisioned(inst.Id(), nonce); err != nil { |
331 | - // The machine is started, but we can't record the mapping in |
332 | - // state. It'll keep running while we fail out and restart, |
333 | - // but will then be detected by findUnknownInstances and |
334 | - // killed again. |
335 | - // |
336 | - // TODO(dimitern) Stop the instance right away here. |
337 | - // |
338 | - // Multiple instantiations of a given machine (with the same |
339 | - // machine ID) cannot coexist, because findUnknownInstances is |
340 | - // called before startMachines. However, if the first machine |
341 | - // had started to do work before being replaced, we may |
342 | - // encounter surprising problems. |
343 | - return err |
344 | - } |
345 | - // populate the local cache |
346 | - p.instances[m.Id()] = inst |
347 | - p.machines[inst.Id()] = m.Id() |
348 | - log.Noticef("worker/provisioner: started machine %s as instance %s", m, inst.Id()) |
349 | - return nil |
350 | -} |
351 | - |
352 | -func (p *Provisioner) setupAuthentication(m *state.Machine) (*state.Info, *api.Info, error) { |
353 | - password, err := utils.RandomPassword() |
354 | - if err != nil { |
355 | - return nil, nil, fmt.Errorf("cannot make password for machine %v: %v", m, err) |
356 | - } |
357 | - if err := m.SetMongoPassword(password); err != nil { |
358 | - return nil, nil, fmt.Errorf("cannot set password for machine %v: %v", m, err) |
359 | - } |
360 | - stateInfo := *p.stateInfo |
361 | - stateInfo.Tag = m.Tag() |
362 | - stateInfo.Password = password |
363 | - apiInfo := *p.apiInfo |
364 | - apiInfo.Tag = m.Tag() |
365 | - apiInfo.Password = password |
366 | - return &stateInfo, &apiInfo, nil |
367 | -} |
368 | - |
369 | -func (p *Provisioner) stopInstances(instances []environs.Instance) error { |
370 | - // Although calling StopInstance with an empty slice should produce no change in the |
371 | - // provider, environs like dummy do not consider this a noop. |
372 | - if len(instances) == 0 { |
373 | - return nil |
374 | - } |
375 | - if err := p.environ.StopInstances(instances); err != nil { |
376 | - return err |
377 | - } |
378 | - |
379 | - // cleanup cache |
380 | - for _, i := range instances { |
381 | - if id, ok := p.machines[i.Id()]; ok { |
382 | - delete(p.machines, i.Id()) |
383 | - delete(p.instances, id) |
384 | - } |
385 | - } |
386 | - return nil |
387 | -} |
388 | - |
389 | -var errNotProvisioned = stderrors.New("machine has no instance id set") |
390 | - |
391 | -// instanceForMachine returns the environs.Instance that represents this machine's instance. |
392 | -func (p *Provisioner) instanceForMachine(m *state.Machine) (environs.Instance, error) { |
393 | - inst, ok := p.instances[m.Id()] |
394 | - if ok { |
395 | - return inst, nil |
396 | - } |
397 | - instId, ok := m.InstanceId() |
398 | - if !ok { |
399 | - return nil, errNotProvisioned |
400 | - } |
401 | - // TODO(dfc): Ask for all instances at once. |
402 | - insts, err := p.environ.Instances([]state.InstanceId{instId}) |
403 | - if err != nil { |
404 | - return nil, err |
405 | - } |
406 | - inst = insts[0] |
407 | - return inst, nil |
408 | -} |
409 | - |
410 | -// instancesForMachines returns a list of environs.Instance that represent |
411 | -// the list of machines running in the provider. Missing machines are |
412 | -// omitted from the list. |
413 | -func (p *Provisioner) instancesForMachines(ms []*state.Machine) ([]environs.Instance, error) { |
414 | - var insts []environs.Instance |
415 | - for _, m := range ms { |
416 | - switch inst, err := p.instanceForMachine(m); err { |
417 | - case nil: |
418 | - insts = append(insts, inst) |
419 | - case errNotProvisioned, environs.ErrNoInstances: |
420 | - default: |
421 | - return nil, err |
422 | - } |
423 | - } |
424 | - return insts, nil |
425 | -} |
426 | |
427 | === added file 'worker/provisioner/provisioner_task.go' |
428 | --- worker/provisioner/provisioner_task.go 1970-01-01 00:00:00 +0000 |
429 | +++ worker/provisioner/provisioner_task.go 2013-06-07 03:44:25 +0000 |
430 | @@ -0,0 +1,373 @@ |
431 | +// Copyright 2012, 2013 Canonical Ltd. |
432 | +// Licensed under the AGPLv3, see LICENCE file for details. |
433 | + |
434 | +package provisioner |
435 | + |
436 | +import ( |
437 | + "fmt" |
438 | + |
439 | + "launchpad.net/juju-core/environs" |
440 | + "launchpad.net/juju-core/errors" |
441 | + "launchpad.net/juju-core/state" |
442 | + "launchpad.net/juju-core/state/api" |
443 | + "launchpad.net/juju-core/state/api/params" |
444 | + "launchpad.net/juju-core/state/watcher" |
445 | + "launchpad.net/juju-core/utils" |
446 | + "launchpad.net/juju-core/worker" |
447 | + "launchpad.net/tomb" |
448 | +) |
449 | + |
450 | +type ProvisionerTask interface { |
451 | + worker.Worker |
452 | + Stop() error |
453 | + Dying() <-chan struct{} |
454 | + Err() error |
455 | +} |
456 | + |
457 | +type Watcher interface { |
458 | + watcher.Errer |
459 | + watcher.Stopper |
460 | + Changes() <-chan []string |
461 | +} |
462 | + |
463 | +type MachineGetter interface { |
464 | + Machine(id string) (*state.Machine, error) |
465 | +} |
466 | + |
467 | +func newProvisionerTask( |
468 | + machineId string, |
469 | + machineGetter MachineGetter, |
470 | + watcher Watcher, |
471 | + broker Broker, |
472 | + stateInfo *state.Info, |
473 | + apiInfo *api.Info, |
474 | +) ProvisionerTask { |
475 | + task := &provisionerTask{ |
476 | + machineId: machineId, |
477 | + machineGetter: machineGetter, |
478 | + machineWatcher: watcher, |
479 | + broker: broker, |
480 | + stateInfo: stateInfo, |
481 | + apiInfo: apiInfo, |
482 | + machines: make(map[string]*state.Machine), |
483 | + } |
484 | + go func() { |
485 | + defer task.tomb.Done() |
486 | + task.tomb.Kill(task.loop()) |
487 | + }() |
488 | + return task |
489 | +} |
490 | + |
491 | +type provisionerTask struct { |
492 | + machineId string |
493 | + machineGetter MachineGetter |
494 | + machineWatcher Watcher |
495 | + broker Broker |
496 | + tomb tomb.Tomb |
497 | + stateInfo *state.Info |
498 | + apiInfo *api.Info |
499 | + |
500 | + // instance id -> instance |
501 | + instances map[state.InstanceId]environs.Instance |
502 | + // machine id -> machine |
503 | + machines map[string]*state.Machine |
504 | +} |
505 | + |
506 | +// Kill implements worker.Worker.Kill. |
507 | +func (task *provisionerTask) Kill() { |
508 | + task.tomb.Kill(nil) |
509 | +} |
510 | + |
511 | +// Wait implements worker.Worker.Wait. |
512 | +func (task *provisionerTask) Wait() error { |
513 | + return task.tomb.Wait() |
514 | +} |
515 | + |
516 | +func (task *provisionerTask) Stop() error { |
517 | + task.Kill() |
518 | + return task.Wait() |
519 | +} |
520 | + |
521 | +func (task *provisionerTask) Dying() <-chan struct{} { |
522 | + return task.tomb.Dying() |
523 | +} |
524 | + |
525 | +func (task *provisionerTask) Err() error { |
526 | + return task.tomb.Err() |
527 | +} |
528 | + |
529 | +func (task *provisionerTask) loop() error { |
530 | + logger.Info("Starting up provisioner task %s", task.machineId) |
531 | + defer watcher.Stop(task.machineWatcher, &task.tomb) |
532 | + |
533 | + // When the watcher is started, it will have the initial changes be all |
534 | + // the machines that are relevant. Also, since this is available straight |
535 | + // away, we know there will be some changes right off the bat. |
536 | + for { |
537 | + select { |
538 | + case <-task.tomb.Dying(): |
539 | + logger.Info("Shutting down provisioner task %s", task.machineId) |
540 | + return tomb.ErrDying |
541 | + case ids, ok := <-task.machineWatcher.Changes(): |
542 | + if !ok { |
543 | + return watcher.MustErr(task.machineWatcher) |
544 | + } |
545 | + // TODO(dfc; lp:1042717) fire process machines periodically to shut down unknown |
546 | + // instances. |
547 | + if err := task.processMachines(ids); err != nil { |
548 | + logger.Error("Process machines failed: %v", err) |
549 | + return err |
550 | + } |
551 | + } |
552 | + } |
553 | + panic("not reached") |
554 | +} |
555 | + |
556 | +func (task *provisionerTask) processMachines(ids []string) error { |
557 | + logger.Trace("processMachines(%v)", ids) |
558 | + // Populate the tasks maps of current instances and machines. |
559 | + err := task.populateMachineMaps(ids) |
560 | + if err != nil { |
561 | + return err |
562 | + } |
563 | + |
564 | + // Find machines without an instance id or that are dead |
565 | + pending, dead, err := task.pendingOrDead(ids) |
566 | + if err != nil { |
567 | + return err |
568 | + } |
569 | + |
570 | + // Find running instances that have no machines associated |
571 | + unknown, err := task.findUnknownInstances() |
572 | + if err != nil { |
573 | + return err |
574 | + } |
575 | + |
576 | + // Stop all machines that are dead |
577 | + stopping := task.instancesForMachines(dead) |
578 | + |
579 | + // It's important that we stop unknown instances before starting |
580 | + // pending ones, because if we start an instance and then fail to |
581 | + // set its InstanceId on the machine we don't want to start a new |
582 | + // instance for the same machine ID. |
583 | + if err := task.stopInstances(append(stopping, unknown...)); err != nil { |
584 | + return err |
585 | + } |
586 | + |
587 | + // Start an instance for the pending ones |
588 | + return task.startMachines(pending) |
589 | +} |
590 | + |
591 | +func (task *provisionerTask) populateMachineMaps(ids []string) error { |
592 | + task.instances = make(map[state.InstanceId]environs.Instance) |
593 | + |
594 | + instances, err := task.broker.AllInstances() |
595 | + if err != nil { |
596 | + logger.Error("failed to get all instances from broker: %v", err) |
597 | + return err |
598 | + } |
599 | + for _, i := range instances { |
600 | + task.instances[i.Id()] = i |
601 | + } |
602 | + |
603 | + // Update the machines map with new data for each of the machines in the |
604 | + // change list. |
605 | + // TODO(thumper): update for API server later to get all machines in one go. |
606 | + for _, id := range ids { |
607 | + machine, err := task.machineGetter.Machine(id) |
608 | + switch { |
609 | + case errors.IsNotFoundError(err): |
610 | + logger.Debug("machine %q not found in state", id) |
611 | + delete(task.machines, id) |
612 | + case err == nil: |
613 | + task.machines[id] = machine |
614 | + default: |
615 | + logger.Error("failed to get machine: %v", err) |
616 | + } |
617 | + } |
618 | + return nil |
619 | +} |
620 | + |
621 | +// pendingOrDead looks up machines with ids and retuns those that do not |
622 | +// have an instance id assigned yet, and also those that are dead. |
623 | +func (task *provisionerTask) pendingOrDead(ids []string) (pending, dead []*state.Machine, err error) { |
624 | + for _, id := range ids { |
625 | + machine, found := task.machines[id] |
626 | + if !found { |
627 | + logger.Info("machine %q not found", id) |
628 | + continue |
629 | + } |
630 | + switch machine.Life() { |
631 | + case state.Dying: |
632 | + if _, ok := machine.InstanceId(); ok { |
633 | + continue |
634 | + } |
635 | + logger.Info("killing dying, unprovisioned machine %q", machine) |
636 | + if err := machine.EnsureDead(); err != nil { |
637 | + logger.Error("failed to ensure machine dead %q: %v", machine, err) |
638 | + return nil, nil, err |
639 | + } |
640 | + fallthrough |
641 | + case state.Dead: |
642 | + dead = append(dead, machine) |
643 | + logger.Info("removing dead machine %q", machine) |
644 | + if err := machine.Remove(); err != nil { |
645 | + logger.Error("failed to remove dead machine %q", machine) |
646 | + return nil, nil, err |
647 | + } |
648 | + continue |
649 | + } |
650 | + if instId, hasInstId := machine.InstanceId(); !hasInstId { |
651 | + status, _, err := machine.Status() |
652 | + if err != nil { |
653 | + logger.Info("cannot get machine %q status: %v", machine, err) |
654 | + continue |
655 | + } |
656 | + if status == params.StatusPending { |
657 | + pending = append(pending, machine) |
658 | + logger.Info("found machine %q pending provisioning", machine) |
659 | + continue |
660 | + } |
661 | + } else { |
662 | + logger.Info("machine %v already started as instance %q", machine, instId) |
663 | + } |
664 | + } |
665 | + return |
666 | +} |
667 | + |
668 | +// findUnknownInstances finds instances which are not associated with a machine. |
669 | +func (task *provisionerTask) findUnknownInstances() ([]environs.Instance, error) { |
670 | + // Make a copy of the instances we know about. |
671 | + instances := make(map[state.InstanceId]environs.Instance) |
672 | + for k, v := range task.instances { |
673 | + instances[k] = v |
674 | + } |
675 | + |
676 | + for _, m := range task.machines { |
677 | + if instId, ok := m.InstanceId(); ok { |
678 | + delete(instances, instId) |
679 | + } |
680 | + } |
681 | + var unknown []environs.Instance |
682 | + for _, i := range instances { |
683 | + unknown = append(unknown, i) |
684 | + } |
685 | + logger.Trace("unknown: %v", unknown) |
686 | + return unknown, nil |
687 | +} |
688 | + |
689 | +// instancesForMachines returns a list of environs.Instance that represent |
690 | +// the list of machines running in the provider. Missing machines are |
691 | +// omitted from the list. |
692 | +func (task *provisionerTask) instancesForMachines(machines []*state.Machine) []environs.Instance { |
693 | + var instances []environs.Instance |
694 | + for _, machine := range machines { |
695 | + instId, ok := machine.InstanceId() |
696 | + if ok { |
697 | + instance, found := task.instances[instId] |
698 | + // If the instance is not found, it means that the underlying |
699 | + // instance is already dead, and we don't need to stop it. |
700 | + if found { |
701 | + instances = append(instances, instance) |
702 | + // now remove it from the machines map |
703 | + delete(task.machines, machine.Id()) |
704 | + } |
705 | + } |
706 | + } |
707 | + return instances |
708 | +} |
709 | + |
710 | +func (task *provisionerTask) stopInstances(instances []environs.Instance) error { |
711 | + // Although calling StopInstance with an empty slice should produce no change in the |
712 | + // provider, environs like dummy do not consider this a noop. |
713 | + if len(instances) == 0 { |
714 | + return nil |
715 | + } |
716 | + logger.Debug("Stopping instances: %v", instances) |
717 | + if err := task.broker.StopInstances(instances); err != nil { |
718 | + logger.Error("broker failed to stop instances: %v", err) |
719 | + return err |
720 | + } |
721 | + return nil |
722 | +} |
723 | + |
724 | +func (task *provisionerTask) startMachines(machines []*state.Machine) error { |
725 | + for _, m := range machines { |
726 | + if err := task.startMachine(m); err != nil { |
727 | + return fmt.Errorf("cannot start machine %v: %v", m, err) |
728 | + } |
729 | + } |
730 | + return nil |
731 | +} |
732 | + |
733 | +func (task *provisionerTask) startMachine(machine *state.Machine) error { |
734 | + // TODO(dfc) the state.Info passed to environ.StartInstance remains contentious |
735 | + // however as the PA only knows one state.Info, and that info is used by MAs and |
736 | + // UAs to locate the state for this environment, it is logical to use the same |
737 | + // state.Info as the PA. |
738 | + stateInfo, apiInfo, err := task.setupAuthentication(machine) |
739 | + if err != nil { |
740 | + logger.Error("failed to setup authentication: %v", err) |
741 | + return err |
742 | + } |
743 | + cons, err := machine.Constraints() |
744 | + if err != nil { |
745 | + return err |
746 | + } |
747 | + // Generate a unique nonce for the new instance. |
748 | + uuid, err := utils.NewUUID() |
749 | + if err != nil { |
750 | + return err |
751 | + } |
752 | + // Generated nonce has the format: "machine-#:UUID". The first |
753 | + // part is a badge, specifying the tag of the machine the provisioner |
754 | + // is running on, while the second part is a random UUID. |
755 | + nonce := fmt.Sprintf("%s:%s", state.MachineTag(task.machineId), uuid.String()) |
756 | + inst, err := task.broker.StartInstance(machine.Id(), nonce, machine.Series(), cons, stateInfo, apiInfo) |
757 | + if err != nil { |
758 | + // Set the state to error, so the machine will be skipped next |
759 | + // time until the error is resolved, but don't return an |
760 | + // error; just keep going with the other machines. |
761 | + logger.Error("cannot start instance for machine %q: %v", machine, err) |
762 | + if err1 := machine.SetStatus(params.StatusError, err.Error()); err1 != nil { |
763 | + // Something is wrong with this machine, better report it back. |
764 | + logger.Error("cannot set error status for machine %q: %v", machine, err1) |
765 | + return err1 |
766 | + } |
767 | + return nil |
768 | + } |
769 | + if err := machine.SetProvisioned(inst.Id(), nonce); err != nil { |
770 | + // The machine is started, but we can't record the mapping in |
771 | + // state. It'll keep running while we fail out and restart, |
772 | + // but will then be detected by findUnknownInstances and |
773 | + // killed again. |
774 | + // |
775 | + // TODO(dimitern) Stop the instance right away here. |
776 | + // |
777 | + // Multiple instantiations of a given machine (with the same |
778 | + // machine ID) cannot coexist, because findUnknownInstances is |
779 | + // called before startMachines. However, if the first machine |
780 | + // had started to do work before being replaced, we may |
781 | + // encounter surprising problems. |
782 | + return err |
783 | + } |
784 | + logger.Info("started machine %s as instance %s", machine, inst.Id()) |
785 | + return nil |
786 | +} |
787 | + |
788 | +func (task *provisionerTask) setupAuthentication(machine *state.Machine) (*state.Info, *api.Info, error) { |
789 | + password, err := utils.RandomPassword() |
790 | + if err != nil { |
791 | + return nil, nil, fmt.Errorf("cannot make password for machine %v: %v", machine, err) |
792 | + } |
793 | + if err := machine.SetMongoPassword(password); err != nil { |
794 | + return nil, nil, fmt.Errorf("cannot set password for machine %v: %v", machine, err) |
795 | + } |
796 | + stateInfo := *task.stateInfo |
797 | + stateInfo.Tag = machine.Tag() |
798 | + stateInfo.Password = password |
799 | + apiInfo := *task.apiInfo |
800 | + apiInfo.Tag = machine.Tag() |
801 | + apiInfo.Password = password |
802 | + return &stateInfo, &apiInfo, nil |
803 | +} |
804 | |
805 | === modified file 'worker/provisioner/provisioner_test.go' |
806 | --- worker/provisioner/provisioner_test.go 2013-05-31 00:16:39 +0000 |
807 | +++ worker/provisioner/provisioner_test.go 2013-06-07 03:44:25 +0000 |
808 | @@ -5,9 +5,14 @@ |
809 | |
810 | import ( |
811 | "fmt" |
812 | + "strings" |
813 | + stdtesting "testing" |
814 | + "time" |
815 | + |
816 | "labix.org/v2/mgo/bson" |
817 | . "launchpad.net/gocheck" |
818 | "launchpad.net/juju-core/constraints" |
819 | + "launchpad.net/juju-core/environs" |
820 | "launchpad.net/juju-core/environs/config" |
821 | "launchpad.net/juju-core/environs/dummy" |
822 | "launchpad.net/juju-core/errors" |
823 | @@ -16,11 +21,9 @@ |
824 | "launchpad.net/juju-core/state/api/params" |
825 | coretesting "launchpad.net/juju-core/testing" |
826 | "launchpad.net/juju-core/utils" |
827 | + "launchpad.net/juju-core/utils/set" |
828 | "launchpad.net/juju-core/worker" |
829 | "launchpad.net/juju-core/worker/provisioner" |
830 | - "strings" |
831 | - stdtesting "testing" |
832 | - "time" |
833 | ) |
834 | |
835 | func TestPackage(t *stdtesting.T) { |
836 | @@ -97,18 +100,19 @@ |
837 | c.Assert(s.Stop(), IsNil) |
838 | } |
839 | |
840 | -func (s *ProvisionerSuite) checkStartInstance(c *C, m *state.Machine) { |
841 | - s.checkStartInstanceCustom(c, m, "pork", constraints.Value{}) |
842 | +func (s *ProvisionerSuite) checkStartInstance(c *C, m *state.Machine) environs.Instance { |
843 | + return s.checkStartInstanceCustom(c, m, "pork", constraints.Value{}) |
844 | } |
845 | |
846 | -func (s *ProvisionerSuite) checkStartInstanceCustom(c *C, m *state.Machine, secret string, cons constraints.Value) { |
847 | +func (s *ProvisionerSuite) checkStartInstanceCustom(c *C, m *state.Machine, secret string, cons constraints.Value) (instance environs.Instance) { |
848 | s.State.StartSync() |
849 | for { |
850 | select { |
851 | case o := <-s.op: |
852 | switch o := o.(type) { |
853 | case dummy.OpStartInstance: |
854 | - s.waitInstanceId(c, m, o.Instance.Id()) |
855 | + instance = o.Instance |
856 | + s.waitInstanceId(c, m, instance.Id()) |
857 | |
858 | // Check the instance was started with the expected params. |
859 | c.Assert(o.MachineId, Equals, m.Id()) |
860 | @@ -141,6 +145,7 @@ |
861 | return |
862 | } |
863 | } |
864 | + return |
865 | } |
866 | |
867 | // checkNoOperations checks that the environ was not operated upon. |
868 | @@ -154,19 +159,31 @@ |
869 | } |
870 | } |
871 | |
872 | -// checkStopInstance checks that an instance has been stopped. |
873 | -func (s *ProvisionerSuite) checkStopInstance(c *C) { |
874 | +// checkStopInstances checks that an instance has been stopped. |
875 | +func (s *ProvisionerSuite) checkStopInstances(c *C, instances ...environs.Instance) { |
876 | s.State.StartSync() |
877 | - select { |
878 | - case o := <-s.op: |
879 | - switch o.(type) { |
880 | - case dummy.OpStopInstances: |
881 | - default: |
882 | - c.Fatalf("unexpected operation %#v", o) |
883 | + instanceIds := set.NewStrings() |
884 | + for _, instance := range instances { |
885 | + instanceIds.Add(string(instance.Id())) |
886 | + } |
887 | + // Continue checking for stop instance calls until all the instances we |
888 | + // are waiting on to finish, actually finish, or we time out. |
889 | + for !instanceIds.IsEmpty() { |
890 | + select { |
891 | + case o := <-s.op: |
892 | + switch o := o.(type) { |
893 | + case dummy.OpStopInstances: |
894 | + for _, stoppedInstance := range o.Instances { |
895 | + instanceIds.Remove(string(stoppedInstance.Id())) |
896 | + } |
897 | + default: |
898 | + c.Fatalf("unexpected operation %#v", o) |
899 | + return |
900 | + } |
901 | + case <-time.After(2 * time.Second): |
902 | + c.Fatalf("provisioner did not stop an instance") |
903 | + return |
904 | } |
905 | - case <-time.After(2 * time.Second): |
906 | - c.Fatalf("provisioner did not stop an instance") |
907 | - return |
908 | } |
909 | } |
910 | |
911 | @@ -230,11 +247,11 @@ |
912 | // Check that an instance is provisioned when the machine is created... |
913 | m, err := s.State.AddMachine(config.DefaultSeries, state.JobHostUnits) |
914 | c.Assert(err, IsNil) |
915 | - s.checkStartInstance(c, m) |
916 | + instance := s.checkStartInstance(c, m) |
917 | |
918 | // ...and removed, along with the machine, when the machine is Dead. |
919 | c.Assert(m.EnsureDead(), IsNil) |
920 | - s.checkStopInstance(c) |
921 | + s.checkStopInstances(c, instance) |
922 | s.waitRemoved(c, m) |
923 | } |
924 | |
925 | @@ -366,12 +383,12 @@ |
926 | // create a machine |
927 | m0, err := s.State.AddMachine(config.DefaultSeries, state.JobHostUnits) |
928 | c.Assert(err, IsNil) |
929 | - s.checkStartInstance(c, m0) |
930 | + i0 := s.checkStartInstance(c, m0) |
931 | |
932 | // create a second machine |
933 | m1, err := s.State.AddMachine(config.DefaultSeries, state.JobHostUnits) |
934 | c.Assert(err, IsNil) |
935 | - s.checkStartInstance(c, m1) |
936 | + i1 := s.checkStartInstance(c, m1) |
937 | stop(c, p) |
938 | |
939 | // mark the first machine as dead |
940 | @@ -384,8 +401,7 @@ |
941 | // start a new provisioner to shut them both down |
942 | p = provisioner.NewProvisioner(s.State, "0") |
943 | defer stop(c, p) |
944 | - s.checkStopInstance(c) |
945 | - s.checkStopInstance(c) |
946 | + s.checkStopInstances(c, i0, i1) |
947 | s.waitRemoved(c, m0) |
948 | } |
949 |
Reviewers: mp+167684_ code.launchpad. net,
Message:
Please take a look.
Description:
First part of the provisioner refactoring.
This branch breaks up the current provisioner and defines some
interfaces that
we'll be using for containers.
The broker interface is what we have starting, stopping instances, and
listing
instances and related machines.
An environment broker is written that defers the actual calls to the
environ
or state.
The provisioner now creates a provisioning task for the environment task.go.
provider
using a machine watcher and the environment broker. The common
provisioning
methods are now in provisioner_
I refactored some of the methods as we were duplicating a lot of calls,
like
get all instances, and then getting instances for individual ids. Also,
looking up specific machines, and also getting all machines. Now we
just do
the "all" gets, and keep a map around for the duration of the checks.
There are a couple of open questions around what to do if the
environment
provider task stops, and I'm a little unclear on the providing dying
bit, the
tests pass, but I feel that something is missing here.
https:/ /code.launchpad .net/~thumper/ juju-core/ provisioner- step-1/ +merge/ 167684
(do not edit description out of merge proposal)
Please review this at https:/ /codereview. appspot. com/9937046/
Affected files: provisioner/ broker. go provisioner/ environ_ broker. go provisioner/ provisioner. go provisioner/ provisioner_ task.go
A [revision details]
A worker/
A worker/
M worker/
A worker/