Merge lp:~jimbaker/pyjuju/expose-cleanup into lp:pyjuju
- expose-cleanup
- Merge into trunk
Status: | Merged |
---|---|
Approved by: | Gustavo Niemeyer |
Approved revision: | 331 |
Merged at revision: | 335 |
Proposed branch: | lp:~jimbaker/pyjuju/expose-cleanup |
Merge into: | lp:pyjuju |
Diff against target: |
835 lines (+510/-100) 9 files modified
ensemble/control/shutdown.py (+2/-1) ensemble/providers/ec2/__init__.py (+30/-5) ensemble/providers/ec2/machine.py (+0/-2) ensemble/providers/ec2/securitygroup.py (+110/-1) ensemble/providers/ec2/tests/common.py (+48/-2) ensemble/providers/ec2/tests/test_connect.py (+1/-1) ensemble/providers/ec2/tests/test_provider.py (+1/-0) ensemble/providers/ec2/tests/test_securitygroup.py (+178/-6) ensemble/providers/ec2/tests/test_shutdown.py (+140/-82) |
To merge this branch: | bzr merge lp:~jimbaker/pyjuju/expose-cleanup |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Gustavo Niemeyer | Approve | ||
William Reade (community) | Approve | ||
Review via email: mp+72132@code.launchpad.net |
Commit message
Description of the change
In the shutdown of EC2 machines, deletes the associated machine security groups (identified by using the Reservation metadata when describing instances), and if possible the environment security group.
This code necessarily does a polling of EC2, via describe_instances, to determine when the machine has moved from shutting-down to terminated, since only then can the security groups be deleted. In practice, the polling used here should be generous, but we may want to consider fine tuning the sleep interval (necessary to avoid unnecessarily hammering EC2) and the max number of polls. It might also be desirable to provide the latter as an option in the shutdown command.
Jim Baker (jimbaker) wrote : | # |
On Fri, Aug 19, 2011 at 2:27 PM, Gustavo Niemeyer <email address hidden>wrote:
> Review: Needs Fixing
> The logic looks mostly good, besides for [8] maybe. Some organizational
> issues, though.
>
> [1]
>
> + def _get_ensemble_
> + return "ensemble-%s" % self.environmen
> +
> + def _get_machine_
> + ensemble_
> + for group in instance.
> + if group != ensemble_
> + return group
> +
> + # Ignore if no such group exists; this allows some limited
> + # backwards compatibility with old setups without machine
> + # security group
> + log.debug("No associated group for instance %r",
> instance.
> + return None
> +
> + @inlineCallbacks
> + def _delete_
> + yield self.ec2.
> + log.debug("Deleted security group %r", group)
>
> These utility functions also have little relation to the provider interface
> itself. Please move them to top levels under
> ensemble.
>
Moved
>
> [2]
>
> + # Repeatedly poll EC2 until instances are in terminated state;
> (...)
> + except EC2Error:
> + pass # Ignore, cannot delete if other machines are using
> +
> + returnValue(
>
> Same thing. Big algorithm polluting the interface of the provider with
> lots of domain-specific knowledge about a minor aspect.
>
> Please create a function named remove_
> module as suggested above, and test it in isolation.
>
Moved and tested now also in test_securitygroup. Note there are still tests
in test_shutdown
> [3]
>
> + i = 0
> + while wait_on and i < 60:
> (...)
> + i += 1
> + if wait_on:
> + yield self._sleep(1) # Reasonable sleep to continue
> polling
>
> Minor hint:
>
> for i in range(60):
> (...)
> if not wait_on:
> break
> yield self._sleep(1)
>
Changed to this form
>
> [4]
>
> Try dropping the sleep in that loop entirely. The describe_instances
> operation is a natural limiter.
>
Removed the sleep
>
> When you do that, how many iterations do you get in practice before
> the operation completes, and how many seconds it took?
>
and empirically observed approx 500ms per op
> [5]
>
> + def _sleep(self, delay):
> + """Non-blocking sleep."""
> + from twisted.internet import reactor
> +
> + deferred = Deferred()
> + reactor.
> + return deferred
>
> If you dropped it in the point above, remove this. If not, please move
> this to ensemble.
>
Removed
>
> [6]
>
> + log.debug("%d Info on instance: %s, %s",
> + i, instance.
> instance.
>
> Slightly better debug message:
>
> "[iteraction %d] Instance %s state: %s"
>
>
This was not intended to be in the branch pushed, so removed
> [...
William Reade (fwereade) wrote : | # |
Just a few notes:
[0]
Sorry, you've got a potentially nasty merge inbound :(. Give me a shout if there's anything I can help with.
[1]
The security_group methods on EC2's MachineProvider, which moved to securitygroup.py, haven't been deleted from MachineProvider yet (but they do seem to be unused now)
[2]
+ returnValue(
I think it would be slightly more technically correct (the best kind of correct!) to construct terminated_machines from the list of terminated_ids, since you have them available.
[3]
+def remove_
Consider renaming killable_ids to terminated_ids, or perhaps instance_ids -- I can think of arguments either way, but I don't think "killable" is the right word to describe them at this stage.
[4]
+ # Delete the security group for the environment as a whole.
I think it would be more sensible to override MachineProvider
def destroy_
try:
finally:
...which would at least ensure we made a good-faith attempt to do everything required to destroy the full environment, although the current logging in the face of multiple errors is a bit nonexistent.
All that said, LGTM; I'm very happy to see a clean API again. Thanks, and +1.
Gustavo Niemeyer (niemeyer) wrote : | # |
This looks good. A couple of comments on top of the changes:
[11]
You got [9] reversed, as I pointed out on IRC:
<niemeyer> jimbaker: I was pointing out that you don't have to wait for deletion to finish on all of the removed groups before running another round
<niemeyer> jimbaker: since it kills parallelism unnecessarily
<niemeyer> jimbaker: You can wait for them to finish after _all_ of the groups have been deleted
<niemeyer> jimbaker: Instead, you've moved backwards, and made it wait on each individual group's deletion
[12]
+def _get_ensemble_
+ """Get EC2 security group name for environment of `provider`."""
+ return "ensemble-%s" % provider.
+ def _get_ensemble_
+ return "ensemble-%s" % self.environmen
There's no point in having that in a function if it's going to be
duplicated either way. Either unify, or just inline.
- 332. By Jim Baker
-
Delete security groups in the background
- 333. By Jim Baker
-
Removed debugging
- 334. By Jim Baker
-
Minor cleanup
- 335. By Jim Baker
-
Only remove env security group in context of provider destroy_environment
- 336. By Jim Baker
-
Reworked remaining mocks
- 337. By Jim Baker
-
Test new securitygroup function
- 338. By Jim Baker
-
Merged trunk & resolved conflicts
- 339. By Jim Baker
-
Fixed merge issues
- 340. By Jim Baker
-
Merged trunk
- 341. By Jim Baker
-
PEP8 & PyFlakes
Jim Baker (jimbaker) wrote : | # |
On Wed, Aug 24, 2011 at 2:20 AM, William Reade
<email address hidden>wrote:
> Review: Approve
> Just a few notes:
>
> [0]
>
> Sorry, you've got a potentially nasty merge inbound :(. Give me a shout if
> there's anything I can help with.
>
It wasn't so bad, but definitely it wasn't trivial.
>
> [1]
>
> The security_group methods on EC2's MachineProvider, which moved to
> securitygroup.py, haven't been deleted from MachineProvider yet (but they do
> seem to be unused now)
>
Removed these (Gustavo pointed out one in his review too)
>
> [2]
>
> + returnValue(
>
> I think it would be slightly more technically correct (the best kind of
> correct!) to construct terminated_machines from the list of terminated_ids,
> since you have them available.
>
Sorry, that was going to make it, but I just realized it wasn't done. I
could do that as a trivial.
> [3]
>
> +def remove_
>
> Consider renaming killable_ids to terminated_ids, or perhaps instance_ids
> -- I can think of arguments either way, but I don't think "killable" is the
> right word to describe them at this stage.
>
Renamed.
>
> [4]
>
> + # Delete the security group for the environment as a whole.
>
> I think it would be more sensible to override
> MachineProvider
> there, rather than trying and failing on shutdown of any other machine. Note
> that the base destroy_environment can raise, so you might want something
> like this:
>
> def destroy_
> try:
> super(MachinePr
> finally:
> destroy_
>
> ...which would at least ensure we made a good-faith attempt to do
> everything required to destroy the full environment, although the current
> logging in the face of multiple errors is a bit nonexistent.
>
Changed as requested and a number of tests suitably adjusted in their
mocking expectations.
Jim Baker (jimbaker) wrote : | # |
On Thu, Aug 25, 2011 at 2:24 PM, Gustavo Niemeyer <email address hidden>wrote:
> Review: Approve
> This looks good. A couple of comments on top of the changes:
>
> [11]
>
> You got [9] reversed, as I pointed out on IRC:
>
> <niemeyer> jimbaker: I was pointing out that you don't have to wait for
> deletion to finish on all of the removed groups before running another round
> <niemeyer> jimbaker: since it kills parallelism unnecessarily
> <niemeyer> jimbaker: You can wait for them to finish after _all_ of the
> groups have been deleted
> <niemeyer> jimbaker: Instead, you've moved backwards, and made it wait on
> each individual group's deletion
>
Machine security groups are now deleted in the background. The environment
security group deletion is performed in a separate step, due to William's
point that should be done from destroy_
sense.
>
> [12]
>
>
> +def _get_ensemble_
> + """Get EC2 security group name for environment of `provider`."""
> + return "ensemble-%s" % provider.
>
> + def _get_ensemble_
> + return "ensemble-%s" % self.environmen
>
> There's no point in having that in a function if it's going to be
> duplicated either way. Either unify, or just inline.
>
> Removed, this was code that wasn't deleted as part of the move to
securitygroup.py (as in a move operation can be thought of as add + delete
:) ). William noticed the same thing.
Preview Diff
1 | === modified file 'ensemble/control/shutdown.py' |
2 | --- ensemble/control/shutdown.py 2011-08-11 19:49:32 +0000 |
3 | +++ ensemble/control/shutdown.py 2011-08-29 18:36:25 +0000 |
4 | @@ -22,7 +22,8 @@ |
5 | |
6 | value = raw_input( |
7 | "Warning, this will destroy all machines and services in the\n" |
8 | - "environment. Continue [y/N]") |
9 | + "%r (type: %s) environment. Continue [y/N]" % ( |
10 | + environment.name, environment.type)) |
11 | |
12 | if value.strip().lower() not in ("y", "yes"): |
13 | options.log.info("Shutdown aborted") |
14 | |
15 | === modified file 'ensemble/providers/ec2/__init__.py' |
16 | --- ensemble/providers/ec2/__init__.py 2011-08-26 12:24:00 +0000 |
17 | +++ ensemble/providers/ec2/__init__.py 2011-08-29 18:36:25 +0000 |
18 | @@ -2,7 +2,6 @@ |
19 | import re |
20 | |
21 | from twisted.internet.defer import inlineCallbacks, returnValue |
22 | - |
23 | from txaws.ec2.exception import EC2Error |
24 | from txaws.service import AWSServiceRegion |
25 | |
26 | @@ -14,7 +13,8 @@ |
27 | from .launch import EC2LaunchMachine |
28 | from .machine import EC2ProviderMachine, machine_from_instance |
29 | from .securitygroup import ( |
30 | - open_provider_port, close_provider_port, get_provider_opened_ports) |
31 | + open_provider_port, close_provider_port, get_provider_opened_ports, |
32 | + remove_security_groups, destroy_environment_security_group) |
33 | from .utils import get_region_uri |
34 | |
35 | |
36 | @@ -104,6 +104,21 @@ |
37 | returnValue(machines) |
38 | |
39 | @inlineCallbacks |
40 | + def destroy_environment(self): |
41 | + """Terminate all associated machines and security groups. |
42 | + |
43 | + The super defintion of this method terminates each machine in |
44 | + the environment; this needs to be augmented here by also |
45 | + removing the security group for the environment. |
46 | + """ |
47 | + try: |
48 | + killed_machines = yield super(MachineProvider, self).\ |
49 | + destroy_environment() |
50 | + returnValue(killed_machines) |
51 | + finally: |
52 | + yield destroy_environment_security_group(self) |
53 | + |
54 | + @inlineCallbacks |
55 | def shutdown_machines(self, machines): |
56 | """Terminate machines associated with this provider. |
57 | |
58 | @@ -119,9 +134,19 @@ |
59 | |
60 | ids = [m.instance_id for m in machines] |
61 | killable_machines = yield self.get_machines(ids) |
62 | - if killable_machines: |
63 | - killable_ids = [m.instance_id for m in killable_machines] |
64 | - yield self.ec2.terminate_instances(*killable_ids) |
65 | + if not killable_machines: |
66 | + returnValue([]) # Nothing to do |
67 | + |
68 | + killable_ids = [m.instance_id for m in killable_machines] |
69 | + terminated = yield self.ec2.terminate_instances(*killable_ids) |
70 | + |
71 | + # Pass on what was actually terminated, in the case the |
72 | + # machine has somehow disappeared since get_machines |
73 | + # above. This is to avoid getting EC2Error: Error Message: |
74 | + # Invalid id when running ec2.describe_instances in |
75 | + # remove_security_groups |
76 | + terminated_ids = [info[0] for info in terminated] |
77 | + yield remove_security_groups(self, terminated_ids) |
78 | returnValue(killable_machines) |
79 | |
80 | def open_port(self, machine, machine_id, port, protocol="tcp"): |
81 | |
82 | === modified file 'ensemble/providers/ec2/machine.py' |
83 | --- ensemble/providers/ec2/machine.py 2011-08-17 13:34:45 +0000 |
84 | +++ ensemble/providers/ec2/machine.py 2011-08-29 18:36:25 +0000 |
85 | @@ -1,5 +1,3 @@ |
86 | -import datetime |
87 | - |
88 | from ensemble.machine import ProviderMachine |
89 | |
90 | |
91 | |
92 | === modified file 'ensemble/providers/ec2/securitygroup.py' |
93 | --- ensemble/providers/ec2/securitygroup.py 2011-08-23 09:59:59 +0000 |
94 | +++ ensemble/providers/ec2/securitygroup.py 2011-08-29 18:36:25 +0000 |
95 | @@ -1,11 +1,17 @@ |
96 | -from twisted.internet.defer import inlineCallbacks, returnValue |
97 | +from twisted.internet.defer import Deferred, inlineCallbacks, returnValue |
98 | from txaws.ec2.exception import EC2Error |
99 | |
100 | from ensemble.errors import ProviderInteractionError |
101 | +from ensemble.lib.twistutils import gather_results |
102 | |
103 | from .utils import log |
104 | |
105 | |
106 | +def _get_ensemble_security_group(provider): |
107 | + """Get EC2 security group name for environment of `provider`.""" |
108 | + return "ensemble-%s" % provider.environment_name |
109 | + |
110 | + |
111 | def _get_machine_group_name(provider, machine_id): |
112 | """Get EC2 security group name associated just with `machine_id`.""" |
113 | return "ensemble-%s-%s" % (provider.environment_name, machine_id) |
114 | @@ -87,3 +93,106 @@ |
115 | # Ensemble (at this time at least) |
116 | opened_ports.add((from_port, ip_permission.ip_protocol)) |
117 | returnValue(opened_ports) |
118 | + |
119 | + |
120 | +def _get_machine_security_group_from_instance(provider, instance): |
121 | + """Parses the `reservation` of `instance` to get assoc machine group.""" |
122 | + ensemble_security_group = _get_ensemble_security_group(provider) |
123 | + for group in instance.reservation.groups: |
124 | + if group != ensemble_security_group: |
125 | + return group |
126 | + |
127 | + # Ignore if no such group exists; this allows some limited |
128 | + # backwards compatibility with old setups without machine |
129 | + # security group |
130 | + log.info("Ignoring missing machine security group for instance %r", |
131 | + instance.instance_id) |
132 | + return None |
133 | + |
134 | + |
135 | +@inlineCallbacks |
136 | +def _delete_security_group(provider, group, completed, deletion_required=True): |
137 | + """Wrap EC2 delete_security_group.""" |
138 | + try: |
139 | + yield provider.ec2.delete_security_group(group) |
140 | + log.debug("Deleted security group %r", group) |
141 | + completed.callback(None) |
142 | + except EC2Error, e: |
143 | + if deletion_required: |
144 | + completed.errback(ProviderInteractionError( |
145 | + "EC2 error when attempting to delete group %s: %s" % ( |
146 | + group, e))) |
147 | + else: |
148 | + completed.callback(None) |
149 | + |
150 | + |
151 | +def _background_delete_security_group(provider, group, deletion_required=True): |
152 | + """Runs deletion task in the background. |
153 | + |
154 | + Wait on this returned deferred for completion of the deletion. |
155 | + """ |
156 | + from twisted.internet import reactor |
157 | + |
158 | + completed = Deferred() |
159 | + reactor.callLater( |
160 | + 0, _delete_security_group, |
161 | + provider, group, completed, deletion_required=deletion_required) |
162 | + return completed |
163 | + |
164 | + |
165 | +@inlineCallbacks |
166 | +def remove_security_groups(provider, instance_ids): |
167 | + """Remove security groups associated with `instance_ids` for `provider`""" |
168 | + log.info( |
169 | + "Waiting on %d EC2 instances to transition to terminated state, " |
170 | + "this may take a while", len(instance_ids)) |
171 | + |
172 | + # Repeatedly poll EC2 until instances are in terminated state; |
173 | + # upon reaching that state, delete associated machine security |
174 | + # groups. The limit of 200 polls is arbitrary and could be |
175 | + # specified by a command line option (and/or an overall |
176 | + # timeout). It's based on an observed ~500 ms roundtrip time per |
177 | + # call of the describe_instances web service, along with typically |
178 | + # taking about 40s to move all instances to a terminated state. |
179 | + wait_on = set(instance_ids) |
180 | + pending_deletions = [] |
181 | + for i in xrange(200): |
182 | + if not wait_on: |
183 | + break |
184 | + instances = yield provider.ec2.describe_instances(*wait_on) |
185 | + for instance in instances: |
186 | + if instance.instance_state == "terminated": |
187 | + log.debug("Instance %r was terminated", |
188 | + instance.instance_id) |
189 | + wait_on.discard(instance.instance_id) |
190 | + group = _get_machine_security_group_from_instance( |
191 | + provider, instance) |
192 | + if group: |
193 | + pending_deletions.append( |
194 | + _background_delete_security_group(provider, group)) |
195 | + if wait_on: |
196 | + outstanding = [ |
197 | + _get_machine_security_group_from_instance(provider, instance) |
198 | + for instance in instances] |
199 | + log.error("Instance shutdown taking too long, " |
200 | + "could not delete groups %s", |
201 | + ", ".join(sorted(outstanding))) |
202 | + |
203 | + # Wait for all pending deletions to complete |
204 | + yield gather_results(pending_deletions) |
205 | + |
206 | + |
207 | +@inlineCallbacks |
208 | +def destroy_environment_security_group(provider): |
209 | + """Delete the security group for the environment of `provider`""" |
210 | + group = _get_ensemble_security_group(provider) |
211 | + try: |
212 | + yield provider.ec2.delete_security_group(group) |
213 | + log.debug("Deleted environment security group %r", group) |
214 | + returnValue(True) |
215 | + except EC2Error, e: |
216 | + # Ignore, since this is only attempting to cleanup |
217 | + log.debug( |
218 | + "Ignoring EC2 error when attempting to delete group %s: %s" % ( |
219 | + group, e)) |
220 | + returnValue(False) |
221 | |
222 | === modified file 'ensemble/providers/ec2/tests/common.py' |
223 | --- ensemble/providers/ec2/tests/common.py 2011-08-23 13:01:13 +0000 |
224 | +++ ensemble/providers/ec2/tests/common.py 2011-08-29 18:36:25 +0000 |
225 | @@ -14,6 +14,7 @@ |
226 | |
227 | |
228 | MATCH_AMI = MATCH(lambda image_id: image_id.startswith("ami-")) |
229 | +MATCH_GROUP = MATCH(lambda x: x.startswith("ensemble-moon")) |
230 | |
231 | |
232 | class EC2TestMixin(object): |
233 | @@ -36,8 +37,11 @@ |
234 | """ |
235 | return MachineProvider(self.env_name, self.get_config()) |
236 | |
237 | - def get_instance(self, instance_id, state="running", **kwargs): |
238 | - groups = kwargs.pop("groups", ["ensemble-%s" % self.env_name]) |
239 | + def get_instance(self, |
240 | + instance_id, state="running", machine_id=42, **kwargs): |
241 | + groups = kwargs.pop("groups", |
242 | + ["ensemble-%s" % self.env_name, |
243 | + "ensemble-%s-%s" % (self.env_name, machine_id)]) |
244 | reservation = Reservation("x", "y", groups=groups) |
245 | return Instance(instance_id, state, reservation=reservation, **kwargs) |
246 | |
247 | @@ -168,3 +172,45 @@ |
248 | # connect grabs the first host of a set. |
249 | self.ec2.describe_instances(hosts[0].instance_id) |
250 | self.mocker.result(succeed([hosts[0]])) |
251 | + |
252 | + |
253 | +class MockInstanceState(object): |
254 | + """Mock the result of ec2_describe_instances when called successively. |
255 | + |
256 | + Each call of :method:`get_round` returns a list of mock `Instance` |
257 | + objects, using the state for that round. Instance IDs not used in |
258 | + the round (and passed in from ec2_describe_instances) are |
259 | + automatically skipped.""" |
260 | + |
261 | + def __init__(self, tester, instance_ids, machine_ids, states): |
262 | + self.tester = tester |
263 | + self.instance_ids = instance_ids |
264 | + self.machine_ids = machine_ids |
265 | + self.states = states |
266 | + self.round = 0 |
267 | + |
268 | + def get_round(self, *current_instance_ids): |
269 | + result = [] |
270 | + for instance_id, machine_id, state in zip( |
271 | + self.instance_ids, self.machine_ids, self.states[self.round]): |
272 | + if instance_id not in current_instance_ids: |
273 | + # Ignore instance_ids that are no longer being |
274 | + # described, because they have since moved into a |
275 | + # terminated state |
276 | + continue |
277 | + result.append(self.tester.get_instance(instance_id, |
278 | + machine_id=machine_id, |
279 | + state=state)) |
280 | + self.round += 1 |
281 | + return succeed(result) |
282 | + |
283 | + |
284 | +class Observed(object): |
285 | + """Minimal wrapper just to ensure :method:`add` returns a `Deferred`.""" |
286 | + |
287 | + def __init__(self): |
288 | + self.items = set() |
289 | + |
290 | + def add(self, item): |
291 | + self.items.add(item) |
292 | + return succeed(True) |
293 | |
294 | === modified file 'ensemble/providers/ec2/tests/test_connect.py' |
295 | --- ensemble/providers/ec2/tests/test_connect.py 2011-08-19 16:21:13 +0000 |
296 | +++ ensemble/providers/ec2/tests/test_connect.py 2011-08-29 18:36:25 +0000 |
297 | @@ -113,7 +113,7 @@ |
298 | # Hand back a real connected client to test the wait on initialization. |
299 | instance = self.get_instance("i-foobar", dns_name="foo.example.com") |
300 | connected_client = ZookeeperClient() |
301 | - self.client = connected_client # for poke_zk |
302 | + self.client = connected_client # for poke_zk |
303 | self.mock_connect(False, instance, succeed(connected_client)) |
304 | |
305 | self.mocker.replay() |
306 | |
307 | === modified file 'ensemble/providers/ec2/tests/test_provider.py' |
308 | --- ensemble/providers/ec2/tests/test_provider.py 2011-08-18 12:19:22 +0000 |
309 | +++ ensemble/providers/ec2/tests/test_provider.py 2011-08-29 18:36:25 +0000 |
310 | @@ -115,6 +115,7 @@ |
311 | serialized.pop("authorized-keys", None) |
312 | self.assertEqual(config, serialized) |
313 | |
314 | + |
315 | class FailCreateTest(TestCase): |
316 | |
317 | def test_conflicting_authorized_keys_options(self): |
318 | |
319 | === modified file 'ensemble/providers/ec2/tests/test_securitygroup.py' |
320 | --- ensemble/providers/ec2/tests/test_securitygroup.py 2011-08-18 12:19:22 +0000 |
321 | +++ ensemble/providers/ec2/tests/test_securitygroup.py 2011-08-29 18:36:25 +0000 |
322 | @@ -7,12 +7,13 @@ |
323 | from ensemble.lib.testing import TestCase |
324 | from ensemble.machine import ProviderMachine |
325 | from ensemble.providers.ec2.securitygroup import ( |
326 | - open_provider_port, close_provider_port, get_provider_opened_ports) |
327 | - |
328 | -from .common import EC2TestMixin |
329 | - |
330 | - |
331 | -class EC2SecurityGroupTest(EC2TestMixin, TestCase): |
332 | + open_provider_port, close_provider_port, get_provider_opened_ports, |
333 | + remove_security_groups, destroy_environment_security_group) |
334 | +from ensemble.providers.ec2.tests.common import ( |
335 | + EC2TestMixin, MATCH_GROUP, Observed, MockInstanceState) |
336 | + |
337 | + |
338 | +class EC2PortMgmtTest(EC2TestMixin, TestCase): |
339 | |
340 | @inlineCallbacks |
341 | def test_open_provider_port(self): |
342 | @@ -127,3 +128,174 @@ |
343 | str(ex), |
344 | "Unexpected EC2Error getting open ports on machine i-foobar: " |
345 | "The instance ID 'i-foobar' does not exist") |
346 | + |
347 | + |
348 | +class EC2RemoveGroupsTest(EC2TestMixin, TestCase): |
349 | + |
350 | + @inlineCallbacks |
351 | + def test_remove_security_groups(self): |
352 | + """Test normal running seen in polling instances for their state.""" |
353 | + system_state = MockInstanceState( |
354 | + self, |
355 | + ["i-amkillable", "i-amkillabletoo"], [0, 2], |
356 | + [["shutting-down", "shutting-down"], |
357 | + ["shutting-down", "shutting-down"], |
358 | + ["shutting-down", "terminated"], |
359 | + ["terminated"]]) |
360 | + |
361 | + self.ec2.describe_instances("i-amkillable", "i-amkillabletoo") |
362 | + self.mocker.call(system_state.get_round) |
363 | + self.mocker.count(3) |
364 | + |
365 | + self.ec2.describe_instances("i-amkillable") |
366 | + self.mocker.call(system_state.get_round) |
367 | + |
368 | + self.ec2.delete_security_group(MATCH_GROUP) |
369 | + deleted_groups = Observed() |
370 | + self.mocker.call(deleted_groups.add) |
371 | + self.mocker.count(2) |
372 | + self.mocker.replay() |
373 | + |
374 | + provider = self.get_provider() |
375 | + yield remove_security_groups( |
376 | + provider, ["i-amkillable", "i-amkillabletoo"]) |
377 | + self.assertEquals( |
378 | + deleted_groups.items, |
379 | + set(["ensemble-moon-0", "ensemble-moon-2"])) |
380 | + |
381 | + @inlineCallbacks |
382 | + def test_machine_not_in_machine_security_group(self): |
383 | + """Old environments do not have machines in security groups. |
384 | + |
385 | + TODO It might be desirable to change this behavior to log as an |
386 | + error, or even raise as a provider exception. |
387 | + """ |
388 | + log = self.capture_logging() |
389 | + |
390 | + # Instance is not in a machine security group, just its env |
391 | + # security group (ensemble-moon) |
392 | + self.ec2.describe_instances("i-amkillable") |
393 | + self.mocker.result(succeed([ |
394 | + self.get_instance("i-amkillable", "terminated", groups=[])])) |
395 | + self.mocker.replay() |
396 | + |
397 | + provider = self.get_provider() |
398 | + yield remove_security_groups(provider, ["i-amkillable"]) |
399 | + self.assertIn( |
400 | + "Ignoring missing machine security group for instance " |
401 | + "'i-amkillable'", |
402 | + log.getvalue()) |
403 | + |
404 | + @inlineCallbacks |
405 | + def test_cannot_delete_machine_security_group(self): |
406 | + """Verify this raises error. |
407 | + |
408 | + Note it is possible for a malicious admin to start other |
409 | + machines with an existing machine security group, so need to |
410 | + test this removal is robust in this case. Another scenario |
411 | + that might cause this is that there's an EC2 timeout. |
412 | + """ |
413 | + system_state = MockInstanceState( |
414 | + self, ["i-amkillable"], [0], |
415 | + [["shutting-down"], |
416 | + ["shutting-down"], |
417 | + ["shutting-down"], |
418 | + ["terminated"]]) |
419 | + |
420 | + self.ec2.describe_instances("i-amkillable") |
421 | + self.mocker.call(system_state.get_round) |
422 | + self.mocker.count(4) |
423 | + |
424 | + self.ec2.delete_security_group("ensemble-moon-0") |
425 | + self.mocker.result(fail( |
426 | + self.get_ec2_error( |
427 | + "ensemble-moon-0", |
428 | + format="There are active instances using security group %r" |
429 | + ))) |
430 | + |
431 | + self.mocker.replay() |
432 | + |
433 | + provider = self.get_provider() |
434 | + ex = yield self.assertFailure( |
435 | + remove_security_groups(provider, ["i-amkillable"]), |
436 | + ProviderInteractionError) |
437 | + self.assertEquals( |
438 | + str(ex), |
439 | + "EC2 error when attempting to delete group ensemble-moon-0: " |
440 | + "Error Message: There are active instances using security group " |
441 | + "'ensemble-moon-0'") |
442 | + |
443 | + @inlineCallbacks |
444 | + def test_remove_security_groups_takes_too_long(self): |
445 | + """Verify that removing security groups doesn't continue indefinitely. |
446 | + |
447 | + This scenario might happen if the instance is taking too long |
448 | + to report as going into the terminated state, although not |
449 | + likely after 1000 polls! |
450 | + """ |
451 | + log = self.capture_logging() |
452 | + states = [["shutting-down", "shutting-down"], |
453 | + ["shutting-down", "shutting-down"], |
454 | + ["shutting-down", "terminated"]] |
455 | + keeps_on_going = [["shutting-down"] for i in xrange(1000)] |
456 | + states.extend(keeps_on_going) |
457 | + |
458 | + system_state = MockInstanceState( |
459 | + self, ["i-amkillable", "i-amkillabletoo"], [0, 2], states) |
460 | + |
461 | + self.ec2.describe_instances("i-amkillable", "i-amkillabletoo") |
462 | + self.mocker.call(system_state.get_round) |
463 | + self.mocker.count(3) |
464 | + |
465 | + # Keep the rounds going until it exits with the error message |
466 | + self.ec2.describe_instances("i-amkillable") |
467 | + self.mocker.call(system_state.get_round) |
468 | + self.mocker.count(197) |
469 | + |
470 | + # Verify that at least this machine security group was deleted |
471 | + deleted_groups = Observed() |
472 | + self.ec2.delete_security_group("ensemble-moon-2") |
473 | + self.mocker.call(deleted_groups.add) |
474 | + |
475 | + self.mocker.replay() |
476 | + |
477 | + provider = self.get_provider() |
478 | + yield remove_security_groups( |
479 | + provider, ["i-amkillable", "i-amkillabletoo"]) |
480 | + self.assertEquals(deleted_groups.items, set(["ensemble-moon-2"])) |
481 | + self.assertIn( |
482 | + "Instance shutdown taking too long, " |
483 | + "could not delete groups ensemble-moon-0", |
484 | + log.getvalue()) |
485 | + |
486 | + @inlineCallbacks |
487 | + def test_destroy_environment_security_group(self): |
488 | + """Verify the deletion of the security group for the environment""" |
489 | + self.ec2.delete_security_group("ensemble-moon") |
490 | + self.mocker.result(succeed(True)) |
491 | + self.mocker.replay() |
492 | + |
493 | + provider = self.get_provider() |
494 | + destroyed = yield destroy_environment_security_group(provider) |
495 | + self.assertTrue(destroyed) |
496 | + |
497 | + @inlineCallbacks |
498 | + def test_destroy_environment_security_group_missing(self): |
499 | + """Verify ignores errors in deleting the env security group""" |
500 | + log = self.capture_logging(level=logging.DEBUG) |
501 | + self.ec2.delete_security_group("ensemble-moon") |
502 | + self.mocker.result(fail( |
503 | + self.get_ec2_error( |
504 | + "ensemble-moon", |
505 | + format="The security group %r does not exist" |
506 | + ))) |
507 | + self.mocker.replay() |
508 | + |
509 | + provider = self.get_provider() |
510 | + destroyed = yield destroy_environment_security_group(provider) |
511 | + self.assertFalse(destroyed) |
512 | + self.assertIn( |
513 | + "Ignoring EC2 error when attempting to delete group " |
514 | + "ensemble-moon: Error Message: The security group " |
515 | + "'ensemble-moon' does not exist", |
516 | + log.getvalue()) |
517 | |
518 | === modified file 'ensemble/providers/ec2/tests/test_shutdown.py' |
519 | --- ensemble/providers/ec2/tests/test_shutdown.py 2011-08-16 08:33:47 +0000 |
520 | +++ ensemble/providers/ec2/tests/test_shutdown.py 2011-08-29 18:36:25 +0000 |
521 | @@ -1,7 +1,8 @@ |
522 | -from twisted.internet.defer import fail, succeed |
523 | +from twisted.internet.defer import succeed, fail, inlineCallbacks |
524 | |
525 | from ensemble.lib.testing import TestCase |
526 | -from ensemble.providers.ec2.tests.common import EC2TestMixin |
527 | +from ensemble.providers.ec2.tests.common import ( |
528 | + EC2TestMixin, MATCH_GROUP, Observed, MockInstanceState) |
529 | |
530 | from ensemble.machine import ProviderMachine |
531 | |
532 | @@ -15,23 +16,36 @@ |
533 | |
534 | class EC2ShutdownMachineTest(EC2TestMixin, TestCase): |
535 | |
536 | + @inlineCallbacks |
537 | def test_shutdown_machine(self): |
538 | - instance = self.get_instance("i-foobar") |
539 | + system_state = MockInstanceState( |
540 | + self, ["i-foobar"], [0], |
541 | + [["running"], |
542 | + ["shutting-down"], |
543 | + ["shutting-down"], |
544 | + ["shutting-down"], |
545 | + ["terminated"]]) |
546 | + |
547 | self.ec2.describe_instances("i-foobar") |
548 | - self.mocker.result(succeed([instance])) |
549 | + self.mocker.call(system_state.get_round) |
550 | + self.mocker.count(5) |
551 | + |
552 | self.ec2.terminate_instances("i-foobar") |
553 | self.mocker.result(succeed([("i-foobar", "running", "shutting-down")])) |
554 | + |
555 | + self.ec2.delete_security_group("ensemble-moon-0") |
556 | + deleted_groups = Observed() |
557 | + self.mocker.call(deleted_groups.add) |
558 | + |
559 | self.mocker.replay() |
560 | |
561 | machine = EC2ProviderMachine("i-foobar") |
562 | provider = self.get_provider() |
563 | - d = provider.shutdown_machine(machine) |
564 | - |
565 | - def verify(machine): |
566 | - self.assertTrue(isinstance(machine, EC2ProviderMachine)) |
567 | - self.assertEquals(machine.instance_id, "i-foobar") |
568 | - d.addCallback(verify) |
569 | - return d |
570 | + machine = yield provider.shutdown_machine(machine) |
571 | + self.assertTrue(isinstance(machine, EC2ProviderMachine)) |
572 | + self.assertEquals(machine.instance_id, "i-foobar") |
573 | + # Notably, ensemble-moon is not mock deleted |
574 | + self.assertEquals(deleted_groups.items, set(["ensemble-moon-0"])) |
575 | |
576 | def test_shutdown_machine_invalid_group(self): |
577 | """ |
578 | @@ -56,7 +70,7 @@ |
579 | def test_shutdown_machine_invalid_machine(self): |
580 | """ |
581 | Attempting to shutdown a machine that from a different provider |
582 | - type will raise a syntaxerror. |
583 | + type will raise a `ProviderError`. |
584 | """ |
585 | self.mocker.replay() |
586 | machine = ProviderMachine("i-foobar") |
587 | @@ -71,16 +85,14 @@ |
588 | d.addCallback(check_error) |
589 | return d |
590 | |
591 | + @inlineCallbacks |
592 | def test_shutdown_machines_none(self): |
593 | self.mocker.replay() |
594 | provider = self.get_provider() |
595 | - d = provider.shutdown_machines([]) |
596 | - |
597 | - def verify(result): |
598 | - self.assertEquals(result, []) |
599 | - d.addCallback(verify) |
600 | - return d |
601 | - |
602 | + result = yield provider.shutdown_machines([]) |
603 | + self.assertEquals(result, []) |
604 | + |
605 | + @inlineCallbacks |
606 | def test_shutdown_machines_some_invalid(self): |
607 | self.ec2.describe_instances("i-amkillable", "i-amalien", "i-amdead") |
608 | self.mocker.result(succeed([ |
609 | @@ -90,104 +102,148 @@ |
610 | self.mocker.replay() |
611 | |
612 | provider = self.get_provider() |
613 | - d = provider.shutdown_machines([ |
614 | - EC2ProviderMachine("i-amkillable"), |
615 | - EC2ProviderMachine("i-amalien"), |
616 | - EC2ProviderMachine("i-amdead")]) |
617 | - self.failUnlessFailure(d, MachinesNotFound) |
618 | - |
619 | - def verify(error): |
620 | - self.assertEquals(str(error), |
621 | - "Cannot find machines: i-amalien, i-amdead") |
622 | - d.addCallback(verify) |
623 | - return d |
624 | - |
625 | + ex = yield self.assertFailure( |
626 | + provider.shutdown_machines([ |
627 | + EC2ProviderMachine("i-amkillable"), |
628 | + EC2ProviderMachine("i-amalien"), |
629 | + EC2ProviderMachine("i-amdead")]), |
630 | + MachinesNotFound) |
631 | + self.assertEquals(str(ex), |
632 | + "Cannot find machines: i-amalien, i-amdead") |
633 | + |
634 | + @inlineCallbacks |
635 | def test_shutdown_machines_some_success(self): |
636 | + """Verify that shutting down some machines works. |
637 | + |
638 | + In particular, the environment as a whole is not removed |
639 | + because there's still the environment security group left.""" |
640 | + system_state = MockInstanceState( |
641 | + self, |
642 | + ["i-amkillable", "i-amkillabletoo"], [0, 2], |
643 | + [["running", "running"], |
644 | + ["shutting-down", "shutting-down"], |
645 | + ["shutting-down", "shutting-down"], |
646 | + ["shutting-down", "terminated"], |
647 | + ["terminated"]]) |
648 | + |
649 | self.ec2.describe_instances("i-amkillable", "i-amkillabletoo") |
650 | - self.mocker.result(succeed([ |
651 | - self.get_instance("i-amkillable"), |
652 | - self.get_instance("i-amkillabletoo")])) |
653 | + self.mocker.call(system_state.get_round) |
654 | + self.mocker.count(4) |
655 | + |
656 | + self.ec2.describe_instances("i-amkillable") |
657 | + self.mocker.call(system_state.get_round) |
658 | + |
659 | self.ec2.terminate_instances("i-amkillable", "i-amkillabletoo") |
660 | self.mocker.result(succeed([ |
661 | ("i-amkillable", "running", "shutting-down"), |
662 | ("i-amkillabletoo", "running", "shutting-down")])) |
663 | + |
664 | + deleted_groups = Observed() |
665 | + self.ec2.delete_security_group("ensemble-moon-0") |
666 | + self.mocker.call(deleted_groups.add) |
667 | + |
668 | + self.ec2.delete_security_group("ensemble-moon-2") |
669 | + self.mocker.call(deleted_groups.add) |
670 | + |
671 | self.mocker.replay() |
672 | |
673 | provider = self.get_provider() |
674 | - d = provider.shutdown_machines([ |
675 | - EC2ProviderMachine("i-amkillable"), |
676 | - EC2ProviderMachine("i-amkillabletoo")]) |
677 | - |
678 | - def verify(result): |
679 | - (machine_1, machine_2) = result |
680 | - self.assertTrue(isinstance(machine_1, EC2ProviderMachine)) |
681 | - self.assertEquals(machine_1.instance_id, "i-amkillable") |
682 | - self.assertTrue(isinstance(machine_2, EC2ProviderMachine)) |
683 | - self.assertEquals(machine_2.instance_id, "i-amkillabletoo") |
684 | - d.addCallback(verify) |
685 | - return d |
686 | + machine_1, machine_2 = yield provider.shutdown_machines([ |
687 | + EC2ProviderMachine("i-amkillable"), |
688 | + EC2ProviderMachine("i-amkillabletoo")]) |
689 | + self.assertTrue(isinstance(machine_1, EC2ProviderMachine)) |
690 | + self.assertEquals(machine_1.instance_id, "i-amkillable") |
691 | + self.assertTrue(isinstance(machine_2, EC2ProviderMachine)) |
692 | + self.assertEquals(machine_2.instance_id, "i-amkillabletoo") |
693 | + self.assertEquals( |
694 | + deleted_groups.items, |
695 | + set(["ensemble-moon-0", "ensemble-moon-2"])) |
696 | |
697 | |
698 | class EC2DestroyTest(EC2TestMixin, TestCase): |
699 | |
700 | + @inlineCallbacks |
701 | def test_destroy_environment(self): |
702 | """ |
703 | The destroy_environment operation terminates all running and pending |
704 | - instances associated to the C{MachineProvider} instance. |
705 | + instances associated to the `MachineProvider` instance. |
706 | """ |
707 | self.s3.put_object("moon", "provider-state", "{}\n") |
708 | self.mocker.result(succeed(None)) |
709 | self.ec2.describe_instances() |
710 | - self.mocker.result(succeed([ |
711 | - self.get_instance("i-canbekilled"), |
712 | - self.get_instance("i-amdead", state="terminated"), |
713 | + instances = [ |
714 | + self.get_instance("i-canbekilled", machine_id=0), |
715 | + self.get_instance("i-amdead", machine_id=1, state="terminated"), |
716 | self.get_instance("i-dontbelong", groups=["unknown"]), |
717 | - self.get_instance("i-canbekilledtoo", state="pending")])) |
718 | + self.get_instance( |
719 | + "i-canbekilledtoo", machine_id=2, state="pending")] |
720 | + self.mocker.result(succeed(instances)) |
721 | self.ec2.describe_instances("i-canbekilled", "i-canbekilledtoo") |
722 | self.mocker.result(succeed([ |
723 | - self.get_instance("i-canbekilled"), |
724 | - self.get_instance("i-canbekilledtoo", state="pending")])) |
725 | + self.get_instance("i-canbekilled"), |
726 | + self.get_instance("i-canbekilledtoo", state="pending")])) |
727 | self.ec2.terminate_instances("i-canbekilled", "i-canbekilledtoo") |
728 | self.mocker.result(succeed([ |
729 | - ("i-canbekilled", "running", "shutting-down"), |
730 | - ("i-canbekilledtoo", "pending", "shutting-down")])) |
731 | + ("i-canbekilled", "running", "shutting-down"), |
732 | + ("i-canbekilledtoo", "pending", "shutting-down")])) |
733 | + |
734 | + self.ec2.describe_instances("i-canbekilled", "i-canbekilledtoo") |
735 | + states = [ |
736 | + self.get_instance( |
737 | + "i-canbekilled", state="terminated", machine_id=0), |
738 | + self.get_instance( |
739 | + "i-canbekilledtoo", state="terminated", machine_id=2)] |
740 | + self.mocker.result(succeed(states)) |
741 | + |
742 | + self.ec2.delete_security_group(MATCH_GROUP) |
743 | + deleted_groups = Observed() |
744 | + self.mocker.call(deleted_groups.add) |
745 | + self.mocker.count(3) |
746 | + |
747 | self.mocker.replay() |
748 | |
749 | provider = self.get_provider() |
750 | - d = provider.destroy_environment() |
751 | - |
752 | - def verify(result): |
753 | - (machine_1, machine_2) = result |
754 | - self.assertTrue(isinstance(machine_1, EC2ProviderMachine)) |
755 | - self.assertEquals(machine_1.instance_id, "i-canbekilled") |
756 | - self.assertTrue(isinstance(machine_2, EC2ProviderMachine)) |
757 | - self.assertEquals(machine_2.instance_id, "i-canbekilledtoo") |
758 | - d.addCallback(verify) |
759 | - return d |
760 | - |
761 | + machine_1, machine_2 = yield provider.destroy_environment() |
762 | + self.assertTrue(isinstance(machine_1, EC2ProviderMachine)) |
763 | + self.assertEquals(machine_1.instance_id, "i-canbekilled") |
764 | + self.assertTrue(isinstance(machine_2, EC2ProviderMachine)) |
765 | + self.assertEquals(machine_2.instance_id, "i-canbekilledtoo") |
766 | + self.assertEquals( |
767 | + deleted_groups.items, |
768 | + set(["ensemble-moon-0", "ensemble-moon-2", "ensemble-moon"])) |
769 | + |
770 | + @inlineCallbacks |
771 | def test_s3_failure(self): |
772 | """Failing to store empty state should not stop us killing machines""" |
773 | self.s3.put_object("moon", "provider-state", "{}\n") |
774 | self.mocker.result(fail(SomeError())) |
775 | self.ec2.describe_instances() |
776 | self.mocker.result(succeed([self.get_instance("i-canbekilled")])) |
777 | + system_state = MockInstanceState( |
778 | + self, ["i-canbekilled"], [0], |
779 | + [["running"], |
780 | + ["shutting-down"], |
781 | + ["shutting-down"], |
782 | + ["shutting-down"], |
783 | + ["terminated"]]) |
784 | self.ec2.describe_instances("i-canbekilled") |
785 | - self.mocker.result(succeed([self.get_instance("i-canbekilled")])) |
786 | + self.mocker.call(system_state.get_round) |
787 | + self.mocker.count(5) |
788 | self.ec2.terminate_instances("i-canbekilled") |
789 | self.mocker.result(succeed([ |
790 | ("i-canbekilled", "running", "shutting-down")])) |
791 | + self.ec2.delete_security_group("ensemble-moon-0") |
792 | + self.mocker.result(succeed(True)) |
793 | + self.ec2.delete_security_group("ensemble-moon") |
794 | + self.mocker.result(succeed(True)) |
795 | self.mocker.replay() |
796 | |
797 | provider = self.get_provider() |
798 | - d = provider.destroy_environment() |
799 | - |
800 | - def verify(result): |
801 | - (machine,) = result |
802 | - self.assertTrue(isinstance(machine, EC2ProviderMachine)) |
803 | - self.assertEquals(machine.instance_id, "i-canbekilled") |
804 | - d.addCallback(verify) |
805 | - return d |
806 | - |
807 | + machine, = yield provider.destroy_environment() |
808 | + self.assertTrue(isinstance(machine, EC2ProviderMachine)) |
809 | + self.assertEquals(machine.instance_id, "i-canbekilled") |
810 | + |
811 | + @inlineCallbacks |
812 | def test_shutdown_no_instances(self): |
813 | """ |
814 | If there are no instances to shutdown, running the destroy_environment |
815 | @@ -197,12 +253,14 @@ |
816 | self.mocker.result(succeed(None)) |
817 | self.ec2.describe_instances() |
818 | self.mocker.result(succeed([])) |
819 | + self.ec2.delete_security_group("ensemble-moon") |
820 | + self.mocker.result(fail( |
821 | + self.get_ec2_error( |
822 | + "ensemble-moon", |
823 | + format="The security group %r does not exist" |
824 | + ))) |
825 | self.mocker.replay() |
826 | |
827 | provider = self.get_provider() |
828 | - d = provider.destroy_environment() |
829 | - |
830 | - def verify(result): |
831 | - self.assertEquals(result, []) |
832 | - d.addCallback(verify) |
833 | - return d |
834 | + result = yield provider.destroy_environment() |
835 | + self.assertEquals(result, []) |
The logic looks mostly good, besides for [8] maybe. Some organizational issues, though.
[1]
+ def _get_ensemble_ security_ group(self) : t_name security_ group(self, instance): security_ group = self._get_ ensemble_ security_ group() reservation. groups: security_ group: instance_ id) security_ group(self, group): delete_ security_ group(group)
+ return "ensemble-%s" % self.environmen
+
+ def _get_machine_
+ ensemble_
+ for group in instance.
+ if group != ensemble_
+ return group
+
+ # Ignore if no such group exists; this allows some limited
+ # backwards compatibility with old setups without machine
+ # security group
+ log.debug("No associated group for instance %r", instance.
+ return None
+
+ @inlineCallbacks
+ def _delete_
+ yield self.ec2.
+ log.debug("Deleted security group %r", group)
These utility functions also have little relation to the provider interface providers. ec2.securitygro up
itself. Please move them to top levels under ensemble.
[2]
+ # Repeatedly poll EC2 until instances are in terminated state; killable_ machines)
(...)
+ except EC2Error:
+ pass # Ignore, cannot delete if other machines are using
+
+ returnValue(
Same thing. Big algorithm polluting the interface of the provider with
lots of domain-specific knowledge about a minor aspect.
Please create a function named remove_ security_ groups under the same
module as suggested above, and test it in isolation.
[3]
+ i = 0
+ while wait_on and i < 60:
(...)
+ i += 1
+ if wait_on:
+ yield self._sleep(1) # Reasonable sleep to continue polling
Minor hint:
for i in range(60):
(...)
if not wait_on:
break
yield self._sleep(1)
[4]
Try dropping the sleep in that loop entirely. The describe_instances
operation is a natural limiter.
When you do that, how many iterations do you get in practice before
the operation completes, and how many seconds it took?
[5]
+ def _sleep(self, delay): callLater( delay, deferred.callback, None)
+ """Non-blocking sleep."""
+ from twisted.internet import reactor
+
+ deferred = Deferred()
+ reactor.
+ return deferred
If you dropped it in the point above, remove this. If not, please move lib.twistedutil s and test it accordingly.
this to ensemble.
[6]
+ log.debug("%d Info on instance: %s, %s", instance_ id, instance. instance_ state)
+ i, instance.
Slightly better debug message:
"[iteraction %d] Instance %s state: %s"
[7]
+ log.debug("Instance %r was terminated", instance_ id)
+ instance.
Drop this.. the exact same information was just logged.
[8]
My understanding is that describe_instances would blow up if some of the
instances were not found. If that's the case, the algorithm won't work
in practice.
[9]
+ # Wait for the security groups deleted in this round deletions)
+ yield DeferredList(
Why? It's killing parallelism unnecessarily, it seems.
[10]
+ log.info("Shutdo...