Merge lp:~fwereade/pyjuju/fix-charm-upgrade into lp:pyjuju

Proposed by William Reade
Status: Merged
Approved by: Kapil Thangavelu
Approved revision: 458
Merged at revision: 458
Proposed branch: lp:~fwereade/pyjuju/fix-charm-upgrade
Merge into: lp:pyjuju
Prerequisite: lp:~fwereade/pyjuju/dont-stop-workflows
Diff against target: 3535 lines (+1424/-808)
15 files modified
docs/source/internals/unit-agent-persistence.rst (+138/-0)
juju/agents/tests/test_unit.py (+79/-141)
juju/agents/unit.py (+59/-136)
juju/control/tests/test_resolved.py (+16/-8)
juju/control/tests/test_status.py (+11/-37)
juju/control/tests/test_upgrade_charm.py (+4/-2)
juju/errors.py (+11/-1)
juju/lib/statemachine.py (+105/-28)
juju/lib/tests/test_statemachine.py (+214/-76)
juju/state/service.py (+10/-10)
juju/tests/test_errors.py (+31/-9)
juju/unit/lifecycle.py (+122/-24)
juju/unit/tests/test_lifecycle.py (+121/-50)
juju/unit/tests/test_workflow.py (+435/-227)
juju/unit/workflow.py (+68/-59)
To merge this branch: bzr merge lp:~fwereade/pyjuju/fix-charm-upgrade
Reviewer Review Type Date Requested Status
Kapil Thangavelu (community) Approve
Jim Baker (community) Approve
Review via email: mp+85271@code.launchpad.net

Description of the change

I'm pretty sure that charm upgrades will now:

* fail silently, as before, on early errors (when no changes have been made to any state apart from the upgrade flag)
* error out on early errors if recovering from an earlier failed operation (the *whole* *thing* needs to complete successfully...)
* induce charm_upgrade_error workflow states, when the workflow comes up in a "started" state but midway through a charm upgrade
* do the Right Thing re fire_hooks, which is to *ignore* fire_hooks in any invocation in which the charm is actually written to disk; in this case, it is vital to *always* fire the upgraded-charm hook, because we guarantee it's the *first* one fired after the operation completes.
* actually overwrite old charms, instead of just unpacking into the same directory (and thus leaving droppings).

To post a comment you must log in.
Revision history for this message
Jim Baker (jimbaker) wrote :

+1, looks good to me. The only thing:

$ pep8 juju/unit
juju/unit/tests/test_lifecycle.py:830:1: E303 too many blank lines (3)
juju/unit/tests/test_workflow.py:619:80: E501 line too long (80 characters)

review: Approve
Revision history for this message
Kapil Thangavelu (hazmat) wrote :

[0]

+ def catch_up(self):
+ # There's nothing we can wait on to determine when all the consequences
+ # of the cb_watch_upgrade_flag have come to pass; this seems to be a
+ # "reliable workaround"...
+ for _ in range(10):
+ yield self.poke_zk()

eek, sleep and pray is not kosher ;-). tests need a definitive observation point, either using the existing wait_on_state helper or create a wait on log message helper, to track the done ness of cb watch upgrade.

[1]

charm directories can contain charm state, ie. their not read only. So they can't be overwritten wholesale. they'd need a manifest on install to delta compute the upgrade for files to delete, i'd go ahead and punt on this one for now, it can be handled separately.

review: Needs Fixing
Revision history for this message
William Reade (fwereade) wrote :

[0]

Replaced with a wait_for_log method.

[1]

Fixed.

446. By William Reade

merge parent

447. By William Reade

merge parent

Revision history for this message
Kapil Thangavelu (hazmat) wrote :

[2]

+ returnValue(state_dict.get("state"))

this seems suspect, we have some state data, but its invalid, and we transparently assume its a new/starting workflow .. what if this was a previously initialized system, we'd be transparently resetting it without warning.

[3]

    @property
    def _upgrade_flag(self):
        return "/units/%s/upgrade" % self._internal_id

for clarity, should be _path suffixed.

[4]

There don't appear to be any tests for charm upgrade prepare failing in the upgrade charm lifecycle method, but irrelevant given the following.

[5]

as discussed in person, let's remove the first_attempt logic, any error in the upgrade transition should result in an error state.

448. By William Reade

added upgrade_charm_ready and charm_replaced states, along with suitable transitions and error states

Revision history for this message
William Reade (fwereade) wrote :

[2]

Hmm, very sensible, not sure why I did that :-/.

[3]

Sounds good, done.

[4/5]

OK, following discussion, I've broken the upgrade process into 3 basic transitions (plus associated error/retry) ones: "begin_charm_upgrade", "replace_charm", "finish_charm_upgrade", with 2 new (non-error) states: "charm_upgrade_ready" and "charm_replaced".

There's no more first_attempt malarkey, BUT the "begin_charm_upgrade" transition will raise an error if there is no new charm available, hence aborting the upgrade operation (and keeping the workflow in state "started"); and thereby ensuring that any upgrade operation that makes it to the "charm_upgrade_ready" state represents a real upgrade, and must therefore pass through "replace_charm" and "finish_charm_upgrade" before it returns to the "started" state.

449. By William Reade

address review point

450. By William Reade

address final review point

451. By William Reade

merge parent

452. By William Reade

merge parent

Revision history for this message
William Reade (fwereade) wrote :

Forces in play/further justifications for chosen approach:

1) The use of a state variable for the critical section was criticised as being a bit surprising/magical, so it seemed better to store states as, well, states.
2) first_attempt was also criticised as too complex, so I dropped the silent-early-failure mode (ie once we know an upgrade is a sensible thing to do, we're locked into it and can't escape).
3) I was reluctant to move too much logic into the top-level callback for the upgrade flag watch. It is true that putting the check of service-charm-id/unit-charm-id in there would indeed eliminate the first state transition, but it seemed cleanest to just try the transition and allow the state machinery to handle backing out in response to the lifecycle's complaints.
4) Splitting upgrade-charm and run-hook into two distinct transitions is definitely a win, because (i) the possible errors in the two states are very different, and it's good to distinguish them; (ii) breaking the process into a pipeline like this eliminated further complexity in the original UL.upgrade_charm method (we handle all state tracking with explicit state machine states
5) The repeated check for service/unit charm ids in UL.replace_charm is not *necessary*, but it seemed helpful to avoid redownloading and reinstalling exactly the same bits. We could drop it without doing anything any actual *harm*.

Summary: we agreed on having 2 transitions to handle the process of upgrading; farming the initial can-we-actually-upgrade check out to a third new transition, which fails back out to "started", keeps responsibility for state-tracking in the workflow where, IMO, it should be anyway. I don't consider the additional workflow states to represent serious complexity: the only other state it interacts with is "started".

Aside: rereading IRC, I may have missed something re: reentrancy concerns. Each of the transition methods on UL is, AFAICT, safe for restart midway through; and the workflow is responsible for ensuring we only run them at sensible times.

Does any of this help at all?

Revision history for this message
Kapil Thangavelu (hazmat) wrote :
Download full text (4.2 KiB)

Excerpts from William Reade's message of Tue Jan 31 09:11:25 UTC 2012:
> Forces in play/further justifications for chosen approach:
>
> 1) The use of a state variable for the critical section was criticised as being a bit surprising/magical, so it seemed better to store states as, well, states.

Indeed, it does, but as gustavo pointed out in irc, its really for lack of a WAL
transition (either disk or zk) that its needed at all, and a better fit to model
it as such than as additional state/transitions. Its main purpose is to
signal the transition that's currently in play to allow for recovery and
re-execution of that transition instead of merely entering the last recorded
state.

> 2) first_attempt was also criticised as too complex, so I dropped the silent-early-failure mode (ie once we know an upgrade is a sensible thing to do, we're locked into it and can't escape).

yeah.. the distinction between first_attempt was ever checked by the tests.

> 3) I was reluctant to move too much logic into the top-level callback for the upgrade flag watch. It is true that putting the check of service-charm-id/unit-charm-id in there would indeed eliminate the first state transition, but it seemed cleanest to just try the transition and allow the state machinery to handle backing out in response to the lifecycle's complaints.

i think the refactoring thats been done to subsume some of the callback
into the transition is good. Capturing the entire operation into the
workflow machinery has some benefits wrt to status reporting/recording failures.

> 4) Splitting upgrade-charm and run-hook into two distinct transitions is definitely a win, because (i) the possible errors in the two states are very different, and it's good to distinguish them; (ii) breaking the process into a pipeline like this eliminated further complexity in the original UL.upgrade_charm method (we handle all state tracking with explicit state machine states

The question is the recovery harmed by rexecuting the whole transition? Ie.
what's the value from an error recovery point of view of the separate states.

> 5) The repeated check for service/unit charm ids in UL.replace_charm is not *necessary*, but it seemed helpful to avoid redownloading and reinstalling exactly the same bits. We could drop it without doing anything any actual *harm*.
>

I don't think feel this is really a big deal, given its simplicity. There's an
open bug out for creating a formula cache that could alleviate the redownload.

> Summary: we agreed on having 2 transitions to handle the process of upgrading; farming the initial can-we-actually-upgrade check out to a third new transition, which fails back out to "started", keeps responsibility for state-tracking in the workflow where, IMO, it should be anyway. I don't consider the additional workflow states to represent serious complexity: the only other state it interacts with is "started".
>

Yeah.. i've back-tracked on this looking over the implementation and given
given gustavo's WAL insight. It feels like the additional states and
transitions are just adding complexity esp while juggling the executor. Its also
doesn't seems like a benefit to a user h...

Read more...

453. By William Reade

manually back out r448 without screwing up future merges, I hope

454. By William Reade

remove UL.upgrade_charm's first_attempt kwarg

Revision history for this message
William Reade (fwereade) wrote :

Still prefer the old way, but done now :).

455. By William Reade

merge parent

Revision history for this message
Kapil Thangavelu (hazmat) wrote :

[6]

As we discussed over g+, this is looking good The main thing is that the callback/deferred passing should be removed in favor of having lib/statemachine.py do saving/clearing of transitions in progress against state variables. This will make generic recovery of any inflight transition and perhaps special casing in synchronize.

It should also obviate the need for _notify_upgrading, and upgrade_charm's cb_upgrading.

    @inlineCallbacks
    def _notify_upgrading(self, is_upgrading):

[7] just a comment, looking at upgrade_charm's implementation.

The distinction between upgrade.prepare, upgrade.run, and upgrade.ready looks a little superficial. The upgrade.prepare should work or raise an error. On a retry of an upgrade-charm error state we should always ..
This also has the interesting side effect of firing hooks on juju resolved if we're just now downloading the charm successfully, but that seems correct.

456. By William Reade

WorkflowState now (1) has a synchronize method that re-runs inflight

transitions after restart, and (2) requires external locking on state-
changing methods.

(1) is a required change due to the verdict on fix-charm-upgrade (that is:
we don't want additional states for the upgrade process, and we should use
write-ahead logging to maintain state across process death); (2) is
consequently required because of the interaction between the upgrade flag
watch and the resolved watch, both of which take `state == "started"` to
mean "it's safe to fire a transition". This was a pre-existing bug, but was
IMO exacerbated severely by the fact that we now never leave the "started"
state while upgrading a charm -- enough so that I'm not comfortable
proposing a patch that fails to address this issue.

This diff is *big*, but the vast majority of the changes are as a result of
the WorkflowState locking; in particular, many many tests set workflow state
in one way or another, and the necessary changes add a *lot* of noise. In
hindsight, the locking changes could have been made independently, but I
don't think the resulting pair of branches would actually be significantly
easier to deal with.

Various explanatory notes follow:

* External WorkflowState locking (rather than automagic internal locking) was
  chosen for simplicity's sake on the part of the client as well: when one
  to fire a transition conditional on some state being active, it makes
  sense to grab the lock, check, and fire the transition, in the certain
  knowledge that nobody can have changed state underneath you.

* The obvious problem with DeferredLock (that you can tell it's locked, but
  not who owns the lock) has minimal impact, thanks to the unit tests.
  Consider the following scenarios:

  * Code tries to lock, test not holding lock: we're fine.
  * Code tries to lock, test is holding lock: bad test; deadlocks.
  * Code fails to lock, test not holding lock: bad code: asserts.
  * Code fails to lock, test is holding lock: bad code AND bad test.

  The final scenario is the only dangerous one, but it's just a special
  case of the fact that you can *always* write a bad test that passes bad
  code; I think attempting to solve this is out of scope for this universe.

* WorkflowState.fire_transition sets the inflight transition once it knows
  the requested transition is valid, and only clears it explicitly if the
  transition fails without an error transition to follow up (when a
  transition succeeds, set_state implicitly clears inflight).

* WorkflowState.synchronize (1) detects and re-runs inflight transitions and
  (2) has taken over responsibility for the automatic transitions (eg
  None->installed->started) which had previously been handled in UWS/RWS.
  The overlapping concept of success_transition_id has been dropped;
  transitions can now be marked as "automatic".

* WorkflowState.set_inflight really wants to be private, but is exposed for
  testing's sake; get_inflight needs to be exposed so that UWS.synchronize
  doesn't inappropriately start the executor when recovering from mid-
  upgrade process death.

* Er...

* That's it.

R=
CC=
https://codereview.appspot.com/5647064

457. By William Reade

break up synchronize test to avoid occasional timeout during full test runs

Revision history for this message
William Reade (fwereade) wrote :

OK, state-machine-sync has been merged in; are we good to go?

458. By William Reade

merge parent

Revision history for this message
Kapil Thangavelu (hazmat) wrote :

woohoo! make the magic happen :-)

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== added file 'docs/source/internals/unit-agent-persistence.rst'
2--- docs/source/internals/unit-agent-persistence.rst 1970-01-01 00:00:00 +0000
3+++ docs/source/internals/unit-agent-persistence.rst 2012-02-21 10:23:29 +0000
4@@ -0,0 +1,138 @@
5+Notes on unit agent persistence
6+===============================
7+
8+Introduction
9+------------
10+
11+This was first written to explain the extensive changes made in the branch
12+lp:~fwereade/juju/restart-transitions; that branch has been split out into
13+four separate branches, but this discussion should remain a useful guide to
14+the changes made to the unit and relation workflows and lifecycles in the
15+course of making the unit agent resumable.
16+
17+
18+Glossary
19+--------
20+
21+UA = UnitAgent
22+UL = UnitLifecycle
23+UWS = UnitWorkflowState
24+URL = UnitRelationLifecycle
25+RWS = RelationWorkflowState
26+URS = UnitRelationState
27+SRS = ServiceRelationState
28+
29+
30+Technical discussion
31+--------------------
32+
33+Probably the most fundamental change is the addition of a "synchronize" method
34+to both UWS and RWS. Calling "synchronize" should generally be *all* you need
35+to do to put the workflow and associated components into "the right state"; ie
36+ZK state will be restored, the appropriate lifecycle will be started (or not),
37+and any initial transitons will automatically be fired ("start" for RWS;
38+"install", "start" for UWS).
39+
40+The synchronize method keeps responsibility for the lifecycle's state purely in
41+the hands of the workflow; once a workflow is synced, the *only* necessary
42+interactions with it should be in response to changes in ZK.
43+
44+The disadvantage is that lifecycle "start" and "stop" methods have become a
45+touch overloaded:
46+
47+* UL.stop(): now takes "stop_relations" in addition to "fire_hooks", in which
48+ "stop_relations" being True causes the orginal behaviour (transition "up"
49+ RWSs to "down", as when transitioning the UWS to a "stopped" or "error"
50+ state), but False simply causes them to stop watching for changes (in
51+ preparation for an orderly shutdown, for example).
52+
53+* UL.start(): now takes "start_relations" in addition to "fire_hooks", in which
54+ the "start_relations" flag being True causes the original behaviour
55+ (automatically transition "down" RWSs to "up", as when restarting/resolving
56+ the UWS), while False causes the RWSs only to be synced.
57+
58+* URL.start(): now takes "scheduler" in addition to "watches", allowing the
59+ watching and the contained HookScheduler to be controlled separately
60+ (allowing us to actually perform the RWS synchronise correctly).
61+
62+* URL.stop(): still just takes "watches", because there wasn't a scenario in
63+ which I wanted to stop the watches but not the HookScheduler.
64+
65+I still think it's a win, though: and I don't think that turning them into
66+separate methods is the right way to go; "start" and "stop" remain perfectly
67+decent and appropriate names for what they do.
68+
69+Now this has been done, we can always launch directly into whatever state we
70+shut down in, and that's great, because sudden process death doesn't hurt us
71+any more [0] [1]. Except... when we're upgrading a charm. It emerges that the
72+charm upgrade state transition only covers the process of firing the hook, and
73+not the process of actually upgrading the charm.
74+
75+In short, we had a mechanism, completely outside the workflow's purview, for
76+potentially *brutal* modifications of state (both in terms of the charm itself,
77+on disk, and also in that the hook executor should remain stopped forever while
78+in "charm_upgrade_error" state); and this rather scuppered the "restart in the
79+same state" goal. The obvious thing to do was to move the charm upgrade
80+operation into the "charm_upgrade" transition, so we had a *chance* of being
81+able to start in the correct state.
82+
83+UL.upgrade_charm, called by UWS, does itself have subtleties, but it should be
84+reasonably clear when examined in context; the most important point is that it
85+will call back at the start and end of the risky period, and that the UWS's
86+handler for this callback sets a flag in "started"'s state_vars for the
87+duration of the upgrade. If that flag is set when we subsequently start up
88+again and synchronize the UWS, then we know to immediately force the
89+charm_upgrade_error state and work from there.
90+
91+[0] Well, it does, because we need to persist more than just the (already-
92+persisted) workflow state. This branch includes RWS persistence in the UL, as
93+requested in this branch's first pre-review (back in the day...), but does not
94+include HookScheduler persistence in the URLs, so it remains possible for
95+relation hooks which have been queued, but not yet executed, to be lost if the
96+process executes before the queue empties. That will be coming in another
97+branch (resolve-unit-relation-diffs).
98+
99+[1] This seems like a good time to mention the UL's relation-broken handling
100+for relations that went away while the process was stopped: every time
101+._relations is changed, it writes out enough state to recreate a Frankenstein's
102+URS object, which it can then use on load to reconstruct the necessary URL and
103+hence RWS.
104+
105+We don't strictly need to *reconstruct* it in every case -- we can just use
106+SRS.get_unit_state if the relation still exists -- but given that sometimes we
107+do, it seemed senseless to have two code paths for the same operations. Of the
108+RWSs we reconstruct, those with existing SRSs will be synchronized (because we
109+know it's safe to do so), and the remainder will be stored untouched (because
110+we know that _process_service_changes will fire the "depart" transition for us
111+before doing anything else... and the "relation-broken" hook will be executed
112+in a DepartedRelationHookContext, which is rather restricted, and so shouldn't
113+cause the Frankenstein's URS to hit state we can't be sure exists).
114+
115+
116+Appendix: a rough history of changes to restart-transitions
117+-----------------------------------------------------------
118+
119+* Add UWS transitions from "stopped" to "started", so that process restarts can
120+ be made to restart UWSs.
121+* Upon review, add RWS persistence to UL, to ensure we can't miss
122+ relation-broken hooks; as part of this, as discussed, add
123+ DepartedRelationHookContext in which to execute them.
124+* Upon discussion, discover that original UWS "started" -> "stopped" behaviour
125+ on process shutdown is not actually the desired behaviour (and that the
126+ associated RWS "up" -> "down" shouldn't happen either.
127+* Make changes to UL.start/stop, and add UWS/RWS.synchronize, to allow us to
128+ shut down workflows cleanly without transitions and bring them up again in
129+ the same state.
130+* Discover that we don't have any other reason to transition UWS to "stopped";
131+ to actually fire stop hooks at the right time, we need a more sophisticated
132+ system (possibly involving the machine agent telling the unit agent to shut
133+ itself down). Remove the newly-added "restart" transitions, because they're
134+ meaningless now; ponder what good it does us to have a "stopped" state that
135+ we never actually enter; chicken out of actually removing it.
136+* Realise that charm upgrades do an end-run around the whole UWS mechanism, and
137+ resolve to intergate them so I can actually detect upgrades left incomplete
138+ due to process death.
139+* Move charm upgrade operation from agent into UL; come to appreciate the
140+ subtleties of the charm upgrade process; make necessary tweaks to
141+ UL.upgrade_charm, and UWS, to allow for synchronization of incomplete
142+ upgrades.
143
144=== modified file 'juju/agents/tests/test_unit.py'
145--- juju/agents/tests/test_unit.py 2011-12-16 09:23:31 +0000
146+++ juju/agents/tests/test_unit.py 2012-02-21 10:23:29 +0000
147@@ -3,10 +3,9 @@
148 import os
149 import yaml
150
151-from twisted.internet.defer import (
152- inlineCallbacks, returnValue, fail, Deferred)
153+from twisted.internet.defer import inlineCallbacks, returnValue
154
155-from juju.agents.unit import UnitAgent, CharmUpgradeOperation
156+from juju.agents.unit import UnitAgent
157 from juju.agents.base import TwistedOptionNamespace
158 from juju.charm import get_charm_from_path
159 from juju.charm.url import CharmURL
160@@ -74,8 +73,10 @@
161 "stop", "#!/bin/bash\necho stop >> %s" % output_file)
162
163 for k in kw.keys():
164- self.write_hook(k.replace("_", "-"),
165- "#!/bin/bash\necho $0 >> %s" % output_file)
166+ hook_name = k.replace("_", "-")
167+ self.write_hook(
168+ hook_name,
169+ "#!/bin/bash\necho %s >> %s" % (hook_name, output_file))
170
171 return output_file
172
173@@ -136,7 +137,8 @@
174 self.client, self.states["unit"], lifecycle,
175 os.path.join(self.juju_directory, "state"))
176
177- yield workflow.fire_transition("install")
178+ with (yield workflow.lock()):
179+ yield workflow.fire_transition("install")
180 yield lifecycle.stop(fire_hooks=False, stop_relations=False)
181
182 yield self.agent.startService()
183@@ -154,9 +156,10 @@
184 self.client, self.states["unit"], lifecycle,
185 os.path.join(self.juju_directory, "state"))
186
187- yield workflow.fire_transition("install")
188- self.write_exit_hook("stop", 1)
189- yield workflow.fire_transition("stop")
190+ with (yield workflow.lock()):
191+ yield workflow.fire_transition("install")
192+ self.write_exit_hook("stop", 1)
193+ yield workflow.fire_transition("stop")
194
195 yield self.agent.startService()
196 current_state = yield self.agent.workflow.get_state()
197@@ -516,7 +519,8 @@
198 yield hook_deferred
199
200 hook_deferred = self.wait_on_hook("stop", executor=self.agent.executor)
201- yield self.agent.workflow.fire_transition("stop")
202+ with (yield self.agent.workflow.lock()):
203+ yield self.agent.workflow.fire_transition("stop")
204 yield hook_deferred
205
206 self.assertEqual("stop_error", (yield self.agent.workflow.get_state()))
207@@ -544,7 +548,8 @@
208 yield hook_deferred
209
210 hook_deferred = self.wait_on_hook("stop", executor=self.agent.executor)
211- yield self.agent.workflow.fire_transition("stop")
212+ with (yield self.agent.workflow.lock()):
213+ yield self.agent.workflow.fire_transition("stop")
214 yield hook_deferred
215
216 self.assertEqual("stop_error", (yield self.agent.workflow.get_state()))
217@@ -568,6 +573,12 @@
218 self.makeDir(path=os.path.join(self.juju_directory, "charms"))
219
220 @inlineCallbacks
221+ def wait_for_log(self, logger_name, message, level=logging.DEBUG):
222+ output = self.capture_logging(logger_name, level=level)
223+ while message not in output.getvalue():
224+ yield self.sleep(0.1)
225+
226+ @inlineCallbacks
227 def mark_charm_upgrade(self):
228 # Create a new version of the charm
229 repository = self.increment_charm(self.charm)
230@@ -592,158 +603,85 @@
231 yield self.assertState(self.agent.workflow, "started")
232
233 @inlineCallbacks
234- def test_agent_upgrade_watch_continues_on_unexpected_error(self):
235- """The agent watches for unit upgrades and continues if there is an
236- unexpected error."""
237- yield self.mark_charm_upgrade()
238- self.agent.set_watch_enabled(True)
239-
240- output = self.capture_logging(
241- "juju.agents.unit", level=logging.DEBUG)
242-
243- upgrade_done = Deferred()
244-
245- def operation_has_run():
246- upgrade_done.callback(True)
247-
248- operation = self.mocker.patch(CharmUpgradeOperation)
249- operation.run()
250-
251- self.mocker.call(operation_has_run)
252- self.mocker.result(fail(ValueError("magic mouse")))
253- self.mocker.replay()
254-
255- yield self.agent.startService()
256-
257- yield upgrade_done
258- self.assertIn("Error while upgrading", output.getvalue())
259- self.assertIn("magic mouse", output.getvalue())
260-
261- yield self.agent.workflow.fire_transition("stop")
262-
263- @inlineCallbacks
264 def test_agent_upgrade(self):
265 """The agent can succesfully upgrade its charm."""
266- self.agent.set_watch_enabled(False)
267- yield self.agent.startService()
268-
269- yield self.mark_charm_upgrade()
270-
271+ log_written = self.wait_for_log("juju.agents.unit", "Upgrade complete")
272 hook_done = self.wait_on_hook(
273 "upgrade-charm", executor=self.agent.executor)
274- self.write_hook("upgrade-charm", "#!/bin/bash\nexit 0")
275- output = self.capture_logging("unit.upgrade", level=logging.DEBUG)
276-
277- # Do the upgrade
278- upgrade = CharmUpgradeOperation(self.agent)
279- value = yield upgrade.run()
280-
281- # Verify the upgrade.
282- self.assertIdentical(value, True)
283- self.assertIn("Unit upgraded", output.getvalue())
284+
285+ self.agent.set_watch_enabled(True)
286+ yield self.agent.startService()
287+ yield self.mark_charm_upgrade()
288 yield hook_done
289+ yield log_written
290
291+ self.assertIdentical(
292+ (yield self.states["unit"].get_upgrade_flag()),
293+ False)
294 new_charm = get_charm_from_path(
295 os.path.join(self.agent.unit_directory, "charm"))
296-
297 self.assertEqual(
298 self.charm.get_revision() + 1, new_charm.get_revision())
299
300 @inlineCallbacks
301+ def test_agent_upgrade_version_current(self):
302+ """If the unit is running the latest charm, do nothing."""
303+ log_written = self.wait_for_log(
304+ "juju.agents.unit",
305+ "Upgrade ignored: already running latest charm")
306+
307+ old_charm_id = yield self.states["unit"].get_charm_id()
308+ self.agent.set_watch_enabled(True)
309+ yield self.agent.startService()
310+ yield self.states["unit"].set_upgrade_flag()
311+ yield log_written
312+
313+ self.assertIdentical(
314+ (yield self.states["unit"].get_upgrade_flag()), False)
315+ self.assertEquals(
316+ (yield self.states["unit"].get_charm_id()), old_charm_id)
317+
318+
319+ @inlineCallbacks
320 def test_agent_upgrade_bad_unit_state(self):
321- """The an upgrade fails if the unit is in a bad state."""
322- self.agent.set_watch_enabled(False)
323- yield self.agent.startService()
324-
325+ """The upgrade fails if the unit is in a bad state."""
326 # Upload a new version of the unit's charm
327 repository = self.increment_charm(self.charm)
328 charm = yield repository.find(CharmURL.parse("local:series/mysql"))
329 charm, charm_state = yield self.publish_charm(charm.path)
330+ old_charm_id = yield self.states["unit"].get_charm_id()
331+
332+ log_written = self.wait_for_log(
333+ "juju.agents.unit",
334+ "Cannot upgrade: unit is in non-started state configure_error. "
335+ "Reissue upgrade command to try again.")
336+ self.agent.set_watch_enabled(True)
337+ yield self.agent.startService()
338
339 # Mark the unit for upgrade, with an invalid state.
340+ with (yield self.agent.workflow.lock()):
341+ yield self.agent.workflow.fire_transition("error_configure")
342 yield self.states["service"].set_charm_id(charm_state.id)
343 yield self.states["unit"].set_upgrade_flag()
344- yield self.agent.workflow.set_state("start_error")
345-
346- output = self.capture_logging("unit.upgrade", level=logging.DEBUG)
347-
348- # Do the upgrade
349- upgrade = CharmUpgradeOperation(self.agent)
350- value = yield upgrade.run()
351-
352- # Verify the upgrade.
353- self.assertIdentical(value, False)
354- self.assertIn("Unit not in an upgradeable state: start_error",
355- output.getvalue())
356+ yield log_written
357+
358 self.assertIdentical(
359- (yield self.states["unit"].get_upgrade_flag()),
360- False)
361+ (yield self.states["unit"].get_upgrade_flag()), False)
362+ self.assertEquals(
363+ (yield self.states["unit"].get_charm_id()), old_charm_id)
364
365 @inlineCallbacks
366 def test_agent_upgrade_no_flag(self):
367- """An upgrade fails if there is no upgrade flag set."""
368- self.agent.set_watch_enabled(False)
369- yield self.agent.startService()
370- output = self.capture_logging("unit.upgrade", level=logging.DEBUG)
371- upgrade = CharmUpgradeOperation(self.agent)
372- value = yield upgrade.run()
373- self.assertIdentical(value, False)
374- self.assertIn("No upgrade flag set", output.getvalue())
375- yield self.agent.workflow.fire_transition("stop")
376-
377- @inlineCallbacks
378- def test_agent_upgrade_version_current(self):
379- """An upgrade fails if the unit is running the latest charm."""
380- self.agent.set_watch_enabled(False)
381- yield self.agent.startService()
382- yield self.states["unit"].set_upgrade_flag()
383- output = self.capture_logging("unit.upgrade", level=logging.DEBUG)
384- upgrade = CharmUpgradeOperation(self.agent)
385- value = yield upgrade.run()
386- self.assertIdentical(value, True)
387- self.assertIn("Unit already running latest charm", output.getvalue())
388- self.assertFalse((yield self.states["unit"].get_upgrade_flag()))
389-
390- @inlineCallbacks
391- def test_agent_upgrade_hook_failure(self):
392- """An upgrade fails if the upgrade hook errors."""
393- self.agent.set_watch_enabled(False)
394- yield self.agent.startService()
395-
396- # Upload a new version of the unit's charm
397- repository = self.increment_charm(self.charm)
398- charm = yield repository.find(CharmURL.parse("local:series/mysql"))
399- charm, charm_state = yield self.publish_charm(charm.path)
400-
401- # Mark the unit for upgrade
402- yield self.states["service"].set_charm_id(charm_state.id)
403- yield self.states["unit"].set_upgrade_flag()
404-
405- hook_done = self.wait_on_hook(
406- "upgrade-charm", executor=self.agent.executor)
407- self.write_hook("upgrade-charm", "#!/bin/bash\nexit 1")
408- output = self.capture_logging("unit.upgrade", level=logging.DEBUG)
409-
410- # Do the upgrade
411- upgrade = CharmUpgradeOperation(self.agent)
412- value = yield upgrade.run()
413-
414- # Verify the failed upgrade.
415- self.assertIdentical(value, False)
416- self.assertIn("Invoking upgrade transition", output.getvalue())
417- self.assertIn("Upgrade failed.", output.getvalue())
418- yield hook_done
419-
420- # Verify state
421- workflow_state = yield self.agent.workflow.get_state()
422- self.assertEqual("charm_upgrade_error", workflow_state)
423-
424- # Verify new charm is in place
425- new_charm = get_charm_from_path(
426- os.path.join(self.agent.unit_directory, "charm"))
427-
428- self.assertEqual(
429- self.charm.get_revision() + 1, new_charm.get_revision())
430-
431- # Verify upgrade flag is cleared.
432- self.assertFalse((yield self.states["unit"].get_upgrade_flag()))
433+ """An upgrade stops if there is no upgrade flag set."""
434+ log_written = self.wait_for_log(
435+ "juju.agents.unit", "No upgrade flag set")
436+ old_charm_id = yield self.states["unit"].get_charm_id()
437+ self.agent.set_watch_enabled(True)
438+ yield self.agent.startService()
439+ yield log_written
440+
441+ self.assertIdentical(
442+ (yield self.states["unit"].get_upgrade_flag()),
443+ False)
444+ new_charm_id = yield self.states["unit"].get_charm_id()
445+ self.assertEquals(new_charm_id, old_charm_id)
446
447=== modified file 'juju/agents/unit.py'
448--- juju/agents/unit.py 2012-01-10 14:14:28 +0000
449+++ juju/agents/unit.py 2012-02-21 10:23:29 +0000
450@@ -1,7 +1,5 @@
451 import os
452 import logging
453-import shutil
454-import tempfile
455
456 from twisted.internet.defer import inlineCallbacks, returnValue
457
458@@ -14,8 +12,6 @@
459 from juju.unit.lifecycle import UnitLifecycle, HOOK_SOCKET_FILE
460 from juju.unit.workflow import UnitWorkflowState
461
462-from juju.unit.charm import download_charm
463-
464 from juju.agents.base import BaseAgent
465
466
467@@ -66,14 +62,14 @@
468 @inlineCallbacks
469 def start(self):
470 """Start the unit agent process."""
471- self.service_state_manager = ServiceStateManager(self.client)
472+ service_state_manager = ServiceStateManager(self.client)
473
474 # Retrieve our unit and configure working directories.
475 service_name = self.unit_name.split("/")[0]
476- service_state = yield self.service_state_manager.get_service_state(
477+ self.service_state = yield service_state_manager.get_service_state(
478 service_name)
479
480- self.unit_state = yield service_state.get_unit_state(
481+ self.unit_state = yield self.service_state.get_unit_state(
482 self.unit_name)
483 self.unit_directory = os.path.join(
484 self.config["juju_directory"], "units",
485@@ -101,19 +97,20 @@
486 yield self.unit_state.connect_agent()
487
488 self.lifecycle = UnitLifecycle(
489- self.client, self.unit_state, service_state, self.unit_directory,
490- self.state_directory, self.executor)
491+ self.client, self.unit_state, self.service_state,
492+ self.unit_directory, self.state_directory, self.executor)
493
494 self.workflow = UnitWorkflowState(
495 self.client, self.unit_state, self.lifecycle, self.state_directory)
496
497 # Set up correct lifecycle and executor state given the persistent
498 # unit workflow state, and fire any starting transitions if necessary.
499- yield self.workflow.synchronize(self.executor)
500+ with (yield self.workflow.lock()):
501+ yield self.workflow.synchronize(self.executor)
502
503 if self.get_watch_enabled():
504 yield self.unit_state.watch_resolved(self.cb_watch_resolved)
505- yield service_state.watch_config_state(
506+ yield self.service_state.watch_config_state(
507 self.cb_watch_config_changed)
508 yield self.unit_state.watch_upgrade_flag(
509 self.cb_watch_upgrade_flag)
510@@ -147,20 +144,17 @@
511 # Clear out the setting
512 yield self.unit_state.clear_resolved()
513
514- # Verify its not already running
515- if (yield self.workflow.get_state()) == "started":
516- returnValue(None)
517-
518- log.info("Resolved detected, firing retry transition")
519-
520- # Fire a resolved transition
521- try:
522- if resolved["retry"] == RETRY_HOOKS:
523- yield self.workflow.fire_transition_alias("retry_hook")
524- else:
525- yield self.workflow.fire_transition_alias("retry")
526- except Exception:
527- log.exception("Unknown error while transitioning for resolved")
528+ with (yield self.workflow.lock()):
529+ if (yield self.workflow.get_state()) == "started":
530+ returnValue(None)
531+ try:
532+ log.info("Resolved detected, firing retry transition")
533+ if resolved["retry"] == RETRY_HOOKS:
534+ yield self.workflow.fire_transition_alias("retry_hook")
535+ else:
536+ yield self.workflow.fire_transition_alias("retry")
537+ except Exception:
538+ log.exception("Unknown error while transitioning for resolved")
539
540 @inlineCallbacks
541 def cb_watch_hook_debug(self, change):
542@@ -175,122 +169,51 @@
543 """Update the unit's charm when requested.
544 """
545 upgrade_flag = yield self.unit_state.get_upgrade_flag()
546- if upgrade_flag:
547- log.info("Upgrade detected, starting upgrade")
548- upgrade = CharmUpgradeOperation(self)
549- try:
550- yield upgrade.run()
551- except Exception:
552- log.exception("Error while upgrading")
553+ if not upgrade_flag:
554+ log.info("No upgrade flag set.")
555+ return
556+
557+ log.info("Upgrade detected")
558+ # Clear the flag immediately; this means that upgrade requests will
559+ # be *ignored* by units which are not "started", and will need to be
560+ # reissued when the units are in acceptable states.
561+ yield self.unit_state.clear_upgrade_flag()
562+
563+ new_id = yield self.service_state.get_charm_id()
564+ old_id = yield self.unit_state.get_charm_id()
565+ if new_id == old_id:
566+ log.info("Upgrade ignored: already running latest charm")
567+ return
568+
569+ with (yield self.workflow.lock()):
570+ state = yield self.workflow.get_state()
571+ if state != "started":
572+ log.warning(
573+ "Cannot upgrade: unit is in non-started state %s. Reissue "
574+ "upgrade command to try again.", state)
575+ return
576+
577+ log.info("Starting upgrade")
578+ if (yield self.workflow.fire_transition("upgrade_charm")):
579+ log.info("Upgrade complete")
580+ else:
581+ log.info("Upgrade failed")
582
583 @inlineCallbacks
584 def cb_watch_config_changed(self, change):
585 """Trigger hook on configuration change"""
586 # Verify it is running
587- current_state = yield self.workflow.get_state()
588- log.debug("Configuration Changed")
589-
590- if current_state != "started":
591- log.debug(
592- "Configuration updated on service in a non-started state")
593- returnValue(None)
594-
595- yield self.workflow.fire_transition("reconfigure")
596-
597-
598-class CharmUpgradeOperation(object):
599- """A unit agent charm upgrade operation."""
600-
601- def __init__(self, agent):
602- self._agent = agent
603- self._log = logging.getLogger("unit.upgrade")
604- self._charm_directory = tempfile.mkdtemp(
605- suffix="charm-upgrade", prefix="tmp")
606-
607- def retrieve_charm(self, charm_id):
608- return download_charm(
609- self._agent.client, charm_id, self._charm_directory)
610-
611- def _remove_tree(self, result):
612- if os.path.exists(self._charm_directory):
613- shutil.rmtree(self._charm_directory)
614- return result
615-
616- def run(self):
617- d = self._run()
618- d.addBoth(self._remove_tree)
619- return d
620-
621- @inlineCallbacks
622- def _run(self):
623- self._log.info("Starting charm upgrade...")
624-
625- # Verify the workflow state
626- workflow_state = yield self._agent.workflow.get_state()
627- if workflow_state != "started":
628- self._log.warning(
629- "Unit not in an upgradeable state: %s", workflow_state)
630- # Upgrades can only be supported while the unit is
631- # running, we clear the flag because we don't support
632- # persistent upgrade requests across unit starts. The
633- # upgrade request will need to be reissued, after
634- # resolving or restarting the unit.
635- yield self._agent.unit_state.clear_upgrade_flag()
636- returnValue(False)
637-
638- # Get, check, and clear the flag. Do it first so a second upgrade
639- # will restablish the upgrade request.
640- upgrade_flag = yield self._agent.unit_state.get_upgrade_flag()
641- if not upgrade_flag:
642- self._log.warning("No upgrade flag set.")
643- returnValue(False)
644-
645- self._log.debug("Clearing upgrade flag.")
646- yield self._agent.unit_state.clear_upgrade_flag()
647-
648- # Retrieve the service state
649- service_state_manager = ServiceStateManager(self._agent.client)
650- service_state = yield service_state_manager.get_service_state(
651- self._agent.unit_name.split("/")[0])
652-
653- # Verify unit state, upgrade flag, and newer version requested.
654- service_charm_id = yield service_state.get_charm_id()
655- unit_charm_id = yield self._agent.unit_state.get_charm_id()
656-
657- if service_charm_id == unit_charm_id:
658- self._log.debug("Unit already running latest charm")
659- yield self._agent.unit_state.clear_upgrade_flag()
660- returnValue(True)
661-
662- # Retrieve charm
663- self._log.debug("Retrieving charm %s", service_charm_id)
664- charm = yield self.retrieve_charm(service_charm_id)
665-
666- # Stop hook executions
667- self._log.debug("Stopping hook execution.")
668- yield self._agent.executor.stop()
669-
670- # Note the current charm version
671- self._log.debug("Setting unit charm id to %s", service_charm_id)
672- yield self._agent.unit_state.set_charm_id(service_charm_id)
673-
674- # Extract charm
675- self._log.debug("Extracting new charm.")
676- charm.extract_to(
677- os.path.join(self._agent.unit_directory, "charm"))
678-
679- # Upgrade
680- self._log.debug("Invoking upgrade transition.")
681-
682- success = yield self._agent.workflow.fire_transition(
683- "upgrade_charm")
684-
685- if success:
686- self._log.debug("Unit upgraded.")
687- else:
688- self._log.warning("Upgrade failed.")
689-
690- returnValue(success)
691+ with (yield self.workflow.lock()):
692+ current_state = yield self.workflow.get_state()
693+ log.debug("Configuration Changed")
694+
695+ if current_state != "started":
696+ log.debug(
697+ "Configuration updated on service in a non-started state")
698+ returnValue(None)
699+
700+ yield self.workflow.fire_transition("configure")
701+
702
703 if __name__ == '__main__':
704 UnitAgent.run()
705
706=== modified file 'juju/control/tests/test_resolved.py'
707--- juju/control/tests/test_resolved.py 2012-01-12 10:18:07 +0000
708+++ juju/control/tests/test_resolved.py 2012-02-21 10:23:29 +0000
709@@ -36,7 +36,8 @@
710
711 self.unit1_workflow = UnitWorkflowState(
712 self.client, self.service_unit1, None, self.makeDir())
713- yield self.unit1_workflow.set_state("started")
714+ with (yield self.unit1_workflow.lock()):
715+ yield self.unit1_workflow.set_state("started")
716
717 self.environment = self.config.get_default()
718 self.provider = self.environment.get_machine_provider()
719@@ -95,7 +96,8 @@
720 workflow_state = RelationWorkflowState(
721 self.client, unit_relation, service_relation.relation_name,
722 lifecycle, self.makeDir())
723- yield workflow_state.set_state(state)
724+ with (yield workflow_state.lock()):
725+ yield workflow_state.set_state(state)
726
727 @inlineCallbacks
728 def test_resolved(self):
729@@ -104,7 +106,8 @@
730 retrying from an error state.
731 """
732 # Push the unit into an error state
733- yield self.unit1_workflow.set_state("start_error")
734+ with (yield self.unit1_workflow.lock()):
735+ yield self.unit1_workflow.set_state("start_error")
736 self.setup_exit(0)
737 finished = self.setup_cli_reactor()
738 self.mocker.replay()
739@@ -128,7 +131,8 @@
740 for retrying from an error state with a retry of hooks
741 executions.
742 """
743- yield self.unit1_workflow.set_state("start_error")
744+ with (yield self.unit1_workflow.lock()):
745+ yield self.unit1_workflow.set_state("start_error")
746 self.setup_exit(0)
747 finished = self.setup_cli_reactor()
748 self.mocker.replay()
749@@ -159,7 +163,8 @@
750 (self.service_unit1, "down"),
751 (self.service_unit2, "up"))
752
753- yield self.unit1_workflow.set_state("start_error")
754+ with (yield self.unit1_workflow.lock()):
755+ yield self.unit1_workflow.set_state("start_error")
756 self.setup_exit(0)
757 finished = self.setup_cli_reactor()
758 self.mocker.replay()
759@@ -309,7 +314,8 @@
760 # Just verify we don't accidentally mark up another unit of the service
761 unit2_workflow = UnitWorkflowState(
762 self.client, self.service_unit2, None, self.makeDir())
763- unit2_workflow.set_state("start_error")
764+ with (yield unit2_workflow.lock()):
765+ unit2_workflow.set_state("start_error")
766
767 self.setup_exit(0)
768 finished = self.setup_cli_reactor()
769@@ -335,11 +341,13 @@
770 """
771 # Mark the unit as resolved and as in an error state.
772 yield self.service_unit1.set_resolved(RETRY_HOOKS)
773- yield self.unit1_workflow.set_state("start_error")
774+ with (yield self.unit1_workflow.lock()):
775+ yield self.unit1_workflow.set_state("start_error")
776
777 unit2_workflow = UnitWorkflowState(
778 self.client, self.service_unit1, None, self.makeDir())
779- unit2_workflow.set_state("start_error")
780+ with (yield unit2_workflow.lock()):
781+ unit2_workflow.set_state("start_error")
782
783 self.assertEqual(
784 (yield self.service_unit2.get_resolved()), None)
785
786=== modified file 'juju/control/tests/test_status.py'
787--- juju/control/tests/test_status.py 2011-12-07 18:29:12 +0000
788+++ juju/control/tests/test_status.py 2012-02-21 10:23:29 +0000
789@@ -10,7 +10,6 @@
790
791 from juju.agents.base import TwistedOptionNamespace
792 from juju.agents.machine import MachineAgent
793-from juju.agents.unit import UnitAgent
794 from juju.environment.environment import Environment
795 from juju.control import status
796 from juju.control import tests
797@@ -39,7 +38,7 @@
798 # Status tests setup a large tree every time, make allowances for it.
799 # TODO: create minimal trees needed per test.
800 timeout = 10
801-
802+
803 @inlineCallbacks
804 def setUp(self):
805 yield super(StatusTestBase, self).setUp()
806@@ -59,22 +58,14 @@
807 self.provider = self.environment.get_machine_provider()
808
809 self.output = StringIO()
810- self.agents = []
811-
812- @inlineCallbacks
813- def tearDown(self):
814- for agent in self.agents:
815- if getattr(agent, "api_socket", None):
816- yield agent.api_socket.stopListening()
817- agent.api_socket = None
818- yield super(StatusTestBase, self).tearDown()
819
820 @inlineCallbacks
821 def set_unit_state(self, unit_state, state, port_protos=()):
822 unit_state.set_public_address(
823 "%s.example.com" % unit_state.unit_name.replace("/", "-"))
824 workflow_client = ZookeeperWorkflowState(self.client, unit_state)
825- yield workflow_client.set_state(state)
826+ with (yield workflow_client.lock()):
827+ yield workflow_client.set_state(state)
828 for port_proto in port_protos:
829 yield unit_state.open_port(*port_proto)
830
831@@ -85,7 +76,8 @@
832 unit_state)
833 workflow_client = ZookeeperWorkflowState(
834 self.client, relation_unit_state)
835- yield workflow_client.set_state(state)
836+ with (yield workflow_client.lock()):
837+ yield workflow_client.set_state(state)
838
839 @inlineCallbacks
840 def add_relation_with_relation_units(
841@@ -102,30 +94,11 @@
842 dest_relation_state, dest_units, dest_states)
843
844 @inlineCallbacks
845- def create_agent(self, agent_cls, path, **extra_options):
846- agent = agent_cls()
847- options = TwistedOptionNamespace()
848- options["juju_directory"] = path
849- options["zookeeper_servers"] = get_test_zookeeper_address()
850- for k, v in extra_options.items():
851- options[k] = v
852- agent.configure(options)
853- agent.set_watch_enabled(False)
854- agent.client = self.client
855- yield agent.start()
856- self.agents.append(agent)
857-
858- @inlineCallbacks
859 def add_unit(self, service, machine, with_agent=lambda _: True):
860 unit = yield service.add_unit_state()
861 yield unit.assign_to_machine(machine)
862- name = unit.unit_name
863- if with_agent(name):
864- juju_dir = self.makeDir()
865- os.makedirs(os.path.join(juju_dir, "state"))
866- os.makedirs(os.path.join(juju_dir, "units",
867- name.replace("/", "-")))
868- yield self.create_agent(UnitAgent, juju_dir, unit_name=name)
869+ if with_agent(unit.unit_name):
870+ yield unit.connect_agent()
871 returnValue(unit)
872
873 @inlineCallbacks
874@@ -274,9 +247,10 @@
875 _, (peer_rel,) = yield self.relation_state_manager.add_relation_state(
876 RelationEndpoint("riak", "peer", "ring", "peer"))
877
878- yield ZookeeperWorkflowState(
879- self.client,
880- (yield peer_rel.add_unit_state(riak_u1))).set_state("up")
881+ riak_u1_relation = yield peer_rel.add_unit_state(riak_u1)
882+ riak_u1_workflow = ZookeeperWorkflowState(self.client, riak_u1_relation)
883+ with (yield riak_u1_workflow.lock()):
884+ yield riak_u1_workflow.set_state("up")
885 yield peer_rel.add_unit_state(riak_u2)
886
887 state = yield status.collect(
888
889=== modified file 'juju/control/tests/test_upgrade_charm.py'
890--- juju/control/tests/test_upgrade_charm.py 2011-12-07 02:26:56 +0000
891+++ juju/control/tests/test_upgrade_charm.py 2012-02-21 10:23:29 +0000
892@@ -66,7 +66,8 @@
893
894 self.unit1_workflow = UnitWorkflowState(
895 self.client, self.service_unit1, None, self.makeDir())
896- yield self.unit1_workflow.set_state("started")
897+ with (yield self.unit1_workflow.lock()):
898+ yield self.unit1_workflow.set_state("started")
899
900 self.environment = self.config.get_default()
901 self.provider = self.environment.get_machine_provider()
902@@ -444,7 +445,8 @@
903
904 self.unit1_workflow = UnitWorkflowState(
905 self.client, self.service_unit1, None, self.makeDir())
906- yield self.unit1_workflow.set_state("started")
907+ with (yield self.unit1_workflow.lock()):
908+ yield self.unit1_workflow.set_state("started")
909
910 self.environment = self.config.get_default()
911 self.provider = self.environment.get_machine_provider()
912
913=== modified file 'juju/errors.py'
914--- juju/errors.py 2011-09-24 22:21:23 +0000
915+++ juju/errors.py 2012-02-21 10:23:29 +0000
916@@ -62,7 +62,7 @@
917 return "Error processing %r: %s" % (self.path, self.message)
918
919
920-class CharmInvocationError(JujuError):
921+class CharmInvocationError(CharmError):
922 """A charm's hook invocation exited with an error"""
923
924 def __init__(self, path, exit_code):
925@@ -74,6 +74,16 @@
926 self.path, self.exit_code)
927
928
929+class CharmUpgradeError(CharmError):
930+ """Something went wrong trying to upgrade a charm"""
931+
932+ def __init__(self, message):
933+ self.message = message
934+
935+ def __str__(self):
936+ return "Cannot upgrade charm: %s" % self.message
937+
938+
939 class FileAlreadyExists(JujuError):
940 """Raised when something refuses to overwrite an existing file.
941
942
943=== modified file 'juju/lib/statemachine.py'
944--- juju/lib/statemachine.py 2011-05-04 19:40:59 +0000
945+++ juju/lib/statemachine.py 2012-02-21 10:23:29 +0000
946@@ -17,7 +17,7 @@
947
948 import logging
949
950-from twisted.internet.defer import inlineCallbacks, returnValue
951+from twisted.internet.defer import DeferredLock, inlineCallbacks, returnValue
952
953
954 class WorkflowError(Exception):
955@@ -43,15 +43,44 @@
956 return instance.__class__.__name__.lower()
957
958
959+class _ExitCaller(object):
960+
961+ def __init__(self, func):
962+ self._func = func
963+
964+ def __enter__(self):
965+ pass
966+
967+ def __exit__(self, *exc_info):
968+ self._func()
969+
970+
971 class WorkflowState(object):
972
973 _workflow = None
974
975 def __init__(self, workflow=None):
976-
977 if workflow:
978 self._workflow = workflow
979 self._observer = None
980+ self._lock = DeferredLock()
981+
982+ @inlineCallbacks
983+ def lock(self):
984+ yield self._lock.acquire()
985+ returnValue(_ExitCaller(self._lock.release))
986+
987+ def _assert_locked(self):
988+ """Should be called at the start of any method which changes state.
989+
990+ This is a frankly pitiful hack that should (handwave handwave) help
991+ people to use this correctly; it doesn't stop anyone from calling
992+ write methods on this object while someone *else* holds a lock, but
993+ hopefully it will help us catch these situations when unit testing.
994+
995+ This method only exists as a place to put this documentation.
996+ """
997+ assert self._lock.locked
998
999 @inlineCallbacks
1000 def get_available_transitions(self):
1001@@ -82,6 +111,7 @@
1002 Ambigious (multiple) or no matching transitions cause an exception
1003 InvalidTransition to be raised.
1004 """
1005+ self._assert_locked()
1006
1007 found = []
1008 for t in (yield self.get_available_transitions()):
1009@@ -113,26 +143,25 @@
1010 Returns a boolean value based on whether the state
1011 was achieved.
1012 """
1013- # verify its a valid state id
1014+ self._assert_locked()
1015+
1016+ # verify it's a valid state id
1017 if not self._workflow.has_state(state_id):
1018 raise InvalidStateError(state_id)
1019
1020 transitions = yield self.get_available_transitions()
1021- found_transition = False
1022 for transition in transitions:
1023 if transition.destination == state_id:
1024- found_transition = True
1025 break
1026-
1027- if found_transition:
1028- log.debug("%s: transition state (%s -> %s)",
1029- class_name(self),
1030- transition.source,
1031- transition.destination)
1032- result = yield self.fire_transition(transition.transition_id)
1033- returnValue(result)
1034-
1035- returnValue(False)
1036+ else:
1037+ returnValue(False)
1038+
1039+ log.debug("%s: transition state (%s -> %s)",
1040+ class_name(self),
1041+ transition.source,
1042+ transition.destination)
1043+ result = yield self.fire_transition(transition.transition_id)
1044+ returnValue(result)
1045
1046 @inlineCallbacks
1047 def fire_transition(self, transition_id, **state_variables):
1048@@ -141,6 +170,8 @@
1049 Invokes any transition actions, saves state and state variables, and
1050 error transitions as needed.
1051 """
1052+ self._assert_locked()
1053+
1054 # Verify and retrieve the transition.
1055 available = yield self.get_available_transitions()
1056 available_ids = [t.transition_id for t in available]
1057@@ -149,6 +180,7 @@
1058 raise InvalidTransitionError(
1059 "%r not a valid transition for state %s" % (
1060 transition_id, current_state))
1061+ yield self.set_inflight(transition_id)
1062 transition = self._workflow.get_transition(transition_id)
1063
1064 log.debug("%s: transition %s (%s -> %s) %r",
1065@@ -181,23 +213,30 @@
1066 yield self.fire_transition(
1067 transition.error_transition_id)
1068 else:
1069+ yield self.set_inflight(None)
1070 log.debug("%s: transition %s failed %s",
1071 class_name(self), transition_id, e)
1072 # Bail, and note the error as a return value.
1073 returnValue(False)
1074
1075- # Set the state with state variables
1076+ # Set the state with state variables (and implicitly clear inflight)
1077 yield self.set_state(transition.destination, **state_variables)
1078 log.debug("%s: transition complete %s (state %s) %r",
1079 class_name(self), transition_id,
1080 transition.destination, state_variables)
1081- if transition.success_transition_id:
1082- log.debug("%s: initiating success transition: %s",
1083- class_name(self), transition.success_transition_id)
1084- yield self.fire_transition(transition.success_transition_id)
1085+ yield self._fire_automatic_transitions()
1086 returnValue(True)
1087
1088 @inlineCallbacks
1089+ def _fire_automatic_transitions(self):
1090+ self._assert_locked()
1091+ available = yield self.get_available_transitions()
1092+ for t in available:
1093+ if t.automatic:
1094+ yield self.fire_transition(t.transition_id)
1095+ return
1096+
1097+ @inlineCallbacks
1098 def get_state(self):
1099 """Get the current workflow state.
1100 """
1101@@ -230,10 +269,48 @@
1102 def set_state(self, state, **variables):
1103 """Set the current workflow state, optionally setting state variables.
1104 """
1105+ self._assert_locked()
1106 yield self._store(dict(state=state, state_variables=variables))
1107 if self._observer:
1108 self._observer(state, variables)
1109
1110+ @inlineCallbacks
1111+ def set_inflight(self, transition_id):
1112+ """Record intent to perform a transition, or completion of same.
1113+
1114+ Ideally, this would not be exposed to the public, but it's necessary
1115+ for writing sane tests.
1116+ """
1117+ self._assert_locked()
1118+ state = yield self._load() or {}
1119+ state.setdefault("state", None)
1120+ state.setdefault("state_variables", {})
1121+ if transition_id is not None:
1122+ state["transition_id"] = transition_id
1123+ else:
1124+ state.pop("transition_id", None)
1125+ yield self._store(state)
1126+
1127+ @inlineCallbacks
1128+ def get_inflight(self):
1129+ """Get the id of the transition that is currently executing.
1130+
1131+ (Or which was abandoned due to unexpected process death.)
1132+ """
1133+ state = yield self._load() or {}
1134+ returnValue(state.get("transition_id"))
1135+
1136+ @inlineCallbacks
1137+ def synchronize(self):
1138+ """Rerun inflight transition, if any, and any default transitions."""
1139+ self._assert_locked()
1140+ # First of all, complete any abandoned transition.
1141+ transition_id = yield self.get_inflight()
1142+ if transition_id is not None:
1143+ yield self.fire_transition(transition_id)
1144+ else:
1145+ yield self._fire_automatic_transitions()
1146+
1147 def _load(self):
1148 """ Load the state and variables from persistent storage.
1149 """
1150@@ -280,25 +357,25 @@
1151 class Transition(object):
1152 """A transition encapsulates an edge in the statemachine graph.
1153
1154- :attr:`transition_id` The identity fo the transition.
1155+ :attr:`transition_id` The identity of the transition.
1156 :attr:`label` A human readable label of the transition's purpose.
1157 :attr:`source` The origin/source state of the transition.
1158 :attr:`destination` The target/destination state of the transition.
1159 :attr:`action_id` The name of the action method to use for this transition.
1160 :attr:`error_transition_id`: A transition to fire if the action fails.
1161- :attr:`success_transition_id`: A transition to fire if the action succeeds.
1162+ :attr:`automatic`: If true, always try to fire this transition whenever in
1163+ `source` state.
1164 :attr:`alias` See :meth:`WorkflowState.fire_transition_alias`
1165 """
1166 def __init__(self, transition_id, label, source, destination,
1167- error_transition_id=None, success_transition_id=None,
1168- alias=None):
1169+ error_transition_id=None, automatic=False, alias=None):
1170
1171 self._transition_id = transition_id
1172 self._label = label
1173 self._source = source
1174 self._destination = destination
1175 self._error_transition_id = error_transition_id
1176- self._success_transition_id = success_transition_id
1177+ self._automatic = automatic
1178 self._alias = alias
1179
1180 @property
1181@@ -334,7 +411,7 @@
1182 return self._error_transition_id
1183
1184 @property
1185- def success_transition_id(self):
1186- """The id of a transition to fire upon the success of this transition.
1187+ def automatic(self):
1188+ """Should this transition always fire whenever possible?
1189 """
1190- return self._success_transition_id
1191+ return self._automatic
1192
1193=== modified file 'juju/lib/tests/test_statemachine.py'
1194--- juju/lib/tests/test_statemachine.py 2011-09-15 18:50:23 +0000
1195+++ juju/lib/tests/test_statemachine.py 2012-02-21 10:23:29 +0000
1196@@ -1,6 +1,6 @@
1197 import logging
1198
1199-from twisted.internet.defer import succeed, fail, inlineCallbacks
1200+from twisted.internet.defer import succeed, fail, inlineCallbacks, Deferred
1201
1202 from juju.lib.testing import TestCase
1203 from juju.lib.statemachine import (
1204@@ -8,7 +8,7 @@
1205 InvalidTransitionError, TransitionError)
1206
1207
1208-class AttributeWorkflowState(WorkflowState):
1209+class TestWorkflowState(WorkflowState):
1210
1211 _workflow_state = None
1212
1213@@ -20,6 +20,9 @@
1214 def _load(self):
1215 return self._workflow_state
1216
1217+
1218+class AttributeWorkflowState(TestWorkflowState):
1219+
1220 # transition handlers.
1221 def do_jump_puddle(self):
1222 self._jumped = True
1223@@ -86,30 +89,35 @@
1224 yield self.assertEqual(
1225 transitions, [workflow.get_transition("init_workflow")])
1226
1227+ @inlineCallbacks
1228 def test_fire_transition_alias_multiple(self):
1229 workflow = Workflow(
1230 Transition("init", "", None, "initialized", alias="init"),
1231 Transition("init_start", "", None, "started", alias="init"))
1232 workflow_state = AttributeWorkflowState(workflow)
1233- return self.assertFailure(
1234- workflow_state.fire_transition_alias("init"),
1235- InvalidTransitionError)
1236+ with (yield workflow_state.lock()):
1237+ yield self.assertFailure(
1238+ workflow_state.fire_transition_alias("init"),
1239+ InvalidTransitionError)
1240
1241+ @inlineCallbacks
1242 def test_fire_transition_alias_none(self):
1243 workflow = Workflow(
1244 Transition("init_workflow", "", None, "initialized"),
1245 Transition("start", "", "initialized", "started"))
1246 workflow_state = AttributeWorkflowState(workflow)
1247- return self.assertFailure(
1248- workflow_state.fire_transition_alias("dog"),
1249- InvalidTransitionError)
1250+ with (yield workflow_state.lock()):
1251+ yield self.assertFailure(
1252+ workflow_state.fire_transition_alias("dog"),
1253+ InvalidTransitionError)
1254
1255 @inlineCallbacks
1256 def test_fire_transition_alias(self):
1257 workflow = Workflow(
1258 Transition("init_magic", "", None, "initialized", alias="init"))
1259 workflow_state = AttributeWorkflowState(workflow)
1260- value = yield workflow_state.fire_transition_alias("init")
1261+ with (yield workflow_state.lock()):
1262+ value = yield workflow_state.fire_transition_alias("init")
1263 self.assertEqual(value, True)
1264
1265 @inlineCallbacks
1266@@ -121,10 +129,16 @@
1267 workflow_state = AttributeWorkflowState(workflow)
1268 current_state = yield workflow_state.get_state()
1269 self.assertEqual(current_state, None)
1270-
1271- yield workflow_state.set_state("started")
1272+ current_vars = yield workflow_state.get_state_variables()
1273+ self.assertEqual(current_vars, {})
1274+
1275+ with (yield workflow_state.lock()):
1276+ yield workflow_state.set_state("started")
1277+
1278 current_state = yield workflow_state.get_state()
1279 self.assertEqual(current_state, "started")
1280+ current_vars = yield workflow_state.get_state_variables()
1281+ self.assertEqual(current_vars, {})
1282
1283 @inlineCallbacks
1284 def test_state_fire_transition(self):
1285@@ -132,14 +146,15 @@
1286 Transition("init_workflow", "", None, "initialized"),
1287 Transition("start", "", "initialized", "started"))
1288 workflow_state = AttributeWorkflowState(workflow)
1289- yield workflow_state.fire_transition("init_workflow")
1290- current_state = yield workflow_state.get_state()
1291- self.assertEqual(current_state, "initialized")
1292- yield workflow_state.fire_transition("start")
1293- current_state = yield workflow_state.get_state()
1294- self.assertEqual(current_state, "started")
1295- yield self.assertFailure(workflow_state.fire_transition("stop"),
1296- InvalidTransitionError)
1297+ with (yield workflow_state.lock()):
1298+ yield workflow_state.fire_transition("init_workflow")
1299+ current_state = yield workflow_state.get_state()
1300+ self.assertEqual(current_state, "initialized")
1301+ yield workflow_state.fire_transition("start")
1302+ current_state = yield workflow_state.get_state()
1303+ self.assertEqual(current_state, "started")
1304+ yield self.assertFailure(workflow_state.fire_transition("stop"),
1305+ InvalidTransitionError)
1306
1307 name = "attributeworkflowstate"
1308 output = (
1309@@ -159,7 +174,8 @@
1310 Transition("jump_puddle", "", None, "dry"))
1311
1312 workflow_state = AttributeWorkflowState(workflow)
1313- yield workflow_state.fire_transition("jump_puddle")
1314+ with (yield workflow_state.lock()):
1315+ yield workflow_state.fire_transition("jump_puddle")
1316 current_state = yield workflow_state.get_state()
1317 self.assertEqual(current_state, "dry")
1318 self.assertEqual(
1319@@ -170,13 +186,14 @@
1320 def test_transition_action_workflow_error(self):
1321 """If a transition action callback raises a transitionerror, the
1322 transition does not complete, and the state remains the same.
1323- The fire_transition method in this cae returns False.
1324+ The fire_transition method in this case returns False.
1325 """
1326 workflow = Workflow(
1327 Transition("raises_transition_error", "", None, "next-state"))
1328 workflow_state = AttributeWorkflowState(workflow)
1329- result = yield workflow_state.fire_transition(
1330- "raises_transition_error")
1331+ with (yield workflow_state.lock()):
1332+ result = yield workflow_state.fire_transition(
1333+ "raises_transition_error")
1334 self.assertEqual(result, False)
1335 current_state = yield workflow_state.get_state()
1336 self.assertEqual(current_state, None)
1337@@ -189,6 +206,7 @@
1338 self.assertEqual(self.log_stream.getvalue(),
1339 "\n".join([line % name for line in output]))
1340
1341+ @inlineCallbacks
1342 def test_transition_action_unknown_error(self):
1343 """If an unknown error is raised by a transition action, it
1344 is raised from the fire transition method.
1345@@ -197,9 +215,10 @@
1346 Transition("error_unknown", "", None, "next-state"))
1347
1348 workflow_state = AttributeWorkflowState(workflow)
1349- return self.assertFailure(
1350- workflow_state.fire_transition("error_unknown"),
1351- AttributeError)
1352+ with (yield workflow_state.lock()):
1353+ yield self.assertFailure(
1354+ workflow_state.fire_transition("error_unknown"),
1355+ AttributeError)
1356
1357 @inlineCallbacks
1358 def test_transition_resets_state_variables(self):
1359@@ -214,17 +233,18 @@
1360 state_variables = yield workflow_state.get_state_variables()
1361 self.assertEqual(state_variables, {})
1362
1363- yield workflow_state.fire_transition("transition_variables")
1364- current_state = yield workflow_state.get_state()
1365- self.assertEqual(current_state, "next-state")
1366- state_variables = yield workflow_state.get_state_variables()
1367- self.assertEqual(state_variables, {"hello": "world"})
1368+ with (yield workflow_state.lock()):
1369+ yield workflow_state.fire_transition("transition_variables")
1370+ current_state = yield workflow_state.get_state()
1371+ self.assertEqual(current_state, "next-state")
1372+ state_variables = yield workflow_state.get_state_variables()
1373+ self.assertEqual(state_variables, {"hello": "world"})
1374
1375- yield workflow_state.fire_transition("some_transition")
1376- current_state = yield workflow_state.get_state()
1377- self.assertEqual(current_state, "final-state")
1378- state_variables = yield workflow_state.get_state_variables()
1379- self.assertEqual(state_variables, {})
1380+ yield workflow_state.fire_transition("some_transition")
1381+ current_state = yield workflow_state.get_state()
1382+ self.assertEqual(current_state, "final-state")
1383+ state_variables = yield workflow_state.get_state_variables()
1384+ self.assertEqual(state_variables, {})
1385
1386 @inlineCallbacks
1387 def test_transition_success_transition(self):
1388@@ -233,16 +253,13 @@
1389 action handler are executed.
1390 """
1391 workflow = Workflow(
1392- Transition("initialized", "", None, "next-state",
1393- success_transition_id="markup"),
1394- Transition("markup", "", "next-state", "final-state"),
1395+ Transition("initialized", "", None, "next"),
1396+ Transition("markup", "", "next", "final", automatic=True),
1397 )
1398 workflow_state = AttributeWorkflowState(workflow)
1399- yield workflow_state.fire_transition("initialized")
1400- self.assertEqual((yield workflow_state.get_state()), "final-state")
1401- self.assertIn(
1402- "initiating success transition: markup",
1403- self.log_stream.getvalue())
1404+ with (yield workflow_state.lock()):
1405+ yield workflow_state.fire_transition("initialized")
1406+ self.assertEqual((yield workflow_state.get_state()), "final")
1407
1408 @inlineCallbacks
1409 def test_transition_error_transition(self):
1410@@ -256,7 +273,8 @@
1411 Transition("error_transition", "", None, "error-state"))
1412
1413 workflow_state = AttributeWorkflowState(workflow)
1414- yield workflow_state.fire_transition("raises_transition_error")
1415+ with (yield workflow_state.lock()):
1416+ yield workflow_state.fire_transition("raises_transition_error")
1417
1418 current_state = yield workflow_state.get_state()
1419 self.assertEqual(current_state, "error-state")
1420@@ -278,10 +296,10 @@
1421 Transition("continue", "", "next-state", "final-state"))
1422
1423 workflow_state = AttributeWorkflowState(workflow)
1424- workflow_state.set_observer(observer)
1425-
1426- yield workflow_state.fire_transition("begin")
1427- yield workflow_state.fire_transition("continue")
1428+ with (yield workflow_state.lock()):
1429+ workflow_state.set_observer(observer)
1430+ yield workflow_state.fire_transition("begin")
1431+ yield workflow_state.fire_transition("continue")
1432
1433 self.assertEqual(results,
1434 [("next-state", {}), ("final-state", {})])
1435@@ -295,19 +313,19 @@
1436 Transition("continue", "", "next-state", "final-state"))
1437
1438 workflow_state = AttributeWorkflowState(workflow)
1439-
1440- yield workflow_state.fire_transition(
1441- "begin", rabbit="moon", hello=True)
1442- current_state = yield workflow_state.get_state()
1443- self.assertEqual(current_state, "next-state")
1444- variables = yield workflow_state.get_state_variables()
1445- self.assertEqual({"rabbit": "moon", "hello": True}, variables)
1446-
1447- yield workflow_state.fire_transition("continue")
1448- current_state = yield workflow_state.get_state()
1449- self.assertEqual(current_state, "final-state")
1450- variables = yield workflow_state.get_state_variables()
1451- self.assertEqual({}, variables)
1452+ with (yield workflow_state.lock()):
1453+ yield workflow_state.fire_transition(
1454+ "begin", rabbit="moon", hello=True)
1455+ current_state = yield workflow_state.get_state()
1456+ self.assertEqual(current_state, "next-state")
1457+ variables = yield workflow_state.get_state_variables()
1458+ self.assertEqual({"rabbit": "moon", "hello": True}, variables)
1459+
1460+ yield workflow_state.fire_transition("continue")
1461+ current_state = yield workflow_state.get_state()
1462+ self.assertEqual(current_state, "final-state")
1463+ variables = yield workflow_state.get_state_variables()
1464+ self.assertEqual({}, variables)
1465
1466 @inlineCallbacks
1467 def test_transition_state(self):
1468@@ -319,17 +337,137 @@
1469 Transition("to_house", "", "trail", "house"))
1470
1471 workflow_state = AttributeWorkflowState(workflow)
1472-
1473- result = yield workflow_state.transition_state("trail")
1474- self.assertEqual(result, True)
1475- current_state = yield workflow_state.get_state()
1476- self.assertEqual(current_state, "trail")
1477-
1478- result = yield workflow_state.transition_state("cabin")
1479- self.assertEqual(result, True)
1480-
1481- result = yield workflow_state.transition_state("house")
1482- self.assertEqual(result, False)
1483-
1484- self.assertFailure(workflow_state.transition_state("unknown"),
1485- InvalidStateError)
1486+ with (yield workflow_state.lock()):
1487+ result = yield workflow_state.transition_state("trail")
1488+ self.assertEqual(result, True)
1489+ current_state = yield workflow_state.get_state()
1490+ self.assertEqual(current_state, "trail")
1491+
1492+ result = yield workflow_state.transition_state("cabin")
1493+ self.assertEqual(result, True)
1494+
1495+ result = yield workflow_state.transition_state("house")
1496+ self.assertEqual(result, False)
1497+
1498+ self.assertFailure(workflow_state.transition_state("unknown"),
1499+ InvalidStateError)
1500+
1501+ @inlineCallbacks
1502+ def test_load_bad_state(self):
1503+ class BadLoadWorkflowState(WorkflowState):
1504+ def _load(self):
1505+ return succeed({"some": "other-data"})
1506+
1507+ workflow = BadLoadWorkflowState(Workflow())
1508+ yield self.assertFailure(workflow.get_state(), KeyError)
1509+ yield self.assertFailure(workflow.get_state_variables(), KeyError)
1510+
1511+
1512+SyncWorkflow = Workflow(
1513+ Transition("init", "", None, "inited", error_transition_id="error_init"),
1514+ Transition("error_init", "", None, "borken"),
1515+ Transition("start", "", "inited", "started", automatic=True),
1516+
1517+ # Disjoint states for testing default transition synchronize.
1518+ Transition("predefault", "", "default_init", "default_start"),
1519+ Transition("default", "", "default_start", "default_end", automatic=True),
1520+)
1521+
1522+class SyncWorkflowState(TestWorkflowState):
1523+
1524+ _workflow = SyncWorkflow
1525+
1526+ def __init__(self):
1527+ super(SyncWorkflowState, self).__init__()
1528+ self.started = {
1529+ "init": Deferred(), "error_init": Deferred(), "start": Deferred()}
1530+ self.blockers = {
1531+ "init": Deferred(), "error_init": Deferred(), "start": Deferred()}
1532+
1533+ def do(self, transition):
1534+ self.started[transition].callback(None)
1535+ return self.blockers[transition]
1536+
1537+ def do_init(self):
1538+ return self.do("init")
1539+
1540+ def do_error_init(self):
1541+ return self.do("error_init")
1542+
1543+ def do_start(self):
1544+ return self.do("start")
1545+
1546+
1547+class StateMachineSynchronizeTest(TestCase):
1548+
1549+ @inlineCallbacks
1550+ def setUp(self):
1551+ yield super(StateMachineSynchronizeTest, self).setUp()
1552+ self.workflow = SyncWorkflowState()
1553+
1554+ @inlineCallbacks
1555+ def assert_state(self, state, inflight):
1556+ self.assertEquals((yield self.workflow.get_state()), state)
1557+ self.assertEquals((yield self.workflow.get_inflight()), inflight)
1558+
1559+ @inlineCallbacks
1560+ def test_plain_synchronize(self):
1561+ """synchronize does nothing when no inflight transitions or applicable
1562+ default transitions"""
1563+ yield self.assert_state(None, None)
1564+ with (yield self.workflow.lock()):
1565+ yield self.workflow.synchronize()
1566+ yield self.assert_state(None, None)
1567+
1568+ @inlineCallbacks
1569+ def test_synchronize_default_transition(self):
1570+ """synchronize runs default transitions after inflight recovery"""
1571+ with (yield self.workflow.lock()):
1572+ yield self.workflow.set_state("default_init")
1573+ yield self.workflow.set_inflight("predefault")
1574+ yield self.workflow.synchronize()
1575+ yield self.assert_state("default_end", None)
1576+
1577+ @inlineCallbacks
1578+ def test_synchronize_inflight_success(self):
1579+ """synchronize will complete an unfinished transition and run the
1580+ success transition where warranted"""
1581+ with (yield self.workflow.lock()):
1582+ yield self.workflow.set_inflight("init")
1583+ d = self.workflow.synchronize()
1584+ yield self.workflow.started["init"]
1585+ yield self.assert_state(None, "init")
1586+ self.workflow.blockers["init"].callback(None)
1587+ yield self.workflow.started["start"]
1588+ yield self.assert_state("inited", "start")
1589+ self.workflow.blockers["start"].callback(None)
1590+ yield d
1591+ yield self.assert_state("started", None)
1592+
1593+ @inlineCallbacks
1594+ def test_synchronize_inflight_error(self):
1595+ """synchronize will complete an unfinished transition and run the
1596+ error transition where warranted"""
1597+ with (yield self.workflow.lock()):
1598+ yield self.workflow.set_inflight("init")
1599+ d = self.workflow.synchronize()
1600+ yield self.workflow.started["init"]
1601+ yield self.assert_state(None, "init")
1602+ self.workflow.blockers["init"].errback(TransitionError())
1603+ yield self.workflow.started["error_init"]
1604+ yield self.assert_state(None, "error_init")
1605+ self.workflow.blockers["error_init"].callback(None)
1606+ yield d
1607+ yield self.assert_state("borken", None)
1608+
1609+ @inlineCallbacks
1610+ def test_error_without_transition_clears_inflight(self):
1611+ """when a transition fails, it should no longer be marked inflight"""
1612+ with (yield self.workflow.lock()):
1613+ yield self.workflow.set_state("inited")
1614+ d = self.workflow.fire_transition("start")
1615+ yield self.workflow.started["start"]
1616+ yield self.assert_state("inited", "start")
1617+ self.workflow.blockers["start"].errback(TransitionError())
1618+ yield d
1619+ yield self.assert_state("inited", None)
1620
1621=== modified file 'juju/state/service.py'
1622--- juju/state/service.py 2012-02-04 00:01:06 +0000
1623+++ juju/state/service.py 2012-02-21 10:23:29 +0000
1624@@ -1020,13 +1020,16 @@
1625 # Wait on the first callback, reflecting present state, not a zk watch
1626 yield callback_d
1627
1628+ @property
1629+ def _upgrade_flag_path(self):
1630+ return "/units/%s/upgrade" % self._internal_id
1631+
1632 @inlineCallbacks
1633 def set_upgrade_flag(self):
1634 """Inform the unit it should perform an upgrade.
1635 """
1636- upgrade_path = "/units/%s/upgrade" % self._internal_id
1637 try:
1638- yield self._client.create(upgrade_path)
1639+ yield self._client.create(self._upgrade_flag_path)
1640 except zookeeper.NodeExistsException:
1641 # We get to the same end state
1642 pass
1643@@ -1035,8 +1038,7 @@
1644 def get_upgrade_flag(self):
1645 """Returns a boolean denoting if the upgrade flag is set.
1646 """
1647- upgrade_path = "/units/%s/upgrade" % self._internal_id
1648- stat = yield self._client.exists(upgrade_path)
1649+ stat = yield self._client.exists(self._upgrade_flag_path)
1650 returnValue(bool(stat))
1651
1652 @inlineCallbacks
1653@@ -1046,9 +1048,8 @@
1654 Typically done by the unit agent before beginning the
1655 upgrade.
1656 """
1657- upgrade_path = "/units/%s/upgrade" % self._internal_id
1658 try:
1659- yield self._client.delete(upgrade_path)
1660+ yield self._client.delete(self._upgrade_flag_path)
1661 except zookeeper.NoNodeException:
1662 # We get to the same end state.
1663 pass
1664@@ -1070,20 +1071,19 @@
1665 happening, the callback should fetch the current value via
1666 the API, if needed.
1667 """
1668- upgrade_path = "/units/%s/upgrade" % self._internal_id
1669-
1670 @inlineCallbacks
1671 def watcher(change_event):
1672
1673 if permanent and self._client.connected:
1674- exists_d, watch_d = self._client.exists_and_watch(upgrade_path)
1675+ exists_d, watch_d = self._client.exists_and_watch(
1676+ self._upgrade_flag_path)
1677
1678 yield callback(change_event)
1679
1680 if permanent:
1681 watch_d.addCallback(watcher)
1682
1683- exists_d, watch_d = self._client.exists_and_watch(upgrade_path)
1684+ exists_d, watch_d = self._client.exists_and_watch(self._upgrade_flag_path)
1685
1686 exists = yield exists_d
1687
1688
1689=== modified file 'juju/tests/test_errors.py'
1690--- juju/tests/test_errors.py 2011-09-24 22:21:23 +0000
1691+++ juju/tests/test_errors.py 2012-02-21 10:23:29 +0000
1692@@ -1,9 +1,9 @@
1693 from juju.errors import (
1694- JujuError, FileNotFound, FileAlreadyExists,
1695- NoConnection, InvalidHost, InvalidUser, ProviderError, CloudInitError,
1696- ProviderInteractionError, CannotTerminateMachine, MachinesNotFound,
1697- EnvironmentPending, EnvironmentNotFound, IncompatibleVersion,
1698- InvalidPlacementPolicy)
1699+ JujuError, FileNotFound, FileAlreadyExists, CharmError,
1700+ CharmInvocationError, CharmUpgradeError, NoConnection, InvalidHost,
1701+ InvalidUser, ProviderError, CloudInitError, ProviderInteractionError,
1702+ CannotTerminateMachine, MachinesNotFound, EnvironmentPending,
1703+ EnvironmentNotFound, IncompatibleVersion, InvalidPlacementPolicy)
1704
1705 from juju.lib.testing import TestCase
1706
1707@@ -73,34 +73,56 @@
1708
1709 def test_MachinesNotFoundSingular(self):
1710 error = MachinesNotFound(("i-sublimed",))
1711+ self.assertIsJujuError(error)
1712 self.assertEquals(error.instance_ids, ["i-sublimed"])
1713 self.assertEquals(str(error),
1714 "Cannot find machine: i-sublimed")
1715
1716 def test_MachinesNotFoundPlural(self):
1717 error = MachinesNotFound(("i-disappeared", "i-exploded"))
1718+ self.assertIsJujuError(error)
1719 self.assertEquals(error.instance_ids, ["i-disappeared", "i-exploded"])
1720 self.assertEquals(str(error),
1721 "Cannot find machines: i-disappeared, i-exploded")
1722
1723- def testEnvironmentNotFoundWithInfo(self):
1724+ def test_EnvironmentNotFoundWithInfo(self):
1725 error = EnvironmentNotFound("problem")
1726+ self.assertIsJujuError(error)
1727 self.assertEquals(str(error),
1728 "juju environment not found: problem")
1729
1730- def testEnvironmentNotFoundNoInfo(self):
1731+ def test_EnvironmentNotFoundNoInfo(self):
1732 error = EnvironmentNotFound()
1733+ self.assertIsJujuError(error)
1734 self.assertEquals(str(error),
1735 "juju environment not found: no details "
1736 "available")
1737
1738- def testEnvironmentPendingWithInfo(self):
1739+ def test_EnvironmentPendingWithInfo(self):
1740 error = EnvironmentPending("problem")
1741+ self.assertIsJujuError(error)
1742 self.assertEquals(str(error), "problem")
1743
1744- def testInvalidPlacementPolicy(self):
1745+ def test_InvalidPlacementPolicy(self):
1746 error = InvalidPlacementPolicy("x", "foobar", ["a", "b", "c"])
1747+ self.assertIsJujuError(error)
1748 self.assertEquals(
1749 str(error),
1750 ("Unsupported placement policy: 'x' for provider: 'foobar', "
1751 "supported policies a, b, c"))
1752+
1753+ def test_CharmError(self):
1754+ error = CharmError("/foo/bar", "blah blah")
1755+ self.assertIsJujuError(error)
1756+ self.assertEquals(str(error), "Error processing '/foo/bar': blah blah")
1757+
1758+ def test_CharmInvocationError(self):
1759+ error = CharmInvocationError("/foo/bar", 1)
1760+ self.assertIsJujuError(error)
1761+ self.assertEquals(
1762+ str(error), "Error processing '/foo/bar': exit code 1.")
1763+
1764+ def test_CharmUpgradeError(self):
1765+ error = CharmUpgradeError("blah blah")
1766+ self.assertIsJujuError(error)
1767+ self.assertEquals(str(error), "Cannot upgrade charm: blah blah")
1768
1769=== modified file 'juju/unit/lifecycle.py'
1770--- juju/unit/lifecycle.py 2012-01-28 08:45:10 +0000
1771+++ juju/unit/lifecycle.py 2012-02-21 10:23:29 +0000
1772@@ -1,17 +1,21 @@
1773 import os
1774 import logging
1775+import shutil
1776+import tempfile
1777 import yaml
1778
1779 from twisted.internet.defer import (
1780 inlineCallbacks, DeferredLock, DeferredList, returnValue)
1781
1782+from juju.errors import CharmUpgradeError
1783 from juju.hooks.invoker import Invoker
1784 from juju.hooks.scheduler import HookScheduler
1785 from juju.state.hook import (
1786- DepartedRelationHookContext, RelationChange, HookContext)
1787+ DepartedRelationHookContext, HookContext, RelationChange)
1788 from juju.state.errors import StopWatcher, UnitRelationStateNotFound
1789 from juju.state.relation import RelationStateManager, UnitRelationState
1790
1791+from juju.unit.charm import download_charm
1792 from juju.unit.workflow import RelationWorkflowState
1793
1794
1795@@ -19,6 +23,67 @@
1796
1797 hook_log = logging.getLogger("hook.output")
1798
1799+# This is used as `client_id` when constructing Invokers
1800+_EVIL_CONSTANT = "constant"
1801+
1802+
1803+class _CharmUpgradeOperation(object):
1804+ """Helper class dealing only with the bare mechanics of upgrading"""
1805+
1806+ def __init__(self, client, service, unit, unit_dir):
1807+ self._client = client
1808+ self._service = service
1809+ self._unit = unit
1810+ self._old_id = None
1811+ self._new_id = None
1812+ self._download_dir = tempfile.mkdtemp()
1813+ self._bundle = None
1814+ self._charm_dir = os.path.join(unit_dir, "charm")
1815+ self._log = logging.getLogger("charm.upgrade")
1816+
1817+ @inlineCallbacks
1818+ def prepare(self):
1819+ self._log.debug("Checking for newer charm...")
1820+ try:
1821+ self._new_id = yield self._service.get_charm_id()
1822+ self._old_id = yield self._unit.get_charm_id()
1823+ if self._new_id != self._old_id:
1824+ self._log.debug("Downloading %s...", self._new_id)
1825+ self._bundle = yield download_charm(
1826+ self._client, self._new_id, self._download_dir)
1827+ else:
1828+ self._log.debug("Latest charm is already present.")
1829+ except Exception as e:
1830+ self._log.exception("Charm upgrade preparation failed.")
1831+ raise CharmUpgradeError(str(e))
1832+
1833+ @property
1834+ def ready(self):
1835+ return self._bundle is not None
1836+
1837+ @inlineCallbacks
1838+ def run(self):
1839+ assert self.ready
1840+ self._log.debug(
1841+ "Replacing charm %s with %s.", self._old_id, self._new_id)
1842+ try:
1843+ # TODO this will leave droppings from the old charm; but we can't
1844+ # delete the whole charm dir and replace it, because some charms
1845+ # store state within their directories. See lp:791035
1846+ self._bundle.extract_to(self._charm_dir)
1847+ self._log.debug(
1848+ "Charm has been upgraded to %s.", self._new_id)
1849+
1850+ yield self._unit.set_charm_id(self._new_id)
1851+ self._log.debug("Upgrade recorded.")
1852+ except Exception as e:
1853+ self._log.exception("Charm upgrade failed.")
1854+ raise CharmUpgradeError(str(e))
1855+
1856+ def cleanup(self):
1857+ if os.path.exists(self._download_dir):
1858+ shutil.rmtree(self._download_dir)
1859+
1860
1861 class UnitLifecycle(object):
1862 """Manager for a unit lifecycle.
1863@@ -64,15 +129,6 @@
1864 yield self._execute_hook("install")
1865
1866 @inlineCallbacks
1867- def upgrade_charm(self, fire_hooks=True):
1868- """Invoke the unit's upgrade-charm hook.
1869- """
1870- if fire_hooks:
1871- yield self._execute_hook("upgrade-charm", now=True)
1872- # Restart hook queued hook execution.
1873- self._executor.start()
1874-
1875- @inlineCallbacks
1876 def start(self, fire_hooks=True, start_relations=True):
1877 """Invoke the start hook, and setup relation watching.
1878
1879@@ -104,9 +160,10 @@
1880 # We actually want to transition from "down" to "up" where
1881 # applicable (ie a stopped unit is starting up again)
1882 for workflow in self._relations.values():
1883- state = yield workflow.get_state()
1884- if state == "down":
1885- yield workflow.transition_state("up")
1886+ with (yield workflow.lock()):
1887+ state = yield workflow.get_state()
1888+ if state == "down":
1889+ yield workflow.transition_state("up")
1890
1891 # Establish a watch on the existing relations.
1892 if not self._watching_relation_memberships:
1893@@ -157,7 +214,8 @@
1894 # We actually want to transition relation states
1895 # (probably because the unit workflow state is stopped/error)
1896 for workflow in self._relations.values():
1897- yield workflow.transition_state("down")
1898+ with (yield workflow.lock()):
1899+ yield workflow.transition_state("down")
1900 else:
1901 # We just want to stop the relations from acting
1902 # (probably because the process is going down)
1903@@ -192,6 +250,42 @@
1904 self._log.debug("configured unit")
1905
1906 @inlineCallbacks
1907+ def upgrade_charm(self, fire_hooks=True):
1908+ """Upgrade the charm and invoke the upgrade-charm hook if requested.
1909+
1910+ :param fire_hooks: if False, *and* the actual upgrade operation is not
1911+ necessary, skip the upgrade-charm hook. When the actual charm has
1912+ changed during this invocation, this flag is ignored: hooks will
1913+ always be fired.
1914+ """
1915+ self._log.debug("Upgrading charm")
1916+ upgrade = _CharmUpgradeOperation(
1917+ self._client, self._service, self._unit, self._unit_dir)
1918+ yield self._run_lock.acquire()
1919+ try:
1920+ yield upgrade.prepare()
1921+
1922+ # Executor may already be stopped if we're retrying.
1923+ if self._executor.running:
1924+ self._log.debug("Pausing normal hook execution")
1925+ yield self._executor.stop()
1926+
1927+ if upgrade.ready:
1928+ yield upgrade.run()
1929+ # We changed the charm just now: we *must* fire hooks.
1930+ fire_hooks = True
1931+ if fire_hooks:
1932+ yield self._execute_hook("upgrade-charm", now=True)
1933+
1934+ # Always restart executor on success; charm upgrade operations and
1935+ # errors are the only reasons for the executor to be stopped.
1936+ self._log.debug("Resuming normal hook execution.")
1937+ self._executor.start()
1938+ finally:
1939+ self._run_lock.release()
1940+ upgrade.cleanup()
1941+
1942+ @inlineCallbacks
1943 def _on_relation_resolved_changes(self, event):
1944 """Callback for unit relation resolved watching.
1945
1946@@ -228,11 +322,11 @@
1947
1948 keys = set(relation_resolved).intersection(self._relations)
1949 for rel_id in keys:
1950- relation_workflow = self._relations[rel_id]
1951- relation_state = yield relation_workflow.get_state()
1952- if relation_state == "up":
1953- continue
1954- yield relation_workflow.transition_state("up")
1955+ workflow = self._relations[rel_id]
1956+ with (yield workflow.lock()):
1957+ state = yield workflow.get_state()
1958+ if state != "up":
1959+ yield workflow.transition_state("up")
1960
1961 @inlineCallbacks
1962 def _on_service_relation_changes(self, old_relations, new_relations):
1963@@ -294,7 +388,8 @@
1964 # Actually depart old relations.
1965 for relation_id in removed:
1966 workflow = self._relations.pop(relation_id)
1967- yield workflow.transition_state("departed")
1968+ with (yield workflow.lock()):
1969+ yield workflow.transition_state("departed")
1970 self._store_relations()
1971
1972 # Process new relations.
1973@@ -336,7 +431,8 @@
1974 # (according to latest stored state) they will try to start, and
1975 # it won't go well (no way to watch related units).
1976 if relation_id in current_ids:
1977- yield workflow.synchronize()
1978+ with (yield workflow.lock()):
1979+ yield workflow.synchronize()
1980
1981 # Put everything into self._relations; adds/departs will be handled
1982 # as usual in the first call to _process_service_changes.
1983@@ -399,7 +495,8 @@
1984 lifecycle, self._state_dir)
1985
1986 self._relations[service_relation.internal_relation_id] = workflow
1987- yield workflow.synchronize()
1988+ with (yield workflow.lock()):
1989+ yield workflow.synchronize()
1990
1991 @inlineCallbacks
1992 def _execute_hook(self, hook_name, now=False):
1993@@ -412,7 +509,7 @@
1994 socket_path = os.path.join(self._unit_dir, HOOK_SOCKET_FILE)
1995 invoker = Invoker(
1996 HookContext(self._client, self._unit.unit_name), None,
1997- "constant", socket_path, self._unit_dir, hook_log)
1998+ _EVIL_CONSTANT, socket_path, self._unit_dir, hook_log)
1999
2000 if now:
2001 yield self._executor.run_priority_hook(invoker, hook_path)
2002@@ -482,7 +579,8 @@
2003 def set_hook_error_handler(self, handler):
2004 """Set an error handler to be invoked if a hook errors.
2005
2006- The handler should accept one parameter, the exception instance.
2007+ The handler should accept two parameters, the RelationChange that
2008+ triggered the hook, and the exception instance.
2009 """
2010 self._error_handler = handler
2011
2012
2013=== modified file 'juju/unit/tests/test_lifecycle.py'
2014--- juju/unit/tests/test_lifecycle.py 2012-01-11 09:37:48 +0000
2015+++ juju/unit/tests/test_lifecycle.py 2012-02-21 10:23:29 +0000
2016@@ -9,25 +9,37 @@
2017
2018 from twisted.internet.defer import inlineCallbacks, Deferred, fail, returnValue
2019
2020-from juju.unit.lifecycle import (
2021- UnitLifecycle, UnitRelationLifecycle, RelationInvoker)
2022-
2023+from juju.charm.url import CharmURL
2024+from juju.control.tests.test_upgrade_charm import CharmUpgradeTestBase
2025+from juju.errors import CharmInvocationError, CharmError, CharmUpgradeError
2026 from juju.hooks.invoker import Invoker
2027 from juju.hooks.executor import HookExecutor
2028-
2029-from juju.errors import CharmInvocationError, CharmError
2030-
2031 from juju.state.endpoint import RelationEndpoint
2032 from juju.state.relation import ClientServerUnitWatcher
2033 from juju.state.service import NO_HOOKS
2034 from juju.state.tests.test_relation import RelationTestBase
2035 from juju.state.hook import RelationChange
2036-
2037+from juju.unit.lifecycle import (
2038+ UnitLifecycle, UnitRelationLifecycle, RelationInvoker)
2039+from juju.unit.tests.test_charm import CharmPublisherTestBase
2040
2041 from juju.lib.testing import TestCase
2042 from juju.lib.mocker import MATCH
2043
2044
2045+class UnwriteablePath(object):
2046+
2047+ def __init__(self, path):
2048+ self.path = path
2049+
2050+ def __enter__(self):
2051+ self.mode = os.stat(self.path).st_mode
2052+ os.chmod(self.path, 0000)
2053+
2054+ def __exit__(self, *exc_info):
2055+ os.chmod(self.path, self.mode)
2056+
2057+
2058 class LifecycleTestBase(RelationTestBase):
2059
2060 juju_directory = None
2061@@ -64,6 +76,9 @@
2062 self.state_directory = os.path.join(self.juju_directory, "state")
2063 os.makedirs(self.state_directory)
2064
2065+ def frozen_charm(self):
2066+ return UnwriteablePath(os.path.join(self.unit_directory, "charm"))
2067+
2068 def write_hook(self, name, text, no_exec=False, hooks_dir=None):
2069 if hooks_dir is None:
2070 hooks_dir = os.path.join(self.unit_directory, "charm", "hooks")
2071@@ -218,7 +233,8 @@
2072 self.states["unit_relation"].internal_relation_id)
2073 self.assertEqual("up", (yield workflow.get_state()))
2074
2075- yield workflow.transition_state("down")
2076+ with (yield workflow.lock()):
2077+ yield workflow.transition_state("down")
2078 resolved = self.wait_on_state(workflow, "up")
2079
2080 # Stop the unit lifecycle
2081@@ -235,14 +251,6 @@
2082 {self.states["unit_relation"].internal_relation_id: NO_HOOKS},
2083 (yield self.states["unit"].get_relation_resolved()))
2084
2085- # If the unit is restarted start, we currently have the
2086- # behavior that the unit relation workflow will automatically
2087- # be transitioned back to running, as part of the normal state
2088- # transition. Sigh.. we should have a separate error
2089- # state for relation hooks then down with state variable usage.
2090- # The current end behavior though seems like the best outcome, ie.
2091- # automatically restart relations.
2092-
2093 @inlineCallbacks
2094 def test_resolved_relation_watch_relation_up(self):
2095 """If a relation marked as to be resolved is already running,
2096@@ -285,7 +293,8 @@
2097 workflow = self.lifecycle.get_relation_workflow(
2098 self.states["unit_relation"].internal_relation_id)
2099 self.assertEqual("up", (yield workflow.get_state()))
2100- yield workflow.fire_transition("error")
2101+ with (yield workflow.lock()):
2102+ yield workflow.fire_transition("error")
2103
2104 resolved = self.wait_on_state(workflow, "up")
2105
2106@@ -321,7 +330,8 @@
2107 workflow = self.lifecycle.get_relation_workflow(
2108 self.states["unit_relation"].internal_relation_id)
2109 self.assertEqual("up", (yield workflow.get_state()))
2110- yield workflow.transition_state("down")
2111+ with (yield workflow.lock()):
2112+ yield workflow.transition_state("down")
2113
2114 resolved = self.wait_on_state(workflow, "up")
2115
2116@@ -410,16 +420,6 @@
2117 self.assertFalse(install_executed.called)
2118
2119 @inlineCallbacks
2120- def test_upgrade_sans_hook(self):
2121- """The lifecycle upgrade can be invoked without firing hooks."""
2122- self.executor.stop()
2123- self.write_hook("upgrade-charm", "#!/bin/sh\n exit 1")
2124- upgrade_executed = self.wait_on_hook("upgrade-charm")
2125- yield self.lifecycle.upgrade_charm(fire_hooks=False)
2126- self.assertFalse(upgrade_executed.called)
2127- self.assertTrue(self.executor.running)
2128-
2129- @inlineCallbacks
2130 def test_running(self):
2131 self.assertFalse(self.lifecycle.running)
2132 yield self.lifecycle.install()
2133@@ -472,23 +472,6 @@
2134 return file_path
2135
2136 @inlineCallbacks
2137- def test_upgrade_hook_invoked_on_upgrade_charm(self):
2138- """Invoking the upgrade_charm lifecycle method executes the
2139- upgrade-charm hook.
2140- """
2141- file_path = self.makeFile("")
2142- self.write_hook(
2143- "upgrade-charm",
2144- ("#!/bin/bash\n" "echo upgraded >> %s\n" % file_path))
2145-
2146- # upgrade requires the external actor that extracts the charm
2147- # to stop the hook executor, prior to extraction so the
2148- # upgrade is the first hook run.
2149- yield self.executor.stop()
2150- yield self.lifecycle.upgrade_charm()
2151- self.assertEqual(open(file_path).read().strip(), "upgraded")
2152-
2153- @inlineCallbacks
2154 def test_config_hook_invoked_on_configure(self):
2155 """Invoke the configure lifecycle method will execute the
2156 config-changed hook.
2157@@ -765,7 +748,8 @@
2158 workflow = self.lifecycle.get_relation_workflow(
2159 self.states["unit_relation"].internal_relation_id)
2160 self.assertEqual("up", (yield workflow.get_state()))
2161- yield workflow.transition_state("down")
2162+ with (yield workflow.lock()):
2163+ yield workflow.transition_state("down")
2164 resolved = self.wait_on_state(workflow, "up")
2165
2166 # Stop the unit lifecycle
2167@@ -854,6 +838,93 @@
2168 yield new_lifecycle.stop()
2169
2170
2171+class UnitLifecycleUpgradeTest(
2172+ LifecycleTestBase, CharmPublisherTestBase, CharmUpgradeTestBase):
2173+
2174+ @inlineCallbacks
2175+ def setUp(self):
2176+ yield super(UnitLifecycleUpgradeTest, self).setUp()
2177+ yield self.setup_default_test_relation()
2178+ self.lifecycle = UnitLifecycle(
2179+ self.client, self.states["unit"], self.states["service"],
2180+ self.unit_directory, self.state_directory, self.executor)
2181+
2182+ @inlineCallbacks
2183+ def test_no_actual_upgrade_bad_hook(self):
2184+ self.write_hook("upgrade-charm", "#!/bin/bash\nexit 1\n")
2185+ done = self.wait_on_hook("upgrade-charm")
2186+ yield self.assertFailure(
2187+ self.lifecycle.upgrade_charm(), CharmInvocationError)
2188+ yield done
2189+ self.assertFalse(self.executor.running)
2190+
2191+ @inlineCallbacks
2192+ def test_no_actual_upgrade_good_hook(self):
2193+ self.write_hook("upgrade-charm", "#!/bin/bash\nexit 0\n")
2194+ # Ensure we don't actually upgrade
2195+ with self.frozen_charm():
2196+ done = self.wait_on_hook("upgrade-charm")
2197+ yield self.lifecycle.upgrade_charm()
2198+ yield done
2199+ self.assertTrue(self.executor.running)
2200+
2201+ @inlineCallbacks
2202+ def test_no_actual_upgrade_dont_fire_hooks(self):
2203+ self.write_hook("upgrade-charm", "#!/bin/bash\nexit 1\n")
2204+ with self.frozen_charm():
2205+ done = self.wait_on_hook("upgrade-charm")
2206+ yield self.lifecycle.upgrade_charm(fire_hooks=False)
2207+ yield self.sleep(0.1)
2208+ self.assertFalse(done.called)
2209+
2210+ @inlineCallbacks
2211+ def prepare_real_upgrade(self, hook_exit):
2212+ repo = self.increment_charm(self.charm)
2213+ hooks_dir = os.path.join(repo.path, "series", "mysql", "hooks")
2214+ self.write_hook(
2215+ "upgrade-charm",
2216+ "#!/bin/bash\nexit %s\n" % hook_exit,
2217+ hooks_dir=hooks_dir)
2218+ charm = yield repo.find(CharmURL.parse("local:series/mysql"))
2219+ charm, charm_state = yield self.publish_charm(charm.path)
2220+ yield self.states["service"].set_charm_id(charm_state.id)
2221+
2222+ @inlineCallbacks
2223+ def test_full_run_bad_write(self):
2224+ yield self.prepare_real_upgrade(0)
2225+ with self.frozen_charm():
2226+ yield self.assertFailure(
2227+ self.lifecycle.upgrade_charm(), CharmUpgradeError)
2228+ self.assertFalse(self.executor.running)
2229+
2230+ @inlineCallbacks
2231+ def test_full_run_bad_hook(self):
2232+ yield self.prepare_real_upgrade(1)
2233+ done = self.wait_on_hook("upgrade-charm")
2234+ yield self.assertFailure(
2235+ self.lifecycle.upgrade_charm(), CharmInvocationError)
2236+ yield done
2237+ self.assertFalse(self.executor.running)
2238+
2239+ @inlineCallbacks
2240+ def test_full_run_good_hook(self):
2241+ yield self.prepare_real_upgrade(0)
2242+ done = self.wait_on_hook("upgrade-charm")
2243+ yield self.lifecycle.upgrade_charm()
2244+ yield done
2245+ self.assertTrue(self.executor.running)
2246+
2247+ @inlineCallbacks
2248+ def test_full_run_dont_fire_hooks_ignored(self):
2249+ """Hooks must always be fired if the charm version actually changed
2250+ in the course of the upgrade"""
2251+ yield self.prepare_real_upgrade(0)
2252+ done = self.wait_on_hook("upgrade-charm")
2253+ yield self.lifecycle.upgrade_charm(fire_hooks=False)
2254+ yield done
2255+ self.assertTrue(self.executor.running)
2256+
2257+
2258 class RelationInvokerTest(TestCase):
2259
2260 def test_relation_invoker_environment(self):
2261@@ -875,10 +946,10 @@
2262 class UnitRelationLifecycleTest(LifecycleTestBase):
2263
2264 hook_template = (
2265- "#!/bin/bash\n"
2266- "echo %(change_type)s >> %(file_path)s\n"
2267- "echo JUJU_RELATION=$JUJU_RELATION >> %(file_path)s\n"
2268- "echo JUJU_REMOTE_UNIT=$JUJU_REMOTE_UNIT >> %(file_path)s")
2269+ "#!/bin/bash\n"
2270+ "echo %(change_type)s >> %(file_path)s\n"
2271+ "echo JUJU_RELATION=$JUJU_RELATION >> %(file_path)s\n"
2272+ "echo JUJU_REMOTE_UNIT=$JUJU_REMOTE_UNIT >> %(file_path)s")
2273
2274 @inlineCallbacks
2275 def setUp(self):
2276
2277=== modified file 'juju/unit/tests/test_workflow.py'
2278--- juju/unit/tests/test_workflow.py 2012-01-11 09:37:48 +0000
2279+++ juju/unit/tests/test_workflow.py 2012-02-21 10:23:29 +0000
2280@@ -1,14 +1,19 @@
2281+import csv
2282 import itertools
2283 import logging
2284+import os
2285 import yaml
2286-import csv
2287-import os
2288
2289 from twisted.internet.defer import inlineCallbacks, returnValue
2290
2291+from juju.control.tests.test_upgrade_charm import CharmUpgradeTestBase
2292+from juju.unit.tests.test_charm import CharmPublisherTestBase
2293 from juju.unit.tests.test_lifecycle import LifecycleTestBase
2294+
2295+from juju.charm.directory import CharmDirectory
2296+from juju.charm.url import CharmURL
2297+from juju.lib.statemachine import WorkflowState
2298 from juju.unit.lifecycle import UnitLifecycle, UnitRelationLifecycle
2299-
2300 from juju.unit.workflow import (
2301 UnitWorkflowState, RelationWorkflowState, WorkflowStateClient,
2302 is_unit_running, is_relation_running)
2303@@ -40,6 +45,26 @@
2304 [yaml.load(r[0]) for r in csv.reader(history)],
2305 yaml.load(zk_state[history_id])))
2306
2307+ @inlineCallbacks
2308+ def assert_history(self, expected, **kwargs):
2309+ f_state, history, zk_state = yield self.read_persistent_state(**kwargs)
2310+ self.assertEquals(f_state, zk_state)
2311+ self.assertEquals(f_state, history[-1])
2312+ self.assertEquals(history, expected)
2313+
2314+ def assert_history_concise(self, *chunks, **kwargs):
2315+ state = None
2316+ history = []
2317+ for chunk in chunks:
2318+ for transition in chunk[:-1]:
2319+ history.append({
2320+ "state": state,
2321+ "state_variables": {},
2322+ "transition_id": transition})
2323+ state = chunk[-1]
2324+ history.append({"state": state, "state_variables": {}})
2325+ return self.assert_history(history, **kwargs)
2326+
2327 def write_exit_hook(self, name, code=0, hooks_dir=None):
2328 self.write_hook(
2329 name,
2330@@ -68,12 +93,14 @@
2331
2332 @inlineCallbacks
2333 def assert_transition(self, transition, success=True):
2334- result = yield self.workflow.fire_transition(transition)
2335+ with (yield self.workflow.lock()):
2336+ result = yield self.workflow.fire_transition(transition)
2337 self.assertEquals(result, success)
2338
2339 @inlineCallbacks
2340 def assert_transition_alias(self, transition, success=True):
2341- result = yield self.workflow.fire_transition_alias(transition)
2342+ with (yield self.workflow.lock()):
2343+ result = yield self.workflow.fire_transition_alias(transition)
2344 self.assertEquals(result, success)
2345
2346 @inlineCallbacks
2347@@ -86,13 +113,6 @@
2348 lines = tuple(l.strip() for l in f)
2349 self.assertEquals(lines, hooks)
2350
2351- @inlineCallbacks
2352- def assert_history(self, expected):
2353- f_state, history, zk_state = yield self.read_persistent_state()
2354- self.assertEquals(f_state, zk_state)
2355- self.assertEquals(f_state, history[-1])
2356- self.assertEquals(history, expected)
2357-
2358
2359 class UnitWorkflowTest(UnitWorkflowTestBase):
2360
2361@@ -101,9 +121,8 @@
2362 yield self.assert_transition("install")
2363 yield self.assert_state("started")
2364 self.assert_hooks("install", "config-changed", "start")
2365- yield self.assert_history([
2366- {"state": "installed", "state_variables": {}},
2367- {"state": "started", "state_variables": {}}])
2368+ yield self.assert_history_concise(
2369+ ("install", "installed"), ("start", "started"))
2370
2371 @inlineCallbacks
2372 def test_install_with_error_and_retry(self):
2373@@ -118,10 +137,10 @@
2374 yield self.assert_transition("retry_install")
2375 yield self.assert_state("started")
2376 self.assert_hooks("install", "config-changed", "start")
2377- yield self.assert_history([
2378- {"state": "install_error", "state_variables": {}},
2379- {"state": "installed", "state_variables": {}},
2380- {"state": "started", "state_variables": {}}])
2381+ yield self.assert_history_concise(
2382+ ("install", "error_install", "install_error"),
2383+ ("retry_install", "installed"),
2384+ ("start", "started"))
2385
2386 @inlineCallbacks
2387 def test_install_error_with_retry_hook(self):
2388@@ -139,21 +158,22 @@
2389 yield self.assert_state("started")
2390 self.assert_hooks(
2391 "install", "install", "install", "config-changed", "start")
2392- yield self.assert_history([
2393- {"state": "install_error", "state_variables": {}},
2394- {"state": "installed", "state_variables": {}},
2395- {"state": "started", "state_variables": {}}])
2396+ yield self.assert_history_concise(
2397+ ("install", "error_install", "install_error"),
2398+ ("retry_install_hook", "install_error"),
2399+ ("retry_install_hook", "installed"),
2400+ ("start", "started"))
2401
2402 @inlineCallbacks
2403 def test_start(self):
2404- yield self.workflow.set_state("installed")
2405+ with (yield self.workflow.lock()):
2406+ yield self.workflow.set_state("installed")
2407
2408 yield self.assert_transition("start")
2409 yield self.assert_state("started")
2410 self.assert_hooks("config-changed", "start")
2411- yield self.assert_history([
2412- {"state": "installed", "state_variables": {}},
2413- {"state": "started", "state_variables": {}}])
2414+ yield self.assert_history_concise(
2415+ ("installed",), ("start", "started"))
2416
2417 @inlineCallbacks
2418 def test_start_with_error(self):
2419@@ -169,10 +189,10 @@
2420 yield self.assert_transition("retry_start")
2421 yield self.assert_state("started")
2422 self.assert_hooks("install", "config-changed", "start")
2423- yield self.assert_history([
2424- {"state": "installed", "state_variables": {}},
2425- {"state": "start_error", "state_variables": {}},
2426- {"state": "started", "state_variables": {}}])
2427+ yield self.assert_history_concise(
2428+ ("install", "installed"),
2429+ ("start", "error_start", "start_error"),
2430+ ("retry_start", "started"))
2431
2432 @inlineCallbacks
2433 def test_start_error_with_retry_hook(self):
2434@@ -191,10 +211,11 @@
2435 self.assert_hooks(
2436 "install", "config-changed", "start", "config-changed", "start",
2437 "config-changed", "start")
2438- yield self.assert_history([
2439- {"state": "installed", "state_variables": {}},
2440- {"state": "start_error", "state_variables": {}},
2441- {"state": "started", "state_variables": {}}])
2442+ yield self.assert_history_concise(
2443+ ("install", "installed"),
2444+ ("start", "error_start", "start_error"),
2445+ ("retry_start_hook", "start_error"),
2446+ ("retry_start_hook", "started"))
2447
2448 @inlineCallbacks
2449 def test_stop(self):
2450@@ -206,10 +227,10 @@
2451 yield self.assert_transition("stop")
2452 yield self.assert_state("stopped")
2453 self.assert_hooks("install", "config-changed", "start", "stop")
2454- yield self.assert_history([
2455- {"state": "installed", "state_variables": {}},
2456- {"state": "started", "state_variables": {}},
2457- {"state": "stopped", "state_variables": {}}])
2458+ yield self.assert_history_concise(
2459+ ("install", "installed"),
2460+ ("start", "started"),
2461+ ("stop", "stopped"))
2462
2463 @inlineCallbacks
2464 def test_stop_with_error(self):
2465@@ -221,11 +242,11 @@
2466 yield self.assert_transition("retry_stop")
2467 yield self.assert_state("stopped")
2468 self.assert_hooks("install", "config-changed", "start", "stop")
2469- yield self.assert_history([
2470- {"state": "installed", "state_variables": {}},
2471- {"state": "started", "state_variables": {}},
2472- {"state": "stop_error", "state_variables": {}},
2473- {"state": "stopped", "state_variables": {}}])
2474+ yield self.assert_history_concise(
2475+ ("install", "installed"),
2476+ ("start", "started"),
2477+ ("stop", "error_stop", "stop_error"),
2478+ ("retry_stop", "stopped"))
2479
2480 @inlineCallbacks
2481 def test_stop_error_with_retry_hook(self):
2482@@ -241,11 +262,12 @@
2483 yield self.assert_state("stopped")
2484 self.assert_hooks(
2485 "install", "config-changed", "start", "stop", "stop", "stop")
2486- yield self.assert_history([
2487- {"state": "installed", "state_variables": {}},
2488- {"state": "started", "state_variables": {}},
2489- {"state": "stop_error", "state_variables": {}},
2490- {"state": "stopped", "state_variables": {}}])
2491+ yield self.assert_history_concise(
2492+ ("install", "installed"),
2493+ ("start", "started"),
2494+ ("stop", "error_stop", "stop_error"),
2495+ ("retry_stop_hook", "stop_error"),
2496+ ("retry_stop_hook", "stopped"))
2497
2498 @inlineCallbacks
2499 def test_configure(self):
2500@@ -254,14 +276,14 @@
2501 """
2502 yield self.assert_transition("install")
2503
2504- yield self.assert_transition("reconfigure")
2505+ yield self.assert_transition("configure")
2506 yield self.assert_state("started")
2507 self.assert_hooks(
2508 "install", "config-changed", "start", "config-changed")
2509- yield self.assert_history([
2510- {"state": "installed", "state_variables": {}},
2511- {"state": "started", "state_variables": {}},
2512- {"state": "started", "state_variables": {}}])
2513+ yield self.assert_history_concise(
2514+ ("install", "installed"),
2515+ ("start", "started"),
2516+ ("configure", "started"))
2517
2518 @inlineCallbacks
2519 def test_configure_error_and_retry(self):
2520@@ -270,17 +292,17 @@
2521 yield self.assert_transition("install")
2522 self.write_exit_hook("config-changed", 1)
2523
2524- yield self.assert_transition("reconfigure", False)
2525+ yield self.assert_transition("configure", False)
2526 yield self.assert_state("configure_error")
2527 yield self.assert_transition("retry_configure")
2528 yield self.assert_state("started")
2529 self.assert_hooks(
2530 "install", "config-changed", "start", "config-changed")
2531- yield self.assert_history([
2532- {"state": "installed", "state_variables": {}},
2533- {"state": "started", "state_variables": {}},
2534- {"state": "configure_error", "state_variables": {}},
2535- {"state": "started", "state_variables": {}}])
2536+ yield self.assert_history_concise(
2537+ ("install", "installed"),
2538+ ("start", "started"),
2539+ ("configure", "error_configure", "configure_error"),
2540+ ("retry_configure", "started"))
2541
2542 @inlineCallbacks
2543 def test_configure_error_and_retry_hook(self):
2544@@ -289,7 +311,7 @@
2545 yield self.assert_transition("install")
2546 self.write_exit_hook("config-changed", 1)
2547
2548- yield self.assert_transition("reconfigure", False)
2549+ yield self.assert_transition("configure", False)
2550 yield self.assert_state("configure_error")
2551 yield self.assert_transition("retry_configure_hook", False)
2552 yield self.assert_state("configure_error")
2553@@ -299,12 +321,13 @@
2554 self.assert_hooks(
2555 "install", "config-changed", "start",
2556 "config-changed", "config-changed", "config-changed")
2557- yield self.assert_history([
2558- {"state": "installed", "state_variables": {}},
2559- {"state": "started", "state_variables": {}},
2560- {"state": "configure_error", "state_variables": {}},
2561- {"state": "configure_error", "state_variables": {}},
2562- {"state": "started", "state_variables": {}}])
2563+ yield self.assert_history_concise(
2564+ ("install", "installed"),
2565+ ("start", "started"),
2566+ ("configure", "error_configure", "configure_error"),
2567+ ("retry_configure_hook", "error_retry_configure",
2568+ "configure_error"),
2569+ ("retry_configure_hook", "started"))
2570
2571 @inlineCallbacks
2572 def test_is_unit_running(self):
2573@@ -312,12 +335,14 @@
2574 self.client, self.states["unit"])
2575 self.assertIdentical(running, False)
2576 self.assertIdentical(state, None)
2577- yield self.workflow.fire_transition("install")
2578+ with (yield self.workflow.lock()):
2579+ yield self.workflow.fire_transition("install")
2580 running, state = yield is_unit_running(
2581 self.client, self.states["unit"])
2582 self.assertIdentical(running, True)
2583 self.assertEqual(state, "started")
2584- yield self.workflow.fire_transition("stop")
2585+ with (yield self.workflow.lock()):
2586+ yield self.workflow.fire_transition("stop")
2587 running, state = yield is_unit_running(
2588 self.client, self.states["unit"])
2589 self.assertIdentical(running, False)
2590@@ -331,57 +356,98 @@
2591
2592 @inlineCallbacks
2593 def test_client_with_state(self):
2594- yield self.workflow.fire_transition("install")
2595+ with (yield self.workflow.lock()):
2596+ yield self.workflow.fire_transition("install")
2597 workflow_client = WorkflowStateClient(self.client, self.states["unit"])
2598 self.assertEqual(
2599 (yield workflow_client.get_state()), "started")
2600
2601 @inlineCallbacks
2602 def test_client_readonly(self):
2603- yield self.workflow.fire_transition("install")
2604+ with (yield self.workflow.lock()):
2605+ yield self.workflow.fire_transition("install")
2606 workflow_client = WorkflowStateClient(
2607 self.client, self.states["unit"])
2608
2609 self.assertEqual(
2610 (yield workflow_client.get_state()), "started")
2611- yield self.assertFailure(
2612- workflow_client.set_state("stopped"), NotImplementedError)
2613+ with (yield workflow_client.lock()):
2614+ yield self.assertFailure(
2615+ workflow_client.set_state("stopped"), NotImplementedError)
2616 self.assertEqual(
2617 (yield workflow_client.get_state()), "started")
2618
2619 @inlineCallbacks
2620- def assert_synchronize(
2621- self, start_state, start_vars,
2622- expect_state, expect_lifecycle, expect_executor):
2623+ def assert_synchronize(self, start_state, state, lifecycle, executor,
2624+ sync_lifecycle=None, sync_executor=None,
2625+ start_inflight=None):
2626+ # Handle cases where we expect to be in a different state pre-sync
2627+ # to the final state post-sync.
2628+ if sync_lifecycle is None:
2629+ sync_lifecycle = lifecycle
2630+ if sync_executor is None:
2631+ sync_executor = executor
2632+ super_sync = WorkflowState.synchronize
2633+
2634+ @inlineCallbacks
2635+ def check_sync(obj):
2636+ # We don't care about RelationWorkflowState syncing here
2637+ if type(obj) == UnitWorkflowState:
2638+ self.assertEquals(
2639+ self.lifecycle.running, sync_lifecycle)
2640+ self.assertEquals(
2641+ self.executor.running, sync_executor)
2642+ yield super_sync(obj)
2643+
2644 all_start_states = itertools.product((True, False), (True, False))
2645- for lifecycle, executor in all_start_states:
2646- if executor and not self.executor.running:
2647+ for initial_lifecycle, initial_executor in all_start_states:
2648+ if initial_executor and not self.executor.running:
2649 self.executor.start()
2650- if lifecycle and not self.lifecycle.running:
2651+ elif not initial_executor and self.executor.running:
2652+ yield self.executor.stop()
2653+ if initial_lifecycle and not self.lifecycle.running:
2654 yield self.lifecycle.start(fire_hooks=False)
2655- yield self.workflow.set_state(start_state, **start_vars)
2656- yield self.workflow.synchronize(self.executor)
2657-
2658- state = yield self.workflow.get_state()
2659- self.assertEquals(state, expect_state)
2660+ elif not initial_lifecycle and self.lifecycle.running:
2661+ yield self.lifecycle.stop(fire_hooks=False)
2662+ with (yield self.workflow.lock()):
2663+ yield self.workflow.set_state(start_state)
2664+ yield self.workflow.set_inflight(start_inflight)
2665+
2666+ # self.patch is not suitable because we can't unpatch until
2667+ # the end of the test, and we don't really want [many] distinct
2668+ # one-line test_synchronize_foo methods.
2669+ WorkflowState.synchronize = check_sync
2670+ try:
2671+ yield self.workflow.synchronize(self.executor)
2672+ finally:
2673+ WorkflowState.synchronize = super_sync
2674+
2675+ new_inflight = yield self.workflow.get_inflight()
2676+ self.assertEquals(new_inflight, None)
2677+ new_state = yield self.workflow.get_state()
2678+ self.assertEquals(new_state, state)
2679 vars = yield self.workflow.get_state_variables()
2680 self.assertEquals(vars, {})
2681- self.assertEquals(self.lifecycle.running, expect_lifecycle)
2682- self.assertEquals(self.executor.running, expect_executor)
2683+ self.assertEquals(self.lifecycle.running, lifecycle)
2684+ self.assertEquals(self.executor.running, executor)
2685
2686 def assert_default_synchronize(self, state):
2687- return self.assert_synchronize(state, {}, state, False, True)
2688-
2689- @inlineCallbacks
2690- def test_synchronize(self):
2691- yield self.assert_synchronize(
2692- None, {}, "started", True, True)
2693- yield self.assert_synchronize(
2694- "installed", {}, "started", True, True)
2695- yield self.assert_synchronize(
2696- "started", {}, "started", True, True)
2697- yield self.assert_synchronize(
2698- "charm_upgrade_error", {}, "charm_upgrade_error", True, False)
2699+ return self.assert_synchronize(state, state, False, True)
2700+
2701+ @inlineCallbacks
2702+ def test_synchronize_automatic(self):
2703+ # No transition in flight
2704+ yield self.assert_synchronize(
2705+ None, "started", True, True, False, True)
2706+ yield self.assert_synchronize(
2707+ "installed", "started", True, True, False, True)
2708+ yield self.assert_synchronize(
2709+ "started", "started", True, True)
2710+ yield self.assert_synchronize(
2711+ "charm_upgrade_error", "charm_upgrade_error", True, False)
2712+
2713+ @inlineCallbacks
2714+ def test_synchronize_trivial(self):
2715 yield self.assert_default_synchronize("install_error")
2716 yield self.assert_default_synchronize("start_error")
2717 yield self.assert_default_synchronize("configure_error")
2718@@ -389,88 +455,182 @@
2719 yield self.assert_default_synchronize("stopped")
2720
2721 @inlineCallbacks
2722+ def test_synchronize_inflight(self):
2723+ # With transition inflight (we check the important one (upgrade_charm)
2724+ # and a couple of others at random, but testing every single one is
2725+ # entirely redundant).
2726+ yield self.assert_synchronize(
2727+ "started", "started", True, True, True, False, "upgrade_charm")
2728+ yield self.assert_synchronize(
2729+ None, "started", True, True, False, True, "install")
2730+ yield self.assert_synchronize(
2731+ "configure_error", "started", True, True, False, True,
2732+ "retry_configure_hook")
2733+
2734+
2735+class UnitWorkflowUpgradeTest(
2736+ UnitWorkflowTestBase, CharmPublisherTestBase, CharmUpgradeTestBase):
2737+
2738+ expected_upgrade = None
2739+
2740+ @inlineCallbacks
2741+ def ready_upgrade(self, bad_hook):
2742+ repository = self.increment_charm(self.charm)
2743+ hooks_dir = os.path.join(repository.path, "series", "mysql", "hooks")
2744+ self.write_exit_hook(
2745+ "upgrade-charm", int(bad_hook), hooks_dir=hooks_dir)
2746+
2747+ charm = yield repository.find(CharmURL.parse("local:series/mysql"))
2748+ charm, charm_state = yield self.publish_charm(charm.path)
2749+ yield self.states["service"].set_charm_id(charm_state.id)
2750+ self.expected_upgrade = charm_state.id
2751+
2752+ @inlineCallbacks
2753+ def assert_charm_upgraded(self, expect_upgraded):
2754+ charm_id = yield self.states["unit"].get_charm_id()
2755+ self.assertEquals(charm_id == self.expected_upgrade, expect_upgraded)
2756+ if expect_upgraded:
2757+ expect_revision = CharmURL.parse(self.expected_upgrade).revision
2758+ charm = CharmDirectory(os.path.join(self.unit_directory, "charm"))
2759+ self.assertEquals(charm.get_revision(), expect_revision)
2760+
2761+ @inlineCallbacks
2762+ def test_upgrade_not_available(self):
2763+ """Upgrading when there's no new version runs the hook anyway"""
2764+ yield self.assert_transition("install")
2765+ yield self.assert_state("started")
2766+
2767+ yield self.assert_transition("upgrade_charm")
2768+ yield self.assert_state("started")
2769+ self.assert_hooks(
2770+ "install", "config-changed", "start", "upgrade-charm")
2771+ yield self.assert_history_concise(
2772+ ("install", "installed"),
2773+ ("start", "started"),
2774+ ("upgrade_charm", "started"))
2775+
2776+ @inlineCallbacks
2777 def test_upgrade(self):
2778 """Upgrading a workflow results in the upgrade hook being
2779 executed.
2780 """
2781- self.makeFile()
2782- yield self.workflow.fire_transition("install")
2783- current_state = yield self.workflow.get_state()
2784- self.assertEqual(current_state, "started")
2785- file_path = self.makeFile()
2786- self.write_hook("upgrade-charm",
2787- ("#!/bin/bash\n"
2788- "echo upgraded >> %s") % file_path)
2789- self.executor.stop()
2790- yield self.workflow.fire_transition("upgrade_charm")
2791- current_state = yield self.workflow.get_state()
2792- self.assertEqual(current_state, "started")
2793-
2794- @inlineCallbacks
2795- def test_upgrade_without_stopping_hooks_errors(self):
2796- """Attempting to execute an upgrade without stopping the
2797- executor is an error.
2798- """
2799- yield self.workflow.fire_transition("install")
2800- current_state = yield self.workflow.get_state()
2801- self.assertEqual(current_state, "started")
2802- yield self.assertFailure(
2803- self.workflow.fire_transition("upgrade_charm"),
2804- AssertionError)
2805+ yield self.assert_transition("install")
2806+ yield self.assert_state("started")
2807+ yield self.ready_upgrade(False)
2808+
2809+ yield self.assert_charm_upgraded(False)
2810+ yield self.assert_transition("upgrade_charm")
2811+ yield self.assert_state("started")
2812+ yield self.assert_charm_upgraded(True)
2813+
2814+ self.assert_hooks(
2815+ "install", "config-changed", "start", "upgrade-charm")
2816+ yield self.assert_history_concise(
2817+ ("install", "installed"),
2818+ ("start", "started"),
2819+ ("upgrade_charm", "started"))
2820
2821 @inlineCallbacks
2822 def test_upgrade_error_retry(self):
2823 """A hook error during an upgrade transitions to
2824 upgrade_error.
2825 """
2826- self.write_hook("upgrade-charm", "#!/bin/bash\nexit 1")
2827- yield self.workflow.fire_transition("install")
2828- current_state = yield self.workflow.get_state()
2829- self.assertEqual(current_state, "started")
2830- self.executor.stop()
2831- yield self.workflow.fire_transition("upgrade_charm")
2832-
2833- current_state = yield self.workflow.get_state()
2834- self.assertEqual(current_state, "charm_upgrade_error")
2835- file_path = self.makeFile()
2836- self.write_hook("upgrade-charm",
2837- ("#!/bin/bash\n"
2838- "echo upgraded >> %s") % file_path)
2839-
2840- # The upgrade error hook should ensure that the executor is stoppped.
2841+ yield self.assert_transition("install")
2842+ yield self.assert_state("started")
2843+ yield self.ready_upgrade(True)
2844+
2845+ yield self.assert_charm_upgraded(False)
2846+ yield self.assert_transition("upgrade_charm", False)
2847+ yield self.assert_state("charm_upgrade_error")
2848 self.assertFalse(self.executor.running)
2849- yield self.workflow.fire_transition("retry_upgrade_charm")
2850- current_state = yield self.workflow.get_state()
2851- self.assertEqual(current_state, "started")
2852+ # The upgrade should complete before the hook blows up.
2853+ yield self.assert_charm_upgraded(True)
2854+
2855+ # The bad hook is still in place, but we don't run it again
2856+ yield self.assert_transition("retry_upgrade_charm")
2857+ yield self.assert_state("started")
2858+ yield self.assert_charm_upgraded(True)
2859+ self.assertTrue(self.executor.running)
2860+
2861+ self.assert_hooks(
2862+ "install", "config-changed", "start", "upgrade-charm")
2863+ yield self.assert_history_concise(
2864+ ("install", "installed"),
2865+ ("start", "started"),
2866+ ("upgrade_charm", "upgrade_charm_error", "charm_upgrade_error"),
2867+ ("retry_upgrade_charm", "started"))
2868
2869 @inlineCallbacks
2870 def test_upgrade_error_retry_hook(self):
2871 """A hook error during an upgrade transitions to
2872 upgrade_error, and can be re-tried with hook execution.
2873 """
2874- yield self.workflow.fire_transition("install")
2875- current_state = yield self.workflow.get_state()
2876- self.assertEqual(current_state, "started")
2877-
2878- # Agent prepares this.
2879- self.executor.stop()
2880-
2881- self.write_hook("upgrade-charm", "#!/bin/bash\nexit 1")
2882- hook_deferred = self.wait_on_hook("upgrade-charm")
2883- yield self.workflow.fire_transition("upgrade_charm")
2884- yield hook_deferred
2885- current_state = yield self.workflow.get_state()
2886- self.assertEqual(current_state, "charm_upgrade_error")
2887-
2888- hook_deferred = self.wait_on_hook("upgrade-charm")
2889- self.write_hook("upgrade-charm", "#!/bin/bash\nexit 0")
2890- # The upgrade error hook should ensure that the executor is stoppped.
2891- self.assertFalse(self.executor.running)
2892- yield self.workflow.fire_transition_alias("retry_hook")
2893- yield hook_deferred
2894- current_state = yield self.workflow.get_state()
2895- self.assertEqual(current_state, "started")
2896- self.assertTrue(self.executor.running)
2897+ yield self.assert_transition("install")
2898+ yield self.assert_state("started")
2899+ yield self.ready_upgrade(True)
2900+
2901+ yield self.assert_charm_upgraded(False)
2902+ yield self.assert_transition("upgrade_charm", False)
2903+ yield self.assert_state("charm_upgrade_error")
2904+ self.assertFalse(self.executor.running)
2905+ # The upgrade should complete before the hook blows up.
2906+ yield self.assert_charm_upgraded(True)
2907+
2908+ yield self.assert_transition("retry_upgrade_charm_hook", False)
2909+ yield self.assert_state("charm_upgrade_error")
2910+ self.assertFalse(self.executor.running)
2911+ yield self.assert_charm_upgraded(True)
2912+
2913+ self.write_exit_hook("upgrade-charm")
2914+ yield self.assert_transition_alias("retry_hook")
2915+ yield self.assert_state("started")
2916+ self.assertTrue(self.executor.running)
2917+ yield self.assert_charm_upgraded(True)
2918+
2919+ self.assert_hooks(
2920+ "install", "config-changed", "start",
2921+ "upgrade-charm", "upgrade-charm", "upgrade-charm")
2922+ yield self.assert_history_concise(
2923+ ("install", "installed"),
2924+ ("start", "started"),
2925+ ("upgrade_charm", "upgrade_charm_error", "charm_upgrade_error"),
2926+ ("retry_upgrade_charm_hook", "retry_upgrade_charm_error",
2927+ "charm_upgrade_error"),
2928+ ("retry_upgrade_charm_hook", "started"))
2929+
2930+ @inlineCallbacks
2931+ def test_upgrade_error_before_hook(self):
2932+ """If we blow up during the critical pre-hook bits, we should still
2933+ end up in the same error state"""
2934+ yield self.assert_transition("install")
2935+ yield self.assert_state("started")
2936+ yield self.ready_upgrade(False)
2937+
2938+ # Induce a surprising error
2939+ with self.frozen_charm():
2940+ yield self.assert_charm_upgraded(False)
2941+ yield self.assert_transition("upgrade_charm", False)
2942+ yield self.assert_state("charm_upgrade_error")
2943+ self.assertFalse(self.executor.running)
2944+ # The upgrade did not complete
2945+ yield self.assert_charm_upgraded(False)
2946+
2947+ yield self.assert_transition("retry_upgrade_charm")
2948+ yield self.assert_state("started")
2949+ self.assertTrue(self.executor.running)
2950+ yield self.assert_charm_upgraded(True)
2951+
2952+ # The hook must run here, even though it's a retry, because the actual
2953+ # charm only just got overwritten: and so we know that we've never even
2954+ # tried to execute a hook for this upgrade, and we must do so to fulfil
2955+ # the guarantee that that hook runs first after upgrade.
2956+ self.assert_hooks(
2957+ "install", "config-changed", "start", "upgrade-charm")
2958+ yield self.assert_history_concise(
2959+ ("install", "installed"),
2960+ ("start", "started"),
2961+ ("upgrade_charm", "upgrade_charm_error", "charm_upgrade_error"),
2962+ ("retry_upgrade_charm", "started"))
2963
2964
2965 class UnitRelationWorkflowTest(WorkflowTestBase):
2966@@ -497,27 +657,29 @@
2967
2968 self.workflow = RelationWorkflowState(
2969 self.client, self.states["unit_relation"],
2970- self.states["unit"].unit_name, self.lifecycle, self.state_directory)
2971+ self.states["unit"].unit_name, self.lifecycle,
2972+ self.state_directory)
2973
2974 @inlineCallbacks
2975 def test_is_relation_running(self):
2976 """The unit relation's workflow state can be categorized as a
2977 boolean.
2978 """
2979- running, state = yield is_relation_running(
2980- self.client, self.states["unit_relation"])
2981- self.assertIdentical(running, False)
2982- self.assertIdentical(state, None)
2983- yield self.workflow.fire_transition("start")
2984- running, state = yield is_relation_running(
2985- self.client, self.states["unit_relation"])
2986- self.assertIdentical(running, True)
2987- self.assertEqual(state, "up")
2988- yield self.workflow.fire_transition("stop")
2989- running, state = yield is_relation_running(
2990- self.client, self.states["unit_relation"])
2991- self.assertIdentical(running, False)
2992- self.assertEqual(state, "down")
2993+ with (yield self.workflow.lock()):
2994+ running, state = yield is_relation_running(
2995+ self.client, self.states["unit_relation"])
2996+ self.assertIdentical(running, False)
2997+ self.assertIdentical(state, None)
2998+ yield self.workflow.fire_transition("start")
2999+ running, state = yield is_relation_running(
3000+ self.client, self.states["unit_relation"])
3001+ self.assertIdentical(running, True)
3002+ self.assertEqual(state, "up")
3003+ yield self.workflow.fire_transition("stop")
3004+ running, state = yield is_relation_running(
3005+ self.client, self.states["unit_relation"])
3006+ self.assertIdentical(running, False)
3007+ self.assertEqual(state, "down")
3008
3009 @inlineCallbacks
3010 def test_up_down_cycle(self):
3011@@ -526,40 +688,29 @@
3012 self.write_hook("%s-relation-changed" % self.relation_name,
3013 "#!/bin/bash\nexit 0\n")
3014
3015- yield self.workflow.fire_transition("start")
3016+ with (yield self.workflow.lock()):
3017+ yield self.workflow.fire_transition("start")
3018 yield self.assertState(self.workflow, "up")
3019
3020 hook_executed = self.wait_on_hook("app-relation-changed")
3021
3022- # Add a new unit, and this will be scheduled by the time
3023- # we finish stopping.
3024+ # Add a new unit, while we're stopped.
3025+ with (yield self.workflow.lock()):
3026+ yield self.workflow.fire_transition("stop")
3027 yield self.add_opposite_service_unit(self.states)
3028- yield self.workflow.fire_transition("stop")
3029 yield self.assertState(self.workflow, "down")
3030 self.assertFalse(hook_executed.called)
3031
3032- # Currently if we restart, we will only see the previously
3033- # queued event, as the last watch active when a lifecycle is
3034- # stopped, may already be in flight and will be scheduled, and
3035- # will be executed when the lifecycle is started. However any
3036- # events that may have occured after the lifecycle is stopped
3037- # are currently ignored and un-notified.
3038- yield self.workflow.fire_transition("restart")
3039+ # Come back up; check unit add detected.
3040+ with (yield self.workflow.lock()):
3041+ yield self.workflow.fire_transition("restart")
3042 yield self.assertState(self.workflow, "up")
3043 yield hook_executed
3044
3045- f_state, history, zk_state = yield self.read_persistent_state(
3046+ self.assert_history_concise(
3047+ ("start", "up"), ("stop", "down"), ("restart", "up"),
3048 history_id=self.workflow.zk_state_id)
3049
3050- self.assertEqual(f_state, zk_state)
3051- self.assertEqual(f_state,
3052- {"state": "up", "state_variables": {}})
3053-
3054- self.assertEqual(history,
3055- [{"state": "up", "state_variables": {}},
3056- {"state": "down", "state_variables": {}},
3057- {"state": "up", "state_variables": {}}])
3058-
3059 @inlineCallbacks
3060 def test_change_hook_with_error(self):
3061 """An error while processing a change hook, results
3062@@ -571,11 +722,9 @@
3063 self.write_hook("%s-relation-changed" % self.relation_name,
3064 "#!/bin/bash\nexit 1\n")
3065
3066- current_state = yield self.workflow.get_state()
3067- self.assertEqual(current_state, None)
3068- yield self.workflow.fire_transition("start")
3069+ with (yield self.workflow.lock()):
3070+ yield self.workflow.fire_transition("start")
3071 yield self.assertState(self.workflow, "up")
3072- current_state = yield self.workflow.get_state()
3073
3074 # Add a new unit, and wait for the broken hook to result in
3075 # the transition to the down state.
3076@@ -602,7 +751,8 @@
3077 broken hook is executed, and the unit stops responding to relation
3078 changes.
3079 """
3080- yield self.workflow.fire_transition("start")
3081+ with (yield self.workflow.lock()):
3082+ yield self.workflow.fire_transition("start")
3083 yield self.assertState(self.workflow, "up")
3084
3085 wait_on_hook = self.wait_on_hook("app-relation-changed")
3086@@ -611,7 +761,8 @@
3087
3088 wait_on_hook = self.wait_on_hook("app-relation-broken")
3089 wait_on_state = self.wait_on_state(self.workflow, "departed")
3090- yield self.workflow.fire_transition("depart")
3091+ with (yield self.workflow.lock()):
3092+ yield self.workflow.fire_transition("depart")
3093 yield wait_on_hook
3094 yield wait_on_state
3095
3096@@ -643,7 +794,8 @@
3097 def test_client_read_state(self):
3098 """The relation workflow client can read the state of a unit
3099 relation."""
3100- yield self.workflow.fire_transition("start")
3101+ with (yield self.workflow.lock()):
3102+ yield self.workflow.fire_transition("start")
3103 yield self.assertState(self.workflow, "up")
3104
3105 self.write_hook("%s-relation-changed" % self.relation_name,
3106@@ -660,12 +812,31 @@
3107 def test_client_read_only(self):
3108 workflow_client = WorkflowStateClient(
3109 self.client, self.states["unit_relation"])
3110- yield self.assertFailure(
3111- workflow_client.set_state("up"),
3112- NotImplementedError)
3113+ with (yield workflow_client.lock()):
3114+ yield self.assertFailure(
3115+ workflow_client.set_state("up"),
3116+ NotImplementedError)
3117
3118 @inlineCallbacks
3119- def assert_synchronize(self, state, expect_state, watches, scheduler):
3120+ def assert_synchronize(self, start_state, state, watches, scheduler,
3121+ sync_watches=None, sync_scheduler=None,
3122+ start_inflight=None):
3123+ # Handle cases where we expect to be in a different state pre-sync
3124+ # to the final state post-sync.
3125+ if sync_watches is None:
3126+ sync_watches = watches
3127+ if sync_scheduler is None:
3128+ sync_scheduler = scheduler
3129+ super_sync = WorkflowState.synchronize
3130+
3131+ @inlineCallbacks
3132+ def check_sync(obj):
3133+ self.assertEquals(
3134+ self.workflow.lifecycle.watching, sync_watches)
3135+ self.assertEquals(
3136+ self.workflow.lifecycle.executing, sync_scheduler)
3137+ yield super_sync(obj)
3138+
3139 start_states = itertools.product((True, False), (True, False))
3140 for (initial_watches, initial_scheduler) in start_states:
3141 yield self.workflow.lifecycle.stop()
3142@@ -676,22 +847,53 @@
3143 self.workflow.lifecycle.watching, initial_watches)
3144 self.assertEquals(
3145 self.workflow.lifecycle.executing, initial_scheduler)
3146- yield self.workflow.set_state(state)
3147-
3148- yield self.workflow.synchronize()
3149+ with (yield self.workflow.lock()):
3150+ yield self.workflow.set_state(start_state)
3151+ yield self.workflow.set_inflight(start_inflight)
3152+
3153+ # self.patch is not suitable because we can't unpatch until
3154+ # the end of the test, and we don't really want 13 distinct
3155+ # one-line test_synchronize_foo methods.
3156+ WorkflowState.synchronize = check_sync
3157+ try:
3158+ yield self.workflow.synchronize()
3159+ finally:
3160+ WorkflowState.synchronize = super_sync
3161+
3162+ new_inflight = yield self.workflow.get_inflight()
3163+ self.assertEquals(new_inflight, None)
3164 new_state = yield self.workflow.get_state()
3165- self.assertEquals(new_state, expect_state)
3166+ self.assertEquals(new_state, state)
3167 self.assertEquals(self.workflow.lifecycle.watching, watches)
3168 self.assertEquals(self.workflow.lifecycle.executing, scheduler)
3169
3170 @inlineCallbacks
3171 def test_synchronize(self):
3172- yield self.assert_synchronize(None, "up", True, True)
3173+ # No transition in flight
3174+ yield self.assert_synchronize(None, "up", True, True, False, False)
3175 yield self.assert_synchronize("down", "down", False, False)
3176 yield self.assert_synchronize("departed", "departed", False, False)
3177 yield self.assert_synchronize("error", "error", True, False)
3178 yield self.assert_synchronize("up", "up", True, True)
3179
3180+ # With transition inflight
3181+ yield self.assert_synchronize(
3182+ None, "up", True, True, False, False, "start")
3183+ yield self.assert_synchronize(
3184+ "up", "down", False, False, True, True, "stop")
3185+ yield self.assert_synchronize(
3186+ "down", "up", True, True, False, False, "restart")
3187+ yield self.assert_synchronize(
3188+ "up", "error", True, False, True, True, "error")
3189+ yield self.assert_synchronize(
3190+ "error", "up", True, True, True, False, "reset")
3191+ yield self.assert_synchronize(
3192+ "up", "departed", False, False, True, True, "depart")
3193+ yield self.assert_synchronize(
3194+ "down", "departed", False, False, False, False, "down_depart")
3195+ yield self.assert_synchronize(
3196+ "error", "departed", False, False, True, False, "error_depart")
3197+
3198 @inlineCallbacks
3199 def test_depart_hook_error(self):
3200 """A depart hook error, still results in a transition to the
3201@@ -701,7 +903,8 @@
3202 "#!/bin/bash\nexit 1\n")
3203 error_output = self.capture_logging("unit.relation.workflow")
3204
3205- yield self.workflow.fire_transition("start")
3206+ with (yield self.workflow.lock()):
3207+ yield self.workflow.fire_transition("start")
3208 yield self.assertState(self.workflow, "up")
3209
3210 wait_on_hook = self.wait_on_hook("app-relation-changed")
3211@@ -710,7 +913,8 @@
3212
3213 wait_on_hook = self.wait_on_hook("app-relation-broken")
3214 wait_on_state = self.wait_on_state(self.workflow, "departed")
3215- yield self.workflow.fire_transition("depart")
3216+ with (yield self.workflow.lock()):
3217+ yield self.workflow.fire_transition("depart")
3218 yield wait_on_hook
3219 yield wait_on_state
3220
3221@@ -756,15 +960,17 @@
3222 broken hook is executed, and the unit stops responding to relation
3223 changes.
3224 """
3225- yield self.workflow.fire_transition("start")
3226- yield self.assertState(self.workflow, "up")
3227- yield self.workflow.fire_transition("stop")
3228- yield self.assertState(self.workflow, "down")
3229+ with (yield self.workflow.lock()):
3230+ yield self.workflow.fire_transition("start")
3231+ yield self.assertState(self.workflow, "up")
3232+ yield self.workflow.fire_transition("stop")
3233+ yield self.assertState(self.workflow, "down")
3234
3235 states = yield self.add_opposite_service_unit(self.states)
3236 wait_on_hook = self.wait_on_hook("app-relation-broken")
3237 wait_on_state = self.wait_on_state(self.workflow, "departed")
3238- yield self.workflow.fire_transition("depart")
3239+ with (yield self.workflow.lock()):
3240+ yield self.workflow.fire_transition("depart")
3241 yield wait_on_hook
3242 yield wait_on_state
3243
3244@@ -783,15 +989,17 @@
3245 self.assertFalse(results)
3246
3247 def test_depart_error(self):
3248- yield self.workflow.fire_transition("start")
3249- yield self.assertState(self.workflow, "up")
3250- yield self.workflow.fire_transition("error")
3251- yield self.assertState(self.workflow, "error")
3252+ with (yield self.workflow.lock()):
3253+ yield self.workflow.fire_transition("start")
3254+ yield self.assertState(self.workflow, "up")
3255+ yield self.workflow.fire_transition("error")
3256+ yield self.assertState(self.workflow, "error")
3257
3258 states = yield self.add_opposite_service_unit(self.states)
3259 wait_on_hook = self.wait_on_hook("app-relation-broken")
3260 wait_on_state = self.wait_on_state(self.workflow, "departed")
3261- yield self.workflow.fire_transition("depart")
3262+ with (yield self.workflow.lock()):
3263+ yield self.workflow.fire_transition("depart")
3264 yield wait_on_hook
3265 yield wait_on_state
3266
3267
3268=== modified file 'juju/unit/workflow.py'
3269--- juju/unit/workflow.py 2012-01-11 09:37:48 +0000
3270+++ juju/unit/workflow.py 2012-02-21 10:23:29 +0000
3271@@ -8,7 +8,7 @@
3272
3273 from txzookeeper.utils import retry_change
3274
3275-from juju.errors import CharmInvocationError, CharmError, FileNotFound
3276+from juju.errors import CharmError, FileNotFound
3277 from juju.lib.statemachine import (
3278 WorkflowState, Workflow, Transition, TransitionError)
3279
3280@@ -16,18 +16,16 @@
3281 UnitWorkflow = Workflow(
3282 # Install transitions
3283 Transition("install", "Install", None, "installed",
3284- error_transition_id="error_install",
3285- success_transition_id="start"),
3286+ error_transition_id="error_install", automatic=True),
3287 Transition("error_install", "Install error", None, "install_error"),
3288 Transition("retry_install", "Retry install", "install_error", "installed",
3289- alias="retry", success_transition_id="start"),
3290+ alias="retry"),
3291 Transition("retry_install_hook", "Retry install with hook",
3292- "install_error", "installed", alias="retry_hook",
3293- success_transition_id="start"),
3294+ "install_error", "installed", alias="retry_hook"),
3295
3296 # Start transitions
3297 Transition("start", "Start", "installed", "started",
3298- error_transition_id="error_start"),
3299+ error_transition_id="error_start", automatic=True),
3300 Transition("error_start", "Start error", "installed", "start_error"),
3301 Transition("retry_start", "Retry start", "start_error", "started",
3302 alias="retry"),
3303@@ -48,33 +46,38 @@
3304 "upgrade_charm", "Upgrade", "started", "started",
3305 error_transition_id="upgrade_charm_error"),
3306 Transition(
3307- "upgrade_charm_error", "Upgrade from stop error",
3308+ "upgrade_charm_error", "Upgrade error",
3309 "started", "charm_upgrade_error"),
3310 Transition(
3311- "retry_upgrade_charm", "Upgrade from stop error",
3312- "charm_upgrade_error", "started", alias="retry"),
3313- Transition(
3314- "retry_upgrade_charm_hook", "Upgrade from stop error with hook",
3315- "charm_upgrade_error", "started", alias="retry_hook"),
3316+ "retry_upgrade_charm_error", "Upgrade error",
3317+ "charm_upgrade_error", "charm_upgrade_error"),
3318+ Transition(
3319+ "retry_upgrade_charm", "Retry upgrade",
3320+ "charm_upgrade_error", "started", alias="retry",
3321+ error_transition_id="retry_upgrade_charm_error"),
3322+ Transition(
3323+ "retry_upgrade_charm_hook", "Retry upgrade with hook",
3324+ "charm_upgrade_error", "started", alias="retry_hook",
3325+ error_transition_id="retry_upgrade_charm_error"),
3326
3327 # Configuration Transitions
3328 Transition(
3329- "reconfigure", "Reconfigure", "started", "started",
3330+ "configure", "Configure", "started", "started",
3331 error_transition_id="error_configure"),
3332 Transition(
3333 "error_configure", "On configure error",
3334 "started", "configure_error"),
3335 Transition(
3336- "retry_error", "On retry configure error",
3337+ "error_retry_configure", "On retry configure error",
3338 "configure_error", "configure_error"),
3339 Transition(
3340 "retry_configure", "Retry configure",
3341 "configure_error", "started", alias="retry",
3342- error_transition_id="retry_error"),
3343+ error_transition_id="error_retry_configure"),
3344 Transition(
3345 "retry_configure_hook", "Retry configure with hooks",
3346 "configure_error", "started", alias="retry_hook",
3347- error_transition_id="retry_error")
3348+ error_transition_id="error_retry_configure")
3349 )
3350
3351
3352@@ -110,7 +113,7 @@
3353 # relation would continues to schedule pending hooks
3354
3355 RelationWorkflow = Workflow(
3356- Transition("start", "Start", None, "up"),
3357+ Transition("start", "Start", None, "up", automatic=True),
3358 Transition("stop", "Stop", "up", "down"),
3359 Transition("restart", "Restart", "down", "up", alias="retry"),
3360 Transition("error", "Relation hook error", "up", "error"),
3361@@ -291,6 +294,7 @@
3362 with open(self.state_file_path, "r") as handle:
3363 content = handle.read()
3364
3365+ # TODO load ZK state and overwrite with disk state if different?
3366 return yaml.load(content)
3367
3368
3369@@ -333,28 +337,47 @@
3370 def _invoke_lifecycle(self, method, *args, **kw):
3371 try:
3372 result = yield method(*args, **kw)
3373- except (FileNotFound, CharmError, CharmInvocationError), e:
3374+ except (FileNotFound, CharmError) as e:
3375 raise TransitionError(e)
3376 returnValue(result)
3377
3378 @inlineCallbacks
3379+ def _get_preconditions(self):
3380+ """Given StateMachine state, return expected executor/lifecycle state.
3381+
3382+ :return: (run_executor, run_lifecycle)
3383+
3384+ Once the executor and lifecycle are in the expected state, it should
3385+ be safe to call StateMachine.synchronize(), and to run other
3386+ transitions as appropriate.
3387+ """
3388+ mid_upgrade = (False, True)
3389+ started = (True, True)
3390+ other = (True, False)
3391+ state = yield self.get_state()
3392+ if state == "charm_upgrade_error":
3393+ returnValue(mid_upgrade)
3394+ if state == "started":
3395+ if (yield self.get_inflight()) == "upgrade_charm":
3396+ # We don't want any risk of queued hooks firing while we're in
3397+ # a potentially-broken mid-upgrade state.
3398+ returnValue(mid_upgrade)
3399+ returnValue(started)
3400+ returnValue(other)
3401+
3402+ @inlineCallbacks
3403 def synchronize(self, executor):
3404 """Ensure the workflow's lifecycle is in the correct state, given
3405 current zookeeper state.
3406
3407 :param executor: the unit agent's shared HookExecutor, which should not
3408- run if we come up in (or detect and switch to) the
3409- "charm_upgrade_error" state.
3410+ run if we come up during an incomplete charm upgrade.
3411
3412 In addition, if the lifecycle has never been started before, the
3413 necessary state transitions are run.
3414 """
3415- state = yield self.get_state()
3416- run_executor, run_lifecycle = True, False
3417- if state == "started":
3418- run_lifecycle = True
3419- elif state == "charm_upgrade_error":
3420- run_executor, run_lifecycle = False, True
3421+ self._assert_locked()
3422+ run_executor, run_lifecycle = yield self._get_preconditions()
3423
3424 if run_executor:
3425 if not executor.running:
3426@@ -369,13 +392,7 @@
3427 elif self._lifecycle.running:
3428 yield self._lifecycle.stop(fire_hooks=False)
3429
3430- # At this point, prior state (if any) has been fully restored, and
3431- # we can run state transitions as usual; fire the standard startup ones
3432- # if they haven't completed yet.
3433- if state is None:
3434- yield self.fire_transition("install")
3435- if state == "installed":
3436- yield self.fire_transition("start")
3437+ yield super(UnitWorkflowState, self).synchronize()
3438
3439 # Install transitions
3440 def do_install(self):
3441@@ -411,6 +428,7 @@
3442 return self._invoke_lifecycle(self._lifecycle.stop)
3443
3444 # Upgrade transititions
3445+
3446 def do_upgrade_charm(self):
3447 return self._invoke_lifecycle(self._lifecycle.upgrade_charm)
3448
3449@@ -425,10 +443,10 @@
3450 def do_error_configure(self):
3451 return self._invoke_lifecycle(self._lifecycle.stop, fire_hooks=False)
3452
3453- def do_reconfigure(self):
3454+ def do_configure(self):
3455 return self._invoke_lifecycle(self._lifecycle.configure)
3456
3457- def do_retry_error(self):
3458+ def do_error_retry_configure(self):
3459 return self._invoke_lifecycle(self._lifecycle.stop, fire_hooks=False)
3460
3461 @inlineCallbacks
3462@@ -465,6 +483,7 @@
3463 In addition, if the lifecycle has never been started before, the
3464 necessary state transitions are run.
3465 """
3466+ self._assert_locked()
3467 state = yield self.get_state()
3468 if state == "up":
3469 watches, scheduler = True, True
3470@@ -478,8 +497,7 @@
3471 yield self._lifecycle.start(
3472 start_watches=watches, start_scheduler=scheduler)
3473
3474- if state is None:
3475- yield self.fire_transition("start")
3476+ yield super(RelationWorkflowState, self).synchronize()
3477
3478 @property
3479 def lifecycle(self):
3480@@ -500,9 +518,10 @@
3481
3482 @param: error: The error from hook invocation.
3483 """
3484- yield self.fire_transition("error",
3485- change_type=relation_change.change_type,
3486- error_message=str(error))
3487+ with (yield self.lock()):
3488+ yield self.fire_transition("error",
3489+ change_type=relation_change.change_type,
3490+ error_message=str(error))
3491
3492 @inlineCallbacks
3493 def do_stop(self):
3494@@ -548,33 +567,23 @@
3495
3496 @inlineCallbacks
3497 def do_depart(self):
3498- """Transition a relation to the departed state, from the up state.
3499- """
3500- # Stop related unit watches and change hook execution.
3501- yield self._lifecycle.stop()
3502- result = yield self._do_depart()
3503- returnValue(result)
3504-
3505- def do_down_depart(self):
3506- """Transition a relation to the departed state, from the down state.
3507- """
3508- return self._do_depart()
3509-
3510- @inlineCallbacks
3511- def _do_depart(self):
3512- """Execute the depart hook.
3513+ """Transition a relation to the departed state, from any state.
3514
3515 We ignore hook errors, as we won't logically process any additional
3516 events for the relation once it doesn't exist. However we do
3517 note the error in the log.
3518 """
3519- # To avoid the relation-changed hook error handler being used,
3520- # set the handler to None, so the exception is raised.
3521+ # Ensure that no further relation hook executions can occur.
3522+ yield self._lifecycle.stop()
3523+
3524+ # Handle errors ourselves, don't try to transition again
3525 self._lifecycle.set_hook_error_handler(None)
3526-
3527 try:
3528 yield self._lifecycle.depart()
3529 except Exception, e:
3530 self._log.error("Depart hook error, ignoring: %s", str(e))
3531 returnValue({"change_type": "depart",
3532 "error_message": str(e)})
3533+
3534+ do_down_depart = do_depart
3535+ do_error_depart = do_depart

Subscribers

People subscribed via source and target branches

to status/vote changes: