Merge lp:~axwalk/juju-core/lp1176740-destroy-unit-short-circuit into lp:~go-bot/juju-core/trunk

Proposed by Andrew Wilkins
Status: Work in progress
Proposed branch: lp:~axwalk/juju-core/lp1176740-destroy-unit-short-circuit
Merge into: lp:~go-bot/juju-core/trunk
Diff against target: 65 lines (+21/-2)
2 files modified
worker/uniter/modes.go (+5/-0)
worker/uniter/uniter_test.go (+16/-2)
To merge this branch: bzr merge lp:~axwalk/juju-core/lp1176740-destroy-unit-short-circuit
Reviewer Review Type Date Requested Status
Juju Engineering Pending
Review via email: mp+178694@code.launchpad.net

Description of the change

destroy-unit: respond to Dying in hook-failure

If the user invokes destroy-unit while a unit
is in a hook-error state, the uniter should
respond to the dying state as if it were in
steady state.

If the unit hasn't started yet (e.g. the install
hook failed), then the uniter should short-circuit
the termination handling.

https://codereview.appspot.com/12517043/

To post a comment you must log in.
Revision history for this message
Andrew Wilkins (axwalk) wrote :

Reviewers: mp+178694_code.launchpad.net,

Message:
Please take a look.

Description:
destroy-unit: respond to Dying in hook-failure

If the user invokes destroy-unit while a unit
is in a hook-error state, the uniter should
respond to the dying state as if it were in
steady state.

If the unit hasn't started yet (e.g. the install
hook failed), then the uniter should short-circuit
the termination handling.

https://code.launchpad.net/~axwalk/juju-core/lp1176740-destroy-unit-short-circuit/+merge/178694

(do not edit description out of merge proposal)

Please review this at https://codereview.appspot.com/12517043/

Affected files:
   A [revision details]
   M worker/uniter/modes.go
   M worker/uniter/uniter_test.go

Index: [revision details]
=== added file '[revision details]'
--- [revision details] 2012-01-01 00:00:00 +0000
+++ [revision details] 2012-01-01 00:00:00 +0000
@@ -0,0 +1,2 @@
+Old revision: tarmac-20130806053414-fbcxuyvncbfz3m8n
+New revision: <email address hidden>

Index: worker/uniter/modes.go
=== modified file 'worker/uniter/modes.go'
--- worker/uniter/modes.go 2013-08-02 12:37:43 +0000
+++ worker/uniter/modes.go 2013-08-06 09:06:11 +0000
@@ -287,6 +287,9 @@
   if err := u.unit.Refresh(); err != nil {
    return nil, err
   }
+ if !u.s.Started {
+ return ModeTerminating, nil
+ }
   for _, name := range u.unit.SubordinateNames() {
    if sub, err := u.st.Unit(name); errors.IsNotFoundError(err) {
     continue
@@ -341,6 +344,8 @@
    select {
    case <-u.tomb.Dying():
     return nil, tomb.ErrDying
+ case <-u.f.UnitDying():
+ return modeAbideDyingLoop(u)
    case rm := <-u.f.ResolvedEvents():
     switch rm {
     case state.ResolvedRetryHooks:

Index: worker/uniter/uniter_test.go
=== modified file 'worker/uniter/uniter_test.go'
--- worker/uniter/uniter_test.go 2013-08-02 12:37:43 +0000
+++ worker/uniter/uniter_test.go 2013-08-06 09:06:11 +0000
@@ -490,22 +490,36 @@
   ), ut(
    "hook error service dying",
    startupError{"start"},
- serviceDying,
    verifyWaiting{},
    fixHook{"start"},
    resolveError{state.ResolvedRetryHooks},
+ waitUnit{status: params.StatusStarted},
+ serviceDying,
    waitHooks{"start", "config-changed", "stop"},
    waitUniterDead{},
   ), ut(
+ "hook error service dying (not started)",
+ startupError{"start"},
+ serviceDying,
+ waitHooks{},
+ waitUniterDead{},
+ ), ut(
    "hook error unit dying",
    startupError{"start"},
- unitDying,
    verifyWaiting{},
    fixHook{"start"},
    resolveError{state.ResolvedRetryHooks},
+ waitUnit{status: params.StatusStarted},
+ unitDying,
    waitHooks{"start", "config-changed", "stop"},
    waitUniterDead{},
   ), ut(
+ "hook error unit dying (not started)",
+ startupError{"start"},
+ unitDying,
+ waitHooks{},
+ waitUniterDead{},
+ ), ut(
    "hook error unit dead",
    startupError{"start"},
    unitDead,

Revision history for this message
Dimiter Naydenov (dimitern) wrote :
Revision history for this message
William Reade (fwereade) wrote :

Sorry, NOT LGTM as it stands. I think it's a matter of:

select {
case <- u.f.UnitDying():
     ...
default:
}

...at appropriate points in the modes which *can* run before we're
started, and short-circuiting to ModeTerminating only out of those. Once
relations and hook errors could be involved, the proposed approach
becomes, er, risky at best.

https://codereview.appspot.com/12517043/diff/1/worker/uniter/modes.go
File worker/uniter/modes.go (right):

https://codereview.appspot.com/12517043/diff/1/worker/uniter/modes.go#newcode291
worker/uniter/modes.go:291: return ModeTerminating, nil
Hitherto, were we in ModeAbide, we must have been started.

https://codereview.appspot.com/12517043/diff/1/worker/uniter/modes.go#newcode348
worker/uniter/modes.go:348: return modeAbideDyingLoop(u)
I don't think this is right. If we've had a hook error we're in an
unknown state, and subsequent hooks are also likely to fail; and if we
fail a -broken hook we won't depart the relation; and if a unit goes
away without departing all its relations, those relations (and their
services) will become unkillable. Which would be a Bad Thing.

https://codereview.appspot.com/12517043/

Revision history for this message
William Reade (fwereade) wrote :

On 2013/08/09 13:52:20, fwereade wrote:

https://codereview.appspot.com/12517043/diff/1/worker/uniter/modes.go#newcode348
> worker/uniter/modes.go:348: return modeAbideDyingLoop(u)
> I don't think this is right. If we've had a hook error we're in an
unknown
> state, and subsequent hooks are also likely to fail; and if we fail a
-broken
> hook we won't depart the relation; and if a unit goes away without
departing all
> its relations, those relations (and their services) will become
unkillable.
> Which would be a Bad Thing.

(basically: whatever approach we take, we need to be sure that a unit
killed while in relation scope correctly leaves all relation scopes
before killing itself

https://codereview.appspot.com/12517043/

Revision history for this message
Andrew Wilkins (axwalk) wrote :

> ...at appropriate points in the modes which *can* run
> before we're started, and short-circuiting to
> ModeTerminating only out of those. Once relations and hook
> errors could be involved, the proposed approach becomes,
> er, risky at best.

(Based on my immature knowledge.)

A hook error can occur before the unit is "started": if either of the
install or config-changed hooks fail. Moreover, I don't see any blocking
states where the unit is not started, and not in ModeHookError. Please
see my proposed solution in the inline response.

https://codereview.appspot.com/12517043/diff/1/worker/uniter/modes.go
File worker/uniter/modes.go (right):

https://codereview.appspot.com/12517043/diff/1/worker/uniter/modes.go#newcode291
worker/uniter/modes.go:291: return ModeTerminating, nil
On 2013/08/09 13:52:20, fwereade wrote:
> Hitherto, were we in ModeAbide, we must have been started.

Understood.

https://codereview.appspot.com/12517043/diff/1/worker/uniter/modes.go#newcode348
worker/uniter/modes.go:348: return modeAbideDyingLoop(u)
On 2013/08/09 13:52:20, fwereade wrote:
> I don't think this is right. If we've had a hook error we're in an
unknown
> state, and subsequent hooks are also likely to fail; and if we fail a
-broken
> hook we won't depart the relation; and if a unit goes away without
departing all
> its relations, those relations (and their services) will become
unkillable.
> Which would be a Bad Thing.

Sorry, a little more subtle than I thought. Here's what I was thinking
here:

In modeAbideDyingLoop, (unless the unit hasn't started) each of the
unit's relationers is set to dying, and then the unit then waits for
relation hooks. If the hook fails, then we go back to ModeHookError,
whereupon we'll return to modeAbideDyingLoop on account of the unit
being in Dying state, and continue to execute hooks until they're all
depleted.

So, from your comments it sounds like the unit must remain in Dying
state until the user resolves whatever problem is preventing the -broken
hook from successfully completing.

Anyway, I overshot; I should only have modified behaviour for when the
unit hasn't started. How about I modify ModeHookError's new select
clause to be a function call that returns nil if the agent is started,
else u.f.UnitDying()? And then go straight to ModeTerminating. AFAICT,
if a unit is blocked and not started, then it is in ModeHookError
(install or config-changed).

https://codereview.appspot.com/12517043/

Unmerged revisions

1605. By Andrew Wilkins

destroy-unit: respond to dying state in hook-failure

If the user invokes destroy-unit while a unit
is in a hook-error state, the uniter should
respond to the dying state as if it were in
steady state.

If the unit hasn't started yet (e.g. the install
hook failed), then the uniter should short-circuit
the termination handling.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'worker/uniter/modes.go'
2--- worker/uniter/modes.go 2013-08-02 12:37:43 +0000
3+++ worker/uniter/modes.go 2013-08-06 09:18:32 +0000
4@@ -287,6 +287,9 @@
5 if err := u.unit.Refresh(); err != nil {
6 return nil, err
7 }
8+ if !u.s.Started {
9+ return ModeTerminating, nil
10+ }
11 for _, name := range u.unit.SubordinateNames() {
12 if sub, err := u.st.Unit(name); errors.IsNotFoundError(err) {
13 continue
14@@ -341,6 +344,8 @@
15 select {
16 case <-u.tomb.Dying():
17 return nil, tomb.ErrDying
18+ case <-u.f.UnitDying():
19+ return modeAbideDyingLoop(u)
20 case rm := <-u.f.ResolvedEvents():
21 switch rm {
22 case state.ResolvedRetryHooks:
23
24=== modified file 'worker/uniter/uniter_test.go'
25--- worker/uniter/uniter_test.go 2013-08-02 12:37:43 +0000
26+++ worker/uniter/uniter_test.go 2013-08-06 09:18:32 +0000
27@@ -490,22 +490,36 @@
28 ), ut(
29 "hook error service dying",
30 startupError{"start"},
31- serviceDying,
32 verifyWaiting{},
33 fixHook{"start"},
34 resolveError{state.ResolvedRetryHooks},
35+ waitUnit{status: params.StatusStarted},
36+ serviceDying,
37 waitHooks{"start", "config-changed", "stop"},
38 waitUniterDead{},
39 ), ut(
40+ "hook error service dying (not started)",
41+ startupError{"start"},
42+ serviceDying,
43+ waitHooks{},
44+ waitUniterDead{},
45+ ), ut(
46 "hook error unit dying",
47 startupError{"start"},
48- unitDying,
49 verifyWaiting{},
50 fixHook{"start"},
51 resolveError{state.ResolvedRetryHooks},
52+ waitUnit{status: params.StatusStarted},
53+ unitDying,
54 waitHooks{"start", "config-changed", "stop"},
55 waitUniterDead{},
56 ), ut(
57+ "hook error unit dying (not started)",
58+ startupError{"start"},
59+ unitDying,
60+ waitHooks{},
61+ waitUniterDead{},
62+ ), ut(
63 "hook error unit dead",
64 startupError{"start"},
65 unitDead,

Subscribers

People subscribed via source and target branches

to status/vote changes: