Merge lp:~hduran-8/juju-core/add_state_network_check_on_specified_machine into lp:~go-bot/juju-core/trunk

Proposed by Horacio Durán
Status: Work in progress
Proposed branch: lp:~hduran-8/juju-core/add_state_network_check_on_specified_machine
Merge into: lp:~go-bot/juju-core/trunk
Diff against target: 415 lines (+278/-68)
3 files modified
state/machine.go (+42/-0)
state/network_assign_test.go (+88/-0)
state/unit.go (+148/-68)
To merge this branch: bzr merge lp:~hduran-8/juju-core/add_state_network_check_on_specified_machine
Reviewer Review Type Date Requested Status
Juju Engineering Pending
Review via email: mp+215143@code.launchpad.net

Description of the change

Add check for requested networks compatibility

This is still work in progress, the idea is to
check if the requested networks for a unit and
for a machine are compatible.
So far check looks that there is no requesting
to inlude nets excluded in the other and vice
versa.

https://codereview.appspot.com/86430043/

To post a comment you must log in.
2580. By Horacio Durán

Changed the way networks are obtained.
Fixed the corresponding tests
Added a draft of check for changes

2581. By Horacio Durán

Go Formatted, added extra comments for context

Revision history for this message
William Reade (fwereade) wrote :
Download full text (6.0 KiB)

firehose of suggestions :)

https://codereview.appspot.com/86430043/diff/20001/state/unit.go
File state/unit.go (right):

https://codereview.appspot.com/86430043/diff/20001/state/unit.go#newcode539
state/unit.go:539: return readRequestedNetworks(u.st, u.globalKey())
don't think we need this method now

https://codereview.appspot.com/86430043/diff/20001/state/unit.go#newcode833
state/unit.go:833: machineIncludedNetworks, machineExcludedNetworks, _
:= m.RequestedNetworks()
Basically never ignore errors. Certainly not here ;p.

https://codereview.appspot.com/86430043/diff/20001/state/unit.go#newcode839
state/unit.go:839: unitIncludedNetworks, unitExcludedNetworks, _ :=
networked_service.Networks()
readRequestedNetworks(u.st, serviceGlobalKey(u.doc.Service))

and don't ignore errors :).

https://codereview.appspot.com/86430043/diff/20001/state/unit.go#newcode852
state/unit.go:852: return fmt.Errorf("The requested network %q is
explicitly excluded in the requested machine %q ", includedNetwork,
m.Id())
We tend not to capitalise error messages

https://codereview.appspot.com/86430043/diff/20001/state/unit.go#newcode859
state/unit.go:859: }
This all might be easier with the methods on utils/set.Strings

https://codereview.appspot.com/86430043/diff/20001/state/unit.go#newcode864
state/unit.go:864: Assert: bson.D{{"txn-revno",
networked_service.doc.TxnRevno}},
We definitely don't want to assert on the service doc's txn-revno -- we
care about changes to the networks, which are on a different doc.

Ofc, you don't have access to them at the moment. I'd add a TxnRevno
field to the requestedNetworks doc, and have readRequestedNetworks
return that too; the RequestedNetworks methods can ignore it, and this
method can use it in txns.

Make sure you return -1 if it's missing, though, and assert DocMissing
:).

https://codereview.appspot.com/86430043/diff/20001/state/unit.go#newcode866
state/unit.go:866: err = u.st.runTransaction(ops)
When building this sort of construction:

if err := u.st.runTransaction(ops); err != txn.ErrAborted {
     return err
}

...replaces this line and the subsequent two blocks.

*but* asserting at this point does us no good -- we have to build the
correct asserts and include them in the transaction, below, that
actually does the assignment.

https://codereview.appspot.com/86430043/diff/20001/state/unit.go#newcode887
state/unit.go:887: assert := append(isAliveDoc, bson.D{
(rename this unitAssert for clarity's sake)

https://codereview.appspot.com/86430043/diff/20001/state/unit.go#newcode893
state/unit.go:893: massert := isAliveDoc
(and this to machineAssert)

https://codereview.appspot.com/86430043/diff/20001/state/unit.go#newcode932
state/unit.go:932: }
I would recommend starting off by refactoring the method to have the
shape we want, so it can handle retries, and making sure all original
tests still pass; and then adding in the extra bits. So, something like:

func (u *Unit) assignToMachine(m *Machine, unused bool) (err error) {
     // On the expectation that we'll have more than one possible nil
return,
     // defer the updates to local state.
     defer func() {
         if err != nil {
             u.doc.MachineId = m.doc.Id...

Read more...

2582. By Horacio Durán

Refactored assignToMachine Added proper tests in separate file, added clone to unit and machine

2583. By Horacio Durán

Go formatted

2584. By Horacio Durán

Go formatte

2585. By Horacio Durán

Ensured there is a retry when txnrevno of unit does not match

Revision history for this message
William Reade (fwereade) wrote :

couple of comments, let's chat live

https://codereview.appspot.com/86430043/diff/40001/state/unit.go
File state/unit.go (right):

https://codereview.appspot.com/86430043/diff/40001/state/unit.go#newcode869
state/unit.go:869: if unit.doc.MachineId != machine.Id() {
This in particular needs to go inside the loop -- unit and machine
assignments can change.

(also, should we be checking that the machine's principals are empty if
unused is true? I think we should...)

https://codereview.appspot.com/86430043/diff/40001/state/unit.go#newcode881
state/unit.go:881: if err := unit.st.supportsUnitPlacement(); err != nil
{
I don't think this is mutable state, is it? we can check it outside the
loop.

https://codereview.appspot.com/86430043/diff/40001/state/unit.go#newcode923
state/unit.go:923: }
Everything from here up to the top of the loop feels like it should all
be inside the assignToMachineOps method -- calculating the ops and
verifying that they're sane should go hand in hand. (this applies to
stuff like the life checks too -- check them while building the txn,
don't just build it blind, wait for it to fail, and then look up why.)

https://codereview.appspot.com/86430043/diff/40001/state/unit.go#newcode940
state/unit.go:940: case unit.doc.Life != Alive:
You shouldn't need this code here. It should all be checked as we
construct the transaction -- we have access to latest-known-state up at
the top of the loop, and we can and should check unit life etc *before*
we try to run a transaction asserting that it's the case.

Apart from anything else, you're inspecting old state here -- it doesn't
necessarily have any bearing on why the txn mechanism rejected the ops.

https://codereview.appspot.com/86430043/

Revision history for this message
Horacio Durán (hduran-8) wrote :

Wow i was not aware it was so bad, right now i am on holiday and on a
touristic place very far from a pc. ill Ping you on monday as soon as i
begin my day

El jueves, 17 de abril de 2014, <email address hidden> escribió:

> couple of comments, let's chat live
>
>
> https://codereview.appspot.com/86430043/diff/40001/state/unit.go
> File state/unit.go (right):
>
> https://codereview.appspot.com/86430043/diff/40001/state/
> unit.go#newcode869
> state/unit.go:869: if unit.doc.MachineId != machine.Id() {
> This in particular needs to go inside the loop -- unit and machine
> assignments can change.
>
> (also, should we be checking that the machine's principals are empty if
> unused is true? I think we should...)
>
> https://codereview.appspot.com/86430043/diff/40001/state/
> unit.go#newcode881
> state/unit.go:881: if err := unit.st.supportsUnitPlacement(); err != nil
> {
> I don't think this is mutable state, is it? we can check it outside the
> loop.
>
> https://codereview.appspot.com/86430043/diff/40001/state/
> unit.go#newcode923
> state/unit.go:923: }
> Everything from here up to the top of the loop feels like it should all
> be inside the assignToMachineOps method -- calculating the ops and
> verifying that they're sane should go hand in hand. (this applies to
> stuff like the life checks too -- check them while building the txn,
> don't just build it blind, wait for it to fail, and then look up why.)
>
> https://codereview.appspot.com/86430043/diff/40001/state/
> unit.go#newcode940
> state/unit.go:940: case unit.doc.Life != Alive:
> You shouldn't need this code here. It should all be checked as we
> construct the transaction -- we have access to latest-known-state up at
> the top of the loop, and we can and should check unit life etc *before*
> we try to run a transaction asserting that it's the case.
>
> Apart from anything else, you're inspecting old state here -- it doesn't
> necessarily have any bearing on why the txn mechanism rejected the ops.
>
> https://codereview.appspot.com/86430043/
>

Unmerged revisions

2585. By Horacio Durán

Ensured there is a retry when txnrevno of unit does not match

2584. By Horacio Durán

Go formatte

2583. By Horacio Durán

Go formatted

2582. By Horacio Durán

Refactored assignToMachine Added proper tests in separate file, added clone to unit and machine

2581. By Horacio Durán

Go Formatted, added extra comments for context

2580. By Horacio Durán

Changed the way networks are obtained.
Fixed the corresponding tests
Added a draft of check for changes

2579. By Horacio Durán

Ran go fmt

2578. By Horacio Durán

Added rough check for network compatibility when adding to machine

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'state/machine.go'
2--- state/machine.go 2014-04-13 11:06:42 +0000
3+++ state/machine.go 2014-04-14 23:52:31 +0000
4@@ -134,6 +134,48 @@
5 return machine
6 }
7
8+func (m *Machine) Clone() *Machine {
9+ machineClone := &Machine{
10+ st: m.st,
11+ doc: m.doc,
12+ }
13+
14+ machinePrincipals := make([]string, len(m.doc.Principals))
15+ for key, value := range m.doc.Principals {
16+ machinePrincipals[key] = value
17+ }
18+ machineClone.doc.Principals = machinePrincipals
19+ machineJobs := make([]MachineJob, len(m.doc.Jobs))
20+ for key, value := range m.doc.Jobs {
21+ machineJobs[key] = value
22+ }
23+ machineClone.doc.Jobs = machineJobs
24+ machineAddresses := make([]address, len(m.doc.Addresses))
25+ for key, value := range m.doc.Addresses {
26+ machineAddresses[key] = value
27+ }
28+ machineClone.doc.Addresses = machineAddresses
29+ machineMachineAddresses := make([]address, len(m.doc.MachineAddresses))
30+ for key, value := range m.doc.MachineAddresses {
31+ machineMachineAddresses[key] = value
32+ }
33+ machineClone.doc.MachineAddresses = machineMachineAddresses
34+
35+ machineSupportedContainers := make([]instance.ContainerType, len(m.doc.SupportedContainers))
36+ for key, value := range m.doc.SupportedContainers {
37+ machineSupportedContainers[key] = value
38+ }
39+ machineClone.doc.SupportedContainers = machineSupportedContainers
40+
41+ machineClone.annotator = annotator{
42+ globalKey: machineClone.globalKey(),
43+ tag: machineClone.Tag(),
44+ st: machineClone.st,
45+ }
46+ return machineClone
47+
48+}
49+
50 // Id returns the machine id.
51 func (m *Machine) Id() string {
52 return m.doc.Id
53
54=== added file 'state/network_assign_test.go'
55--- state/network_assign_test.go 1970-01-01 00:00:00 +0000
56+++ state/network_assign_test.go 2014-04-14 23:52:31 +0000
57@@ -0,0 +1,88 @@
58+// Copyright 2012, 2013 Canonical Ltd.
59+// Licensed under the AGPLv3, see LICENCE file for details.
60+
61+package state_test
62+
63+import (
64+ "fmt"
65+
66+ gc "launchpad.net/gocheck"
67+
68+ "launchpad.net/juju-core/state"
69+)
70+
71+type NetworkedUnitSuite struct {
72+ UnitSuite
73+}
74+
75+var _ = gc.Suite(&NetworkedUnitSuite{})
76+
77+func (s *NetworkedUnitSuite) SetUpTest(c *gc.C) {
78+ includeNetworks := []string{"net1", "net3", "net5"}
79+ excludeNetworks := []string{"net2", "net4", "net6"}
80+ s.ConnSuite.SetUpTest(c)
81+ s.charm = s.AddTestingCharm(c, "wordpress")
82+ var err error
83+ s.service = s.AddTestingServiceWithNetworks(c, "wordpress", s.charm, includeNetworks, excludeNetworks)
84+ c.Assert(err, gc.IsNil)
85+ s.unit, err = s.service.AddUnit()
86+ c.Assert(err, gc.IsNil)
87+ c.Assert(s.unit.Series(), gc.Equals, "quantal")
88+}
89+
90+func (s *NetworkedUnitSuite) TestIncludeExcludeNetworksCompatibilityCheckPasses(c *gc.C) {
91+ includeNetworks := []string{"net1", "net3", "net5"}
92+ excludeNetworks := []string{"net2", "net4", "net6"}
93+
94+ template := state.MachineTemplate{
95+ Series: "quantal",
96+ Jobs: []state.MachineJob{state.JobHostUnits},
97+ IncludeNetworks: includeNetworks,
98+ ExcludeNetworks: excludeNetworks,
99+ }
100+
101+ machine, err := s.State.AddOneMachine(template)
102+ c.Assert(err, gc.IsNil)
103+
104+ // TODO: (hduran-8) I really need to find a way to trigger reload
105+ // So I can test that there is a retry but could not think
106+ // a proper transactionhook
107+
108+ err = s.unit.AssignToMachine(machine)
109+ c.Assert(err, gc.IsNil)
110+}
111+
112+func (s *NetworkedUnitSuite) TestExcludeNetworksCompatibilityCheckFails(c *gc.C) {
113+ includeNetworks := []string{"net2", "net4", "net6"}
114+ excludeNetworks := []string{"net1", "net3", "net5"}
115+
116+ template := state.MachineTemplate{
117+ Series: "quantal",
118+ Jobs: []state.MachineJob{state.JobHostUnits},
119+ IncludeNetworks: includeNetworks,
120+ ExcludeNetworks: excludeNetworks,
121+ }
122+ machine, err := s.State.AddOneMachine(template)
123+ c.Assert(err, gc.IsNil)
124+ err = s.unit.AssignToMachine(machine)
125+ expected_error := fmt.Errorf("cannot assign unit \"wordpress/0\" to machine 0: the requested networks [\"net1\" \"net3\" \"net5\"] are explicitly excluded in the requested machine \"0\" ")
126+ c.Assert(err, gc.DeepEquals, expected_error)
127+
128+}
129+
130+func (s *NetworkedUnitSuite) TestInvludeNetworksCompatibilityCheckFails(c *gc.C) {
131+ includeNetworks := []string{"net2", "net4", "net6"}
132+ excludeNetworks := []string{"net7", "net9", "net11"}
133+
134+ template := state.MachineTemplate{
135+ Series: "quantal",
136+ Jobs: []state.MachineJob{state.JobHostUnits},
137+ IncludeNetworks: includeNetworks,
138+ ExcludeNetworks: excludeNetworks,
139+ }
140+ machine, err := s.State.AddOneMachine(template)
141+ c.Assert(err, gc.IsNil)
142+ err = s.unit.AssignToMachine(machine)
143+ expected_error := fmt.Errorf("cannot assign unit \"wordpress/0\" to machine 0: the requested networks [\"net2\" \"net4\" \"net6\"] are explicitly included in the requested machine \"0\" ")
144+ c.Assert(err, gc.DeepEquals, expected_error)
145+}
146
147=== modified file 'state/unit.go'
148--- state/unit.go 2014-04-12 05:53:58 +0000
149+++ state/unit.go 2014-04-14 23:52:31 +0000
150@@ -22,6 +22,7 @@
151 "launchpad.net/juju-core/state/presence"
152 "launchpad.net/juju-core/tools"
153 "launchpad.net/juju-core/utils"
154+ "launchpad.net/juju-core/utils/set"
155 "launchpad.net/juju-core/version"
156 )
157
158@@ -102,6 +103,32 @@
159 return unit
160 }
161
162+// Provides a reazonably deep copy of the given unit
163+func (u *Unit) Clone() *Unit {
164+ // The following fields have been omitted from explicit copy
165+ // because I did not find reason to deep copy them
166+ // CharmURL *charm.URL
167+ // Tools *tools.Tools `bson:",omitempty"`
168+ unitClone := &Unit{st: u.st, doc: u.doc}
169+ unitSubordinates := make([]string, len(u.doc.Subordinates))
170+ for key, value := range u.doc.Subordinates {
171+ unitSubordinates[key] = value
172+ }
173+ unitClone.doc.Subordinates = unitSubordinates
174+ unitPorts := make([]instance.Port, len(u.doc.Ports))
175+ for key, value := range u.doc.Ports {
176+ unitPorts[key] = value
177+ }
178+ unitClone.doc.Ports = unitPorts
179+
180+ unitClone.annotator = annotator{
181+ globalKey: unitClone.globalKey(),
182+ tag: unitClone.Tag(),
183+ st: unitClone.st,
184+ }
185+ return unitClone
186+}
187+
188 // Service returns the service.
189 func (u *Unit) Service() (*Service, error) {
190 return u.st.Service(u.doc.Service)
191@@ -269,7 +296,7 @@
192 u.doc.Life = Dying
193 }
194 }()
195- unit := &Unit{st: u.st, doc: u.doc}
196+ unit := u.Clone()
197 for i := 0; i < 5; i++ {
198 switch ops, err := unit.destroyOps(); err {
199 case errRefresh:
200@@ -441,7 +468,7 @@
201
202 // Now we're sure we haven't left any scopes occupied by this unit, we
203 // can safely remove the document.
204- unit := &Unit{st: u.st, doc: u.doc}
205+ unit := u.Clone()
206 for i := 0; i < 5; i++ {
207 switch ops, err := unit.removeOps(isDeadDoc); err {
208 case errRefresh:
209@@ -816,6 +843,33 @@
210 inUseErr = stderrors.New("machine is not unused")
211 )
212
213+func (u *Unit) assignToMachineOps(m *Machine, unused bool) []txn.Op {
214+ assert := append(isAliveDoc, bson.D{
215+ {"$or", []bson.D{
216+ {{"machineid", ""}},
217+ {{"machineid", m.Id()}},
218+ }},
219+ }...)
220+ massert := isAliveDoc
221+ if unused {
222+ massert = append(massert, bson.D{{"clean", bson.D{{"$ne", false}}}}...)
223+ }
224+ assert = append(assert, bson.D{{"txn-revno", u.doc.TxnRevno}}...)
225+ ops := []txn.Op{
226+ {
227+ C: u.st.units.Name,
228+ Id: u.doc.Name,
229+ Assert: assert,
230+ Update: bson.D{{"$set", bson.D{{"machineid", m.doc.Id}}}},
231+ }, {
232+ C: u.st.machines.Name,
233+ Id: m.doc.Id,
234+ Assert: massert,
235+ Update: bson.D{{"$addToSet", bson.D{{"principals", u.doc.Name}}}, {"$set", bson.D{{"clean", false}}}},
236+ }}
237+ return ops
238+}
239+
240 // assignToMachine is the internal version of AssignToMachine,
241 // also used by AssignToUnusedMachine. It returns specific errors
242 // in some cases:
243@@ -824,80 +878,106 @@
244 // - alreadyAssignedErr when the unit has already been assigned
245 // - inUseErr when the machine already has a unit assigned (if unused is true)
246 func (u *Unit) assignToMachine(m *Machine, unused bool) (err error) {
247- if u.doc.Series != m.doc.Series {
248+
249+ unit := u.Clone()
250+ machine := m.Clone()
251+
252+ // Static checking
253+ if unit.doc.Series != machine.doc.Series {
254 return fmt.Errorf("series does not match")
255 }
256- if u.doc.MachineId != "" {
257- if u.doc.MachineId != m.Id() {
258+ if unit.doc.MachineId != "" {
259+ if unit.doc.MachineId != machine.Id() {
260 return alreadyAssignedErr
261 }
262 return nil
263 }
264- if u.doc.Principal != "" {
265+ if unit.doc.Principal != "" {
266 return fmt.Errorf("unit is a subordinate")
267 }
268- canHost := false
269- for _, j := range m.doc.Jobs {
270- if j == JobHostUnits {
271- canHost = true
272- break
273- }
274- }
275- if !canHost {
276- return fmt.Errorf("machine %q cannot host units", m)
277- }
278- // assignToMachine implies assignment to an existing machine,
279- // which is only permitted if unit placement is supported.
280- if err := u.st.supportsUnitPlacement(); err != nil {
281- return err
282- }
283- assert := append(isAliveDoc, bson.D{
284- {"$or", []bson.D{
285- {{"machineid", ""}},
286- {{"machineid", m.Id()}},
287- }},
288- }...)
289- massert := isAliveDoc
290- if unused {
291- massert = append(massert, bson.D{{"clean", bson.D{{"$ne", false}}}}...)
292- }
293- ops := []txn.Op{{
294- C: u.st.units.Name,
295- Id: u.doc.Name,
296- Assert: assert,
297- Update: bson.D{{"$set", bson.D{{"machineid", m.doc.Id}}}},
298- }, {
299- C: u.st.machines.Name,
300- Id: m.doc.Id,
301- Assert: massert,
302- Update: bson.D{{"$addToSet", bson.D{{"principals", u.doc.Name}}}, {"$set", bson.D{{"clean", false}}}},
303- }}
304- err = u.st.runTransaction(ops)
305- if err == nil {
306- u.doc.MachineId = m.doc.Id
307- m.doc.Clean = false
308- return nil
309- }
310- if err != txn.ErrAborted {
311- return err
312- }
313- u0, err := u.st.Unit(u.Name())
314- if err != nil {
315- return err
316- }
317- m0, err := u.st.Machine(m.Id())
318- if err != nil {
319- return err
320- }
321- switch {
322- case u0.Life() != Alive:
323- return unitNotAliveErr
324- case m0.Life() != Alive:
325- return machineNotAliveErr
326- case u0.doc.MachineId != "" || !unused:
327- return alreadyAssignedErr
328- }
329- return inUseErr
330+ for i := 0; i < 3; i++ {
331+
332+ // assignToMachine implies assignment to an existing machine,
333+ // which is only permitted if unit placement is supported.
334+ if err := unit.st.supportsUnitPlacement(); err != nil {
335+ return err
336+ }
337+
338+ // Jobs
339+ canHost := false
340+ for _, j := range m.doc.Jobs {
341+ if j == JobHostUnits {
342+ canHost = true
343+ break
344+ }
345+ }
346+ if !canHost {
347+ return fmt.Errorf("machine %q cannot host units", m)
348+ }
349+
350+ // Check non colliding network settings
351+ machineIncludedNetworks, machineExcludedNetworks, err := machine.RequestedNetworks()
352+ if err != nil {
353+ return err
354+ }
355+
356+ unitIncludedNetworks, unitExcludedNetworks, err := readRequestedNetworks(unit.st, serviceGlobalKey(unit.doc.Service))
357+ if err != nil {
358+ return err
359+ }
360+
361+ machineExcludedSet := set.NewStrings(machineExcludedNetworks...)
362+ machineIncludedSet := set.NewStrings(machineIncludedNetworks...)
363+ unitExcludedSet := set.NewStrings(unitExcludedNetworks...)
364+ unitIncludedSet := set.NewStrings(unitIncludedNetworks...)
365+
366+ mExcludeUIncludeIntersection := unitIncludedSet.Intersection(machineExcludedSet)
367+
368+ if mExcludeUIncludeIntersection.Size() > 0 {
369+ return fmt.Errorf("the requested networks %q are explicitly excluded in the requested machine %q ", mExcludeUIncludeIntersection.Values(), machine.Id())
370+ }
371+
372+ mIncludeUExcludeIntersection := unitExcludedSet.Intersection(machineIncludedSet)
373+
374+ if mIncludeUExcludeIntersection.Size() > 0 {
375+ return fmt.Errorf("the requested networks %q are explicitly included in the requested machine %q ", mIncludeUExcludeIntersection.Values(), machine.Id())
376+ }
377+
378+ // ops
379+ ops := unit.assignToMachineOps(machine, unused)
380+ err = u.st.runTransaction(ops)
381+
382+ if err == nil {
383+ u.doc.MachineId = m.doc.Id
384+ m.doc.Clean = false
385+ return nil
386+ }
387+
388+ if i > 1 {
389+ if err != txn.ErrAborted {
390+ return err
391+ }
392+
393+ switch {
394+ case unit.doc.Life != Alive:
395+ return unitNotAliveErr
396+ case machine.doc.Life != Alive:
397+ return machineNotAliveErr
398+ case u.doc.MachineId != "" || !unused:
399+ return alreadyAssignedErr
400+ }
401+ }
402+
403+ if err := unit.Refresh(); err != nil {
404+ return err
405+ }
406+ if err := machine.Refresh(); err != nil {
407+ return err
408+ }
409+
410+ }
411+
412+ return ErrExcessiveContention
413 }
414
415 func assignContextf(err *error, unit *Unit, target string) {

Subscribers

People subscribed via source and target branches

to status/vote changes: