Merge lp:~axwalk/juju-core/lp1263586-force-symlink-jujurun into lp:~go-bot/juju-core/trunk

Proposed by Andrew Wilkins
Status: Merged
Approved by: Andrew Wilkins
Approved revision: no longer in the source branch.
Merged at revision: 2198
Proposed branch: lp:~axwalk/juju-core/lp1263586-force-symlink-jujurun
Merge into: lp:~go-bot/juju-core/trunk
Diff against target: 193 lines (+78/-20)
5 files modified
cmd/jujud/machine.go (+17/-0)
cmd/jujud/machine_test.go (+42/-0)
cmd/jujud/main_test.go (+19/-10)
environs/cloudinit/cloudinit.go (+0/-8)
environs/cloudinit/cloudinit_test.go (+0/-2)
To merge this branch: bzr merge lp:~axwalk/juju-core/lp1263586-force-symlink-jujurun
Reviewer Review Type Date Requested Status
Juju Engineering Pending
Review via email: mp+200368@code.launchpad.net

Commit message

environs/cloudinit: force symlink of juju-run

Manual provisioning fails if the juju-run symlink
exists. We should always recreate the symlink if
it exists, by passing -f to ln.

Also: moved symlinking of juju-run to the machine
agent, so that upgrades do the right thing.

Fixes #1263586

https://codereview.appspot.com/47380044/

Description of the change

environs/cloudinit: force symlink of juju-run

Manual provisioning fails if the juju-run symlink
exists. We should always recreate the symlink if
it exists, by passing -f to ln.

Also: moved symlinking of juju-run to the machine
agent, so that upgrades do the right thing.

Fixes #1263586

https://codereview.appspot.com/47380044/

To post a comment you must log in.
Revision history for this message
Andrew Wilkins (axwalk) wrote :
Download full text (3.2 KiB)

Reviewers: mp+200368_code.launchpad.net,

Message:
Please take a look.

Description:
environs/cloudinit: force symlink of juju-run

Manual provisioning fails if the juju-run symlink
exists. We should always recreate the symlink if
it exists, by passing -f to ln.

Fixes #1263586

https://code.launchpad.net/~axwalk/juju-core/lp1263586-force-symlink-jujurun/+merge/200368

(do not edit description out of merge proposal)

Please review this at https://codereview.appspot.com/47380044/

Affected files (+5, -3 lines):
   A [revision details]
   M environs/cloudinit/cloudinit.go
   M environs/cloudinit/cloudinit_test.go

Index: [revision details]
=== added file '[revision details]'
--- [revision details] 2012-01-01 00:00:00 +0000
+++ [revision details] 2012-01-01 00:00:00 +0000
@@ -0,0 +1,2 @@
+Old revision: tarmac-20140102083548-er3lr2qx0y8nwd2x
+New revision: <email address hidden>

Index: environs/cloudinit/cloudinit.go
=== modified file 'environs/cloudinit/cloudinit.go'
--- environs/cloudinit/cloudinit.go 2013-12-18 15:15:23 +0000
+++ environs/cloudinit/cloudinit.go 2014-01-03 06:09:31 +0000
@@ -261,7 +261,7 @@
    // cfg.jujuTools()), as we want the jujud that is linked to in
    // /usr/local/bin to also upgrade when the machine agent upgrades its
    // tools and changes the tools directory that it is using.
- fmt.Sprintf("ln -s %s/tools/%s/jujud /usr/local/bin/juju-run",
cfg.DataDir, machineTag),
+ fmt.Sprintf("ln -f -s %s/tools/%s/jujud /usr/local/bin/juju-run",
cfg.DataDir, machineTag),
   )

   // Add the cloud archive cloud-tools pocket to apt sources

Index: environs/cloudinit/cloudinit_test.go
=== modified file 'environs/cloudinit/cloudinit_test.go'
--- environs/cloudinit/cloudinit_test.go 2013-12-17 12:13:50 +0000
+++ environs/cloudinit/cloudinit_test.go 2014-01-03 06:09:31 +0000
@@ -111,7 +111,7 @@
  printf '%s\\n' '.*' > '/var/lib/juju/agents/machine-0/format'
  install -m 600 /dev/null '/var/lib/juju/agents/machine-0/agent\.conf'
  printf '%s\\n' '.*' > '/var/lib/juju/agents/machine-0/agent\.conf'
-ln -s /var/lib/juju/tools/machine-0/jujud /usr/local/bin/juju-run
+ln -f -s /var/lib/juju/tools/machine-0/jujud /usr/local/bin/juju-run
  install -D -m 600 /dev/null '/var/lib/juju/server\.pem'
  printf '%s\\n' 'SERVER CERT\\n[^']*SERVER KEY\\n[^']*'
> '/var/lib/juju/server\.pem'
  mkdir -p /var/lib/juju/db/journal
@@ -224,7 +224,7 @@
  printf '%s\\n' '.*' > '/var/lib/juju/agents/machine-99/format'
  install -m 600 /dev/null '/var/lib/juju/agents/machine-99/agent\.conf'
  printf '%s\\n' '.*' > '/var/lib/juju/agents/machine-99/agent\.conf'
-ln -s /var/lib/juju/tools/machine-99/jujud /usr/local/bin/juju-run
+ln -f -s /var/lib/juju/tools/machine-99/jujud /usr/local/bin/juju-run
  ln -s 1\.2\.3-linux-amd64 '/var/lib/juju/tools/machine-99'
  echo 'Starting Juju machine agent \(jujud-machine-99\)'.*
  cat >> /etc/init/jujud-machine-99\.conf << 'EOF'\\ndescription "juju
machine-99 agent"\\nauthor "Juju Team <juju@lists\.ubuntu\.com>"\\nstart on
runlevel \[2345\]\\nstop on runlevel \[!2345\]\\nrespawn\\nnormal exit
0\\n\\nlimit nofile 20000 20000\\n\\nexec
/var/lib/juju/tools/machine-99/juju...

Read more...

Revision history for this message
William Reade (fwereade) wrote :

not your fault, but NOT LGTM -- I'm pretty sure we need to make this
symlink the machine agent's responsibility, not the cloudinit script's
(otherwise, where will it come from in an upgraded environment?).

So the correct fix is, I think, just a move of the functionality. Sane?

https://codereview.appspot.com/47380044/

Revision history for this message
Andrew Wilkins (axwalk) wrote :

On 2014/01/03 13:16:11, fwereade wrote:
> not your fault, but NOT LGTM -- I'm pretty sure we need to make this
symlink the
> machine agent's responsibility, not the cloudinit script's (otherwise,
where
> will it come from in an upgraded environment?).

> So the correct fix is, I think, just a move of the functionality.
Sane?

Yes, very much. Fix incoming soon.

https://codereview.appspot.com/47380044/

Revision history for this message
Andrew Wilkins (axwalk) wrote :
Revision history for this message
John A Meinel (jameinel) wrote :

I won't block this if we really need to have it, but it is part of why I
feel having a 'juju-run' command that is executed via SSH is the wrong
way about it. (Rather, the running agent should fire of the command to
execute, rather than have it driven via SSH from the API server.)

The actual code seems ok to me, though I wonder if we need to patch a
couple more places to avoid accidentally trying to write to
/usr/local/bin in the test suite.

https://codereview.appspot.com/47380044/diff/20001/cmd/jujud/machine.go
File cmd/jujud/machine.go (right):

https://codereview.appspot.com/47380044/diff/20001/cmd/jujud/machine.go#newcode354
cmd/jujud/machine.go:354: return os.Symlink(jujud, jujuRun)
So this is something that concerned me a lot with our methodology. When
we upgrade a jujud, is it going to fix this symlink? It really seems
wrong to have a global 'juju-run' binary that points into a
version-specific tools directory. (Presumably we can only have one
machine agent, but we might have multiple unit agents, and the unit
agents could be running different versions. If the idea is to run in a
'hook context' should it run in the context the agent is running?)

I'm personally opposed to the idea of having a /usr/local/bin/juju-run
symlink, but if it has been decided that we must, the code looks sane.

As a different think, is this going to get us in trouble when running
the test suite? My local user certainly doesn't have rights to create
/usr/local/bin/juju-run on my machine.

https://codereview.appspot.com/47380044/diff/20001/cmd/jujud/machine_test.go
File cmd/jujud/machine_test.go (right):

https://codereview.appspot.com/47380044/diff/20001/cmd/jujud/machine_test.go#newcode65
cmd/jujud/machine_test.go:65: s.PatchValue(&jujuRun,
filepath.Join(c.MkDir(), "juju-run"))
This seems dangerous, in that anything test suite that creates a Machine
job needs to add this patch.

Is this really the only test suite that creates machine agents? I would
expect to see a lot more agents being run. Doesn't agent_test also need
to create them?

https://codereview.appspot.com/47380044/

Revision history for this message
Andrew Wilkins (axwalk) wrote :

On 2014/01/06 12:53:47, jameinel wrote:
> I won't block this if we really need to have it, but it is part of why
I feel
> having a 'juju-run' command that is executed via SSH is the wrong way
about it.
> (Rather, the running agent should fire of the command to execute,
rather than
> have it driven via SSH from the API server.)

Since we do have it already, it needs to not break things :)

> The actual code seems ok to me, though I wonder if we need to patch a
couple
> more places to avoid accidentally trying to write to /usr/local/bin in
the test
> suite.

"make check" passes, and I don't run my tests as root.

https://codereview.appspot.com/47380044/diff/20001/cmd/jujud/machine.go
> File cmd/jujud/machine.go (right):

https://codereview.appspot.com/47380044/diff/20001/cmd/jujud/machine.go#newcode354
> cmd/jujud/machine.go:354: return os.Symlink(jujud, jujuRun)
> So this is something that concerned me a lot with our methodology.
When we
> upgrade a jujud, is it going to fix this symlink? It really seems
wrong to have
> a global 'juju-run' binary that points into a version-specific tools
directory.
> (Presumably we can only have one machine agent, but we might have
multiple unit
> agents, and the unit agents could be running different versions. If
the idea is
> to run in a 'hook context' should it run in the context the agent is
running?)

> I'm personally opposed to the idea of having a /usr/local/bin/juju-run
symlink,
> but if it has been decided that we must, the code looks sane.

> As a different think, is this going to get us in trouble when running
the test
> suite? My local user certainly doesn't have rights to create
> /usr/local/bin/juju-run on my machine.

Not with the current tests. I don't think this is really any different
from the tests that modify "$HOME/...", after having changed $HOME. That
scares me much more than this change, but they're working.

https://codereview.appspot.com/47380044/diff/20001/cmd/jujud/machine_test.go
> File cmd/jujud/machine_test.go (right):

https://codereview.appspot.com/47380044/diff/20001/cmd/jujud/machine_test.go#newcode65
> cmd/jujud/machine_test.go:65: s.PatchValue(&jujuRun,
filepath.Join(c.MkDir(),
> "juju-run"))
> This seems dangerous, in that anything test suite that creates a
Machine job
> needs to add this patch.

> Is this really the only test suite that creates machine agents? I
would expect
> to see a lot more agents being run. Doesn't agent_test also need to
create them?

Yes. agent_test doesn't actually define any tests, it's just for
embedding, and the unit agent doesn't exercise the new code.

https://codereview.appspot.com/47380044/

Revision history for this message
William Reade (fwereade) wrote :

LGTM, but I share jam's concerns about that jujuRun var. We've had (and
still have) problems with isolation for the jujud tests, and I'd prefer
to over-isolate than to under-isolate, at least until we break the
package up a bit.

https://codereview.appspot.com/47380044/diff/20001/cmd/jujud/machine.go
File cmd/jujud/machine.go (right):

https://codereview.appspot.com/47380044/diff/20001/cmd/jujud/machine.go#newcode354
cmd/jujud/machine.go:354: return os.Symlink(jujud, jujuRun)
On 2014/01/06 12:53:48, jameinel wrote:
> So this is something that concerned me a lot with our methodology.
When we
> upgrade a jujud, is it going to fix this symlink? It really seems
wrong to have
> a global 'juju-run' binary that points into a version-specific tools
directory.
> (Presumably we can only have one machine agent, but we might have
multiple unit
> agents, and the unit agents could be running different versions. If
the idea is
> to run in a 'hook context' should it run in the context the agent is
running?)

> I'm personally opposed to the idea of having a /usr/local/bin/juju-run
symlink,
> but if it has been decided that we must, the code looks sane.

We do sadly at this stage need to do this. Fixing this will I think
involve putting the machine agent in charge of the tools running on that
machine, so that it can download the tools for all agents, update a
single symlink, and then bounce them all; it's a good idea regardless,
but it probably makes most sense to do it once we've got a `juju-reboot`
mechanism for hooks (that'll likely involve setting up coordinated
shutdowns of agents anyway).

Sane?

> As a different think, is this going to get us in trouble when running
the test
> suite? My local user certainly doesn't have rights to create
> /usr/local/bin/juju-run on my machine.

I too would rather see more defensive patching of that value, I think --
maybe even in the package test func?

Although what I'd *really* like is for this module to be decomposed a
bit so we can test things like jobs independent of the workers they run,
and initialization independently from running, and so on.

https://codereview.appspot.com/47380044/

Revision history for this message
Andrew Wilkins (axwalk) wrote :

Please take a look.

https://codereview.appspot.com/47380044/diff/20001/cmd/jujud/machine.go
File cmd/jujud/machine.go (right):

https://codereview.appspot.com/47380044/diff/20001/cmd/jujud/machine.go#newcode354
cmd/jujud/machine.go:354: return os.Symlink(jujud, jujuRun)
On 2014/01/13 08:39:20, william.reade wrote:
> On 2014/01/06 12:53:48, jameinel wrote:
> > So this is something that concerned me a lot with our methodology.
When we
> > upgrade a jujud, is it going to fix this symlink? It really seems
wrong to
> have
> > a global 'juju-run' binary that points into a version-specific tools
> directory.
> > (Presumably we can only have one machine agent, but we might have
multiple
> unit
> > agents, and the unit agents could be running different versions. If
the idea
> is
> > to run in a 'hook context' should it run in the context the agent is
running?)
> >
> > I'm personally opposed to the idea of having a
/usr/local/bin/juju-run
> symlink,
> > but if it has been decided that we must, the code looks sane.

> We do sadly at this stage need to do this. Fixing this will I think
involve
> putting the machine agent in charge of the tools running on that
machine, so
> that it can download the tools for all agents, update a single
symlink, and then
> bounce them all; it's a good idea regardless, but it probably makes
most sense
> to do it once we've got a `juju-reboot` mechanism for hooks (that'll
likely
> involve setting up coordinated shutdowns of agents anyway).

> Sane?

Yes, I think so.

> > As a different think, is this going to get us in trouble when
running the test
> > suite? My local user certainly doesn't have rights to create
> > /usr/local/bin/juju-run on my machine.

> I too would rather see more defensive patching of that value, I think
-- maybe
> even in the package test func?

I was going to do an init(), but either way has the desired effect. I
see TestPackage has other initialisation code already, so I'll tack it
on.

> Although what I'd *really* like is for this module to be decomposed a
bit so we
> can test things like jobs independent of the workers they run, and
> initialization independently from running, and so on.

https://codereview.appspot.com/47380044/

Revision history for this message
Andrew Wilkins (axwalk) wrote :

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'cmd/jujud/machine.go'
2--- cmd/jujud/machine.go 2014-01-12 19:48:01 +0000
3+++ cmd/jujud/machine.go 2014-01-14 05:46:46 +0000
4@@ -53,6 +53,8 @@
5
6 var retryDelay = 3 * time.Second
7
8+var jujuRun = "/usr/local/bin/juju-run"
9+
10 // MachineAgent is a cmd.Command responsible for running a machine agent.
11 type MachineAgent struct {
12 cmd.CommandBase
13@@ -111,6 +113,9 @@
14 return err
15 }
16 charm.CacheDir = filepath.Join(a.Conf.dataDir, "charmcache")
17+ if err := a.initAgent(); err != nil {
18+ return err
19+ }
20
21 // ensureStateWorker ensures that there is a worker that
22 // connects to the state that runs within itself all the workers
23@@ -351,6 +356,14 @@
24 return names.MachineTag(a.MachineId)
25 }
26
27+func (a *MachineAgent) initAgent() error {
28+ if err := os.Remove(jujuRun); err != nil && !os.IsNotExist(err) {
29+ return err
30+ }
31+ jujud := filepath.Join(a.Conf.dataDir, "tools", a.Tag(), "jujud")
32+ return os.Symlink(jujud, jujuRun)
33+}
34+
35 func (a *MachineAgent) uninstallAgent() error {
36 var errors []error
37 agentServiceName := a.Conf.config.Value(agent.AgentServiceName)
38@@ -363,6 +376,10 @@
39 errors = append(errors, fmt.Errorf("cannot remove service %q: %v", agentServiceName, err))
40 }
41 }
42+ // Remove the juju-run symlink.
43+ if err := os.Remove(jujuRun); err != nil && !os.IsNotExist(err) {
44+ errors = append(errors, err)
45+ }
46 // The machine agent may terminate without knowing its jobs,
47 // for example if the machine's entry in state was removed.
48 // Thus, we do not rely on jobs here, and instead just check
49
50=== modified file 'cmd/jujud/machine_test.go'
51--- cmd/jujud/machine_test.go 2013-12-23 05:18:58 +0000
52+++ cmd/jujud/machine_test.go 2014-01-14 05:46:46 +0000
53@@ -4,6 +4,7 @@
54 package main
55
56 import (
57+ "os"
58 "path/filepath"
59 "reflect"
60 "strings"
61@@ -61,6 +62,7 @@
62 func (s *MachineSuite) SetUpTest(c *gc.C) {
63 s.agentSuite.SetUpTest(c)
64 s.TestSuite.SetUpTest(c)
65+ os.Remove(jujuRun) // ignore error; may not exist
66 }
67
68 func (s *MachineSuite) TearDownTest(c *gc.C) {
69@@ -651,3 +653,43 @@
70 s.assertCanOpenState(c, conf.Tag(), conf.DataDir())
71 })
72 }
73+
74+func (s *MachineSuite) TestMachineAgentSymlinkJujuRun(c *gc.C) {
75+ _, err := os.Stat(jujuRun)
76+ c.Assert(err, jc.Satisfies, os.IsNotExist)
77+ s.assertJobWithAPI(c, state.JobManageState, func(conf agent.Config, st *api.State) {
78+ // juju-run should have been created
79+ _, err := os.Stat(jujuRun)
80+ c.Assert(err, gc.IsNil)
81+ })
82+}
83+
84+func (s *MachineSuite) TestMachineAgentSymlinkJujuRunExists(c *gc.C) {
85+ err := os.Symlink("/nowhere/special", jujuRun)
86+ c.Assert(err, gc.IsNil)
87+ _, err = os.Stat(jujuRun)
88+ c.Assert(err, jc.Satisfies, os.IsNotExist)
89+ s.assertJobWithAPI(c, state.JobManageState, func(conf agent.Config, st *api.State) {
90+ // juju-run should have been recreated
91+ _, err := os.Stat(jujuRun)
92+ c.Assert(err, gc.IsNil)
93+ link, err := os.Readlink(jujuRun)
94+ c.Assert(err, gc.IsNil)
95+ c.Assert(link, gc.Not(gc.Equals), "/nowhere/special")
96+ })
97+}
98+
99+func (s *MachineSuite) TestMachineAgentUninstall(c *gc.C) {
100+ m, ac, _ := s.primeAgent(c, state.JobHostUnits, state.JobManageState)
101+ err := m.EnsureDead()
102+ c.Assert(err, gc.IsNil)
103+ a := s.newAgent(c, m)
104+ err = runWithTimeout(a)
105+ c.Assert(err, gc.IsNil)
106+ // juju-run should have been removed on termination
107+ _, err = os.Stat(jujuRun)
108+ c.Assert(err, jc.Satisfies, os.IsNotExist)
109+ // data-dir should have been removed on termination
110+ _, err = os.Stat(ac.DataDir())
111+ c.Assert(err, jc.Satisfies, os.IsNotExist)
112+}
113
114=== modified file 'cmd/jujud/main_test.go'
115--- cmd/jujud/main_test.go 2013-09-13 14:48:13 +0000
116+++ cmd/jujud/main_test.go 2014-01-14 05:46:46 +0000
117@@ -25,18 +25,27 @@
118
119 var caCertFile string
120
121+func mktemp(prefix string, content string) string {
122+ f, err := ioutil.TempFile("", prefix)
123+ if err != nil {
124+ panic(err)
125+ }
126+ _, err = f.WriteString(content)
127+ if err != nil {
128+ panic(err)
129+ }
130+ f.Close()
131+ return f.Name()
132+}
133+
134 func TestPackage(t *stdtesting.T) {
135+ // Change the path to "juju-run", so that the
136+ // tests don't try to write to /usr/local/bin.
137+ jujuRun = mktemp("juju-run", "")
138+ defer os.Remove(jujuRun)
139+
140 // Create a CA certificate available for all tests.
141- f, err := ioutil.TempFile("", "juju-test-cert")
142- if err != nil {
143- panic(err)
144- }
145- _, err = f.WriteString(testing.CACert)
146- if err != nil {
147- panic(err)
148- }
149- f.Close()
150- caCertFile = f.Name()
151+ caCertFile = mktemp("juju-test-cert", testing.CACert)
152 defer os.Remove(caCertFile)
153
154 testing.MgoTestPackage(t)
155
156=== modified file 'environs/cloudinit/cloudinit.go'
157--- environs/cloudinit/cloudinit.go 2014-01-05 22:20:24 +0000
158+++ environs/cloudinit/cloudinit.go 2014-01-14 05:46:46 +0000
159@@ -263,14 +263,6 @@
160 if err != nil {
161 return err
162 }
163- c.AddScripts(
164- // We specifically make the symlink here to the machine's current
165- // tools, not to the specific version tool directory (from
166- // cfg.jujuTools()), as we want the jujud that is linked to in
167- // /usr/local/bin to also upgrade when the machine agent upgrades its
168- // tools and changes the tools directory that it is using.
169- fmt.Sprintf("ln -s %s/tools/%s/jujud /usr/local/bin/juju-run", cfg.DataDir, machineTag),
170- )
171
172 // Add the cloud archive cloud-tools pocket to apt sources
173 // for series that need it. This gives us up-to-date LXC,
174
175=== modified file 'environs/cloudinit/cloudinit_test.go'
176--- environs/cloudinit/cloudinit_test.go 2013-12-20 01:52:22 +0000
177+++ environs/cloudinit/cloudinit_test.go 2014-01-14 05:46:46 +0000
178@@ -112,7 +112,6 @@
179 printf '%s\\n' '.*' > '/var/lib/juju/agents/machine-0/format'
180 install -m 600 /dev/null '/var/lib/juju/agents/machine-0/agent\.conf'
181 printf '%s\\n' '.*' > '/var/lib/juju/agents/machine-0/agent\.conf'
182-ln -s /var/lib/juju/tools/machine-0/jujud /usr/local/bin/juju-run
183 install -D -m 600 /dev/null '/var/lib/juju/system-identity'
184 printf '%s\\n' '.*' > '/var/lib/juju/system-identity'
185 install -D -m 600 /dev/null '/var/lib/juju/server\.pem'
186@@ -228,7 +227,6 @@
187 printf '%s\\n' '.*' > '/var/lib/juju/agents/machine-99/format'
188 install -m 600 /dev/null '/var/lib/juju/agents/machine-99/agent\.conf'
189 printf '%s\\n' '.*' > '/var/lib/juju/agents/machine-99/agent\.conf'
190-ln -s /var/lib/juju/tools/machine-99/jujud /usr/local/bin/juju-run
191 ln -s 1\.2\.3-linux-amd64 '/var/lib/juju/tools/machine-99'
192 echo 'Starting Juju machine agent \(jujud-machine-99\)'.*
193 cat >> /etc/init/jujud-machine-99\.conf << 'EOF'\\ndescription "juju machine-99 agent"\\nauthor "Juju Team <juju@lists\.ubuntu\.com>"\\nstart on runlevel \[2345\]\\nstop on runlevel \[!2345\]\\nrespawn\\nnormal exit 0\\n\\nlimit nofile 20000 20000\\n\\nexec /var/lib/juju/tools/machine-99/jujud machine --data-dir '/var/lib/juju' --machine-id 99 --debug >> /var/log/juju/machine-99\.log 2>&1\\nEOF\\n

Subscribers

People subscribed via source and target branches

to status/vote changes: