Merge lp:~clint-fewbar/ubuntu/natty/nfs-utils/wait-for-local-filesystems into lp:ubuntu/natty/nfs-utils

Proposed by Clint Byrum
Status: Merged
Merged at revision: 34
Proposed branch: lp:~clint-fewbar/ubuntu/natty/nfs-utils/wait-for-local-filesystems
Merge into: lp:ubuntu/natty/nfs-utils
Diff against target: 116 lines (+58/-6)
5 files modified
debian/changelog (+11/-0)
debian/control (+1/-1)
debian/nfs-common.statd-mounting.upstart (+28/-0)
debian/nfs-common.statd.upstart (+17/-5)
debian/rules (+1/-0)
To merge this branch: bzr merge lp:~clint-fewbar/ubuntu/natty/nfs-utils/wait-for-local-filesystems
Reviewer Review Type Date Requested Status
James Hunt (community) Needs Fixing
Steve Langasek Pending
Ubuntu branches Pending
Review via email: mp+44417@code.launchpad.net

Description of the change

See comments on bug #690401 for explanation of whats going on here.

To post a comment you must log in.
Revision history for this message
James Hunt (jamesodhunt) wrote :

Hi Clint,

I don't understand the reason for the runlevel check. Also, I don't think the script logic is correct. Do you mean:

  RLEVEL=`runlevel|awk '{print $2}'

Also, the syntax below doesn't appear to work in bash or dash:

  RLEVEL=${RLEVEL#? [2345]}

Am I missing something?

It probably makes sense for Steve Langasek to review this as I understand he wrote the NFS handling code and could get this progressed quickly.

review: Needs Fixing
35. By Clint Byrum

  local-filesystems. (LP: #581941) Also add stop on runlevel [!2345]

Revision history for this message
Clint Byrum (clint-fewbar) wrote :

Sorry, the runlevel check is used to avoid starting with portmap outside those runlevels. Essentially, portmap starts way too early, but we do need to start after it, and restart whenever portmap is restarted.

I think it would be MUCH more clear if we could say

start on started portmap RUNLEVEL=[2345] ...

But I can't seem to get portmap to export a RUNLEVEL in its started event. For that matter, I can't seem to get any jobs to export a variable, so I'm a little confused as to how to do it.

As far as the shell prefix removal, it works fine for me, though it may not be clear what is going on because we're testing in runlevel 2:

$ RLEVEL=`runlevel`
$ RLEVEL=${RLEVEL#? [016]}
$ echo $RLEVEL
N 2
$
$ RLEVEL=`runlevel`
$ RLEVEL=${RLEVEL#? [2345]}
$ echo $RLEVEL

$

The idea is that it will be empty when [2345] are there. Hence

if [ "x$RLEVEL" != x ] ; then
  # handle the != runlevel [2345] stuff

I don't know if awk will be available, given that it is in /usr, which is allowed to be on NFS. But maybe there's a minimal awk available during boot, or something else that will allow a more clear scriptlet?

Revision history for this message
Steve Langasek (vorlon) wrote :

I wouldn't say I'm likely to get this to progress quickly; yes, I wrote the NFS handling code, but soon came to recognize there were bugs in the resulting logic that I couldn't see a way to address with the current upstart support.

That said, I don't see the reason for this runlevel check. Is this meant to split the portmap start condition (start when portmap is restarted) away from the startup condition (start when mountall mounts an NFS mount)? If so, what happens when there are no NFS shares mounted at boot time, and the user calls mount manually post-boot in which case no 'mounting' event is emitted?

If that's not the intent of your check, then I definitely don't understand what you're trying to achieve here.

Revision history for this message
Steve Langasek (vorlon) wrote :

Oh right, should read the diff - you're also starting on runlevel [2345]. So that solves the "start even without nfs services at boot" problem.

+stop on stopping portmap or runlevel [!2345]

That's definitely wrong; it causes statd to be torn down at shutdown or reboot, out from under the NFS mounts that are still being used.

> (local-filesystems and (mounting TYPE=nfs)

Have you checked whether mountall emits these events synchronously or asynchronously? If it emits them synchronously, that's obviously going to cause a deadlock at boot, because both 'local-filesystems' and 'mounting' are emitted by the same mountall process.

Revision history for this message
Clint Byrum (clint-fewbar) wrote :

On Wed, 2010-12-22 at 20:42 +0000, Steve Langasek wrote:
> Oh right, should read the diff - you're also starting on runlevel [2345]. So that solves the "start even without nfs services at boot" problem.
>
> +stop on stopping portmap or runlevel [!2345]
>
> That's definitely wrong; it causes statd to be torn down at shutdown or reboot, out from under the NFS mounts that are still being used.
>

Right.. we do need to stop it at some point... leaving it running means
the same libc problems and possible open files (it does write
to /var...) preventing unmounting filesystems cleanly.

Would it then make sense to change this to stop on stopping portmap or
unmounted-nfs-filesystems and then emit that
in /etc/init.d/umountnfs.sh?

Where would this leave an NFS root? Do we just leave rpc.statd running
in this case?

In either case, this is really a separate bug and so I'll remove the
extra stop on.

> > (local-filesystems and (mounting TYPE=nfs)
>
> Have you checked whether mountall emits these events synchronously or asynchronously? If it emits them synchronously, that's obviously going to cause a deadlock at boot, because both 'local-filesystems' and 'mounting' are emitted by the same mountall process.
>

RIGHT.. hadn't thought of that. local-filesystems is not waited on,
mounting is. I think mountall skips the remote mounts until SIGUSR1, I
can't confirm that by reading its code as the logic is fairly complex,
so mounting might be triggered first (especially w/ an nfs root which I
didn't test), and so these can't really be used with any of the others
as an "and".

I did test this, and it actually resulted in booting and starting
rpc.statd as a result of this condition (confirmed by booting with
--verbose). I suspect this was entirely by chance, because the nfs
filesystems were just mounted after all local filesystems.

That said, there will always be a problem if /var isn't mounted rw
before running sm-notify, upstart or not. So if we just keep the
runlevel check to avoid starting with portmap too early, and change it
back to mounting, and make sure that sm-notify logs something useful
like "failed to write to /var/xxx, /var must be mounted before all NFS
mounts" when it fails, that seems acceptable.

I'll poke at that for a bit and see how it goes.

Revision history for this message
Steve Langasek (vorlon) wrote :

On Thu, Dec 23, 2010 at 06:01:54AM -0000, Clint Byrum wrote:
> Right.. we do need to stop it at some point... leaving it running means
> the same libc problems and possible open files (it does write
> to /var...) preventing unmounting filesystems cleanly.

Yes, that much is true.

> Would it then make sense to change this to stop on stopping portmap or
> unmounted-nfs-filesystems and then emit that
> in /etc/init.d/umountnfs.sh?

That sounds reasonable to me, but I agree with you that this should be taken
to a separate bug / branch.

> Where would this leave an NFS root? Do we just leave rpc.statd running
> in this case?

Yes, in that case I think it has to stay running until the end.

> I'll poke at that for a bit and see how it goes.

Any progress on this? From the discussions on IRC I think we've got a
pretty good handle now on the scope, so I'm keen to get this fixed up and
merged.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

Revision history for this message
Clint Byrum (clint-fewbar) wrote :

On Tue, 2011-01-04 at 17:04 +0000, Steve Langasek wrote:
> On Thu, Dec 23, 2010 at 06:01:54AM -0000, Clint Byrum wrote:
> > Right.. we do need to stop it at some point... leaving it running means
> > the same libc problems and possible open files (it does write
> > to /var...) preventing unmounting filesystems cleanly.
>
> Yes, that much is true.
>
> > Would it then make sense to change this to stop on stopping portmap or
> > unmounted-nfs-filesystems and then emit that
> > in /etc/init.d/umountnfs.sh?
>
> That sounds reasonable to me, but I agree with you that this should be taken
> to a separate bug / branch.
>
> > Where would this leave an NFS root? Do we just leave rpc.statd running
> > in this case?
>
> Yes, in that case I think it has to stay running until the end.
>
> > I'll poke at that for a bit and see how it goes.
>
> Any progress on this? From the discussions on IRC I think we've got a
> pretty good handle now on the scope, so I'm keen to get this fixed up and
> merged.
>

I just picked this back up today. I am running a smoke test right now
and will push to the branch shortly.

Revision history for this message
Darin Tay (dtay) wrote :

Thanks for working on this!

Awaiting the new push, but as something to watch out for, note that any solution where this still starts on started portmap without changing portmap's start conditions (such as the original one posted here) won't completely work, even if another condition will restart it later.

The problem being that, in the failure case, sm-notify still writes out its pid file to /var/run (which is writable at this point, even if / isn't, as it's a tmpfs) before dying. On subsequent job starts, sm-notify will see that the pid file already exists and then exit (with success) without doing any work, since it thinks notifications have already been sent. As a result, notifications will never get sent, leaving stale locks.

Revision history for this message
Steve Langasek (vorlon) wrote :

On Tue, Jan 04, 2011 at 10:37:53PM -0000, Darin Tay wrote:
> The problem being that, in the failure case, sm-notify still writes out
> its pid file to /var/run (which is writable at this point, even if /
> isn't, as it's a tmpfs) before dying. On subsequent job starts, sm-notify
> will see that the pid file already exists and then exit (with success)
> without doing any work, since it thinks notifications have already been
> sent. As a result, notifications will never get sent, leaving stale
> locks.

That implies sm-notify is leaving a stale PID file when it fails to start,
and then failing to clean up PID files when they're stale; if so, that's a
separate bug in sm-notify that should also be fixed.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

Revision history for this message
Darin Tay (dtay) wrote :

Sure, I'll file it as a separate bug then. I use the term 'pid file' rather loosely here, as it is intentionally left around after execution, because it's not supposed to run again on statd restarts after the initial run. But it could quite easily clean up the pid file on failure cases.

Revision history for this message
Clint Byrum (clint-fewbar) wrote :

Right, what I'm seeing is that sm-notify drops the file there and then tries to notify remote hosts and fails because, for instance, its trying to mount the NFS filesystems before there is a net device. It most definitely should be cleaning this file up on this failure so we can try to run sm-notify again when the network is up.

My testing is almost complete and I think I've got it working reliably even when fsck's take a while.

Revision history for this message
Clint Byrum (clint-fewbar) wrote :

So I'm getting intermittent NFS mount failures with this start on:

start on (started portmap ON_BOOT=
    or (mounting TYPE=nfs)
    or local-filesystems and started portmap ON_BOOT=y)

statd always ends up started.. but the mounts don't work.

The reason is that local-filesystems and started portmap ON_BOOT=y may come around the same exact time as mounting TYPE=nfs. If they do, then the mounting event doesn't seem to block on statd's already begun startup, and the mount attempt fails because statd hasn't started yet.

A lame hacky way that I found worked was to add

/etc/init/statd-nomounts.conf

start on starting rc RUNLEVEL=[2345] and started portmap ON_BOOT=y and net-device-up IFACE!=lo
stop on started statd

task

script
  # This is *LAME* we need to have a "wait for X to happen" command
  sleep 5
  exec start statd
end script

The event that is missing, that makes the 'sleep 5' necessary, is the one that mountall would need to emit when it is done processing the SIGUSR1 command sent by mountall-net.

One could also turn it around and have a statd-mounting.conf that fires on mounting TYPE=nfs and then waits for any existing instance of statd to finish starting. That may be more desirable.

Anyway, the above is not a solution, so I'll keep looking at it and hopefully have a solid solution in the next 24 hours.

36. By Clint Byrum

* debian/nfs-common.statd.upstart,
  debian/nfs-common.statd-mounting.upstart: refactor startup to wait for
  local-filesystems. (LP: #525154)
* debian/control: depend on portmap version that sets ON_BOOT=y and
  has the portmap-wait job.

37. By Clint Byrum

debian/rules: install new statd-mounting upstart job

Revision history for this message
Steve Langasek (vorlon) wrote :

@@ -32,3 +44,8 @@
                exec rpc.statd -L $STATDOPTS
        fi
 end script
+
+post-start script
+ echo sleeping...
+ sleep 1
+end script

Why this change? Looks like a workaround for a start-up race condition in statd, but isn't documented as such in either the changelog or the upstart job.

+start on mounting TYPE=nfs

This will only block one mount attempt at a time, and if there are multiple NFS shares listed in /etc/fstab, mountall may try to mount them in parallel. I think this can be made more robust by adding an 'instance' to the job, so *each* NFS mount gets its own copy of this job. We're already head and shoulders above where we were before in terms of startup robustness, however, so I don't consider that a blocker for merging.

38. By Clint Byrum

removing unnecessary debugging sleep delay post-start

Revision history for this message
Clint Byrum (clint-fewbar) wrote :

How embarassing with the sleep 1. That was just a testing delay.

For the INSTANCE, I have to agree. I didn't realize that mountall mounted things in parallel. Will test with a whole lot of NFS mounts.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'debian/changelog'
2--- debian/changelog 2010-12-15 21:42:55 +0000
3+++ debian/changelog 2011-01-11 16:33:49 +0000
4@@ -1,3 +1,14 @@
5+nfs-utils (1:1.2.2-4ubuntu2) natty; urgency=low
6+
7+ * debian/nfs-common.statd.upstart,
8+ debian/nfs-common.statd-mounting.upstart: refactor startup to wait for
9+ local-filesystems. (LP: #525154)
10+ * debian/control: depend on portmap version that sets ON_BOOT=y and
11+ has the portmap-wait job.
12+ * debian/rules: install new statd-mounting upstart job
13+
14+ -- Clint Byrum <clint@ubuntu.com> Wed, 05 Jan 2011 12:27:32 -0800
15+
16 nfs-utils (1:1.2.2-4ubuntu1) natty; urgency=low
17
18 * Merge from debian unstable (LP: #685860), remaining changes:
19
20=== modified file 'debian/control'
21--- debian/control 2010-12-15 21:42:55 +0000
22+++ debian/control 2011-01-11 16:33:49 +0000
23@@ -33,7 +33,7 @@
24
25 Package: nfs-common
26 Architecture: any
27-Depends: ${shlibs:Depends}, ${misc:Depends}, portmap (>= 6.0-10ubuntu1), adduser, ucf, lsb-base (>= 1.3-9ubuntu3), netbase (>= 4.24), initscripts (>= 2.86.ds1-38.1)
28+Depends: ${shlibs:Depends}, ${misc:Depends}, portmap (>= 6.0.0-2ubuntu2), adduser, ucf, lsb-base (>= 1.3-9ubuntu3), netbase (>= 4.24), initscripts (>= 2.86.ds1-38.1)
29 Provides: nfs-client
30 Conflicts: nfs-client
31 Replaces: nfs-client, nfs-kernel-server (<< 1:1.0.7-5), mount (<< 2.13~)
32
33=== added file 'debian/nfs-common.statd-mounting.upstart'
34--- debian/nfs-common.statd-mounting.upstart 1970-01-01 00:00:00 +0000
35+++ debian/nfs-common.statd-mounting.upstart 2011-01-11 16:33:49 +0000
36@@ -0,0 +1,28 @@
37+# statd-mounting
38+
39+description "Block the mounting event for NFS filesytems until statd is running"
40+author "Clint Byrum <clint.byrum@canonical.com>"
41+
42+start on mounting TYPE=nfs
43+stop on started statd or stopped statd
44+task
45+
46+# This is required so that the task is still considered
47+# successful when it gets killed
48+normal exit 2
49+
50+script
51+
52+ status statd | grep -q "start/running" && exit 0
53+
54+ # Attempt to start statd
55+ start statd || true
56+
57+ # If its already starting we'll get killed by the impending 'stop on
58+ # started statd'
59+ # If it wasn't already starting, we'll either get killed by the stop
60+ # on started or stopped.
61+ # So, its safe to sleep forever here and rely on upstart to kill us,
62+
63+ while sleep 3600 ;:;done
64+end script
65
66=== modified file 'debian/nfs-common.statd.upstart'
67--- debian/nfs-common.statd.upstart 2010-02-25 14:35:17 +0000
68+++ debian/nfs-common.statd.upstart 2011-01-11 16:33:49 +0000
69@@ -3,7 +3,16 @@
70 description "NSM status monitor"
71 author "Steve Langasek <steve.langasek@canonical.com>"
72
73-start on (started portmap or mounting TYPE=nfs)
74+# ON_BOOT is set to y in portmap's special portmap-boot.conf
75+# It will not be set when users run 'restart portmap' or 'start portmap'
76+# This is so that we don't start until we have local filesystems on
77+# bootup but we also restart whenever portmap is restarted. -Clint Byrum
78+#
79+# The case where we need to make sure statd is started on mounting
80+# TYPE=nfs is handled in the "statd-mounting" job.
81+#
82+start on (started portmap ON_BOOT=
83+ or local-filesystems and started portmap ON_BOOT=y)
84 stop on stopping portmap
85
86 expect fork
87@@ -17,10 +26,13 @@
88 fi
89
90 [ "x$NEED_STATD" != xno ] || { stop; exit 0; }
91-
92- start portmap || true
93- status portmap | grep -q start/running
94- exec sm-notify
95+ logger -t statd-pre-start "$UPSTART_EVENTS" || true
96+ echo UPSTART_EVENTS = "$UPSTART_EVENTS"
97+
98+ # This will block if portmap is mid-starting
99+ start portmap-wait
100+
101+ exec sm-notify
102 end script
103
104 script
105
106=== modified file 'debian/rules'
107--- debian/rules 2010-06-04 09:53:57 +0000
108+++ debian/rules 2011-01-11 16:33:49 +0000
109@@ -47,6 +47,7 @@
110 dh_installdocs -A
111 dh_installdocs -pnfs-common debian/README.Debian.nfsv4
112 dh_installinit -pnfs-common --upstart-only -R --name statd
113+ dh_installinit -pnfs-common --upstart-only --no-start --name statd-mounting
114 dh_installinit -pnfs-common --upstart-only -n --name rpc_pipefs
115 dh_installinit -pnfs-common --upstart-only -R --name gssd
116 dh_installinit -pnfs-common --upstart-only -R --name idmapd

Subscribers

People subscribed via source and target branches

to all changes: