Merge lp:~clint-fewbar/ubuntu/natty/upstart/restore-re-exec-code into lp:ubuntu/natty/upstart

Proposed by Clint Byrum
Status: Merged
Merged at revision: 1295
Proposed branch: lp:~clint-fewbar/ubuntu/natty/upstart/restore-re-exec-code
Merge into: lp:ubuntu/natty/upstart
Diff against target: 121 lines (+74/-2)
3 files modified
debian/changelog (+9/-0)
debian/control (+1/-0)
init/main.c (+64/-2)
To merge this branch: bzr merge lp:~clint-fewbar/ubuntu/natty/upstart/restore-re-exec-code
Reviewer Review Type Date Requested Status
Kees Cook Approve
Ubuntu branches Pending
Review via email: mp+45295@code.launchpad.net

Description of the change

Now that the libc6 version that cancels the telinit u has been accepted into natty, I'm proposing merging this in.

This will cause upstart to re-exec itself on 'telinit u', which sends a SIGTERM sent to pid 1.

This code was dropped for unknown reasons (possibly just by mistake of package importing into bzr) and will solve the issue of / not being able to be remounted read-only during the shutdown process.

To post a comment you must log in.
1289. By Kees Cook

debian/apparmor-profile-load: allow profiles to be missing for saner
packaging integration.

Revision history for this message
Clint Byrum (clint-fewbar) wrote :

FYI, this is insufficient to stop root FS corruption.

I tested this on lucid, and occasionally it still fails the remount of /

This is because the SIGTERM signal is sent by telinit u, but then we don't wait for upstart to re-exec itself before moving on to the umount. This fails about 50% of the time on lucid.

The simple, but incomplete, solution is to add a sleep 1 after telinit u. The more appropriate solution is to have telinit ptrace 1 with PTRACE_O_TRACEEXEC.

Revision history for this message
Clint Byrum (clint-fewbar) wrote :

So, the simpler solution to the race between umountroot exitting and the re-exec completing, brought up by James Hunt, is to watch for changes in /proc/1 .. I think /proc/1/maps is the right place to look.

We are working on a patch to sysvinit's umountroot script to take care of that. In the mean time, this patch should probably be uploaded without it, as this patch will at least give us *some* way of telling upstart to re-exec itself, so the sooner people are booting with this version of upstart, the less corrupted /'s there will be.

Revision history for this message
Colin Watson (cjwatson) wrote :

This basically seems OK; but one possible failure mode comes to mind. If the old libc6.postinst is run with the new upstart, then it will run 'telinit u' and that will send SIGTERM to upstart, which will faithfully re-exec and will now have no job state. Is it worth adding some kind of packaging metadata, perhaps Breaks, to try to avoid that?

Revision history for this message
Sebastien Bacher (seb128) wrote :

the changelog is buggy and needs to be rebased on the current natty version

Revision history for this message
Clint Byrum (clint-fewbar) wrote :

Merged latest changes and pushed.

1290. By Steve Langasek

debian/upstart-job: properly handle jobs that are in state 'start/running'
with no PID, by checking only if the goal is 'start'. LP: #603934.

1291. By Steve Langasek

hah, get our grep line right, otherwise we break *all* service upgrades

1292. By Steve Langasek

add another bug ref

1293. By Steve Langasek

releasing version 0.6.7-5

1294. By Kees Cook

debian/apparmor-profile-load: check for correct AppArmor profile loading
interface file (LP: #710649).

1295. By Clint Byrum

Re-add upstream r977 to allow proper re-exec on shutdown (LP: #672177)

1296. By Clint Byrum

* Re-add upstream r977 to allow proper re-exec on shutdown (LP: #672177)
* debian/control: adding Breaks on eglibc version that disables
  telinit u to avoid accidentally installing a version of libc6 that
  will cause upstart to re-exec and lose its state.

Revision history for this message
Clint Byrum (clint-fewbar) wrote :

Ok I overwrote the old branch as it got way out of sync w/ the main natty branch.

Please note that this fix is already present in lucid-updates and maverick-updates. Not sure how it is that it didn't get uploaded into natty before that.

Revision history for this message
Kees Cook (kees) wrote :

Looks good; thanks! I'll get this uploaded.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'debian/changelog'
2--- debian/changelog 2011-02-03 13:45:32 +0000
3+++ debian/changelog 2011-02-05 01:12:11 +0000
4@@ -1,3 +1,12 @@
5+upstart (0.6.7-7) natty; urgency=low
6+
7+ * Re-add upstream r977 to allow proper re-exec on shutdown (LP: #672177)
8+ * debian/control: adding Breaks on eglibc version that disables
9+ telinit u to avoid accidentally installing a version of libc6 that
10+ will cause upstart to re-exec and lose its state.
11+
12+ -- Clint Byrum <clint@ubuntu.com> Fri, 21 Jan 2011 08:39:13 -0800
13+
14 upstart (0.6.7-6) natty; urgency=low
15
16 * debian/apparmor-profile-load: check for correct AppArmor profile loading
17
18=== modified file 'debian/control'
19--- debian/control 2010-12-14 17:15:57 +0000
20+++ debian/control 2011-02-05 01:12:11 +0000
21@@ -12,6 +12,7 @@
22 Replaces: upstart-job, sysvinit, upstart-compat-sysv, startup-tasks, system-services
23 Conflicts: upstart-job, sysvinit, upstart-compat-sysv, startup-tasks, system-services
24 Provides: upstart-job, upstart-compat-sysv, startup-tasks, system-services
25+Breaks: libc6 ( << 2.12.1-0ubuntu12 )
26 Description: event-based init daemon
27 upstart is a replacement for the /sbin/init daemon which handles
28 starting of tasks and services during boot, stopping them during
29
30=== modified file 'init/main.c'
31--- init/main.c 2010-12-14 17:15:57 +0000
32+++ init/main.c 2011-02-05 01:12:11 +0000
33@@ -63,6 +63,9 @@
34 /* Prototypes for static functions */
35 #ifndef DEBUG
36 static void crash_handler (int signum);
37+#endif /* DEBUG */
38+static void term_handler (void *data, NihSignal *signal);
39+#ifndef DEBUG
40 static void cad_handler (void *data, NihSignal *signal);
41 static void kbd_handler (void *data, NihSignal *signal);
42 static void pwr_handler (void *data, NihSignal *signal);
43@@ -261,9 +264,15 @@
44 /* SIGUSR1 instructs us to reconnect to D-Bus */
45 nih_signal_set_handler (SIGUSR1, nih_signal_handler);
46 NIH_MUST (nih_signal_add_handler (NULL, SIGUSR1, usr1_handler, NULL));
47+
48+ /* SIGTERM instructs us to re-exec ourselves; this should be the
49+ * last in the list to ensure that all other signals are handled
50+ * before a SIGTERM.
51+ */
52+ nih_signal_set_handler (SIGTERM, nih_signal_handler);
53+ NIH_MUST (nih_signal_add_handler (NULL, SIGTERM, term_handler, NULL));
54 #endif /* DEBUG */
55
56-
57 /* Watch children for events */
58 NIH_MUST (nih_child_add_watch (NULL, -1, NIH_CHILD_ALL,
59 job_process_handler, NULL));
60@@ -476,7 +485,60 @@
61 /* Goodbye, cruel world. */
62 exit (signum);
63 }
64-
65+#endif
66+
67+/**
68+ * term_handler:
69+ * @data: unused,
70+ * @signal: signal caught.
71+ *
72+ * This is called when we receive the TERM signal, which instructs us
73+ * to reexec ourselves.
74+ **/
75+static void
76+term_handler (void *data,
77+ NihSignal *signal)
78+{
79+ NihError *err;
80+ const char *loglevel;
81+ sigset_t mask, oldmask;
82+
83+ nih_assert (argv0 != NULL);
84+ nih_assert (signal != NULL);
85+
86+ nih_warn (_("Re-executing %s"), argv0);
87+
88+ /* Block signals while we work. We're the last signal handler
89+ * installed so this should mean that they're all handled now.
90+ *
91+ * The child must make sure that it unblocks these again when
92+ * it's ready.
93+ */
94+ sigfillset (&mask);
95+ sigprocmask (SIG_BLOCK, &mask, &oldmask);
96+
97+ /* Argument list */
98+ if (nih_log_priority <= NIH_LOG_DEBUG) {
99+ loglevel = "--debug";
100+ } else if (nih_log_priority <= NIH_LOG_INFO) {
101+ loglevel = "--verbose";
102+ } else if (nih_log_priority >= NIH_LOG_ERROR) {
103+ loglevel = "--error";
104+ } else {
105+ loglevel = NULL;
106+ }
107+ execl (argv0, argv0, "--restart", loglevel, NULL);
108+ nih_error_raise_system ();
109+
110+ err = nih_error_get ();
111+ nih_error (_("Failed to re-execute %s: %s"), argv0, err->message);
112+ nih_free (err);
113+
114+ sigprocmask (SIG_SETMASK, &oldmask, NULL);
115+}
116+
117+
118+#ifndef DEBUG
119 /**
120 * cad_handler:
121 * @data: unused,

Subscribers

People subscribed via source and target branches

to all changes: