Merge lp:~jamesodhunt/upstart/bug-980917-reworked into lp:upstart

Proposed by James Hunt
Status: Merged
Merged at revision: 1375
Proposed branch: lp:~jamesodhunt/upstart/bug-980917-reworked
Merge into: lp:upstart
Diff against target: 251 lines (+150/-9)
4 files modified
ChangeLog (+8/-0)
init/main.c (+72/-3)
init/system.c (+63/-5)
init/system.h (+7/-1)
To merge this branch: bzr merge lp:~jamesodhunt/upstart/bug-980917-reworked
Reviewer Review Type Date Requested Status
Steve Langasek Disapprove
Review via email: mp+118132@code.launchpad.net

Description of the change

Based upon branch lp:~vorlon/upstart/lp.980917.

* init/main.c:main(): Handle hostile initramfs-less environments by
  calling umask and creating required device nodes as early as possible.
* init/system.c: New functions to simplify code:
  - system_mknod()
  - system_check_file()

To post a comment you must log in.
Revision history for this message
Steve Langasek (vorlon) wrote :

I think there's some very unfortunate scope creep here. My previous revision was addressing what would have been a regression for users of static /dev, due to the incorrect overmounting with devtmpfs. This subsequent change doesn't address any problems related to the /dev/pts case at all, instead adding checks for additional devices that have never been the source of error reports.

+ if (system_check_file ("/dev/null", S_IFCHR, makedev (1, 3)) < 0)
+ needs_devtmpfs = 1;
+
+ if (system_check_file ("/dev/console", S_IFCHR, makedev (5, 1)) < 0)
+ needs_devtmpfs = 1;
+
+ if (system_check_file ("/dev/tty", S_IFCHR, makedev (5, 0)) < 0)
+ needs_devtmpfs = 1;
+

This is a completely unnecessary check. These three device nodes are guaranteed to *always* be present at boot time. Either they're set up by the initramfs, or they're required to be part of /dev on the root filesystem, or the system must be configured for the kernel to automount devtmpfs at boot time. It is therefore *wrong* for upstart to take action when these device nodes are missing, as that means the system is badly broken and requiring admin intervention, and papering over one of the symptoms will only make it harder for the problem to be fixed at its source. This adds complexity where none is needed.

+ if (system_check_file ("/dev/kmsg", S_IFCHR, makedev (1, 11)) < 0)
+ needs_devtmpfs = 1;

This is a device that could possibly be missing (it's not included in the "std" set from MAKEDEV), but it also seems optional... if missing, you just don't get messages logged, right?

I think it's borderline as to whether upstart should do anything to create /dev/kmsg. If it's not present at boot time, chances are that it will be present soon afterwards. On the other hand, if it's going to become "present soon afterwards", it means we have a dynamic /dev, so there's probably no *harm* in mounting /dev here. But I think it's still orthogonal to the problem this branch was supposed to solve and should be treated separately.

So the only part of this that I would include is the fix of the mknod() argument ordering. (Which, btw, you probably never noticed problems with when testing because the node is created by the kernel at the same time as /dev/null, so in practice it's always present by the time devtmpfs is mounted.)

review: Disapprove
Revision history for this message
Steve Langasek (vorlon) wrote :

Correction: /dev/ptmx is *not* created at the same time that /dev/null is. So there may be a small race allowing the unix98 pty driver to have not been initialized by the time of devtmpfs mount, whereas the others should never need to be created by hand because they're guaranteed available on the devtmpfs before the handoff to init.

Revision history for this message
James Hunt (jamesodhunt) wrote :

Hi Steve,

> This is a completely unnecessary check. These three device nodes are
> guaranteed to *always* be present at boot time.

On Ubuntu we run MAKEDEV when the devtmpfs /tmp is mounted is mounted, but although these devices are expected to exist on a normal working system, I don't see that we can guarantee that for non-Ubuntu systems, particularly those using a static /dev. Hence, the defensive programming to try to recover in the case where 'the impossible' happens.

> Either they're set up by the initramfs, or they're required to be part
> of /dev on the root filesystem, or the system must be configured for the
> kernel to automount devtmpfs at boot time.
> It is therefore *wrong* for upstart to take action when these device nodes
> are missing, as that means the system is badly broken and requiring admin
> intervention

Ok, so unlikely though it is, if an admin _does_ inadvertently remove a crucial device node on a system running Upstart with no initramfs and a static /dev, assuming they have access to another working system, they'll need to do something akin to the following to recover:

- boot with "init=/bin/sh root=/dev/sda1 rootfstype=ext4"
  (remove "quiet" and "splash" if present too).

- run the following to mount disk read-write:

  # mount -oremount,rw /

- recreate any missing device nodes using mknod(1), or better, just run:

  # MAKEDEV std

- either call "sync", remount disks read-only, halt and power-cycle ("sync && mount
  -oremount,ro / && halt"), or just start Upstart:

  # exec /sbin/init </dev/console >/dev/console 2>&1

> This (/dev/kmsg) is a device that could possibly be missing (it's not
> included in the "std" set from MAKEDEV), but it also seems optional...
> if missing, you just don't get messages logged, right?

That's right - Upstarts switches the NIH logger to use /dev/kmsg to make
use of the kernel ring buffer for logging all output. This allows early
Upstart output to appear in the system log when a syslog daemon
eventually starts.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'ChangeLog'
2--- ChangeLog 2012-07-31 08:39:57 +0000
3+++ ChangeLog 2012-08-03 15:50:43 +0000
4@@ -1,3 +1,11 @@
5+2012-08-03 James Hunt <james.hunt@ubuntu.com>
6+
7+ * init/main.c:main(): Handle hostile initramfs-less environments by
8+ calling umask and creating required device nodes as early as possible.
9+ * init/system.c: New functions to simplify code:
10+ - system_mknod()
11+ - system_check_file()
12+
13 2012-07-31 James Hunt <james.hunt@ubuntu.com>
14
15 [ Eric S. Raymond <esr@thyrsus.com> ]
16
17=== modified file 'init/main.c'
18--- init/main.c 2011-12-15 16:26:34 +0000
19+++ init/main.c 2012-08-03 15:50:43 +0000
20@@ -28,6 +28,7 @@
21 #include <sys/ioctl.h>
22 #include <sys/reboot.h>
23 #include <sys/resource.h>
24+#include <sys/mount.h>
25
26 #include <errno.h>
27 #include <stdio.h>
28@@ -192,6 +193,8 @@
29 #ifndef DEBUG
30 if (use_session_bus == FALSE) {
31
32+ int needs_devtmpfs = 0;
33+
34 /* Check we're root */
35 if (getuid ()) {
36 nih_fatal (_("Need to be root"));
37@@ -236,6 +239,70 @@
38 */
39 setsid ();
40
41+ /* Allow devices to be created with the actual perms
42+ * specified.
43+ */
44+ (void)umask (0);
45+
46+ /* Check if key devices already exist; if they do,
47+ * we should assume we don't need to mount /dev.
48+ */
49+ if (system_check_file ("/dev/null", S_IFCHR, makedev (1, 3)) < 0)
50+ needs_devtmpfs = 1;
51+
52+ if (system_check_file ("/dev/console", S_IFCHR, makedev (5, 1)) < 0)
53+ needs_devtmpfs = 1;
54+
55+ if (system_check_file ("/dev/tty", S_IFCHR, makedev (5, 0)) < 0)
56+ needs_devtmpfs = 1;
57+
58+ if (system_check_file ("/dev/kmsg", S_IFCHR, makedev (1, 11)) < 0)
59+ needs_devtmpfs = 1;
60+
61+ if (system_check_file ("/dev/ptmx", S_IFCHR, makedev (5, 2)) < 0)
62+ needs_devtmpfs = 1;
63+
64+ if (system_check_file ("/dev/pts", S_IFDIR, 0) < 0)
65+ needs_devtmpfs = 1;
66+
67+ if (needs_devtmpfs) {
68+ if (system_mount ("devtmpfs", "/dev", (MS_NOEXEC | MS_NOSUID)) < 0) {
69+ NihError *err;
70+
71+ err = nih_error_get ();
72+ nih_error ("%s: %s", _("Unable to mount /dev filesystem"),
73+ err->message);
74+ nih_free (err);
75+ }
76+
77+ /* Required to exist before /dev/pts accessed */
78+ system_mknod ("/dev/ptmx", (S_IFCHR | 0666), makedev (5, 2));
79+
80+ if (mkdir ("/dev/pts", 0755) < 0 && errno != EEXIST)
81+ nih_error ("%s: %s", _("Cannot create directory"), "/dev/pts");
82+ }
83+
84+ if (system_mount ("devpts", "/dev/pts", (MS_NOEXEC | MS_NOSUID)) < 0) {
85+ NihError *err;
86+
87+ err = nih_error_get ();
88+ nih_error ("%s: %s", _("Unable to mount /dev/pts filesystem"),
89+ err->message);
90+ nih_free (err);
91+ }
92+
93+ /* These devices must exist, but we have to have handled the /dev
94+ * check (and possible mount) prior to considering
95+ * creating them. And yet, if /dev is not available from
96+ * the outset and an error occurs, we are unable to report it,
97+ * hence these checks are performed as early as is
98+ * feasible.
99+ */
100+ system_mknod ("/dev/null", (S_IFCHR | 0666), makedev (1, 3));
101+ system_mknod ("/dev/tty", (S_IFCHR | 0666), makedev (5, 0));
102+ system_mknod ("/dev/console", (S_IFCHR | 0600), makedev (5, 1));
103+ system_mknod ("/dev/kmsg", (S_IFCHR | 0600), makedev (1, 11));
104+
105 /* Set the standard file descriptors to the ordinary console device,
106 * resetting it to sane defaults unless we're inheriting from another
107 * init process which we know left it in a sane state.
108@@ -271,9 +338,10 @@
109
110 /* Mount the /proc and /sys filesystems, which are pretty much
111 * essential for any Linux system; not to mention used by
112- * ourselves.
113+ * ourselves. Also mount /dev/pts to allow CONSOLE_LOG
114+ * to function if booted in an initramfs-less environment.
115 */
116- if (system_mount ("proc", "/proc") < 0) {
117+ if (system_mount ("proc", "/proc", (MS_NODEV | MS_NOEXEC | MS_NOSUID)) < 0) {
118 NihError *err;
119
120 err = nih_error_get ();
121@@ -282,7 +350,7 @@
122 nih_free (err);
123 }
124
125- if (system_mount ("sysfs", "/sys") < 0) {
126+ if (system_mount ("sysfs", "/sys", (MS_NODEV | MS_NOEXEC | MS_NOSUID)) < 0) {
127 NihError *err;
128
129 err = nih_error_get ();
130@@ -290,6 +358,7 @@
131 err->message);
132 nih_free (err);
133 }
134+
135 } else {
136 nih_log_set_priority (NIH_LOG_DEBUG);
137 nih_debug ("Running with UID %d as PID %d (PPID %d)",
138
139=== modified file 'init/system.c'
140--- init/system.c 2011-12-09 14:07:11 +0000
141+++ init/system.c 2012-08-03 15:50:43 +0000
142@@ -164,9 +164,10 @@
143 /**
144 * system_mount:
145 * @type: filesystem type,
146- * @dir: mountpoint.
147+ * @dir: mountpoint,
148+ * @flags: mount flags.
149 *
150- * Mount the kernel filesystem @type at @dir, if not already mounted. This
151+ * Mount the kernel filesystem @type at @dir with @flags, if not already mounted. This
152 * is used to ensure that the proc and sysfs filesystems are always
153 * available.
154 *
155@@ -177,7 +178,8 @@
156 **/
157 int
158 system_mount (const char *type,
159- const char *dir)
160+ const char *dir,
161+ unsigned long flags)
162 {
163 nih_local char *parent = NULL;
164 char * ptr;
165@@ -206,9 +208,65 @@
166 return 0;
167
168 /* Mount the filesystem */
169- if (mount ("none", dir, type,
170- MS_NODEV | MS_NOEXEC | MS_NOSUID, NULL) < 0)
171+ if (mount ("none", dir, type, flags, NULL) < 0)
172 nih_return_system_error (-1);
173
174 return 0;
175 }
176+
177+/**
178+ * system_mknod:
179+ *
180+ * @path: full path,
181+ * @mode: mode to create device with,
182+ * @dev: device major and minor numbers.
183+ *
184+ * Create specified device.
185+ *
186+ * Note that depending on the device, if an error occurs
187+ * it may not be reportable, hence no return value,
188+ * but an attempt to display an error.
189+ **/
190+void
191+system_mknod (const char *path, mode_t mode, dev_t dev)
192+{
193+ nih_assert (path);
194+
195+ if (mknod (path, mode, dev) < 0 && errno != EEXIST)
196+ nih_error ("%s: %s", _("Unable to create device"), path);
197+}
198+
199+/**
200+ * system_check_file:
201+ *
202+ * @path: full path,
203+ * @type: file type,
204+ * @dev: device major and minor numbers (only checked for character and
205+ * block devices).
206+ *
207+ * Perform checks on specified file.
208+ *
209+ * Returns: 0 if device exists and has the specified @path,
210+ * @type and @dev attributes, else -1.
211+ **/
212+int
213+system_check_file (const char *path, mode_t type, dev_t dev)
214+{
215+ struct stat statbuf;
216+ int ret;
217+
218+ nih_assert (path);
219+
220+ ret = stat (path, &statbuf);
221+
222+ if (ret < 0 || ! ((statbuf.st_mode & S_IFMT) == type))
223+ return -1;
224+
225+ if (type == S_IFCHR || type == S_IFBLK) {
226+ if (major (statbuf.st_rdev) != major (dev)
227+ || minor (statbuf.st_rdev) != minor (dev))
228+ return -1;
229+ }
230+
231+ return 0;
232+}
233
234=== modified file 'init/system.h'
235--- init/system.h 2011-05-12 20:42:28 +0000
236+++ init/system.h 2012-08-03 15:50:43 +0000
237@@ -35,7 +35,13 @@
238 int system_setup_console (ConsoleType type, int reset)
239 __attribute__ ((warn_unused_result));
240
241-int system_mount (const char *type, const char *dir)
242+int system_mount (const char *type, const char *dir,
243+ unsigned long flags)
244+ __attribute__ ((warn_unused_result));
245+
246+void system_mknod (const char *path, mode_t mode, dev_t dev);
247+
248+int system_check_file (const char *path, mode_t type, dev_t dev)
249 __attribute__ ((warn_unused_result));
250
251 NIH_END_EXTERN

Subscribers

People subscribed via source and target branches