Merge lp:~stgraber/upstart/upstart-initgroups into lp:upstart

Proposed by Stéphane Graber
Status: Merged
Merged at revision: 1396
Proposed branch: lp:~stgraber/upstart/upstart-initgroups
Merge into: lp:upstart
Diff against target: 193 lines (+117/-3)
3 files modified
init/job_process.c (+46/-2)
init/job_process.h (+3/-1)
init/tests/test_job_process.c (+68/-0)
To merge this branch: bzr merge lp:~stgraber/upstart/upstart-initgroups
Reviewer Review Type Date Requested Status
James Hunt Approve
Review via email: mp+136794@code.launchpad.net

Description of the change

Call initgroups() before spawning a job to ensure that the user's group list
is properly initialized.

This avoids the following issue:

=== Example of the security problem ===
root@upstart-test:~# cat /etc/init/test.conf
setuid nobody
setgid nogroup

task
script
    cat /tmp/secret-file > /tmp/public-file
    chmod 666 /tmp/public-file
    id > /tmp/debug
end script
root@upstart-test:~# id nobody
uid=65534(nobody) gid=65534(nogroup) groups=65534(nogroup)
root@upstart-test:~# echo secret > /tmp/secret-file
root@upstart-test:~# chmod 660 /tmp/secret-file
root@upstart-test:~# start test
test stop/waiting
root@upstart-test:~# cat /tmp/public-file
secret
root@upstart-test:~# ls -l /tmp/public-file
-rw-rw-rw- 1 nobody nogroup 7 Nov 28 20:59 /tmp/public-file
root@upstart-test:~# cat /tmp/debug
uid=65534(nobody) gid=65534(nogroup) groups=65534(nogroup),0(root)
root@upstart-test:~#

=== Same commands but with the fixed version ===
root@upstart-test:~# cat /etc/init/test.conf
setuid nobody
setgid nogroup

task
script
    cat /tmp/secret-file > /tmp/public-file
    chmod 666 /tmp/public-file
    id > /tmp/debug
end script
root@upstart-test:~# id nobody
uid=65534(nobody) gid=65534(nogroup) groups=65534(nogroup)
root@upstart-test:~# echo secret > /tmp/secret-file
root@upstart-test:~# chmod 660 /tmp/secret-file
root@upstart-test:~# start test
start: Job failed to start
root@upstart-test:~#

The code was tested (as shown above) and the unit tests still all pass.
However, as the tests are meant to be run as non-root, it's not possible
to add new tests testing for initgroups() behaviour.

To post a comment you must log in.
Revision history for this message
Stéphane Graber (stgraber) wrote :

So after more testing, this bug can't be exploited on regular hosts (physical machines, VMs) as the kernel group list is empty on those, so there's no "root" group to inherit.

On those, the bug is only that you don't inherit the groups of the setuid user, which is problematic but not a security issue.

However for users of containers, the initial group list does contain root, so for those, it's a potential security issue. But the number of users of containers being far lower than those of regular systems, this somewhat lowers the priority of this fix.

Revision history for this message
James Hunt (jamesodhunt) wrote :

The first call to initgroups for user jobs looks fine, but the 2nd call for system jobs (that specify setuid/setgid stanzas) has a few problems:

(1) Shouldn't we only be calling this if we're root _and_ 'if (class->setuid || class->setgid)' and then performing the password and group lookup for job_setuid and job_setgid?

(2) The checks on pwd and grp for near the 2nd call to initgroups are not currently legitimate - pwd and grp should be initialised to NULL to avoid undefined behaviour.

(3) You don't need those two errno resets just before the 2nd call since the code never explicitly checks errno there.

review: Needs Fixing
Revision history for this message
Stéphane Graber (stgraber) wrote :

> The first call to initgroups for user jobs looks fine, but the 2nd call for
> system jobs (that specify setuid/setgid stanzas) has a few problems:
>
> (1) Shouldn't we only be calling this if we're root _and_ 'if (class->setuid
> || class->setgid)' and then performing the password and group lookup for
> job_setuid and job_setgid?

No, in the case where the root user is in additional groups, we need the initgroups call to populate the group list.

> (2) The checks on pwd and grp for near the 2nd call to initgroups are not
> currently legitimate - pwd and grp should be initialised to NULL to avoid
> undefined behaviour.

Oops, indeed, fixed.

> (3) You don't need those two errno resets just before the 2nd call since the
> code never explicitly checks errno there.

Fixed.

1395. By Stéphane Graber

Don't reset errno when we're not using it. Initialise passwd and group to NULL.

Revision history for this message
James Hunt (jamesodhunt) wrote :

> > The first call to initgroups for user jobs looks fine, but the 2nd call for
> > system jobs (that specify setuid/setgid stanzas) has a few problems:
> >
> > (1) Shouldn't we only be calling this if we're root _and_ 'if (class->setuid
> > || class->setgid)' and then performing the password and group lookup for
> > job_setuid and job_setgid?
>
> No, in the case where the root user is in additional groups, we need the initgroups call to populate the group list.

Ah - I did wonder about that. We've survived for a long time without calling initgroups for system jobs, but that appears to have been an oversight from way back :)

>
> > (2) The checks on pwd and grp for near the 2nd call to initgroups are not
> > currently legitimate - pwd and grp should be initialised to NULL to avoid
> > undefined behaviour.
>
> Oops, indeed, fixed.
>
> > (3) You don't need those two errno resets just before the 2nd call since the
> > code never explicitly checks errno there.
>
> Fixed.

The final change required is to update job_process_error_read() for the new JobProcessErrorTypes you've added. Once you've done this, please can you force such an error in a manual test, to ensure expected behaviour (currently, you should find that init will assert).

I've raised bug 1085836 to document the requirement to create a new test program which will embody all tests that need to be run as the root user.

review: Needs Fixing
1396. By Stéphane Graber

Add the new error codes to job_process_error_read

1397. By Stéphane Graber

Add a basic test for initgroups

Revision history for this message
Stéphane Graber (stgraber) wrote :

Added the entries to job_process_error_read.

I also tried to add a test for the initgroups() code that'd work both as a user and as root by checking that the error path is triggered in one case and that the job starts properly in the other.

The test is pretty minimal and appears to work when run as a user, when running it as root, the test suite appears to fail earlier on so even if the code appears to be fine, I'm not 100% sure it'll work.
I also tried to use nih_error_get() to match the exact INITGROUPS error code but without much success, I gave up after a few hours of trying to figure out why the simple fact of calling nih_error_get would fail.

Revision history for this message
James Hunt (jamesodhunt) wrote :

Thanks for this. Great that we now have some form of test for this scenario. Just a couple of points:

* init/tests/test_job_process.c:
  - Can you change the getpwuid() call to getpwnam ("nobody") and
    likewise for getgrgid() to getgrnam() as it's much clearer and
    should protect us in the case where the tests are run on
    systems with 32-bit uid_t's. This will also solve the 2 typo issues
    (655534 rather than 65534).
  - Also, can you put some more comments on that test to make it clear
    why it is expected to work when run as "nobody".

1398. By Stéphane Graber

Use nobody/nogroup instead of 65534/65534, use more comprehensive comment.

Revision history for this message
Stéphane Graber (stgraber) wrote :

There you go. Confirmed that the test still passes.

Revision history for this message
James Hunt (jamesodhunt) wrote :

Thanks Stéphane - merged.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'init/job_process.c'
2--- init/job_process.c 2012-11-21 06:48:08 +0000
3+++ init/job_process.c 2012-12-04 19:59:27 +0000
4@@ -413,6 +413,8 @@
5 JobClass *class;
6 uid_t job_setuid = -1;
7 gid_t job_setgid = -1;
8+ struct passwd *pwd = NULL;
9+ struct group *grp = NULL;
10
11
12 nih_assert (job != NULL);
13@@ -705,6 +707,11 @@
14 job_process_error_abort (fds[1], JOB_PROCESS_ERROR_CHOWN, 0);
15 }
16
17+ if (geteuid () == 0 && initgroups (pw->pw_name, pw->pw_gid) < 0) {
18+ nih_error_raise_system ();
19+ job_process_error_abort (fds[1], JOB_PROCESS_ERROR_INITGROUPS, 0);
20+ }
21+
22 if (pw->pw_gid && setgid (pw->pw_gid) < 0) {
23 nih_error_raise_system ();
24 job_process_error_abort (fds[1], JOB_PROCESS_ERROR_SETGID, 0);
25@@ -844,7 +851,6 @@
26 * session jobs and jobs with a chroot stanza.
27 */
28 if (class->setuid) {
29- struct passwd *pwd;
30 /* Without resetting errno, it's impossible to
31 * distinguish between a non-existent user and and
32 * error during lookup */
33@@ -867,7 +873,6 @@
34 }
35
36 if (class->setgid) {
37- struct group *grp;
38 errno = 0;
39 grp = getgrnam (class->setgid);
40 if (! grp) {
41@@ -891,6 +896,35 @@
42 job_process_error_abort (fds[1], JOB_PROCESS_ERROR_CHOWN, 0);
43 }
44
45+ /* Make sure we always have the needed pwd and grp structs.
46+ * Then pass those to initgroups() to setup the user's group list.
47+ * Only do that if we're root as initgroups() won't work when non-root. */
48+ if (geteuid () == 0) {
49+ if (! pwd) {
50+ pwd = getpwuid (geteuid ());
51+ if (! pwd) {
52+ nih_error_raise_system ();
53+ job_process_error_abort (fds[1], JOB_PROCESS_ERROR_GETPWUID, 0);
54+ }
55+ }
56+
57+ if (! grp) {
58+ grp = getgrgid (getegid ());
59+ if (! grp) {
60+ nih_error_raise_system ();
61+ job_process_error_abort (fds[1], JOB_PROCESS_ERROR_GETGRGID, 0);
62+ }
63+ }
64+
65+ if (pwd && grp) {
66+ if (initgroups (pwd->pw_name, grp->gr_gid) < 0) {
67+ nih_error_raise_system ();
68+ job_process_error_abort (fds[1], JOB_PROCESS_ERROR_INITGROUPS, 0);
69+ }
70+ }
71+ }
72+
73+ /* Start dropping privileges */
74 if (job_setgid != (gid_t) -1 && setgid (job_setgid) < 0) {
75 nih_error_raise_system ();
76 job_process_error_abort (fds[1], JOB_PROCESS_ERROR_SETGID, 0);
77@@ -1142,6 +1176,11 @@
78 err, _("unable to getpwuid: %s"),
79 strerror (err->errnum)));
80 break;
81+ case JOB_PROCESS_ERROR_GETGRGID:
82+ err->error.message = NIH_MUST (nih_sprintf (
83+ err, _("unable to getgrgid: %s"),
84+ strerror (err->errnum)));
85+ break;
86 case JOB_PROCESS_ERROR_BAD_SETUID:
87 err->error.message = NIH_MUST (nih_sprintf (
88 err, _("unable to find setuid user")));
89@@ -1200,6 +1239,11 @@
90 err, _("unable to allocate memory: %s"),
91 strerror (err->errnum)));
92 break;
93+ case JOB_PROCESS_ERROR_INITGROUPS:
94+ err->error.message = NIH_MUST (nih_sprintf (
95+ err, _("unable to initgroups: %s"),
96+ strerror (err->errnum)));
97+ break;
98 default:
99 nih_assert_not_reached ();
100 }
101
102=== modified file 'init/job_process.h'
103--- init/job_process.h 2012-11-07 15:17:58 +0000
104+++ init/job_process.h 2012-12-04 19:59:27 +0000
105@@ -94,7 +94,9 @@
106 JOB_PROCESS_ERROR_PTSNAME,
107 JOB_PROCESS_ERROR_OPENPT_SLAVE,
108 JOB_PROCESS_ERROR_SIGNAL,
109- JOB_PROCESS_ERROR_ALLOC
110+ JOB_PROCESS_ERROR_ALLOC,
111+ JOB_PROCESS_ERROR_INITGROUPS,
112+ JOB_PROCESS_ERROR_GETGRGID
113 } JobProcessErrorType;
114
115 /**
116
117=== modified file 'init/tests/test_job_process.c'
118--- init/tests/test_job_process.c 2012-12-04 17:12:46 +0000
119+++ init/tests/test_job_process.c 2012-12-04 19:59:27 +0000
120@@ -4007,6 +4007,74 @@
121 }
122
123 /************************************************************/
124+
125+ /* Check that initgroups gets called.
126+ * The test will run a job as nobody/nogroup (setuid/setgid) target.
127+ *
128+ * When run from an unprivileged user, the check will ensure that upstart
129+ * fails to start the job (returning -1) as initgroups() is a privileged
130+ * call (similar to setuid and setgid).
131+ *
132+ * When run from a privileged user (root), the check will ensure that
133+ * upstart succeeds in dropping privileges, which includes calling
134+ * initgroup, setuid and setgid.
135+ *
136+ * If the test is started by user nobody/nogroup, then it'll succeed as
137+ * there's no privilege changes to be done in such case (same uid/gid).
138+ */
139+ TEST_FEATURE ("with setuid");
140+ TEST_HASH_EMPTY (job_classes);
141+
142+ TEST_NE_P (output, NULL);
143+ TEST_ALLOC_FAIL {
144+ TEST_ALLOC_SAFE {
145+ class = job_class_new (NULL, "test", NULL);
146+ class->process[PROCESS_MAIN] = process_new (class);
147+ class->process[PROCESS_MAIN]->command = nih_sprintf (
148+ class->process[PROCESS_MAIN],
149+ "touch %s", filename);
150+
151+ pwd = getpwnam ("nobody");
152+ TEST_NE (pwd, NULL);
153+ class->setuid = nih_strdup (class, pwd->pw_name);
154+
155+ grp = getgrnam ("nogroup");
156+ TEST_NE (grp, NULL);
157+ class->setgid = nih_strdup (class, grp->gr_name);
158+
159+ job = job_new (class, "");
160+ job->goal = JOB_START;
161+ job->state = JOB_SPAWNED;
162+
163+ output = tmpfile ();
164+ }
165+
166+ TEST_DIVERT_STDERR (output) {
167+ ret = job_process_run (job, PROCESS_MAIN);
168+ if (geteuid() == 0 || getuid() == pwd->pw_uid) {
169+ TEST_EQ (ret, 0);
170+ }
171+ else {
172+ TEST_EQ (ret, -1);
173+ }
174+ }
175+
176+ if (geteuid() == 0 || getuid() == pwd->pw_uid) {
177+ TEST_NE (job->pid[PROCESS_MAIN], 0);
178+
179+ waitpid (job->pid[PROCESS_MAIN], NULL, 0);
180+ TEST_EQ (stat (filename, &statbuf), 0);
181+ }
182+ else {
183+ TEST_EQ (stat (filename, &statbuf), -1);
184+ }
185+
186+ unlink (filename);
187+ nih_free (class);
188+
189+ }
190+
191+ /************************************************************/
192 TEST_FEATURE ("with multiple processes and log");
193 TEST_HASH_EMPTY (job_classes);
194

Subscribers

People subscribed via source and target branches