Merge lp:~afrantzis/unity-system-compositor/no-external-spinner-zombie-processes into lp:unity-system-compositor

Proposed by Alexandros Frantzis
Status: Work in progress
Proposed branch: lp:~afrantzis/unity-system-compositor/no-external-spinner-zombie-processes
Merge into: lp:unity-system-compositor
Diff against target: 271 lines (+180/-15)
5 files modified
src/external_spinner.cpp (+85/-4)
tests/integration-tests/CMakeLists.txt (+1/-0)
tests/integration-tests/spin_wait.cpp (+39/-0)
tests/integration-tests/spin_wait.h (+39/-0)
tests/integration-tests/test_external_spinner.cpp (+16/-11)
To merge this branch: bzr merge lp:~afrantzis/unity-system-compositor/no-external-spinner-zombie-processes
Reviewer Review Type Date Requested Status
Unity System Compositor Development Team Pending
Review via email: mp+264281@code.launchpad.net

Commit message

Don't leave zombie spinner processes

Description of the change

Don't leave zombie spinner processes

This MP uses the double-fork trick to avoid zombie spinner process. It works by forking an intermediate process which then forks again to execute the real spinner process. The intermediate process exits immediately, orphaning the spinner process, which is then adopted by init (pid 1). The init process then takes care of waiting for the process to exit.

To post a comment you must log in.
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Haven't looked at the branch in detail, but I'm surprised it needs much work at all... Simply calling wait() or waitpid() at the right times will avoid zombies. That's what "zombie process" means -- that the parent has not called wait() yet.

Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

> Simply calling wait() or waitpid() at the right times will avoid zombies.
> That's what "zombie process" means -- that the parent has not called wait() yet.

That was my initial approach, but the problem with this is that waitpid() blocks USC for a little bit in the best case, and can block forever if for whatever reason the external spinner happens to not respond to our signals (e.g. it has hung).

The proposed way is indeed more complicated, but it avoid all blocking problems by transferring the responsibility to the init process.

Note that a lot of the code in this MP is just test improvements, and signal-safe wrappers around read/write.

Revision history for this message
Chris Halse Rogers (raof) wrote :

On Friday, 10 July 2015 6:21:03 PM AEST, Alexandros Frantzis wrote:
>> Simply calling wait() or waitpid() at the right times will avoid zombies.
>> That's what "zombie process" means -- that the parent has not
>> called wait() yet.
>
> That was my initial approach, but the problem with this is that
> waitpid() blocks USC for a little bit in the best case, and can
> block forever if for whatever reason the external spinner
> happens to not respond to our signals (e.g. it has hung).

The canonical way of handling this is a SIGCHILD handler; that way you only
ever call non-blocking waitpid.

That said, double-forking should work, too.

--
Sent using Dekko from my Ubuntu device

Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

> The canonical way of handling this is a SIGCHILD handler; that way you only
> ever call non-blocking waitpid.

I considered this, but was concerned about interaction with libmirserver signal handling. If we think it is safe, though, I would rather use SIGCHILD too. I will propose an alternative MP with sigchild and we can decide.

Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

Unmerged revisions

224. By Alexandros Frantzis

Don't leave zombie spinner processes

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/external_spinner.cpp'
2--- src/external_spinner.cpp 2015-04-01 17:06:16 +0000
3+++ src/external_spinner.cpp 2015-07-09 13:12:58 +0000
4@@ -20,6 +20,64 @@
5
6 #include <unistd.h>
7 #include <signal.h>
8+#include <sys/wait.h>
9+
10+namespace
11+{
12+
13+void write_pid(int fd, pid_t pid)
14+{
15+ auto const pid_str = std::to_string(pid);
16+ size_t nwritten = 0;
17+
18+ while (nwritten < pid_str.size())
19+ {
20+ auto const n = write(fd, pid_str.data() + nwritten, pid_str.size() - nwritten);
21+ if (n < 0)
22+ {
23+ if (errno != EINTR)
24+ return;
25+ }
26+ else if (n == 0)
27+ {
28+ return;
29+ }
30+ else
31+ {
32+ nwritten += n;
33+ }
34+ }
35+}
36+
37+pid_t read_pid(int fd)
38+{
39+ std::string pid_str;
40+ bool more_data_to_read = true;
41+
42+ while (more_data_to_read)
43+ {
44+ size_t const bufsize = 256;
45+ char data[bufsize];
46+ auto const nread = read(fd, data, bufsize);
47+ if (nread < 0)
48+ {
49+ if (errno != EINTR)
50+ return 0;
51+ }
52+ else if (nread == 0)
53+ {
54+ more_data_to_read = false;
55+ }
56+ else
57+ {
58+ pid_str.append(data, nread);
59+ }
60+ }
61+
62+ return std::atoi(pid_str.c_str());
63+}
64+
65+}
66
67 usc::ExternalSpinner::ExternalSpinner(
68 std::string const& executable,
69@@ -42,16 +100,39 @@
70 if (executable.empty() || spinner_pid)
71 return;
72
73- auto pid = fork();
74+ /* Use a double fork to avoid zombie processes */
75+ int pipefd[2] {0,0};
76+ if (pipe(pipefd)) {}
77+
78+ auto const pid = fork();
79 if (!pid)
80 {
81- setenv("MIR_SOCKET", mir_socket.c_str(), 1);
82- execlp(executable.c_str(), executable.c_str(), nullptr);
83+ close(pipefd[0]);
84+
85+ auto const spid = fork();
86+ if (!spid)
87+ {
88+ close(pipefd[1]);
89+ setenv("MIR_SOCKET", mir_socket.c_str(), 1);
90+ execlp(executable.c_str(), executable.c_str(), nullptr);
91+ }
92+ else
93+ {
94+ /* Send the spinner pid to grandparent (USC) */
95+ write_pid(pipefd[1], spid);
96+ close(pipefd[1]);
97+ exit(0);
98+ }
99 }
100 else
101 {
102- spinner_pid = pid;
103+ close(pipefd[1]);
104+ waitpid(pid, nullptr, 0);
105+ /* Get spinner pid from grandchild */
106+ spinner_pid = read_pid(pipefd[0]);
107 }
108+
109+ close(pipefd[0]);
110 }
111
112 void usc::ExternalSpinner::kill()
113
114=== modified file 'tests/integration-tests/CMakeLists.txt'
115--- tests/integration-tests/CMakeLists.txt 2015-04-07 13:36:48 +0000
116+++ tests/integration-tests/CMakeLists.txt 2015-07-09 13:12:58 +0000
117@@ -32,6 +32,7 @@
118 run_command.cpp
119 dbus_bus.cpp
120 dbus_client.cpp
121+ spin_wait.cpp
122 test_dbus_event_loop.cpp
123 test_unity_screen_service.cpp
124 test_external_spinner.cpp
125
126=== added file 'tests/integration-tests/spin_wait.cpp'
127--- tests/integration-tests/spin_wait.cpp 1970-01-01 00:00:00 +0000
128+++ tests/integration-tests/spin_wait.cpp 2015-07-09 13:12:58 +0000
129@@ -0,0 +1,39 @@
130+/*
131+ * Copyright © 2015 Canonical Ltd.
132+ *
133+ * This program is free software: you can redistribute it and/or modify it
134+ * under the terms of the GNU General Public License version 3,
135+ * as published by the Free Software Foundation.
136+ *
137+ * This program is distributed in the hope that it will be useful,
138+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
139+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
140+ * GNU General Public License for more details.
141+ *
142+ * You should have received a copy of the GNU General Public License
143+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
144+ *
145+ * Authored by: Alexandros Frantzis <alexandros.frantzis@canonical.com>
146+ */
147+
148+#include "spin_wait.h"
149+
150+#include <thread>
151+
152+bool usc::test::spin_wait_for_condition_or_timeout(
153+ std::function<bool()> const& condition,
154+ std::chrono::milliseconds timeout,
155+ std::chrono::milliseconds spin_period)
156+{
157+ auto const end = std::chrono::steady_clock::now() + timeout;
158+ bool condition_fulfilled = false;
159+
160+ while (std::chrono::steady_clock::now() < end &&
161+ !(condition_fulfilled = condition()))
162+ {
163+ std::this_thread::sleep_for(spin_period);
164+ }
165+
166+ return condition_fulfilled;
167+}
168+
169
170=== added file 'tests/integration-tests/spin_wait.h'
171--- tests/integration-tests/spin_wait.h 1970-01-01 00:00:00 +0000
172+++ tests/integration-tests/spin_wait.h 2015-07-09 13:12:58 +0000
173@@ -0,0 +1,39 @@
174+/*
175+ * Copyright © 2015 Canonical Ltd.
176+ *
177+ * This program is free software: you can redistribute it and/or modify it
178+ * under the terms of the GNU General Public License version 3,
179+ * as published by the Free Software Foundation.
180+ *
181+ * This program is distributed in the hope that it will be useful,
182+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
183+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
184+ * GNU General Public License for more details.
185+ *
186+ * You should have received a copy of the GNU General Public License
187+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
188+ *
189+ * Authored by: Alexandros Frantzis <alexandros.frantzis@canonical.com>
190+ */
191+
192+#ifndef USC_SPIN_WAIT_H_
193+#define USC_SPIN_WAIT_H_
194+
195+#include <functional>
196+#include <chrono>
197+
198+namespace usc
199+{
200+namespace test
201+{
202+
203+
204+bool spin_wait_for_condition_or_timeout(
205+ std::function<bool()> const& condition,
206+ std::chrono::milliseconds timeout,
207+ std::chrono::milliseconds spin_period = std::chrono::milliseconds{10});
208+
209+}
210+}
211+
212+#endif
213
214=== modified file 'tests/integration-tests/test_external_spinner.cpp'
215--- tests/integration-tests/test_external_spinner.cpp 2015-04-01 17:11:57 +0000
216+++ tests/integration-tests/test_external_spinner.cpp 2015-07-09 13:12:58 +0000
217@@ -18,6 +18,7 @@
218
219 #include "src/external_spinner.h"
220 #include "run_command.h"
221+#include "spin_wait.h"
222
223 #include <fstream>
224 #include <chrono>
225@@ -74,12 +75,21 @@
226 {
227 std::vector<pid_t> spinner_pids()
228 {
229- return pidof(spinner_cmd);
230+ std::vector<pid_t> pids;
231+
232+ usc::test::spin_wait_for_condition_or_timeout(
233+ [&pids, this] { pids = pidof(spinner_cmd); return !pids.empty(); },
234+ timeout);
235+
236+ if (pids.empty())
237+ BOOST_THROW_EXCEPTION(std::runtime_error("spinner_pids timed out"));
238+
239+ return pids;
240 }
241
242 std::vector<std::string> environment_of_spinner()
243 {
244- auto const pids = pidof(spinner_cmd);
245+ auto const pids = spinner_pids();
246 if (pids.size() > 1)
247 BOOST_THROW_EXCEPTION(std::runtime_error("Detected multiple spinner processes"));
248 std::vector<std::string> env;
249@@ -96,19 +106,14 @@
250
251 void wait_for_spinner_to_terminate()
252 {
253- auto const timeout = std::chrono::milliseconds{3000};
254- auto const expire = std::chrono::steady_clock::now() + timeout;
255-
256- while (spinner_pids().size() > 0)
257- {
258- if (std::chrono::steady_clock::now() > expire)
259- BOOST_THROW_EXCEPTION(std::runtime_error("wait_for_no_spinner timed out"));
260- std::this_thread::sleep_for(std::chrono::milliseconds{10});
261- }
262+ usc::test::spin_wait_for_condition_or_timeout(
263+ [this] { return pidof(spinner_cmd).empty(); },
264+ timeout);
265 }
266
267 std::string const spinner_cmd{executable_path() + "/usc_test_helper_wait_for_signal"};
268 std::string const mir_socket{"usc_mir_socket"};
269+ std::chrono::milliseconds const timeout{3000};
270 usc::ExternalSpinner spinner{spinner_cmd, mir_socket};
271 };
272

Subscribers

People subscribed via source and target branches