Merge lp:~afrantzis/unity-system-compositor/no-external-spinner-zombie-processes into lp:unity-system-compositor

Proposed by Alexandros Frantzis
Status: Work in progress
Proposed branch: lp:~afrantzis/unity-system-compositor/no-external-spinner-zombie-processes
Merge into: lp:unity-system-compositor
Diff against target: 271 lines (+180/-15)
5 files modified
src/external_spinner.cpp (+85/-4)
tests/integration-tests/CMakeLists.txt (+1/-0)
tests/integration-tests/spin_wait.cpp (+39/-0)
tests/integration-tests/spin_wait.h (+39/-0)
tests/integration-tests/test_external_spinner.cpp (+16/-11)
To merge this branch: bzr merge lp:~afrantzis/unity-system-compositor/no-external-spinner-zombie-processes
Reviewer Review Type Date Requested Status
Unity System Compositor Development Team Pending
Review via email: mp+264281@code.launchpad.net

Commit message

Don't leave zombie spinner processes

Description of the change

Don't leave zombie spinner processes

This MP uses the double-fork trick to avoid zombie spinner process. It works by forking an intermediate process which then forks again to execute the real spinner process. The intermediate process exits immediately, orphaning the spinner process, which is then adopted by init (pid 1). The init process then takes care of waiting for the process to exit.

To post a comment you must log in.
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Haven't looked at the branch in detail, but I'm surprised it needs much work at all... Simply calling wait() or waitpid() at the right times will avoid zombies. That's what "zombie process" means -- that the parent has not called wait() yet.

Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

> Simply calling wait() or waitpid() at the right times will avoid zombies.
> That's what "zombie process" means -- that the parent has not called wait() yet.

That was my initial approach, but the problem with this is that waitpid() blocks USC for a little bit in the best case, and can block forever if for whatever reason the external spinner happens to not respond to our signals (e.g. it has hung).

The proposed way is indeed more complicated, but it avoid all blocking problems by transferring the responsibility to the init process.

Note that a lot of the code in this MP is just test improvements, and signal-safe wrappers around read/write.

Revision history for this message
Chris Halse Rogers (raof) wrote :

On Friday, 10 July 2015 6:21:03 PM AEST, Alexandros Frantzis wrote:
>> Simply calling wait() or waitpid() at the right times will avoid zombies.
>> That's what "zombie process" means -- that the parent has not
>> called wait() yet.
>
> That was my initial approach, but the problem with this is that
> waitpid() blocks USC for a little bit in the best case, and can
> block forever if for whatever reason the external spinner
> happens to not respond to our signals (e.g. it has hung).

The canonical way of handling this is a SIGCHILD handler; that way you only
ever call non-blocking waitpid.

That said, double-forking should work, too.

--
Sent using Dekko from my Ubuntu device

Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

> The canonical way of handling this is a SIGCHILD handler; that way you only
> ever call non-blocking waitpid.

I considered this, but was concerned about interaction with libmirserver signal handling. If we think it is safe, though, I would rather use SIGCHILD too. I will propose an alternative MP with sigchild and we can decide.

Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

Unmerged revisions

224. By Alexandros Frantzis

Don't leave zombie spinner processes

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'src/external_spinner.cpp'
--- src/external_spinner.cpp 2015-04-01 17:06:16 +0000
+++ src/external_spinner.cpp 2015-07-09 13:12:58 +0000
@@ -20,6 +20,64 @@
2020
21#include <unistd.h>21#include <unistd.h>
22#include <signal.h>22#include <signal.h>
23#include <sys/wait.h>
24
25namespace
26{
27
28void write_pid(int fd, pid_t pid)
29{
30 auto const pid_str = std::to_string(pid);
31 size_t nwritten = 0;
32
33 while (nwritten < pid_str.size())
34 {
35 auto const n = write(fd, pid_str.data() + nwritten, pid_str.size() - nwritten);
36 if (n < 0)
37 {
38 if (errno != EINTR)
39 return;
40 }
41 else if (n == 0)
42 {
43 return;
44 }
45 else
46 {
47 nwritten += n;
48 }
49 }
50}
51
52pid_t read_pid(int fd)
53{
54 std::string pid_str;
55 bool more_data_to_read = true;
56
57 while (more_data_to_read)
58 {
59 size_t const bufsize = 256;
60 char data[bufsize];
61 auto const nread = read(fd, data, bufsize);
62 if (nread < 0)
63 {
64 if (errno != EINTR)
65 return 0;
66 }
67 else if (nread == 0)
68 {
69 more_data_to_read = false;
70 }
71 else
72 {
73 pid_str.append(data, nread);
74 }
75 }
76
77 return std::atoi(pid_str.c_str());
78}
79
80}
2381
24usc::ExternalSpinner::ExternalSpinner(82usc::ExternalSpinner::ExternalSpinner(
25 std::string const& executable,83 std::string const& executable,
@@ -42,16 +100,39 @@
42 if (executable.empty() || spinner_pid)100 if (executable.empty() || spinner_pid)
43 return;101 return;
44102
45 auto pid = fork();103 /* Use a double fork to avoid zombie processes */
104 int pipefd[2] {0,0};
105 if (pipe(pipefd)) {}
106
107 auto const pid = fork();
46 if (!pid)108 if (!pid)
47 {109 {
48 setenv("MIR_SOCKET", mir_socket.c_str(), 1);110 close(pipefd[0]);
49 execlp(executable.c_str(), executable.c_str(), nullptr);111
112 auto const spid = fork();
113 if (!spid)
114 {
115 close(pipefd[1]);
116 setenv("MIR_SOCKET", mir_socket.c_str(), 1);
117 execlp(executable.c_str(), executable.c_str(), nullptr);
118 }
119 else
120 {
121 /* Send the spinner pid to grandparent (USC) */
122 write_pid(pipefd[1], spid);
123 close(pipefd[1]);
124 exit(0);
125 }
50 }126 }
51 else127 else
52 {128 {
53 spinner_pid = pid;129 close(pipefd[1]);
130 waitpid(pid, nullptr, 0);
131 /* Get spinner pid from grandchild */
132 spinner_pid = read_pid(pipefd[0]);
54 }133 }
134
135 close(pipefd[0]);
55}136}
56137
57void usc::ExternalSpinner::kill()138void usc::ExternalSpinner::kill()
58139
=== modified file 'tests/integration-tests/CMakeLists.txt'
--- tests/integration-tests/CMakeLists.txt 2015-04-07 13:36:48 +0000
+++ tests/integration-tests/CMakeLists.txt 2015-07-09 13:12:58 +0000
@@ -32,6 +32,7 @@
32 run_command.cpp32 run_command.cpp
33 dbus_bus.cpp33 dbus_bus.cpp
34 dbus_client.cpp34 dbus_client.cpp
35 spin_wait.cpp
35 test_dbus_event_loop.cpp36 test_dbus_event_loop.cpp
36 test_unity_screen_service.cpp37 test_unity_screen_service.cpp
37 test_external_spinner.cpp38 test_external_spinner.cpp
3839
=== added file 'tests/integration-tests/spin_wait.cpp'
--- tests/integration-tests/spin_wait.cpp 1970-01-01 00:00:00 +0000
+++ tests/integration-tests/spin_wait.cpp 2015-07-09 13:12:58 +0000
@@ -0,0 +1,39 @@
1/*
2 * Copyright © 2015 Canonical Ltd.
3 *
4 * This program is free software: you can redistribute it and/or modify it
5 * under the terms of the GNU General Public License version 3,
6 * as published by the Free Software Foundation.
7 *
8 * This program is distributed in the hope that it will be useful,
9 * but WITHOUT ANY WARRANTY; without even the implied warranty of
10 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
11 * GNU General Public License for more details.
12 *
13 * You should have received a copy of the GNU General Public License
14 * along with this program. If not, see <http://www.gnu.org/licenses/>.
15 *
16 * Authored by: Alexandros Frantzis <alexandros.frantzis@canonical.com>
17 */
18
19#include "spin_wait.h"
20
21#include <thread>
22
23bool usc::test::spin_wait_for_condition_or_timeout(
24 std::function<bool()> const& condition,
25 std::chrono::milliseconds timeout,
26 std::chrono::milliseconds spin_period)
27{
28 auto const end = std::chrono::steady_clock::now() + timeout;
29 bool condition_fulfilled = false;
30
31 while (std::chrono::steady_clock::now() < end &&
32 !(condition_fulfilled = condition()))
33 {
34 std::this_thread::sleep_for(spin_period);
35 }
36
37 return condition_fulfilled;
38}
39
040
=== added file 'tests/integration-tests/spin_wait.h'
--- tests/integration-tests/spin_wait.h 1970-01-01 00:00:00 +0000
+++ tests/integration-tests/spin_wait.h 2015-07-09 13:12:58 +0000
@@ -0,0 +1,39 @@
1/*
2 * Copyright © 2015 Canonical Ltd.
3 *
4 * This program is free software: you can redistribute it and/or modify it
5 * under the terms of the GNU General Public License version 3,
6 * as published by the Free Software Foundation.
7 *
8 * This program is distributed in the hope that it will be useful,
9 * but WITHOUT ANY WARRANTY; without even the implied warranty of
10 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
11 * GNU General Public License for more details.
12 *
13 * You should have received a copy of the GNU General Public License
14 * along with this program. If not, see <http://www.gnu.org/licenses/>.
15 *
16 * Authored by: Alexandros Frantzis <alexandros.frantzis@canonical.com>
17 */
18
19#ifndef USC_SPIN_WAIT_H_
20#define USC_SPIN_WAIT_H_
21
22#include <functional>
23#include <chrono>
24
25namespace usc
26{
27namespace test
28{
29
30
31bool spin_wait_for_condition_or_timeout(
32 std::function<bool()> const& condition,
33 std::chrono::milliseconds timeout,
34 std::chrono::milliseconds spin_period = std::chrono::milliseconds{10});
35
36}
37}
38
39#endif
040
=== modified file 'tests/integration-tests/test_external_spinner.cpp'
--- tests/integration-tests/test_external_spinner.cpp 2015-04-01 17:11:57 +0000
+++ tests/integration-tests/test_external_spinner.cpp 2015-07-09 13:12:58 +0000
@@ -18,6 +18,7 @@
1818
19#include "src/external_spinner.h"19#include "src/external_spinner.h"
20#include "run_command.h"20#include "run_command.h"
21#include "spin_wait.h"
2122
22#include <fstream>23#include <fstream>
23#include <chrono>24#include <chrono>
@@ -74,12 +75,21 @@
74{75{
75 std::vector<pid_t> spinner_pids()76 std::vector<pid_t> spinner_pids()
76 {77 {
77 return pidof(spinner_cmd);78 std::vector<pid_t> pids;
79
80 usc::test::spin_wait_for_condition_or_timeout(
81 [&pids, this] { pids = pidof(spinner_cmd); return !pids.empty(); },
82 timeout);
83
84 if (pids.empty())
85 BOOST_THROW_EXCEPTION(std::runtime_error("spinner_pids timed out"));
86
87 return pids;
78 }88 }
7989
80 std::vector<std::string> environment_of_spinner()90 std::vector<std::string> environment_of_spinner()
81 {91 {
82 auto const pids = pidof(spinner_cmd);92 auto const pids = spinner_pids();
83 if (pids.size() > 1)93 if (pids.size() > 1)
84 BOOST_THROW_EXCEPTION(std::runtime_error("Detected multiple spinner processes"));94 BOOST_THROW_EXCEPTION(std::runtime_error("Detected multiple spinner processes"));
85 std::vector<std::string> env;95 std::vector<std::string> env;
@@ -96,19 +106,14 @@
96106
97 void wait_for_spinner_to_terminate()107 void wait_for_spinner_to_terminate()
98 {108 {
99 auto const timeout = std::chrono::milliseconds{3000};109 usc::test::spin_wait_for_condition_or_timeout(
100 auto const expire = std::chrono::steady_clock::now() + timeout;110 [this] { return pidof(spinner_cmd).empty(); },
101111 timeout);
102 while (spinner_pids().size() > 0)
103 {
104 if (std::chrono::steady_clock::now() > expire)
105 BOOST_THROW_EXCEPTION(std::runtime_error("wait_for_no_spinner timed out"));
106 std::this_thread::sleep_for(std::chrono::milliseconds{10});
107 }
108 }112 }
109113
110 std::string const spinner_cmd{executable_path() + "/usc_test_helper_wait_for_signal"};114 std::string const spinner_cmd{executable_path() + "/usc_test_helper_wait_for_signal"};
111 std::string const mir_socket{"usc_mir_socket"};115 std::string const mir_socket{"usc_mir_socket"};
116 std::chrono::milliseconds const timeout{3000};
112 usc::ExternalSpinner spinner{spinner_cmd, mir_socket};117 usc::ExternalSpinner spinner{spinner_cmd, mir_socket};
113};118};
114119

Subscribers

People subscribed via source and target branches