Mir

Merge lp:~albaguirre/mir/alternate-fix-1285215 into lp:mir

Proposed by Alberto Aguirre
Status: Merged
Approved by: Alberto Aguirre
Approved revision: no longer in the source branch.
Merged at revision: 1560
Proposed branch: lp:~albaguirre/mir/alternate-fix-1285215
Merge into: lp:mir
Diff against target: 278 lines (+172/-2)
8 files modified
include/test/mir_test_framework/display_server_test_fixture.h (+1/-0)
include/test/mir_test_framework/testing_process_manager.h (+1/-0)
src/server/frontend/published_socket_connector.cpp (+65/-1)
tests/acceptance-tests/test_server_startup.cpp (+31/-0)
tests/mir_test_framework/display_server_test_fixture.cpp (+7/-0)
tests/mir_test_framework/process.cpp (+5/-0)
tests/mir_test_framework/testing_process_manager.cpp (+19/-0)
tests/mir_test_framework/testing_server_options.cpp (+43/-1)
To merge this branch: bzr merge lp:~albaguirre/mir/alternate-fix-1285215
Reviewer Review Type Date Requested Status
PS Jenkins bot (community) continuous-integration Approve
Alexandros Frantzis (community) Approve
Robert Carr (community) Approve
Alan Griffiths Approve
Review via email: mp+215320@code.launchpad.net

Commit message

Remove stale socket files (LP: #1285215)

Stale socket files can be detected by checking the active socket connection table
at /proc/net/unix.
Stale socket files are files that exist, are of socket type
but don't have an active socket connection entry.

Description of the change

Remove stale socket files (LP: #1285215)

Stale socket files can be detected by checking the active socket connection table
at /proc/net/unix.
Stale socket files are files that exist, are of socket type
but don't have an active socket connection entry.

This is an alternative solution to switching to using the abstract socket namespace.

To post a comment you must log in.
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Needs Fixing (continuous-integration)
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

60 + socket_path.insert(std::begin(socket_path), ' ');
69 + if (line.find(socket_path) != std::string::npos)

We also need to ensure that what we have found is a full word match, not a partial match. For example searching for '/tmp/mir_socket' shouldn't match an entry for '/tmp/mir_socket_host'.

review: Needs Fixing
Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

> 60 + socket_path.insert(std::begin(socket_path), ' ');
> 69 + if (line.find(socket_path) != std::string::npos)
>
> We also need to ensure that what we have found is a full word match, not a
> partial match. For example searching for '/tmp/mir_socket' shouldn't match an
> entry for '/tmp/mir_socket_host'.

+1

(I'm still not convinced we should be doing this at all.)

review: Needs Fixing
Revision history for this message
Robert Carr (robertcarr) wrote :

I would feel better if we knew how the socket could get orphaned now? What sort of signals?

review: Needs Information
Revision history for this message
Robert Carr (robertcarr) wrote :

run_mir.cpp:
>> auto const intercepted = { SIGQUIT, SIGABRT, SIGFPE, SIGSEGV, SIGBUS };

SIGPIPE looks like the most likely culprit?

Revision history for this message
Alberto Aguirre (albaguirre) wrote :

> I would feel better if we knew how the socket could get orphaned now? What
> sort of signals?

SIGKILL for sure :)

Revision history for this message
Alberto Aguirre (albaguirre) wrote :

> > I would feel better if we knew how the socket could get orphaned now? What
> > sort of signals?
>
> SIGKILL for sure :)

For example what if you driver hangs and you need to really SIGKILL the server? Some would argue it is the responsibility of whoever sent SIGKILL to remove the stale socket - a fair point. However I - obviously since I submitted this MP - am of the opinion that if we can detect the stale socket then mir should be able to remove it.

Revision history for this message
Alberto Aguirre (albaguirre) wrote :

> We also need to ensure that what we have found is a full word match, not a
> partial match. For example searching for '/tmp/mir_socket' shouldn't match an
> entry for '/tmp/mir_socket_host'.

Fixed, although the only effect it had was a false positive detection of an active socket connection.

Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

This adds complexity that I don't think we need for a small convenience when the server is SIGKILLed. But the impact is localized and not worth arguing over.

review: Approve
Revision history for this message
Robert Carr (robertcarr) :
review: Approve
Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

OK.

review: Approve
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Needs Fixing (continuous-integration)
Revision history for this message
Alberto Aguirre (albaguirre) wrote :

^--Looks like a CI timeout to me...

Revision history for this message
PS Jenkins bot (ps-jenkins) :
review: Approve (continuous-integration)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'include/test/mir_test_framework/display_server_test_fixture.h'
2--- include/test/mir_test_framework/display_server_test_fixture.h 2014-03-06 06:05:17 +0000
3+++ include/test/mir_test_framework/display_server_test_fixture.h 2014-04-14 23:26:11 +0000
4@@ -65,6 +65,7 @@
5
6 bool shutdown_server_process();
7 Result wait_for_shutdown_server_process();
8+ bool kill_server_process();
9 void kill_client_processes();
10 void terminate_client_processes();
11
12
13=== modified file 'include/test/mir_test_framework/testing_process_manager.h'
14--- include/test/mir_test_framework/testing_process_manager.h 2014-03-06 06:05:17 +0000
15+++ include/test/mir_test_framework/testing_process_manager.h 2014-04-14 23:26:11 +0000
16@@ -58,6 +58,7 @@
17 void tear_down_server();
18 void tear_down_all();
19 Result shutdown_server_process();
20+ Result kill_server_process();
21 Result wait_for_shutdown_server_process();
22 void kill_client_processes();
23 void terminate_client_processes();
24
25=== modified file 'src/server/frontend/published_socket_connector.cpp'
26--- src/server/frontend/published_socket_connector.cpp 2014-03-06 06:05:17 +0000
27+++ src/server/frontend/published_socket_connector.cpp 2014-04-14 23:26:11 +0000
28@@ -26,18 +26,82 @@
29 #include <boost/throw_exception.hpp>
30
31 #include <sys/socket.h>
32+#include <sys/stat.h>
33+
34+#include <cstdio>
35+#include <fstream>
36
37 namespace mf = mir::frontend;
38 namespace mfd = mir::frontend::detail;
39 namespace ba = boost::asio;
40
41+namespace
42+{
43+
44+bool socket_file_exists(std::string const& filename)
45+{
46+ struct stat statbuf;
47+ bool exists = (0 == stat(filename.c_str(), &statbuf));
48+ /* Avoid removing non-socket files */
49+ bool is_socket_type = (statbuf.st_mode & S_IFMT) == S_IFSOCK;
50+ return exists && is_socket_type;
51+}
52+
53+bool socket_exists(std::string const& socket_name)
54+{
55+ try
56+ {
57+ std::string socket_path{socket_name};
58+
59+ /* In case an abstract socket name exists with the same name*/
60+ socket_path.insert(std::begin(socket_path), ' ');
61+
62+ /* If the name is contained in this table, it signifies
63+ * a process is truly using that socket connection
64+ */
65+ std::ifstream socket_names_file("/proc/net/unix");
66+ std::string line;
67+ while (std::getline(socket_names_file, line))
68+ {
69+ auto index = line.find(socket_path);
70+ /* check for complete match */
71+ if (index != std::string::npos &&
72+ (index + socket_path.length()) == line.length())
73+ {
74+ return true;
75+ }
76+ }
77+ }
78+ catch (...)
79+ {
80+ /* Assume the socket exists */
81+ return true;
82+ }
83+ return false;
84+}
85+
86+std::string remove_if_stale(std::string const& socket_name)
87+{
88+ if (socket_file_exists(socket_name) && !socket_exists(socket_name))
89+ {
90+ if (std::remove(socket_name.c_str()) != 0)
91+ {
92+ BOOST_THROW_EXCEPTION(
93+ boost::enable_error_info(
94+ std::runtime_error("Failed removing stale socket file")) << boost::errinfo_errno(errno));
95+ }
96+ }
97+ return socket_name;
98+}
99+}
100+
101 mf::PublishedSocketConnector::PublishedSocketConnector(
102 const std::string& socket_file,
103 std::shared_ptr<SessionCreator> const& session_creator,
104 int threads,
105 std::shared_ptr<ConnectorReport> const& report)
106 : BasicConnector(session_creator, threads, report),
107- socket_file(socket_file),
108+ socket_file(remove_if_stale(socket_file)),
109 acceptor(io_service, socket_file)
110 {
111 start_accept();
112
113=== modified file 'tests/acceptance-tests/test_server_startup.cpp'
114--- tests/acceptance-tests/test_server_startup.cpp 2013-01-11 23:47:40 +0000
115+++ tests/acceptance-tests/test_server_startup.cpp 2014-04-14 23:26:11 +0000
116@@ -49,4 +49,35 @@
117
118 launch_client_process(client_config);
119 }
120+
121+TEST_F(BespokeDisplayServerTestFixture, can_start_new_instance_after_sigkill)
122+{
123+ ASSERT_FALSE(mtf::detect_server(mtf::test_socket_file(), std::chrono::milliseconds(0)));
124+
125+ TestingServerConfiguration config;
126+ launch_server_process(config);
127+
128+ run_in_test_process([&]
129+ {
130+ /* Under valgrind, raise(SIGKILL) results in a memcheck orphan process
131+ * we kill the server process from the test process instead
132+ */
133+ EXPECT_TRUE(kill_server_process());
134+
135+ /* Attempt to start a new server instance */
136+ TestingServerConfiguration server_config;
137+ launch_server_process(server_config);
138+
139+ struct ClientConfig : TestingClientConfiguration
140+ {
141+ void exec()
142+ {
143+ EXPECT_TRUE(mtf::detect_server(mtf::test_socket_file(),
144+ std::chrono::milliseconds(100)));
145+ }
146+ } client_config;
147+
148+ launch_client_process(client_config);
149+ });
150+}
151 }
152
153=== modified file 'tests/mir_test_framework/display_server_test_fixture.cpp'
154--- tests/mir_test_framework/display_server_test_fixture.cpp 2014-03-06 06:05:17 +0000
155+++ tests/mir_test_framework/display_server_test_fixture.cpp 2014-04-14 23:26:11 +0000
156@@ -75,6 +75,13 @@
157 || result.reason == TerminationReason::unknown;
158 }
159
160+bool BespokeDisplayServerTestFixture::kill_server_process()
161+{
162+ auto const& result = process_manager.kill_server_process();
163+ return result.reason == TerminationReason::child_terminated_by_signal &&
164+ result.signal == SIGKILL;
165+}
166+
167 mtf::Result BespokeDisplayServerTestFixture::wait_for_shutdown_server_process()
168 {
169 return process_manager.wait_for_shutdown_server_process();
170
171=== modified file 'tests/mir_test_framework/process.cpp'
172--- tests/mir_test_framework/process.cpp 2014-03-06 06:05:17 +0000
173+++ tests/mir_test_framework/process.cpp 2014-04-14 23:26:11 +0000
174@@ -122,6 +122,11 @@
175 result.signal = WTERMSIG(status);
176 break;
177 }
178+ else {
179+ terminated = true;
180+ result.reason = TerminationReason::unknown;
181+ break;
182+ }
183 }
184 else if (rc == 0)
185 {
186
187=== modified file 'tests/mir_test_framework/testing_process_manager.cpp'
188--- tests/mir_test_framework/testing_process_manager.cpp 2014-03-06 06:05:17 +0000
189+++ tests/mir_test_framework/testing_process_manager.cpp 2014-04-14 23:26:11 +0000
190@@ -165,6 +165,25 @@
191 return result;
192 }
193
194+mtf::Result mtf::TestingProcessManager::kill_server_process()
195+{
196+ Result result;
197+
198+ if (server_process)
199+ {
200+ server_process->kill();
201+ result = server_process->wait_for_termination();
202+ server_process.reset();
203+ return result;
204+ }
205+ else
206+ {
207+ result.reason = TerminationReason::unknown;
208+ }
209+
210+ return result;
211+}
212+
213 mtf::Result mtf::TestingProcessManager::wait_for_shutdown_server_process()
214 {
215 Result result;
216
217=== modified file 'tests/mir_test_framework/testing_server_options.cpp'
218--- tests/mir_test_framework/testing_server_options.cpp 2014-03-06 06:05:17 +0000
219+++ tests/mir_test_framework/testing_server_options.cpp 2014-04-14 23:26:11 +0000
220@@ -23,8 +23,50 @@
221 #include <boost/exception/errinfo_errno.hpp>
222 #include <boost/throw_exception.hpp>
223
224+#include <random>
225+#include <fstream>
226+
227 namespace mtf = mir_test_framework;
228
229+namespace
230+{
231+
232+bool socket_exists(std::string const& socket_name)
233+{
234+ std::string socket_path{socket_name};
235+ socket_path.insert(std::begin(socket_path), ' ');
236+
237+ std::ifstream socket_names_file("/proc/net/unix");
238+ std::string line;
239+ while (std::getline(socket_names_file, line))
240+ {
241+ if (line.find(socket_path) != std::string::npos)
242+ return true;
243+ }
244+ return false;
245+}
246+
247+std::string create_random_socket_name()
248+{
249+ std::random_device random_device;
250+ std::mt19937 generator(random_device());
251+ int max_concurrent_test_instances = 99999;
252+ std::uniform_int_distribution<> dist(1, max_concurrent_test_instances);
253+ std::string const suffix{"-mir_socket_test"};
254+
255+ /* check for name collisions against other test instances
256+ * running concurrently
257+ */
258+ for (int i = 0; i < max_concurrent_test_instances; i++)
259+ {
260+ std::string name{std::to_string(dist(generator))};
261+ name.append(suffix);
262+ if (!socket_exists(name))
263+ return name;
264+ }
265+ throw std::runtime_error("Too many test socket instances exist!");
266+}
267+}
268
269 mtf::TestingServerConfiguration::TestingServerConfiguration() :
270 using_server_started_sync(false)
271@@ -99,6 +141,6 @@
272
273 std::string const& mtf::test_socket_file()
274 {
275- static const std::string socket_file{"./mir_socket_test"};
276+ static const std::string socket_file{create_random_socket_name()};
277 return socket_file;
278 }

Subscribers

People subscribed via source and target branches