Mir

Merge lp:~afrantzis/mir/fix-1439719-simple-dispatch-thread-eintr into lp:mir

Proposed by Alexandros Frantzis
Status: Merged
Approved by: Alan Griffiths
Approved revision: no longer in the source branch.
Merged at revision: 2461
Proposed branch: lp:~afrantzis/mir/fix-1439719-simple-dispatch-thread-eintr
Merge into: lp:mir
Diff against target: 157 lines (+84/-12)
4 files modified
src/common/dispatch/simple_dispatch_thread.cpp (+28/-12)
tests/include/mir_test/signal.h (+2/-0)
tests/mir_test/signal.cpp (+6/-0)
tests/unit-tests/dispatch/test_simple_dispatch_thread.cpp (+48/-0)
To merge this branch: bzr merge lp:~afrantzis/mir/fix-1439719-simple-dispatch-thread-eintr
Reviewer Review Type Date Requested Status
PS Jenkins bot (community) continuous-integration Approve
Alan Griffiths Approve
Chris Halse Rogers Approve
Robert Carr (community) Approve
Review via email: mp+255171@code.launchpad.net

Commit message

common: Fix SimpleDispatchThreadTest signal interruption bug (LP: #1439719)

Description of the change

common: Fix SimpleDispatchThreadTest signal interruption bug (LP: #1439719)

To post a comment you must log in.
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Needs Fixing (continuous-integration)
Revision history for this message
Alexandros Frantzis (afrantzis) wrote :
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
Revision history for this message
Robert Carr (robertcarr) wrote :

LGTM

review: Approve
Revision history for this message
Chris Halse Rogers (raof) wrote :

Ah, it's Linux violating the POSIX mandated behaviour for a small set of system calls, of which epoll_wait happens to be one. Great!

Reading the documentation it seems that this behaviour won't occur on any of our other system calls. Specifically, our calls to readmsg/sendmsg will not error out with EINTR after receiving a SIGSTOP/SIGCONT pair.

LGTM.

review: Approve
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Needs Fixing (continuous-integration)
Revision history for this message
Alexandros Frantzis (afrantzis) wrote :
Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

OK

review: Approve
Revision history for this message
PS Jenkins bot (ps-jenkins) :
review: Approve (continuous-integration)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/common/dispatch/simple_dispatch_thread.cpp'
2--- src/common/dispatch/simple_dispatch_thread.cpp 2015-03-31 02:35:42 +0000
3+++ src/common/dispatch/simple_dispatch_thread.cpp 2015-04-03 09:11:56 +0000
4@@ -23,6 +23,7 @@
5 #include <sys/epoll.h>
6 #include <unistd.h>
7 #include <system_error>
8+#include <array>
9 #include <signal.h>
10 #include <boost/exception/all.hpp>
11
12@@ -60,28 +61,43 @@
13
14 for (;;)
15 {
16- epoll_event events[2];
17- if (epoll_wait(epoll_fd, events, sizeof(events), -1) > 1)
18+ std::array<epoll_event,2> events;
19+ auto const num_available_events =
20+ epoll_wait(epoll_fd, events.data(), events.size(), -1);
21+
22+ if (num_available_events == 1)
23+ {
24+ if (events[0].data.u32 == fd_names::dispatchee_fd)
25+ {
26+ if (!dispatchee->dispatch(md::epoll_to_fd_event(events[0])))
27+ {
28+ // No need to keep looping, the Dispatchable's not going to produce any more events.
29+ return;
30+ }
31+ }
32+ else if (events[0].data.u32 == fd_names::shutdown)
33+ {
34+ // The only thing we do with the shutdown fd is to close it.
35+ return;
36+ }
37+ }
38+ else if (num_available_events > 1)
39 {
40 // Because we only have two fds in the epoll, if there is more than one
41 // event pending then one of them must be a shutdown event.
42 // So, shutdown.
43 return;
44 }
45-
46- if (events[0].data.u32 == fd_names::dispatchee_fd)
47+ else if (num_available_events < 0)
48 {
49- if (!dispatchee->dispatch(md::epoll_to_fd_event(events[0])))
50+ // Although we have blocked signals in this thread, we can still
51+ // get interrupted by SIGSTOP (which is unblockable and non-fatal).
52+ if (errno != EINTR)
53 {
54- // No need to keep looping, the Dispatchable's not going to produce any more events.
55- return;
56+ BOOST_THROW_EXCEPTION((std::system_error{
57+ errno, std::system_category(), "Failed to wait for epoll events"}));
58 }
59 }
60- else if (events[0].data.u32 == fd_names::shutdown)
61- {
62- // The only thing we do with the shutdown fd is to close it.
63- return;
64- }
65 }
66 }
67
68
69=== modified file 'tests/include/mir_test/signal.h'
70--- tests/include/mir_test/signal.h 2015-03-31 02:35:42 +0000
71+++ tests/include/mir_test/signal.h 2015-04-03 09:11:56 +0000
72@@ -50,6 +50,8 @@
73 return cv.wait_until(lock, time, [this]() { return signalled; });
74 }
75
76+ void reset();
77+
78 private:
79 std::mutex mutex;
80 std::condition_variable cv;
81
82=== modified file 'tests/mir_test/signal.cpp'
83--- tests/mir_test/signal.cpp 2015-03-31 02:35:42 +0000
84+++ tests/mir_test/signal.cpp 2015-04-03 09:11:56 +0000
85@@ -38,3 +38,9 @@
86 std::unique_lock<decltype(mutex)> lock(mutex);
87 cv.wait(lock, [this]() { return signalled; });
88 }
89+
90+void mt::Signal::reset()
91+{
92+ std::lock_guard<decltype(mutex)> lock(mutex);
93+ signalled = false;
94+}
95
96=== modified file 'tests/unit-tests/dispatch/test_simple_dispatch_thread.cpp'
97--- tests/unit-tests/dispatch/test_simple_dispatch_thread.cpp 2015-03-31 02:35:42 +0000
98+++ tests/unit-tests/dispatch/test_simple_dispatch_thread.cpp 2015-04-03 09:11:56 +0000
99@@ -22,6 +22,8 @@
100 #include "mir_test/pipe.h"
101 #include "mir_test/signal.h"
102 #include "mir_test/test_dispatchable.h"
103+#include "mir_test_framework/process.h"
104+#include "mir_test/cross_process_action.h"
105
106 #include <fcntl.h>
107
108@@ -215,3 +217,49 @@
109
110 EXPECT_TRUE(dispatched->wait_for(10s));
111 }
112+
113+// Regression test for: lp #1439719
114+// The bug involves uninitialized memory and is also sensitive to signal
115+// timings, so this test does not always catch the problem. However, repeated
116+// runs (~300, YMMV) consistently fail when run against the problematic code.
117+TEST_F(SimpleDispatchThreadTest, keeps_dispatching_after_signal_interruption)
118+{
119+ using namespace std::chrono_literals;
120+ mt::CrossProcessAction stop_and_restart_process;
121+
122+ auto child = mir_test_framework::fork_and_run_in_a_different_process(
123+ [&]
124+ {
125+ auto dispatched = std::make_shared<mt::Signal>();
126+ auto dispatchable = std::make_shared<mt::TestDispatchable>(
127+ [dispatched]() { dispatched->raise(); });
128+
129+ md::SimpleDispatchThread dispatcher{dispatchable};
130+ // Ensure the dispatcher has started
131+ dispatchable->trigger();
132+ EXPECT_TRUE(dispatched->wait_for(1s));
133+
134+ stop_and_restart_process();
135+
136+ dispatched->reset();
137+ // The dispatcher shouldn't have been affected by the signal
138+ dispatchable->trigger();
139+ EXPECT_TRUE(dispatched->wait_for(1s));
140+ exit(HasFailure() ? EXIT_FAILURE : EXIT_SUCCESS);
141+ },
142+ []{ return 1; });
143+
144+ stop_and_restart_process.exec(
145+ [child]
146+ {
147+ // Increase chances of interrupting the dispatch mechanism
148+ for (int i = 0; i < 100; ++i)
149+ {
150+ child->stop();
151+ child->cont();
152+ }
153+ });
154+
155+ auto const result = child->wait_for_termination(10s);
156+ EXPECT_TRUE(result.succeeded());
157+}

Subscribers

People subscribed via source and target branches