Mir

Merge lp:~robertcarr/mir/session-transactions into lp:~mir-team/mir/trunk

Proposed by Robert Carr
Status: Rejected
Rejected by: Robert Carr
Proposed branch: lp:~robertcarr/mir/session-transactions
Merge into: lp:~mir-team/mir/trunk
Diff against target: 214 lines (+51/-17)
6 files modified
include/server/mir/shell/application_session.h (+3/-1)
include/server/mir/shell/session.h (+7/-0)
include/test/mir_test_doubles/mock_session.h (+1/-1)
src/server/shell/application_session.cpp (+15/-9)
src/server/shell/single_visibility_focus_mechanism.cpp (+11/-6)
tests/unit-tests/shell/test_single_visibility_focus_mechanism.cpp (+14/-0)
To merge this branch: bzr merge lp:~robertcarr/mir/session-transactions
Reviewer Review Type Date Requested Status
Alan Griffiths Needs Fixing
PS Jenkins bot (community) continuous-integration Approve
Robert Ancell Approve
Review via email: mp+175669@code.launchpad.net

Commit message

Perform focus setting transactionally, enabling the stress test suite to run.

Description of the change

This branch is a stab at fixing:

https://bugs.launchpad.net/mir/+bug/1195089

First the back story:

The way the crash from mir_stress presented itself from me was a throw in SessionManager::close_session "Invalid Session" when called from ~SessionMediator. That is to say, somehow we still have the msh::Session shared_ptr around but it has already been removed from the session manager. A bit of digging revealed the problem presents from SessionMediator::disconnect. SessionManager::close_session as called from SessionMediator::disconnect can throw in a few different ways, and when it does so the SessionMediator will be destroyed, but fail to reset the session pointer (which it does immediately following the call to SessionManager::close_session).

I tried logging this exception, and found that all the exceptions which could be thrown originated from setting the focus when closing a surface. I found the following two:

1. SingleVisibilityFocusMechanism requests the default_surface of a session, getting std::shared_ptr<msh::Session>. Now say, we are prempted by a frontend thread which closes the surface we have just grabbed. So now our msh::Session pointer is still valid, but the call to lock the internal ms::Surface will fail and throw an exception "Invalid surface" (when we call ->set_input_focus).
2. Things start the same way, but the surface is destroyed later, say after we enter AndroidInputTargeter::focus_changed(std::shared_ptr<mi::InputChannel>) we are preempted by a frontend thread destroying the surface. Now we try and find the input window handle for InputChannel (back in the focus_changed thread), but it has already been unregistered, and so we throw "Window handle requested for unregistered surface."

In order to solve this, I've introduced an API, msh::Session::transaction(std::function). The semantics of this function are easy to describe:

TEST(Session, create_and_destroy_surface_from_other_threads_blocks_during_transaction)

Unfortunately this test is quite difficult to write deterministic-ally so the function is essentially untested. I came up with one solution using real-time scheduling attributes (SCHED_FIFO), but this requires running the tests as root or using root to set a system capability on the test binary. If we want to do this/can think of a good way I can write the test.

With this branch I was able to run the stress test for several minutes and drag the cursor around. Eventually I got bored.

To post a comment you must log in.
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
Revision history for this message
Robert Ancell (robert-ancell) wrote :

+ // from a second thread before input focus is set.xs
Use emacs much?

Revision history for this message
Robert Ancell (robert-ancell) wrote :

I'd prefer a second opinion on using the transaction function over just using a lock for consistency with the other code, but otherwise it seems to make sense.

review: Approve
858. By Robert Carr

Remove secret emacs signet ring

Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

I see the need, but don't like the solution. Maybe my next article should be "recursive_mutex considered evil"?

What we are lacking is a clear synchronization strategy in our data model. (C.f. "The case for a "SceneGraph"" on mir-devel.)

I'm tempted to hold my nose and "Approve" but will try to think of a cleaner solution first.

review: Abstain
Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

...having thought about it: doesn't the problem stem from not holding SessionManager::mutex when these actions occur?

void msh::SessionManager::close_session(std::shared_ptr<mf::Session> const& session)
{
    auto shell_session = std::dynamic_pointer_cast<Session>(session);

    session_listener->stopping(shell_session);

    app_container->remove_session(shell_session);

    std::unique_lock<std::mutex> lock(mutex);
    set_focus_to_locked(lock, focus_sequence->default_focus());
}

Note that the SessionManager::mutex isn't seized until after the remove_session() call.

inline void msh::SessionManager::set_focus_to_locked(std::unique_lock<std::mutex> const&, std::shared_ptr<Session> const& shell_session)
{
    auto old_focus = focus_application.lock();

    focus_application = shell_session;

    focus_setter->set_focus_to(shell_session);
    if (shell_session)
    {
        session_listener->focused(shell_session);
    }
    else
    {
        session_listener->unfocused();
    }
}

Note that the SessionManager::mutex isn't seized.

review: Needs Fixing
Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

> inline void
> msh::SessionManager::set_focus_to_locked(std::unique_lock<std::mutex> const&,
> std::shared_ptr<Session> const& shell_session)
> {
> auto old_focus = focus_application.lock();
>
> focus_application = shell_session;
>
> focus_setter->set_focus_to(shell_session);
> if (shell_session)
> {
> session_listener->focused(shell_session);
> }
> else
> {
> session_listener->unfocused();
> }
> }
>
> Note that the SessionManager::mutex isn't seized.

Sorry, that last bit is a thinko.

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

Also, is there any need for DefaultSessionContainer::remove_session() to throw? It can (trivially) meet its post condition.

Revision history for this message
Robert Carr (robertcarr) wrote :

Holding the SessionManager mutex isn't enough. Not that close_surface isn't routed through the SessionManager which is where this race is triggered. The existing locking around close_session is asctually enough to prevent it.

DefaultSessionContainer::remove_session perhaps need not throw (though I think this usage is indicative of programmer error). Still though, that would leave the race around WindowHandleRepository.

More than those two, there is a race in every portion of the shell which has code like:

Step 1: Get surface from session
Step 2: Query properties from surface.

Step 2 can always throw as it stands and there is no reasonable way to get around it.

Revision history for this message
Robert Carr (robertcarr) wrote :

Does anyone have opinions?

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

OK, I'm beginning to see that the code is broken worse than I thought.

In any case, the problems arise from a poorly synchronized model - and part of that is that some locks are in the wrong place (i.e. in the controller).

Putting recursive locks into the controller seems like the wrong answer.

Sadly, fixing the model ("scenegraph") is a larger conversation.

review: Needs Fixing

Unmerged revisions

858. By Robert Carr

Remove secret emacs signet ring

857. By Robert Carr

Set focus transactionally

856. By Robert Carr

Hack

855. By Daniel van Vugt

buffer_swapper_spin.h: Remove dead code: initialize_queues.

Approved by PS Jenkins bot, Alan Griffiths.

854. By Stephen M. Webb

disable running the integration test suite on arm architecture during the packaging builds (lp: #1195265). Fixes: https://bugs.launchpad.net/bugs/1195260, https://bugs.launchpad.net/bugs/1195265.

Approved by PS Jenkins bot, Kevin DuBois.

853. By Alan Griffiths

doc: use house font and colors for coding guidelines.

Approved by PS Jenkins bot, Kevin DuBois, Alexandros Frantzis.

852. By Robert Carr

Implement a connection authorization mechanism.

Approved by Gerry Boland, Alan Griffiths, Robert Ancell, PS Jenkins bot.

851. By Robert Ancell

Releasing 0.0.7

850. By Robert Carr

Extract DefaultServerConfiguration::the_cursor_listener from DefaultServerConfiguration::the_input_configuration. Fixes: https://bugs.launchpad.net/bugs/1192916.

Approved by PS Jenkins bot, Alan Griffiths, Alexandros Frantzis, Robert Ancell.

849. By Alan Griffiths

geometry: make geometry compound types easier to construct. Fixes: https://bugs.launchpad.net/bugs/1199756.

Approved by PS Jenkins bot, Alexandros Frantzis.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'include/server/mir/shell/application_session.h'
2--- include/server/mir/shell/application_session.h 2013-06-25 13:19:15 +0000
3+++ include/server/mir/shell/application_session.h 2013-07-19 01:29:27 +0000
4@@ -66,6 +66,8 @@
5 void show();
6
7 int configure_surface(frontend::SurfaceId id, MirSurfaceAttrib attrib, int value);
8+
9+ void transaction(std::function<void()> const& body);
10
11 protected:
12 ApplicationSession(ApplicationSession const&) = delete;
13@@ -83,7 +85,7 @@
14
15 typedef std::map<frontend::SurfaceId, std::shared_ptr<Surface>> Surfaces;
16 Surfaces::const_iterator checked_find(frontend::SurfaceId id) const;
17- std::mutex mutable surfaces_mutex;
18+ std::recursive_mutex mutable surfaces_mutex;
19 Surfaces surfaces;
20 };
21
22
23=== modified file 'include/server/mir/shell/session.h'
24--- include/server/mir/shell/session.h 2013-06-25 13:19:15 +0000
25+++ include/server/mir/shell/session.h 2013-07-19 01:29:27 +0000
26@@ -22,6 +22,8 @@
27 #include "mir/frontend/session.h"
28 #include "mir/shell/snapshot.h"
29
30+#include <functional>
31+
32 namespace mir
33 {
34
35@@ -37,6 +39,11 @@
36
37 virtual void take_snapshot(SnapshotCallback const& snapshot_taken) = 0;
38 virtual std::shared_ptr<Surface> default_surface() const = 0;
39+
40+ /// Executes the body callback as a transaction. That is to say
41+ /// other requests to this session will block during the execution
42+ // of body.
43+ virtual void transaction(std::function<void()> const& body) = 0;
44 };
45
46 }
47
48=== modified file 'include/test/mir_test_doubles/mock_session.h'
49--- include/test/mir_test_doubles/mock_session.h 2013-05-21 17:16:43 +0000
50+++ include/test/mir_test_doubles/mock_session.h 2013-07-19 01:29:27 +0000
51@@ -42,7 +42,7 @@
52 MOCK_METHOD0(hide, void());
53 MOCK_METHOD0(show, void());
54
55- MOCK_METHOD3(configure_surface, int(frontend::SurfaceId, MirSurfaceAttrib, int));
56+ MOCK_METHOD3(configure_surface, int(frontend::SurfaceId, MirSurfaceAttrib, int));
57 };
58
59 }
60
61=== modified file 'src/server/shell/application_session.cpp'
62--- src/server/shell/application_session.cpp 2013-06-25 13:19:15 +0000
63+++ src/server/shell/application_session.cpp 2013-07-19 01:29:27 +0000
64@@ -56,7 +56,7 @@
65
66 msh::ApplicationSession::~ApplicationSession()
67 {
68- std::unique_lock<std::mutex> lock(surfaces_mutex);
69+ std::unique_lock<std::recursive_mutex> lock(surfaces_mutex);
70 for (auto const& pair_id_surface : surfaces)
71 {
72 pair_id_surface.second->destroy();
73@@ -73,7 +73,7 @@
74 auto const id = next_id();
75 auto surf = surface_factory->create_surface(params, id, event_sink);
76
77- std::unique_lock<std::mutex> lock(surfaces_mutex);
78+ std::unique_lock<std::recursive_mutex> lock(surfaces_mutex);
79 surfaces[id] = surf;
80 return id;
81 }
82@@ -90,7 +90,7 @@
83
84 std::shared_ptr<mf::Surface> msh::ApplicationSession::get_surface(mf::SurfaceId id) const
85 {
86- std::unique_lock<std::mutex> lock(surfaces_mutex);
87+ std::unique_lock<std::recursive_mutex> lock(surfaces_mutex);
88
89 return checked_find(id)->second;
90 }
91@@ -102,7 +102,7 @@
92
93 std::shared_ptr<msh::Surface> msh::ApplicationSession::default_surface() const
94 {
95- std::unique_lock<std::mutex> lock(surfaces_mutex);
96+ std::unique_lock<std::recursive_mutex> lock(surfaces_mutex);
97
98 if (surfaces.size())
99 return surfaces.begin()->second;
100@@ -112,7 +112,7 @@
101
102 void msh::ApplicationSession::destroy_surface(mf::SurfaceId id)
103 {
104- std::unique_lock<std::mutex> lock(surfaces_mutex);
105+ std::unique_lock<std::recursive_mutex> lock(surfaces_mutex);
106 auto p = checked_find(id);
107
108 p->second->destroy();
109@@ -126,7 +126,7 @@
110
111 void msh::ApplicationSession::force_requests_to_complete()
112 {
113- std::unique_lock<std::mutex> lock(surfaces_mutex);
114+ std::unique_lock<std::recursive_mutex> lock(surfaces_mutex);
115 for (auto& id_s : surfaces)
116 {
117 id_s.second->force_requests_to_complete();
118@@ -135,7 +135,7 @@
119
120 void msh::ApplicationSession::hide()
121 {
122- std::unique_lock<std::mutex> lock(surfaces_mutex);
123+ std::unique_lock<std::recursive_mutex> lock(surfaces_mutex);
124 for (auto& id_s : surfaces)
125 {
126 id_s.second->hide();
127@@ -144,7 +144,7 @@
128
129 void msh::ApplicationSession::show()
130 {
131- std::unique_lock<std::mutex> lock(surfaces_mutex);
132+ std::unique_lock<std::recursive_mutex> lock(surfaces_mutex);
133 for (auto& id_s : surfaces)
134 {
135 id_s.second->show();
136@@ -155,8 +155,14 @@
137 MirSurfaceAttrib attrib,
138 int value)
139 {
140- std::unique_lock<std::mutex> lock(surfaces_mutex);
141+ std::unique_lock<std::recursive_mutex> lock(surfaces_mutex);
142 std::shared_ptr<mf::Surface> surf(checked_find(id)->second);
143
144 return surf->configure(attrib, value);
145 }
146+
147+void msh::ApplicationSession::transaction(std::function<void()> const& body)
148+{
149+ std::unique_lock<std::recursive_mutex> lg(surfaces_mutex);
150+ body();
151+}
152
153=== modified file 'src/server/shell/single_visibility_focus_mechanism.cpp'
154--- src/server/shell/single_visibility_focus_mechanism.cpp 2013-05-28 15:44:18 +0000
155+++ src/server/shell/single_visibility_focus_mechanism.cpp 2013-07-19 01:29:27 +0000
156@@ -40,13 +40,18 @@
157 [&](std::shared_ptr<mf::Session> const& session) {
158 if (session == focus_session)
159 {
160- auto surface = focus_session->default_surface();
161- if (surface)
162+ // We use a transaction to guarantee the surface will not be destroyed
163+ // from a second thread before input focus is set.
164+ focus_session->transaction([&]()
165 {
166- surface->take_input_focus(input_targeter);
167- set_input_focus = true;
168- }
169- session->show();
170+ auto surface = focus_session->default_surface();
171+ if (surface)
172+ {
173+ surface->take_input_focus(input_targeter);
174+ set_input_focus = true;
175+ }
176+ session->show();
177+ });
178 }
179 else
180 {
181
182=== modified file 'tests/unit-tests/shell/test_single_visibility_focus_mechanism.cpp'
183--- tests/unit-tests/shell/test_single_visibility_focus_mechanism.cpp 2013-06-25 13:19:15 +0000
184+++ tests/unit-tests/shell/test_single_visibility_focus_mechanism.cpp 2013-07-19 01:29:27 +0000
185@@ -60,6 +60,15 @@
186 MOCK_METHOD0(show, void());
187
188 MOCK_METHOD3(configure_surface, int(mf::SurfaceId, MirSurfaceAttrib, int));
189+
190+ MOCK_METHOD0(mock_transaction_started, void());
191+ MOCK_METHOD0(mock_transaction_finished, void());
192+ void transaction(std::function<void()> const& body)
193+ {
194+ mock_transaction_started();
195+ body();
196+ mock_transaction_finished();
197+ }
198 };
199
200 TEST(SingleVisibilityFocusMechanism, mechanism_sets_visibility)
201@@ -114,8 +123,13 @@
202
203 {
204 InSequence seq;
205+ EXPECT_CALL(app1, mock_transaction_started()).Times(1);
206 EXPECT_CALL(mock_surface, take_input_focus(_)).Times(1);
207+ EXPECT_CALL(app1, mock_transaction_finished()).Times(1);
208 // When we have no default surface.
209+ EXPECT_CALL(app1, mock_transaction_started()).Times(1);
210+ EXPECT_CALL(app1, mock_transaction_finished()).Times(1);
211+
212 EXPECT_CALL(targeter, focus_cleared()).Times(1);
213 // When we have no session.
214 EXPECT_CALL(targeter, focus_cleared()).Times(1);

Subscribers

People subscribed via source and target branches