Mir

Merge lp:~robertcarr/mir/fix-focus-races into lp:~mir-team/mir/trunk

Proposed by Robert Carr
Status: Superseded
Proposed branch: lp:~robertcarr/mir/fix-focus-races
Merge into: lp:~mir-team/mir/trunk
Diff against target: 288 lines (+44/-14)
5 files modified
include/server/mir/shell/application_session.h (+1/-1)
include/server/mir/shell/surface.h (+3/-0)
src/server/shell/application_session.cpp (+0/-10)
src/server/shell/surface.cpp (+40/-1)
tests/unit-tests/shell/test_application_session.cpp (+0/-2)
To merge this branch: bzr merge lp:~robertcarr/mir/fix-focus-races
Reviewer Review Type Date Requested Status
Daniel van Vugt Needs Fixing
PS Jenkins bot (community) continuous-integration Needs Fixing
Alan Griffiths Abstain
Alexandros Frantzis (community) Needs Information
Review via email: mp+184404@code.launchpad.net

This proposal has been superseded by a proposal from 2013-09-19.

Commit message

Fix the races around focus. Essentially these stem from that holding std::shared_ptr<msh::Surface> doesn't guarantee that calling methods on that surface will be safe (the underlying surface could be destroyed via Surface::destroy and then your method call will throw).

So this branch changes ApplicationSession to not destroy the surface explicitly and instead rely on ~msh::Surface, which means that you really can hold a safe pointer. I think this as a reasonable choice, as the shell might want to do similar things, i.e. continue to animate a surface after a session has closed it.

Description of the change

Fix the races around focus. Essentially these stem from that holding std::shared_ptr<msh::Surface> doesn't guarantee that calling methods on that surface will be safe (the underlying surface could be destroyed via Surface::destroy and then your method call will throw).

So this branch changes ApplicationSession to not destroy the surface explicitly and instead rely on ~msh::Surface, which means that you really can hold a safe pointer. I think this as a reasonable choice, as the shell might want to do similar things, i.e. continue to animate a surface after a session has closed it.

To post a comment you must log in.
Revision history for this message
Robert Carr (robertcarr) wrote :

I was able to run the stress test with continuous cursor input for the full ten minute run. There is also good reason to believe this fixes: https://bugs.launchpad.net/mir/+bug/1220845

The exception in ~SessionMediator can be triggered if session_manager->close_session throws (due to this focus race), causing SessionMediator::disconnect to fail to reset it's session member-variable pointer. This exception is caught in frontend. However now in ~SessionMediator the pointer has not been reset so it closes the session again throwing an Invalid Session exception. causing the session to be closed again in ~

Revision history for this message
Robert Carr (robertcarr) wrote :

The unsafely held surface pointer in this case is the previously focused app.

Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

There is an inconsistency with this approach: when a session is closed its surface resources are destroyed immediately, but when a surface is explicitly destroyed the resources are not destroyed until there are no more users.

Would it (make sense|be possible) for msh::Surface to own (hold a strong pointer to) ms::Surface and rely on ~msh::Surface() in both the cases above, thus removing the msh::Surface -> ms::Surface weak_ptr dance?

review: Needs Information
Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

> There is an inconsistency with this approach: when a session is closed its
> surface resources are destroyed immediately, but when a surface is explicitly
> destroyed the resources are not destroyed until there are no more users.
>
> Would it (make sense|be possible) for msh::Surface to own (hold a strong
> pointer to) ms::Surface and rely on ~msh::Surface() in both the cases above,
> thus removing the msh::Surface -> ms::Surface weak_ptr dance?

I've not yet looked at the MP, but I can contribute the origin of the "weak_ptr dance".

Once a surface has been "destroyed" it is no longer in the scenegraph (ok, SurfaceStack) and therefore not composited. If the ms::Surface continued to exist it could reach a state where it has client buffers meaning threads that require them will wait forever.

It may well be that subsequent changes (particularly the introduction of force_requests_to_complete()) permits a better solution than the "weak_ptr dance".

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

I don't think this is the right evolution of the code, but I don't think it makes things worse.

The right solution would be more complex and should probably come as part of fixing the data model.

review: Abstain
lp:~robertcarr/mir/fix-focus-races updated
1055. By Robert Carr

Default ApplicationSession constructor

Revision history for this message
Robert Carr (robertcarr) wrote :

I think it makes sense to not destroy the surfaces explicitly from the session (as in the last committ). It may be possible to fix the weak_ptr dance (it almost surely is possible). It's not really clear what direction this code should go in right now though (except probably not further down this one), so I am trying to just fix the crashes non invasively

Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Bug 1216727 is still happening :(

terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<std::logic_error> >'
  what(): Requesting handle for an unregistered channel

Should we just unlink it from the branch?

review: Needs Fixing
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

P.S. Before declaring bug 1216727 fixed, please find a machine on which you can reproduce it (using trunk). To be sure the changes are actually having an effect.

lp:~robertcarr/mir/fix-focus-races updated
1056. By Robert Carr

Mutex

1057. By Robert Carr

Recursive mutex

Revision history for this message
Robert Carr (robertcarr) wrote :

Can someone review this again?

Revision history for this message
Daniel van Vugt (vanvugt) :
review: Abstain
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Needs Fixing (continuous-integration)
lp:~robertcarr/mir/fix-focus-races updated
1058. By Robert Carr

Merge trunk

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Weirdly, merging this branch now causes an almost immediate hang:

Running tests...
Test project /home/dan/bzr/mir/tmp.focus/build
        Start 1: LGPL-required
  1/145 Test #1: LGPL-required ............................................... Passed 0.01 sec
        Start 2: GPL-required
  2/145 Test #2: GPL-required ................................................ Passed 0.01 sec
        Start 3: acceptance-tests.BespokeDisplayServerTestFixture.*

review: Needs Fixing
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Needs Fixing (continuous-integration)
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Another bug with this branch is that on Alt+Tab in demo_server_shell, the client rendering freezes. Permanently.

review: Needs Fixing

Unmerged revisions

1058. By Robert Carr

Merge trunk

1057. By Robert Carr

Recursive mutex

1056. By Robert Carr

Mutex

1055. By Robert Carr

Default ApplicationSession constructor

1054. By Robert Carr

Do not destroy underlying surfaces until the msh::Surface d'tor

1053. By Eleni Maria Stea

graphics: Pull in Eleni's changes to get the DRM fd to init GBM from the host Mir instance.

Approved by PS Jenkins bot, Alexandros Frantzis, Kevin DuBois, Robert Ancell.

1052. By PS Jenkins bot

Releasing 0.0.10+13.10.20130904-0ubuntu1 (revision 1051 from lp:mir).

Approved by PS Jenkins bot.

1051. By Kevin DuBois

fix lp:1220441 (a test for android display ID was not updated). Fixes: https://bugs.launchpad.net/bugs/1220441.

Approved by Daniel van Vugt, PS Jenkins bot, Robert Ancell.

1050. By PS Jenkins bot

Releasing 0.0.10+13.10.20130903-0ubuntu1 (revision 1049 from lp:mir).

Approved by PS Jenkins bot.

1049. By Daniel van Vugt

SwitchingBundle: Simplify and clarify guarantees that compositor_acquire
always has a buffer to return without blocking or throwing an exception.

The trade-off is that to enforce the guarantee we need to permanently
reserve one buffer for compositing. This reduces the flexibility of
SwitchingBundle a little, such that minimum nbuffers is now 2.

This was originally requested by Alexandros, as the potential throw concerned
him. Although, it was logically guaranteed to never happen for other reasons.

The second reason for doing this is to eliminate recycling logic, which
while safe and correct, was quite confusing. So this change further proves
that that logic (now removed) is not to blame for frame ordering bugs.
.

Approved by Robert Ancell, Alan Griffiths, PS Jenkins bot.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'include/server/mir/shell/application_session.h'
2--- include/server/mir/shell/application_session.h 2013-08-28 03:41:48 +0000
3+++ include/server/mir/shell/application_session.h 2013-09-19 16:20:57 +0000
4@@ -46,7 +46,7 @@
5 std::shared_ptr<SessionListener> const& session_listener,
6 std::shared_ptr<frontend::EventSink> const& sink);
7
8- ~ApplicationSession();
9+ ~ApplicationSession() = default;
10
11 frontend::SurfaceId create_surface(shell::SurfaceCreationParameters const& params);
12 void destroy_surface(frontend::SurfaceId surface);
13
14=== modified file 'include/server/mir/shell/surface.h'
15--- include/server/mir/shell/surface.h 2013-08-28 03:41:48 +0000
16+++ include/server/mir/shell/surface.h 2013-09-19 16:20:57 +0000
17@@ -28,6 +28,7 @@
18 #include "mir_toolkit/common.h"
19
20 #include <string>
21+#include <mutex>
22
23 namespace mir
24 {
25@@ -104,6 +105,8 @@
26
27 MirSurfaceType type_value;
28 MirSurfaceState state_value;
29+
30+ mutable std::recursive_mutex surface_lock;
31 };
32 }
33 }
34
35=== modified file 'src/server/shell/application_session.cpp'
36--- src/server/shell/application_session.cpp 2013-08-28 03:41:48 +0000
37+++ src/server/shell/application_session.cpp 2013-09-19 16:20:57 +0000
38@@ -50,15 +50,6 @@
39 assert(surface_factory);
40 }
41
42-msh::ApplicationSession::~ApplicationSession()
43-{
44- std::unique_lock<std::mutex> lock(surfaces_mutex);
45- for (auto const& pair_id_surface : surfaces)
46- {
47- pair_id_surface.second->destroy();
48- }
49-}
50-
51 mf::SurfaceId msh::ApplicationSession::next_id()
52 {
53 return mf::SurfaceId(next_surface_id.fetch_add(1));
54@@ -116,7 +107,6 @@
55
56 session_listener->destroying_surface(*this, p->second);
57
58- p->second->destroy();
59 surfaces.erase(p);
60 }
61
62
63=== modified file 'src/server/shell/surface.cpp'
64--- src/server/shell/surface.cpp 2013-08-28 03:41:48 +0000
65+++ src/server/shell/surface.cpp 2013-09-19 16:20:57 +0000
66@@ -58,14 +58,17 @@
67
68 msh::Surface::~Surface() noexcept
69 {
70+ std::unique_lock<std::recursive_mutex> lg(surface_lock);
71+
72 if (surface.lock())
73 {
74- destroy();
75+ builder->destroy_surface(surface);
76 }
77 }
78
79 void msh::Surface::hide()
80 {
81+ std::unique_lock<std::recursive_mutex> lg(surface_lock);
82 if (auto const& s = surface.lock())
83 {
84 s->set_hidden(true);
85@@ -78,6 +81,7 @@
86
87 void msh::Surface::show()
88 {
89+ std::unique_lock<std::recursive_mutex> lg(surface_lock);
90 if (auto const& s = surface.lock())
91 {
92 s->set_hidden(false);
93@@ -90,11 +94,15 @@
94
95 void msh::Surface::destroy()
96 {
97+ std::unique_lock<std::recursive_mutex> lg(surface_lock);
98+
99 builder->destroy_surface(surface);
100 }
101
102 void msh::Surface::force_requests_to_complete()
103 {
104+ std::unique_lock<std::recursive_mutex> lg(surface_lock);
105+
106 if (auto const& s = surface.lock())
107 {
108 s->force_requests_to_complete();
109@@ -103,6 +111,8 @@
110
111 mir::geometry::Size msh::Surface::size() const
112 {
113+ std::unique_lock<std::recursive_mutex> lg(surface_lock);
114+
115 if (auto const& s = surface.lock())
116 {
117 return s->size();
118@@ -115,6 +125,8 @@
119
120 void msh::Surface::move_to(geometry::Point const& p)
121 {
122+ std::unique_lock<std::recursive_mutex> lg(surface_lock);
123+
124 if (auto const& s = surface.lock())
125 {
126 s->move_to(p);
127@@ -127,6 +139,8 @@
128
129 mir::geometry::Point msh::Surface::top_left() const
130 {
131+ std::unique_lock<std::recursive_mutex> lg(surface_lock);
132+
133 if (auto const& s = surface.lock())
134 {
135 return s->top_left();
136@@ -139,6 +153,8 @@
137
138 std::string msh::Surface::name() const
139 {
140+ std::unique_lock<std::recursive_mutex> lg(surface_lock);
141+
142 if (auto const& s = surface.lock())
143 {
144 return s->name();
145@@ -151,6 +167,8 @@
146
147 mir::geometry::PixelFormat msh::Surface::pixel_format() const
148 {
149+ std::unique_lock<std::recursive_mutex> lg(surface_lock);
150+
151 if (auto const& s = surface.lock())
152 {
153 return s->pixel_format();
154@@ -163,6 +181,8 @@
155
156 std::shared_ptr<mg::Buffer> msh::Surface::advance_client_buffer()
157 {
158+ std::unique_lock<std::recursive_mutex> lg(surface_lock);
159+
160 if (auto const& s = surface.lock())
161 {
162 return s->advance_client_buffer();
163@@ -175,6 +195,8 @@
164
165 void msh::Surface::allow_framedropping(bool allow)
166 {
167+ std::unique_lock<std::recursive_mutex> lg(surface_lock);
168+
169 if (auto const& s = surface.lock())
170 {
171 s->allow_framedropping(allow);
172@@ -184,6 +206,8 @@
173 void msh::Surface::with_most_recent_buffer_do(
174 std::function<void(mg::Buffer&)> const& exec)
175 {
176+ std::unique_lock<std::recursive_mutex> lg(surface_lock);
177+
178 if (auto const& s = surface.lock())
179 {
180 auto buf = s->snapshot_buffer();
181@@ -197,6 +221,8 @@
182
183 bool msh::Surface::supports_input() const
184 {
185+ std::unique_lock<std::recursive_mutex> lg(surface_lock);
186+
187 if (auto const& s = surface.lock())
188 {
189 return s->supports_input();
190@@ -209,6 +235,8 @@
191
192 int msh::Surface::client_input_fd() const
193 {
194+ std::unique_lock<std::recursive_mutex> lg(surface_lock);
195+
196 if (auto const& s = surface.lock())
197 {
198 return s->client_input_fd();
199@@ -221,6 +249,8 @@
200
201 int msh::Surface::configure(MirSurfaceAttrib attrib, int value)
202 {
203+ std::unique_lock<std::recursive_mutex> lg(surface_lock);
204+
205 int result = 0;
206 bool allow_dropping = false;
207 /*
208@@ -263,11 +293,14 @@
209
210 MirSurfaceType msh::Surface::type() const
211 {
212+ std::unique_lock<std::recursive_mutex> lg(surface_lock);
213+
214 return type_value;
215 }
216
217 bool msh::Surface::set_type(MirSurfaceType t)
218 {
219+ std::unique_lock<std::recursive_mutex> lg(surface_lock);
220 bool valid = false;
221
222 if (t >= 0 && t < mir_surface_type_enum_max_)
223@@ -281,11 +314,13 @@
224
225 MirSurfaceState msh::Surface::state() const
226 {
227+ std::unique_lock<std::recursive_mutex> lg(surface_lock);
228 return state_value;
229 }
230
231 bool msh::Surface::set_state(MirSurfaceState s)
232 {
233+ std::unique_lock<std::recursive_mutex> lg(surface_lock);
234 bool valid = false;
235
236 if (s > mir_surface_state_unknown &&
237@@ -302,6 +337,7 @@
238
239 void msh::Surface::notify_change(MirSurfaceAttrib attrib, int value)
240 {
241+ std::unique_lock<std::recursive_mutex> lg(surface_lock);
242 MirEvent e;
243
244 // This memset is not really required. However it does avoid some
245@@ -319,6 +355,7 @@
246
247 void msh::Surface::take_input_focus(std::shared_ptr<msh::InputTargeter> const& targeter)
248 {
249+ std::unique_lock<std::recursive_mutex> lg(surface_lock);
250 auto s = surface.lock();
251 if (s)
252 targeter->focus_changed(s->input_channel());
253@@ -328,6 +365,7 @@
254
255 void msh::Surface::set_input_region(std::vector<geom::Rectangle> const& region)
256 {
257+ std::unique_lock<std::recursive_mutex> lg(surface_lock);
258 if (auto const& s = surface.lock())
259 {
260 s->set_input_region(region);
261@@ -340,6 +378,7 @@
262
263 void msh::Surface::raise(std::shared_ptr<msh::SurfaceController> const& controller)
264 {
265+ std::unique_lock<std::recursive_mutex> lg(surface_lock);
266 if (auto const& s = surface.lock())
267 {
268 controller->raise(s);
269
270=== modified file 'tests/unit-tests/shell/test_application_session.cpp'
271--- tests/unit-tests/shell/test_application_session.cpp 2013-08-28 03:41:48 +0000
272+++ tests/unit-tests/shell/test_application_session.cpp 2013-09-19 16:20:57 +0000
273@@ -65,7 +65,6 @@
274 ON_CALL(surface_factory, create_surface(_,_,_,_)).WillByDefault(Return(mock_surface));
275
276 EXPECT_CALL(surface_factory, create_surface(_, _, _, _));
277- EXPECT_CALL(*mock_surface, destroy());
278
279 mtd::MockSessionListener listener;
280 EXPECT_CALL(listener, surface_created(_, _)).Times(1);
281@@ -142,7 +141,6 @@
282 InSequence seq;
283 EXPECT_CALL(*mock_surface, hide()).Times(1);
284 EXPECT_CALL(*mock_surface, show()).Times(1);
285- EXPECT_CALL(*mock_surface, destroy()).Times(1);
286 }
287
288 msh::SurfaceCreationParameters params;

Subscribers

People subscribed via source and target branches