Mir

Merge lp:~raof/mir/fix-lock-inversion into lp:mir

Proposed by Chris Halse Rogers
Status: Merged
Approved by: Alan Griffiths
Approved revision: no longer in the source branch.
Merged at revision: 4215
Proposed branch: lp:~raof/mir/fix-lock-inversion
Merge into: lp:mir
Diff against target: 152 lines (+60/-40)
3 files modified
src/client/buffer_vault.cpp (+59/-12)
src/client/buffer_vault.h (+1/-0)
tests/unit-tests/client/test_buffer_vault.cpp (+0/-28)
To merge this branch: bzr merge lp:~raof/mir/fix-lock-inversion
Reviewer Review Type Date Requested Status
Mir CI Bot continuous-integration Approve
Alan Griffiths Approve
Review via email: mp+328069@code.launchpad.net

Commit message

mcl::BufferVault: Fix lock inversion.

~BufferVault takes BufferVault::mutex and then calls AtomicCallback::set_callback() on each buffer,
which takes AtomicCallback::mutex.

The RPC subsystem calls buffer->callback(), which takes AtomicCallback::mutex and then calls
BufferVault::wire_transfer_inbound... which immediately takes BufferVault::mutex.

This is clearly a disaster waiting to happen.

Instead, move to a two-phase destruction approach:
1) Lock BufferVault::mutex, set a flag telling callbacks to bail, and collect all the existing
    buffers.
2) Drop the lock, and use buffer->set_callback() to wait on any thread already in a callback.
3) Don't bother telling the server to release any buffers; they're associated with the
   BufferStream that we're in the process of destroying, and will get cleaned up there.

Description of the change

This, unfortunately, moves the deadlock into a different place, but I think it's a correct change.

To post a comment you must log in.
Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

> The RPC subsystem calls buffer->callback(), which takes
> AtomicCallback::mutex and then calls BufferVault::wire_transfer_inbound...
> which immediately takes BufferVault::mutex.

This (pre-existing code) is hard to reason about, the extra logic doesn't help.

Would it be any more complex to make BufferVault::mutex a std::recursive_mutex? (The above is all on a single thread.)

review: Needs Information
Revision history for this message
Mir CI Bot (mir-ci-bot) wrote :

PASSED: Continuous integration, rev:4214
https://mir-jenkins.ubuntu.com/job/mir-ci/3510/
Executed test runs:
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-mir/4803
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-0-fetch/4986
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-1-sourcepkg/release=artful/4975
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-1-sourcepkg/release=xenial/4975
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-1-sourcepkg/release=zesty/4975
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=clang,platform=mesa,release=artful/4840
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=clang,platform=mesa,release=artful/4840/artifact/output/*zip*/output.zip
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=clang,platform=mesa,release=zesty/4840
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=clang,platform=mesa,release=zesty/4840/artifact/output/*zip*/output.zip
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=gcc,platform=mesa,release=artful/4840
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=gcc,platform=mesa,release=artful/4840/artifact/output/*zip*/output.zip
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=gcc,platform=mesa,release=xenial/4840
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=gcc,platform=mesa,release=xenial/4840/artifact/output/*zip*/output.zip
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=gcc,platform=mesa,release=zesty/4840
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=gcc,platform=mesa,release=zesty/4840/artifact/output/*zip*/output.zip
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=cross-armhf,compiler=gcc,platform=mesa,release=artful/4840
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=cross-armhf,compiler=gcc,platform=mesa,release=artful/4840/artifact/output/*zip*/output.zip
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=cross-armhf,compiler=gcc,platform=mesa,release=zesty/4840
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=cross-armhf,compiler=gcc,platform=mesa,release=zesty/4840/artifact/output/*zip*/output.zip
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=i386,compiler=gcc,platform=mesa,release=xenial/4840
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=i386,compiler=gcc,platform=mesa,release=xenial/4840/artifact/output/*zip*/output.zip

Click here to trigger a rebuild:
https://mir-jenkins.ubuntu.com/job/mir-ci/3510/rebuild

review: Approve (continuous-integration)
Revision history for this message
Chris Halse Rogers (raof) wrote :

It is on two different threads - the IPC thread calls buffer->callback (), the application thread calls ~BufferVault. I don't think a recursive mutex helps here?

On 26 July 2017 6:41:46 pm AEST, Alan Griffiths <email address hidden> wrote:
>Review: Needs Information
>
>> The RPC subsystem calls buffer->callback(), which takes
>> AtomicCallback::mutex and then calls
>BufferVault::wire_transfer_inbound...
>> which immediately takes BufferVault::mutex.
>
>This (pre-existing code) is hard to reason about, the extra logic
>doesn't help.
>
>Would it be any more complex to make BufferVault::mutex a
>std::recursive_mutex? (The above is all on a single thread.)
>--
>https://code.launchpad.net/~raof/mir/fix-lock-inversion/+merge/328069
>You are the owner of lp:~raof/mir/fix-lock-inversion.

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

> It is on two different threads - the IPC thread calls buffer->callback (), the
> application thread calls ~BufferVault. I don't think a recursive mutex helps
> here?

Ack. I misread what was happening.

review: Abstain
Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

Nits, but non-blocking:

    bool disposing{false};

Perhaps not the clearest name. I don't have any great idea, but "shutting_down" seems marginally better.

~~~~

    try
    {
        ...
    }
    catch (...)
    {
    }

I guess this was there to ignore IPC errors in the destructor. If there are actually exceptions from the new code are being "eaten" something is badly wrong.

review: Approve
Revision history for this message
Mir CI Bot (mir-ci-bot) wrote :

PASSED: Continuous integration, rev:4216
https://mir-jenkins.ubuntu.com/job/mir-ci/3511/
Executed test runs:
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-mir/4805
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-0-fetch/4992
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-1-sourcepkg/release=artful/4981
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-1-sourcepkg/release=xenial/4981
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-1-sourcepkg/release=zesty/4981
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=clang,platform=mesa,release=artful/4842
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=clang,platform=mesa,release=artful/4842/artifact/output/*zip*/output.zip
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=clang,platform=mesa,release=zesty/4842
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=clang,platform=mesa,release=zesty/4842/artifact/output/*zip*/output.zip
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=gcc,platform=mesa,release=artful/4842
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=gcc,platform=mesa,release=artful/4842/artifact/output/*zip*/output.zip
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=gcc,platform=mesa,release=xenial/4842
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=gcc,platform=mesa,release=xenial/4842/artifact/output/*zip*/output.zip
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=gcc,platform=mesa,release=zesty/4842
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=gcc,platform=mesa,release=zesty/4842/artifact/output/*zip*/output.zip
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=cross-armhf,compiler=gcc,platform=mesa,release=artful/4842
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=cross-armhf,compiler=gcc,platform=mesa,release=artful/4842/artifact/output/*zip*/output.zip
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=cross-armhf,compiler=gcc,platform=mesa,release=zesty/4842
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=cross-armhf,compiler=gcc,platform=mesa,release=zesty/4842/artifact/output/*zip*/output.zip
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=i386,compiler=gcc,platform=mesa,release=xenial/4842
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=i386,compiler=gcc,platform=mesa,release=xenial/4842/artifact/output/*zip*/output.zip

Click here to trigger a rebuild:
https://mir-jenkins.ubuntu.com/job/mir-ci/3511/rebuild

review: Approve (continuous-integration)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'src/client/buffer_vault.cpp'
--- src/client/buffer_vault.cpp 2017-07-17 02:44:34 +0000
+++ src/client/buffer_vault.cpp 2017-07-27 02:04:12 +0000
@@ -80,24 +80,63 @@
80mcl::BufferVault::~BufferVault()80mcl::BufferVault::~BufferVault()
81{81{
82 buffer_factory->cancel_requests_with_context(this);82 buffer_factory->cancel_requests_with_context(this);
83
84 std::vector<std::shared_ptr<MirBuffer>> current_buffers;
85
83 std::unique_lock<std::mutex> lk(mutex);86 std::unique_lock<std::mutex> lk(mutex);
87
88 // Prevent callbacks from allocating new buffers
89 being_destroyed = true;
90
84 for (auto& it : buffers)91 for (auto& it : buffers)
85 try92 if (auto map = surface_map.lock())
86 {93 {
87 if (auto map = surface_map.lock())94 if (auto buffer = map->buffer(it.first))
88 {95 {
89 if (auto buffer = map->buffer(it.first))96 /*
90 {97 * Annoying wart:
91 buffer->set_callback(ignore_buffer, nullptr);98 *
92 }99 * mir::AtomicCallback is atomic via locking, and we need it to be because
93 } 100 * we rely on the guarantee that after set_callback() returns no thread
94 if (!disconnected_)101 * is in, or will enter, the previous callback.
95 {102 *
96 free_buffer(it.first);103 * However!
104 *
105 * wire_transfer_inbound() is called from buffer->callback, and acquires
106 * BufferVault::mutex.
107 *
108 * This destuctor must call buffer->set_callback(), and has acquired
109 * BufferVault::mutex.
110 *
111 * So we've got a lock inversion.
112 *
113 * We can't unlock here, as wire_transfer_inbound() might mutate
114 * BufferVault::buffers, which we're iterating over.
115 *
116 * So store them up to process later...
117 */
118 current_buffers.push_back(buffer);
119
120 /*
121 * ...and erase them from the SurfaceMap.
122 *
123 * After this point, the RPC layer will ignore any messages about
124 * this buffer.
125 *
126 * We don't need to explicitly ask the server to free it; it'll be
127 * freed with the BufferStream
128 */
129 map->erase(it.first);
97 }130 }
98 }131 }
99 catch (...)132 lk.unlock();
133
134 /*
135 * Now, ensure that no threads are in the existing buffer callbacks...
136 */
137 for (auto const& buffer : current_buffers)
100 {138 {
139 buffer->set_callback(ignore_buffer, nullptr);
101 }140 }
102}141}
103142
@@ -246,6 +285,14 @@
246void mcl::BufferVault::wire_transfer_inbound(int buffer_id)285void mcl::BufferVault::wire_transfer_inbound(int buffer_id)
247{286{
248 std::unique_lock<std::mutex> lk(mutex);287 std::unique_lock<std::mutex> lk(mutex);
288
289 /*
290 * The BufferVault is in the process of being destroyed; there's no point in
291 * doing any processing and we should *certainly* not allocate any more buffers.
292 */
293 if (being_destroyed)
294 return;
295
249 last_received_id = buffer_id;296 last_received_id = buffer_id;
250 auto buffer = checked_buffer_from_map(buffer_id);297 auto buffer = checked_buffer_from_map(buffer_id);
251 auto inbound_size = buffer->size();298 auto inbound_size = buffer->size();
252299
=== modified file 'src/client/buffer_vault.h'
--- src/client/buffer_vault.h 2017-05-08 03:04:26 +0000
+++ src/client/buffer_vault.h 2017-07-27 02:04:12 +0000
@@ -96,6 +96,7 @@
96 int const usage;96 int const usage;
9797
98 std::mutex mutex;98 std::mutex mutex;
99 bool being_destroyed{false};
99 BufferMap buffers;100 BufferMap buffers;
100 std::deque<NoTLSPromise<std::shared_ptr<MirBuffer>>> promises;101 std::deque<NoTLSPromise<std::shared_ptr<MirBuffer>>> promises;
101 geometry::Size size;102 geometry::Size size;
102103
=== modified file 'tests/unit-tests/client/test_buffer_vault.cpp'
--- tests/unit-tests/client/test_buffer_vault.cpp 2017-05-17 04:48:46 +0000
+++ tests/unit-tests/client/test_buffer_vault.cpp 2017-07-27 02:04:12 +0000
@@ -156,16 +156,6 @@
156 make_vault();156 make_vault();
157}157}
158158
159TEST_F(BufferVault, frees_the_buffers_we_actually_got)
160{
161 EXPECT_CALL(mock_requests, free_buffer(package.buffer_id()));
162 EXPECT_CALL(mock_requests, free_buffer(package2.buffer_id()));
163 auto vault = make_vault();
164
165 vault->wire_transfer_inbound(package.buffer_id());
166 vault->wire_transfer_inbound(package2.buffer_id());
167}
168
169TEST_F(BufferVault, withdrawing_and_never_filling_up_will_timeout)159TEST_F(BufferVault, withdrawing_and_never_filling_up_will_timeout)
170{160{
171 using namespace std::literals::chrono_literals;161 using namespace std::literals::chrono_literals;
@@ -370,24 +360,6 @@
370 }, std::exception);360 }, std::exception);
371}361}
372362
373TEST_F(BufferVault, makes_sure_rpc_calls_exceptions_are_caught_in_destructor)
374{
375 EXPECT_CALL(mock_requests, free_buffer(_))
376 .Times(1)
377 .WillOnce(Throw(std::runtime_error("")));
378 EXPECT_NO_THROW({
379 auto vault = make_vault();
380 vault->wire_transfer_inbound(package.buffer_id());
381 });
382}
383
384TEST_F(StartedBufferVault, skips_free_buffer_rpc_calls_if_disconnected)
385{
386 EXPECT_CALL(mock_requests, free_buffer(_))
387 .Times(0);
388 vault.disconnected();
389}
390
391TEST_F(StartedBufferVault, buffer_count_remains_the_same_after_scaling)363TEST_F(StartedBufferVault, buffer_count_remains_the_same_after_scaling)
392{364{
393 std::array<mp::Buffer, 3> buffers;365 std::array<mp::Buffer, 3> buffers;

Subscribers

People subscribed via source and target branches