Mir

Merge lp:~kdub/mir/fix-shutdown-bug into lp:~mir-team/mir/trunk

Proposed by Kevin DuBois
Status: Merged
Approved by: Daniel van Vugt
Approved revision: no longer in the source branch.
Merged at revision: 592
Proposed branch: lp:~kdub/mir/fix-shutdown-bug
Merge into: lp:~mir-team/mir/trunk
Diff against target: 30 lines (+11/-1)
2 files modified
src/server/surfaces/surface.cpp (+0/-1)
tests/unit-tests/surfaces/test_surface.cpp (+11/-0)
To merge this branch: bzr merge lp:~kdub/mir/fix-shutdown-bug
Reviewer Review Type Date Requested Status
Daniel van Vugt Approve
Alexandros Frantzis (community) Approve
Alan Griffiths Approve
PS Jenkins bot (community) continuous-integration Approve
Review via email: mp+158710@code.launchpad.net

Commit message

fixes: lp:1168072
if the swapper was empty at the time shutdown was called (eg when a client died), it would cause us to pop from an empty deque.

Description of the change

fixes: lp:1168072
if the swapper was empty at the time shutdown was called (eg when a client died), it would cause us to pop from an empty deque.

I could only produce this on my nexus 4 by running a server, and a few clients, and then starting/ctrl-c 'ing a client repeatedly. Returning is the proper thing to do; at the only time this bug could occur, the client is not waiting on the cv.

To post a comment you must log in.
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Sorry to nit-pick on this one, but can we drop the extraneous parentheses and if statements?...

if (client_queue.empty())
{
    if (compositor_queue.empty())
        return;

    //client is waiting. hijack compositor's buffer and unblock client
    auto dequeued_buffer = compositor_queue.front();
    compositor_queue.pop_front();
    client_queue.push_back(dequeued_buffer);

review: Needs Fixing
Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

This might well be the right solution, but it's not obvious to me that it is, since it's not clear what's the root cause of this (also see comment #3 in https://bugs.launchpad.net/mir/+bug/1168072). If this is called normally during server shutdown, then the compositor queue should never be empty, since the compositor has stopped and therefore can't be holding a buffer, nor can the client be holding all buffers since we disallow this in client_acquire().

review: Needs Information
Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

> This might well be the right solution, but it's not obvious to me that it is,
> since it's not clear what's the root cause of this (also see comment #3 in
> https://bugs.launchpad.net/mir/+bug/1168072). If this is called normally
> during server shutdown, then the compositor queue should never be empty, since
> the compositor has stopped and therefore can't be holding a buffer, nor can
> the client be holding all buffers since we disallow this in client_acquire().

Ignore this... I just realized that this is also called when a ms::Surface is destroyed.

Thinking a bit more about this, I believe the cleanest way to handle the situation would be to always push a NullBuffer to the client_queue.

Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

> Ignore this... I just realized that this is also called when a ms::Surface is
> destroyed.
>
> Thinking a bit more about this, I believe the cleanest way to handle the
> situation would be to always push a NullBuffer to the client_queue.

Taking one step back: does ms::Surfaces need to call shutdown() in the destructor? If a ms::Surface is being destroyed, it means that no one is holding a strong reference to it, and thus that no one can be blocked trying to get the next buffer from it, so there is no need to call shutdown() to unblock potentially blocked threads (there aren't any).

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

> > Ignore this... I just realized that this is also called when a ms::Surface
> is
> > destroyed.
> >
> > Thinking a bit more about this, I believe the cleanest way to handle the
> > situation would be to always push a NullBuffer to the client_queue.
>
> Taking one step back: does ms::Surfaces need to call shutdown() in the
> destructor? If a ms::Surface is being destroyed, it means that no one is
> holding a strong reference to it, and thus that no one can be blocked trying
> to get the next buffer from it, so there is no need to call shutdown() to
> unblock potentially blocked threads (there aren't any).

I think this is right. The call was introduced before we introduced what is now shell::Surface to provide that guarantee.

Can someone able to reproduce the problem check whether removing this call is sufficient?

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

Oops. Add "Needs Information"

> > > Ignore this... I just realized that this is also called when a ms::Surface
> > is
> > > destroyed.
> > >
> > > Thinking a bit more about this, I believe the cleanest way to handle the
> > > situation would be to always push a NullBuffer to the client_queue.
> >
> > Taking one step back: does ms::Surfaces need to call shutdown() in the
> > destructor? If a ms::Surface is being destroyed, it means that no one is
> > holding a strong reference to it, and thus that no one can be blocked trying
> > to get the next buffer from it, so there is no need to call shutdown() to
> > unblock potentially blocked threads (there aren't any).
>
> I think this is right. The call was introduced before we introduced what is
> now shell::Surface to provide that guarantee.
>
> Can someone able to reproduce the problem check whether removing this call is
> sufficient?

review: Needs Information
Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

> Can someone able to reproduce the problem check whether removing this call is
> sufficient?

I can confirm that removing the call fixes the crash.

Revision history for this message
Kevin DuBois (kdub) wrote :

I think this was figured out in the comments, but just to be clear, this is not during the shutdown of the server, rather, just the shutdown of a client due to the client getting a SIGTERM.

removing that call will be sufficient (my quick check just verified). I think that with the way the code is now, the shutdown function call isn't needed anymore too.

Revision history for this message
Kevin DuBois (kdub) wrote :

Although just removing the line seems to work, I think we should probably add an integration test that teased this scenario out, and have that driver the 1line change

Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
Revision history for this message
Kevin DuBois (kdub) wrote :

> Sorry to nit-pick on this one, but can we drop the extraneous parentheses and
> if statements?...
>
> if (client_queue.empty())
> {
> if (compositor_queue.empty())
> return;
>
> //client is waiting. hijack compositor's buffer and unblock client
> auto dequeued_buffer = compositor_queue.front();
> compositor_queue.pop_front();
> client_queue.push_back(dequeued_buffer);

we're changing a different part of the code to address the bug

Revision history for this message
Robert Ancell (robert-ancell) wrote :

I think you need to update the commit message for this MP now.

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

Looks good. (What's wrong with the commit message?)

review: Approve
Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

Looks good.

Having understood this problem more, I think that a variant of the original proposal should also be merged (probably not in this MP).

BufferSwapper::shutdown() is implemented with very specific preconditions in mind, and we either need to explicitly fail if the preconditions aren't met (e.g. throw a std::logic_error), or we should just fail silently without crashing. Since we can control the conditions in which this method is called, I vote for the explicit failure mode.

In other words, if ::shutdown() is not in a position to unblock potentially blocked threads, it should throw a std::logic_error.

This touches what I am currently working on (generalizing ::shutdown()), so I can include it in my upcoming branch.

review: Approve
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Better, thanks.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/server/surfaces/surface.cpp'
2--- src/server/surfaces/surface.cpp 2013-04-12 04:55:11 +0000
3+++ src/server/surfaces/surface.cpp 2013-04-15 19:49:24 +0000
4@@ -52,7 +52,6 @@
5
6 ms::Surface::~Surface()
7 {
8- shutdown();
9 }
10
11 std::string const& ms::Surface::name() const
12
13=== modified file 'tests/unit-tests/surfaces/test_surface.cpp'
14--- tests/unit-tests/surfaces/test_surface.cpp 2013-03-29 16:51:35 +0000
15+++ tests/unit-tests/surfaces/test_surface.cpp 2013-04-15 19:49:24 +0000
16@@ -365,3 +365,14 @@
17 ms::Surface surf{surface_name, mock_buffer_bundle, mock_change_cb};
18 surf.set_alpha(0.5f);
19 }
20+
21+TEST_F(SurfaceCreation, test_surface_shutdown)
22+{
23+ using namespace testing;
24+
25+ EXPECT_CALL(*mock_buffer_bundle, shutdown()).Times(Exactly(1));
26+
27+ ms::Surface surf{surface_name, mock_buffer_bundle, mock_change_cb};
28+ surf.shutdown();
29+
30+}

Subscribers

People subscribed via source and target branches