Mir

[regression] Mir clients get caught in an infinite exception loop when the server goes away ("Caught exception at Mir/EGL driver boundary")

Bug #1353867 reported by Daniel van Vugt
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mir
Fix Released
High
Alexandros Frantzis
0.6
Won't Fix
High
Unassigned
0.7
Fix Released
High
Alexandros Frantzis
mir (Ubuntu)
Fix Released
High
Unassigned

Bug Description

Nexus4: Mir client gets caught in an infinite exception loop if the server crashes ("Caught exception at Mir/EGL driver boundary")

Test case:
   1. mir_demo_server_shell
   2. mir_demo_client_egltriangle
   3. Press the power button to sleep
   4. Press the power button again to wake up

Expected: Rendering continues on screen after wakeup
Observed:

Caught exception at Mir/EGL driver boundary: /home/dan/bzr/mir/dev/src/client/rpc/stream_socket_transport.cpp(280): Throw in function virtual void mir::client::rpc::StreamSocketTransport::send_data(const std::vector<unsigned char>&)
Dynamic exception type: N5boost16exception_detail10clone_implINS0_19error_info_injectorIN12_GLOBAL__N_112socket_errorEEEEE
std::exception::what: Failed to send message to server: Bad file descriptor
9, "Bad file descriptor"
Caught exception at Mir/EGL driver boundary: /home/dan/bzr/mir/dev/src/client/rpc/stream_socket_transport.cpp(280): Throw in function virtual void mir::client::rpc::StreamSocketTransport::send_data(const std::vector<unsigned char>&)
Dynamic exception type: N5boost16exception_detail10clone_implINS0_19error_info_injectorIN12_GLOBAL__N_112socket_errorEEEEE
std::exception::what: Failed to send message to server: Bad file descriptor
9, "Bad file descriptor"
Caught exception at Mir/EGL driver boundary: /home/dan/bzr/mir/dev/src/client/rpc/stream_socket_transport.cpp(280): Throw in function virtual void mir::client::rpc::StreamSocketTransport::send_data(const std::vector<unsigned char>&)
Dynamic exception type: N5boost16exception_detail10clone_implINS0_19error_info_injectorIN12_GLOBAL__N_112socket_errorEEEEE
std::exception::what: Failed to send message to server: Bad file descriptor
9, "Bad file descriptor"
.....

Related branches

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Unfortunately, being an infinite loop, no core is dumped, nothing is restarted and there's no bug report.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Oh, it's just bug 1347053. Since the fix for that we've gone from crashing to this infinite loop, but it seems like it could be basically the same problem still.

Changed in mir:
status: New → Triaged
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

The relevant change was:

------------------------------------------------------------
revno: 1818 [merge]
author: Alexandros Frantzis <email address hidden>
committer: Tarmac
branch nick: development-branch
timestamp: Tue 2014-08-05 14:41:21 +0000
message:
  mesa,android: Don't propagate exceptions to graphics driver code. Fixes: https://bugs.launchpad.net/bugs/1347053.

  Approved by PS Jenkins bot, Kevin DuBois, Robert Carr.
------------------------------------------------------------

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Note that since the change is not in 0.6, you will instead see a crash with that series:

terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<(anonymous namespace)::socket_error> >'
  what(): Failed to send message to server: Bad file descriptor
ERROR: /home/dan/bzr/mir/r1817/src/platform/graphics/android/real_hwc_wrapper.cpp(60): Throw in function virtual void mir::graphics::android::RealHwcWrapper::set(hwc_display_contents_1_t&) const
Dynamic exception type: N5boost16exception_detail10clone_implINS0_19error_info_injectorISt13runtime_errorEEEE
std::exception::what: error during hwc prepare(). rc = ffffffff

I think that counts as bug 1347053 still.

Revision history for this message
Daniel van Vugt (vanvugt) wrote : Re: [regression] Nexus4: Mir client gets caught in an infinite exception loop on wakeup ("Caught exception at Mir/EGL driver boundary")

Sorry, I'm not awake. It's the client looping. The server simply crashed, somewhere.

summary: - [regression] Nexus4: Mir server gets caught in an infinite exception
+ [regression] Nexus4: Mir client gets caught in an infinite exception
loop on wakeup ("Caught exception at Mir/EGL driver boundary")
description: updated
summary: [regression] Nexus4: Mir client gets caught in an infinite exception
- loop on wakeup ("Caught exception at Mir/EGL driver boundary")
+ loop if the server crashes ("Caught exception at Mir/EGL driver
+ boundary")
description: updated
Changed in mir:
importance: Critical → High
Revision history for this message
Daniel van Vugt (vanvugt) wrote : Re: [regression] Nexus4: Mir client gets caught in an infinite exception loop if the server crashes ("Caught exception at Mir/EGL driver boundary")

The server crash is now logged separately as bug 1353887.

Changed in mir:
assignee: nobody → Alexandros Frantzis (afrantzis)
tags: added: rtm14
summary: [regression] Nexus4: Mir client gets caught in an infinite exception
- loop if the server crashes ("Caught exception at Mir/EGL driver
+ loop when the server goes away ("Caught exception at Mir/EGL driver
boundary")
Revision history for this message
Daniel van Vugt (vanvugt) wrote : Re: [regression] Nexus4: Mir client gets caught in an infinite exception loop when the server goes away ("Caught exception at Mir/EGL driver boundary")

The offending change just got backported to 0.6 so this bug is in 0.6 now too.

summary: - [regression] Nexus4: Mir client gets caught in an infinite exception
- loop when the server goes away ("Caught exception at Mir/EGL driver
- boundary")
+ [regression] Mir client gets caught in an infinite exception loop when
+ the server goes away ("Caught exception at Mir/EGL driver boundary")
Changed in mir:
status: Triaged → In Progress
Changed in mir:
status: In Progress → Fix Committed
summary: - [regression] Mir client gets caught in an infinite exception loop when
+ [regression] Mir clients get caught in an infinite exception loop when
the server goes away ("Caught exception at Mir/EGL driver boundary")
tags: added: touch-2014-08-28
Changed in mir (Ubuntu):
importance: Undecided → High
status: New → Triaged
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Fix committed to lp:mir/0.7 at revision 1863, scheduled for release in Mir 0.7.0

Changed in mir:
milestone: 0.7.0 → 0.8.0
Changed in mir:
milestone: 0.8.0 → 0.7.0
Changed in mir:
milestone: 0.7.0 → 0.8.0
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (3.2 KiB)

This bug was fixed in the package mir - 0.7.0+14.10.20140829-0ubuntu1

---------------
mir (0.7.0+14.10.20140829-0ubuntu1) utopic; urgency=medium

  [ Daniel van Vugt ]
  * New upstream release 0.7.0 (https://launchpad.net/mir/+milestone/0.7.0)
    - Enhancements:
      . Test suite: Reworked mechanism to override Mir client functions
      . Demo shell: Detect custom rendering (decorations) to make it
        compatible with overlay optimizations
      . Make sure to preserve fd resources until the end of the sending
        of the message
      . Add test cases and script for tracking changes to the new ABIs:
        libmircommon, libmirplatform
      . Symbols file for libmirplatform
      . Symbols file for libmircommon
      . Symbols file for libmirserver
      . Various improvements to the SessionMediator test
      . Various build related improvements
      . Print testcase output during package build
      . Abort test when InProcessServer startup fails
      . Link the integration and unit tests against the server objects
      . Add a document detailing the useful tests to run and the useful
        logs to collect when troubleshooting a new android chipset
      . Enable motion event resampling and prediction for a more responsive
        touch experience.
    - ABI summary: Servers need rebuilding, but clients do not
      . Mirclient ABI unchanged at 8
      . Mircommon ABI bumped to 1
      . Mirplatform ABI bumped to 2
      . Mirserver ABI bumped to 25
    - API changes
      . Deleted function - frontend::Shell::create_surface_for(). If you have
        the std::shared_ptr<frontend::Session> session, you can just do
        session->create_surface(params) instead to get a SurfaceId
    - Bug fixes:
      . Ensure we process lifecycle events before the nested server is torn
        down (LP: #1353465)
      . Fix race in InputTestingServerConfiguration (LP: #1354446)
      . Fix fd leaks in prompt session frontend code and tests (LP: #1353461)
      . Detect the additional things the demo shell draws on the renderable
        list and avoid calling the optimized post function if they are being
        drawn (LP: #1348330)
      . Client: Fix SIGTERM dispatch in our default lifecycle event handler
        (LP: #1353867)
      . DemoRenderer: Don't try to create a texture of width zero.
        (LP: #1358210)
      . Fix CI failures (LP: #1358698)
      . Fix build failure: "variable ‘rc’ set but not used" which happens in
        release mode when NDEBUG is set (LP: #1358625)
      . Only enumerate exposed input surfaces to avoid delivering events to
        occluded surfaces (LP: #1359264)
      . Android: do not post driver cancelled buffers (LP: #1359406)
      . Client: Ensure our platform library stays loaded for as long as it is
        needed by other objects (LP: #1358191)
      . Examples: Register the DemoCompositor with the Scene to properly
        process visibility events (LP: #1359487)
      . Mir_demo_client_basic: Don't assert on user errors like failing to
        connect to a Mir server (LP: #1331958)
      . Tests: Explicitly depend on GMock target to avoid build races
        (LP: #1362646)

  [ Ubuntu dai...

Read more...

Changed in mir (Ubuntu):
status: Triaged → Fix Released
Changed in mir:
milestone: 0.8.0 → none
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.