Mir

[regression] Nested servers can select wrong platform

Bug #1515558 reported by Alan Griffiths
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mir
Fix Released
Medium
Alan Griffiths
0.18
Won't Fix
Medium
Mir development team
mir (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Nested servers do not have the information required to deduce which platform has been selected by the host server. (The host may be a different user with different permissions, with different environment variables and different command-line options.)

Original scenario:

$ sudo bin/mir_demo_server --window-manager system-compositor --display-config sidebyside --vt 1 --arw-file
$ bin/mir_demo_server --host /tmp/mir_socket --display-config clone
...
[1447325929.742637] mirserver: Selected driver: dummy (version 0.18.0)

Seems to be -c 3098 as specifying a dummy --vt parameter is a workaround

Related branches

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

Yes, reverting -c 3098 fixes the problem

Changed in mir:
importance: Undecided → High
Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

Root cause is that the nested platform *ought* to be selecting based upon the host server, not probing the platform drivers itself.

Revision history for this message
Cemil Azizoglu (cemil-azizoglu) wrote :

Not sure how dummy platform would be selected. In my case, it selects the mesa-x11 platform (still wrong).

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

If you start from a VT or a ssh session then there's no DISPLAY environment variable and mesa-x11 won't be selected.

In any case probing in the nested server can in principle "get it wrong" - the host server ought to determine the stack to use.

Although... I think the main use of the guest platform is to allocate buffers, so maybe following the NBS work we can rework the the nested server so that it won't need anything beyond the client API and guest platforms can be dropped.

Changed in mir:
importance: High → Medium
Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

Team discussion agreed there is currently no way for the nested server to correctly deduce the platform choice of the host in all cases. (This issue can also affect normal clients choosing the "client platform".)

There's a significant chunk of work needed to negotiate this correctly.

Changed in mir:
status: New → Confirmed
tags: added: nested
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Fix committed to lp:mir/0.18 at revision 3182, scheduled for release in Mir 0.18.0

Changed in mir:
milestone: none → 0.19.0
status: Confirmed → In Progress
assignee: nobody → Mir development team (mir-team)
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Invalid in the "Ubuntu" project as the bug never reached distro.

Changed in mir (Ubuntu):
status: New → Invalid
summary: - Nested servers on mesa-kms select wrong (dummy) guest platform
+ [regression] Nested servers on mesa-kms select wrong (dummy) guest
+ platform
tags: added: regression
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote : Re: [regression] Nested servers on mesa-kms select wrong (dummy) guest platform

Fix committed into lp:mir at revision 3198, scheduled for release in mir, milestone 0.19.0

Changed in mir:
status: In Progress → Fix Committed
Kevin DuBois (kdub)
Changed in mir:
status: Fix Committed → Fix Released
Changed in mir:
status: Fix Released → Fix Committed
Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

As discussed above, the nested server doesn't have enough information to select the correct drivers and more work is needed on supplying this from the host before this can be closed.

Changed in mir:
status: Fix Committed → Triaged
Revision history for this message
Daniel van Vugt (vanvugt) wrote :
Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

Having a (hopefully) temporary workaround that handles some common scenarios isn't a fix. Here (copied from the email discussion) are some notes about resolving this:

as discussed following today's standup the problems we face involves a number of issues beyond the probing logic.

There are three contexts that matter: The Host Server; The Nested Server; and, The Client.

At present these all resort to some form of "probing" to decide what platform module to load. This is wrong, but we've only recently started seeing the problems it causes. Only the Host Server should be making a decision on the platform, the Nested Server and the Client should respect that decision.

Currently, the client code sees no difference between KMS and X11 - it just chooses between "android" and "mesa", which is a decision that is usually right. However, if we ever get additional platforms that "look alike" it will encounter the same problems we've seen recently without needing a nested server. Consequently, it would be better for the server to tell it which driver to load.

The Nested Server is where we've seen problems arise. As it need not be running as the same user, with the same rights, with the same environment nor configuration options as the Host Server it cannot reliably deduce which platform the host has chosen. Consequently, it would be better for the Host Server to tell it which driver to load.

Also worth considering is that the Nested Server only needs a server platform module to instantiate a "GuestPlatform" the principle function of which is buffer allocation. With "New Buffer Semantics" this will be possible through the client API - which may obsolete the need for platform specific "GuestPlatform" implementations. If we can eliminate loading a platform module to create a "GuestPlatform" then we achieve a significant simplification.

Coming out of this there are several pieces of work:

1. improve the decision logic in the Host Server and platform probing to deal with tie-breaking. (E.g. KMS should be preferred to X11.)

2. Enable the server to inform the client/nested server of the correct platform module to load. (This doesn't need to be surfaced in the client API, it is internal.)

3. Once the NBS client API is finalized. Revisit the need for a platform specific GuestPlatform - ideally one can be written in terms of the client API.

Changed in mir:
status: Triaged → In Progress
assignee: Mir development team (mir-team) → Alan Griffiths (alan-griffiths)
summary: - [regression] Nested servers on mesa-kms select wrong (dummy) guest
- platform
+ [regression] Nested servers can select wrong platform
description: updated
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

See also relevant bug 1526595

Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :

Fix committed into lp:mir at revision None, scheduled for release in mir, milestone 0.19.0

Changed in mir:
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package mir - 0.19.0+16.04.20160128-0ubuntu1

---------------
mir (0.19.0+16.04.20160128-0ubuntu1) xenial; urgency=medium

  [ Brandon Schaefer ]
  * New upstream release 0.19.0
    - API summary:
      . mirclient abi unchanged at 9
      . mirserver abi bumped to 37
      . mircommon abi unchanged at 5
      . mirplatform abi unchanged at 11
      . mirprotobuf abi unchanged at 3
      . mirplatformgraphics abi bumped to 7
      . mirclientplatform abi bumped to 4
      . mirinputplatform abi bumped to 5
      . mircookie abi bumped to 2
    - Bug fix:
      . Mir servers crash on mouse input (LP: #1528438)
      . Pinch to zoom not working reliably (LP: #1531517)
      . Passing DisplayConfiguration scale property from
        nested server to host appears to not work (LP: #1535780)
      . Various TSan reports when running test suite
        on a mir tsan enabled build (LP: #1523647)
      . Buffer leak during repeated mirscreencasts
        causes server to be killed (LP: #1523900)
      . Cursor now displays correctly (LP: #1526779)
      .ProgramOption::parse_file() reports problems to cerr (LP: #1190165)
      . Nested servers can select wrong platform (LP: #1515558)
      . There seems to be missing RTTI information
         when linking with UBSan enabled (LP: #1521930)
      . Mir threadsanitizer build fails with GCC (LP: #1522581)
      . After "make install" mir_demo_server cannot
        find shared object file in /usr/local/lib (LP: #1522836)
      . Fixed a test in TestClientInput (LP: #1523965)
      . Mir servers choose graphics-dummy (or no driver at all)
        over mesa-kms on a desktop (LP: #1528082)
      . Function mir_event_get_close_surface_event is never used (LP: #1447690)
      . mir::input::Surface::consume(MirEvent const& event)
        should not take a reference to an opaque type (LP: #1450797)
      . lintian: E: mir-doc: privacy-breach-logo (LP: #1483471)

  [ CI Train Bot ]
  * No-change rebuild.

 -- Brandon Schaefer <email address hidden> Thu, 28 Jan 2016 12:19:47 +0000

Changed in mir (Ubuntu):
status: Invalid → Fix Released
Changed in mir:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.