X11 crashes with seg fault when running QT5 based applications on a Pandaboard with the SGX driver

Bug #1015292 reported by Ricardo Salveti
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ubuntu-omap4-extras-graphics
New
Undecided
Unassigned
pvr-omap4 (Ubuntu)
Invalid
High
Unassigned
Precise
Won't Fix
Undecided
Unassigned
Quantal
Invalid
High
Unassigned
xf86-video-omap (Ubuntu)
Fix Released
High
Ricardo Salveti
Precise
Invalid
Undecided
Unassigned
Quantal
Fix Released
High
Ricardo Salveti
xorg-server (Ubuntu)
Fix Released
High
Ricardo Salveti
Precise
Fix Released
High
Ricardo Salveti
Quantal
Fix Released
High
Ricardo Salveti

Bug Description

[Impact]
Segmentation fault at X11 where the randr code could use the randr screen private data without checking for null first. This happens when the X server is running with multiple screens, some of which are randr enabled and some of which are not. Applications making protocol requests to the non-randr screens can cause segfaults where the server touches the unset private structure.

This happened initially while running Precise on a Pandaboard, as with the driver auto-load, it starts 2 different screens, one backed up by the PVR SGX driver, and the other by fbdev. In this case, the issue can easily be reproduced by running any QT5 based application, as by default it'll try to initialize the internal structures for all screens available at the system.

The bug can also happen on cases where the user is running one screen with the nvidia/ati driver, and the other with fbdev (external usb video device, for example).

[Test Case]
How to reproduce the issue, on a Pandaboard:
1) Install Precise at a Pandaboard;
2) Enable the PVR SGX driver from the "Additional Driver" screen;
1) Enable https://launchpad.net/~canonical-qt5-edgers/+archive/qt5-daily
2) Install 'snowshoe-mobile' package
3) Run snowshoe: $ PATH=/opt/qt5/bin:$PATH; snowshoe

Broken Behavior: X11 will exit with a seg fault
Fixed Behavior: The QT5 based application (snowshoe) will open without crashing X11.

[Regression Potential]
Both patches are already applied at upstream, and they are simply just checking the pointers for NULL results, which would already cause a seg fault in case of NULL value, so it's safe to be applied as SRU.

[Original Report]

While testing Qt 5 support on Ubuntu, and validating the support for OpenGL ES2.0 with Pandaboard, I couldn't start Snowshoe (Qt 5 - webkit based browser) as it gives a segmentation fault and also breaks the X11 server (with the pvr driver).

After a quick check with Snowball (Mali 400), it worked properly and as expected, so this could probably be related with the current SGX driver available for Pandaboard.

How to reproduce the issue:
1) Enable https://launchpad.net/~canonical-qt5-edgers/+archive/qt5-daily
2) Install 'snowshoe-mobile' package
3) Run snowshoe: $ PATH=/opt/qt5/bin:$PATH; snowshoe

This is with Ubuntu 12.04 with pvr-omap4 1.7.10.0.1.21-0ubuntu1 (from archive) and also 1.7.15.0.1.57-1 from TI's PPA.

Revision history for this message
Ricardo Salveti (rsalveti) wrote :
Download full text (7.0 KiB)

Trace from the core dump (with qt enabled with dbg symbols):

root@ubuntu-desktop:/home/ubuntu/qt5/snowshoe-1.0~git+20120608# gdb ./snowshoe core
GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2) 7.4-2012.04
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
For bug reporting instructions, please see:
<http://bugs.launchpad.net/gdb-linaro/>...
Reading symbols from /home/ubuntu/qt5/snowshoe-1.0~git+20120608/snowshoe...done.
[New LWP 8591]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
Core was generated by `./snowshoe'.
Program terminated with signal 11, Segmentation fault.
#0 0xb2453698 in QXcbConnection::initializeXFixes (this=0x210bc70) at qxcbconnection.cpp:1045
1045 xfixes_first_event = reply->first_event;
(gdb) l
1040
1041 void QXcbConnection::initializeXFixes()
1042 {
1043 xcb_generic_error_t *error = 0;
1044 const xcb_query_extension_reply_t *reply = xcb_get_extension_data(m_connection, &xcb_xfixes_id);
1045 xfixes_first_event = reply->first_event;
1046
1047 xcb_xfixes_query_version_cookie_t xfixes_query_cookie = xcb_xfixes_query_version(m_connection,
1048 XCB_XFIXES_MAJOR_VERSION,
1049 XCB_XFIXES_MINOR_VERSION);
(gdb) bt full
#0 0xb2453698 in QXcbConnection::initializeXFixes (this=0x210bc70) at qxcbconnection.cpp:1045
        xfixes_query_cookie = {sequence = 0}
        xfixes_query = 0xffffffff
        __PRETTY_FUNCTION__ = "void QXcbConnection::initializeXFixes()"
        error = 0x0
        reply = 0x0
#1 0xb2451d70 in QXcbConnection::QXcbConnection (this=0x210bc70, nativeInterface=0x210bc08, displayName=0x0) at qxcbconnection.cpp:180
        dpy = 0x210bf38
        it = {data = 0x211286c, rem = 0, index = 1632}
        screenNumber = 2
#2 0xb24558ac in QXcbIntegration::QXcbIntegration (this=0x21047e8, parameters=...) at qxcbintegration.cpp:101
No locals.
#3 0xb24663be in QXcbIntegrationPlugin::create (this=0x21005e8, system=..., parameters=...) at main.cpp:66
No locals.
#4 0xb49db386 in qLoadPlugin1<QPlatformIntegration, QPlatformIntegrationFactoryInterface, QStringList> (loader=0x2100408, key=..., parameter1=...)
    at ../../include/QtCore/5.0.0/QtCore/private/../../../../../src/corelib/plugin/qfactoryloader_p.h:118
        result = 0xbecd8650
        factory = 0x21005f0
        factoryObject = 0x21005e8
        index = 3
#5 0xb49dac0e in QPlatformIntegrationFactory::create (key=..., platformPluginPath=...) at kernel/qplatformintegrationfactory_qpa.cpp:72
        ret = 0xbecd8464
        paramList = {<QList<QString>> = {{p = {static shared_null = {ref = {atomic = {_q_value = -1}}, alloc = 0, begin = 0, end = 0, array = {0x0}},
                d = 0x21003b0}, d = 0x...

Read more...

Changed in pvr-omap4 (Ubuntu):
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Ricardo Salveti (rsalveti) wrote :

The seg fault cause:

Breakpoint 1, QXcbConnection::initializeXFixes (this=0x1fbc88) at qxcbconnection.cpp:1042
1042 {
(gdb) l
1037 xcb_get_input_focus_cookie_t cookie = Q_XCB_CALL(xcb_get_input_focus(xcb_connection()));
1038 free(xcb_get_input_focus_reply(xcb_connection(), cookie, 0));
1039 }
1040
1041 void QXcbConnection::initializeXFixes()
1042 {
1043 xcb_generic_error_t *error = 0;
1044 const xcb_query_extension_reply_t *reply = xcb_get_extension_data(m_connection, &xcb_xfixes_id);
1045 xfixes_first_event = reply->first_event;
1046
(gdb) n
[Thread 0xb10b0460 (LWP 8635) exited]
1044 const xcb_query_extension_reply_t *reply = xcb_get_extension_data(m_connection, &xcb_xfixes_id);
(gdb) p m_connection
$1 = (xcb_connection_t *) 0x1fc9b0
(gdb) p xcb_xfixes_id
$2 = {name = 0xb2433b28 "XFIXES", global_id = 2}
(gdb) n
1045 xfixes_first_event = reply->first_event;
(gdb) p reply
$3 = (const xcb_query_extension_reply_t *) 0x0

Qt expects the first call to 'xcb_get_extension_data' to return a valid pointer, but in this case it's just returning NULL, breaking it up later on.

Revision history for this message
Ricardo Salveti (rsalveti) wrote :

The error from Qt get's generated once xcb detects the error at the connector, which Qt doesn't handle by default:

(gdb) p xcb_xfixes_id
$37 = {name = 0xb2433b28 "XFIXES", global_id = 2}
(gdb) n
Breakpoint 3, xcb_get_extension_data (c=0x1fc9b0, ext=0xb243c02c) at ../../src/xcb_ext.c:85
85 ../../src/xcb_ext.c: No such file or directory.
(gdb) n
87 in ../../src/xcb_ext.c
(gdb) p c->has_error
$39 = 1

....

Qt:

void QXcbConnection::initializeXFixes()
{
    xcb_generic_error_t *error = 0;
    const xcb_query_extension_reply_t *reply = xcb_get_extension_data(m_connection, &xcb_xfixes_id);
    xfixes_first_event = reply->first_event;
...

As this is causing the X11 server to blow as well, I believe this is probably related with an issue on the xrandr side, as it's the first extension Qt loads before moving on with the application.

Revision history for this message
Ricardo Salveti (rsalveti) wrote : Re: QT5 based applications fails with a segmentation fault with Pandaboard and the SGX driver

The issue on the X11 side:
Program received signal SIGSEGV, Segmentation fault.
RRFirstOutput (pScreen=0x2a1880c0) at randr.c:458
458 if (pScrPriv->primaryOutput && pScrPriv->primaryOutput->crtc)
(gdb) bt full
#0 RRFirstOutput (pScreen=0x2a1880c0) at randr.c:458
        pScrPriv = 0x0
        output = <optimized out>
        i = <optimized out>
        j = <optimized out>
#1 0x2a0a5834 in ProcRRGetScreenInfo (client=0x2a233f08) at rrscreen.c:615
        stuff = <optimized out>
        rep = {type = 232 '\350', setOfRotations = 66 'B', sequenceNumber = 10777, length = 705951608, root = 3204445656, timestamp = 706298576, configTimestamp = 704846133, nSizes = 0, sizeID = 0,
          rotation = 63348, rate = 10771, nrateEnts = 16136, pad = 0}
        pWin = 0x2a1b0258
        n = <optimized out>
        rc = 0
        pScreen = 0x2a1880c0
        pScrPriv = 0x0
        extra = <optimized out>
        extraLen = <optimized out>
        output = <optimized out>
#2 0x2a09f456 in ProcRRDispatch (client=<optimized out>) at randr.c:493
        stuff = <optimized out>
#3 0x2a02eb76 in Dispatch () at dispatch.c:442
        clientReady = 0x2a2e81f0
        result = 0
        client = 0x2a233f08
        nready = 0
        icheck = 0x2a1430c0
        start_tick = 100
#4 0x2a0242ce in main (argc=3, argv=0xbefff824, envp=<optimized out>) at main.c:287
        i = <optimized out>
        alwaysCheckForInput = {0, 1}

This happens because the code expects pScrPriv to be available, crashing X11.

The origin of the problem happens because when X11 starts without any pvr specific config, both the pvr/omap and fbdev will be loaded and probed, and as there's no bus specific variable controlling the probe, it ends up starting both drivers.

As a consequence of having both drivers loaded, X will export 2 screens, one with (pvr) and one without randr support (fbdev). The expected pSrcPriv struct comes from randr, so when Qt tries to create a xcb window on all displays available, X11 crashes trying to access pSrcPriv.

There are a few issues happening at the same time here, as first xorg shouldn't be loading 2 screens, and the other is that xorg is not properly validating the pointer before using it's content.

summary: - Snowshoe fails with QT5 with a segmentation fault while starting the
- application
+ QT5 based applications fails with a segmentation fault with Pandaboard
+ and the SGX driver
Revision history for this message
Ricardo Salveti (rsalveti) wrote :

One workaround to get Qt 5 working with SGX is simply forcing just the pvr to be loaded by Xorg, creating a specific config:
root@ubuntu-desktop:~# cat /usr/share/X11/xorg.conf.d/99-pvr.conf
# X.Org X server configuration file

Section "Device"
 Identifier "Video Device"
 Driver "pvr"
 Option "FlipChain" "true"
 Option "NoAccel" "false"
EndSection

Section "Monitor"
 Identifier "Main Screen"
EndSection

Section "Screen"
 Identifier "Screen"
 Monitor "Main Screen"
 Device "Video Device"
EndSection

Section "ServerLayout"
 Identifier "Server Layout"
 Screen "Screen"
EndSection

Revision history for this message
Ricardo Salveti (rsalveti) wrote :

If the pointer is also checked at the xorg side, it works as expected. Tested with the patch:

commit 6e83934da0288e9a182c0f7982871ea5eaff1cec
Author: Ricardo Salveti de Araujo <email address hidden>
Date: Wed Jun 20 20:19:09 2012 -0300

    randr: first check pSrcPriv before using the pointer at RRFirstOutput

    Fix a seg fault in case pSrcPriv fails to be allocated at
    ProcRRGetScreenInfo, which later calls RRFirstOutput.

    Signed-off-by: Ricardo Salveti de Araujo <email address hidden>

diff --git a/randr/randr.c b/randr/randr.c
index 4d4298a..9432819 100644
--- a/randr/randr.c
+++ b/randr/randr.c
@@ -446,6 +446,9 @@ RRFirstOutput(ScreenPtr pScreen)
     RROutputPtr output;
     int i, j;

+ if (!pSrcPriv)
+ return NULL;
+
     if (pScrPriv->primaryOutput && pScrPriv->primaryOutput->crtc)
         return pScrPriv->primaryOutput;

Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Patch sent upstream and reviewed by Keith:
- http://lists.x.org/archives/xorg-devel/2012-June/031836.html

Then he also proposed a similar patch for other use cases at http://lists.x.org/archives/xorg-devel/2012-June/031888.html.

Once accepted I'll take care of sending a debdiff for quantal and creating a SRU for precise.

Changed in xorg-server (Ubuntu):
status: New → In Progress
importance: Undecided → High
assignee: nobody → Ricardo Salveti (rsalveti)
Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Attached the debdiff for Quantal, to be able to fix the seg fault issue in case the user is using 2 monitors and just one has randr support.

Bryce Harrington (bryce)
description: updated
Revision history for this message
Bryce Harrington (bryce) wrote :

Thanks for these two fixes.

We're actually planning on pulling in 1.13 fairly soon; it's already staged in our git tree. I've verified these two patches are already included in our tree (git commits 32603f57 and 855003c3), so that should sufficiently address this bug.

Changed in xorg-server (Ubuntu Quantal):
status: In Progress → Fix Committed
Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Great, will move the SRU ahead then.

Thanks!

Changed in xorg-server (Ubuntu Precise):
status: New → In Progress
importance: Undecided → High
assignee: nobody → Ricardo Salveti (rsalveti)
Revision history for this message
Ricardo Salveti (rsalveti) wrote :
description: updated
summary: - QT5 based applications fails with a segmentation fault with Pandaboard
- and the SGX driver
+ X11 crashes with seg fault when running QT5 based applications on a
+ Pandaboard with the SGX driver
Revision history for this message
Bryce Harrington (bryce) wrote :

Hi Ricardo,

I noticed this isn't uploaded to -proposed yet (that I can tell), so have freshened your debdiff up and uploaded it.

A made a few minor changes. I changed the number from 1.11.4-0ubuntu10.7 to 1.11.4-0ubuntu10.8 since there's been another SRU made recently. I renumbered the patches from 5xx to 2xx, because the 5xx series are really only for patches to the input stack (yes, it's really confusing...)

We also manage the X stack packaging in git (see https://wiki.ubuntu.com/X/GitUsage). I've committed your debdiff to our git tree, so it can be tracked.

Revision history for this message
Ricardo Salveti (rsalveti) wrote : Re: [Bug 1015292] Re: X11 crashes with seg fault when running QT5 based applications on a Pandaboard with the SGX driver

On Mon, Aug 6, 2012 at 4:02 PM, Bryce Harrington
<email address hidden> wrote:
> Hi Ricardo,
>
> I noticed this isn't uploaded to -proposed yet (that I can tell), so
> have freshened your debdiff up and uploaded it.
>
> A made a few minor changes. I changed the number from
> 1.11.4-0ubuntu10.7 to 1.11.4-0ubuntu10.8 since there's been another SRU
> made recently. I renumbered the patches from 5xx to 2xx, because the
> 5xx series are really only for patches to the input stack (yes, it's
> really confusing...)

Great, thanks for your review.

> We also manage the X stack packaging in git (see
> https://wiki.ubuntu.com/X/GitUsage). I've committed your debdiff to our
> git tree, so it can be tracked.

Thanks for the pointers.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package xorg-server - 2:1.12.99.904-0ubuntu1

---------------
xorg-server (2:1.12.99.904-0ubuntu1) quantal-proposed; urgency=low

  [ Maarten Lankhorst ]
  * New upstream release snapshot (on the way to 1.13).
  * Remove 172_cwgetbackingpicture_nullptr_check.patch:
    - Code is removed now that XAA is gone.

xorg-server (2:1.12.99.904-1) UNRELEASED; urgency=low

  * New upstream release snapshot (on the way to 1.13).
  * Bump minimum required abi, randr, dri2 and gl protos.
 -- Timo Aaltonen <email address hidden> Wed, 08 Aug 2012 14:35:03 +0300

Changed in xorg-server (Ubuntu Quantal):
status: Fix Committed → Fix Released
Revision history for this message
Ricardo Salveti (rsalveti) wrote :

On Quantal the bug is not part of the pvr-omap4 package.

Changed in pvr-omap4 (Ubuntu Quantal):
status: Confirmed → Invalid
Changed in xf86-video-omap (Ubuntu Precise):
status: New → Invalid
Changed in xf86-video-omap (Ubuntu Quantal):
assignee: nobody → Ricardo Salveti (rsalveti)
importance: Undecided → High
status: New → In Progress
Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Enabling support for platformProbe at the xf86-video-omap driver, which fixes the behavior of loading both the omap and the fbdev drivers (which fixes the issue related with qt5 applications as well).

Revision history for this message
Ricardo Salveti (rsalveti) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package xf86-video-omap - 0.4.0-0ubuntu2

---------------
xf86-video-omap (0.4.0-0ubuntu2) quantal; urgency=low

  * Build depending and using quilt for patch management
  * debian/patches/01-adding-support-for-platformProbe.patch:
    - Adding support for platformProbe, for proper platform device support
      (LP: #1015292)
 -- Ricardo Salveti de Araujo <email address hidden> Thu, 23 Aug 2012 02:02:54 -0300

Changed in xf86-video-omap (Ubuntu Quantal):
status: In Progress → Fix Released
Revision history for this message
Chris Halse Rogers (raof) wrote : Please test proposed package

Hello Ricardo, or anyone else affected,

Accepted xorg-server into precise-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/xorg-server/2:1.11.4-0ubuntu10.8 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please change the bug tag from verification-needed to verification-done. If it does not, change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in xorg-server (Ubuntu Precise):
status: In Progress → Fix Committed
tags: added: verification-needed
Revision history for this message
Martin Pitt (pitti) wrote :

Everything already uploaded, unsubscribing sponsors.

tags: added: verification-done
removed: verification-needed
Revision history for this message
Colin Watson (cjwatson) wrote : Update Released

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package xorg-server - 2:1.11.4-0ubuntu10.8

---------------
xorg-server (2:1.11.4-0ubuntu10.8) precise-proposed; urgency=low

  * Add upstream patches to avoid seg fault in case the user is running with
    multiple screens and xrandr is only enabled at one (LP: #1015292):
    - 229_randr_first_check_pScrPriv_before_using_the_pointer.patch
    - 230_randr_catch_two_more_potential_unset_rrScrPriv_uses.patch
 -- Ricardo Salveti de Araujo <email address hidden> Thu, 19 Jul 2012 22:57:12 -0300

Changed in xorg-server (Ubuntu Precise):
status: Fix Committed → Fix Released
Revision history for this message
Steve Langasek (vorlon) wrote :

The Precise Pangolin has reached end of life, so this bug will not be fixed for that release

Changed in pvr-omap4 (Ubuntu Precise):
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.