Unity8 shows black screen with Qt 5.4.0

Bug #1403758 reported by Timo Jyrinki
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
binutils
Fix Released
Medium
binutils (Ubuntu)
Fix Released
Undecided
Unassigned
gcc-defaults (Ubuntu)
Incomplete
Undecided
Unassigned
qtbase-opensource-src (Ubuntu)
Fix Released
Undecided
Unassigned
ubuntu-ui-toolkit (Ubuntu)
Invalid
Undecided
Unassigned
unity8 (Ubuntu)
Fix Released
High
Unassigned

Bug Description

With bug #1403511 taken care of in Qt, apps do not anymore crash with Qt 5.4 and device does not anymore go into reboot loop. However, nothing is visible on the screen.

It seems unity8, unity8-dash etc are all running. unity8.log attached

Relevant upstream links referring GCC5 as the reason to require -fPIC:
http://code.qt.io/cgit/qt/qtbase.git/commit/?id=36d6eb721e7d5997ade75e289d4088dc48678d0d
http://code.qt.io/cgit/qt/qtbase.git/commit/?id=3eca75de67b3fd2c890715b30c7899cebc096fe9

Tags: qt5.4

Related branches

Revision history for this message
In , Giuseppe-dangelo (giuseppe-dangelo) wrote :

Created attachment 7474
testcase

The attached program changes the output from "true" to "false" when the -Bsymbolic / -Bsymbolic-functions options are passed to GCC. This happens on ARM -- on x86-64 output is always "true".

The program involves a comparison, within a shared library, of a PMF defined inside the shared library itself with the same PMF passed by the application.

Compile with:

 > g++ -fPIC -shared -Wall -o libshared.so -Wl,-Bsymbolic shared.cpp
 > g++ -fPIE -Wall -o main main.cpp -L. -lshared

(The long story is that Qt 5 is taking PMFs in its public API, and the comparison failing inside of Qt shared libraries is breaking code on ARM, as -Bsymbolic is set by default there.)

The bug has been acknowledged, and tentative patch has been kindly provided by W. Newton here:

> https://sourceware.org/ml/binutils/2014-01/msg00172.html

but there hasn't been any activity from what I can see, so I'm opening this bug report to keep track of the issue.

References:

> http://lists.linaro.org/pipermail/linaro-toolchain/2014-January/003942.html
> https://bugreports.qt-project.org/browse/QTBUG-36129

Revision history for this message
In , V-thomas-i (v-thomas-i) wrote :

Created attachment 7483
C-only testcase

Revision history for this message
In , V-thomas-i (v-thomas-i) wrote :

Attached a C-only testcase (forgot to rename the files from .cpp to .c, but I don't think that matters). The testcase simply outputs the function pointer of a function in a shared library, and one can see that the address is not the same.

Compile with:
gcc -fPIC -shared -Wall -o libshared.so -Wl,-Bsymbolic shared.cpp
gcc -fPIE -Wall -o main main.cpp -L. -lshared

The output on my machine is:
# ./main
0x8518
0x2ac595cc

The root of the problem seems to be that main gets the following undefined symbol:
   Num: Value Size Type Bind Vis Ndx Name
    14: 0000853c 0 FUNC GLOBAL DEFAULT UND testFunction()

This is a special hack - an undefined symbol that nevertheless has a value! See http://www.airs.com/blog/archives/42 for details on how that hack is supposed to work.
The idea is to use a different symbol value depending on the relocation type - a relocation for R_ARM_JUMP_SLOT should always resolve to the local PLT, but a relocation of type R_ARM_GLOB_DAT should use the symbol value defined in main. The problem is that libshared.so doesn't contain a R_ARM_GLOB_DAT relocation when taking the address of the function, but a R_ARM_RELATIVE relocation, and will therefore never resolve the function address to the value in main.
I guess this is a consequence of using -Bsymbolic - when taking the address of a function, it should still go through the GOT, with a R_ARM_GLOB_DAT relocation, despite -Bsymbolic being set.

Or at least that is what I think is the cause of the problem, I am by no means a Linker export, I only recently started being interested in this topic.

Revision history for this message
In , V-thomas-i (v-thomas-i) wrote :

Created attachment 7484
C-only testcase

Revision history for this message
In , V-thomas-i (v-thomas-i) wrote :

Note that on x86-64, this works, but is *not* solved like the hacky method that Ian Lance Taylor describes in his blog.

Instead of using an undefined symbol with an actual value in main, x86-64 will create a normal undefined symbol with a 0 value:
    12: 0000000000000000 0 FUNC GLOBAL DEFAULT UND testFunction

This works well: The relocations in libshared.so is a relative relocation that resolves to the actual address of testFunction (not the PLT stub), and the relocation in main is a R_X86_64_GLOB_DAT relocation that resolves to the address of testFunction in libshared.so, and all just works.

This is less hacky, and I think also the proposed patch from the mailing list solves the problem this way.

Revision history for this message
In , Cvs-commit (cvs-commit) wrote :

This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "gdb and binutils".

The branch, master has been updated
       via 97323ad11305610185a0265392cabcd37510f50e (commit)
      from e1f8f1b3af798e8af99bffdb695f74c6c916d150 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=97323ad11305610185a0265392cabcd37510f50e

commit 97323ad11305610185a0265392cabcd37510f50e
Author: Will Newton <email address hidden>
Date: Fri Jan 10 14:38:58 2014 +0000

    bfd/elf32-arm.c: Set st_value to zero for undefined symbols

    Unless pointer_equality_needed is set then set st_value to be zero
    for undefined symbols.

    bfd/ChangeLog:

    2014-03-20 Will Newton <email address hidden>

     PR ld/16715
     * elf32-arm.c (elf32_arm_check_relocs): Set
     pointer_equality_needed for absolute references within
     executable links.
     (elf32_arm_finish_dynamic_symbol): Set st_value to zero
     unless pointer_equality_needed is set.

    ld/testsuite/ChangeLog:

    2014-03-20 Will Newton <email address hidden>

     * ld-arm/ifunc-14.rd: Update symbol values.

-----------------------------------------------------------------------

Summary of changes:
 bfd/ChangeLog | 9 +++++++++
 bfd/elf32-arm.c | 7 ++++++-
 ld/testsuite/ChangeLog | 4 ++++
 ld/testsuite/ld-arm/ifunc-14.rd | 4 ++--
 4 files changed, 21 insertions(+), 3 deletions(-)

Revision history for this message
In , Giuseppe-dangelo (giuseppe-dangelo) wrote :

Thanks to Will for merging the patch. I'm not closing this just yet as we got reports of the same kind of breakage on PPC.

Revision history for this message
Timo Jyrinki (timo-jyrinki) wrote :
Revision history for this message
Albert Astals Cid (aacid) wrote :

I've been investigating this today and there's at leat two separate issues:

**Only** on the phone you get the log Timo attaches (or something along the lines). That is caused by QQmlPropertyValidator::canCoerce failing. I've talked to Simon Haussmann from Qt Company and the situation we have (two QQmlPropertyCache with different pointer but same QMetaObject pointer) is technically impossible, yet we're having it, if o change that function to do

    while (fromMo && toMo) {
        if (fromMo->metaObject() && toMo->metaObject() && fromMo->metaObject()->className() == toMo->metaObject()->className())
            return true;
        fromMo = fromMo->parent();
    }

instead of the original loop comparing toMo and fromMo i can workaround the bug (but this is just a hacky) and get to the next bug

On the desktop (or on the phone after workarounding the first issue), you will get errors saying
"Unable to assign QSortFilterProxyModelQML to QSortFilterProxyModelQML"

These errors seem to be caused somehow by Ubuntu.Components since if i un-import it, they go away. You can also workaround them by changing the code from
  property SortFilterProxyModel categories: categoryFilter
to
  property var categories: categoryFilter

Which you could even argue it's better code in some cases, but still we need to investigate why it's happening.

In summary, there's two bugs:
 Bug #1 only happen on the phone (probably because of ARM)
 Bug #2 is triggered by importing Ubuntu.Components

Both need investigation. I'll continue tomorrow if noone beats me to it :D

Revision history for this message
Zsombor Egri (zsombi) wrote :

Checking how the type is exported - and all related types to that one - I see a potential bug in UITK. The way is exported to 1.1 version is wrong:

 qmlRegisterType<QSortFilterProxyModelQML>(uri, 1, 1, "SortFilterModel");

And it should be

 qmlRegisterType<QSortFilterProxyModelQML, 1>(uri, 1, 1, "SortFilterModel")

More, as the type is supposed to be used only with version 1.1 and above, all properties/slots/signals would need to be revisioned correctly.

Revision history for this message
Albert Astals Cid (aacid) wrote :

Oh, the SDK has a QSortFilterProxyModelQML too?

Now I see why are we getting this error now, we have one in Unity8 too and that's why it is getting confused. It is a regression against Qt 5.3 but i'm not sure it's totally QMLs fault to be honest :D

Revision history for this message
Albert Astals Cid (aacid) wrote :

So second bug is https://bugreports.qt-project.org/browse/QTBUG-43463 but tbh I can see them marking this just as a won't fix.

Revision history for this message
Timo Jyrinki (timo-jyrinki) wrote :

Long discussions about #qt-labs, mostly between tsdgeos and tronical, about various things. Summarizing:

* <+tronical> tsdgeos: so I think something is wrong with the unit8 binary, my _guess_ is that it isn't built with -fPIC
-> it is built with -fPIC (and -fPIE, also asked about, "ok, that looks good")
* <+tronical> oh wait, there used to be a gcc ARM bug with copy relocations that affected this <+tronical> tsdgeos: could it be that qtbase 5.3 was built with -reduce-relocations and now it isn't anymore?
-> no, both 5.3 and 5.4 built _with_ -reduce-relocations
* binutils has had bugs, Ubuntu has the version with fixes
* -Bsymbolic-functions is system wide default in Ubuntu
-> "well, it explains where that comes from. it doesn't explain why the relocation in unit8 resolved to the wrong address"
* <+tronical> peppe, Mirv, tsdgeos: I realize that this is unrelated to -Bsymbolic-functions because this isn't about functions but a global symbol/variable
* <+tronical> peppe, Mirv, tsdgeos: this particular problem is new and different. QObject::staticMetaObject is a global symbol and here it resolves to two different addresses
* <+tronical> tsdgeos: still there? I have an idea how we may be able to find out where that bad pointer is coming from (it may not necessarily by the unity8 binary)

...to be continued.

Revision history for this message
Albert Astals Cid (aacid) wrote :

Second bug (i.e. not starting in the desktop) should be fixed by https://code.launchpad.net/~aacid/unity8/unitySortFilterProxyQML/+merge/245198

First bug still under investigation as Timo reports ↑↑

Revision history for this message
Timo Jyrinki (timo-jyrinki) wrote :

Trying to represent the meaningful parts of the rest of the discussion below.

<+tronical> tsdgeos: in QQmlEnginePrivate::registerBaseTypes can you insert a qDebug() << "QObject::staticMetaObject is at" <<
&QObject::staticMetaObject; ?
<+tronical> tsdgeos: if we get the wrong address there, then maybe we have a miscompilation of QtQml somehow
< tsdgeos> tronical: yeah it's wrong there
<+tronical> tsdgeos: ok, that means there's something wrong with libQt5Qml.so
< tsdgeos> removing -O2 didn't help
<+tronical> ok, that dump looks sane though
<+tronical> in particular QObject::staticMetaObject is referenced through a R_ARM_GOT32 relocation
<+tronical> can you paste the output of objdump -TRDC /path/to/unity8 ?
< tsdgeos> yes
< tsdgeos> tronical: http://paste.ubuntu.com/9570566/
<+tronical> aha!!
<+tronical> this one has indeed a copy relocation to QObject::staticMetaObject, which it shouldn't have if it's compiled with -fPIE I think

(... -fPIE is in use ...)

<+tronical> tsdgeos: _some_ source code in the unity8 sources references QObject::staticMetaObject
<+tronical> tsdgeos: in that .o file the reference to QObject::staticMetaObject should be through the procedure linkage table, but in your case it must be an absolute reference/relocation, which will cause the linker in turn to create a so-called copy relocation for unity8 as binary. I think it shouldn't have that
< tsdgeos> builddir/tests/mocks/Unity/Indicators/moc_sharedunitymenumodel.cpp: { &QObject::staticMetaObject, qt_meta_stringdata_SharedUnityMenuModel.data,
<+tronical> oh, that's just a test
<+tronical> still, ok, suppose it were valid, then the other question is why when loading the plugin that has QSortFilterProxyModelQML the reference to QObject::staticMetaObject isn't changed to point to the copy relocation
<+tronical> tsdgeos: could you try running your app with LD_DEBUG=bindings,symbols and collect the output? it's going to be a lot, so you could email it if it's too big for a paste
< tsdgeos> sent
<+tronical> tsdgeos: the ld output looks ok but I don't understand yet why the one file that has the QSortFilterProxyModelQML isn't relocated correctly
<+tronical> tsdgeos: that comes from libunity8-private.so, right?
<+tronical> tsdgeos: I have to leave. let's look at this early next year :)

Michał Sawicz (saviq)
Changed in unity8 (Ubuntu):
status: New → In Progress
importance: Undecided → High
assignee: nobody → Albert Astals Cid (aacid)
Revision history for this message
Albert Astals Cid (aacid) wrote :

After reverting the patch we have in qtbase that enables Bsymbolic/reduce-relocations for arm, it is back to working.

So this needs someone that understands compiler/linker stuff to have a look. If we want to keep that patch in. Or remove the use of reduce-relocations in arm

Changed in qtbase-opensource-src (Ubuntu):
status: New → Fix Committed
Changed in ubuntu-ui-toolkit (Ubuntu):
status: New → Invalid
Michał Sawicz (saviq)
Changed in unity8 (Ubuntu):
status: In Progress → Fix Released
JACQUELINE (ijdisabest)
Changed in qtbase-opensource-src (Ubuntu):
status: Fix Committed → Confirmed
Changed in qtbase-opensource-src (Ubuntu):
status: Confirmed → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (3.7 KiB)

This bug was fixed in the package qtbase-opensource-src - 5.4.0+dfsg-4ubuntu2

---------------
qtbase-opensource-src (5.4.0+dfsg-4ubuntu2) vivid; urgency=medium

  [ Timo Jyrinki ]
  * New upstream release.
  * debian/patches/qimage_conversions.cpp-Fix-little-endian-build.patch:
    - Fix PowerPC build (LP: #1400244)
  * Remove patches:
    - debian/patches/Always-lock-the-DBus-dispatcher-before-dbus_connecti.patch
    - debian/patches/Don-t-always-chmod-the-XDG_RUNTIME_DIR.patch
    - debian/patches/Fix-crash-in-QNetworkAccessCacheBackend-closeDownstr.patch
    - debian/patches/Partially-revert-Fix-a-deadlock-introduced-by-the-ra.patch
    - debian/patches/QDBusConnection-Merge-the-dispatch-and-the-watch-and.patch
    - debian/patches/Report-the-system-error-on-why-chmod-2-failed-in-XDG.patch
    - debian/patches/Reset-QNAM-s-NetworkConfiguration-when-state-changes.patch
    - debian/patches/Support-dual-sim-in-QtBearer-s-networkmanager-backen.patch
    - debian/patches/Use-a-property-cache-to-cut-down-on-blocking-calls.patch
    - debian/patches/dbus_correct_signal_name_disconnect.patch
    - debian/patches/fix_bug_in_internal_comparison_operator.patch
    - debian/patches/fix_sparc_atomics.patch
    - debian/patches/prefer_qpa_for_systemtrayicon.patch
    - debian/patches/update-QtBearer-NetworkManager-backend-API.patch
  * Include the networkmanager backend changes from 5.4.1
  * Bump ABI version to 5.4.0
  * debian/patches/Resolve-GLES3-functions-from-the-shared-lib.patch
    - Fix usage on OpenGL ES2 platforms (LP: #1403511)
  * Sync with Debian 5.4.0+dfsg-4
  * debian/patches/enable-tests.patch:
    - Refresh and enable for 5.4.0 (LP: #1403582)
    - Disable the tests for new QStorageInfo class which partially fail
    - Disable some widgets tests that fail (desktop only)
    - Disable one qlogging test
  * Drop reduce-relocations option from configure, since it causes black screen
    for Unity8 on armhf because of linking problems. Comment out the related
    revert of earlier upstream commit. (LP: #1403758)
  * debian/patches/Add-C++11-if-available-for-QVariant-autotest.patch
    - Fix tst_qvariant (LP: #1408273)

  [ Łukasz 'sil2100' Zemczak ]
  * debian/patches/enable_pie.patch:
    - Add fix for QObject::connect failing on ARM

qtbase-opensource-src (5.4.0+dfsg-4) experimental; urgency=medium

  * debian/patches/bsd_statfs.diff: Third attempt to fix the build
    failure on kfreebsd.
  * Update symbols files for mips.

qtbase-opensource-src (5.4.0+dfsg-3) experimental; urgency=medium

  * More debian/copyright updates.
  * Do not ship htmlinfo example which contains non-free HTML pages.
  * Drop remove_icon_from_example.patch and remove_google_adsense.patch,
    no longer needed with the above change.
  * Update symbols files with buildds’ logs.
  * debian/patches/bsd_statfs.diff: Second attempt to fix the build
    failure.

qtbase-opensource-src (5.4.0+dfsg-2) experimental; urgency=medium

  * Add a patch to fix qstorageinfo_unix.cpp build on kFreeBSD.
  * Add a patch to fix qimage_conversions.cpp build on big endian
    systems.
  * Update symbols files with buildds’ logs.

qtbase-opensource-src (5.4.0+dfsg-1) experimental; ur...

Read more...

Changed in qtbase-opensource-src (Ubuntu):
status: Fix Committed → Fix Released
Revision history for this message
Timo Jyrinki (timo-jyrinki) wrote :

This still happens on current wily. Using the -reduce-relocations option plus opting to use it on arm (upstream still carries a patch that disables symbolic function binding on non-x86) causes the Unity 8 to just show black screen.

description: updated
Revision history for this message
Timo Jyrinki (timo-jyrinki) wrote :

With some more building, it seems compiling qtbase with GCC5 does not help in enabling the upstream (now default in Qt 5.5 & 5.4.2) patches, bringing back the -Bsymbolic for armhf and using the -reduce-relocations option. The symptoms are similar with this qtbase compiled with GCC5. I've saved the compilation at https://launchpad.net/~timo-jyrinki/+archive/ubuntu/qt-reduce-relocations and attaching unity8.log (with "Cannot assign object to list").

Revision history for this message
Timo Jyrinki (timo-jyrinki) wrote :

Just omitting "-reduce-relocations" doesn't change anything. For reference, http://paste.ubuntu.com/11431050/ lists the changes that I tried first - enabling the four commented out patches, modifying the enable_pie.patch accordingly which is actually also from upstream even though listed as being Ubuntu specific, and adding the gcc5.patch.

Revision history for this message
Timo Jyrinki (timo-jyrinki) wrote :

This seems resolved for now. We're using fPIC, but not -reduce-relocations (-Bsymbolic) on armhf, and with the latest upstream Qt 5.4.2 defaults backported to 5.4.1 everything seems in order, for now at least. This may need to be revisited with Qt 5.5.0 and/or GCC5.

description: updated
Changed in gcc-defaults (Ubuntu):
status: New → Incomplete
Changed in binutils (Ubuntu):
status: New → Incomplete
Revision history for this message
Timo Jyrinki (timo-jyrinki) wrote :

Just to collect information in this bug, the fPIC requirement and fPIE banning in Qt now is because of this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65886

This bug originally however is about a problem that shows up when that related Bsymbolic option is enabled on armhf too. This needed forcing but we forced it at one point since it was believed the toolchain bug was fixed. However, for now it remains disabled on non-x86 (the default of Qt upstream) because of the persisting bug, but it may be re-enabled later if the bug is resolved.

The toolchain bug (fPIC + Bsymbolic misbehaving on arm) is last described at https://bugreports.qt.io/browse/QTBUG-47350

affects: unity8 → binutils
Revision history for this message
Matthias Klose (doko) wrote :

the binutils issues is fixed upstream. closing the binutils task.

Changed in binutils (Ubuntu):
status: Incomplete → Fix Released
Revision history for this message
In , Alan Modra (amodra-gmail) wrote :

Fixed

Changed in binutils:
importance: Unknown → Medium
status: Unknown → Fix Released
Michał Sawicz (saviq)
Changed in unity8 (Ubuntu):
assignee: Albert Astals Cid (aacid) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.