[regression, intrepid] Xorg servers broken "No core keyboard" and "failed to initialize core devices"

Bug #281610 reported by Herbert V. Riedel
40
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Ubuntu PS3 Port
Fix Released
Critical
Dan Munckton
X.Org X server
Fix Released
High
xorg-server (Ubuntu)
Fix Released
High
Dan Munckton
Intrepid
Fix Released
High
Dan Munckton

Bug Description

Regular Xorg and Xvfb fail to start with:
No core keyboard
Fatal server error:
failed to initialize core devices

This is likely since 2:1.5.1-1ubuntu3 and remains unfixed in 2:1.5.2-1ubuntu4.

Related branches

Revision history for this message
Herbert V. Riedel (hvr) wrote :
Revision history for this message
Herbert V. Riedel (hvr) wrote :
Revision history for this message
Jonathan Hudson (jh+lpd) wrote :

Me too. PPC Mac G4.
Build Operating System: Linux 2.6.15-51-powerpc64-smp ppc Ubuntu
(--) PCI:*(0@0:16:0) ATI Technologies Inc Radeon RV250 [Mobility FireGL 9000] rev 1, Mem @ 0x0f9f0374/0, 0x0f9f0374/0, I/O @ 0x0f9f0374/0, BIOS @ 0x????????/262079348

Exactly the same results.

Somewhat important, IMHO.

Revision history for this message
Thomas Champagne (lafeuil) wrote :

I have the same problem. If you want,I have the list of packages that have been upgraded.

Revision history for this message
Herbert V. Riedel (hvr) wrote :

that'd give me at least an idea which packages to downgrade in order to try to restore functionality and thus determine the culprit...

Revision history for this message
Matteo Settenvini (tchernobog) wrote :

My PPC64 G5 machine is affected. Downgrading xserver-xorg-core from 2:1.5.1-1ubuntu3 to 2:1.5.1-1ubuntu2 fixes this for me. If your cache is still populated, try with:

sudo dpkg -i /var/cache/apt/archives/xserver-xorg-core_2%3a1.5.1-1ubuntu2_powerpc.deb

and restart gdm (sudo /etc/init.d/gdm restart).

Revision history for this message
Herbert V. Riedel (hvr) wrote :

alas I don't have it anymore in my cache; is there some package archive of old .deb's?

Revision history for this message
Herbert V. Riedel (hvr) wrote :
Revision history for this message
Marcus Asshauer (mcas) wrote :

Thank you for reporting this bug. I can confirm this problem on my iBook G4.

Changed in xorg:
status: New → Confirmed
Revision history for this message
Herbert V. Riedel (hvr) wrote :

I skimmed quickly through the changes from 1.5.1-1ubuntu2 to 1.5.1-1ubuntu3, and they have occurences of swap{l,s}() and also some cast-trunking of integer pointers, which could be accountable for an endian issue we're experiencing here...

Revision history for this message
Matteo Settenvini (tchernobog) wrote :

Can someone knowing X internals have a look at this? This bug should be raised to "critical" priority, since it makes the whole graphical system unusable (for normal users, that means: the whole system unusable).

Revision history for this message
Loïc Minier (lool) wrote :

Hi all,

I've also noticed this bug, it affects at least ppc, sparc, hppa.

It's also affecting xvfb obviously.

Milestoning for intrepid.

Cheers

Changed in xorg-server:
importance: Undecided → High
milestone: none → ubuntu-8.10
description: updated
Revision history for this message
Loïc Minier (lool) wrote :

Can someone please confirm that 2:1.5.1-1ubuntu2 doesn't suffer from this bug and that 2:1.5.1-1ubuntu3 does?

If someone experiencing the bug knows how to "git bisect", then it would be helpful to "git bisect" the failure to the upstream commit causing the regression; the Ubuntu package is kept in git in the ubuntu branch of <git://git.debian.org/git/pkg-xorg/xserver/xorg-server>.

You can also try from upstream git directly if you can reproduce with upstream tip.

Revision history for this message
Philippe Leroux (phler2) wrote :

i confirm 2:1.5.1-1ubuntu2 works on my ibook g4 and not ubuntu3

however, i have no keyboard
which is not quit pratical.
i should figure that out

Revision history for this message
In , stevewin (stevewin) wrote :

Created an attachment (id=19726)
Xorg log for Fatal server error EnableDevice on Xorg git startup

Using full git xorg version built this week.

PPC 4xx 32-bit platform.

When starting X get the following crash:

---------
Backtrace:
0: /usr/X11R7.4/bin/X(xorg_backtrace+0x4c) [0x101002d0]
1: /usr/X11R7.4/bin/X(xf86SigHandler+0x68) [0x10085bdc]
2: [0x100374]
3: /lib/ld.so.1 [0x4800b6f4]
4: [0x4d]
5: /usr/X11R7.4/bin/X(xf86PostKeyboardEvent+0x58) [0x1009870c]
6: /usr/X11R7.4/lib/xorg/modules/input//kbd_drv.so [0xf6811e0]
7: /usr/X11R7.4/lib/xorg/modules/input//kbd_drv.so [0xf68167c]
8: /usr/X11R7.4/lib/xorg/modules/input//kbd_drv.so [0xf68189c]
9: /usr/X11R7.4/bin/X(EnableDevice+0x16c) [0x1003a380]
10: /usr/X11R7.4/bin/X(InitAndStartDevices+0x158) [0x1003a63c]
11: /usr/X11R7.4/bin/X(main+0x380) [0x100229e4]
12: /lib/tls/libc.so.6 [0xfa36994]
13: /lib/tls/libc.so.6(__libc_start_main+0xb0) [0xfa36ad0]

Fatal server error:
Caught signal 11. Server aborting
----------

Noteworthy are messages surfaced from EnableDevice() in xserver/dix/devices.c :
[dix] cannot find pointer to pair with. This is a bug

Using USB keyboard/mouse

9:~/xorg-git/xserver# cat /proc/bus/input/devices
I: Bus=0003 Vendor=05ac Product=0201 Version=0100
N: Name="Mitsumi Electric Apple USB Keyboard"
P: Phys=usb-PPC-OF USB-1.3.1/input0
S: Sysfs=/class/input/input0
U: Uniq=
H: Handlers=kbd event0
B: EV=120013
B: KEY=10000 7 ff9f207a c14057ff febeffdf ffefffff ffffffff fffffffe
B: MSC=10
B: LED=1f

I: Bus=0003 Vendor=05ac Product=0307 Version=0110
N: Name="Logitech Apple Optical USB Mouse"
P: Phys=usb-PPC-OF USB-1.3.2/input0
S: Sysfs=/class/input/input1
U: Uniq=
H: Handlers=mouse0 event1
B: EV=17
B: KEY=10000 0 0 0 0 0 0 0 0
B: REL=3
B: MSC=10

Relevant xorg.conf sections:

Section "ServerLayout"
        Identifier "X.org Configured"
        Screen 0 "Screen0" 0 0
        InputDevice "Mouse0" "CorePointer"
        InputDevice "Keyboard0" "CoreKeyboard"
EndSection
...
Section "InputDevice"
        Identifier "Keyboard0"
        Driver "kbd"
EndSection

Section "InputDevice"
        Identifier "Mouse0"
        Driver "mouse"
        Option "Protocol" "ExplorerPS/2"
        Option "Device" "/dev/input/mice"
        Option "ZAxisMapping" "4 5 6 7"
EndSection
...

Using the same HW/kernel/X conf, this problem does not occur with Xorg 7.3 system w/ xorg-server 1.4 (or previous Xorg versions ie. Debian Etch w/ Xorg 7.1). Have also tried another kbd/mouse device (ThinkPad USB kbd w/ integrated trackpoint/touchpad - which also works with previous Xorg versions) which also fails similar.

Full Xorg.0.log attached.

Revision history for this message
Loïc Minier (lool) wrote :

Philippe, you're saying you have no keyboard with ubuntu2? When did you have keyboard last?

Could you attach a Xorg.log from ubuntu2 and ubuntu3? (sorry should have asked earlier, didn't think of it)

Revision history for this message
Philippe Leroux (phler2) wrote :

in fact, i installed
dpkg -i xserver-xorg-core_1.5.1-1ubuntu2_powerpc.deb

and haven't switch the other packets (xserver-xorg-input) to previeous versions
(installed packages depend on xorg-core 1.5.1 ubunut3)

that might conflict?

i'm now tryin to install the previous versions now

thought i don't think i had keyboard or nouse before with ubuntu3 version of xorg-core
as the xserver would startup with a certain xorg.conf, and i would get a black screen with no possibility to ctr-alt-backsapce or whatsoever

Revision history for this message
Philippe Leroux (phler2) wrote :

i'll send my xorg log files as soon as i'm home
thx

Revision history for this message
Philippe Leroux (phler2) wrote :

xorg ubuntu2 log

Revision history for this message
Philippe Leroux (phler2) wrote :

xorg ubuntu4 log (sorry couldn't reinstall ubuntu3)

Revision history for this message
Sergey V. Udaltsov (sergey-udaltsov) wrote :

Same thing here. Power G5. Stopped working after one of the recent upgrades

Revision history for this message
soapdog (andre-andregarzia) wrote :

Bug confirmed here with iMac G4 flat panel. Argh!

Dan Munckton (munckfish)
Changed in ubuntu-ps3-port:
assignee: nobody → munckfish
importance: Undecided → Critical
milestone: none → ubuntu-8.10
status: New → Confirmed
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

That's odd, none of your logfiles have much about evdev. That should be used for keyboards as well. 'lshal' output, please.

Revision history for this message
Dan Munckton (munckfish) wrote :

Right I'm doing a bit of leg work on this. Reverting back these packages to these revisions brings back graphics and keyboard input:

xserver-xorg-core_1.5.1-1ubuntu2
xserver-xorg-input-evdev_2.0.99+git20080912-0ubuntu2

I'll be back with more info if I can find it.

Revision history for this message
Jussi Saarinen (jms) wrote :

Same problem here in ps3. 'lshal' output attached

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

ok, so the lshal output looks normal. Probably the three xkb related commits in 1.5.2 that broke it, but it needs to be verified and after that let upstream know about it.

Dan Munckton (munckfish)
Changed in xorg-server:
assignee: nobody → munckfish
status: Confirmed → In Progress
Changed in ubuntu-ps3-port:
status: Confirmed → In Progress
Revision history for this message
Dan Munckton (munckfish) wrote :

Right, reverted these three commits

e88df87851232d6b6c8da5fff802b33f5275b050 xkb: squash canonical types into explicit ones on core reconstruction.
be3b3cb970d040f0db4bead018c338012547334f xkb: fix core keyboard map generation. #14373
3bf826f59013ec14fbcf19b85a03e2967a821661 xkb: fix use of uninitialized variable.

None of them fix the problem, even with all three reverted, no improvement.

The message in the log ... "No core keyboard" seems to originate from dix/devices.c on line 532 (post patching). The following commits between 1.5.1-1ubuntu2 and 1.5.1-1ubuntu3 affect this file:

b1b567e30678b6515ff3067ed7f1aae5f541c3ae Fix 137 backport, one line missing from XIGetDeviceProperty
a834ce903f4a2f54e1145336b331493c3b696501 Merge patches 138 and 139 into 137

These commits alter patch 137 which pulls in what seems to be called the XI API. I've tried reverting these patches but the really odd thing is once they have been removed the build fails due to undefined macro constant XI_PROP_ENABLED - seems include/xserver-properties.h file is missing from the earlier version of these patches???

I need to quit this for today. Do these last 2 commits look likely suspects? Is it worth continuing to try and revert them?

Revision history for this message
Giovanni Condello (nanomad) wrote :

I can confirm that this bug exists for x86 too (fresh installed hardy then upgraded via update-manager -c -d to ibex).
Can anyone provide a link to an old xserver-xorg-core version until a fix is released?

Revision history for this message
Herbert V. Riedel (hvr) wrote :

you can find old builds of xserver-xorg-core by following the links for your architecture at the bottom of

https://launchpad.net/ubuntu/+source/xorg-server/2:1.5.1-1ubuntu2

(for instance, this could lead to
https://launchpad.net/ubuntu/intrepid/i386/xserver-xorg-core/2:1.5.1-1ubuntu2
and finally
http://launchpadlibrarian.net/18263189/xserver-xorg-core_1.5.1-1ubuntu2_i386.deb
)

Revision history for this message
Dan Munckton (munckfish) wrote :

@Giovanni: I also have an upgraded x86 machine - it's not affected. could you attach your Xorg.0.log?

I'm slowly narrowing this down now. I believe the problem lays somewhere between config/hal.c and NewInputDeviceRequest() in xf86XInput.c.

Upon initialization in config/hal.c a list of input devices is obtained by querying HAL. The device properties are parsed and then finally NewInputDeviceRequest() is called. The driver to load for the device is specified by the "input.x11_driver" property in HAL and on my PS3 system and my x86 desktop this is set to "evdev" . As far as I can see (from the code and the healthy Xorg.0.log on my x86 machine) the first call to NewInputDeviceRequest() is where evdev should be loaded, but in our failing logs it's not getting that far.

I'll continue investigation when I get home (where my PS3 is).

Revision history for this message
Giovanni Condello (nanomad) wrote :

Here is the x86 Xorg.0.log. It seems similar to other one posted (complains about hal not giving X a valid core pointer and core keyboard)

Revision history for this message
Dan Munckton (munckfish) wrote :

@Giovanni. Thanks for the log. This bit ...

(II) Cannot locate a core pointer device.
(II) Cannot locate a core keyboard device.
(II) The server relies on HAL to provide the list of input devices.
 If no devices become available, reconfigure HAL or disable AllowEmptyInput.

... is nothing to worry about. Although it sounds like a failure it's actually just informational. It's saying that because no mouse or keyboard were found in the config file (if there was one) that the server will rely on HAL to find input devices later.

The signature of the bug we're dealing with here is a fatal error when no devices are actually found at this HAL configuration stage. Your log doesn't seem to be exhibiting that same problem. My instinct says you may have a slightly different issue. I recommend opening a new bug for yours just now, if later we find it is the same we can merge it in (by marking as a duplicate).

Thanks.

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

Giovanni, you probably don't have xserver-xorg-input-evdev installed.

Thanks Dan for working on this! I doubt it's the properties code that broke it, since the new stuff only removed a couple of functions.. but you seem to have found that out already :)

Revision history for this message
In , Michel-tungstengraphics (michel-tungstengraphics) wrote :

I'm using the evdev driver, so I get a different backtrace:

0: X [0x100ae894]
1: X(xf86SigHandler+0xc8) [0x100ae824]
2: [0x100344]
3: [0x28]
4: X(NewInputDeviceRequest+0x504) [0x100c7d24]
5: X [0x10089080]
6: X [0x10089560]
7: X [0x10088130]
8: X [0x10146f94]
9: X(WaitForSomething+0x7e8) [0x101478ec]
10: X(Dispatch+0xac) [0x1004c5e8]
11: X(main+0x5b4) [0x10028b8c]
12: /lib/libc.so.6 [0xf906704]
13: /lib/libc.so.6 [0xf9068c0]

(Unfortunately, I can't seem to get more information about the crash with gdb, it just hangs instead of giving me a prompt... Steve, are you seeing this as well? Anyway, I've narrowed it down using debugging output to CheckMotion() crashing because pSprite is NULL.)

But I think the key is really

[dix] cannot find pointer to pair with. This is a bug.

I've bisected this to xserver commit 1e24e7b9df3d02350c7ea18e9379e87fe4d00026 ('Xi: remove configure/query device property calls.'), but I can't see what in there could cause the symptoms we're seeing; also obviously it doesn't happen on x86... So it could be related to endianness or char being unsigned by default, or maybe just some kind of latent memory corruption issue that happens not to affect x86. (I think I've ruled out a compiler optimization bug by rebuilding everything affected by this change with -O0).

Peter, any suggestions for narrowing down why it's unable to find a pointer for pairing?

Revision history for this message
In , Daniel Stone (daniels) wrote :

On Tue, Oct 21, 2008 at 12:14:57AM -0700, <email address hidden> wrote:
> (Unfortunately, I can't seem to get more information about the crash with gdb,
> it just hangs instead of giving me a prompt... Steve, are you seeing this as
> well? Anyway, I've narrowed it down using debugging output to CheckMotion()
> crashing because pSprite is NULL.)
>
> But I think the key is really
>
> [dix] cannot find pointer to pair with. This is a bug.
>
> I've bisected this to xserver commit 1e24e7b9df3d02350c7ea18e9379e87fe4d00026
> ('Xi: remove configure/query device property calls.'), but I can't see what in
> there could cause the symptoms we're seeing; also obviously it doesn't happen
> on x86... So it could be related to endianness or char being unsigned by
> default, or maybe just some kind of latent memory corruption issue that happens
> not to affect x86. (I think I've ruled out a compiler optimization bug by
> rebuilding everything affected by this change with -O0).
>
> Peter, any suggestions for narrowing down why it's unable to find a pointer for
> pairing?

Stupid question, but you have rebuilt -evdev against the exact same
headers, right?

Revision history for this message
In , Michel-tungstengraphics (michel-tungstengraphics) wrote :

(In reply to comment #2)
>
> Stupid question, but you have rebuilt -evdev against the exact same
> headers, right?

Yes, the behaviour is the same for me with current evdev Git built against current xserver Git.

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

Dan, the problem might be in DeviceSetProperty() found on dix/devices.c (patch 137). This is a new function introduced in the current properties code, so the breakage could well be there.

Revision history for this message
Dan Munckton (munckfish) wrote : Re: [Bug 281610] Re: [regression, intrepid] Xorg servers broken "No core keyboard" and "failed to initialize core devices"

Timo

Ok noted. Sorry couldn't get at my PS3 last night so no progress since
yesterday :(

Revision history for this message
In , Peter Hutterer (peter-hutterer) wrote :

The error message is definitely a hint. What should happen is that the VCP
initialises, then the VCK, and the VCK should get paired with the VCP. If that
doesn't happen, any operation on the VCK may just segfault due to a
nonexistent sprite.

Not sure how you got there though, some memory corruption somewhere. anything
to narrow it down would be appreciated. valgrind complaining about anything?

Revision history for this message
In , Michel-tungstengraphics (michel-tungstengraphics) wrote :

Created an attachment (id=19814)
Possible fix: look at all bytes of dev->enabled

Okay, I've been able to trace this with gdb (I guess I was inadvertently using the kernel DRM and thus hitting the xkbcomp fork hilarity...), and it indeed looks like a classic endianness bug:

dev->enabled is a Bool, which is typedefed to int. However, the XI_PROP_ENABLED related code in dix/devices.c only looks at the first byte of it (which happens to work with little endian). This patch fixes it for me, but I'm not sure how it fits into the bigger picture; another possibility would be to use a CARD8 local variable instead of dev->enabled directly in the XIChangeDeviceProperty() callers.

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

10:15 < whot> tjaalton: i'd recompile the evdev driver without properties and then try that
10:16 < whot> tjaalton: ther's a good chance that a length field doesn't get swapped and corrupts memory
10:17 < whot> tjaalton: that should be easy to find by just running xinput --set-int-prop with a few varieties of format

So, try installing the evdev driver without properties support. To do that you have to disable patch 100 which removes the API checks.

This might be related to upstream bug http://bugs.freedesktop.org/show_bug.cgi?id=18111

Revision history for this message
Dan Munckton (munckfish) wrote :
Revision history for this message
Dan Munckton (munckfish) wrote :
Revision history for this message
In , stevewin (stevewin) wrote :

Applying the patch fixed the error for me - with devices configured using kbd or evdev.

Thanks so much for the prompt attention.

Revision history for this message
Dan Munckton (munckfish) wrote :

Progress ...

See the above 2 logs. I added plenty of tracing output and recompiled.

It seems things probably go wrong somewhere in dix/devices.c in or below EnableDevice(). In the healthy log the second call to enable device seems to trigger the connect_hook() callback in config/hal.c. In the broken PS3 trace this never happens.

I investigated DeviceSetProperty() in dix/devices.c but it seemed to be behaving itself - returned Success on each call.

I will try recompiling the evdev driver without properties support, but looking at the logs I have it doesn't look like evdev is even loaded as that happens after the config/hal.c has done its work.

Continuing ...

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

What did you do to get the healthy trace, looks like the server version was the same?

Revision history for this message
Loïc Minier (lool) wrote : Re: [Bug 281610] Re: [regression, intrepid] Xorg servers broken "No core keyboard" and "failed to initialize core devices"

On Wed, Oct 22, 2008, Timo Aaltonen wrote:
> What did you do to get the healthy trace, looks like the server version
> was the same?

 AIUI healthy one is on x86.

--
Loïc Minier

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

heh, silly me..

Revision history for this message
Dan Munckton (munckfish) wrote :

Healthy is from my x86.

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

Dan, try to ifdef 0 the three property calls in dix/devices.c:AddInputDevice().

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

The problem has likely been found:

17:30 < jcristau> tjaalton: dev->enabled is Bool, Bool is typedefed to int, so
                  *(CARD8*)(&dev->enabled) doesn't look at the right byte on big endian, aiui

And the upstream bug has a proposed fix for it, please test!

Revision history for this message
Dan Munckton (munckfish) wrote :

Fantastic. I'll test the patch as soon as I can.

Revision history for this message
In , Peter Hutterer (peter-hutterer) wrote :

Fix pushed as 98f01c2abe4771d76febf8fe70111b2bddfab776.

Revision history for this message
Dan Munckton (munckfish) wrote :

Woooooohooooooo!

Fixed!

Updated the patch so it applies cleanly, and slotted into series at position 138 so it's near to the other Xi properties patches it relates to. Hope that's ok.

Let me know if there's anything else I can do.

Cheers

Dan

Dan Munckton (munckfish)
Changed in ubuntu-ps3-port:
status: In Progress → Fix Committed
Changed in xorg-server:
status: In Progress → Fix Committed
Revision history for this message
Bryce Harrington (bryce) wrote :

Uploading to ubuntu (via ftp to upload.ubuntu.com):
  xorg-server_1.5.2-2ubuntu3.dsc: done.
  xorg-server_1.5.2-2ubuntu3.diff.gz: done.
  xorg-server_1.5.2-2ubuntu3_source.changes: done.
Successfully uploaded packages.

Changed in xorg-server:
status: Unknown → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package xorg-server - 2:1.5.2-2ubuntu3

---------------
xorg-server (2:1.5.2-2ubuntu3) intrepid; urgency=low

  * 138_look_at_all_bytes_of_dev_enabled.diff: dev->enabled has type
    Bool, which is typedef'd to int, but is used in comparisons with
    CARD8 data, which gives incorrect logic on big endian systems,
    causing failure to initialize keyboard and mouse.
    (LP: #281610)

 -- Bryce Harrington <email address hidden> Thu, 23 Oct 2008 07:31:47 -0700

Changed in xorg-server:
status: Fix Committed → Fix Released
Revision history for this message
Herbert V. Riedel (hvr) wrote :

just confirming, that with the latest/current packages installed, the bug is gone here on apple-ppc hardware :-)

Revision history for this message
Henry Wertz (hwertz) wrote :

     "Me to." 1.25ghz G5, the 1.5.2-2ubuntu3 xserver-xorg-core and related xserver-xorg-....ubuntu3 packages fixed things up. Nice work!

Dan Munckton (munckfish)
Changed in ubuntu-ps3-port:
status: Fix Committed → Fix Released
Revision history for this message
Sergey V. Udaltsov (sergey-udaltsov) wrote :

"Me Too". Power G5. Thanks lads!

Changed in xorg-server:
importance: Unknown → High
Changed in xorg-server:
importance: High → Unknown
Changed in xorg-server:
importance: Unknown → High
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.