Xorg does not detect displays in rootless mode on nvidia proprietary drivers (GNOME)

Bug #1672033 reported by Igor
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
NVIDIA Drivers Ubuntu
New
Undecided
Unassigned
mutter (Ubuntu)
Confirmed
Undecided
Unassigned
xorg (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

There are two bug reports in LP: #1559576, LP: #1632322 and also LP: #1666664, where GDM does not start on proprietary nvidia drivers. As it turned out, the reason for that was Xorg starting in rootless mode and apparently not initializing everything properly, which was causing gnome-shell/libmutter to crash.

Installing xserver-xorg-legacy did partially fix those issues.

Enabling modesetting for nvidia driver however still causes the problem.

Here are some parts from log:
Xorg startup:
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (--) Log file renamed from "/var/lib/gdm3/.local/share/xorg/Xorg.pid-2027.log" to "/var/lib/gdm3/.local/share/xorg/Xorg.0.log"
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: X.Org X Server 1.18.4
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: Release Date: 2016-07-19
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: X Protocol Version 11, Revision 0
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: Build Operating System: Linux 4.4.0-53-generic x86_64 Ubuntu

glx loaded:
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) LoadModule: "glx"
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) Loading /usr/lib/x86_64-linux-gnu/xorg/extra-modules/libglx.so
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) Module glx: vendor="NVIDIA Corporation"
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: compiled for 4.0.2, module version = 1.0.0
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: Module class: X.Org Server Extension
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) NVIDIA GLX Module 375.39 Tue Jan 31 19:37:12 PST 2017

nvidia loaded:
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) LoadModule: "nvidia"
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) Loading /usr/lib/x86_64-linux-gnu/xorg/extra-modules/nvidia_drv.so
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) Module nvidia: vendor="NVIDIA Corporation"
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: compiled for 4.0.2, module version = 1.0.0
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: Module class: X.Org Video Driver

modesetting loaded:
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) LoadModule: "modesetting"
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) Loading /usr/lib/xorg/modules/drivers/modesetting_drv.so
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) Module modesetting: vendor="X.Org Foundation"
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: compiled for 1.18.4, module version = 1.18.4
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: Module class: X.Org Video Driver
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: ABI class: X.Org Video Driver, version 20.0

gnome-shell fails to run:
Mär 11 00:43:20 arvlin kernel: gnome-shell[2067]: segfault at 28 ip 00007fedba8da7c4 sp 00007ffd2fb5f5a0 error 4 in libmutter-0.so.0.0.0[7fedba893000+12f000]

xorg stops:
Mär 11 00:43:21 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) UnloadModule: "libinput"
Mär 11 00:43:21 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) systemd-logind: releasing fd for 13:66
Mär 11 00:43:21 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) UnloadModule: "libinput"
Mär 11 00:43:21 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) systemd-logind: releasing fd for 13:67
Mär 11 00:43:21 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) UnloadModule: "libinput"
Mär 11 00:43:21 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) systemd-logind: releasing fd for 13:64
Mär 11 00:43:21 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) UnloadModule: "libinput"
Mär 11 00:43:21 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) systemd-logind: releasing fd for 13:65
Mär 11 00:43:21 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) NVIDIA(GPU-0): Deleting GPU-0
Mär 11 00:43:21 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) Server terminated successfully (0). Closing log file.
Mär 11 00:43:21 arvlin gdm-launch-environment][2009]: pam_unix(gdm-launch-environment:session): session closed for user gdm

Jeremy Bícha (jbicha)
summary: XOrg does not work in rootless mode on nvidia proprietary drivers
+ (GNOME)
Igor (invy)
summary: - XOrg does not work in rootless mode on nvidia proprietary drivers
+ Xorg does not work in rootless mode on nvidia proprietary drivers
(GNOME)
Revision history for this message
Tim Lunn (darkxst) wrote : Re: Xorg does not work in rootless mode on nvidia proprietary drivers (GNOME)

I don't currently have access to my nvidia hardware and to be honest not even sure if rootless Xorg is supported yet on nvidia drivers, however I would have assumed that if wayland is mostly working then the KMS support would have been in good enough shape for this to work.

Without xserver-xorg-legacy gdm should start in wayland mode, so not sure how that is related to rootless Xorg?

Igor, Can you get a back trace of the mutter crash? As far as I recall most of the rootless Xorg stuff is at a lower level than mutter, and handled by logind and Xorg. systemd-logind passes a fd for your drm device over to Xorg, are there any log messages related to that?

Revision history for this message
Igor (invy) wrote :

Tim,

indeed, if KMS is enabled, gdm starts perfectly in wayland mode and you can use gnome-shell wayland session (glx is however broken). But if you would like to use gnome-shell Xorg session, then gnome-shell will crash, because apparently gdm starts xorg in rootless mode (I presume uid of user being logged in, which makes logically sense).

This is exactly the problem: enabling KMS by default would break gnome-shell xorg session.

Regrading logs. Could you maybe give me some hints for what should I look, because nothing looks suspicious.

It's pretty clear, that xorg is lacking some permissions for some resources, but the question is, which exactly.

Revision history for this message
Igor (invy) wrote :
Download full text (3.9 KiB)

Here is another observation (my old workaround):

- KMS is enabled, gdm starts in wayland mode. Trying to start gnome-shell in Xorg mode fails (gnome-shell/mutter crash).

- Switch to tty (ctrl+alt+f2), login and start:
- $ sudo lightdm --test-mode
 - lightdm is starting, nvidia logo appears for a moment
- Switch back to tty once again and kill lightdm (ctrl+c)
- Switch back to GDM (ctrl+alt+f1)
- Login in gnome-shell Xorg session: everything works fine at this moment.

The question is what lightdm does, that gdm doesn't?

New messages in logs during xorg startup after executing lightdm are:
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(0): Valid display device(s) on GPU-0 at PCI:3:0:0
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(0): DFP-0 (boot)
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(0): DFP-1
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(0): DFP-2
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(0): DFP-3
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(0): DFP-4
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(0): DFP-5
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(0): DFP-6
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(0): DFP-7

/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DELL U2412M (DFP-0): connected
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DELL U2412M (DFP-0): Internal TMDS
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DELL U2412M (DFP-0): 330.0 MHz maximum pixel clock
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0):
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-1: disconnected
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-1: Internal TMDS
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-1: 165.0 MHz maximum pixel clock
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0):
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-2: disconnected
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-2: Internal DisplayPort
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-2: 1440.0 MHz maximum pixel clock
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0):
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-3: disconnected
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-3: Internal TMDS
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-3: 165.0 MHz maximum pixel clock
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0):
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-4: disconnected
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-4: Internal DisplayPort
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-4: 1440.0 MHz maximum pixel clock
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0):
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-5: disconnected
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-5: Internal TMDS
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-5: 165.0 MHz maximum pixel clock
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0):
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-6: disconnected
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-6: Internal DisplayPort
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-6: 1440.0 MHz maximum pixel clock
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0):
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-7: disconnected
/usr/lib/gd...

Read more...

Revision history for this message
Igor (invy) wrote :

A little bit of debugging and investigating confirm what is in log above:

libmutter cannot get a monitor and crashes here:
https://github.com/GNOME/mutter/blob/master/src/backends/meta-backend.c#L128-L133

  primary =
    meta_monitor_manager_get_primary_logical_monitor (monitor_manager);

  meta_backend_warp_pointer (backend,
                             primary->rect.x + primary->rect.width / 2,
                             primary->rect.y + primary->rect.height / 2);

because 'primary' is not a valid pointer.

So,
1. This should be reported to gnome/mutter developers, so they check all their pointers and terminate in clean way with meta_fatal("failed to get primary monitor"); or something like that.
2. We have to understand, why libmutter fails to get primary logical monitor. Does Xorg need some permissions?

Revision history for this message
Igor (invy) wrote :
Igor (invy)
summary: - Xorg does not work in rootless mode on nvidia proprietary drivers
- (GNOME)
+ Xorg does not detect displays in rootless mode on nvidia proprietary
+ drivers (GNOME)
Revision history for this message
Tim Lunn (darkxst) wrote :

what hardware are you running? I recall some hybrid laptops fail to advertise a "primary" display. what is the output of "xrandr -q" in a working root-less X session?

Revision history for this message
Tim Lunn (darkxst) wrote :

Also does lightdm work with KMS enabled for logging into gnome-shell session?

Revision history for this message
Igor (invy) wrote :

I don't have hybrid graphics. It's a normal desktop.

Yes, lightdm is working with KMS enabled and it is possible to log in into gnome-shell Xorg session.

But lightdm is started as root:
root 10867 0.0 0.0 361724 6620 ? SLsl 15:41 0:00 /usr/sbin/lightdm

Revision history for this message
Igor (invy) wrote :

xrandr output in rootless session:

igor:~% ps aux | grep Xorg
igor 3779 3.2 0.0 199348 51332 tty3 S+ 15:49 0:00 /usr/lib/xorg/Xorg vt3 -displayfd 3 -auth /run/user/1000/gdm/Xauthority -background none -noreset -keeptty -verbose 3

igor:~% xrandr -q
Screen 0: minimum 8 x 8, current 1920 x 1200, maximum 32767 x 32767
DVI-D-0 connected primary 1920x1200+0+0 (normal left inverted right x axis y axis) 518mm x 324mm
   1920x1200 59.95*+
   1920x1080 60.00
   1680x1050 59.95
   1600x1200 60.00
   1280x1024 60.02
   1280x960 60.00
   1024x768 60.00
   800x600 60.32
   640x480 59.94
HDMI-0 disconnected (normal left inverted right x axis y axis)
DP-0 disconnected (normal left inverted right x axis y axis)
DP-1 disconnected (normal left inverted right x axis y axis)
DP-2 disconnected (normal left inverted right x axis y axis)
DP-3 disconnected (normal left inverted right x axis y axis)
DP-4 disconnected (normal left inverted right x axis y axis)
DP-5 disconnected (normal left inverted right x axis y axis)

lightdm works, because it starts Xorg as root for itself, which causes monitor initialization and consequently later started rootless session work fine.

Revision history for this message
Tim Lunn (darkxst) wrote :

Igor, can you post full logs from Xorg, and also any systemd-logind logs from journalctl.

Revision history for this message
Tim Lunn (darkxst) wrote :

(before your workaround)

Revision history for this message
Igor (invy) wrote :

Here you go:

Revision history for this message
Igor (invy) wrote :
Revision history for this message
Igor (invy) wrote :
Revision history for this message
Tim Lunn (darkxst) wrote :

The following line is a bit suspicious
>Mär 21 20:10:20 arvlin /usr/lib/gdm3/gdm-x-session[3618]: (II) systemd-logind: releasing fd for 226:0

afair that would mean the Xorg/nvidia driver no longer has access to GPU once NVIDIA is trying to load/detect monitors, however strangely its also showing up in your other "working" log.

It might be worth filing an upstream bug against gdm.

Revision history for this message
Igor (invy) wrote :

Hm, this makes sense.

Maybe, after xorg/nvidia driver detects monitors as root (calling "sudo startx" from the console, has the same effect as running lightdm) they are cached somewhere, or maybe Xorg/nvidia driver changes permissions, so rootless Xorg cann then detect monitors in following sessions.

Revision history for this message
Jeremy Bícha (jbicha) wrote :
Revision history for this message
Igor (invy) wrote :

Jeremy, it's not related.

Also, this is how it supposed to work. If gdm thinks it cannot start wayland session, it wont display Wayland option.

Why does gdm thinks Wayland session is not available you ask? Because I presume you have modesetting disabled, because installation or update for nvidia driver overwrites:

/etc/modprobe.d/nvidia-graphics-drivers.conf -> /etc/alternatives/x86_64-linux-gnu_nvidia_modconf

and puts by default:

options nvidia_381_drm modeset=0

which will prevent wayland from working.

Revision history for this message
Tim Lunn (darkxst) wrote :

[ 163.975845] Call Trace:
[ 163.975854] dump_stack+0x63/0x81
[ 163.975858] __warn+0xcb/0xf0
[ 163.975860] warn_slowpath_null+0x1d/0x20
[ 163.975866] drm_atomic_helper_commit_hw_done+0xab/0xb0 [drm_kms_helper]
[ 163.975869] nvidia_drm_atomic_helper_commit_tail+0x128/0x1d0 [nvidia_drm]
[ 163.975875] commit_tail+0x3f/0x80 [drm_kms_helper]
[ 163.975879] commit_work+0x12/0x20 [drm_kms_helper]
[ 163.975881] process_one_work+0x1fc/0x4b0
[ 163.975883] worker_thread+0x4b/0x500
[ 163.975886] kthread+0x101/0x140
[ 163.975888] ? process_one_work+0x4b0/0x4b0
[ 163.975890] ? kthread_create_on_node+0x60/0x60
[ 163.975893] ret_from_fork+0x2c/0x40
[ 163.975895] ---[ end trace 341cfa538776d33d ]---

Jeremy Bícha (jbicha)
tags: added: gnome-1710 wayland
Jeremy Bícha (jbicha)
tags: added: gnome-17.10
removed: gnome-1710
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in mutter (Ubuntu):
status: New → Confirmed
Changed in xorg (Ubuntu):
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.