Gnome session will not start when no audio device present

Bug #8678 reported by Rob Wills
46
This bug affects 1 person
Affects Status Importance Assigned to Milestone
control-center (Ubuntu)
Fix Released
Medium
Sebastien Bacher

Bug Description

Audio controller disabled in BIOS.

Installed Ubuntu Warty without errors

Any attempt to log into GDM results in empty screen with Ubuntu wallpaper

Logs show errors relating to audio

Enabled audio in bios

Login worked.

Revision history for this message
Matt Zimmerman (mdz) wrote :

I can reproduce this easily by unloading the sound drivers and logging in. It
seems to be gnome-settings-daemon which gets hung up; if I kill it (repeatedly),
the session eventually starts up

Revision history for this message
Matt Zimmerman (mdz) wrote :

Increasing severity; this makes the system unusable for anyone without a sound
device

Revision history for this message
Matt Zimmerman (mdz) wrote :

*** Bug 8594 has been marked as a duplicate of this bug. ***

Revision history for this message
Jason Bradley Nance (aitrus) wrote :

I, too, have this same problem (with no sound). Killing off the
gnome-settings-daemon gets me to a desktop (with no settings) as well.

Revision history for this message
Matt Zimmerman (mdz) wrote :

*** Bug 8697 has been marked as a duplicate of this bug. ***

Revision history for this message
Matt Zimmerman (mdz) wrote :

Created an attachment (id=287)
strace -f gnome-session

Revision history for this message
Sivan Greenberg (sivan) wrote :

(In reply to comment #1)
> I can reproduce this easily by unloading the sound drivers and logging in. It
> seems to be gnome-settings-daemon which gets hung up; if I kill it (repeatedly),
> the session eventually starts up

I have had problems reproducting. Is modprobe -r <sound driver module> enough? I
also removed manually snd_pcm_oss, which was remaining left after I removed
emu10k1 module.

btw what sound card have you tested with?

Revision history for this message
Jason Bradley Nance (aitrus) wrote :

(In reply to comment #7)
> I have had problems reproducting. Is modprobe -r <sound driver module> enough? I
> also removed manually snd_pcm_oss, which was remaining left after I removed
> emu10k1 module.
>
> btw what sound card have you tested with?

I think you are misreading the reports. This bug occurs when there is no sound
card present (hence no kernel sound modules loaded).

Revision history for this message
Matt Zimmerman (mdz) wrote :

I tested with snd-intel8x0; I can easily reproduce by unloading the module and
then logging in from gdm.

Revision history for this message
Dean Jansen (list) wrote :

(In reply to comment #8)
>
>
> I think you are misreading the reports. This bug occurs when there is no sound
> card present (hence no kernel sound modules loaded).

(In reply to comment #9)
> I tested with snd-intel8x0; I can easily reproduce by unloading the module and
> then logging in from gdm.

I get this when I have a sound card installed (audigy gamer) AND all the sound modules loaded (I'm pretty
sure they are all loaded, there are a LOT of modules when I 'lsmod | grep snd' including the emu10k1
module for the audigy).

Revision history for this message
Matt Zimmerman (mdz) wrote :

It is possible that there is more than one bug here...Dean, can you try
unloading _all_ snd* modules and trying to reproduce the problem?

Revision history for this message
Matt Zimmerman (mdz) wrote :

This should serve as a temporary workaround until we track down and fix this bug:

gconftool-2 -s -t bool /schemas/desktop/gnome/sound/event_sounds false

Revision history for this message
Matt Zimmerman (mdz) wrote :

mdz@potpal:~ $ ps ux | grep settings
mdz 7885 0.0 3.2 17692 8384 ? S 02:51 0:00
/usr/lib/control-center/gnome-settings-daemon
--oaf-activate-iid=OAFIID:GNOME_SettingsDaemon --oaf-ior-fd=22
mdz 7494 0.0 3.2 17692 8384 ? S 02:51 0:00
/usr/lib/control-center/gnome-settings-daemon
--oaf-activate-iid=OAFIID:GNOME_SettingsDaemon --oaf-ior-fd=22
mdz 8315 0.0 0.1 1532 452 pts/0 S+ 03:01 0:00 grep settings
mdz@potpal:~ $ strace -p 7494
Process 7494 attached - interrupt to quit
waitpid(7885, <unfinished ...>
Process 7494 detached
mdz@potpal:~ $ strace -p 7885
Process 7885 attached - interrupt to quit
write(33, "\21\0\0\0\316\36\0\0\0\2\0\0", 12 <unfinished ...>
Process 7885 detached
mdz@potpal:~ $ ls -l /proc/7885/fd/33
l-wx------ 1 mdz mdz 64 Oct 2 02:58 /proc/7885/fd/33 ->
pipe:[22088]
mdz@potpal:~ $ ls -l /proc/*/fd/* | grep 22088
ls: /proc/8319/fd/255: No such file or directory
ls: /proc/8319/fd/3: No such file or directory
ls: /proc/self/fd/255: No such file or directory
ls: /proc/self/fd/3: No such file or directory
lr-x------ 1 mdz mdz 64 Oct 2 02:58 /proc/7494/fd/31 ->
pipe:[22088]
l-wx------ 1 mdz mdz 64 Oct 2 02:58 /proc/7494/fd/33 ->
pipe:[22088]
lr-x------ 1 mdz mdz 64 Oct 2 02:58 /proc/7885/fd/31 ->
pipe:[22088]
l-wx------ 1 mdz mdz 64 Oct 2 02:58 /proc/7885/fd/33 ->
pipe:[22088]
mdz@potpal:~ $

In other words:

- There are two gnome-settings-daemon processes, one of which has been forked by
the other.
- There is a pipe open between the two processes
- It appears that both processes have both ends of the pipe open, which is
wrong, but perhaps unrelated
- The child is blocked writing to the pipe
- The parent is blocked in waitpid()
- Therefore the two processes are deadlocked

I have no idea what any of this has to do with sound, but I can reproduce the
problem with:

1. Log out of GNOME
2. lsmod | grep '^snd' | awk '{ print $1 }' | xargs -n1 sudo rmmod
3. Log into GNOME

I get the splash screen with 0 icons on it, and this odd pair of wedged
gnome-settings-daemon processes.

Revision history for this message
Matt Zimmerman (mdz) wrote :

(In reply to comment #13)
> - The parent is blocked in waitpid()
                                                       ^^ waiting for the child
(otherwise this would not necessarily be a deadlock)

Revision history for this message
Sebastien Bacher (seb128) wrote :

(In reply to comment #13)
> 1. Log out of GNOME
> 2. lsmod | grep '^snd' | awk '{ print $1 }' | xargs -n1 sudo rmmod
> 3. Log into GNOME

No problem by doing that here. Nothing in ~/.xsession-errors, the volume applet
is just set to 0 ..

Revision history for this message
Jason Bradley Nance (aitrus) wrote :

(In reply to comment #12)
> This should serve as a temporary workaround until we track down and fix this bug:
>
> gconftool-2 -s -t bool /schemas/desktop/gnome/sound/event_sounds false

Unfortunately, that doesn't seem to workaround the issue in my case. But thanks
for the idea.

Revision history for this message
Dean Jansen (list) wrote :

Turns out that I had the snd_emu10k1 module loaded and not emu10k1.
However, when I 'rmmod snd_emu10k1' and then 'modprobe emu10k1' I still
have the problem when I return into gdm and press ctrl+alt+backspace. I
will attach both lsmods (one w/ emu10k1 and one w/ snd_emu10k1).

(In reply to comment #11)
> Dean, can you try unloading _all_ snd* modules and trying to reproduce
> the problem?

I'll boot back into ubuntu a little later--I've got some homework to do
now. Will I use the command 'rmmod snd*'?

Revision history for this message
Dean Jansen (list) wrote :

Created an attachment (id=298)
lsmod with original snd_emu10k1

Revision history for this message
Dean Jansen (list) wrote :

Created an attachment (id=299)
lsmod with emu10k1 loaded (snd_emu10k1 was rmmoded)

Revision history for this message
Sivan Greenberg (sivan) wrote :

(In reply to comment #9)
> I tested with snd-intel8x0; I can easily reproduce by unloading the module and
> then logging in from gdm.

Did exatcly that. However the machine discussed (Dell Inspiron 8200) hasn't still yet upgraded
from the online archives. Might this be the difference that makes me unable to reporduce?

Revision history for this message
Jason Bradley Nance (aitrus) wrote :

(In reply to comment #20)
> Did exatcly that. However the machine discussed (Dell Inspiron 8200) hasn't
still yet upgraded
> from the online archives. Might this be the difference that makes me unable to
reporduce?

I recall that being the case for me as well. I was fine pre-update.

Revision history for this message
Matt Zimmerman (mdz) wrote :

(In reply to comment #15)
> (In reply to comment #13)
> > 1. Log out of GNOME
> > 2. lsmod | grep '^snd' | awk '{ print $1 }' | xargs -n1 sudo rmmod
> > 3. Log into GNOME
>
> No problem by doing that here. Nothing in ~/.xsession-errors, the volume applet
> is just set to 0 ..

It might be a race condition of some sort, given that a deadlock is occurring.
It might also be affected by the configuration. Maybe try with a new uid?

Revision history for this message
Matt Zimmerman (mdz) wrote :

(In reply to comment #17)
> Turns out that I had the snd_emu10k1 module loaded and not emu10k1.

That's the correct module; you want snd-emu10k1 and NOT emu10k1. It does the
right thing automatically.

> I'll boot back into ubuntu a little later--I've got some homework to do
> now. Will I use the command 'rmmod snd*'?

use the lsmod pipeline I posted here earlier

Revision history for this message
Matt Zimmerman (mdz) wrote :

(In reply to comment #20)
> (In reply to comment #9)
> > I tested with snd-intel8x0; I can easily reproduce by unloading the module and
> > then logging in from gdm.
>
> Did exatcly that. However the machine discussed (Dell Inspiron 8200) hasn't
still yet upgraded
> from the online archives. Might this be the difference that makes me unable to
reporduce?

It was only recently that sound events were enabled by default, which is what I
can only assume triggered this bug

Revision history for this message
Jason Bradley Nance (aitrus) wrote :

(In reply to comment #22)
> It might also be affected by the configuration. Maybe try with a new uid?

I tried with a new uid. The first login did work with some hiccups (stalls at
certains points of Gnome startup). However, successive logins resulted in the
same hang we are discussing in this thread.

Revision history for this message
Matt Zimmerman (mdz) wrote :

Here's where it gets stuc:

#0 0xffffe410 in __kernel_vsyscall ()
#1 0x40906373 in __write_nocancel () from /lib/tls/i686/cmov/libpthread.so.0
#2 0x0805de8f in vte_reaper_signal_handler (signum=-512) at reaper.c:64
#3 <signal handler called>
#4 0xffffe410 in __kernel_vsyscall ()
#5 0x40c6a49e in fork () from /lib/tls/i686/cmov/libc.so.6
#6 0x40908131 in fork () from /lib/tls/i686/cmov/libpthread.so.0
#7 0x402156dc in esd_open_sound () from /usr/lib/libesd.so.0
#8 0x403911c5 in gnome_config_set_sync_handler ()
   from /usr/lib/libgnome-2.so.0
#9 0x403917f6 in gnome_sound_connection_get () from /usr/lib/libgnome-2.so.0
#10 0x08059551 in start_esd () at gnome-settings-sound.c:72
#11 0x08059825 in apply_settings () at gnome-settings-sound.c:156
#12 0x08054f05 in gnome_settings_daemon_new () at gnome-settings-daemon.c:336
#13 0x08053a2c in main (argc=3, argv=0x80a7000) at factory.c:40

it's blocked writing to a pipe in vte_reaper_signal_handler. I don't understand
how all of this reaper.c stuff fits together quite yet, though...I'd appreciate
some insight from anyone who is familiar with the code.

Revision history for this message
Vincent Untz (vuntz) wrote :

http://bugzilla.gnome.org/show_bug.cgi?id=112998 looks similar.
Does killing esd unblock gnome-settings-daemon?

Revision history for this message
Matt Zimmerman (mdz) wrote :

(In reply to comment #27)
> http://bugzilla.gnome.org/show_bug.cgi?id=112998 looks similar.
> Does killing esd unblock gnome-settings-daemon?

esd is not running at the time (it fails to start up because there is no /dev/dsp)

Revision history for this message
Matt Zimmerman (mdz) wrote :

The case I was pointing out where there were two gnome-settings-daemon processes
seems to be a strange one; normally in my tests, there is only one process,
blocked in write() on a pipe. As far as I can see, that pipe isn't shared with
any other process, so it can never unblock.

potpal:[...2.8.0/gnome-settings-daemon] ps ux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
mdz 3831 0.0 0.5 2728 1328 ? Ss 01:01 0:00 /usr/sbin/famd
-Tmdz 4477 0.0 0.7 6088 1884 ? S 01:57 0:00 sshd: mdz@pts/0
mdz 4478 0.0 0.9 5928 2552 pts/0 Ss 01:57 0:00 -zsh
mdz 4737 0.0 0.2 1936 704 ? Ss 01:57 0:00 dbus-daemon-1
--fmdz 5678 0.0 3.5 17708 9064 ? Ss 02:03 0:00 gnome-session
mdz 5724 0.0 0.3 2920 868 ? Ss 02:03 0:00
/usr/bin/ssh-agenmdz 5727 0.0 0.2 1936 704 ? Ss 02:03 0:00
dbus-daemon-1 --fmdz 5730 0.0 1.1 5896 2976 ? S 02:03 0:00
/usr/lib/gconf2/gmdz 5733 0.0 0.3 2176 976 ? S 02:03 0:00
/usr/bin/gnome-kemdz 6334 0.0 1.0 6396 2784 ? Ss 02:03 0:00
/usr/lib/bonobo-amdz 6336 0.0 0.9 7796 2436 ? Ss 02:03 0:00
gnome-smproxy --smdz 6338 0.0 3.3 19348 8488 ? S 02:03 0:00
/usr/lib/control-mdz 6869 0.0 0.3 2476 828 pts/0 R+ 02:21 0:00
ps ux
potpal:[...2.8.0/gnome-settings-daemon] strace -p 6338
Process 6338 attached - interrupt to quit
write(33, "\21\0\0\0\262\32\0\0\0\0\0\0", 12 <unfinished ...>
Process 6338 detached
potpal:[...2.8.0/gnome-settings-daemon] ls -l /proc/6338/fd/33
l-wx------ 1 mdz mdz 64 2004-10-04 02:18 /proc/6338/fd/33 ->
pipe:[18598]

Revision history for this message
Matt Zimmerman (mdz) wrote :

Created an attachment (id=326)
patch

Revision history for this message
Vincent Untz (vuntz) wrote :

I may be wrong, but this patch will break the typing break stuff...

Looking at reaper.c, the pipe is used only by one process
(gnome-settings-daemon) to send data from a signal handler to another function.
I can't understand why a pipe would break if both ends of pipe are used by only
one program.

Revision history for this message
Matt Zimmerman (mdz) wrote :

How will it break the typing break stuff? I looked around for anything which
would do something with the "child-exited" signal, and didn't find any. That
code isn't used outside of gnome-settings-daemon itself. It looked like this
code was cut-and-pasted from elsewhere and this bit wasn't really applicable.

The pipe isn't breaking, as far as I saw. It's simply blocking on write, and
nothing ever read from it.

Revision history for this message
Sebastien Bacher (seb128) wrote :

I've just tested a patched version, no problem with the break feature apparently

Revision history for this message
Vincent Untz (vuntz) wrote :

(In reply to comment #32)
> How will it break the typing break stuff? I looked around for anything which
> would do something with the "child-exited" signal, and didn't find any. That
> code isn't used outside of gnome-settings-daemon itself. It looked like this
> code was cut-and-pasted from elsewhere and this bit wasn't really applicable.

In gnome-settings-typing-break.c, gnome_settings_typing_break_init():
g_signal_connect (reaper, "child_exited", G_CALLBACK (child_exited_callback), NULL);

IIRC, '-' can be replaced by '_' in signals (I should look if this is really true).

Revision history for this message
Vincent Untz (vuntz) wrote :

This is what could happen:

The user sets up the typing monitor. It crashes or the user kills it. Then the
user disables the typing monitor in the keyboard preferences.

This code is executed:
  if (typing_monitor_pid > 0)
    kill (typing_monitor_pid, SIGKILL);

With the child-exited signal removed, typing_monitor_pid is not reset to 0. So
it might kill another process that got the pid that the typing monitor had.

Of course, this won't happen often, so it's far less critical than the current bug.

Revision history for this message
Vincent Untz (vuntz) wrote :

Ok, I think I'm understanding the problem.

There are lots of forks when launching esound. When this doesn't work (no
/dev/dsp) the reaper will add stuff in the pipe because it receives SIGCHLD.
The problem is that it is not read by vte_reaper_emit_signal() because the g-s-d
process is still in another part of the code. So we fill the pipe. And we block.

I think we should do something like this in vte_reaper_init():

fcntl (reaper->iopipe[1], F_SETFL, O_NONBLOCK);

According to some doc:

A write request for {PIPE_BUF} or fewer bytes will have the following effect: If
there is sufficient space available in the pipe, write() will transfer all the
data and return the number of bytes requested. Otherwise, write() will transfer
no data and return -1 with errno set to [EAGAIN].

So the write won't block and no data will be written. This won't break anything.

Hope I'm not wrong.

(btw, having a reaper just for the typing monitor is really a bad idea. The
long-term solution would be to remove the reaper if possible.)

Revision history for this message
Sebastien Bacher (seb128) wrote :

Matt's patch has been applied in the control-center 2.8.0-0ubuntu4 upload, it
should fix the problem.
BTW I think that Vincent's idea is probably the right one about the problem so
I'm letting the bug open, time
to figure what we should do.

Revision history for this message
Matt Zimmerman (mdz) wrote :

*** Bug 8821 has been marked as a duplicate of this bug. ***

Revision history for this message
Matt Zimmerman (mdz) wrote :

Seems to have been fixed in control-center 2.8.0-0ubuntu4, users aren't
reporting this problem anymore

Revision history for this message
Callum McKenzie (callum) wrote :

The gnome-games package has seen a similar problem in an gnome bugzilla bug
report. Gnobots is crashing on an Ubuntu machine without sound. It can't
possibly be the gnobots code since that just calls gnome_triggers_do. Is it
possible this problem is being triggered from another route ? The bug in
question is http://bugzilla.gnome.org/show_bug.cgi?id=162086 .

Revision history for this message
Sebastien Bacher (seb128) wrote :

how can gnome-session crash gnobots ?

Revision history for this message
Callum McKenzie (callum) wrote :

Obviously it can't. I added the comment here because the symptoms are very
similar and I am worried that the real problem might lie deeper than it appears
to. Having a second look at the comments, I think I have misread some of them
and the problems may not be as similar as I first thought. The gnobots problem
is within the gnome triggers subsystem so it starts reasonably low in the stack.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Patches

Remote bug watches

Bug watches keep track of this bug in other bug trackers.