Hang while loading snd-nm256 module

Bug #14277 reported by Norbert Kiesel
12
Affects Status Importance Assigned to Milestone
linux-source-2.6.15 (Ubuntu)
Fix Released
Critical
Fabio Massimo Di Nitto

Bug Description

I installed the hoary-preview on my old Gateway Solo 3300. Installation went
fine, but hotplug locks up the machine solidly (only poweroff is still working).
I managed to skip over hotplug by hitting Ctrl-C during boot and got the system
running. Then I set VERBOSE=yes in /etc/default/rcS and ran
"/etc/init.d/hotplug start" in a root shell. Last lines before the lockup are:
(typed off the screen, so bear with me if there are typos)
* Running net.rc [ok]
* RUnning pci.rc
  intel-agp: loaded successfully
  shpchp: loaded successfully
  pciehp: can't be loaded
missing kernel or user mode driver pciehp
  piix: already loaded
  uhci-hcd: loaded successfully
  i2c-piix4: loaded successfully
  yenta_socket: loaded successfully
  3c59x: loaded successfully
  ignoring pci display device on 01:00.0
(solid lockup at this point)

Anything else you need to know? Any workarounds? All I care about in the moment
are the network and the Cardbus (for wireless network). Can I comment out stuff
in hotplug to make it work for me?

Revision history for this message
Norbert Kiesel (nk-iname) wrote :

Created an attachment (id=1759)
dmidecode

Revision history for this message
Norbert Kiesel (nk-iname) wrote :

Created an attachment (id=1760)
lspci

Revision history for this message
Norbert Kiesel (nk-iname) wrote :

Got it working by hitting Ctrl-C at the right point in time during hotplug
startup and then running "modprobe 3c59x; /etc/init.d/networking start"

Booting with acpi=off does not make a difference

Revision history for this message
Matt Zimmerman (mdz) wrote :

Did you try "noapic"?

Judging by the order of the devices on your PCI bus, the hang is occurring when
loading the driver for this device:

0000:01:00.1 Multimedia audio controller: Neomagic Corporation NM2200
[MagicMedia 256AV Audio] (rev 20)
 Subsystem: Gateway 2000: Unknown device 3300

This device is handled by the snd-nm256 (ALSA) and nm256_audio (OSS) modules,
and nm256_audio is blacklisted by alsa-base, so snd-nm256 seems likely to be the
module where the hang is occurring.

You should be able to test this hypothesis by adding snd-nm256 to
/etc/hotplug/blacklist (which will prevent it from being loaded by default).
The system should boot normally, and should hang when you "sudo modprobe snd-nm256".

The next step is to get a trace from the hang, using the instructions at
http://www.ubuntulinux.org/wiki/DebuggingSystemCrash

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

(In reply to comment #3)
> Got it working by hitting Ctrl-C at the right point in time during hotplug
> startup and then running "modprobe 3c59x; /etc/init.d/networking start"
>
> Booting with acpi=off does not make a difference

hi,
  can you please blacklist the sound module and let us know if that solves the hang?
I would like to be 100% that this is the driver that is hanging the system,
before starting
a long debugging session.

Thanks
Fabio

Revision history for this message
Daniel T Chen (crimsun) wrote :

(In reply to comment #3)
> Got it working by hitting Ctrl-C at the right point in time during hotplug
> startup and then running "modprobe 3c59x; /etc/init.d/networking start"
>
> Booting with acpi=off does not make a difference

Blacklisting snd-nm256 should allow you to boot. As far as I know, this ALSA
driver has been problematic across several versions. I recommend we consider
using the OSS/Free version (nm256_audio) instead for Hoary.

Revision history for this message
Norbert Kiesel (nk-iname) wrote :

Blacklisting snd-nm256 makes the machine boot. Did not try the OSS module yet,
but will do later today and report back.

I tried to generate a stacktrace using ctrl-scrolllock but after modprobing
snd-nm256 the machine locks hard, including killing the keyboard (e.g. capslock
does not work anymore). I can produce stacktraces before modprobing snd-nm256.

Not sure what the procedre for changing severity is (e.g. is it ok for me to
change it (assuming I can)), but I'd degrade it from critical now that there is
a workaround.

Revision history for this message
Norbert Kiesel (nk-iname) wrote :

did a "modprobe nm256_audio" and gstreamer-properties does produce a test sound
when selecting OSS as output

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

What kernel are you using exactly? 2.6.10-5-686 or?

fabio

Revision history for this message
Norbert Kiesel (nk-iname) wrote :

I'm using kernel 2.6.10-5-386

Most likely you know this already, but there is also an ALSA bug open for this:
https://bugtrack.alsa-project.org/alsa-bug/view.php?id=914 which also contains a
proposed patch. I'd be willing to test this patch but I'm not sure yet how to
reproduce the kernel source setup to match the Ubuntu config. As an
alternative, I'd also be willing to test a patched module on my system if
someone could create it.

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

(In reply to comment #10)
> I'm using kernel 2.6.10-5-386
>
> Most likely you know this already, but there is also an ALSA bug open for this:
> https://bugtrack.alsa-project.org/alsa-bug/view.php?id=914 which also contains a
> proposed patch. I'd be willing to test this patch but I'm not sure yet how to
> reproduce the kernel source setup to match the Ubuntu config. As an
> alternative, I'd also be willing to test a patched module on my system if
> someone could create it.

Ok! can you grab the patch and attach it here? alsa bugzilla requires a valid
account also for browsing, and i can build the kernel for you.

Thanks!
Fabio

Revision history for this message
Norbert Kiesel (nk-iname) wrote :

Created an attachment (id=1769)
proposed patch from ALSA bug report 914

attached patch. Btw, I just clicked on the "guest login (allows browsing
only)" or so on the ALSA bug system login page to access the bug report (i.e. I
don't have a valid ALSA account either).

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

(In reply to comment #12)
> Created an attachment (id=1769) [edit]
> proposed patch from ALSA bug report 914
>
> attached patch. Btw, I just clicked on the "guest login (allows browsing
> only)" or so on the ALSA bug system login page to access the bug report (i.e. I
> don't have a valid ALSA account either).

Ok, can you try to replace
/lib/modules/2.6.10-5-386/kernel/sound/pci/nm256/snd-nm256.ko
with the one here:

http://people.ubuntu.com/~fabbione/snd-nm256.ko

run depmod -a
and see if it still crashes the machine?

Just remember to rmmod the oss module first.

Fabio

Revision history for this message
Norbert Kiesel (nk-iname) wrote :

yup, sorry to say but it still crashes (i.e. locks) the machine.
I also rebooted using init=/bin/bash, then manually executed some minimal init.d
(mountvirtfs, udev, module-init-tools procfs.sh), then modprobe snd-nm256 and it
still locked the box the same way.
Another thing I tried was "modprobe snd-nm256 force_ac97=1 vaio_hack=1" but also
locked the machine

Too bad I can't get any info out of the box (keyboard is completly dead)

Anything else I could try?

Revision history for this message
Daniel T Chen (crimsun) wrote :

(In reply to comment #14)
> Anything else I could try?

Please modprobe with the reset_workaround=1 parameter from a fresh boot into
Rescue mode (single user); if successful, telinit 2.

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

(In reply to comment #14)
> yup, sorry to say but it still crashes (i.e. locks) the machine.

That module includes the patch from the alsa, so apparently it is not enough.

Revision history for this message
Norbert Kiesel (nk-iname) wrote :

(In reply to comment #15)
>
> Please modprobe with the reset_workaround=1 parameter from a fresh boot into
> Rescue mode (single user); if successful, telinit 2.

Did this (with both the original and the patched module), but did not make a
difference: systms locks up. Have some more info though (written down and typed
in, so typos possible. ??? == cound not read my own handwriting :-).
# modprobe snd-nm256 reset_workaround=1
ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 10
ACPI: PCI interrupt 0000:01:00.1[B] -> GSI 10 (level, low) -> IRQ 10
nm256: found ??? in video RAM: 0x27ec00
nm256: Mapping port 1 from 0x2709a0 - 0x27ec00

Revision history for this message
Daniel T Chen (crimsun) wrote :

(In reply to comment #17)
> Did this (with both the original and the patched module), but did not make a
> difference: systms locks up. Have some more info though (written down and typed
> in, so typos possible. ??? == cound not read my own handwriting :-).
> # modprobe snd-nm256 reset_workaround=1
> ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 10
> ACPI: PCI interrupt 0000:01:00.1[B] -> GSI 10 (level, low) -> IRQ 10
> nm256: found ??? in video RAM: 0x27ec00
> nm256: Mapping port 1 from 0x2709a0 - 0x27ec00

Is gdm already running when you modprobe with that parameter? If so, please
reboot into single user and try it without X loaded.

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

(In reply to comment #17)
> (In reply to comment #15)
> >
> > Please modprobe with the reset_workaround=1 parameter from a fresh boot into
> > Rescue mode (single user); if successful, telinit 2.
>
> Did this (with both the original and the patched module), but did not make a
> difference: systms locks up. Have some more info though (written down and typed
> in, so typos possible. ??? == cound not read my own handwriting :-).
> # modprobe snd-nm256 reset_workaround=1
> ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 10
> ACPI: PCI interrupt 0000:01:00.1[B] -> GSI 10 (level, low) -> IRQ 10
> nm256: found ??? in video RAM: 0x27ec00
> nm256: Mapping port 1 from 0x2709a0 - 0x27ec00
>
>

hmmmmm can you try to boot the kernel with irqpoll option and try the 2 modules?

but that "nm256: found ??? in video RAM: 0x27ec00" is really weird..

Fabio

Revision history for this message
Norbert Kiesel (nk-iname) wrote :

(In reply to comment #18)
>
> Is gdm already running when you modprobe with that parameter? If so, please
> reboot into single user and try it without X loaded.

Sorry I did not mention but this was after booting into single user mode (i.e. X
was not loaded)

Revision history for this message
Norbert Kiesel (nk-iname) wrote :

(In reply to comment #19)
> (In reply to comment #17)
> > (In reply to comment #15)
> > >
> > > Please modprobe with the reset_workaround=1 parameter from a fresh boot into
> > > Rescue mode (single user); if successful, telinit 2.
> >
> > Did this (with both the original and the patched module), but did not make a
> > difference: systms locks up. Have some more info though (written down and typed
> > in, so typos possible. ??? == cound not read my own handwriting :-).
> > # modprobe snd-nm256 reset_workaround=1
> > ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 10
> > ACPI: PCI interrupt 0000:01:00.1[B] -> GSI 10 (level, low) -> IRQ 10
> > nm256: found ??? in video RAM: 0x27ec00
> > nm256: Mapping port 1 from 0x2709a0 - 0x27ec00
> >
> >
>
> hmmmmm can you try to boot the kernel with irqpoll option and try the 2 modules?
will do and report back.
>
> but that "nm256: found ??? in video RAM: 0x27ec00" is really weird..
just to make this clear: there were no ??? but something resonable written.
It's just that I can't read what I scribbled down on a piece of paper after the
system locked up. Will try again and report exact line

Revision history for this message
Norbert Kiesel (nk-iname) wrote :

(In reply to comment #19)
> (In reply to comment #17)
>
> hmmmmm can you try to boot the kernel with irqpoll option and try the 2 modules?
booted using "single irqpoll" and then entered "modprobe snd-nm256
reset_workaround=1". Result: solid lock-up.
This was with the original snd-nm256 module. What is the expected result of
irqpoll? Some output? Did not seem to do anything. Will try with your patched
module next.

> but that "nm256: found ??? in video RAM: 0x27ec00" is really weird..
ok wrote it down better now. The lines reads:
nm256: found card signature in video RAM: 0x27ec00

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

(In reply to comment #22)
> (In reply to comment #19)
> > (In reply to comment #17)
> >
> > hmmmmm can you try to boot the kernel with irqpoll option and try the 2 modules?
> booted using "single irqpoll" and then entered "modprobe snd-nm256
> reset_workaround=1". Result: solid lock-up.
> This was with the original snd-nm256 module. What is the expected result of
> irqpoll? Some output? Did not seem to do anything. Will try with your patched
> module next.

There is no output change either than a couple of lines in the dmesg. It is a
different
way of handling irqs.

>
> > but that "nm256: found ??? in video RAM: 0x27ec00" is really weird..
> ok wrote it down better now. The lines reads:
> nm256: found card signature in video RAM: 0x27ec00
>
>

this is clearly misdetection... I will have to test another patch
i saw upstream. I am preparing another test module for you.

Fabio

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

There is an updated driver at the same location as before.

It includes the patch from alsa and an update about detection from 2.6.12rc1.

Mind to test it please?

Fabio

PS if this driver doesn't work we will have to ship hoary with the oss workaround :(

Revision history for this message
Norbert Kiesel (nk-iname) wrote :

Hi Fabio,

tested the new module and it - kind of - works. I did a "echo options snd-nm256
reset_workaround=1 > /etc/modprobe.conf". Problem is that there is a click of
about 10HZ. I somehow think the problem is that this machine has too much
cramped on the same interrupt.

root@voyager:~ # cat /proc/interrupts
           CPU0
  0: 991949 XT-PIC timer
  1: 2337 XT-PIC i8042
  2: 0 XT-PIC cascade
  3: 2669 XT-PIC orinoco_cs
  9: 1416 XT-PIC acpi
 10: 11142 XT-PIC uhci_hcd, yenta, eth0, NM256AV
 12: 57476 XT-PIC i8042
 14: 26214 XT-PIC ide0
NMI: 0
LOC: 0
ERR: 0
MIS: 0

I first just renamed the original module but left it in /lib/modules. This I
think resulted in loading the original module instead of the new one. So on
next reboot in single-user-mode I hit ctrl-c while hotplug was loading the
modules and ended up with a system which did not have anything mapped on INT 10.
 Then I moved the original module out of /lib/modules, ran "depmod -a; modprobe
snd-nm256; telinit 3" and got a clear sound (but of course no wireless as e.g.
yenta was not loaded).

Is there a way to force nm256 to use another interrupt or am I on the wrong
track here anyway?

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

(In reply to comment #25)
> Hi Fabio,
>
> tested the new module and it - kind of - works. I did a "echo options snd-nm256
> reset_workaround=1 > /etc/modprobe.conf". Problem is that there is a click of
> about 10HZ. I somehow think the problem is that this machine has too much
> cramped on the same interrupt.

finally a bit of good news... at least it does not hang the system, right?
did you try without the reset_workaround ? there is also the force_ac97
option that looks "interesting".

>
> I first just renamed the original module but left it in /lib/modules. This I
> think resulted in loading the original module instead of the new one. So on
> next reboot in single-user-mode I hit ctrl-c while hotplug was loading the
> modules and ended up with a system which did not have anything mapped on INT 10.
> Then I moved the original module out of /lib/modules, ran "depmod -a; modprobe
> snd-nm256; telinit 3" and got a clear sound (but of course no wireless as e.g.
> yenta was not loaded).
>
> Is there a way to force nm256 to use another interrupt or am I on the wrong
> track here anyway?
>

Not in the module. I think you need to check that in the BIOS.

Fabio

Revision history for this message
Norbert Kiesel (nk-iname) wrote :

(In reply to comment #26)
>
> finally a bit of good news... at least it does not hang the system, right?
> did you try without the reset_workaround ? there is also the force_ac97
> option that looks "interesting".

yes, did not hang the system. Will try to remove snd-nm256 from hotplug
blacklist and boot regulary (i.e. not first in single user mode).

> >
> > Is there a way to force nm256 to use another interrupt or am I on the wrong
> > track here anyway?
> >
>
> Not in the module. I think you need to check that in the BIOS.

This BIOS does not allow playing around with IRQ (phoenix bios crippled by
Gateway). I think that I was wrong anyway: installed alsaplayer and this plays
my testfiles without problem, so seems to be more a gstreamer problem
(gstreamer-properties also creates a "hackish" sound). Will look into that
further.
>
> Fabio

Revision history for this message
Norbert Kiesel (nk-iname) wrote :

It seems we have a winner. I moved /etc/modprobe.conf to
/etc/modprobe.d/snd-nm256.modprobe and switched gstreamer to use esd instead of
alsa. Then I removed snd-nm256 from /etc/hotplug/blacklist again and rebooted.
 System comes up, loads snd-nm256 and has clear sound.

I currently have reset_workaround=1 force_ac97=1 in snd-nm256.modprobe, will try
to get rid of them and see if this works, too.

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

(In reply to comment #28)
> It seems we have a winner.

ROCKING! the fix will be in -29 that will be in the archive either later today
or tomorrow.

> I moved /etc/modprobe.conf to
> /etc/modprobe.d/snd-nm256.modprobe and switched gstreamer to use esd instead of
> alsa. Then I removed snd-nm256 from /etc/hotplug/blacklist again and rebooted.
> System comes up, loads snd-nm256 and has clear sound.
>
> I currently have reset_workaround=1 force_ac97=1 in snd-nm256.modprobe, will try
> to get rid of them and see if this works, too.
>
>

Please let me know.

Thanks
Fabio

Revision history for this message
Norbert Kiesel (nk-iname) wrote :

(In reply to comment #29)
> (In reply to comment #28)
> > It seems we have a winner.
>
> ROCKING! the fix will be in -29 that will be in the archive either later today
> or tomorrow.
Cool.

> Please let me know.
It still locks without reset_workaround=1. With reset_workaround=1, I could
boot 5 times in a row without problems.
I finally added the line to the bottom of /etc/modprobe.d/alsa-base and removed
my /etc/modprobe.d/snd-nm256.modprobe
again, assuming that is what you will end up doing, too.

Now I just have to find out why it's always hanging 20 seconds in
/etc/init.d/networking and then failing to enable my wireless card and I'm a
happy camper. (Actually I think I know why this happens: it's a PCMCIA card and
pcmcia is started as rc2.d/s20pcmcia which is way later than rcS.d/S40networking)

> Thanks
> Fabio

Revision history for this message
Thomas Hood (jdthood) wrote :

(In reply to comment #28)
> It seems we have a winner. I moved /etc/modprobe.conf to
> /etc/modprobe.d/snd-nm256.modprobe and switched gstreamer to use esd instead of
> alsa. Then I removed snd-nm256 from /etc/hotplug/blacklist again and rebooted.
> System comes up, loads snd-nm256 and has clear sound.
>
> I currently have reset_workaround=1 force_ac97=1 in snd-nm256.modprobe, will try
> to get rid of them and see if this works, too.

You should know that if /etc/modprobe.conf is present then /etc/modprobe.d/* are
not used.

I am a bit confused about exactly what needed to be done to fix this problem.
Can someone please summarize the solution for me so that I can implement the
fix upstream in Debian too?

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

(In reply to comment #31)
> (In reply to comment #28)
> > It seems we have a winner. I moved /etc/modprobe.conf to
> > /etc/modprobe.d/snd-nm256.modprobe and switched gstreamer to use esd instead of
> > alsa. Then I removed snd-nm256 from /etc/hotplug/blacklist again and rebooted.
> > System comes up, loads snd-nm256 and has clear sound.
> >
> > I currently have reset_workaround=1 force_ac97=1 in snd-nm256.modprobe, will try
> > to get rid of them and see if this works, too.
>
>
> You should know that if /etc/modprobe.conf is present then /etc/modprobe.d/* are
> not used.
>
>
> I am a bit confused about exactly what needed to be done to fix this problem.
> Can someone please summarize the solution for me so that I can implement the
> fix upstream in Debian too?

It is a kernel patch. All the other stuff were only tests done to verify that the
module was ok.

Fabio

Revision history for this message
Thomas Hood (jdthood) wrote :

> It is a kernel patch. All the other stuff were only tests done to verify that the
> module was ok.

Does this patch apply to the source from the alsa-driver tarball too? If so, please
send it to me. :)

--
Thomas
<email address hidden>

Revision history for this message
Chuck Short (zulcss) wrote :

Should be fixed in the next upload

Revision history for this message
Norbert Kiesel (nk-iname) wrote :

Hi,

the module works now with linux-image-2.6.10-5-386, but not with
linux-image-2.6.10-5-686. My tests are of course in no way scientific, but I
got 3 out of 3 lockups with -686 and 2 out of 2 successful boots with sound
using -386.

Revision history for this message
Fabrizio Magni (fabrizio-magni) wrote :

Hi,
I'm testing the 2.6.11-1-386 on my DELL Latitude CPi.
The snd-nm256 freezes my system even with reset_workaround=1 force_ac97=1 vaio_hack=1.

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

(In reply to comment #35)
> Hi,
>
> the module works now with linux-image-2.6.10-5-386, but not with
> linux-image-2.6.10-5-686. My tests are of course in no way scientific, but I
> got 3 out of 3 lockups with -686 and 2 out of 2 successful boots with sound
> using -386.
>

The patch is applied uncoditionally. Are you sure all the kernels are updated?

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

(In reply to comment #36)
> Hi,
> I'm testing the 2.6.11-1-386 on my DELL Latitude CPi.
> The snd-nm256 freezes my system even with reset_workaround=1 force_ac97=1
vaio_hack=1.
>
>

2.6.11 is not supported.

Revision history for this message
Fabrizio Magni (fabrizio-magni) wrote :

Sorry I didn't know.

However snd-nm256 of linux-image-2.6.10-5-686 works here.

Thanks.

Revision history for this message
aftertaf (dwooffindin) wrote :

(In reply to comment #39)
> Sorry I didn't know.
>
> However snd-nm256 of linux-image-2.6.10-5-686 works here.
>
> Thanks.

I have the same pb with Latitude LS on debian.
I have disabled OSS in the new kernel i'm compiling.

Is the problem with the alsa sources in the kernel?

Revision history for this message
Norbert Kiesel (nk-iname) wrote :

(In reply to comment #40)
> I have the same pb with Latitude LS on debian.
> I have disabled OSS in the new kernel i'm compiling.
>
> Is the problem with the alsa sources in the kernel?
Yup, the patch was applied to the nm256 file of alsa in the kernel.
I actually did not look at the patch myself, but it's working
beautifully since it was incorporated into the Ubuntu kernel.
I upgraded the kernel 3 times since then and sound always worked
flawlessy.

Revision history for this message
Jonathon Sim (sim) wrote :

In breezy's 2.6.12 kernel unfortunately this patch doesn't seem to be applied -
I have applied that patch to the breezy kernel source and recompiled and it
fixed this issue on my Dell Latitude LS.

Revision history for this message
Michael Haile (chain009) wrote :

(In reply to comment #42)
> In breezy's 2.6.12 kernel unfortunately this patch doesn't seem to be applied -
> I have applied that patch to the breezy kernel source and recompiled and it
> fixed this issue on my Dell Latitude LS.

Which patch did you use? The one listed above (proposed patch from ALSA bug
report 914
) or another patch? Can you provide a link to the patch you used?

Revision history for this message
Igor Bukanov (igor-bukanov) wrote :

The bug still happens with Dapper Flight 6 and Breezy final on my Sony Vaio PCG-Z505R notebook. As with reports above system freezes after

modprobe snd-nm256
modprobe snd-nm256 reset_workaround=1 force_ac97=1 vaio_hack=1

and different combinations.

But during the first boot right after installation the sound is available and works fine.

Note that the bug present in Fedora Core 4 with all the updated kernels there including 2.6.16.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.