Oops in inotify_dev_queue_event (and USB umount goes to hell)

Bug #11844 reported by Jens Askengren
88
Affects Status Importance Assigned to Milestone
linux-source-2.6.15 (Ubuntu)
Fix Released
Medium
Chuck Short

Bug Description

I saved a file to a subdirectory of my home directory and browsed to it using
nautilus.
Any program trying to this directory will hang forever.
The directory only contais 3 files.

Version: Linux jag 2.6.10-1-386 #1 Tue Jan 11 03:59:15 UTC 2005 i686 GNU/Linux

The following stack trace was written to /var/log/syslog:

Jan 11 16:54:16 localhost -- MARK --
Jan 11 17:00:33 localhost kernel: Disabled Privacy Extensions on device c02ed2a0(lo)
Jan 11 17:00:33 localhost kernel: IPv6 over IPv4 tunneling driver
Jan 11 17:00:33 localhost kernel: ACPI: Power Button (FF) [PWRF]
Jan 11 17:00:33 localhost kernel: ACPI: Sleep Button (CM) [SLPB]
Jan 11 17:00:33 localhost kernel: apm: BIOS version 1.2 Flags 0x07 (Driver
version 1.16ac)
Jan 11 17:00:33 localhost kernel: apm: overridden by ACPI.
Jan 11 17:00:33 localhost kernel: eth0: no IPv6 routers present
Jan 11 17:00:33 localhost kernel: Unable to handle kernel NULL pointer
dereference at virtual address 00000008
Jan 11 17:00:33 localhost kernel: printing eip:
Jan 11 17:00:33 localhost kernel: c01d4f03
Jan 11 17:00:33 localhost kernel: *pde = 00000000
Jan 11 17:00:33 localhost kernel: Oops: 0000 [#1]
Jan 11 17:00:33 localhost kernel: PREEMPT
Jan 11 17:00:33 localhost kernel: Modules linked in: binfmt_misc proc_intf
freq_table cpufreq_userspace cpufreq_ondemand cpufreq_powersave button ac
battery ipv6 8139cp 8139too mii snd_intel8x0 snd_ac97_codec snd_pcm_oss
snd_mixer_oss snd_pcm snd_timer snd soundcore snd_page_alloc i2c_i801 i2c_core
ehci_hcd usbhid uhci_hcd usbcore shpchp pci_hotplug intel_mch_agp intel_agp
agpgart floppy rtc md dm_mod capability commoncap parport_pc lp evdev parport
tsdev ide_cd cdrom mousedev psmouse ext3 jbd ide_generic piix ide_disk ide_core
unix thermal processor fan fbcon crc32 font bitblit vesafb cfbcopyarea cfbimgblt
cfbfillrect
Jan 11 17:00:33 localhost kernel: CPU: 0
Jan 11 17:00:33 localhost kernel: EIP: 0060:[inotify_dev_queue_event+159/268]
   Not tainted VLI
Jan 11 17:00:33 localhost kernel: EFLAGS: 00210287 (2.6.10-1-386)
Jan 11 17:00:33 localhost kernel: EIP is at inotify_dev_queue_event+0x9f/0x10c
Jan 11 17:00:33 localhost kernel: eax: 00000000 ebx: 00000800 ecx: dfd23580
  edx: d99fd168
Jan 11 17:00:33 localhost kernel: esi: 00000800 edi: c3537680 ebp: d98fcec0
  esp: c54cdeb0
Jan 11 17:00:33 localhost kernel: ds: 007b es: 007b ss: 0068
Jan 11 17:00:33 localhost kernel: Process evolution (pid: 19343,
threadinfo=c54cc000 task=ca4370a0)
Jan 11 17:00:33 localhost kernel: Stack: 00000000 00000000 d99fd168 c54cc000
d99fd168 c3537680 dfd23580 c01d5408
Jan 11 17:00:33 localhost kernel: d98fcec0 d99fd168 00000800 00000000
dfd23580 00000000 000081a4 c3537680
Jan 11 17:00:33 localhost kernel: dfd2351c c0151ead c3537680 00000800
00000000 dfd23580 c54cdf60 dfd2351c
Jan 11 17:00:33 localhost kernel: Call Trace:
Jan 11 17:00:33 localhost kernel: [inotify_inode_queue_event+66/112]
inotify_inode_queue_event+0x42/0x70
Jan 11 17:00:33 localhost kernel: [vfs_create+231/258] vfs_create+0xe7/0x102
Jan 11 17:00:33 localhost kernel: [open_namei+322/1286] open_namei+0x142/0x506
Jan 11 17:00:33 localhost kernel: [filp_open+44/73] filp_open+0x2c/0x49
Jan 11 17:00:33 localhost kernel: [get_unused_fd+40/189] get_unused_fd+0x28/0xbd
Jan 11 17:00:33 localhost kernel: [sys_open+49/141] sys_open+0x31/0x8d
Jan 11 17:00:33 localhost kernel: [sysenter_past_esp+82/117]
sysenter_past_esp+0x52/0x75

Revision history for this message
Matt Zimmerman (mdz) wrote :

What type of filesystem are you using? (the default is ext3)

Did this happen only once, or does it happen every time you access that file?

Revision history for this message
Jens Askengren (jens-askengren) wrote :

The filesystem is ext3 on a lvm volume.

It seems like the problem was trigged by evolution or possibly gnome-vfs
creating a file and starting to monitor the directory for changes.

Any access to the directory where the file was created would
hang the accessing process.

A reboot solved the problem.

I have upgraded the kernel and have not been able to reproduce the problem.

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

Closing the bug.

Revision history for this message
Matt Zimmerman (mdz) wrote :

Confirmed in bug #12165 to still exist with 2.6.10-10

Revision history for this message
Matt Zimmerman (mdz) wrote :

*** Bug 12165 has been marked as a duplicate of this bug. ***

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

Jeff,
    can you please get in touch with upstream?

Fabio

Revision history for this message
Sebastien Bacher (seb128) wrote :

*** Bug 12271 has been marked as a duplicate of this bug. ***

Revision history for this message
Javier Barroso (javibarroso) wrote :
Download full text (15.4 KiB)

(In reply to comment #7)
> *** Bug 12271 has been marked as a duplicate of this bug. ***

Today I update my ubuntu system again, and my (full, without ssh access, no
mouse, no keyboard ... nothing) system hang INMEDIATELY after a
mount vfat partition | open with nautilus | umount vfat partition witho out
close nautilus browser
I think if this bug if fixed, bug 12271 is not duplicated of 5431 (I don't view
oops on my logs), or 5431 fix is not upload to ubuntu hoary repository yet.

Thank you for Ubuntu!

My nautilus version:
Gnome nautilus 2.9.90
My dmesg output:
Linux version 2.6.10-2-k7 (buildd@mcmurdo) (gcc version 3.3.5 (Debian
1:3.3.5-6ubuntu1)) #1 Mon Jan 24 13:15:38 UTC 2005
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
 BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 000000001fff0000 (usable)
 BIOS-e820: 000000001fff0000 - 000000001fff3000 (ACPI NVS)
 BIOS-e820: 000000001fff3000 - 0000000020000000 (ACPI data)
 BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
 BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved)
0MB HIGHMEM available.
511MB LOWMEM available.
found SMP MP-table at 000f4b20
On node 0 totalpages: 131056
  DMA zone: 4096 pages, LIFO batch:1
  Normal zone: 126960 pages, LIFO batch:16
  HighMem zone: 0 pages, LIFO batch:1
DMI 2.3 present.
ACPI: RSDP (v000 GBT ) @ 0x000f63f0
ACPI: RSDT (v001 GBT AWRDACPI 0x42302e31 AWRD 0x01010101) @ 0x1fff3000
ACPI: FADT (v001 GBT AWRDACPI 0x42302e31 AWRD 0x01010101) @ 0x1fff3040
ACPI: MADT (v001 GBT AWRDACPI 0x42302e31 AWRD 0x01010101) @ 0x1fff6840
ACPI: DSDT (v001 GBT AWRDACPI 0x00001000 MSFT 0x0100000c) @ 0x00000000
ACPI: PM-Timer IO Port: 0x4008
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 6:10 APIC version 16
ACPI: LAPIC_NMI (acpi_id[0x00] dfl dfl lint[0x1])
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 3, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Enabling APIC mode: Flat. Using 1 I/O APICs
Using ACPI (MADT) for SMP configuration information
Built 1 zonelists
Kernel command line: root=/dev/hdc1 ro quiet splash
mapped APIC to ffffd000 (fee00000)
mapped IOAPIC to ffffc000 (fec00000)
Initializing CPU#0
PID hash table entries: 2048 (order: 11, 32768 bytes)
Detected 2010.002 MHz processor.
Using pmtmr for high-res timesource
Console: colour VGA+ 80x25
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 511424k/524224k available (1593k kernel code, 12184k reserved, 719k
data, 164k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay loop... 3981.31 BogoMIPS (lpj=1990656)
Security Framework v1.0.0 initialized
SEL...

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

This problem should be fixed in 2.6.10-12 that will hit the mirrors in a few hours.

Revision history for this message
dmatrix7 (dmatrix7) wrote :

Bug is still present on linux-image-2.6.10-2-686-smp 2.6.10-13.

Unable to handle kernel NULL pointer dereference at virtual address 00000008
 printing eip:
c01f5450
*pde = 00000000
Oops: 0000 [#1]
PREEMPT SMP
Modules linked in: proc_intf freq_table cpufreq_userspace cpufreq_ondemand
cpufreq_powersave nfsd exportfs video sony_acpi pcc_acpi button battery
container ac nfs lockd sunrpc af_packet e1000 i2c_i801 i2c_core piix hw_random
uhci_hcd usbcore floppy pcspkr rtc md dm_mod capability commoncap parport_pc lp
parport evdev tsdev ide_generic ide_disk ide_cd ide_core cdrom mousedev psmouse
ext3 jbd mbcache sd_mod gdth scsi_mod unix thermal processor fan fbcon crc32
font bitblit vesafb cfbcopyarea cfbimgblt cfbfillrect
CPU: 0
EIP: 0060:[<c01f5450>] Not tainted VLI
EFLAGS: 00010287 (2.6.10-2-686-smp)
EIP is at inotify_dev_queue_event+0x77/0x180
eax: 00000000 ebx: f6868300 ecx: 00000800 edx: dd79fb68
esi: f6868310 edi: 00000800 ebp: 00000000 esp: d8cd7e88
ds: 007b es: 007b ss: 0068
Process evolution (pid: 23894, threadinfo=d8cd6000 task=eec1ea20)
Stack: 000f4430 eec1ea20 eec1eb88 00000002 00000000 d6145f7c 00000000 dd79fb68
       dd79fb68 d9ea70ec d6145f7c 00000000 c01f5aed f6868300 dd79fb68 00000800
       00000000 d6145f7c 00000000 000081a4 d9ea70ec d6145f14 c0167496 d9ea70ec
Call Trace:
 [<c01f5aed>] inotify_inode_queue_event+0x57/0x81
 [<c0167496>] vfs_create+0xeb/0x16c
 [<c0167d37>] open_namei+0x5cc/0x61f
 [<c0157ee3>] filp_open+0x3e/0x64
 [<c0158307>] sys_open+0x44/0xc6
 [<c0102f1d>] sysenter_past_esp+0x52/0x75
Code: 0f 87 b9 00 00 00 0f 84 c7 00 00 00 81 f9 00 20 00 00 74 33 81 f9 00 80 00
00 74 2b 8b 54 24 1c 89 cf 8b 42 08 8b 80 24 01 00 00 <23> 78 08 85 ff 0f 84 89
00 00 00 81 f9 00 80 00 00 74 09 89 c8
 <6>note: evolution[23894] exited with preempt_count 1

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

I had a conversation with Robert Love (upstream author of inotify) and i am
going to paste the relevant bits here for the record:

> I am going out with inotify -12 or -13 in a couple of days and we can see if that
> still happens.

Let me know if the bug still happens.

If it does, unless the user can get in there with kgdb, I think we need
to put debugging checks in the there and see what trips.

You can even do that now if you want. Something like

 BUG_ON(!dev);
 BUG_ON(!watch);
 BUG_ON(!watch->inode);
 BUG_ON(!watch->inode->inotify_data);

at the top of inotify_dev_queue_event().

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

ok guys.. i need you to test a new inotify patch for Robert and me.

Kernels are here:

people.ubuntu.com/~fabbione/inotify/

note that you must reboot to use this kernel and it is supposed to
print all the debugging info in dmesg/logs.

The inotify patch itself is also brand new, so it might not even have the problem.

Please test and let me know asap. If it fails attach the logs to this bug.

Thanks
Fabio

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

Sorry.. disreagard the previous message and do NOT install that kernel.
It is definitly more buggy than the previous one.

I can now reproduce the problem locally, so debugging will happen slightly faster.

Fabio

Revision history for this message
Javier Barroso (javibarroso) wrote :

Ok, if you want I send here more information about bug, say me howto I can do
it, and I'll do it

Thank you again!
(In reply to comment #13)
> Sorry.. disreagard the previous message and do NOT install that kernel.
> It is definitly more buggy than the previous one.
>
> I can now reproduce the problem locally, so debugging will happen slightly faster.
>
> Fabio

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

*** Bug 12496 has been marked as a duplicate of this bug. ***

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

*** Bug 11858 has been marked as a duplicate of this bug. ***

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

Ok this is now confirmed. All the USB problems we are experiencing with "unplug
while mounted"
are all related to inotify. I am working hard with upstream to get this right.

As a temporary workaround please be sure to a) close all your nautilus windows
and b) umount before unplugging.

Fabio

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

I have upload a 2.6.10-15 that implements a 'noinotify' boot option.
People affected by this bug can use it as temporary workaround to avoid
the crashes. Upstream is working hard to fix the problem (apparently it
is already fixed in 2.6.11-rc3-mm2)

Revision history for this message
Christian Meyer (chrisime-gnome-de) wrote :

Thanks for you hard work guys. For me kernel 2.6.10-14 works without any crashes.

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

*** Bug 12678 has been marked as a duplicate of this bug. ***

Revision history for this message
Javier Barroso (javibarroso) wrote :

(In reply to comment #19)
> Thanks for you hard work guys. For me kernel 2.6.10-14 works without any crashes.

umount VFAT with nautilus browsing on still hangs with 2.6.10-15, I'll probe
with noinotify boot option

Revision history for this message
Javier Barroso (javibarroso) wrote :

(In reply to comment #21)
> (In reply to comment #19)
> > Thanks for you hard work guys. For me kernel 2.6.10-14 works without any
crashes.
>
> umount VFAT with nautilus browsing on still hangs with 2.6.10-15, I'll probe
> with noinotify boot option

It was ok. I boot with noinotify option, and it works fine!

What advantages does give inotify feature to kernel?
Where can I read about it ?

Thank you!

Revision history for this message
Matt Zimmerman (mdz) wrote :

*** Bug 12714 has been marked as a duplicate of this bug. ***

Revision history for this message
Christian Meyer (chrisime-gnome-de) wrote :

Now, python crashes the kernel. I got the latest 2.6.10.
If you want the log, please tell me.

Revision history for this message
Christian Meyer (chrisime-gnome-de) wrote :
Revision history for this message
Christian Meyer (chrisime-gnome-de) wrote :

With the latest kernel ...-17 my system totally freezes when booting into GNOME.
-16 works, except the occasional crashes which aren't as severe as they have
been some time ago.

Revision history for this message
John McCutchan (ttb) wrote :

What version of inotify is being included in this kernel? Inotify 0.19 doesn't
have any crashers in it AFAIK

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

(In reply to comment #27)
> What version of inotify is being included in this kernel? Inotify 0.19 doesn't
> have any crashers in it AFAIK

0.18 in 2.6.10 and 0.19 2.6.11, but i can still reproduce the crash.

Revision history for this message
dmatrix7 (dmatrix7) wrote :

I can easily reproduce this crash on my 686-smp system and the last kernel I
tried on was 2.6.10-16. I am going to test on 2.6.10-19 today. For me I can
quickly reproduce this crash using Evolution and saving a bunch of attachments.
It happens randomly but usually within a few minutes of saving attachments in
Evolution. I am not using any USB devices on this system, but do have a bunch of
NFS mounts. I get a kernel panic and this log on 2.6.10-16:

Feb 13 11:29:08 qcsdesktop kernel: Unable to handle kernel NULL pointer
dereference at virtual address 00000008
Feb 13 11:29:08 qcsdesktop kernel: printing eip:
Feb 13 11:29:08 qcsdesktop kernel: c01f54d0
Feb 13 11:29:08 qcsdesktop kernel: *pde = 00000000
Feb 13 11:29:08 qcsdesktop kernel: Oops: 0000 [#1]
Feb 13 11:29:08 qcsdesktop kernel: PREEMPT SMP
Feb 13 11:29:08 qcsdesktop kernel: Modules linked in: proc_intf freq_table
cpufreq_userspace cpufreq_ondemand cpufreq_powersave nfsd exportfs video
sony_acpi pcc_acpi button battery container ac nfs lockd sunrpc af_packet e1000
i2c_i801 i2c_core piix hw_random uhci_hcd usbcore floppy pcspkr rtc md dm_mod
capability commoncap parport_pc lp parport evdev tsdev ide_generic ide_disk
ide_cd ide_core cdrom mousedev psmouse ext3 jbd mbcache sd_mod gdth scsi_mod
unix thermal processor fan fbcon crc32 font bitblit vesafb cfbcopyarea cfbimgblt
cfbfillrect
Feb 13 11:29:08 qcsdesktop kernel: CPU: 1
Feb 13 11:29:08 qcsdesktop kernel: EIP:
0060:[inotify_dev_queue_event+119/384] Not tainted VLI
Feb 13 11:29:08 qcsdesktop kernel: EFLAGS: 00010283 (2.6.10-3-686-smp)
Feb 13 11:29:08 qcsdesktop kernel: EIP is at inotify_dev_queue_event+0x77/0x180
Feb 13 11:29:08 qcsdesktop kernel: eax: 00000000 ebx: f7157c00 ecx: 00000010
  edx: f6f0c3a8
Feb 13 11:29:08 qcsdesktop kernel: esi: f7157c10 edi: 00000010 ebp: 00000000
  esp: e6871ee0
Feb 13 11:29:08 qcsdesktop kernel: ds: 007b es: 007b ss: 0068
Feb 13 11:29:08 qcsdesktop kernel: Process evolution-2.2 (pid: 12349,
threadinfo=e6870000 task=e6b41020)
Feb 13 11:29:08 qcsdesktop kernel: Stack: 000f4e47 e6b41020 e6b41188 00000001
00000000 f44d1518 00000000 f6f0c3a8
Feb 13 11:29:08 qcsdesktop kernel: f6f0c3a8 f4adc1ec f44d1518 00000000
c01f5b6d f7157c00 f6f0c3a8 00000010
Feb 13 11:29:08 qcsdesktop kernel: 00000000 f44d1518 f44d199c f44d14b8
f4adc1ec f44d1518 c01f5c0c f4adc1ec
Feb 13 11:29:08 qcsdesktop kernel: Call Trace:
Feb 13 11:29:08 qcsdesktop kernel: [inotify_inode_queue_event+87/129]
inotify_inode_queue_event+0x57/0x81
Feb 13 11:29:08 qcsdesktop kernel: [inotify_dentry_parent_queue_event+117/181]
inotify_dentry_parent_queue_event+0x75/0xb5

Revision history for this message
Sebastien Bacher (seb128) wrote :

*** Bug 13049 has been marked as a duplicate of this bug. ***

Revision history for this message
dmatrix7 (dmatrix7) wrote :

I think this bug can be closed for me. I upgraded to this kernel yesterday and
am unable to reproduce the issue anymore. I have now been running this kernel
for 24 hours with no issues.

linux-image-2.6.10-3-686-smp 2.6.10-19

Revision history for this message
dmatrix7 (dmatrix7) wrote :

Guess I spoke too soon. This bug is back for me, but I cannot trigger it with
Evolution anymore. This time I triggered it by adding a large music folder to
Rhythmbox. Same type of error: inotify_dev_queue_event. The system is useless
after this and requires a reboot back into 2.6.8.

Revision history for this message
John McCutchan (ttb) wrote :

I only have a UP system, so I've never tried inotify under SMP. But I know there
are lots of locking problems in inotify, and you are bound to run into them
using an SMP system. Robert Love and I have worked out the locking-bug-free
algorithms and Robert is busy implementing them in code as we speak. Can you
reproduce this on a UP system?

Revision history for this message
John McCutchan (ttb) wrote :

Could you try and get the latest patch from rml to crash? It was just sent to
the lists.

Revision history for this message
Chuck Short (zulcss) wrote :

Yes it is on my todo list for this weekend.

Revision history for this message
Matt Zimmerman (mdz) wrote :

Worked around by disabling inotify by default, so downgrading

Revision history for this message
Chuck Short (zulcss) wrote :

I have built a kernel with inotify 0.20 and I havent had any probelms yet. No
messages about gamin and no usb media crashing. I have tseng testing as well and
he reports the same no problems yet and no usb media crashing.

chuck

Revision history for this message
Matt Zimmerman (mdz) wrote :

Sounds tempting. Let's discuss at Friday's development meeting whether to
re-enable inotify by default after preview

Revision history for this message
John McCutchan (ttb) wrote :

Hey everyone, please try out inotify 0.22 -- any problems you've had in the past
should be gone. We are trying to get inotify into 2.6.13, so if you love
inotify, send an email to lkml letting the kernel developers know you want it in
NOW. Thanks and sorry for the spam.

Revision history for this message
Dave Ahlswede (mightyquinn) wrote :

If someone could make Ubuntu-compatible kernel packages to test the new Inotify
with, that would be great. (Self-compiling initrd kernels is kind of a pain)

Revision history for this message
Chuck Short (zulcss) wrote :

0.22 will not make it for hoary release, since it will be released *real* soon.

chuck

Revision history for this message
Matt Zimmerman (mdz) wrote :

<dato> mdz: just in case the follow-up interests you: inotify .22 fixes the
kernel oops/crash I had when umounting cdroms
<dato> which was similar to the ubuntu bug you mentioned, I believe

Revision history for this message
John McCutchan (ttb) wrote :

Footnote for previous comment: inotify 0.22-2 fixes all known OOPses

Revision history for this message
Chuck Short (zulcss) wrote :

New kernel will hit the archive soon

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.