dmraid causes udev event feedback loop in Lucid

Bug #534743 reported by Phillip Susi
28
This bug affects 4 people
Affects Status Importance Assigned to Milestone
dmraid (Ubuntu)
Fix Released
High
Unassigned
Lucid
Fix Released
High
Unassigned

Bug Description

Binary package hint: dmraid

Trying to boot Karmic Alpha 3 my system hangs for 3 minutes in the initramfs, then finally times out waiting for udev to settle and drops to a busybox shell. After much investigation the the problem seems to be a udev event infinite feedback loop. Every time udev gets an event, it runs dmraid-activate, which in turn runs dmraid with the -Z switch. This appears to tell it to remove the partitions from the underlying real disk device which causes a change event to occur. I am not sure why, but in 9.10, only one change event it generated according to udevadm monitor. If dmraid-activate is run on sda, then sda gets a change event, which it appears udev is smart enough not to recursively process. Likewise, when run on sdb, only an sdb change event occurs. In Karmic Alpha 3, a change event is generated for BOTH sda and sdb when dmraid-activate is run on either one, so when sda is processed, it generates a change on sdb, which is then processed and causes a change in sda, which... and so on...

I think the way to fix this is to change the dmraid udev rule to only activate on add events, not change events, but I am not sure.

Related branches

tags: added: kernel-series-unknown
tags: removed: kernel-series-unknown
Revision history for this message
Danny Wood (danwood76) wrote :

Do you mean Lucid (10.04) and not Karmic (9.10)?

Revision history for this message
Phillip Susi (psusi) wrote :

Doh, yes... meant Lucid ;)

summary: - dmraid causes udev event feedback loop in Karmic
+ dmraid causes udev event feedback loop in Lucid
Revision history for this message
Phillip Susi (psusi) wrote :

This bug really needs fixed before Lucid is released or there are going to be a lot of unhappy people. Changing the udev rule for dmraid to only run on add, not add|change has worked for me, but I think a more correct fix would be to remove the -Z flag from dmraid telling it to delete the partitions from the underlying disks, and have udev handle that, possibly with partx, when blkid says the disk is a raid member.

Changed in dmraid (Ubuntu):
status: New → Confirmed
Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Removing the "change" event would mean that dmraid would not get picked up on various devices that need to be constructed a piece at a time, etc.

Why does the change event get triggered in the first place? This could be the inotify watch udev holds - in which case changing dmraid to only open the block device for reading (not writing) would be sufficient. It's very unlikely dmraid actually needs write support

Revision history for this message
gadLinux (gad-aguilardelgado) wrote :

Hi there,

Not sure if related but my system with 3 disks 2 with dmraid raid-1 takes 5-10 minutes to boot since >8.10

I saw different udevadm settle errors and I post two screenshowts.

Also my dmraid -r config looks like this.

/dev/sdc: "sil" and "nvidia" formats discovered (using nvidia)!
/dev/sdc: nvidia, "nvidia_fdcaafbf", mirror, ok, 398297086 sectors, data@ 0
/dev/sdb: pdc, "pdc_bbbgggjccd", mirror, ok, 625011328 sectors, data@ 0
/dev/sda: pdc, "pdc_bbbgggjccd", mirror, ok, 625011328 sectors, data@ 0

Revision history for this message
gadLinux (gad-aguilardelgado) wrote :

Another screenshot. Before the first one... About 2 minutes before and four after initial load.

Revision history for this message
Phillip Susi (psusi) wrote :

Yep, that's it... tag the "this bug affects me" button.

Phillip Susi (psusi)
tags: added: regression-potential
Phillip Susi (psusi)
Changed in dmraid (Ubuntu):
importance: Undecided → High
milestone: none → ubuntu-10.04
status: Confirmed → Triaged
Revision history for this message
Phillip Susi (psusi) wrote :

We discussed this after on irc but just to clarify in response to SJR's comments:

As the description says, the change events are created by dmraid removing the partitions from the underlying physical disk. So you have:

1) add triggers dmraid
2) dmraid removes partitions, triggering change
3) change activates dmraid, goto step 2.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package dmraid - 1.0.0.rc16-3ubuntu2

---------------
dmraid (1.0.0.rc16-3ubuntu2) lucid; urgency=low

  * Removed |change udev rule for dmraid-activate to prevent infinite udev
    event loop (LP: #534743).
 -- Phillip Susi <email address hidden> Wed, 07 Apr 2010 11:01:17 -0400

Changed in dmraid (Ubuntu Lucid):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.