Boot fails with degraded mdadm raid

Bug #1635049 reported by Grant Slater
138
This bug affects 25 people
Affects Status Importance Assigned to Milestone
mdadm (Debian)
Fix Released
Unknown
mdadm (Ubuntu)
Fix Released
High
Dimitri John Ledkov
Xenial
Fix Released
High
Dimitri John Ledkov

Bug Description

[Impact]

 * Systems fail to boot in certain status of mdadm arrays, requiring manual recovery / array assembly

 * Backport of boot logic from yakkety

[Test Case]

 * Install a system with RAID1 and two hard-drives and boot the system with array in-sync
 * Shutdown
 * Disconnect one of the drives and thus boot, unexpectedly, degraded
 * The boot should complete.
 * Shutdown, and boot again, expecting degraded state.
 * The boot should complete.
 * Shutdown, reconnect disconnected drive, and boot again.
 * The boot should complete, add the device to the array, the array should be resyncing, and results with system with array in-sync, just like at the beginning of the testcase.

[Regression Potential]

 * Systems may continue to fail to boot degraded.

[Other Info]

 * Original report

mdadm does not attempt to start partial md devices (incremental assembly) during initramfs and can cause system to fail to initramfs prompt if rootfs on md.

http://askubuntu.com/questions/789953/how-to-enable-degraded-raid1-boot-in-16-04lts

Fixed in debian mdadm 3.4-2: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=784070

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in mdadm (Ubuntu):
status: New → Confirmed
tags: added: regression-release xenial
Revision history for this message
Brian Murray (brian-murray) wrote :

This is fixed in 16.10 and Zesty, but I'm leaving the task open so the bug doesn't disappear from some LP searches.

Changed in mdadm (Ubuntu Xenial):
assignee: nobody → Dimitri John Ledkov (xnox)
importance: Undecided → High
status: New → Triaged
Changed in mdadm (Ubuntu):
status: Confirmed → Triaged
Changed in mdadm (Debian):
status: Unknown → Fix Released
Changed in mdadm (Ubuntu):
importance: Undecided → High
assignee: nobody → Dimitri John Ledkov (xnox)
Revision history for this message
Brian Murray (brian-murray) wrote :

Dimitri - Do you have any plans to get this fixed in Xenial?

Changed in mdadm (Ubuntu):
milestone: none → ubuntu-17.02
Revision history for this message
chrone (chrone81) wrote :

Will there be any backport patch for Ubuntu 16.04.2?

Just tested this out yesterday and Ubuntu 16.04.2 with latest update still could not boot secondary drive.

My test was done on Linux mdadm RAID1 with LVM. (/dev/md0 for /boot xfs, and /dev/md1 for LVM with /root xfs and swap).

description: updated
description: updated
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Proposing the following fix:

mdadm (3.3-2ubuntu7.2) xenial; urgency=medium

  * Backport initramfs changes from 3.4-4, to improve reliability of
    booting with degraded arrays. LP: #1635049

  * debian/initramfs/hook:
    - Fix UUID= grep for configured RAIDs to be case insensitive.
    - Drop CREATE stanzas from mkconf and don't include them in the
    initramfs. The generated defaults, are the compiled-in defaults. And
    the current one generates warnings when running mdadm in the
    initramfs, as there is no passwd|group files to resolve root/disk
    uid/gid.
  * debian/initrmafs/script.local-block|script.local-bottom:
    - Use local-block integration scrips, in favor of root-fail hooks to
    activate incomplete arrays.
  * debian/initramfs/init-premount|mdadm-functions:
    - Drop, no longer in use.

 -- Dimitri John Ledkov <email address hidden> Mon, 20 Feb 2017 10:57:43 +0000

It is available from Bileto:
https://bileto.ubuntu.com/#/ticket/2500

Publish in this ephemeral PPA:
https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/2500

Changed in mdadm (Ubuntu):
status: Triaged → Fix Released
Changed in mdadm (Ubuntu Xenial):
status: Triaged → In Progress
milestone: none → xenial-updates
Revision history for this message
Dragan S. (dragan-s) wrote :

PPA located at:
https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/2500

Fixes my degraded raid boot issues. System boots up fine after this PPA is applied.

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

I accepted the package from the queue to -proposed but since it was a sync I couldn't do it through the proper tooling (I think I need to get educated how to do it properly for synces?). Anyway, this means that there was no auto-release message. Please install the package from xenial-proposed, test it and mark the bug as verification-done as with any other SRU bug.

Thank you!

tags: added: verification-needed
Changed in mdadm (Ubuntu Xenial):
status: In Progress → Fix Committed
Revision history for this message
Adam Blomberg (paradox606) wrote :

Hi Lukasz and Dragan, I just received confirmation from a customer that the propsed package also fixed the issue on their ibm power environment as well.

This was using the ca-train-ppa-service ppa however, so I'll check again using the proper proposed repository package.

Dragan S. (dragan-s)
tags: added: verification-done
removed: verification-needed
Revision history for this message
Brian Murray (brian-murray) wrote :

Was the "check again using the proper proposed repository package" made?

tags: added: verification-needed
removed: verification-done
Revision history for this message
Dimitri John Ledkov (xnox) wrote : Re: [Bug 1635049] Re: Boot fails with degraded mdadm raid

I can do that. However, I am not sure if that is necessary since this is a
sync from bileti ppa, and thus it is identical binaries.

On 1 Mar 2017 22:31, "Brian Murray" <email address hidden> wrote:

Was the "check again using the proper proposed repository package" made?

** Tags removed: verification-done
** Tags added: verification-needed

--
You received this bug notification because you are a bug assignee.
https://bugs.launchpad.net/bugs/1635049

Title:
  Boot fails with degraded mdadm raid

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/1635049/+subscriptions

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

retested without ppa, and just with proposed using 3.3-2ubuntu7.2.

tags: added: verification-done
removed: verification-needed
Revision history for this message
Adam Blomberg (paradox606) wrote :

I have also validated the proposed package on my reproducer sandbox, and it worked correctly.
Still awaiting feedback from customer with ppc64le system, as soon as I hear back I will let you know.

-Adam

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package mdadm - 3.3-2ubuntu7.2

---------------
mdadm (3.3-2ubuntu7.2) xenial; urgency=medium

  * Backport initramfs changes from 3.4-4, to improve reliability of
    booting with degraded arrays. LP: #1635049

  * debian/initramfs/hook:
    - Fix UUID= grep for configured RAIDs to be case insensitive.
    - Drop CREATE stanzas from mkconf and don't include them in the
    initramfs. The generated defaults, are the compiled-in defaults. And
    the current one generates warnings when running mdadm in the
    initramfs, as there is no passwd|group files to resolve root/disk
    uid/gid.
  * debian/initrmafs/script.local-block|script.local-bottom:
    - Use local-block integration scrips, in favor of root-fail hooks to
    activate incomplete arrays.
  * debian/initramfs/init-premount|mdadm-functions:
    - Drop, no longer in use.

 -- Dimitri John Ledkov <email address hidden> Mon, 20 Feb 2017 10:57:43 +0000

Changed in mdadm (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
Chris Halse Rogers (raof) wrote : Update Released

The verification of the Stable Release Update for mdadm has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Eero (eero+launchpad) wrote :

I just made a fresh install from ubuntu-16.04.2-server-amd64.iso, updated everything, and tested to boot without one disk. It failed. See the attachment for my RAID configuration.

https://imgur.com/a/RApJS

Revision history for this message
Dimitri John Ledkov (xnox) wrote : Re: [Bug 1635049] Re: Boot fails with degraded mdadm raid

On 5 April 2017 at 07:28, Eero <email address hidden> wrote:
> I just made a fresh install from ubuntu-16.04.2-server-amd64.iso,
> updated everything, and tested to boot without one disk. It failed. See
> the attachment for my RAID configuration.
>
> https://imgur.com/a/RApJS
>

Please open new a new bug report, instead of piling onto an unrelated report.

And your boot is waiting for you to unlock the encrypted volume...
only after which the volume groups will be detected.

I do not see anything degraded in your case at all.

Note your test-case is completely different to this bug report as it
also involves encrypted volume.

Regards,

Dimitri.

Revision history for this message
Eero (eero+launchpad) wrote :

> And your boot is waiting for you to unlock the encrypted volume...
> only after which the volume groups will be detected.

Why do you lie? You didn't even look what I reported?

The first screenshot clearly shows that the boot fails before password is even asked.
The second screenshot shows that the password is asked when I attached the drive again.

I've submitted a bug report regarding this to Ubuntu and Debian, but nobody seems to care. It's 2017 and RAID 1 doesn't work...

https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/1680448
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=859691

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

On 10 April 2017 at 16:57, Eero <email address hidden> wrote:
>> And your boot is waiting for you to unlock the encrypted volume...
>> only after which the volume groups will be detected.
>
> Why do you lie? You didn't even look what I reported?
>

File a new bug report with text logs.... not photographs / screenshots.

> The first screenshot clearly shows that the boot fails before password is even asked.
> The second screenshot shows that the password is asked when I attached the drive again.
>
> I've submitted a bug report regarding this to Ubuntu and Debian, but
> nobody seems to care. It's 2017 and RAID 1 doesn't work...
>
> https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/1680448
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=859691
>
>
> ** Bug watch added: Debian Bug tracker #859691
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=859691
>
> --
> You received this bug notification because you are a bug assignee.
> https://bugs.launchpad.net/bugs/1635049
>
> Title:
> Boot fails with degraded mdadm raid
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/1635049/+subscriptions

--
Regards,

Dimitri.

Revision history for this message
Eero (eero+launchpad) wrote :

> File a new bug report with text logs.... not photographs / screenshots.

And this is a perfect example why Ubuntu is a piece of garbage. Someone reports a serious issue with Ubuntu, but the whole bug gets dismissed, because the reports are in a "wrong" format.

How do you even get early boot logs out of Ubuntu when the boot fails? You didn't even mention that.

Is there some way to contact professionals at Canonical instead of these amateur wise asses on this platform?

Revision history for this message
Eero (eero+launchpad) wrote :

From Ubuntu's own documentation https://wiki.ubuntu.com/DebuggingKernelBoot:

> If you are unable to capture a log file, a digital photo will work just as well.

Hopefully someone competent will see these messages at some point.

Revision history for this message
Ryan C. Underwood (nemesis-icequake) wrote :

This is still broken in Xenial with mdadm 3.3-2ubuntu7.2

Revision history for this message
Rüdiger Schernthaner (res80-deactivatedaccount) wrote :

It seems that this is still somehow broken with Xubuntu 17.04

I just tried mdadm within Xubuntu 17.04 in Virtualbox using the following setup:
1 virtual disk containing the OS.
2 virtual disks running as a RAID 1.
Whenever I disconnect one of the RAID disks, Xubuntu cannot boot.

I wrote the script shown here:
https://askubuntu.com/questions/789953/how-to-enable-degraded-raid1-boot-in-16-04lts
to the file /usr/share/initramfs-tools/scripts/local-top/mdadm
After that, Xubuntu can boot properly with the degraded RAID.

Is this the intended behaviour?
Thanks

Revision history for this message
Eero (eero+launchpad) wrote :

Is this bug report abandoned?

And why the bug is assigned to Dimitri John Ledkov? He is obviously incompetent. In comment number 18 he even claimed that photographs and screenshots aren't acceptable in bug reports even though Ubuntu's website clearly states otherwise.

Revision history for this message
Kevin Lyda (lyda) wrote :

Hey there kids, this bug still appears to be relevant for 16.04. My /dev/sda died today and I'm prepping to replace the disk. I note that the answer here https://askubuntu.com/a/798213/185653 notes the missing file and it's still missing.

I haven't tried a reboot as I'm waiting for the monthly check to complete on the other array but I've added in that script and rebuilt my initramfs and installed grub on sdb. Will let folks know how I get on.

The tone and comments in this bug from some reporters and others is... poor. I would have hoped for better.

Revision history for this message
Kevin Lyda (lyda) wrote :

This did not work but I'm not clear why. It might have been be messing up the grub install.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.