zfs initrd script fails when rootdelay boot option is set

Bug #1577057 reported by Sam Van den Eynde
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
zfs-linux (Ubuntu)
Fix Released
Medium
Richard Laager

Bug Description

It looks like that, when booting off zfs (zfs holds /boot) with the rootdelay boot option set, the boot process fails in the initrd fase, asking to manually import the pool using zpool import -f -R / -N. I only had one system with that parameter set, which I seldom reboot.

I did not find an upstream reference of this bug or behavior.

The error is caused by the fact the pool is already imported: "zpool status" executed on the initramfs prompt will correctly list the pool and all devices online. To continue, one has to export the pool, re-import it and exit the initramfs prompt after which regular booting continues. Not exporting and reimporting it leaves the pool readonly leading to boot errors further down the road (systemd units failing).

I noticed zfs_autoimport_disable is set to 1 in the initramfs environment, so looking at /usr/share/initramfs-tools/scripts/zfs this section might be the issue (zpool import succeeding, but $ZFS_HEALTH never returning with the correct status (I'm not a programmer but perhaps ZFS_HEALTH is a local variable in the zfs_test_import function)):

 delay=${ROOTDELAY:-0}

 if [ "$delay" -gt 0 ]
 then
  # Try to import the pool read-only. If it does not import with
  # the ONLINE status, wait and try again. The pool could be
  # DEGRADED because a drive is really missing, or it might just
  # be slow to be detected.
  zfs_test_import
  retry_nr=0
  while [ "$retry_nr" -lt "$delay" ] && [ "$ZFS_HEALTH" != "ONLINE" ]
  do
   [ "$quiet" != "y" ] && log_begin_msg "Retrying ZFS read-only import"
   /bin/sleep 1
   zfs_test_import
   retry_nr=$(( $retry_nr + 1 ))
   [ "$quiet" != "y" ] && log_end_msg
  done
  unset retry_nr
  unset ZFS_HEALTH
 fi
 unset delay

Edit: to be clear: I removed the rootdelay parameter, regenerated the initrd, and was able to boot successfully afterwards.

description: updated
description: updated
Revision history for this message
Richard Laager (rlaager) wrote :

What did you have rootdelay set to? I just tested again now with "rootdelay=1" as well as the useless "rootdelay=0" and invalid "rootdelay=".

Try adding "set -x" to the top of /usr/share/initramfs-tools/scripts/zfs, rebuilding the init script, and rebooting with rootdelay set. Grab a picture of the output so I can see what it's running just before it fails.

Changed in zfs-linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Richard Laager (rlaager) wrote :

In the last comment, I meant "rebuilding the initrd", of course, not "rebuilding the init script".

Revision history for this message
Sam Van den Eynde (samvde) wrote :

I had it set to 200, this was a configuration that I forgot to remove after I had to ran on external USB drives for a while with this machine when the internal disk failed (long before zfs and this install).

Will try later today.

Revision history for this message
Richard Laager (rlaager) wrote :

It works for me even with rootdelay=200. If you want to hit me up on #zfsonlinux (on FreeNode), I'll try to help debug this.

Revision history for this message
Sam Van den Eynde (samvde) wrote :

I was not at the machine yesterday to test but it is strange indeed. The system would never boot without re-importing the pool manually. It was 100% consistent.

When I noticed the rootdelay option still set, I removed it manually from the grub command line the first time, which worked. Then I adapted /etc/default/grub and updated grub to make it permanent. Since then, no more boot issues. That's when I went to check the initramfs zfs script.

I just tried it out quickly in a VM and I can't reproduce it there either :-/

I'll check this out tonight as now I'm really puzzled by this. Other than the rootdelay option nothing changed.

Revision history for this message
Richard Laager (rlaager) wrote :

Did you have a zpool.cache or modified ZFS script from testing on that other bug?

The production script unconditionally exports the pool after a read-only import. There shouldn't be any way to end up with a pool still imported unless that export fails for some reason, which seems unlikely.

Changed in zfs-linux (Ubuntu):
status: Incomplete → Invalid
Revision history for this message
Sam Van den Eynde (samvde) wrote :

I was able to reproduce on the system itself. Add rootdelay=3 to the boot parameters, and the system will drop in the initramfs shell, zpool imported, complaining it is not imported yet. Re-importing needed to get it to mount rw. Leave rootdelay out: system boots fine. Tried several times, consistently seeing the error.

So I decided not to touch the initrd yet for troubleshooting reasons. It does not contain a zpool.cache file, and I'll attach the zfs script for your reference (I did not alter it). The grub bootline is as follows:

linux /System@/boot/vmlinuz-4.4.0-21-generic root=ZFS=systempool/System ro zswap.enabled=1 zswap.max_pool_percent=25 elevator=noop
initrd /System@/boot/initrd.img-4.4.0-21-generic

Revision history for this message
Sam Van den Eynde (samvde) wrote :

I was able to reproduce on the system itself. Add rootdelay=3 to the boot parameters, and the system will drop in the initramfs shell, zpool imported, complaining it is not imported yet. Re-importing needed to get it to mount rw. Leave rootdelay out: system boots fine. Tried several times, consistently seeing the error.

So I decided not to touch the initrd yet for troubleshooting reasons. It does not contain a zpool.cache file, and I'll attach the zfs script for your reference (I did not alter it). The grub bootline is as follows:

linux /System@/boot/vmlinuz-4.4.0-21-generic root=ZFS=systempool/System ro zswap.enabled=1 zswap.max_pool_percent=25 elevator=noop
initrd /System@/boot/initrd.img-4.4.0-21-generic

Revision history for this message
Richard Laager (rlaager) wrote :

If you get into the broken state, what happens if you run:
zpool list -H -o health "$ZFS_RPOOL"
zpool export "$ZFS_RPOOL"

Try that literally first. If $ZFS_RPOOL is unset, then replace it by hand with the name of your pool.

Changed in zfs-linux (Ubuntu):
status: Invalid → Incomplete
Revision history for this message
Sam Van den Eynde (samvde) wrote :

After setting ZFS_RPOOL manually:

zpool list ..... returns "ONLINE"
zpool export exports the pool without any console message.

I'm going to add some printfs to the zfs script to see where the boot proces goes wrong and get back to you. I'll keep the current initrd for reference.

Revision history for this message
Sam Van den Eynde (samvde) wrote :

Ok, pinned it down.

The problem appears to be very simple: when using rootdelay, the zfs_test_import() function is called which tries to do a readonly "zpool import" immediately followed by exporting the pool. This import/export sequence goes to fast for my system (which indeed has low specs).

So: the export fails with a generic "dataset is busy" message. The zpool health status is "ONLINE" prior to the export. A delay of merely 5 seconds before exporting in zfs_test_import() fixes the issue in my case. The "delay", "ZFS_RPOOL", "ZFS_BOOTFS" and "retry_nr" were all set correctly on each iteration.

I see 2 possible angles to this:
- The pool returns status "ONLINE" but does not accept certain commands. Is the pool status wrong, or the interpretation of the "ONLINE" status? Quite important when relying on this in scripts.
- Should we not check if the zpool export goes bad in the init script or at least give it a bit more time, especially since the rootdelay parameter is already used on systems that are expected to require longer initialization times ?

The second one is the short term option I guess.

I now boot without the rootdelay parameter as it was a leftover configuration I had forgotten about, but I'd be happy to test any new versions of the script as I can reproduce this quite easily.

Revision history for this message
Richard Laager (rlaager) wrote :

When it breaks, are any datasets mounted? Run, `cat /proc/mounts`.

Is a delay of 1 second sufficient (in your limited testing)?

Revision history for this message
Richard Laager (rlaager) wrote :

Also, can you test with this updated /usr/share/initramfs-tools/scripts/zfs. Make sure to update the initrd, of course.

Changed in zfs-linux (Ubuntu):
status: Incomplete → Confirmed
assignee: nobody → Richard Laager (rlaager)
Revision history for this message
Sam Van den Eynde (samvde) wrote :

Here are the test results:

- no datasets mounted when it breaks
- A delay of 1 second is not enough: breaks on first try
- Retested with 5 seconds delay: ok

- Retested with new zfs script: notok, again the pool is busy error.

I had a quick look at the script and I found the remaining problem: the command ZFS_STDERR=$(zpool export "$ZFS_RPOOL" >/dev/null) will not capture stderr messages. I adapted this to ZFS_STDERR=$(zpool export "$ZFS_RPOOL" 2>&1 >/dev/null) and was able to boot correctly using rootdelay=10. So I think that solves it.

Richard Laager (rlaager)
Changed in zfs-linux (Ubuntu):
status: Confirmed → In Progress
Revision history for this message
Richard Laager (rlaager) wrote :

@svde-tech: Can you please test from this PPA:
https://launchpad.net/~rlaager/+archive/ubuntu/zfs

The init script is slightly different from my last iteration. I removed one unrelated change. I also applied your fix, but stopped suppressing stdout (since that's probably unnecessary). Besides making sure it works with rootdelay, please watch the terminal and confirm you're not seeing any spurious output from zpool commands.

If you do test from my PPA, please remove it from your APT sources when you're done testing. I don't want some future experiment I upload to break your system.

Revision history for this message
Sam Van den Eynde (samvde) wrote :

No errors encountered using the PPA.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package zfs-linux - 0.6.5.7-0ubuntu2

---------------
zfs-linux (0.6.5.7-0ubuntu2) yakkety; urgency=medium

  * Minor changes to satisfy MIR request (LP: #1579082)
    [ Richard Laager ]
    - zfs-zed: Version "Replaces: zfsutils-linux" and add versioned
     "Breaks: zfsutils-linux"
    - explicitly specify the man pages so we don't include zed.8
    [ Colin Ian King ]
    - remove root ALL=(ALL) ALL and R_ROOT from
      debian/tree/zfsutils-linux/etc/sudoers.d/zfs. The R_ROOT serves
      no useful purpose, complicates people actually using the sudoers
      file, and pollutes the global namespace
    - add init.d linitan overrides for zfsutils-linux and zfs-zed
      as initscripts will never be used and we will rebase with the
      Debian package initscripts once the package gets accepted by Debian
  [ Richard Laager ]
  * zfs-initramfs: When rootdelay is set, retry the `zpool export` command
    until it stops returning "is busy" (or the rootdelay expires).
    (LP: #1577057)

 -- Colin Ian King <email address hidden> Thu, 26 May 2016 15:28:13 +0100

Changed in zfs-linux (Ubuntu):
status: In Progress → Fix Released
Changed in zfs-linux (Ubuntu):
importance: Undecided → Medium
Revision history for this message
dreamcat4 (dreamcat4) wrote :

Hi. Just to say thanks for fixing this. However the bug has occured for me very recently on 16.04 LTS. Both 16.04.1 (4.4 kernel) and 16.04.2 (4.8 kernel).

So at this time this was not fixed for 16.04. Which aligns with the 'no fix target' here for '16.04' in the bug description. Anyhow regardless of that it is still a great help, and has solved my issue, just to simply remove the 'rootdelay=' parameter from boot flags. Many thanks.

Revision history for this message
Richard Laager (rlaager) wrote :

This has presumably regressed in 17.04 as they replaced the initrd code (somewhat by accident). I'm going to review all this stuff and get some fixes in soon, so it should be re-fixed by the next LTS.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.