When in a multipath, boot from SAN environment, update-grub will make the system unbootable

Bug #688261 reported by Peter Petrakis
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
grub2 (Ubuntu)
Incomplete
High
Colin Watson

Bug Description

Binary package hint: grub2

This was discovered on a bare metal install of Lucid AMD64 10.4.1 on an Intel IMS blade,
which utilizes an integrated Promise SAN with ALUA support: Active/Standby capabilities.

To reproduce
1) Install OS to normal, non mp device
2) Once OS is installed, add appropriate multipath.conf, scsi_dh_alua to
/etc/initramfs-tools/modules
2a) Install multipath-tools and multipath-tools-boot
3) reboot
4) run update-grub as a result of installing a new kernel , simply adjusting the
argument list, or for no reason at all.

At this point, the scripts seem to be unable to divine what the UUID of the backing store
should be and instead insert the MP device into /boot/grub/grub.cfg. Grub isn't configured
to handle MP so on the next boot, grub simply fails, and the system is unbootable. grub-probe
appears to be the guilty party.

A trace of update-grub with "set -x" in the supporting scripts is attached in file grub-mp-failure.log

Here are the diffs between the working grub, and the one that is unbootable. Each grub config
and initrd was checkpointed along the way and included in grub-initrd-snapshots.tgz.

$ diff -u grub.cfg-post-mp-install grub.cfg.mp-as-root-ug
--- grub.cfg-post-mp-install 2010-12-09 15:37:21.332495496 -0500
+++ grub.cfg.mp-as-root-ug 2010-12-09 15:46:28.632432748 -0500
@@ -55,19 +55,15 @@
 ### BEGIN /etc/grub.d/10_linux ###
 menuentry 'Ubuntu, with Linux 2.6.32-26-generic' --class ubuntu --class gnu-linux --class gnu --class os {
        recordfail
- insmod ext2
- set root='(hd0,1)'
- search --no-floppy --fs-uuid --set 0cfc57d4-da18-47e5-97c6-e6e5915d2698
- linux /boot/vmlinuz-2.6.32-26-generic root=UUID=0cfc57d4-da18-47e5-97c6-e6e5915d2698 ro console=tty0 console=ttyS0,115200n8
+ set root='(2225700015522652d-part1)'
+ linux /boot/vmlinuz-2.6.32-26-generic root=/dev/mapper/2225700015522652d-part1 ro console=tty0 console=ttyS0,115200n8
        initrd /boot/initrd.img-2.6.32-26-generic
 }
 menuentry 'Ubuntu, with Linux 2.6.32-26-generic (recovery mode)' --class ubuntu --class gnu-linux --class gnu --class os {
        recordfail
- insmod ext2
- set root='(hd0,1)'
- search --no-floppy --fs-uuid --set 0cfc57d4-da18-47e5-97c6-e6e5915d2698
+ set root='(2225700015522652d-part1)'
        echo 'Loading Linux 2.6.32-26-generic ...'
- linux /boot/vmlinuz-2.6.32-26-generic root=UUID=0cfc57d4-da18-47e5-97c6-e6e5915d2698 ro single
+ linux /boot/vmlinuz-2.6.32-26-generic root=/dev/mapper/2225700015522652d-part1 ro single
        echo 'Loading initial ramdisk ...'
        initrd /boot/initrd.img-2.6.32-26-generic
 }
@@ -75,15 +71,11 @@

 ### BEGIN /etc/grub.d/20_memtest86+ ###
 menuentry "Memory test (memtest86+)" {
- insmod ext2
- set root='(hd0,1)'
- search --no-floppy --fs-uuid --set 0cfc57d4-da18-47e5-97c6-e6e5915d2698
+ set root='(2225700015522652d-part1)'
        linux16 /boot/memtest86+.bin
 }
 menuentry "Memory test (memtest86+, serial console 115200)" {
- insmod ext2
- set root='(hd0,1)'
- search --no-floppy --fs-uuid --set 0cfc57d4-da18-47e5-97c6-e6e5915d2698
+ set root='(2225700015522652d-part1)'
        linux16 /boot/memtest86+.bin console=ttyS0,115200n8
 }
 ### END /etc/grub.d/20_memtest86+ ###

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: grub-common 1.98-1ubuntu9 [modified: usr/sbin/grub-mkconfig]
ProcVersionSignature: Ubuntu 2.6.32-26.48-generic 2.6.32.24+drm33.11
Uname: Linux 2.6.32-26-generic x86_64
Architecture: amd64
Date: Thu Dec 9 23:51:05 2010
ProcEnviron:
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: grub2

Revision history for this message
Peter Petrakis (peter-petrakis) wrote :
Revision history for this message
Peter Petrakis (peter-petrakis) wrote :
Changed in grub2 (Ubuntu):
status: New → Confirmed
Changed in grub2 (Ubuntu):
assignee: nobody → Serge Hallyn (serge-hallyn)
importance: Undecided → High
Revision history for this message
Peter Petrakis (peter-petrakis) wrote :

Tried the new grub from Natty, 1.99~20101126-1ubuntu3 with similar results.

# dpkg -i grub*deb
(Reading database ... 41361 files and directories currently installed.)
Preparing to replace grub-common 1.98-1ubuntu9 (using grub-common_1.99~20101126-1ubuntu3_amd64.deb) ...
Unpacking replacement grub-common ...
Replacing files in old package grub-pc ...
Preparing to replace grub-pc 1.98-1ubuntu9 (using grub-pc_1.99~20101126-1ubuntu3_amd64.deb) ...
Unpacking replacement grub-pc ...
Setting up grub-common (1.99~20101126-1ubuntu3) ...
Installing new version of config file /etc/grub.d/00_header ...
Installing new version of config file /etc/grub.d/10_linux ...
Installing new version of config file /etc/grub.d/30_os-prober ...

Processing triggers for man-db ...
Processing triggers for install-info ...
Processing triggers for ureadahead ...
Setting up grub-pc (1.99~20101126-1ubuntu3) ...
Installing new version of config file /etc/grub.d/05_debian_theme ...
Removing update-grub hooks from /etc/kernel-img.conf in favour of
/etc/kernel/ hooks.
Replacing config file /etc/default/grub with new version
/usr/sbin/grub-probe: error: no such disk.
Auto-detection of a filesystem of /dev/mapper/2225700015522652d-part1 failed.
Please report this together with the output of "/usr/sbin/grub-probe --device-map="/boot/grub/device.map" --target=fs -v /boot/grub" to <email address hidden>
Generating grub.cfg ...
/usr/sbin/grub-probe: error: no such disk.
dpkg: error processing grub-pc (--install):
 subprocess installed post-installation script returned error exit status 1
Errors were encountered while processing:
 grub-pc

Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: [Bug 688261] Re: When in a multipath, boot from SAN environment, update-grub will make the system unbootable

As recommended in #grub, I grabbed the latest 'experimental' bzr
branch, compiled it locally and ran it, with no success.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :
Download full text (3.5 KiB)

After hours and hours trying to get grub-probe to do the right thing,
I've decided to temporarily work around it instead. Package is still
compiling, but hand-applied changes to the installed scripts allowed
me to run update-grub on multipath-enabled system. Patch:

diff -Nur -x '*.orig' -x '*~' grub2-mp1/util/grub-mkconfig.in grub2-mp1.new/util/grub-mkconfig.in
--- grub2-mp1/util/grub-mkconfig.in 2010-12-10 10:53:15.000000000 -0500
+++ grub2-mp1.new/util/grub-mkconfig.in 2010-12-10 22:36:19.658999131 -0500
@@ -120,15 +120,18 @@

 # Device containing our userland. Typically used for root= parameter.
 GRUB_DEVICE="`${grub_probe} --target=device /`"
-GRUB_DEVICE_UUID="`${grub_probe} --device ${GRUB_DEVICE} --target=fs_uuid 2> /dev/null`" || true
+#GRUB_DEVICE_UUID="`${grub_probe} --device ${GRUB_DEVICE} --target=fs_uuid 2> /dev/null`" || true
+GRUB_DEVICE_UUID="`blkid ${GRUB_DEVICE} | sed -e 's/ /\n/g' | grep UUID | cut -d \\" -f 2`"

 # Device containing our /boot partition. Usually the same as GRUB_DEVICE.
 GRUB_DEVICE_BOOT="`${grub_probe} --target=device /boot`"
-GRUB_DEVICE_BOOT_UUID="`${grub_probe} --device ${GRUB_DEVICE_BOOT} --target=fs_uuid 2> /dev/null`" || true
+#GRUB_DEVICE_BOOT_UUID="`${grub_probe} --device ${GRUB_DEVICE_BOOT} --target=fs_uuid 2> /dev/null`" || true
+GRUB_DEVICE_BOOT_UUID="`blkid ${GRUB_DEVICE_BOOT} | sed -e 's/ /\n/g' | grep UUID | cut -d \\" -f 2`"

 # Filesystem for the device containing our userland. Used for stuff like
 # choosing Hurd filesystem module.
-GRUB_FS="`${grub_probe} --target=fs / 2> /dev/null || echo unknown`"
+#GRUB_FS="`${grub_probe} --target=fs / 2> /dev/null || echo unknown`"
+GRUB_FS="`blkid ${GRUB_DEVICE} | sed -e 's/ /\n/g' | grep TYPE | cut -d \\" -f 2`"

 if test -f ${sysconfdir}/default/grub ; then
   . ${sysconfdir}/default/grub
diff -Nur -x '*.orig' -x '*~' grub2-mp1/util/grub-mkconfig_lib.in grub2-mp1.new/util/grub-mkconfig_lib.in
--- grub2-mp1/util/grub-mkconfig_lib.in 2010-12-10 10:53:15.000000000 -0500
+++ grub2-mp1.new/util/grub-mkconfig_lib.in 2010-12-10 22:38:15.778704576 -0500
@@ -65,9 +65,9 @@
   fi

   # abort if file is in a filesystem we can't read
- if ${grub_probe} -t fs $path > /dev/null 2>&1 ; then : ; else
- return 1
- fi
+ #if ${grub_probe} -t fs $path > /dev/null 2>&1 ; then : ; else
+ # return 1
+ #fi

   return 0
 }
@@ -108,20 +108,23 @@
   device=$1

   # Abstraction modules aren't auto-loaded.
- abstraction="`${grub_probe} --device ${device} --target=abstraction`"
+ #abstraction="`${grub_probe} --device ${device} --target=abstraction`"
+ abstraction=""
   for module in ${abstraction} ; do
     echo "insmod ${module}"
   done

- fs="`${grub_probe} --device ${device} --target=fs`"
+ #fs="`${grub_probe} --device ${device} --target=fs`"
+ fs="`blkid ${device} | sed -e 's/ /\n/g' | grep TYPE | cut -d \\" -f 2`"
   for module in ${fs} ; do
     echo "insmod ${module}"
   done

   # If there's a filesystem UUID that GRUB is capable of identifying, use it;
   # otherwise set root as per value in device.map.
- echo "set root='`${grub_probe} --device ${device} --target=drive`'"
- if fs_uuid="`${grub_probe} --device ${device} --target=fs_uuid ...

Read more...

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

The test package is at

ppa:serge-hallyn/grub-multipath

with the code at

lp:~serge-hallyn/ubuntu/lucid/grub2/multipath2

I don't think these changes will be acceptable, but they work for me both on a
system with and without multipath.

Hopefully I can figure out how to teach grub-probe to do the right thing.

Changed in grub2 (Ubuntu):
status: Confirmed → In Progress
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

The package in ppa:serge-hallyn/grub-multipath seems to be working both with and
without multipath. It's needed for lucid with multipath, but it's scary for lucid.

The upstream is very different so a different fix for natty *may* be worth pursuing (i.e.
a fix to grub-probe itself).

Assigning, with apologies, to cjwatson, to get his input.

Changed in grub2 (Ubuntu):
assignee: Serge Hallyn (serge-hallyn) → Colin Watson (cjwatson)
Revision history for this message
Colin Watson (cjwatson) wrote :

I would definitely rather fix grub-probe. Could you please attach the output of 'sudo grub-probe --target=fs -v /boot/grub'?

Changed in grub2 (Ubuntu):
status: In Progress → Incomplete
Revision history for this message
Peter Petrakis (peter-petrakis) wrote :

@Colin

Before MP install:

root@kickseed:~# grub-probe --target=fs -v /boot/grub
grub-probe: info: cannot open `/boot/grub/device.map'.
grub-probe: info: changing current directory to /dev.
grub-probe: info: changing current directory to shm.
grub-probe: info: changing current directory to disk.
grub-probe: info: changing current directory to by-uuid.
grub-probe: info: changing current directory to by-id.
grub-probe: info: /dev/sda1 starts from 2048.
grub-probe: info: opening the device /dev/sda.
grub-probe: info: the size of /dev/sda is 20971520.
grub-probe: info: DOS partition 0 starts from 2048.
grub-probe: info: opening /dev/sda,1.
grub-probe: info: the size of /dev/sda is 20971520.
ext2

After MP install and reboot:

root@kickseed:~# multipath -ll
2222b0001554cc9b6dm-0 Intel ,Multi-Flex
[size=10G][features=1 queue_if_no_path][hwhandler=1 alua]
\_ round-robin 0 [prio=130][active]
 \_ 0:0:1:0 sdb 8:16 [active][ready]
\_ round-robin 0 [prio=1][enabled]
 \_ 0:0:0:0 sda 8:0 [active][ready]

root@kickseed:~# grub-probe --target=fs -v /boot/grub
grub-probe: info: cannot open `/boot/grub/device.map'.
grub-probe: info: changing current directory to /dev.
grub-probe: info: changing current directory to shm.
grub-probe: info: changing current directory to disk.
grub-probe: info: changing current directory to by-uuid.
grub-probe: info: changing current directory to by-id.
grub-probe: info: changing current directory to bsg.
grub-probe: info: changing current directory to char.
grub-probe: info: changing current directory to block.
grub-probe: info: changing current directory to pts.
grub-probe: info: changing current directory to mapper.
grub-probe: info: the size of hd0 is 20971520.
grub-probe: info: the size of hd0 is 20971520.
grub-probe: info: the size of hd0 is 20971520.
grub-probe: info: the size of hd0 is 20971520.
grub-probe: info: the size of hd0 is 20971520.
grub-probe: info: the size of hd0 is 20971520.
grub-probe: info: the size of hd0 is 20971520.
grub-probe: info: the size of hd0 is 20971520.
grub-probe: info: opening 2222b0001554cc9b6-part1.
grub-probe: error: no mapping exists for `2222b0001554cc9b6-part1'.

On 01/04/2011 07:17 AM, Colin Watson wrote:
> I would definitely rather fix grub-probe. Could you please attach the
> output of 'sudo grub-probe --target=fs -v /boot/grub'?
>
> ** Changed in: grub2 (Ubuntu)
> Status: In Progress => Incomplete
>

Revision history for this message
Colin Watson (cjwatson) wrote :

I'm very sorry - could I have the same but with -vv rather than -v?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.