iscsi root with or without auth fails to boot

Bug #728088 reported by Patrick Domack
60
This bug affects 10 people
Affects Status Importance Assigned to Milestone
open-iscsi (Ubuntu)
Fix Released
High
Colin Watson
Natty
Fix Released
High
Colin Watson
Oneiric
Fix Released
High
Colin Watson

Bug Description

Stable release update justification:

IMPACT: Some systems with their root file system on iSCSI fail to boot.

DEVELOPMENT BRANCH: open-iscsi 2.0.871-0ubuntu6 fixes this bug as described in comment 15.

PATCH: http://bazaar.launchpad.net/~ubuntu-branches/ubuntu/oneiric/open-iscsi/oneiric/revision/32

TEST CASE: Only some iSCSI-based systems suffer from this; I expect there are race conditions involved. Those affected would fail to boot in some manner similar to that described later in this bug description, and this change should make them boot successfully.

REGRESSION POTENTIAL: Strictly confined to systems using iSCSI. I tried hard not to make any additional assumptions in this patch; if it works, it should be safe.

Original report follows:

Natty alpha3 test iso's for ubuntu-server amd64 (haven't tested i386 yet).

Running the iscsi auth and unauth tests.

Installation is fine. Before reboot, I copy vmlinuz and initrd.img to tftp server.

System boots ok, till it gets past apparmor, then it has issues, I assume something with bringing up networking is wiping out the iscsi config?

Screen shows:

fsck version ......
    /dev/sda1 clean, 52412,491526 files 262116/1965824 blocks
* Starting AppArmor profiles [ OK ]

(long pause)

[ 130.xxxxxx] end_request: I/O error, dev sda, sector 8653696
[ 130.xxxxxx] end_request: I/O error, dev sda, sector 8653696
/proc/self/fd/10: 30: telinit: Input/output error
[ 130.xxxxxx] end_request: I/O error, dev sda, sector 838656
[ 130.xxxxxx] end_request: I/O error, dev sda, sector 838656
[ 130.xxxxxx] end_request: I/O error, dev sda, sector 352768
[ 130.xxxxxx] end_request: I/O error, dev sda, sector 352768
[ 130.xxxxxx] EXT4-fs error (device sda1) in ext4_reserve_inode_write:5620: Journal has aborted
[ 130.xxxxxx] EXT4-fs (sda1): previous I/O error to superblock detected
[ 130.xxxxxx] end_request: I/O error, dev sda, sector 2048
[ 130.xxxxxx] end_request: I/O error, dev sda, sector 4724736
[ 130.xxxxxx] end_request: I/O error, dev sda, sector 291584
[ 130.xxxxxx] end_request: I/O error, dev sda, sector 8669024
[ 130.xxxxxx] end_request: I/O error, dev sda, sector 8669024
[ 130.xxxxxx] end_request: I/O error, dev sda, sector 8669024
[ 130.xxxxxx] end_request: I/O error, dev sda, sector 1305064
[ 130.xxxxxx] end_request: I/O error, dev sda, sector 1305064
[ 130.xxxxxx] end_request: I/O error, dev sda, sector 1305064
[ 130.xxxxxx] end_request: I/O error, dev sda, sector 1300856
[ 130.xxxxxx] end_request: I/O error, dev sda, sector 1300856

tags: added: iso-testing
Revision history for this message
Patrick Domack (patrickdk) wrote :

I installed again, but with sshd loaded, just to test.

sshd is running, and attempts to log me in, but fails checking my password (hard without a disk)

and /etc/network/interfaces contains
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0
iface eth0 inet manual

Changed in debian-installer (Ubuntu Natty):
importance: Undecided → High
Changed in debian-installer (Ubuntu Natty):
assignee: nobody → Ubuntu Server Team (ubuntu-server)
Revision history for this message
Patrick Domack (patrickdk) wrote :

After an hour or so, more info:

could not start boot splash: Input/output error
fsck from util-linux-ng 2.17.2
/dev/sda1: clean, 52312/491520 files, 262116/1965824 blocks
* Starting Apparmor profiles [ OK ]
/etc/init.d/rc: 341: /etc/rcS.d/S55urandom: Input/output error
/proc/self/fd/10: 30: telinit: Input/output error

Revision history for this message
Patrick Domack (patrickdk) wrote :

This also affects i386.

It doesn't matter if I use a static or dhcp ip address.

I did find, if I delete/disable/remove /etc/init.d/open-iscsi, everything works perfectly :)

For amd64/i386 and auth/unauth

Dave Walker (davewalker)
tags: added: server-nro
Changed in debian-installer (Ubuntu Natty):
assignee: Ubuntu Server Team (ubuntu-server) → Canonical Foundations Team (canonical-foundations)
Revision history for this message
Patrick Domack (patrickdk) wrote :

Confirmed issue exists in beta iso images also.

Changed in debian-installer (Ubuntu Natty):
status: New → Confirmed
Colin Watson (cjwatson)
Changed in debian-installer (Ubuntu Natty):
assignee: Canonical Foundations Team (canonical-foundations) → Colin Watson (cjwatson)
Revision history for this message
Colin Watson (cjwatson) wrote :

I failed to reproduce this with a current daily build of natty-server-i386.iso, unauthenticated, in kvm with ietd running on the host. My /etc/network/interfaces is identical to that in comment 1. I did find a problem with netcfg that was tearing down the DHCP lease near the end of the installer before unmounting /target, so some finalisation steps didn't happen and it might not have written everything to disk; I suppose in theory that could account for this. I've uploaded a fix for that.

Perhaps somebody could try again with tomorrow's (not today's) daily build, and see if it's cleared up?

Colin Watson (cjwatson)
Changed in debian-installer (Ubuntu Natty):
status: Confirmed → Incomplete
Revision history for this message
Patrick Domack (patrickdk) wrote :

I'm still having the issue using the Apr 8th daily build, using amd64 right now, I will try i386 when it downloads.

It's still the same issue. I'm using a standalone hardy server with ietd/tftp, and a lucid dhcp server

The test is running in vmware esx. I'll also give it a try this weekend on two pure hardware computers.

Revision history for this message
Patrick Domack (patrickdk) wrote :

Nope still fails for me, on amd64 and i386

Tried on real hardware with Broadcom 5751 and e1000 nics.

Changed in debian-installer (Ubuntu Natty):
status: Incomplete → Confirmed
Martin Pitt (pitti)
summary: - iscsi root (amd64) with or without auth fails to boot
+ iscsi root with or without auth fails to boot
Revision history for this message
Patrick Domack (patrickdk) wrote :

For reference, my set config setup.

Normal configs for dhcp (in two different installs using static vs dynamic ips)

ietd config (20gig partition):
  Target iqn.1999-05.com.patrickdk:testing.iscsiboot
          Lun 0 Path=/dev/sda5,Type=blockio

tftp config for pxe boot:
label iscsi-root
  menu label iSCSI Root
  kernel vmlinuz-iscsi
  append initrd=initrd.img-iscsi root=UUID=2fc570df-9408-4718-b7fc-0e100be751dd ro

Revision history for this message
Amos Hayes (ahayes-polkaroo) wrote :

I am having the same type of problem. Installing with 11.04 to an iSCSI volume seems to work great. But upon rebooting, it breaks at the same spot. Right after app armor.

What I have narrowed it down to is that I have a virtual CD drive attached via IPMI to provide the ISO image for the install, but I disconnect this drive before the system comes back up. It then boots via the Intel iSCSI ROM.

I notice that during the install, the iSCSI disk was referred to as sdb and that after a reboot, it can't find sdb (but seems to be booting from sda). I'm guess

I thought the boot process on ubuntu was using disk labels these days, not device names, but I think this may be at least some part of my problem. That being said, when I tried re-connecting the virtual CD device with nothing in it and booted the same install, I'm still getting lots of IO errors and it doesn't boot. I have some screen caps and the install log file I can put somewhere for you to see if you like.

Revision history for this message
Amos Hayes (ahayes-polkaroo) wrote :

Meant to say "I'm guessing the CD was sda during install."

Revision history for this message
Dave Gilbert (ubuntu-treblig) wrote :

I'm seeing the same thing with release Natty server.
I've got a pair of KVM guests; one is the server one is the client. The server runs ietd exporting an LVM partition; it's /etc/iet/ietd.conf has the following stanza:

Target iqn.2010-05.org.treblig:server1.export.a.b
             Lun 0 Path=/dev/server1/client1,Type=blockio

The client was installed without specifying any username and installed OK, and I'm actually doing an iSCSI boot which worked OK
(as described in http://www.advogato.org/person/penguin42/diary/10.html ) so it's loading grub and the initram from iSCSI using gPXE sanboot.

What I see with ietd in foreground/debug and with a wireshark on virbr0 is that it's sending an iSCSI login specifying CHAP:
and I then see an authentication failed.

wireshark dump attached.

Dave

Revision history for this message
Amos Hayes (ahayes-polkaroo) wrote :

Colin, would remote console (KVM) access to a system to test with be useful? I have a system with IPMI that you can access and attempt an install/boot on.

Revision history for this message
Colin Watson (cjwatson) wrote : Re: [Bug 728088] Re: iscsi root with or without auth fails to boot

Yes, that would be very useful, thanks! Although I won't be able to
make use of it until next week, when I'm back home from UDS.

My GPG key is on my Launchpad user page, if you'd like to send details
by encrypted e-mail.

Revision history for this message
Rafał Krypa (r.krypa) wrote :

I have upgraded my Ubuntu Desktop with iSCSI root from Maverick to Natty yesterday (it was installed originally as Karmic and upgraded to each new release with no problems regarding iSCSI). It started failing exactly like in this bug description and after a short trial and error trying to solve it, my root file system (XFS) was damaged beyond repair. If I indeed hit the same problem, then it's caused (also) by something else than debian-installer, as my upgrade had nothing to do with it.
By the way, it seems like a good idea for such upgrades to take a snapshot of iSCSI volume before upgrading...

Colin Watson (cjwatson)
Changed in debian-installer (Ubuntu):
status: Confirmed → In Progress
Revision history for this message
Colin Watson (cjwatson) wrote :

I've been debugging this on Amos' system (thanks!).

The problem isn't AppArmor as such. In fact, the real problem is that the network interface needed for iSCSI isn't being protected from later modification as it should be, so, whenever the network-interface job happens to fire for that interface in the real system, it effectively takes down the root filesystem. This tends to happen somewhere around the time that AppArmor starts, but that's just a coincidence.

This is happening because we run configure_networking in a subshell to avoid '. /tmp/net-*.conf' from killing the entirety of /scripts/local-top/iscsi if no network devices are available yet, and then expect the value of DEVICE from that subshell to be available later so that we can write it to /dev/.initramfs/open-iscsi.interface (which /etc/init/iscsi-network-interface.conf uses). Moving the code that uses DEVICE into that same subshell fixes the problem. It also makes sense to wait for udev to settle before doing any of this; I don't think that's vital, but it should reduce noisy error messages and it's what I've tested.

I'm going to do a bit more testing of this, and then upload to oneiric and natty-proposed.

affects: debian-installer (Ubuntu) → open-iscsi (Ubuntu)
Changed in open-iscsi (Ubuntu Natty):
status: Confirmed → Triaged
Changed in open-iscsi (Ubuntu Oneiric):
milestone: none → oneiric-alpha-1
Changed in open-iscsi (Ubuntu Natty):
milestone: none → natty-updates
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package open-iscsi - 2.0.871-0ubuntu6

---------------
open-iscsi (2.0.871-0ubuntu6) oneiric; urgency=low

  * Fix initramfs iSCSI login (many thanks to Amos Hayes for lending me a
    test system; LP: #728088):
    - Write out /dev/.initramfs/open-iscsi.interface in the same subshell as
      configure_networking, so that we can get at the value of DEVICE.
    - Wait for udev to settle before attempting to configure networking.
 -- Colin Watson <email address hidden> Fri, 27 May 2011 23:27:31 +0100

Changed in open-iscsi (Ubuntu Oneiric):
status: In Progress → Fix Released
Colin Watson (cjwatson)
description: updated
Changed in open-iscsi (Ubuntu Natty):
status: Triaged → In Progress
Revision history for this message
Martin Pitt (pitti) wrote : Please test proposed package

Accepted open-iscsi into natty-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in open-iscsi (Ubuntu Natty):
status: In Progress → Fix Committed
tags: added: verification-needed
Revision history for this message
Dave Gilbert (ubuntu-treblig) wrote :

I can confirm that Oneiric Alpha1 server installs OK on iscsi root - thanks for fixing this!

(That's natty host, natty kvm guest as a server and oneiric kvm guest as the Oneirc Alpha1 server test)

Dave

Revision history for this message
Colin Watson (cjwatson) wrote :

Thanks, Dave. Are you able to test with the version in natty-proposed, so that we can release this to natty-updates?

Revision history for this message
Dave Gilbert (ubuntu-treblig) wrote : Re: [Bug 728088] Re: iscsi root with or without auth fails to boot

* Colin Watson (<email address hidden>) wrote:
> Thanks, Dave. Are you able to test with the version in natty-proposed,
> so that we can release this to natty-updates?

Can you suggest a way of testing it? This is happening on 1st
boot after installation - how would I get the -proposed package in?

Dave

Revision history for this message
Colin Watson (cjwatson) wrote :

If you pass the apt-setup/proposed=true boot parameter to the installer,
it should install the open-iscsi package from natty-proposed (along with
any other packages that have been updated in -proposed, but hopefully
that's OK ...).

Revision history for this message
Dave Gilbert (ubuntu-treblig) wrote :

Hi Colin,
  I couldn't persuade apt-setup/proposed=true to cause the installer to do it - however I ctrl-alt-f2'd and chroot'd into the mounted filesystem before rebooting at the end of install and that does seem to have worked - thanks.

My test environment isn't that difficult to replicate; I'm sure it would make an interesting automated test setup for u-server;
see :

http://www.advogato.org/person/penguin42/diary/10.html

Dave

Revision history for this message
Martin Flasskamp (78luphr0-launchpad) wrote :

Hi Colin,
I had the same problem with iSCSI boot (i386). To fix it I booted an old kernel and ramdisk (2.6.32-23), updated open-iscsi to the natty-proposed version and now everything runs fine!

Colin Watson (cjwatson)
tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package open-iscsi - 2.0.871-0ubuntu5.1

---------------
open-iscsi (2.0.871-0ubuntu5.1) natty-proposed; urgency=low

  * Fix initramfs iSCSI login (many thanks to Amos Hayes for lending me a
    test system; LP: #728088):
    - Write out /dev/.initramfs/open-iscsi.interface in the same subshell as
      configure_networking, so that we can get at the value of DEVICE.
    - Wait for udev to settle before attempting to configure networking.
 -- Colin Watson <email address hidden> Fri, 27 May 2011 23:30:31 +0100

Changed in open-iscsi (Ubuntu Natty):
status: Fix Committed → Fix Released
tags: added: testcase
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.