Comment 10 for bug 1732028

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : Re: transient boot fail with overlayroot [open-iscsi autopkg tests]

I could confirm that I can run the guest that way and it uses the intended root=/dev/disk/by-path/ip-10.0.12.2:3260-iscsi-tgt-boot-test-b7X8g2-lun-1-part1.

The only and unfortunate difference to the issue when run on LP infra stays, that it works all of the time :-/
Note: I ran most of them in background after verifying once how they behave. But wanted to make sure they complete reproducibly.
a - manual login, no user data case 3/3
b - user data collect data to disk and shut down 3/3
c - user data collect just shut down 3/3
d - user data collect data to disk and shut down, no tty (nohup CMD > log 2>&1) 3/3
e - non interactive like autopkgtest would run it 3/3
f - non interactive like autopkgtest would run it forcing KVM mode 3/3

Next I was parsing all the logs that the fails on LP accumulated recently.
I found this errors whichi is interesting:
  iscsistart: initiator reported error (15 - session exists)
I realized this is present on ALL logs that we gathered.
But after thinking I had a lead on this I realized that the good cases had those messages as well.
Also found the mentioned "ordering cycle on media-root\x2dro.mount/start"
In good as well as bad logs.
So neither of these "is it"

Essentially the boot around the iscsi root has these steps with some noise in between - looking for differences in good/bad cases. They start the same even with sharing a few errors that seem to be red herrings:
[...] (early boot)
all logs (id changes) - Logging into tgt-boot-test-o3PlsL 10.1.1.2:3260,1
all logs - mounted filesystem with ordered data
on 7/17 logs (also good) - Found ordering cycle ...
only all bad cases - Dependency failed for Local File Systems
only all bad cases - Timed out waiting for device (devices change)
only all bad cases - Started Emergency Shell

That matches this bug here.