test-mic can fail / writes to qemu-nbd device do not persist

Bug #1741096 reported by Scott Moser
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-utils
Fix Released
Undecided
Unassigned
cloud-utils (Ubuntu)
Fix Released
Medium
Unassigned

Bug Description

### Whats going on here?
## test-mic is a test of mount-image-callback in cloud-image-utils
##
## It populates some disk images and then uses mic to write some
## files (info.txt and data.txt)
## then calls mic again to verify the files are there.
##
## the failure i'm seeing is that one of the files written during
## the first mic is not present in the second.
##
## mic does:
## mount via qemu-nbd, run command, umount, qemu-nbd disconnect
##

this recreates at least 1 in 3 times in my local tests here.

$ uname -r
4.13.0-17-generic

$ dpkg-query --show qemu-utils
qemu-utils 1:2.10+dfsg-0ubuntu5

$ bzr branch lp:cloud-utils cloud-utils
$ cd cloud-utils

## the output below has some debug info added
$ sudo PATH=$PWD/bin:$PATH ./test/test-mic 2>&1 | tee out.log
testing partition 1 image
testing partition 2 image
testing --system-mounts on pt1
testing unexpected mount get unmounted.
writing hunks to disk image /tmp/test-mic.zGTR4a/disk-mbr.img
partitioning MBR disk image /tmp/test-mic.zGTR4a/disk-mbr.img
partitioning GPT disk image /tmp/test-mic.zGTR4a/disk-gpt.img
MBR: testing partition 1 in /tmp/test-mic.zGTR4a/disk-mbr.img
pid for /dev/nbd0 is 30004
connected /tmp/test-mic.zGTR4a/disk-mbr.img (raw) 1 to /dev/nbd0. waiting for device.
mounted /dev/nbd0p1 via qemu-nbd /dev/nbd0 at /tmp/mount-image-callback.0ikR3l/mp
invoking: MOUNTPOINT=/tmp/mount-image-callback.0ikR3l/mp sh -c echo 'foo1' > data.txt
cmd returned 0. unmounting /tmp/mount-image-callback.0ikR3l/mp
disconnecting /dev/nbd0
pid for /dev/nbd0 is 30077
connected /tmp/test-mic.zGTR4a/disk-mbr.img (raw) 1 to /dev/nbd0. waiting for device.
mounted /dev/nbd0p1 via qemu-nbd /dev/nbd0 at /tmp/mount-image-callback.2wgbbV/mp
invoking: MOUNTPOINT=/tmp/mount-image-callback.2wgbbV/mp sh -c
      r=0; for f in "$@"; do
      cat $f || { echo "$f: cat failed rc=$?"; r=99; }; done;
      exit $r sh-extract info.txt data.txt
cmd returned 0. unmounting /tmp/mount-image-callback.2wgbbV/mp
disconnecting /dev/nbd0
MBR: testing partition 2 in /tmp/test-mic.zGTR4a/disk-mbr.img
pid for /dev/nbd0 is 30153
connected /tmp/test-mic.zGTR4a/disk-mbr.img (raw) 2 to /dev/nbd0. waiting for device.
mounted /dev/nbd0p2 via qemu-nbd /dev/nbd0 at /tmp/mount-image-callback.9PowSH/mp
invoking: MOUNTPOINT=/tmp/mount-image-callback.9PowSH/mp sh -c echo 'foo2' > data.txt
cmd returned 0. unmounting /tmp/mount-image-callback.9PowSH/mp
disconnecting /dev/nbd0
pid for /dev/nbd0 is 30222
connected /tmp/test-mic.zGTR4a/disk-mbr.img (raw) 2 to /dev/nbd0. waiting for device.
mounted /dev/nbd0p2 via qemu-nbd /dev/nbd0 at /tmp/mount-image-callback.Az1hhv/mp
invoking: MOUNTPOINT=/tmp/mount-image-callback.Az1hhv/mp sh -c
      r=0; for f in "$@"; do
      cat $f || { echo "$f: cat failed rc=$?"; r=99; }; done;
      exit $r sh-extract info.txt data.txt
cat: data.txt: No such file or directory
cmd returned 99. unmounting /tmp/mount-image-callback.Az1hhv/mp
disconnecting /dev/nbd0
MBR:
== expected on 2 ==
partition 2
foo2
== found on 2 ==
partition 2
data.txt: cat failed rc=1
MBR: failed testing partition 2 on /tmp/test-mic.zGTR4a/disk-mbr.img
smoser@milhouse:~/src/cloud-utils/trunk$ sudo PATH=$PWD/bin:$PATH ./test/test-mic 2>&1 | tee out.log
testing partition 1 image
testing partition 2 image
testing --system-mounts on pt1
testing unexpected mount get unmounted.
writing hunks to disk image /tmp/test-mic.SCaQUe/disk-mbr.img
partitioning MBR disk image /tmp/test-mic.SCaQUe/disk-mbr.img
partitioning GPT disk image /tmp/test-mic.SCaQUe/disk-gpt.img
MBR: testing partition 1 in /tmp/test-mic.SCaQUe/disk-mbr.img
pid for /dev/nbd0 is 30674
connected /tmp/test-mic.SCaQUe/disk-mbr.img (raw) 1 to /dev/nbd0. waiting for device.
mounted /dev/nbd0p1 via qemu-nbd /dev/nbd0 at /tmp/mount-image-callback.0bIcqZ/mp
invoking: MOUNTPOINT=/tmp/mount-image-callback.0bIcqZ/mp sh -c echo 'foo1' > data.txt
cmd returned 0. unmounting /tmp/mount-image-callback.0bIcqZ/mp
disconnecting /dev/nbd0
waiting on pidfile for /dev/nbd0 in /sys/block/nbd0/pid
.
pid for /dev/nbd0 is 30757
connected /tmp/test-mic.SCaQUe/disk-mbr.img (raw) 1 to /dev/nbd0. waiting for device.
mounted /dev/nbd0p1 via qemu-nbd /dev/nbd0 at /tmp/mount-image-callback.ZNIbzM/mp
invoking: MOUNTPOINT=/tmp/mount-image-callback.ZNIbzM/mp sh -c
      r=0; for f in "$@"; do
      cat $f || { echo "$f: cat failed rc=$?"; r=99; }; done;
      exit $r sh-extract info.txt data.txt
cat: data.txt: No such file or directory
cmd returned 99. unmounting /tmp/mount-image-callback.ZNIbzM/mp
disconnecting /dev/nbd0
MBR:
== expected on 1 ==
partition 1
foo1
== found on 1 ==
partition 1
data.txt: cat failed rc=1
MBR: failed testing partition 1 on /tmp/test-mic.SCaQUe/disk-mbr.img

### new data in dmesg during run
[1300609.359437] EXT4-fs (loop3): mounted filesystem with ordered data mode. Opts: (null)
[1300609.506371] EXT4-fs (loop3): mounted filesystem with ordered data mode. Opts: (null)
[1300609.619183] EXT4-fs (loop3): mounted filesystem with ordered data mode. Opts: (null)
[1300609.752534] EXT4-fs (loop3): mounted filesystem with ordered data mode. Opts: (null)
[1300609.889312] EXT4-fs (loop3): mounted filesystem with ordered data mode. Opts: (null)
[1300610.022198] EXT4-fs (loop3): mounted filesystem with ordered data mode. Opts: (null)
[1300610.225399] EXT4-fs (loop3): mounted filesystem with ordered data mode. Opts: (null)
[1300610.374885] EXT4-fs (loop3): mounted filesystem with ordered data mode. Opts: (null)
[1300610.634821] EXT4-fs (loop3): mounted filesystem with ordered data mode. Opts: (null)
[1300610.830573] EXT4-fs (loop3): mounted filesystem with ordered data mode. Opts: (null)
[1300612.701430] nbd0: p1 p2
[1300612.713894] EXT4-fs (nbd0p1): mounted filesystem with ordered data mode. Opts: (null)
[1300612.798977] block nbd0: NBD_DISCONNECT
[1300612.799003] block nbd0: shutting down sockets
[1300612.800582] print_req_error: 22 callbacks suppressed
[1300612.800584] print_req_error: I/O error, dev nbd0, sector 18954
[1300612.800590] EXT4-fs warning (device nbd0p1): ext4_end_bio:322: I/O error 10 writing to inode 15 (offset 0 size 0 starting block 9478)
[1300612.800592] Buffer I/O error on device nbd0p1, logical block 8453
[1300612.800614] print_req_error: I/O error, dev nbd0, sector 100354
[1300612.800616] buffer_io_error: 17 callbacks suppressed
[1300612.800617] Buffer I/O error on dev nbd0p1, logical block 49153, lost sync page write
[1300612.800622] JBD2: Error -5 detected when updating journal superblock for nbd0p1-8.
[1300612.800623] Aborting journal on device nbd0p1-8.
[1300612.800630] print_req_error: I/O error, dev nbd0, sector 100354
[1300612.800632] Buffer I/O error on dev nbd0p1, logical block 49153, lost sync page write
[1300612.800636] JBD2: Error -5 detected when updating journal superblock for nbd0p1-8.
[1300612.800643] JBD2: Detected IO errors while flushing file data on nbd0p1-8
[1300612.800664] print_req_error: I/O error, dev nbd0, sector 0
[1300612.800693] print_req_error: I/O error, dev nbd0, sector 0
[1300612.800708] print_req_error: I/O error, dev nbd0, sector 0
[1300612.800723] print_req_error: I/O error, dev nbd0, sector 0
[1300612.800815] print_req_error: I/O error, dev nbd0, sector 2050
[1300612.800817] Buffer I/O error on dev nbd0p1, logical block 1, lost sync page write
[1300612.837164] nbd0: p1 p2
[1300612.953403] EXT4-fs (nbd0p1): INFO: recovery required on readonly filesystem
[1300612.953405] EXT4-fs (nbd0p1): write access will be enabled during recovery
[1300612.953745] EXT4-fs (nbd0p1): recovery complete
[1300612.954785] EXT4-fs (nbd0p1): mounted filesystem with ordered data mode. Opts: (null)
[1300613.011345] block nbd0: NBD_DISCONNECT
[1300613.011378] block nbd0: shutting down sockets
[1300613.015916] nbd0: detected capacity change from 0 to 316669952
[1300613.016004] print_req_error: I/O error, dev nbd0, sector 0
[1300613.016010] Buffer I/O error on dev nbd0, logical block 0, async page read
[1300613.016027] print_req_error: I/O error, dev nbd0, sector 0
[1300613.016030] Buffer I/O error on dev nbd0, logical block 0, async page read
[1300613.016039] Buffer I/O error on dev nbd0, logical block 0, async page read
[1300613.016044] ldm_validate_partition_table(): Disk read failed.
[1300613.016050] Buffer I/O error on dev nbd0, logical block 0, async page read
[1300613.016058] Buffer I/O error on dev nbd0, logical block 0, async page read
[1300613.016066] Buffer I/O error on dev nbd0, logical block 0, async page read
[1300613.016074] Buffer I/O error on dev nbd0, logical block 0, async page read
[1300613.016078] Dev nbd0: unable to read RDB block 0
[1300613.016136] nbd0: unable to read partition table

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: cloud-utils 0.30-0ubuntu2
ProcVersionSignature: Ubuntu 4.13.0-17.20-generic 4.13.8
Uname: Linux 4.13.0-17-generic x86_64
NonfreeKernelModules: zfs zunicode zavl zcommon znvpair
ApportVersion: 2.20.8-0ubuntu5
Architecture: amd64
CurrentDesktop: ubuntu:GNOME
Date: Wed Jan 3 11:47:00 2018
EcryptfsInUse: Yes
InstallationDate: Installed on 2015-07-23 (895 days ago)
InstallationMedia: Ubuntu 15.10 "Wily Werewolf" - Alpha amd64 (20150722.1)
PackageArchitecture: all
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: cloud-utils
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Scott Moser (smoser) wrote :
Changed in cloud-utils (Ubuntu):
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
Scott Moser (smoser) wrote :

I've played a bit. Attached patch here has logging done with logger in an effort to serialize things.

Then, I have 'go.sh' (see below) that just loops over running test-mic until it fails.
Then you get bookend messages in /var/log/syslog. I've collected a fail here.

Revision history for this message
Scott Moser (smoser) wrote :

go.sh looks like

#!/bin/sh
[ "$(id -u)" = "0" ] || { echo "not root" 1>&2; exit 1; }
i=0
while true; do
   logger --tag=xtest-mic-$i "start xtest-mic-$i"
   echo -n .
   {
   read up idle </proc/uptime
   echo $up
   ./test/test-mic
   r=$?
   read up idle </proc/uptime
   echo $up
   echo $r
   } >out.log 2>&1
   logger --tag=xtest-mic-$i "end xtest-mic-$i $r"
   [ $r -eq 0 ] || { echo; echo "failed xtest-mic-$i"; exit 1; }
   i=$(($i+1))
done

Revision history for this message
Scott Moser (smoser) wrote :

run like
$ sudo PATH=$PWD/bin:$PATH ./go.sh
....
failed xtest-mic-3
smoser@milhouse:~/src/cloud-utils/trunk$ cat out.log
19536.67
cat: data.txt: No such file or directory
GPT:
== expected on 1 ==
partition 1
foo1
== found on 1 ==
partition 1
data.txt: cat failed rc=1
19543.86
1

Revision history for this message
Scott Moser (smoser) wrote :

heres another log of /var/log/syslog. http://paste.ubuntu.com/26315330/
I'm not certain if the order of entries in /varlog/syslog can be 100% trusted or not.

If it can be, then we definitely see oddities.
 * mount-image-callback definitly unmounts before running qemu-nbd --disconnect.
 * the 'detected capacity change' seems odd.
 * during qemu-nbd --disconnect we see Buffer I/O errors.

Revision history for this message
Scott Moser (smoser) wrote :
tags: added: patch
Revision history for this message
Scott Moser (smoser) wrote :

I considered the fact that maybe the partitioned image that test-mic was
creating was somehow invalid. To test that, I modifed mount-image-callback
to set the mount device in MOUNT_DEV, and then did:

cp "$pt1" /tmp/part1.img
cp "$pt2" /tmp/part2.img
mount-image-callback --read-only --part=1 "$img_mbr" -- sh -c 'dd if=$MOUNT_DEV of=/tmp/part1.mbr.img'
mount-image-callback --read-only --part=2 "$img_mbr" -- sh -c 'dd if=$MOUNT_DEV of=/tmp/part2.mbr.img'
mount-image-callback --read-only --part=1 "$img_gpt" -- sh -c 'dd if=$MOUNT_DEV of=/tmp/part1.gpt.img'
mount-image-callback --read-only --part=2 "$img_gpt" -- sh -c 'dd if=$MOUNT_DEV of=/tmp/part2.gpt.img'

then an 'md5sum' shows that the 'dd'd image is identical to the partition image.

It seems to me that that validates the partition table and the data. I was
somewhat concerned that maybe my partition/sector math in test-mic was
truncating the partition or filesystem. that does not seem to be the case.

So, it seems that this is either an issue in
 a.) qemu-nbd
 b.) kernel
 c.) mount-image-callback assumptions on qemu-nbd usage.

for 'c', the assumption is that this is valid:
DEV=/dev/nbd0 # unused nbd device
BNAME=$(basename $DEV)
PARTNUM=2
PART_PATH=${DEV}p${PARTNUM}
FILE=/tmp/my-2partition-mbr.img
 1.) qemu-nbd --format=raw --connect $DEV $FILE
 2.) wait until a 'pid' file appears in /sys/block/$BNAME/pid)
 3.) wait until $PART_PATH exists
 4.) mount $PART_PATH to MOUNTPOINT
 5.) execute user code that operates on on MOUNTPOINT
 6.) umount MOUNTPOINT (and anything under it)
 7.) qemu-disconnect --disconnect $DEV

I have assumed that 'umount' would block until all file descriptors
are flushed to the backing /dev/nbd device, and that 'qemu-disconnect'
would block until any writes already sent were written through to the
backing file.

Revision history for this message
Scott Moser (smoser) wrote :

Attaching a recreate script with loop included.
It should run correctly just about anywhere, with mount-image-callback installed.

must be root.
to run with a different/modified mount-image-callback, just add it to your PATH.

Revision history for this message
Scott Moser (smoser) wrote :
Revision history for this message
Scott Moser (smoser) wrote :

the recreate script demonstrates failure on xenial, artful and bionic in a VM.
xenial needs a change to mount-image-callback to run 'blockdev' on the nbd device. i'm not actually sure how i've never seen that bug before.

with the change in place, all 3 can be made to show the issue. artful and bionic vms are actually somewhat reliable, but if you run two copies of the script concurrently it will fail.

=== modified file 'bin/mount-image-callback'
--- bin/mount-image-callback 2018-01-03 15:44:47 +0000
+++ bin/mount-image-callback 2018-01-04 17:19:37 +0000
@@ -316,6 +316,7 @@
        fi

        i=0
+ [ -b "$mdev" ] || blockdev --rereadpt $nbd
        while :; do
                [ -b "$mdev" ] && break
                i=$(($i+1))

Revision history for this message
Scott Moser (smoser) wrote :

this actually seems to fix the issue and I cannot recreate with it in place.
 http://paste.ubuntu.com/26320756/

I suspect what this is showing is that we were operating on nbd devices that
existed from a previous run.

Revision history for this message
Scott Moser (smoser) wrote :

that had some noise:
 === modified file 'bin/mount-image-callback'
--- bin/mount-image-callback 2018-01-03 15:44:47 +0000
+++ bin/mount-image-callback 2018-01-04 18:08:43 +0000
@@ -284,6 +284,10 @@

  debug 1 "connected $img_in ($fmt) ${rwmode} to $nbd. waiting for device."
  i=0
+ local out=""
+ out=$(blockdev --rereadpt "$nbd" 2>&1) ||
+ debug 1 "blockdev rereadpt $nbd failed"
+ udevadm settle
  while i=$(($i+1)):; do
   get_partition "$nbd" && nptnum="$_RET" && break
   [ $i -eq 40 ] && {

Revision history for this message
Scott Moser (smoser) wrote :

The fix that I've applied is to add the '2.5' to the list below.

 1.) qemu-nbd --format=raw --connect $DEV $FILE
 2.) wait until a 'pid' file appears in /sys/block/$BNAME/pid)
 2.5) run 'blockdev --rereadpt' and 'udevadm settle'
 3.) wait until $PART_PATH exists
 4.) mount $PART_PATH to MOUNTPOINT
 5.) execute user code that operates on on MOUNTPOINT
 6.) umount MOUNTPOINT (and anything under it)
 7.) qemu-disconnect --disconnect $DEV

note that 'udevadm settle' is not entirely unlike 'sleep 1', but if you call blockdev --rereadpt, then you do need to settle afterward.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Fetched cloud utils on a new (bionic) system.
Running go.sh just to see if I can trigger the case at all on my system.

I could and got to the
cat: data.txt: No such file or directory

I can confirm the IO errors, overall I see:
xtest-mic-0: start xtest-mic-0
[4998071.265687] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
[4998071.372544] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
[4998071.485091] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
[4998071.593007] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
[4998071.716770] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
[4998072.500496] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
[4998074.532148] EXT4-fs (nbd1p1): mounted filesystem with ordered data mode. Opts: (null)
[4998074.549530] print_req_error: I/O error, dev nbd0, sector 0
[4998074.607975] print_req_error: I/O error, dev nbd0, sector 0
[4998074.633572] Buffer I/O error on dev nbd0, logical block 0, async page read
[4998074.841685] print_req_error: I/O error, dev nbd0, sector 0
[4998074.867993] Buffer I/O error on dev nbd0, logical block 0, async page read
[4998074.907662] block nbd10: shutting down sockets
[4998074.909787] Buffer I/O error on dev nbd10p2, logical block 90113, lost sync page write
[4998074.909796] JBD2: Error -5 detected when updating journal superblock for nbd10p2-8.
[4998074.941009] EXT4-fs (nbd1p2): write access will be enabled during recovery
[4998075.276217] block nbd1: NBD_DISCONNECT
[4998075.276643] nbd1: detected capacity change from 0 to 316669952
[4998075.276796] Dev nbd1: unable to read RDB block 0

I think I need to break down the test to the most simple form that triggers it and check then.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Only now saw your updates as I opened yesterday and didn't refresh since then.
With the workaround in cloud utils known consider it low for qmeu-nbd to find more background details.

Changed in cloud-utils (Ubuntu):
importance: Medium → Wishlist
importance: Wishlist → Medium
Changed in qemu (Ubuntu):
importance: Undecided → Low
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-utils - 0.30-0ubuntu3

---------------
cloud-utils (0.30-0ubuntu3) bionic; urgency=medium

  * sync to trunk at 322
    - mount-image-callback: run blockdev and udevadm settle on nbd devices.
      (LP: #1741096, #1741300)
    - mount-image-callback: Drop support for mounting lxd containers.
      (LP: #1715994)

 -- Scott Moser <email address hidden> Thu, 04 Jan 2018 14:17:26 -0500

Changed in cloud-utils (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
Scott Moser (smoser) wrote :

I'm dropping the qemu-nbd. i'm not certain that there is actually anything that was wrong there.

no longer affects: qemu (Ubuntu)
Revision history for this message
Scott Moser (smoser) wrote : Fixed in cloud-utils version 0.31.

This bug is believed to be fixed in cloud-utils in version 0.31. If this is still a problem for you, please make a comment and set the state back to New

Thank you.

Changed in cloud-utils:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.