Cloud-init fails if iso9660 filesystem on non-cdrom path in 20171211 image.

Bug #1737704 reported by Christian Ehrhardt 
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
cloud-init
Fix Released
High
Scott Moser
cloud-init (Ubuntu)
Fix Released
High
Scott Moser

Bug Description

During OVF datasource checks, ds-identify attempted to ignore non-cdrom iso9660 filesystems. It logs a debug 'skip' message using an undeclared variable in a debug message resulting in the following failure:

_shwrap: d: parameter not set.

==== Original description ===

Hi,
I had the last daily image working fine:
$ uvt-simplestreams-libvirt query
release=bionic arch=amd64 label=daily (20171129.1)

But today after a sync I got this image:
$ uvt-simplestreams-libvirt query
release=bionic arch=amd64 label=daily (20171211)

The latter is failing me to boot correctly in regard to networking and actually cloud-init in general.

In the guest console I see it hanging on the usual "A start job is running for Wait for ..."
It breaks after some time giving up on networking.
"See 'systemctl status systemd-networkd-wait-online.service' for details."
The host confirmd that - the guest did not get an IP from dnsmasq.

Note: I was able to trigger this on a Xenial host as well as a Bionic Host. Also latest Artful image works well on all of these - so I'd expect it safe to assume that it only depends on the guest image.

I have taken full bootup console logs of both cases.

20171129.1 (good): http://paste.ubuntu.com/26169044/
20171211 (bad): http://paste.ubuntu.com/26169046/

There was one more thing that made me perplex - I usually provide --password=ubunut to uvt-kvm.
That adds a snippet to the cloud-init data to set the password of the ubuntu user.
Connecting via "virsh console" I can't log in on the bad guest which made me assume that cloud-init didn't run at all in the bad case.

And in fact the full logs confirm that, in the bad case there is no cloud-init seen at all.

Also my bionic containers today saw a cloud-init update - maybe it really is broken in the current daily image?
OTOH the changelog of cloud-init didn't suggest a change that could explain this.

Related branches

summary: - Cloud-init seems not run on today's bionic images
+ Cloud-init seems not run on today's bionic images (20171211)
Revision history for this message
Christian Ehrhardt  (paelzer) wrote : Re: Cloud-init seems not run on today's bionic images (20171211)

Since I can't log into the guest due to the lack of proper init in the bad case I copied and mounted both daily images to check how they start fresh.

I see different cloud-init versions:
good: 17.1-41-g76243487-0ubuntu1
bad: 17.1-53-ga5dc0f42-0ubuntu1

I see status and clean commands got added, but none of these should be the kill switch.
I checked out the more reasonable changes in config and/or systemd files, but they all are of the same md5.

Going slightly wider on what actually changed: http://paste.ubuntu.com/26169161/

The only change left that IMHO could cause this is the one in ds-identify.

Now while I can't really use the new image, I can check what ds-identify left there on its first init (if anything).

But that seems equal, the only thing that differs in:
$ md5sum $(find /etc/cloud/ -type f | sort | xargs)
is "/etc/cloud/build.info" which refers to the different image date.

I fail to see why it didn't run so far :-/

Revision history for this message
Dan Watkins (oddbloke) wrote :

The testing we perform before a daily makes it out in to the world includes some basic cloud-init validation (basically "touch /some/file" in user-data, and checking that happened after boot), so it isn't failing in all cases.

Diffing the two manifests gives a pretty substantial set of changes:

new: {'python3-debconf': '1.5.65', 'libnss-systemd:amd64': '235-3ubuntu2', 'linux-headers-4.13.0-17-generic': '4.13.0-17.20', 'libharfbuzz0b:amd64': '1.7.2-1', 'libicu-le-hb0:amd64': '1.0.3+git161113-4', 'libntfs-3g88': '1:2017.3.23-2', 'linux-headers-4.13.0-17': '4.13.0-17.20', 'libgraphite2-3:amd64': '1.3.10-8'}
removed: {'linux-headers-4.13.0-16-generic': '4.13.0-16.19', 'libntfs-3g872': '1:2016.2.22AR.2-2', 'linux-headers-4.13.0-16': '4.13.0-16.19'}
changed: ['apport', 'bsdutils', 'busybox-initramfs', 'busybox-static', 'byobu', 'cloud-init', 'cpio', 'debconf', 'debconf-i18n', 'fdisk', 'gcc-7-base:amd64', 'gdisk', 'grub-common', 'grub-legacy-ec2', 'grub-pc', 'grub-pc-bin', 'grub2-common', 'iproute2', 'less', 'libassuan0:amd64', 'libblkid1:amd64', 'libcap2-bin', 'libcap2:amd64', 'libexpat1:amd64', 'libfdisk1:amd64', 'libgcc1:amd64', 'libicu60:amd64', 'libmount1:amd64', 'libpam-cap:amd64', 'libpam-systemd:amd64', 'libpcre3:amd64', 'libperl5.26:amd64', 'libpsl5:amd64', 'libpython3.6-minimal:amd64', 'libpython3.6-stdlib:amd64', 'libpython3.6:amd64', 'libsmartcols1:amd64', 'libssl1.0.0:amd64', 'libstdc++6:amd64', 'libsystemd0:amd64', 'libudev1:amd64', 'libuuid1:amd64', 'linux-headers-generic', 'linux-headers-virtual', 'linux-image-4.13.0-17-generic', 'linux-image-virtual', 'linux-virtual', 'man-db', 'mount', 'nano', 'netcat-openbsd', 'ntfs-3g', 'openssl', 'perl', 'perl-base', 'perl-modules-5.26', 'python3-apport', 'python3-gi', 'python3-problem-report', 'python3.6', 'python3.6-minimal', 'snapd', 'sosreport', 'sudo', 'systemd', 'systemd-sysv', 'udev', 'util-linux', 'uuid-runtime', 'xauth']

Could you try bisecting with the interim serials to see if we can get a slightly smaller list of packages to consider?

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

smoser was able to reproduce (without uvt btw) and found up to 20171208 working.
The manifest diff then is much smaller.

Essentially:
+cloud-init 17.1-53-ga5dc0f42-0ubuntu1
+grub-legacy-ec2 17.1-53-ga5dc0f42-0ubuntu1
+libassuan0:amd64 2.5.1-1

Of those only the first seems related.

See: http://paste.ubuntu.com/26170495/

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in cloud-init (Ubuntu):
status: New → Confirmed
Revision history for this message
Scott Moser (smoser) wrote :

The fix for bug 1731868 left a interim change in ds-identify referencing an undeclared variable.

That code path would be hit if you attached an ISO filesystem to a device other than a cdrom.
For example:

qemu-system-x86_64 -enable-kvm \
  -device virtio-net-pci,netdev=net00 \
  -netdev type=user,id=net00 \
  -drive file=disk.img,id=disk00,if=none,format=qcow2,index=0 \
  -device virtio-blk,drive=disk00,serial=disk.img \
  -drive file=my-seed.img,id=disk01,if=none,format=raw,index=1 \
  -device virtio-blk,drive=disk01,serial=my-seed.img \
  -m 768 -nographic

It would work fine if you attached the iso filesystem as a cdrom.
(-cdrom my-seed.img)

no longer affects: cloud-images
Changed in cloud-init:
status: New → In Progress
Changed in cloud-init (Ubuntu):
status: Confirmed → In Progress
Changed in cloud-init:
importance: Undecided → High
Changed in cloud-init (Ubuntu):
importance: Undecided → High
Changed in cloud-init:
assignee: nobody → Scott Moser (smoser)
Changed in cloud-init (Ubuntu):
assignee: nobody → Scott Moser (smoser)
Scott Moser (smoser)
summary: - Cloud-init seems not run on today's bionic images (20171211)
+ Cloud-init fails if iso9660 filesystem on non-cdrom path in 20171211
+ image.
Revision history for this message
Scott Moser (smoser) wrote :

I have a fix
 http://paste.ubuntu.com/26170810/
when launchpad git returns I will push and get uploaded.

Chad Smith (chad.smith)
description: updated
description: updated
Chad Smith (chad.smith)
Changed in cloud-init:
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-init - 17.1-60-ga30a3bb5-0ubuntu1

---------------
cloud-init (17.1-60-ga30a3bb5-0ubuntu1) bionic; urgency=medium

  * New upstream snapshot.
    - ds-identify: failure in NoCloud due to unset variable usage.
      (LP: #1737704)
    - tests: fix collect_console when not implemented [Joshua Powers]

 -- Chad Smith <email address hidden> Tue, 12 Dec 2017 12:03:08 -0700

Changed in cloud-init (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Scott Moser (smoser) wrote : Fixed in Cloud-init 1705804

This bug is believed to be fixed in cloud-init in 1705804. If this is still a problem for you, please make a comment and set the state back to New

Thank you.

Changed in cloud-init:
status: Fix Committed → Fix Released
Revision history for this message
James Falcon (falcojr) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.