Current kinetic ISO images not installable on s390x

Bug #1986551 reported by Frank Heimes
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
Fix Released
High
Canonical Foundations Team
cloud-init
Fix Released
High
James Falcon
subiquity
Invalid
Undecided
Unassigned

Bug Description

While wanting to install an kinetic/22.10 system on s390x for testing new and updated packages
I found that the current daily ISO image for s390x is not installable - not on LPAR nor on z/VM - not interactively using subiquity, not non-interactively using autoinstall.

I had the image from August 2nd and the installation ended at the console with these messages (please ignore the weird special characters):
...
Ý Ý0;32m OK Ý0m¨ Started Ý0;1;39mTime & Date Service Ý0m.
connecting... - \ |
waiting for cloud-init... -

It is possible to connect to the installer over the network, which
might allow the use of a more capable terminal and can offer more languages
than can be rendered in the Linux console.

Unfortunately this system seems to have no global IP addresses at this
time.

         Starting Ý0;1;39mTime & Date Service Ý0m...
Ý Ý0;32m OK Ý0m¨ Started Ý0;1;39mTime & Date Service Ý0m.
Ý Ý0;32m OK Ý0m¨ Finished Ý0;1;39mWait until snapd is fully seeded Ý0m.
         Starting Ý0;1;39mApply the settings specified in cloud-config Ý0m...
Ý Ý0;32m OK Ý0m¨ Started Ý0;1;39mSubiquity, the installer for Ubuntu Server
hvc0 Ý0m.
Ý Ý0;32m OK Ý0m¨ Started Ý0;1;39mSubiquity, the ins er for Ubuntu Server t
tysclp0 Ý0m.
Ý Ý0;32m OK Ý0m¨ Reached target Ý0;1;39mLogin Prompts Ý0m.
         Stopping Ý0;1;39mOpenBSD Secure Shell server Ý0m...
Ý Ý0;32m OK Ý0m¨ Stopped Ý0;1;39mOpenBSD Secure Shell server Ý0m.
         Starting Ý0;1;39mOpenBSD Secure Shell server Ý0m...
Ý Ý0;32m OK Ý0m¨ Started Ý0;1;39mOpenBSD Secure Shell server Ý0m.
Ý Ý0;32m OK Ý0m¨ Finished Ý0;1;39mApply the settings specified in cloud-con
ig Ý0m.
Ý Ý0;32m OK Ý0m¨ Reached target Ý0;1;39mMulti-User System Ý0m.
Ý Ý0;32m OK Ý0m¨ Reached target Ý0;1;39mGraphical Interface Ý0m.
         Starting Ý0;1;39mExecute cloud user/final scripts Ý0m...
         Starting Ý0;1;39mRecord Runlevel Change in UTMP Ý0m...
Ý Ý0;32m OK Ý0m¨ Finished Ý0;1;39mRecord Runlevel Change in UTMP Ý0m.
Ý Ý0;32m OK Ý0m¨ Finished Ý0;1;39mExecute cloud user/final scripts Ý0m.
Ý Ý0;32m OK Ý0m¨ Reached target Ý0;1;39mCloud-init target Ý0m.
...

Then updated to the latest ISO from today (Aug 15th), I got the same:
...
Ý Ý0;32m OK Ý0m¨ Finished Ý0;1;39mHolds Snappy daemon refresh Ý0m.
Ý Ý0;32m OK Ý0m¨ Finished Ý0;1;39mService for snap application lxd.activate
Ý0m.
Ý Ý0;32m OK Ý0m¨ Started Ý0;1;39msnap.lxd.hook.conf -4b29-8a88-87b80c6b731
8.scope Ý0m.
Ý Ý0;32m OK Ý0m¨ Started Ý0;1;39msnap.subiquity.hoo -4a63-9355-e4654a5890c
1.scope Ý0m.
Ý Ý0;32m OK Ý0m¨ Started Ý0;1;39mService for snap a on subiquity.subiquity
-server Ý0m.
Ý Ý0;32m OK Ý0m¨ Started Ý0;1;39mService for snap a n subiquity.subiquity-
service Ý0m.
         Starting Ý0;1;39mTime & Date Service Ý0m...
Ý Ý0;32m OK Ý0m¨ Started Ý0;1;39mTime & Date Service Ý0m.
connecting... - \ |
waiting for cloud-init... - \

It is possible to connect to the installer over the network, which
might allow the use of a more capable terminal and can offer more languages
than can be rendered in the Linux console.

Unfortunately this system seems to have no global IP addresses at this
time.
...

Unfortunately I am not able to get any logs at that (very early) stage of the installation.

On top I did a 22.04.1 installation on the same systems, using the same data (IP etc) which worked fine.

(I kept one of the systems in that stage for now ...)

Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
importance: Undecided → High
assignee: nobody → Canonical Foundations Team (canonical-foundations)
Revision history for this message
Frank Heimes (fheimes) wrote :

After having a bit brainstormed with Michael on this, I have some updates:

If the installer is at the above stage (using the interactive way to specify the basic network config data <casper>, no login credentials are specified (like shown above),
but an ssh server is up and listening:

z/VM:
$ ssh installer@hwe0003
The authenticity of host 'hwe0003 (10.123.234.23)' can't be established.
ECDSA key fingerprint is SHA256:t/LLOGMeNBD623Bac371TmcHBHD+lthnY.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added 'hwe0003,10.123.234.23' (ECDSA) to the list of known hosts.
installer@hwe0003's password:

LPAR:
$ ssh ubuntu@10.245.236.14
The authenticity of host '10.123.234.14 (10.123.234.14)' can't be established.
ECDSA key fingerprint is SHA256:41LjqXjab4D9rvexGhFo7+OvBx5o.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '10.123.234.14' (ECDSA) to the list of known hosts.
ubuntu@10.123.234.14's password:
Permission denied, please try again.
ubuntu@10.123.234.14's password:

There are just no credentials to login.

But on LPAR, there is the additional option to do an installation based on the "Integrated ASCII console" (instead of using a remote ssh session).
And the subiquity installer shows up at "Integrated ASCII console", and allows to enter the installer shell.
'ip a' shows a proper IP address.
So the "Integrated ASCII console" can be used for further debugging ...

Revision history for this message
Frank Heimes (fheimes) wrote :

...and here are the logs from that LPAR (copied to a different system from within the "Integrated ASCII console").

Frank Heimes (fheimes)
tags: added: installer rls-kk-incoming
removed: subiquity
Revision history for this message
Dan Bungert (dbungert) wrote :
Download full text (4.8 KiB)

Cloud-init folks, may I get your opinion on this one?

2022-08-16 10:16:29,345 - util.py[WARNING]: failed stage init-local
failed run of stage init-local
------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 745, in status_wrapper
    ret = functor(name, args)
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 411, in main_init
    init.apply_network_config(bring_up=bring_up_interfaces)
  File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 937, in apply_network_config
    return self.distro.apply_network_config(
  File "/usr/lib/python3/dist-packages/cloudinit/distros/__init__.py", line 244, in apply_network_config
    network_state = parse_net_config_data(netconfig)
  File "/usr/lib/python3/dist-packages/cloudinit/net/network_state.py", line 1062, in parse_net_config_data
    nsi.parse_config(skip_broken=skip_broken)
  File "/usr/lib/python3/dist-packages/cloudinit/net/network_state.py", line 280, in parse_config
    self.parse_config_v2(skip_broken=skip_broken)
  File "/usr/lib/python3/dist-packages/cloudinit/net/network_state.py", line 330, in parse_config_v2
    self._v2_common(command)
  File "/usr/lib/python3/dist-packages/cloudinit/net/network_state.py", line 785, in _v2_common
    real_if_name = find_interface_name_from_mac(mac_address)
  File "/usr/lib/python3/dist-packages/cloudinit/net/__init__.py", line 933, in find_interface_name_from_mac
    if mac.lower() == interface_mac.lower():
AttributeError: 'NoneType' object has no attribute 'lower'
------------------------------------------------------------
Cloud-init v. 22.2-115-g6e498773-0ubuntu1~22.10.1 running 'init' at Tue,...

Read more...

Revision history for this message
Chad Smith (chad.smith) wrote :
Download full text (4.7 KiB)

Ok in the /var/log/cloud-init.log we can see the network config passed to cloud-init is versoin: 2 (which cloud-init should ultimately write directly to /etc/netplan/50-cloud-init.yaml

From cloud-init.log:
2022-08-16 10:16:29,328 - stages.py[INFO]: Applying network configuration from system_cfg bringup=False: {'version': 2, 'ethernets': {'encc000': {}, 'zz-all-en': {'match': {'name': 'en*'}, 'dhcp4': True}, 'zz-all-eth': {'match': {'name': 'eth*'}, 'dhcp4': True}}, 'vlans': {'encc000.2653': {'id': 2653, 'link': 'encc000', 'addresses': ['10.245.236.14/24'], 'gateway4': '10.245.236.1', 'nameservers': {'addresses': ['10.245.236.1']}}}}

We can quickly reproduce any specific network rendering tracebacks in cloud-init with the following devel tool (that is included in the installed cloud-init deb):

1. write out the network config provided to cloud-init to a yaml file
$ cat > input-net.yaml <<EOF
{'version': 2, 'ethernets': {'encc000': {}, 'zz-all-en': {'match': {'name': 'en*'}, 'dhcp4': True}, 'zz-all-eth': {'match': {'name': 'eth*'}, 'dhcp4': True}}, 'vlans': {'encc000.2653': {'id': 2653, 'link': 'encc000', 'addresses': ['10.245.236.14/24'], 'gateway4': '10.245.236.1', 'nameservers': {'addresses': ['10.245.236.1']}}}}
EOF

2. try to render that network config (works for version: 1 or version: 2) network input config
$ cloud-init devel net-convert -p input-net.yaml -k yaml -d out -D ubuntu -O netplan
Traceback (most recent call last):
  File "/usr/bin/cloud-init", line 33, in <module>
    sys.exit(load_entry_point('cloud-init==22.3', 'console_scripts', 'cloud-init')())
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 1088, in main
    retval = util.log_time(
  File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 2621, in log_time
    ret = func(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/devel/net_convert.py", line 136, in handle_args
    ns = network_state.parse_net_config_data(pre_ns)
  File "/usr/lib/python3/dist-packages/cloudinit/net/network_state.py", line 1062, in parse_net_config_data
    nsi.parse_config(skip_broken=skip_broken)
  File "/usr/lib/python3/dist-packages/cloudinit/net/network_state.py", line 280, in parse_config
    self.parse_config_v2(skip_broken=skip_broken)
  File "/usr/lib/python3/dist-packages/cloudinit/net/network_state.py", line 330, in parse_config_v2
    self._v2_common(command)
  File "/usr/lib/python3/dist-packages/cloudinit/net/network_state.py", line 785, in _v2_common
    real_if_name = find_interface_name_from_mac(mac_address)
  File "/usr/lib/python3/dist-packages/cloudinit/net/__init__.py", line 933, in find_interface_name_from_mac
    if mac.lower() == interface_mac.lower():
AttributeError: 'NoneType' object has no attribute 'lower'

## The failure here seems to be that we have a nameserver clause that doesn't contain a corresponding match: {"macaddress": "f6:4c:f0:f1:ea:27"} on the "encc000.2653" device configuration.

Instead of providing this config:
{'version': 2, 'ethernets': {'encc000': {}, 'zz-all-en': {'match': {'name': 'en*'}, 'dhcp4': True}, 'zz-all-eth': {'match': {'name': 'eth*'}, 'dhcp4': True}}, 'vlans': {'encc000.2653': {'id': 265...

Read more...

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Er, I think you're right that the cloud-init crash is caused by the lack of macaddress in some of the ethernets: entries, but this is a cloud-init bug, not a problem with the input, which is afaict perfectly valid netplan. An ethernets that doesn't have a match matches by name, and even if it does have a match, matching by things other than mac is fine. And vlan: entries never have a match: https://netplan.io/reference/#device-configuration-ids:~:text=Thus%20match%3A%20and%20set%2Dname%3A%20are%20not%20applicable%20for%0Athese%2C%20and%20the%20ID%20field%20is%20the%20name%20of%20the%20created%20virtual%20device

I think cloud-init master now handles this kind of "v2 to v2" "conversion" by just dumping it out to disk, which sounds fine. But if you need to convert it to some other kind of format you'll need to fix this.

Revision history for this message
Frank Heimes (fheimes) wrote (last edit ):

For comparison reasons, and just from a user pov, this is what I get with a jammy installation (for the same system, having specific the same network data in the early network config steps):

$ cat /etc/netplan/00-installer-config.yaml
# This is the network config written by 'subiquity'
network:
  ethernets:
    encc000: {}
  version: 2
  vlans:
    encc000.2653:
      addresses:
      - 10.245.236.14/24
      gateway4: 10.245.236.1
      id: 2653
      link: encc000
      nameservers:
        addresses:
        - 10.245.236.1

Which worked perfectly fine so far (no mac addresses at all).

Revision history for this message
Chad Smith (chad.smith) wrote :

mwhudson and fheimes. Thanks for this. and you both are correct. This bug is a short-coming in cloud-init's internal rendering network_state not being able to "handle" full netplan version: 2 while rendering internal state.

I was reminded in our standup that we had a separate bug that was being worked that will resolve this issue. cloud-init shouldn't really be doing anything with network version: 2 on a system that already had netplan as a the network renderer. This was just fixed on Friday due to a related bug
LP: #1986551 and the upstream patch https://github.com/canonical/cloud-init/commit/f1d901c9b21fcf1073d663f4190badce662ff3da.

This fix will be in the next upload and release of cloud-init.

Revision history for this message
Chad Smith (chad.smith) wrote :

fheimes, just for future reference, I captured the netplan-rendered systemd files from your working netplan yaml:

root@pkg-dev:~# for file in /run/systemd/network/10-netplan-*; do echo $file; cat $file; done
/run/systemd/network/10-netplan-encc000.2653.netdev
[NetDev]
Name=encc000.2653
Kind=vlan

[VLAN]
Id=2653
/run/systemd/network/10-netplan-encc000.2653.network
[Match]
Name=encc000.2653

[Network]
LinkLocalAddressing=ipv6
Address=10.245.236.14/24
Gateway=10.245.236.1
DNS=10.245.236.1
ConfigureWithoutCarrier=yes
/run/systemd/network/10-netplan-encc000.network
[Match]
Name=encc000

[Network]
LinkLocalAddressing=ipv6
VLAN=encc000.2653

This network_state bug above is avoided on Ubuntu by the recent commit to upstream in cloud-init passing network version: 2 directly as passthrough config to the netplan renderer instead of converting it to the internal cloud-init network_state. This bug will still exist on other distros which do not have netplan installed.

Revision history for this message
Chad Smith (chad.smith) wrote :
Changed in cloud-init:
status: New → Fix Committed
assignee: nobody → James Falcon (falcojr)
importance: Undecided → High
Revision history for this message
Chad Smith (chad.smith) wrote :

Fix released into Ubuntu Kinetic 22.10 as cloud-init version 22.3-13-g70ce6442-0ubuntu1~22.10.1

Changed in cloud-init:
status: Fix Committed → Fix Released
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Thanks for fixing this so quickly!

Revision history for this message
Chad Smith (chad.smith) wrote :

Thanks for the bug triage :) It also happened to align with an SRU we wanted to process and upload so we figured "fix all the things"

Revision history for this message
Frank Heimes (fheimes) wrote :

Many thanks all.
I guess I need to wait for another day or two until it will have landed in the kinetic ISO, and before I can give it a try ...?!

Changed in ubuntu-z-systems:
status: New → Fix Committed
Changed in subiquity:
status: New → Invalid
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

It looks like the ISO currently in pending has the new version now.

Revision history for this message
Frank Heimes (fheimes) wrote :

I can confirm that this issue is fixed with the latest 'pending' ISO from today (Sept 1st):
https://cdimage.ubuntu.com/ubuntu-server/daily-live/pending/kinetic-live-server-s390x.iso
Many thx!

I tried that on two systems and I was able to reach subiquity - so, complete the initial basic network configuration.

I was able to complete the entire installation on one of the systems, but faced another (independent) problem on the 2nd one - will open a separate bug for that (LP#1988407).

But this bug can be closed now as Fix Released.

Changed in ubuntu-z-systems:
status: Fix Committed → Fix Released
Revision history for this message
James Falcon (falcojr) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.