P8 node entei unable to boot with 4.15.0-141.145~16.04.1

Bug #1922997 reported by Po-Hsu Lin
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Fix Released
Undecided
Unassigned
ubuntu-kernel-tests
Fix Released
Undecided
Unassigned
linux-hwe (Ubuntu)
Invalid
Undecided
Unassigned
Xenial
Fix Released
Undecided
Unassigned

Bug Description

[Impact]
Enabling CONFIG_MODVERSIONS on xenial/linux-hwe (via rebase on bionic/linux, see bug 1898716) is causing the kernel to fail booting on ppc64el as the modules can't be loaded because of the mismatch of the module_layout symbol.

[ 7.635173] raid0: disagrees about version of symbol module_layout

[Fix]
The proposed fix is to disable CONFIG_MODVERSIONS and unset CONFIG_SYSTEM_TRUSTED_KEYS via the 'local-mangle' script which gets called by the 'open' script after the rebase. Making the changes directly on the config or annotations file is not persistent as these files are synced from master.

This is a temporary fix until the root cause can be found.

[Test case]
Boot the kernel on a xenial ppc64el system.

[Regression potential]
This config option has been enabled on Bionic to support rebuilding the lrm modules without the need to rebuild the kernel. There are no lrm modules in Xenial, so it should be safe to keep it disabled.

[Original Description]

Tested manually, this node can boot with 4.15.0-140-generic

However with 4.15.0-141.145~16.04.1 in proposed, it will drop into initramfs:

[ 9.547985] usb 1-3.4: Manufacturer: American Megatrends Inc.
[ 9.563800] hid: disagrees about version of symbol module_layout
[ 9.563949] hid: disagrees about version of symbol module_layout
[ 9.692066] libcrc32c: disagrees about version of symbol module_layout
[ 12.593593] raid10: disagrees about version of symbol module_layout
done.
Begin: Running /scripts/init-premount ... done.
Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
Begin: Running /scripts/local-premount ... [ 12.963251] raid6_pq: disagrees about version of symbol module_layout
done.
Begin: Waiting for root file system ... Begin: Running /scripts/local-block ... mdadm: CREATE group disk not found
mdadm: No devices listed in conf file were found.
done.
Begin: Running /scripts/local-block ... mdadm: CREATE group disk not found
mdadm: No devices listed in conf file were found.
done.
(this mdadm message repeats)
Gave up waiting for root device. Common problems:
 - Boot args (cat /proc/cmdline)
   - Check rootdelay= (did the system wait long enough?)
   - Check root= (did the system wait for the right device?)
 - Missing modules (cat /proc/modules; ls /dev)
ALERT! UUID=348b5e78-915d-47b0-93db-3eca0d8f048e does not exist. Dropping to a shell!
[ 204.831089] hid: disagrees about version of symbol module_layout

BusyBox v1.22.1 (Ubuntu 1:1.22.0-15ubuntu1.4) built-in shell (ash)
Enter 'help' for a list of built-in commands.

(initramfs)

Please find attachment for the boot log.

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :
description: updated
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

boot dmesg for 4.15.0-140-generic #144~16.04.1-Ubuntu

Changed in linux-hwe (Ubuntu Xenial):
status: New → Confirmed
Changed in linux-hwe (Ubuntu):
status: New → Invalid
description: updated
description: updated
Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :
Changed in linux-hwe (Ubuntu Xenial):
status: Confirmed → In Progress
description: updated
Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :
Changed in linux-hwe (Ubuntu Xenial):
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (12.1 KiB)

This bug was fixed in the package linux-hwe - 4.15.0-142.146~16.04.1

---------------
linux-hwe (4.15.0-142.146~16.04.1) xenial; urgency=medium

  * P8 node entei unable to boot with 4.15.0-141.145~16.04.1 (LP: #1922997)
    - [Packaging] HWE: disable CONFIG_MODVERSIONS

  [ Ubuntu: 4.15.0-142.146 ]

  * overlayfs calls vfs_setxattr without cap_convert_nscap
    - vfs: move cap_convert_nscap() call into vfs_setxattr()
  * CVE-2021-29154
    - SAUCE: bpf, x86: Validate computation of branch displacements for x86-64

linux-hwe (4.15.0-141.145~16.04.1) xenial; urgency=medium

  * xenial/linux-hwe: 4.15.0-141.145~16.04.1 -proposed tracker (LP: #1919535)

  * Bionic update: upstream stable patchset 2021-02-26 (LP: #1917093)
    - [Config] hwe: Updateconfigs for BDM_PCI

  [ Ubuntu: 4.15.0-141.145 ]

  * bionic/linux: 4.15.0-141.145 -proposed tracker (LP: #1919536)
  * binary assembly failures with CONFIG_MODVERSIONS present (LP: #1919315)
    - [Packaging] quiet (nomially) benign errors in BUILD script
  * selftests: bpf verifier fails after sanitize_ptr_alu fixes (LP: #1920995)
    - bpf: Simplify alu_limit masking for pointer arithmetic
    - bpf: Add sanity check for upper ptr_limit
    - bpf, selftests: Fix up some test_verifier cases for unprivileged
  * Packaging resync (LP: #1786013)
    - update dkms package versions
  * CVE-2018-13095
    - xfs: More robust inode extent count validation
  * i40e PF reset due to incorrect MDD event (LP: #1772675)
    - i40e: change behavior on PF in response to MDD event
  * Bionic update: upstream stable patchset 2021-03-09 (LP: #1918330)
    - ACPI: sysfs: Prefer "compatible" modalias
    - ARM: dts: imx6qdl-gw52xx: fix duplicate regulator naming
    - wext: fix NULL-ptr-dereference with cfg80211's lack of commit()
    - net: usb: qmi_wwan: added support for Thales Cinterion PLSx3 modem family
    - drivers: soc: atmel: Avoid calling at91_soc_init on non AT91 SoCs
    - drivers: soc: atmel: add null entry at the end of at91_soc_allowed_list[]
    - KVM: x86/pmu: Fix HW_REF_CPU_CYCLES event pseudo-encoding in
      intel_arch_events[]
    - KVM: x86: get smi pending status correctly
    - xen: Fix XenStore initialisation for XS_LOCAL
    - leds: trigger: fix potential deadlock with libata
    - mt7601u: fix kernel crash unplugging the device
    - mt7601u: fix rx buffer refcounting
    - xen-blkfront: allow discard-* nodes to be optional
    - ARM: imx: build suspend-imx6.S with arm instruction set
    - netfilter: nft_dynset: add timeout extension to template
    - xfrm: Fix oops in xfrm_replay_advance_bmp
    - RDMA/cxgb4: Fix the reported max_recv_sge value
    - iwlwifi: pcie: use jiffies for memory read spin time limit
    - iwlwifi: pcie: reschedule in long-running memory reads
    - mac80211: pause TX while changing interface type
    - can: dev: prevent potential information leak in can_fill_info()
    - x86/entry/64/compat: Preserve r8-r11 in int $0x80
    - x86/entry/64/compat: Fix "x86/entry/64/compat: Preserve r8-r11 in int $0x80"
    - iommu/vt-d: Gracefully handle DMAR units with no supported address widths
    - iommu/vt-d: Don't dereference iommu_device if IOMMU_API is not built
    - NFC:...

Changed in linux-hwe (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Manually deployed it and upgrade to -142, it works as expected.

Changed in ubuntu-kernel-tests:
status: New → Fix Released
Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
status: New → Fix Released
tags: added: sts
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.