[Artful/Bionic] [Config] enable EDAC_GHES for ARM64

Bug #1747746 reported by Manoj Iyer
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
High
Manoj Iyer
Artful
Fix Released
Undecided
Unassigned
Bionic
Fix Released
High
Manoj Iyer

Bug Description

[Impact]
Enable CONFIG_EDAC_GHES option for ARM64, so that the memory errors are processed and reported to the user space.

[Test]
Test kernel is available in the PPA: https://launchpad.net/~centriq-team/+archive/ubuntu/lp1747746
Tested on QDF2400 platform with Artful user-space and Bionic 4.14 kernel. Please see comment #2 for results.

[Regression Potential]
None.

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1747746

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: artful
Manoj Iyer (manjo)
summary: - [Artful/Bionic] [Config] enable EDAC_GHES and PCIE_ECRC for ARM64
+ [Artful/Bionic] [Config] enable EDAC_GHES for ARM64
Manoj Iyer (manjo)
description: updated
Revision history for this message
Manoj Iyer (manjo) wrote :

- artful -
-- before patch --
ubuntu@awrep7:~$ dmesg | grep -E "GHES|EDAC"
[ 3.160507] EDAC MC: Ver: 3.0.0
[ 4.781212] GHES: APEI firmware first mode is enabled by APEI bit and WHEA _OSC.
ubuntu@awrep7:~$

-- after patch --
ubuntu@awrep7:~$ uname -a
Linux awrep7 4.14.0-17-generic #20~configs+test.2 SMP Tue Feb 6 20:14:17 UTC 2018 aarch64 aarch64 aarch64 GNU/Linux
ubuntu@awrep7:~$
ubuntu@awrep7:~$ dmesg | grep -E "GHES|EDAC"
[ 3.134436] EDAC MC: Ver: 3.0.0
[ 4.778011] ghes_edac: This EDAC driver relies on BIOS to enumerate memory and get error reports.
[ 4.820844] EDAC MC0: Giving out device to module ghes_edac.c controller ghes_edac: DEV ghes (INTERRUPT)
[ 4.830239] EDAC MC1: Giving out device to module ghes_edac.c controller ghes_edac: DEV ghes (INTERRUPT)
[ 4.839695] EDAC MC2: Giving out device to module ghes_edac.c controller ghes_edac: DEV ghes (INTERRUPT)
[ 4.849156] EDAC MC3: Giving out device to module ghes_edac.c controller ghes_edac: DEV ghes (INTERRUPT)
[ 4.858615] EDAC MC4: Giving out device to module ghes_edac.c controller ghes_edac: DEV ghes (INTERRUPT)
[ 4.868124] GHES: APEI firmware first mode is enabled by APEI bit and WHEA _OSC.
ubuntu@awrep7:~$
ubuntu@awrep7:~$ sudo ras-mc-ctl --layout
          +-----------------------------------------------------------+
         | mc0 | mc1 | mc2 | mc3 | mc4 |
----------+-----------------------------------------------------------+
memory11: | 0 MB | 0 MB | 0 MB | 0 MB | 0 MB |
memory10: | 16384 MB | 0 MB | 0 MB | 0 MB | 0 MB |
---------+-----------------------------------------------------------+
memory9: | 0 MB | 0 MB | 0 MB | 0 MB | 0 MB |
memory8: | 16384 MB | 0 MB | 0 MB | 0 MB | 0 MB |
---------+-----------------------------------------------------------+
memory7: | 0 MB | 0 MB | 0 MB | 0 MB | 0 MB |
memory6: | 16384 MB | 0 MB | 0 MB | 0 MB | 0 MB |
---------+-----------------------------------------------------------+
memory5: | 0 MB | 0 MB | 0 MB | 0 MB | 0 MB |
memory4: | 16384 MB | 0 MB | 0 MB | 0 MB | 0 MB |
---------+-----------------------------------------------------------+
memory3: | 0 MB | 0 MB | 0 MB | 0 MB | 0 MB |
memory2: | 16384 MB | 0 MB | 0 MB | 0 MB | 0 MB |
---------+-----------------------------------------------------------+
memory1: | 0 MB | 0 MB | 0 MB | 0 MB | 0 MB |
memory0: | 16384 MB | 0 MB | 0 MB | 0 MB | 0 MB |
---------+------------------------------------------------------------+
ubuntu@awrep7:~$

description: updated
Revision history for this message
Manoj Iyer (manjo) wrote :

- Power8-
Where edac_ghes config is not enabled

-- before patch --
ubuntu@entei:~$ dmesg | grep -i -E "EDAC|GHES"
[ 0.770577] EDAC MC: Ver: 3.0.0
ubuntu@entei:~$

-- after patch --
ubuntu@entei:~$ uname -a
Linux entei 4.14.0-17-generic #20~configs+test.2-Ubuntu SMP Tue Feb 6 20:41:16 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux
ubuntu@entei:~$ dmesg | grep -i -E "EDAC|GHES"
[ 1.073492] EDAC MC: Ver: 3.0.0
ubuntu@entei:~$

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (3.3 KiB)

This bug was fixed in the package linux - 4.15.0-10.11

---------------
linux (4.15.0-10.11) bionic; urgency=medium

  * linux: 4.15.0-10.11 -proposed tracker (LP: #1749250)

  * "swiotlb: coherent allocation failed" dmesg spam with linux 4.15.0-9.10
    (LP: #1749202)
    - swiotlb: suppress warning when __GFP_NOWARN is set
    - drm/ttm: specify DMA_ATTR_NO_WARN for huge page pools

  * linux-tools: perf incorrectly linking libbfd (LP: #1748922)
    - SAUCE: tools -- add ability to disable libbfd
    - [Packaging] correct disablement of libbfd

  * [Artful] Realtek ALC225: 2 secs noise when a headset plugged in
    (LP: #1744058)
    - ALSA: hda/realtek - update ALC225 depop optimize

  * [Artful] Support headset mode for DELL WYSE (LP: #1723913)
    - SAUCE: ALSA: hda/realtek - Add support headset mode for DELL WYSE

  * headset mic can't be detected on two Dell machines (LP: #1748807)
    - ALSA: hda/realtek - Support headset mode for ALC215/ALC285/ALC289
    - ALSA: hda - Fix headset mic detection problem for two Dell machines

  * Bionic update to v4.15.3 stable release (LP: #1749191)
    - ip6mr: fix stale iterator
    - net: igmp: add a missing rcu locking section
    - qlcnic: fix deadlock bug
    - qmi_wwan: Add support for Quectel EP06
    - r8169: fix RTL8168EP take too long to complete driver initialization.
    - tcp: release sk_frag.page in tcp_disconnect
    - vhost_net: stop device during reset owner
    - ipv6: addrconf: break critical section in addrconf_verify_rtnl()
    - ipv6: change route cache aging logic
    - Revert "defer call to mem_cgroup_sk_alloc()"
    - net: ipv6: send unsolicited NA after DAD
    - rocker: fix possible null pointer dereference in
      rocker_router_fib_event_work
    - tcp_bbr: fix pacing_gain to always be unity when using lt_bw
    - cls_u32: add missing RCU annotation.
    - ipv6: Fix SO_REUSEPORT UDP socket with implicit sk_ipv6only
    - soreuseport: fix mem leak in reuseport_add_sock()
    - net_sched: get rid of rcu_barrier() in tcf_block_put_ext()
    - net: sched: fix use-after-free in tcf_block_put_ext
    - media: mtk-vcodec: add missing MODULE_LICENSE/DESCRIPTION
    - media: soc_camera: soc_scale_crop: add missing
      MODULE_DESCRIPTION/AUTHOR/LICENSE
    - media: tegra-cec: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
    - gpio: uniphier: fix mismatch between license text and MODULE_LICENSE
    - crypto: tcrypt - fix S/G table for test_aead_speed()
    - Linux 4.15.3

  * bnx2x_attn_int_deasserted3:4323 MC assert! (LP: #1715519) //
    CVE-2018-1000026
    - net: create skb_gso_validate_mac_len()
    - bnx2x: disable GSO where gso_size is too big for hardware

  * ethtool -p fails to light NIC LED on HiSilicon D05 systems (LP: #1748567)
    - net: hns: add ACPI mode support for ethtool -p

  * CVE-2017-5715 (Spectre v2 Intel)
    - [Packaging] retpoline files must be sorted
    - [Packaging] pull in retpoline files

  * [Feature] PXE boot with Intel Omni-Path (LP: #1712031)
    - d-i: Add hfi1 to nic-modules

  * CVE-2017-5715 (Spectre v2 retpoline)
    - [Packaging] retpoline -- add call site validation
    - [Config] disable retpoline checks for first upload

  * Do not dup...

Read more...

Changed in linux (Ubuntu):
status: Incomplete → Fix Released
Changed in linux (Ubuntu Artful):
status: New → In Progress
Changed in linux (Ubuntu Artful):
status: In Progress → Fix Committed
Revision history for this message
Stefan Bader (smb) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-artful' to 'verification-done-artful'. If the problem still exists, change the tag 'verification-needed-artful' to 'verification-failed-artful'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-artful
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (18.9 KiB)

This bug was fixed in the package linux - 4.13.0-38.43

---------------
linux (4.13.0-38.43) artful; urgency=medium

  * linux: 4.13.0-38.43 -proposed tracker (LP: #1755762)

  * Servers going OOM after updating kernel from 4.10 to 4.13 (LP: #1748408)
    - i40e: Fix memory leak related filter programming status
    - i40e: Add programming descriptors to cleaned_count

  * [SRU] Lenovo E41 Mic mute hotkey is not responding (LP: #1753347)
    - platform/x86: ideapad-laptop: Increase timeout to wait for EC answer

  * fails to dump with latest kpti fixes (LP: #1750021)
    - kdump: write correct address of mem_section into vmcoreinfo

  * headset mic can't be detected on two Dell machines (LP: #1748807)
    - ALSA: hda/realtek - Support headset mode for ALC215/ALC285/ALC289
    - ALSA: hda - Fix headset mic detection problem for two Dell machines
    - ALSA: hda - Fix a wrong FIXUP for alc289 on Dell machines

  * CIFS SMB2/SMB3 does not work for domain based DFS (LP: #1747572)
    - CIFS: make IPC a regular tcon
    - CIFS: use tcon_ipc instead of use_ipc parameter of SMB2_ioctl
    - CIFS: dump IPC tcon in debug proc file

  * i2c-thunderx: erroneous error message "unhandled state: 0" (LP: #1754076)
    - i2c: octeon: Prevent error message on bus error

  * hisi_sas: Add disk LED support (LP: #1752695)
    - scsi: hisi_sas: directly attached disk LED feature for v2 hw

  * EDAC, sb_edac: Backport 1 patch to Ubuntu 17.10 (Fix missing DIMM sysfs
    entries with KNL SNC2/SNC4 mode) (LP: #1743856)
    - EDAC, sb_edac: Fix missing DIMM sysfs entries with KNL SNC2/SNC4 mode

  * [regression] Colour banding and artefacts appear system-wide on an Asus
    Zenbook UX303LA with Intel HD 4400 graphics (LP: #1749420)
    - drm/edid: Add 6 bpc quirk for CPT panel in Asus UX303LA

  * DVB Card with SAA7146 chipset not working (LP: #1742316)
    - vmalloc: fix __GFP_HIGHMEM usage for vmalloc_32 on 32b systems

  * [Asus UX360UA] battery status in unity-panel is not changing when battery is
    being charged (LP: #1661876) // AC adapter status not detected on Asus
    ZenBook UX410UAK (LP: #1745032)
    - ACPI / battery: Add quirk for Asus UX360UA and UX410UAK

  * ASUS UX305LA - Battery state not detected correctly (LP: #1482390)
    - ACPI / battery: Add quirk for Asus GL502VSK and UX305LA

  * support thunderx2 vendor pmu events (LP: #1747523)
    - perf pmu: Extract function to get JSON alias map
    - perf pmu: Pass pmu as a parameter to get_cpuid_str()
    - perf tools arm64: Add support for get_cpuid_str function.
    - perf pmu: Add helper function is_pmu_core to detect PMU CORE devices
    - perf vendor events arm64: Add ThunderX2 implementation defined pmu core
      events
    - perf pmu: Add check for valid cpuid in perf_pmu__find_map()

  * lpfc.ko module doesn't work (LP: #1746970)
    - scsi: lpfc: Fix loop mode target discovery

  * Ubuntu 17.10 crashes on vmalloc.c (LP: #1739498)
    - powerpc/mm/book3s64: Make KERN_IO_START a variable
    - powerpc/mm/slb: Move comment next to the code it's referring to
    - powerpc/mm/hash64: Make vmalloc 56T on hash

  * ethtool -p fails to light NIC LED on HiSilicon D05 systems (LP: #1748567)
    - net...

Changed in linux (Ubuntu Artful):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.