arm64 arch_timer fixes

Bug #1713821 reported by dann frazier
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
dann frazier
Zesty
Fix Released
Undecided
dann frazier

Bug Description

[Impact]
This bug captures a few issues with the ARM arch_timer driver:

1) Some arm64 systems have hardware defects in their architected timer implementations that require errata, which we workaround in the kernel. However, it's possible that this workaround will not be applied if the timer was reset w/ the user access bit set.

2) The Juno board fails to initialize a timer at boot:

      arch_timer: Unable to map frame @ 0x0000000000000000
      arch_timer: Frame missing phys irq.
      Failed to initialize '/timer@2a810000': -22

3) Possible boot warning from arch_timer_mem_of_init():
   'Trying to vfree() nonexistent vm area'

4) There's a theoretical problem where the first frame of a timer could be used even though a better suited timer frame is available.

5) An infinite recursion loop will occur when enabling the function tracer in builds with CONFIG_PREEMPT_TRACER=y. Ubuntu does not enable CONFIG_PREEMPT_TRACER, so this will only be a problem if that changes.

[Test Case]
I've regression tested this on both a system w/ an errata workaround (HiSilicon D05) and one that is not (Cavium ThunderX CRB1S). In both cases the timer was initialized correctly. Verified by looking at the boot messages:

dannf@d05-3:~$ dmesg | grep arch_timer
[ 0.000000] arch_timer: Enabling global workaround for HiSilicon erratum 161010101
[ 0.000000] arch_timer: CPU0: Trapping CNTVCT access
[ 0.000000] arch_timer: cp15 timer(s) running at 50.00MHz (phys).
[ 0.194241] arch_timer: CPU1: Trapping CNTVCT access
[ 0.197305] arch_timer: CPU2: Trapping CNTVCT access
<.....>
[ 0.396228] arch_timer: CPU62: Trapping CNTVCT access
[ 0.399752] arch_timer: CPU63: Trapping CNTVCT access

ubuntu@grotrian:~$ dmesg | grep arch_timer
[ 0.000000] arch_timer: cp15 timer(s) running at 100.00MHz (phys).

[Regression Risk]
The regression risk is restricted to ARM systems, as this driver only applies there. Regressions could lead to a timer failing to initialize, or a system that requires errata not having the appropriate workaround applied. (Which are also the conditions that the suggested backports are attempting to fix).

CVE References

dann frazier (dannf)
Changed in linux (Ubuntu Zesty):
status: New → Confirmed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1713821

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
dann frazier (dannf)
description: updated
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
dann frazier (dannf)
description: updated
description: updated
Changed in linux (Ubuntu):
assignee: nobody → dann frazier (dannf)
Changed in linux (Ubuntu Zesty):
assignee: nobody → dann frazier (dannf)
Changed in linux (Ubuntu):
status: Confirmed → In Progress
Changed in linux (Ubuntu Zesty):
status: Confirmed → In Progress
Seth Forshee (sforshee)
Changed in linux (Ubuntu):
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (12.1 KiB)

This bug was fixed in the package linux - 4.12.0-13.14

---------------
linux (4.12.0-13.14) artful; urgency=low

  * linux: 4.12.0-13.14 -proposed tracker (LP: #1714687)

  * vhost guest network randomly drops under stress (kvm) (LP: #1711251)
    - Revert "vhost: cache used event for better performance"

  * EDAC sbridge: Failed to register device with error -22. (LP: #1714112)
    - [Config] CONFIG_EDAC_GHES=n

  * Artful update to v4.12.10 stable release (LP: #1714525)
    - sparc64: remove unnecessary log message
    - bonding: require speed/duplex only for 802.3ad, alb and tlb
    - bonding: ratelimit failed speed/duplex update warning
    - af_key: do not use GFP_KERNEL in atomic contexts
    - dccp: purge write queue in dccp_destroy_sock()
    - dccp: defer ccid_hc_tx_delete() at dismantle time
    - ipv4: fix NULL dereference in free_fib_info_rcu()
    - net_sched/sfq: update hierarchical backlog when drop packet
    - net_sched: remove warning from qdisc_hash_add
    - bpf: fix bpf_trace_printk on 32 bit archs
    - net: igmp: Use ingress interface rather than vrf device
    - openvswitch: fix skb_panic due to the incorrect actions attrlen
    - ptr_ring: use kmalloc_array()
    - ipv4: better IP_MAX_MTU enforcement
    - nfp: fix infinite loop on umapping cleanup
    - tun: handle register_netdevice() failures properly
    - sctp: fully initialize the IPv6 address in sctp_v6_to_addr()
    - tipc: fix use-after-free
    - ipv6: reset fn->rr_ptr when replacing route
    - ipv6: repair fib6 tree in failure case
    - tcp: when rearming RTO, if RTO time is in past then fire RTO ASAP
    - net/mlx4_core: Enable 4K UAR if SRIOV module parameter is not enabled
    - irda: do not leak initialized list.dev to userspace
    - net: sched: fix NULL pointer dereference when action calls some targets
    - net_sched: fix order of queue length updates in qdisc_replace()
    - bpf, verifier: add additional patterns to evaluate_reg_imm_alu
    - bpf: fix mixed signed/unsigned derived min/max value bounds
    - bpf/verifier: fix min/max handling in BPF_SUB
    - Input: trackpoint - add new trackpoint firmware ID
    - Input: elan_i2c - add ELAN0602 ACPI ID to support Lenovo Yoga310
    - Input: ALPS - fix two-finger scroll breakage in right side on ALPS touchpad
    - KVM: s390: sthyi: fix sthyi inline assembly
    - KVM: s390: sthyi: fix specification exception detection
    - KVM: x86: simplify handling of PKRU
    - KVM, pkeys: do not use PKRU value in vcpu->arch.guest_fpu.state
    - KVM: x86: block guest protection keys unless the host has them enabled
    - ALSA: usb-audio: Add delay quirk for H650e/Jabra 550a USB headsets
    - ALSA: core: Fix unexpected error at replacing user TLV
    - ALSA: hda - Add stereo mic quirk for Lenovo G50-70 (17aa:3978)
    - ALSA: firewire: fix NULL pointer dereference when releasing uninitialized
      data of iso-resource
    - ALSA: firewire-motu: destroy stream data surely at failure of card
      initialization
    - ARCv2: SLC: Make sure busy bit is set properly for region ops
    - ARCv2: PAE40: Explicitly set MSB counterpart of SLC region ops addresses
    - ARCv2: PAE40: set MSB even if !CONFIG_ARC_HAS_...

Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
Stefan Bader (smb)
Changed in linux (Ubuntu Zesty):
status: In Progress → Fix Committed
Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-zesty' to 'verification-done-zesty'. If the problem still exists, change the tag 'verification-needed-zesty' to 'verification-failed-zesty'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-zesty
Revision history for this message
dann frazier (dannf) wrote :
Download full text (4.0 KiB)

ubuntu@equal-beetle:~$ cat /proc/version
Linux version 4.10.0-36-generic (buildd@bos01-arm64-013) (gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.4) ) #40~16.04.1-Ubuntu SMP Tue Sep 19 15:20:18 UTC 2017
ubuntu@equal-beetle:~$ sudo dmesg | grep -i arch_timer
[ 0.000000] arch_timer: Enabling global workaround for HiSilicon erratum 161010101
[ 0.000000] arch_timer: CPU0: Trapping CNTVCT access
[ 0.000000] arch_timer: cp15 timer(s) running at 50.00MHz (phys).
[ 0.194148] arch_timer: CPU1: Trapping CNTVCT access
[ 0.197183] arch_timer: CPU2: Trapping CNTVCT access
[ 0.200193] arch_timer: CPU3: Trapping CNTVCT access
[ 0.203192] arch_timer: CPU4: Trapping CNTVCT access
[ 0.206199] arch_timer: CPU5: Trapping CNTVCT access
[ 0.209199] arch_timer: CPU6: Trapping CNTVCT access
[ 0.212197] arch_timer: CPU7: Trapping CNTVCT access
[ 0.215212] arch_timer: CPU8: Trapping CNTVCT access
[ 0.218234] arch_timer: CPU9: Trapping CNTVCT access
[ 0.221238] arch_timer: CPU10: Trapping CNTVCT access
[ 0.224244] arch_timer: CPU11: Trapping CNTVCT access
[ 0.227250] arch_timer: CPU12: Trapping CNTVCT access
[ 0.230271] arch_timer: CPU13: Trapping CNTVCT access
[ 0.233286] arch_timer: CPU14: Trapping CNTVCT access
[ 0.236293] arch_timer: CPU15: Trapping CNTVCT access
[ 0.239365] arch_timer: CPU16: Trapping CNTVCT access
[ 0.242482] arch_timer: CPU17: Trapping CNTVCT access
[ 0.245543] arch_timer: CPU18: Trapping CNTVCT access
[ 0.248601] arch_timer: CPU19: Trapping CNTVCT access
[ 0.251675] arch_timer: CPU20: Trapping CNTVCT access
[ 0.254747] arch_timer: CPU21: Trapping CNTVCT access
[ 0.257809] arch_timer: CPU22: Trapping CNTVCT access
[ 0.260874] arch_timer: CPU23: Trapping CNTVCT access
[ 0.263942] arch_timer: CPU24: Trapping CNTVCT access
[ 0.267019] arch_timer: CPU25: Trapping CNTVCT access
[ 0.270077] arch_timer: CPU26: Trapping CNTVCT access
[ 0.273181] arch_timer: CPU27: Trapping CNTVCT access
[ 0.276300] arch_timer: CPU28: Trapping CNTVCT access
[ 0.279418] arch_timer: CPU29: Trapping CNTVCT access
[ 0.282534] arch_timer: CPU30: Trapping CNTVCT access
[ 0.285632] arch_timer: CPU31: Trapping CNTVCT access
[ 0.289091] arch_timer: CPU32: Trapping CNTVCT access
[ 0.292688] arch_timer: CPU33: Trapping CNTVCT access
[ 0.296098] arch_timer: CPU34: Trapping CNTVCT access
[ 0.299505] arch_timer: CPU35: Trapping CNTVCT access
[ 0.302920] arch_timer: CPU36: Trapping CNTVCT access
[ 0.306333] arch_timer: CPU37: Trapping CNTVCT access
[ 0.309748] arch_timer: CPU38: Trapping CNTVCT access
[ 0.313156] arch_timer: CPU39: Trapping CNTVCT access
[ 0.316575] arch_timer: CPU40: Trapping CNTVCT access
[ 0.320010] arch_timer: CPU41: Trapping CNTVCT access
[ 0.323419] arch_timer: CPU42: Trapping CNTVCT access
[ 0.326831] arch_timer: CPU43: Trapping CNTVCT access
[ 0.330254] arch_timer: CPU44: Trapping CNTVCT access
[ 0.333679] arch_timer: CPU45: Trapping CNTVCT access
[ 0.337106] arch_timer: CPU46: Trapping CNTVCT access
[ 0.340505] arch_timer: CPU47: Trapping CNTVCT access
[ 0.343945] arch_timer: CPU48: Trapping CN...

Read more...

tags: added: verification-done-zesty
removed: verification-needed-zesty
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 4.10.0-37.41

---------------
linux (4.10.0-37.41) zesty; urgency=low

  * CVE-2017-1000255
    - SAUCE: powerpc/64s: Use emergency stack for kernel TM Bad Thing program
      checks
    - SAUCE: powerpc/tm: Fix illegal TM state in signal handler

linux (4.10.0-36.40) zesty; urgency=low

  * linux: 4.10.0-36.40 -proposed tracker (LP: #1718143)

  * Neighbour confirmation broken, breaks ARP cache aging (LP: #1715812)
    - sock: add sk_dst_pending_confirm flag
    - net: add dst_pending_confirm flag to skbuff
    - sctp: add dst_pending_confirm flag
    - tcp: replace dst_confirm with sk_dst_confirm
    - net: add confirm_neigh method to dst_ops
    - net: use dst_confirm_neigh for UDP, RAW, ICMP, L2TP
    - net: pending_confirm is not used anymore

  * SRIOV: warning if unload VFs (LP: #1715073)
    - PCI: Lock each enable/disable num_vfs operation in sysfs
    - PCI: Disable VF decoding before pcibios_sriov_disable() updates resources

  * Kernel has troule recognizing Corsair Strafe RGB keyboard (LP: #1678477)
    - usb: quirks: add delay init quirk for Corsair Strafe RGB keyboard

  * CVE-2017-14106
    - tcp: initialize rcv_mss to TCP_MIN_MSS instead of 0

  * [CIFS] Fix maximum SMB2 header size (LP: #1713884)
    - CIFS: Fix maximum SMB2 header size

  * Middle button of trackpoint doesn't work (LP: #1715271)
    - Input: trackpoint - assume 3 buttons when buttons detection fails

  * Drop GPL from of_node_to_nid() export to match other arches (LP: #1709179)
    - powerpc: Drop GPL from of_node_to_nid() export to match other arches

  * vhost guest network randomly drops under stress (kvm) (LP: #1711251)
    - Revert "vhost: cache used event for better performance"

  * arm64 arch_timer fixes (LP: #1713821)
    - Revert "UBUNTU: SAUCE: arm64: arch_timer: Enable CNTVCT_EL0 trap if
      workaround is enabled"
    - arm64: arch_timer: Enable CNTVCT_EL0 trap if workaround is enabled
    - clocksource/arm_arch_timer: Fix arch_timer_mem_find_best_frame()
    - clocksource/drivers/arm_arch_timer: Fix read and iounmap of incorrect
      variable
    - clocksource/drivers/arm_arch_timer: Fix mem frame loop initialization
    - clocksource/drivers/arm_arch_timer: Avoid infinite recursion when ftrace is
      enabled

  * Touchpad not detected (LP: #1708852)
    - Input: elan_i2c - add ELAN0608 to the ACPI table

 -- Thadeu Lima de Souza Cascardo <email address hidden> Fri, 06 Oct 2017 16:45:48 -0300

Changed in linux (Ubuntu Zesty):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.