Ubuntu
linux package

ISST-LTE:pNV: system ben is hung during ST (nvme)

Bug #1620317 reported by bugproxy on 2016-09-05

This bug affects 1 person

	Status	Importance	Assigned to
linux (Ubuntu)	Fix Released	High	Unassigned
Xenial	Fix Released	Undecided	Tim Gardner
Yakkety	Fix Released	High	Unassigned

Bug Description

On when we are running I/O intensive tasks and CPU addition/removal, the block may hang stalling the entire machine.

The backtrace below is one of the symptoms:

[12747.111149] ---[ end trace b4d8d720952460b5 ]---
[12747.126885] Trying to free IRQ 357 from IRQ context!
[12747.146930] ------------[ cut here ]------------
[12747.166674] WARNING: at /build/linux-iLHNl3/linux-4.4.0/kernel/irq/manage.c:1438
[12747.184069] Modules linked in: minix nls_iso8859_1 rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) configfs ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) mlx5_ib(OE) mlx4_ib(OE) ib_sa(OE) ib_mad(OE) ib_core(OE) ib_addr(OE) mlx4_en(OE) mlx4_core(OE) binfmt_misc xfs joydev input_leds mac_hid ofpart cmdlinepart powernv_flash ipmi_powernv mtd ipmi_msghandler at24 opal_prd powernv_rng ibmpowernv uio_pdrv_genirq uio sunrpc knem(OE) autofs4 btrfs xor raid6_pq hid_generic usbhid hid uas usb_storage nouveau ast bnx2x i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops mlx5_core(OE) ahci drm mdio libcrc32c mlx_compat(OE) libahci vxlan nvme ip6_udp_tunnel udp_tunnel
[12747.349013] CPU: 80 PID: 0 Comm: swapper/80 Tainted: G W OEL 4.4.0-21-generic #37-Ubuntu
[12747.369046] task: c000000f1fab89b0 ti: c000000f1fb6c000 task.ti: c000000f1fb6c000
[12747.404848] NIP: c000000000131888 LR: c000000000131884 CTR: 00000000300303f0
[12747.808333] REGS: c000000f1fb6e550 TRAP: 0700 Tainted: G W OEL (4.4.0-21-generic)
[12747.867658] MSR: 9000000100029033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 28022222 XER: 20000000
[12747.884783] CFAR: c000000000aea8f4 SOFTE: 1
GPR00: c000000000131884 c000000f1fb6e7d0 c0000000015b4200 0000000000000028
GPR04: c000000f2a409c50 c000000f2a41b4e0 0000000f29480000 00000000000033da
GPR08: 0000000000000007 c000000000f8b27c 0000000f29480000 9000000100001003
GPR12: 0000000000002200 c000000007b6f800 c000000f2a40a938 0000000000000100
GPR16: c000000f11480000 0000000000003a98 0000000000000000 0000000000000000
GPR20: 0000000000000000 d000000009521008 d0000000095146a0 fffffffffffff000
GPR24: c000000004a19ef0 0000000000000000 0000000000000003 000000000000007d
GPR28: 0000000000000165 c000000eefeb1800 c000000eef830600 0000000000000165
[12748.243270] NIP [c000000000131888] __free_irq+0x238/0x370
[12748.254089] LR [c000000000131884] __free_irq+0x234/0x370
[12748.269738] Call Trace:
[12748.286740] [c000000f1fb6e7d0] [c000000000131884] __free_irq+0x234/0x370 (unreliable)
[12748.289687] [c000000f1fb6e860] [c000000000131af8] free_irq+0x88/0xb0
[12748.304594] [c000000f1fb6e890] [d000000009514528] nvme_suspend_queue+0xc8/0x150 [nvme]
[12748.333825] [c000000f1fb6e8c0] [d00000000951681c] nvme_dev_disable+0x3fc/0x400 [nvme]
[12748.340913] [c000000f1fb6e9a0] [d000000009516ae4] nvme_timeout+0xe4/0x260 [nvme]
[12748.357136] [c000000f1fb6ea60] [c000000000548a34] blk_mq_rq_timed_out+0x64/0x110
[12748.383939] [c000000f1fb6ead0] [c00000000054c540] bt_for_each+0x160/0x170
[12748.399292] [c000000f1fb6eb40] [c00000000054d4e8] blk_mq_queue_tag_busy_iter+0x78/0x110
[12748.402665] [c000000f1fb6eb90] [c000000000547358] blk_mq_rq_timer+0x48/0x140
[12748.438649] [c000000f1fb6ebd0] [c00000000014a13c] call_timer_fn+0x5c/0x1c0
[12748.468126] [c000000f1fb6ec60] [c00000000014a5fc] run_timer_softirq+0x31c/0x3f0
[12748.483367] [c000000f1fb6ed30] [c0000000000beb78] __do_softirq+0x188/0x3e0
[12748.498378] [c000000f1fb6ee20] [c0000000000bf048] irq_exit+0xc8/0x100
[12748.501048] [c000000f1fb6ee40] [c00000000001f954] timer_interrupt+0xa4/0xe0
[12748.516377] [c000000f1fb6ee70] [c000000000002714] decrementer_common+0x114/0x180
[12748.547282] --- interrupt: 901 at arch_local_irq_restore+0x74/0x90
[12748.547282] LR = arch_local_irq_restore+0x74/0x90
[12748.574141] [c000000f1fb6f160] [0000000000000001] 0x1 (unreliable)
[12748.592405] [c000000f1fb6f180] [c000000000aedc3c] dump_stack+0xd0/0xf0
[12748.596461] [c000000f1fb6f1c0] [c0000000001006fc] dequeue_task_idle+0x5c/0x90
[12748.611532] [c000000f1fb6f230] [c0000000000f6080] deactivate_task+0xc0/0x130
[12748.627685] [c000000f1fb6f270] [c000000000adcb10] __schedule+0x440/0x990
[12748.654416] [c000000f1fb6f300] [c000000000add0a8] schedule+0x48/0xc0
[12748.670558] [c000000f1fb6f330] [c000000000ae1474] schedule_timeout+0x274/0x350
[12748.673485] [c000000f1fb6f420] [c000000000ade23c] wait_for_common+0xec/0x240
[12748.699192] [c000000f1fb6f4a0] [c0000000000e6908] kthread_stop+0x88/0x210
[12748.718385] [c000000f1fb6f4e0] [d000000009514240] nvme_dev_list_remove+0x90/0x110 [nvme]
[12748.748925] [c000000f1fb6f510] [d000000009516498] nvme_dev_disable+0x78/0x400 [nvme]
[12748.752112] [c000000f1fb6f5f0] [d000000009516ae4] nvme_timeout+0xe4/0x260 [nvme]
[12748.775395] [c000000f1fb6f6b0] [c000000000548a34] blk_mq_rq_timed_out+0x64/0x110
[12748.821069] [c000000f1fb6f720] [c00000000054c540] bt_for_each+0x160/0x170
[12748.851733] [c000000f1fb6f790] [c00000000054d4e8] blk_mq_queue_tag_busy_iter+0x78/0x110
[12748.883093] [c000000f1fb6f7e0] [c000000000547358] blk_mq_rq_timer+0x48/0x140
[12748.918348] [c000000f1fb6f820] [c00000000014a13c] call_timer_fn+0x5c/0x1c0
[12748.934743] [c000000f1fb6f8b0] [c00000000014a5fc] run_timer_softirq+0x31c/0x3f0
[12748.938084] [c000000f1fb6f980] [c0000000000beb78] __do_softirq+0x188/0x3e0
[12748.960815] [c000000f1fb6fa70] [c0000000000bf048] irq_exit+0xc8/0x100
[12748.992175] [c000000f1fb6fa90] [c00000000001f954] timer_interrupt+0xa4/0xe0
[12749.019299] [c000000f1fb6fac0] [c000000000002714] decrementer_common+0x114/0x180
[12749.037168] --- interrupt: 901 at arch_local_irq_restore+0x74/0x90
[12749.037168] LR = arch_local_irq_restore+0x74/0x90
[12749.079044] [c000000f1fb6fdb0] [c000000f2a41d680] 0xc000000f2a41d680 (unreliable)
[12749.081736] [c000000f1fb6fdd0] [c000000000909a28] cpuidle_enter_state+0x1a8/0x410
[12749.127094] [c000000f1fb6fe30] [c000000000119a88] call_cpuidle+0x78/0xd0
[12749.144435] [c000000f1fb6fe70] [c000000000119e5c] cpu_startup_entry+0x37c/0x480
[12749.166156] [c000000f1fb6ff30] [c00000000004563c] start_secondary+0x33c/0x360
[12749.186929] [c000000f1fb6ff90] [c000000000008b6c] start_secondary_prolog+0x10/0x14
[12749.223828] Instruction dump:
[12749.223856] 4e800020 4bf83a5d 60000000 4bffff64 4bf83a51 60000000 4bffffa8 3c62ff7b
[12749.233245] 7f84e378 38630fe0 489b900d 60000000 <0fe00000> 4bfffe20 7d2903a6 387d0118
[12749.298371] ---[ end trace b4d8d720952460b6 ]---

== Comment: #184 - Gabriel Krisman Bertazi <email address hidden> - 2016-07-29 12:55:48 ==
I got it figured out. The nvme driver is not playing nice with the block timeout infrastructure, in a way that the timeout code goes into a live lock, waiting for the queue to be released. CPU hotplug, on the other hand, who is holding the queue freeze lock at the time, is waiting for an outstanding request to timeout (or complete). This request, in turn is stuck in the device, requiring a reset triggered by a timeout, which never happens due to the live lock.

I don't have the reason why the request is stuck inside the device requiring a timeout, but this could even be caused by the Leaf firmware itself. I also see some successful timeouts triggered under normal conditions. In the failure event, we should be able to abort the request normally, but this happens via the timeout infrastructure, which is blocked during cpu hotplug events.

I have a quirk to fully recover after the failure, by forcing a reset of the stucked IO, which allows the cpu hotplug completion and block layer recovery. I have a machine hitting the failure every few minutes in a loop, and recovering from it with my patch.

Patch submitted to linux-block

https://marc.info/?l=linux-block&m=146976739016592&w=2

== Comment: #207 - Gabriel Krisman Bertazi <email address hidden> - 2016-09-05 09:13:51 ==
Canonical,

This is fixed by:

e57690fe009b ("blk-mq: don't overwrite rq->mq_ctx")
0e87e58bf60e ("blk-mq: improve warning for running a queue on the wrong CPU")
71f79fb3179e (" blk-mq: Allow timeouts to run while queue is freezing")

Which will apply cleanly on top of your kernel.

Tags:

CVE References

Revision history for this message

bugproxy (bugproxy) wrote on 2016-09-05: sample allocation failure

sample allocation failure Edit (4.8 KiB, text/plain)

Default Comment by Bridge

tags:

added: architecture-ppc64le bugnameltc-140718 severity-critical targetmilestone-inin16041

Revision history for this message

bugproxy (bugproxy) wrote on 2016-09-05: queue usage counter during deadlock

queue usage counter during deadlock Edit (5.6 KiB, text/plain)

Default Comment by Bridge

Revision history for this message

bugproxy (bugproxy) wrote on 2016-09-05: Log from iod76, tracking percpu variables

Log from iod76, tracking percpu variables Edit (5.6 KiB, text/plain)

Default Comment by Bridge

Changed in ubuntu:
assignee:	nobody → Taco Screen team (taco-screen-team)
affects:	ubuntu → linux (Ubuntu)

Breno Leitão (breno-leitao) on 2016-09-05

Changed in linux (Ubuntu):
status:	New → Incomplete
status:	Incomplete → Confirmed

Leann Ogasawara (leannogasawara) on 2016-09-06

Changed in linux (Ubuntu):
assignee:	Taco Screen team (taco-screen-team) → nobody
assignee:	nobody → Canonical Kernel Team (canonical-kernel-team)
importance:	Undecided → High
status:	Confirmed → Triaged

Revision history for this message

Tim Gardner (timg-tpi) wrote on 2016-09-06:

Merged in v4.8

Changed in linux (Ubuntu Yakkety):
assignee:	Canonical Kernel Team (canonical-kernel-team) → nobody
status:	Triaged → Fix Released
Changed in linux (Ubuntu Xenial):
assignee:	nobody → Tim Gardner (timg-tpi)
status:	New → In Progress

Revision history for this message

Tim Gardner (timg-tpi) wrote on 2016-09-06:

https://lists.ubuntu.com/archives/kernel-team/2016-September/079898.html

Revision history for this message

bugproxy (bugproxy) wrote on 2016-09-12: Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2016-09-12 03:10 EDT-------
*** Bug 141963 has been marked as a duplicate of this bug. ***

Kamal Mostafa (kamalmostafa) on 2016-09-19

Changed in linux (Ubuntu Xenial):
status:	In Progress → Fix Committed

Revision history for this message

bugproxy (bugproxy) wrote on 2016-09-20: calltraces_cap_ubutnu160401

calltraces_cap_ubutnu160401 Edit (186.2 KiB, text/rtf)

------- Comment (attachment only) From <email address hidden> 2016-09-20 02:40 EDT-------

Revision history for this message

bugproxy (bugproxy) wrote on 2016-09-23: SMTfix_calltraces_leafio_ubutnu160401

SMTfix_calltraces_leafio_ubutnu160401 Edit (22.8 KiB, text/rtf)

------- Comment (attachment only) From <email address hidden> 2016-09-23 12:09 EDT-------

Revision history for this message

bugproxy (bugproxy) wrote on 2016-09-23: xmon debug session

xmon debug session Edit (656.7 KiB, text/plain)

------- Comment on attachment From <email address hidden> 2016-09-23 14:47 EDT-------

Looking at the blocked tasks, this looks like it could be an issue fixed recently upstream:

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/kernel/sched/core.c?id=135e8c9250dd5c8c9aae5984fde6f230d0cbfeaf

Gabriel is building a kernel that has this fix added and we'll kick off a weekend run.

bugproxy (bugproxy) on 2016-09-24

tags:

removed: bugnameltc-140718 severity-critical

bugproxy (bugproxy) on 2016-09-24

tags:

added: bugnameltc-140718 severity-critical

Revision history for this message

Brad Figg (brad-figg) wrote on 2016-09-26:

#10

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags:

added: verification-needed-xenial

Revision history for this message

bugproxy (bugproxy) wrote on 2016-09-28: Comment bridged from LTC Bugzilla

#11

------- Comment From <email address hidden> 2016-09-27 23:02 EDT-------
Hi Canonical,

Our test teams had problems with booting the Ubuntu-4.4.0-40.60 kernel. They hit an Oops in the NVMe probe path.

My current understanding is that this is already fixed in your kernel by the fixup commit: e9820e415895 (" UBUNTU: (fix) NVMe: Don't unmap controller registers on reset") which will be published in kernel Ubuntu-4.4.0-41.61.

We will try again with that version.

While we are here, I noticed another issue with the backport of 30d6592fce71 (" NVMe: Don't unmap controller registers on reset"). Looks like you are missing the fixup commit that went later into the 4.4.y tree:

81e9a969c441 ('nvme: Call pci_disable_device on the error path") #4.4.y tree.

I opened Bug 146899 to track this. It will be mirrored soon.

tags:

added: verification-failed-xenial
removed: verification-needed-xenial

Revision history for this message

Breno Leitão (breno-leitao) wrote on 2016-09-28:

#12

Bug 146899 is https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1628520

Revision history for this message

Kamal Mostafa (kamalmostafa) wrote on 2016-09-28:

#13

Thanks for this information, Gabriel.

Your understanding is correct: the original probe path oops is fixed in 4.4.0-41.61, which will land in the -proposed archive by tomorrow (alternately, available now from our staging PPA: https://launchpad.net/~canonical-kernel-team/+archive/ubuntu/ppa/+packages

And we'll pick up the missing error path fixup commit as well (for the next Xenial cycle).

Revision history for this message

bugproxy (bugproxy) wrote on 2016-09-28: xmon debug session

#14

xmon debug session Edit (656.7 KiB, text/plain)

------- Comment on attachment From <email address hidden> 2016-09-23 14:47 EDT-------

Looking at the blocked tasks, this looks like it could be an issue fixed recently upstream:

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/kernel/sched/core.c?id=135e8c9250dd5c8c9aae5984fde6f230d0cbfeaf

Gabriel is building a kernel that has this fix added and we'll kick off a weekend run.

Brad Figg (brad-figg) on 2016-09-30

tags:

added: verification-needed-xenial
removed: verification-failed-xenial

bugproxy (bugproxy) on 2016-09-30

tags:

added: verification-done-xenial
removed: verification-needed-xenial

Revision history for this message

Launchpad Janitor (janitor) wrote on 2016-10-10:

#15

Download full text (17.5 KiB)

This bug was fixed in the package linux - 4.4.0-42.62

---------------
linux (4.4.0-42.62) xenial; urgency=low

  * Fix GRO recursion overflow for tunneling protocols (LP: #1631287)
    - tunnels: Don't apply GRO to multiple layers of encapsulation.
    - gro: Allow tunnel stacking in the case of FOU/GUE

* CVE-2016-7039
- SAUCE: net: add recursion limit to GRO

linux (4.4.0-41.61) xenial; urgency=low

[ Kamal Mostafa ]

* Release Tracking Bug
- LP: #1628204

* nvme drive probe failure (LP: #1626894)
- (fix) NVMe: Don't unmap controller registers on reset

linux (4.4.0-40.60) xenial; urgency=low

[ Kamal Mostafa ]

* Release Tracking Bug
- LP: #1627074

  * Permission denied in CIFS with kernel 4.4.0-38 (LP: #1626112)
    - Fix memory leaks in cifs_do_mount()
    - Compare prepaths when comparing superblocks
    - SAUCE: Fix regression which breaks DFS mounting

  * Backlight does not change when adjust it higher than 50% after S3
    (LP: #1625932)
    - SAUCE: i915_bpo: drm/i915/backlight: setup and cache pwm alternate
      increment value
    - SAUCE: i915_bpo: drm/i915/backlight: setup backlight pwm alternate
      increment on backlight enable

linux (4.4.0-39.59) xenial; urgency=low

[ Joseph Salisbury ]

* Release Tracking Bug
- LP: #1625303

* thunder: chip errata w/ multiple CQEs for a TSO packet (LP: #1624569)
- net: thunderx: Fix for issues with multiple CQEs posted for a TSO packet

* thunder: faulty TSO padding (LP: #1623627)
- net: thunderx: Fix for HW issue while padding TSO packet

* CVE-2016-6828
- tcp: fix use after free in tcp_xmit_retransmit_queue()

* Sennheiser Officerunner - cannot get freq at ep 0x83 (LP: #1622763)
- SAUCE: (no-up) ALSA: usb-audio: Add quirk for sennheiser officerunner

* Backport E3 Skylake Support in ie31200_edac to Xenial (LP: #1619766)
- EDAC, ie31200_edac: Add Skylake support

* Ubuntu 16.04 - Full EEH Recovery Support for NVMe devices (LP: #1602724)
- SAUCE: nvme: Don't suspend admin queue that wasn't created

  * ISST-LTE:pNV: system ben is hung during ST (nvme) (LP: #1620317)
    - blk-mq: Allow timeouts to run while queue is freezing
    - blk-mq: improve warning for running a queue on the wrong CPU
    - blk-mq: don't overwrite rq->mq_ctx

  * lsattr 32bit does not work on 64bit kernel (Inappropriate ioctl error)
    (LP: #1619918)
    - btrfs: bugfix: handle FS_IOC32_{GETFLAGS, SETFLAGS, GETVERSION} in
      btrfs_ioctl

  * radeon: monitor connected to onboard VGA doesn't work with Xenial
    (LP: #1600092)
    - drm/radeon/dp: add back special handling for NUTMEG

* initramfs includes qle driver, but not firmware (LP: #1623187)
- qed: add MODULE_FIRMWARE()

This bug was fixed in the package linux - 4.4.0-42.62

---------------
linux (4.4.0-42.62) xenial; urgency=low

* Fix GRO recursion overflow for tunneling protocols (LP: #1631287)
    - tunnels: Don't apply GRO to multiple layers of encapsulation.
    - gro: Allow tunnel stacking in the case of FOU/GUE

* CVE-2016-7039
    - SAUCE: net: add recursion limit to GRO

linux (4.4.0-41.61) xenial; urgency=low

[ Kamal Mostafa ]

* Release Tracking Bug
    - LP: #1628204

* nvme drive probe failure (LP: #1626894)
    - (fix) NVMe: Don't unmap controller registers on reset

linux (4.4.0-40.60) xenial; urgency=low

[ Kamal Mostafa ]

* Release Tracking Bug
    - LP: #1627074

linux (4.4.0-39.59) xenial; urgency=low

[ Joseph Salisbury ]

* Release Tracking Bug
    - LP: #1625303

* thunder: chip errata w/ multiple CQEs for a TSO packet (LP: #1624569)
    - net: thunderx: Fix for issues with multiple CQEs posted for a TSO packet

* thunder: faulty TSO padding (LP: #1623627)
    - net: thunderx: Fix for HW issue while padding TSO packet

* CVE-2016-6828
    - tcp: fix use after free in tcp_xmit_retransmit_queue()

* Sennheiser Officerunner - cannot get freq at ep 0x83 (LP: #1622763)
    - SAUCE: (no-up) ALSA: usb-audio: Add quirk for sennheiser officerunner

* Backport E3 Skylake Support in ie31200_edac to Xenial (LP: #1619766)
    - EDAC, ie31200_edac: Add Skylake support

* Ubuntu 16.04 - Full EEH Recovery Support for NVMe devices (LP: #1602724)
    - SAUCE: nvme: Don't suspend admin queue that wasn't created

* lsattr 32bit does not work on 64bit kernel (Inappropriate ioctl error)
    (LP: #1619918)
    - btrfs: bugfix: handle FS_IOC32_{GETFLAGS, SETFLAGS, GETVERSION} in
      btrfs_ioctl

* radeon: monitor connected to onboard VGA doesn't work with Xenial
    (LP: #1600092)
    - drm/radeon/dp: add back special handling for NUTMEG

* initramfs includes qle driver, but not firmware (LP: #1623187)
    - qed: add MODULE_FIRMWARE()

* [Hyper-V] Rebase Hyper-V to 4.7.2 (stable) (LP: #1616677)
    - hv_netvsc: Implement support for VF drivers on Hyper-V
    - hv_netvsc: Fix the list processing for network change event
    - Drivers: hv: vmbus: Introduce functions for estimating room in the ring
      buffer
    - Drivers: hv: vmbus: Use READ_ONCE() to read variables that are volatile
    - Drivers: hv: vmbus: Export the vmbus_set_event() API
    - lcoking/barriers, arch: Use smp barriers in smp_store_release()
    - asm-generic: guard smp_store_release/load_acquire
    - x86: reuse asm-generic/barrier.h
    - asm-generic: add __smp_xxx wrappers
    - x86: define __smp_xxx
    - asm-generic: implement virt_xxx memory barriers
    - Drivers: hv: vmbus: Move some ring buffer functions to hyperv.h
    - Drivers: hv: vmbus: Implement APIs to support "in place" consumption of
      vmbus packets
    - drivers:hv: Lock access to hyperv_mmio resource tree
    - drivers:hv: Make a function to free mmio regions through vmbus
    - drivers:hv: Track allocations of children of hv_vmbus in private resource
      tree
    - drivers:hv: Separate out frame buffer logic when picking MMIO range
    - Drivers: hv: vmbus: handle various crash scenarios
    - Drivers: hv: balloon: don't crash when memory is added in non-sorted order
    - Drivers: hv: balloon: reset host_specified_ha_region
    - tools: hv: lsvmbus: add pci pass-through UUID
    - hv_netvsc: move start_remove flag to net_device_context
    - hv_netvsc: use start_remove flag to protect netvsc_link_change()
    - hv_netvsc: untangle the pointer mess
    - hv_netvsc: get rid of struct net_device pointer in struct netvsc_device
    - hv_netvsc: synchronize netvsc_change_mtu()/netvsc_set_channels() with
      netvsc_remove()
    - hv_netvsc: set nvdev link after populating chn_table
    - hv_netvsc: Fix VF register on vlan devices
    - hv_netvsc: remove redundant assignment in netvsc_recv_callback()
    - hv_netvsc: introduce {net, hv}_device_to_netvsc_device() helpers
    - hv_netvsc: pass struct netvsc_device to rndis_filter_{open, close}()
    - hv_netvsc: pass struct net_device to rndis_filter_set_device_mac()
    - hv_netvsc: pass struct net_device to rndis_filter_set_offload_params()
    - netvsc: get rid of completion timeouts
    - PCI: hv: Don't leak buffer in hv_pci_onchannelcallback()
    - PCI: hv: Handle all pending messages in hv_pci_onchannelcallback()
    - netvsc: Use the new in-place consumption APIs in the rx path
    - x86/kernel: Audit and remove any unnecessary uses of module.h
    - PCI: hv: Fix interrupt cleanup path
    - hv_netvsc: Fix VF register on bonding devices
    - hv_netvsc: don't lose VF information
    - hv_netvsc: avoid deadlocks between rtnl lock and vf_use_cnt wait
    - hv_netvsc: reset vf_inject on VF removal
    - hv_netvsc: protect module refcount by checking net_device_ctx->vf_netdev
    - hv_netvsc: fix bonding devices check in netvsc_netdev_event()
    - Drivers: hv: vmbus: Use the new virt_xx barrier code
    - ixgbevf: call ndo_stop() instead of dev_close() when running offline
      selftest
    - ixgbevf: fix error code path when setting MAC address
    - ixgbevf: use bit operations for setting and checking resets
    - ixgbevf: Add support for generic Tx checksums
    - ixgbe/ixgbevf: Add support for bulk free in Tx cleanup & cleanup boolean
      logic
    - ixgbevf: refactor ethtool stats handling
    - ixgbevf: add support for per-queue ethtool stats
    - ixgbevf: make use of BIT() macro to avoid shift of signed values
    - ixgbevf: Move API negotiation function into mac_ops
    - ixgbevf: Add the device ID's presented while running on Hyper-V
    - ixgbevf: Support Windows hosts (Hyper-V)
    - ixgbevf: Change the relaxed order settings in VF driver for sparc
    - ixgbevf: Use mac_ops instead of trying to identify NIC type

* New device ID for Kabypoint (LP: #1622469)
    - mfd: lpss: Add Intel Kaby Lake PCH-H PCI IDs
    - SAUCE: i2c: i801: Add support for Kaby Lake PCH-H

* Xenial update to v4.4.21 stable release (LP: #1624037)
    - Revert "i40e: fix: do not sleep in netdev_ops"
    - fs: Check for invalid i_uid in may_follow_link()
    - netfilter: x_tables: check for size overflow
    - ext4: validate that metadata blocks do not overlap superblock
    - ext4: fix xattr shifting when expanding inodes
    - ext4: fix xattr shifting when expanding inodes part 2
    - ext4: properly align shifted xattrs when expanding inodes
    - ext4: avoid deadlock when expanding inode size
    - ext4: avoid modifying checksum fields directly during checksum verification
    - block: Fix race triggered by blk_set_queue_dying()
    - block: make sure a big bio is split into at most 256 bvecs
    - cgroup: reduce read locked section of cgroup_threadgroup_rwsem during fork
    - s390/sclp_ctl: fix potential information leak with /dev/sclp
    - drm/radeon: fix radeon_move_blit on 32bit systems
    - drm: Reject page_flip for !DRIVER_MODESET
    - drm/msm: fix use of copy_from_user() while holding spinlock
    - ASoC: atmel_ssc_dai: Don't unconditionally reset SSC on stream startup
    - xfs: fix superblock inprogress check
    - timekeeping: Cap array access in timekeeping_debug
    - timekeeping: Avoid taking lock in NMI path with CONFIG_DEBUG_TIMEKEEPING
    - lustre: remove unused declaration
    - wrappers for ->i_mutex access
    - ovl: don't copy up opaqueness
    - ovl: remove posix_acl_default from workdir
    - ovl: listxattr: use strnlen()
    - ovl: fix workdir creation
    - ubifs: Fix assertion in layout_in_gaps()
    - bcache: RESERVE_PRIO is too small by one when prio_buckets() is a power of
      two.
    - vhost/scsi: fix reuse of &vq->iov[out] in response
    - x86/apic: Do not init irq remapping if ioapic is disabled
    - uprobes: Fix the memcg accounting
    - crypto: caam - fix IV loading for authenc (giv)decryption
    - ALSA: usb-audio: Add sample rate inquiry quirk for B850V3 CP2114
    - ALSA: firewire-tascam: accessing to user space outside spinlock
    - ALSA: fireworks: accessing to user space outside spinlock
    - ALSA: rawmidi: Fix possible deadlock with virmidi registration
    - ALSA: hda - Add headset mic quirk for Dell Inspiron 5468
    - ALSA: hda - Enable subwoofer on Dell Inspiron 7559
    - ALSA: timer: fix NULL pointer dereference in read()/ioctl() race
    - ALSA: timer: fix division by zero after SNDRV_TIMER_IOCTL_CONTINUE
    - ALSA: timer: fix NULL pointer dereference on memory allocation failure
    - scsi: fix upper bounds check of sense key in scsi_sense_key_string()
    - metag: Fix atomic_*_return inline asm constraints
    - cpufreq: Fix GOV_LIMITS handling for the userspace governor
    - hwrng: exynos - Disable runtime PM on probe failure
    - regulator: anatop: allow regulator to be in bypass mode
    - lib/mpi: mpi_write_sgl(): fix skipping of leading zero limbs
    - Linux 4.4.21

* Headset mic detection on some variants of Dell Inspiron 5468 (LP: #1617900)
    - ALSA: hda - Add headset mic quirk for Dell Inspiron 5468

* Xenial update to v4.4.20 stable release (LP: #1621113)
    - hugetlb: fix nr_pmds accounting with shared page tables
    - x86/mm: Disable preemption during CR3 read+write
    - uprobes/x86: Fix RIP-relative handling of EVEX-encoded instructions
    - tools/testing/nvdimm: fix SIGTERM vs hotplug crash
    - SUNRPC: Handle EADDRNOTAVAIL on connection failures
    - SUNRPC: allow for upcalls for same uid but different gss service
    - powerpc/eeh: eeh_pci_enable(): fix checking of post-request state
    - ALSA: usb-audio: Add a sample rate quirk for Creative Live! Cam Socialize HD
      (VF0610)
    - ALSA: usb-audio: Add quirk for ELP HD USB Camera
    - arm64: Define AT_VECTOR_SIZE_ARCH for ARCH_DLINFO
    - parisc: Fix order of EREFUSED define in errno.h
    - virtio: fix memory leak in virtqueue_add()
    - vfio/pci: Fix NULL pointer oops in error interrupt setup handling
    - perf intel-pt: Fix occasional decoding errors when tracing system-wide
    - libnvdimm, nd_blk: mask off reserved status bits
    - ALSA: hda - Manage power well properly for resume
    - NVMe: Don't unmap controller registers on reset
    - PCI: Support PCIe devices with short cfg_size
    - PCI: Add Netronome vendor and device IDs
    - PCI: Limit config space size for Netronome NFP6000 family
    - PCI: Add Netronome NFP4000 PF device ID
    - PCI: Limit config space size for Netronome NFP4000
    - mmc: sdhci-acpi: Reduce Baytrail eMMC/SD/SDIO hangs
    - ACPI: CPPC: Return error if _CPC is invalid on a CPU
    - ACPI / CPPC: Prevent cpc_desc_ptr points to the invalid data
    - um: Don't discard .text.exit section
    - genirq/msi: Remove unused MSI_FLAG_IDENTITY_MAP
    - genirq/msi: Make sure PCI MSIs are activated early
    - crypto: caam - fix non-hmac hashes
    - crypto: caam - fix echainiv(authenc) encrypt shared descriptor
    - crypto: caam - defer aead_set_sh_desc in case of zero authsize
    - usb: ehci: change order of register cleanup during shutdown
    - usb: misc: usbtest: add fix for driver hang
    - usb: dwc3: pci: add Intel Kabylake PCI ID
    - usb: dwc3: gadget: increment request->actual once
    - usb: hub: Fix unbalanced reference count/memory leak/deadlocks
    - USB: hub: fix up early-exit pathway in hub_activate
    - USB: hub: change the locking in hub_activate
    - usb: renesas_usbhs: clear the BRDYSTS in usbhsg_ep_enable()
    - usb: renesas_usbhs: Use dmac only if the pipe type is bulk
    - USB: validate wMaxPacketValue entries in endpoint descriptors
    - usb: gadget: fsl_qe_udc: off by one in setup_received_handle()
    - usb/gadget: fix gadgetfs aio support.
    - xhci: always handle "Command Ring Stopped" events
    - usb: xhci: Fix panic if disconnect
    - xhci: don't dereference a xhci member after removing xhci
    - USB: serial: fix memleak in driver-registration error path
    - USB: serial: option: add D-Link DWM-156/A3
    - USB: serial: option: add support for Telit LE920A4
    - USB: serial: ftdi_sio: add device ID for WICED USB UART dev board
    - USB: serial: ftdi_sio: add PIDs for Ivium Technologies devices
    - iommu/dma: Don't put uninitialised IOVA domains
    - iommu/arm-smmu: Fix CMDQ error handling
    - iommu/arm-smmu: Don't BUG() if we find aborting STEs with disable_bypass
    - pinctrl/amd: Remove the default de-bounce time
    - EDAC: Increment correct counter in edac_inc_ue_error()
    - s390/dasd: fix hanging device after clear subchannel
    - mac80211: fix purging multicast PS buffer queue
    - arm64: dts: rockchip: add reset saradc node for rk3368 SoCs
    - of: fix reference counting in of_graph_get_endpoint_by_regs
    - sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression
    - sched/nohz: Fix affine unpinned timers mess
    - iio: fix sched WARNING "do not call blocking ops when !TASK_RUNNING"
    - drm/amdgpu: Change GART offset to 64-bit
    - drm/amdgpu: fix amdgpu_move_blit on 32bit systems
    - drm/amdgpu: avoid a possible array overflow
    - drm/amdgpu: skip TV/CV in display parsing
    - drm/amd/amdgpu: sdma resume fail during S4 on CI
    - drm/amdgpu: record error code when ring test failed
    - drm/i915: fix aliasing_ppgtt leak
    - ARC: build: Better way to detect ISA compatible toolchain
    - ARC: use correct offset in pt_regs for saving/restoring user mode r25
    - ARC: Call trace_hardirqs_on() before enabling irqs
    - ARC: Elide redundant setup of DMA callbacks
    - aacraid: Check size values after double-fetch from user
    - mfd: cros_ec: Add cros_ec_cmd_xfer_status() helper
    - i2c: cros-ec-tunnel: Fix usage of cros_ec_cmd_xfer()
    - cdc-acm: fix wrong pipe type on rx interrupt xfers
    - mpt3sas: Fix resume on WarpDrive flash cards
    - megaraid_sas: Fix probing cards without io port
    - usb: renesas_usbhs: gadget: fix return value check in
      usbhs_mod_gadget_probe()
    - gpio: Fix OF build problem on UM
    - fs/seq_file: fix out-of-bounds read
    - btrfs: waiting on qgroup rescan should not always be interruptible
    - btrfs: properly track when rescan worker is running
    - Input: tegra-kbc - fix inverted reset logic
    - Input: i8042 - break load dependency between atkbd/psmouse and i8042
    - Input: i8042 - set up shared ps2_cmd_mutex for AUX ports
    - crypto: nx - off by one bug in nx_of_update_msc()
    - crypto: qat - fix aes-xts key sizes
    - dmaengine: usb-dmac: check CHCR.DE bit in usb_dmac_isr_channel()
    - USB: avoid left shift by -1
    - usb: chipidea: udc: don't touch DP when controller is in host mode
    - USB: fix typo in wMaxPacketSize validation
    - USB: serial: mos7720: fix non-atomic allocation in write path
    - USB: serial: mos7840: fix non-atomic allocation in write path
    - USB: serial: option: add WeTelecom WM-D200
    - USB: serial: option: add WeTelecom 0x6802 and 0x6803 products
    - staging: comedi: daqboard2000: bug fix board type matching code
    - staging: comedi: comedi_test: fix timer race conditions
    - staging: comedi: ni_mio_common: fix AO inttrig backwards compatibility
    - staging: comedi: ni_mio_common: fix wrong insn_write handler
    - ACPI / drivers: fix typo in ACPI_DECLARE_PROBE_ENTRY macro
    - ACPI / drivers: replace acpi_probe_lock spinlock with mutex
    - ACPI / sysfs: fix error code in get_status()
    - ACPI / SRAT: fix SRAT parsing order with both LAPIC and X2APIC present
    - ALSA: line6: Remove double line6_pcm_release() after failed acquire.
    - ALSA: line6: Give up on the lock while URBs are released.
    - ALSA: line6: Fix POD sysfs attributes segfault
    - hwmon: (iio_hwmon) fix memory leak in name attribute
    - sysfs: correctly handle read offset on PREALLOC attrs
    - Linux 4.4.20

* Failed to acknowledge elog: /sys/firmware/opal/elog/0x5018d709/acknowledge
    (2:No such file or directory) (LP: #1619552)
    - powerpc/powernv : Drop reference added by kset_find_obj()

* backport support for userspace access of DP aux devices (LP: #1619756)
    - drm/dp: Add a drm_aux-dev module for reading/writing dpcd registers.
    - drm/dp: Allow signals to interrupt drm_aux-dev reads/writes
    - [Config] CONFIG_DRM_DP_AUX_CHARDEV=y

* Enable virtual scsi server driver for Power (LP: #1615665)
    - SAUCE: Ibmvscsis: Properly deregister target sessions
    - SAUCE: Return TCMU-generated sense data to fabric module
    - SAUCE: Ibmvscsis: Code cleanup of print statements
    - SAUCE: Ibmvscsis: Fixed a bug reported by Dan Carpenter

* ISST-LTE: system dropped into xmon at pcibios_release_device+0x5c/0x80
    during running dlpar test on monklp3 (LP: #1618151)
    - powerpc/pseries: use pci_host_bridge.release_fn() to kfree(phb)

* Kernel Build Fails for Fuse Module (LP: #1617550)
    - SAUCE: (namespace) userns: Export current_in_userns to modules

* boot-time kernel panic introduced in 4.4.0-18, not present in 4.4.0-15
    (LP: #1572630)
    - blk-mq: Reuse hardware context cpumask for tags
    - blk-mq: Use proper cpumask iterator

-- Seth Forshee <seth.forshee@canonical.com>  Fri, 07 Oct 2016 12:03:55 -0500

Changed in linux (Ubuntu Xenial):
status:	Fix Committed → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

Add attachment

Remote bug watches

Bug watches keep track of this bug in other bug trackers.

Ubuntulinux package

ISST-LTE:pNV: system ben is hung during ST (nvme)

Bug Description

CVE References

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntu
linux package