[Hyper-V/Azure] Please include Mellanox OFED drivers in Azure kernel and image

Bug #1650058 reported by Joshua R. Poulson
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Joseph Salisbury
Xenial
Fix Released
Medium
Joseph Salisbury
Yakkety
Fix Released
Medium
Joseph Salisbury

Bug Description

In order to have the correct VF driver to support SR-IOV in Azure, the Mellanox OFED distribution needs to be included in the kernel and the image. Mellanox's drivers are not upstream, but they are available from here:

https://www.mellanox.com/page/products_dyn?product_family=26&mtag=linux_sw_drivers&ssn=3nnhohirh5htv6dnp1uk7tf487

While this configuration is not explicitly listed as supported in the release notes, Microsoft and Mellanox engineers are working on the corresponding Windows Server 2016 PF driver to support this VF driver in operation in Ubuntu guests.

I file file a corresponding rebase request to pick up the PCI passthrough and other SR-IOV work done for the Hyper-V capabilities in the upstream 4.9 kernel.

Only 64-bit support for Ubuntu 16.04's HWE kernel is needed.

CVE References

Joshua R. Poulson (jrp)
summary: - Please include Mellanox OFED drivers in Azure kernel and image
+ [Hyper-V/Azure] Please include Mellanox OFED drivers in Azure kernel and
+ image
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1650058

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Joshua R. Poulson (jrp)
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Changed in linux (Ubuntu):
importance: Undecided → Medium
Changed in linux (Ubuntu Xenial):
status: New → Triaged
Changed in linux (Ubuntu):
status: Confirmed → Triaged
Changed in linux (Ubuntu Xenial):
importance: Undecided → Medium
tags: added: kernel-da-key kernel-hyper-v xenial
Joshua R. Poulson (jrp)
description: updated
Revision history for this message
Tim Gardner (timg-tpi) wrote :
Revision history for this message
Joshua R. Poulson (jrp) wrote :

I'll pester Mellanox on that. Thanks!

Revision history for this message
Joshua R. Poulson (jrp) wrote :

Mellanox has pointed me to a repository that has the minimal set of drivers and utilities needed for SR-IOV (and, eventually, DPDK). Consider the "mlnx_rdma_minimal" repository, here: http://linux.mellanox.com/public/repo/mlnx_rdma_minimal/

I am requesting the latest 64-bit drivers and userspace for Azure images, here: http://linux.mellanox.com/public/repo/mlnx_rdma_minimal/3.4-2.0.0.0/ubuntu16.04/x86_64/

and here if we want to apply to the yakkety kernel:
http://linux.mellanox.com/public/repo/mlnx_rdma_minimal/3.4-2.0.0.0/ubuntu16.10/x86_64/

Revision history for this message
Tim Gardner (timg-tpi) wrote :

Joshua - I looked at mlnx-ofed-kernel-dkms_3.4-OFED.3.4.2.0.0.1.g30039f7_all.deb in some detail. It appears that nearly all of the drivers contained therein are replacements for existing drivers in the linux v4.4 kernel. Even without looking at the code for these replacement drivers, I can say that a patch series integrating these changes will certainly _not_ qualify under the Ubuntu SRU policy. There are a number of consumers of the Mellanox Infiniband stack for which these changes might cause regression.

We could certainly merge these changes in an Azure specific kernel or flavor, but not as part of the generic distribution kernel binary.

rtg

Revision history for this message
Joshua R. Poulson (jrp) wrote :

Mellanox assures us that all the ConnectX3 drivers are upstream now, so I will switch focus on the kernel side to integrating those once I have the commits. DPDK however has userspace which come from this repo: http://linux.mellanox.com/public/repo/mlnx_rdma_minimal/3.4-2.0.5.0/

Revision history for this message
Joshua R. Poulson (jrp) wrote :

Mellanox has told me that the following three commits are needed for SR-IOV in Azure:
1. d585df1c5ccf net/mlx4_core: Avoid command timeouts during VF driver device shutdown
2. 7c3945bc2073 net/mlx4_core: Fix when to save some qp context flags for dynamic VST to VGT transitions
3. 291c566a2891 net/mlx4_core: Fix racy CQ (Completion Queue) free

Changed in linux (Ubuntu):
status: Triaged → In Progress
Changed in linux (Ubuntu Xenial):
status: Triaged → In Progress
Changed in linux (Ubuntu):
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Xenial):
assignee: nobody → Joseph Salisbury (jsalisbury)
Revision history for this message
Joshua R. Poulson (jrp) wrote :

Mellanox just sent me another one:
6496bbf0ec48 net/mlx4_en: Fix bad WQE issue

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a Xenial test kernel with the four patches. The test kernel can be downloaded from:

http://kernel.ubuntu.com/~jsalisbury/lp1650058/xenial/

Can you test this kernel and see if it resolves this bug?

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I build a Xenial test kernel with all seven patches from both bug reports. The test kernel can be downloaded from:

http://kernel.ubuntu.com/~jsalisbury/lp1650058Andlp1665097/

Revision history for this message
Joshua R. Poulson (jrp) wrote :

Hrm, 4.4.0-62? We need the prerequisite merges in 4.4.0-63.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

The patches were applied to the current kernel in -updates for testing. 4.4.0-63 is what is in -proposed(Proposed is actually now up to -64). The patches will be applied to whichever version is in proposed.

I can build you a -63 or -64 kernel with the patches if you want?

Revision history for this message
Joshua R. Poulson (jrp) wrote :

The -63 in proposed has other prerequisites for SR-IOV in Azure: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1650059 and https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1657540

Since -64 is newer, and going GA on Monday I hope, it's best to go with that one.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Thanks for the update, Josh. I built a test kernel based on -64 with the patches from both bugs. It is available in the same location:

http://kernel.ubuntu.com/~jsalisbury/lp1650058Andlp1665097/

Tim Gardner (timg-tpi)
Changed in linux (Ubuntu Xenial):
status: In Progress → Fix Committed
Revision history for this message
Joshua R. Poulson (jrp) wrote :

The following patch to mlx4_core fixes a problem with 16 or more vCPUs with Hyper-V and guest RDMA is disabled (as is the case with SR-IOV in Azure). Microsoft has smoke tested 16 and 20 CPU cases today.

Revision history for this message
Joshua R. Poulson (jrp) wrote :

We are testing the kernel in comment #14 and will revert as soon as possible.

tags: added: patch
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Hi Josh,

Are you also requesting the fifth patch posted in comment #15?

Revision history for this message
Joshua R. Poulson (jrp) wrote :

Yes, I am requesting it. It will be submitted upstream soon (although there's a 32 vCPU problem still being investigated). Should I open a separate bug since the rest of the work has been committed?

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Yes, it might be best to have a separate bug. That way we can track it seperatly, since these patches were already committed.

Revision history for this message
Simon Xiao (sixiao) wrote :

A quick test on the kernel in comment #14:
Mellanox CX3 VF device is not showing up when booting up with this test kernel: 4.4.0-64-generic.
Looks like this is a regression from 4.4.0-63-generic. In 4.4.0-63-generic, the VF presented correctly.

1. Boot with this test kernel: 4.4.0-64-generic; then VF is not listed in lspci:

root@ubuntu1604-client:~# uname -r
4.4.0-64-generic
root@ubuntu1604-client:~# lspci
00:00.0 Host bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX Host bridge (AGP disabled) (rev 03)
00:07.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev 01)
00:07.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev 01)
00:07.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 02)
00:08.0 VGA compatible controller: Microsoft Corporation Hyper-V virtual VGA
00:0a.0 Ethernet controller: Digital Equipment Corporation DECchip 21140 [FasterNet] (rev 20)

2. Then boot the same VM with same configuration on: 4.4.0-63-generic, VF shows up.

root@ubuntu1604-client:~# uname -r
4.4.0-63-generic
root@ubuntu1604-client:~# lspci
0000:00:00.0 Host bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX Host bridge (AGP disabled) (rev 03)
0000:00:07.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev 01)
0000:00:07.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev 01)
0000:00:07.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 02)
0000:00:08.0 VGA compatible controller: Microsoft Corporation Hyper-V virtual VGA
0000:00:0a.0 Ethernet controller: Digital Equipment Corporation DECchip 21140 [FasterNet] (rev 20)
99b2:00:02.0 Ethernet controller: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function] <=== this is the VF device

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Is the VF device not showing up only with my test kernel posted in comment #14, or does it also fail to show up with the stock -64 kernel in the proposed repository?

If the new issue only happens with my test kernel, can you test the kernel posted in comment #9? That will help narrow down which commit might be causing this. That kernel only has the first 4 patches.

I can also build a -63 based kernel with the patches instead of -64.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a test kernel based on -63 with the patches from both bugs. It is available from:

http://kernel.ubuntu.com/~jsalisbury/lp1650058Andlp1665097/

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Also, just wanted to confirm you installed both the linux-image and the linux-image-extra .deb packages.

Revision history for this message
Joshua R. Poulson (jrp) wrote :

Ah, we don't normally install "extra" on virtual machines, but if that's where the mlx4 driver lives, we will change.

Revision history for this message
Joshua R. Poulson (jrp) wrote :
Changed in linux (Ubuntu):
status: In Progress → Fix Committed
Revision history for this message
Dexuan Cui (decui) wrote :

@Joshph
Can you please confirm the patch (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1665097/comments/4) is included in the test kernel in #25?

Revision history for this message
Dexuan Cui (decui) wrote :

@Joshph, "the test kernel in #25" means the #4 in the link of #25,
i.e. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1667007/comments/4
It looks to me the patch is not included. Just want to confirm my guess.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Hi Dexuan,

I confirmed the patch for bug 1667007 is applied in the posted test kernel(Comment #4 for that bug):

Author: Jack Morgenstein <email address hidden>
Date: Thu Feb 16 12:59:58 2017 +0200

    [PATCH] net/mlx4_core: Use cq quota in SRIOV when creating completion EQs

However, the test kernel for that bug did not have the four patches applied that came in from this bug. I will rebuild the test kernel for that bug and post it in that bugs comments.

Revision history for this message
Dexuan Cui (decui) wrote :

Thanks, Joseph!

Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'. If the problem still exists, change the tag 'verification-needed-xenial' to 'verification-failed-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (14.5 KiB)

This bug was fixed in the package linux - 4.4.0-65.86

---------------
linux (4.4.0-65.86) xenial; urgency=low

  * linux: 4.4.0-65.86 -proposed tracker (LP: #1667052)

  [ Stefan Bader ]
  * Upgrade Redpine RS9113 driver to support AP mode (LP: #1665211)
    - SAUCE: Redpine driver to support Host AP mode

  * NFS client : permission denied when trying to access subshare, since kernel
    4.4.0-31 (LP: #1649292)
    - fs: Better permission checking for submounts

  * [Hyper-V] SAUCE: pci-hyperv fixes for SR-IOV on Azure (LP: #1665097)
    - SAUCE: PCI: hv: Fix wslot_to_devfn() to fix warnings on device removal
    - SAUCE: pci-hyperv: properly handle pci bus remove
    - SAUCE: pci-hyperv: lock pci bus on device eject

  * [Hyper-V/Azure] Please include Mellanox OFED drivers in Azure kernel and
    image (LP: #1650058)
    - net/mlx4_en: Fix bad WQE issue
    - net/mlx4_core: Fix racy CQ (Completion Queue) free
    - net/mlx4_core: Fix when to save some qp context flags for dynamic VST to VGT
      transitions
    - net/mlx4_core: Avoid command timeouts during VF driver device shutdown

  * Xenial update to v4.4.49 stable release (LP: #1664960)
    - ARC: [arcompact] brown paper bag bug in unaligned access delay slot fixup
    - selinux: fix off-by-one in setprocattr
    - Revert "x86/ioapic: Restore IO-APIC irq_chip retrigger callback"
    - cpumask: use nr_cpumask_bits for parsing functions
    - hns: avoid stack overflow with CONFIG_KASAN
    - ARM: 8643/3: arm/ptrace: Preserve previous registers for short regset write
    - target: Don't BUG_ON during NodeACL dynamic -> explicit conversion
    - target: Use correct SCSI status during EXTENDED_COPY exception
    - target: Fix early transport_generic_handle_tmr abort scenario
    - target: Fix COMPARE_AND_WRITE ref leak for non GOOD status
    - ARM: 8642/1: LPAE: catch pending imprecise abort on unmask
    - mac80211: Fix adding of mesh vendor IEs
    - netvsc: Set maximum GSO size in the right place
    - scsi: zfcp: fix use-after-free by not tracing WKA port open/close on failed
      send
    - scsi: aacraid: Fix INTx/MSI-x issue with older controllers
    - scsi: mpt3sas: disable ASPM for MPI2 controllers
    - xen-netfront: Delete rx_refill_timer in xennet_disconnect_backend()
    - ALSA: seq: Fix race at creating a queue
    - ALSA: seq: Don't handle loop timeout at snd_seq_pool_done()
    - drm/i915: fix use-after-free in page_flip_completed()
    - Linux 4.4.49

  * NFS client : kernel 4.4.0-57 crash with nfsv4 enries in /etc/fstab
    (LP: #1650336)
    - SUNRPC: fix refcounting problems with auth_gss messages.

  * [0bda:0328] Card reader failed after S3 (LP: #1664809)
    - usb: hub: Wait for connection to be reestablished after port reset

  * linux-lts-xenial 4.4.0-63.84~14.04.2 ADT test failure with linux-lts-xenial
    4.4.0-63.84~14.04.2 (LP: #1664912)
    - SAUCE: apparmor: fix link auditing failure due to, uninitialized var

  * ibmvscsis: Add SGL LIMIT (LP: #1662551)
    - ibmvscsis: Add SGL limit

  * [Hyper-V] Bug fixes for storvsc (tagged queuing, error conditions)
    (LP: #1663687)
    - scsi: storvsc: Enable tracking of queue depth
    - scsi: storvsc: Remove the ...

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
Joshua R. Poulson (jrp)
tags: added: verification-done-xenial
removed: verification-needed-xenial
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a Yakkety test kernel with the four patches. The test kernel can be downloaded from:

http://kernel.ubuntu.com/~jsalisbury/lp1650058/yakkety

Can you test this kernel and see if it resolves this bug?

Changed in linux (Ubuntu Yakkety):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Joseph Salisbury (jsalisbury)
Revision history for this message
Adrian Suhov (asuhov) wrote :

Hi,
The test kernel looks good, you can submit it.

Stefan Bader (smb)
Changed in linux (Ubuntu Yakkety):
status: In Progress → Fix Committed
Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-yakkety' to 'verification-done-yakkety'. If the problem still exists, change the tag 'verification-needed-yakkety' to 'verification-failed-yakkety'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-yakkety
Revision history for this message
Adrian Suhov (asuhov) wrote :

Hi,
I tested the latest proposed kernel and it's looking good, no issues were found. You can submit the changes.

Chris Valean (cvalean)
tags: added: verification-done-yakkety
removed: verification-needed-yakkety
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 4.8.0-52.55

---------------
linux (4.8.0-52.55) yakkety; urgency=low

  * linux: 4.8.0-52.55 -proposed tracker (LP: #1686976)

  * CVE-2017-7477: macsec: avoid heap overflow in skb_to_sgvec (LP: #1685892)
    - macsec: avoid heap overflow in skb_to_sgvec
    - macsec: dynamically allocate space for sglist

  * net/ipv4: original ingress device index set as the loopback interface.
    (LP: #1683982)
    - net: fix incorrect original ingress device index in PKTINFO

  * Touchpad not working correctly after kernel upgrade (LP: #1662589)
    - Input: ALPS - fix V8+ protocol handling (73 03 28)

  * ifup service of network device stay active after driver stop (LP: #1672144)
    - net: use net->count to check whether a netns is alive or not

  * [Hyper-V] mkfs regression in kernel 4.4+ (LP: #1682215)
    - block: relax check on sg gap

  * Potential memory corruption with capi adapters (LP: #1681469)
    - powerpc/mm: Add missing global TLB invalidate if cxl is active

  * [Hyper-V/Azure] Please include Mellanox OFED drivers in Azure kernel and
    image (LP: #1650058)
    - net/mlx4_en: Fix bad WQE issue
    - net/mlx4_core: Fix racy CQ (Completion Queue) free
    - net/mlx4_core: Fix when to save some qp context flags for dynamic VST to VGT
      transitions
    - net/mlx4_core: Avoid command timeouts during VF driver device shutdown

 -- Stefan Bader <email address hidden> Fri, 28 Apr 2017 12:17:12 +0200

Changed in linux (Ubuntu Yakkety):
status: Fix Committed → Fix Released
status: Fix Committed → Fix Released
Brad Figg (brad-figg)
Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.