~timg-tpi/ubuntu/+source/linux/+git/kinetic:kinetic-azure-Features-Support-and-InfiniBand-for-MANA-sf00358953

Last commit made on 2023-04-26
Get this branch:
git clone -b kinetic-azure-Features-Support-and-InfiniBand-for-MANA-sf00358953 https://git.launchpad.net/~timg-tpi/ubuntu/+source/linux/+git/kinetic
Only Tim Gardner can upload to this branch. If you are Tim Gardner please log in for upload directions.

Branch merges

Branch information

Name:
kinetic-azure-Features-Support-and-InfiniBand-for-MANA-sf00358953
Repository:
lp:~timg-tpi/ubuntu/+source/linux/+git/kinetic

Recent commits

3e1a0b5... by Long Li

RDMA/mana_ib: Fix a bug when the PF indicates more entries for registering memory on first packet

When registering memory in a large chunk that doesn't fit into a single PF
message, the PF may return GDMA_STATUS_MORE_ENTRIES on the first message if
there are more messages needed for registering more chunks.

Fix the VF to make it process the correct return code.

Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
Link: https://<email address hidden>
Signed-off-by: Long Li <email address hidden>
Signed-off-by: Jason Gunthorpe <email address hidden>
(cherry picked from commit 89d42b8c85b4c67d310c5ccaf491acbf71a260c3)
Signed-off-by: Tim Gardner <email address hidden>

592174a... by Haiyang Zhang

net: mana: Fix accessing freed irq affinity_hint

After calling irq_set_affinity_and_hint(), the cpumask pointer is
saved in desc->affinity_hint, and will be used later when reading
/proc/irq/<num>/affinity_hint. So the cpumask variable needs to be
persistent. Otherwise, we are accessing freed memory when reading
the affinity_hint file.

Also, need to clear affinity_hint before free_irq(), otherwise there
is a one-time warning and stack trace during module unloading:

 [ 243.948687] WARNING: CPU: 10 PID: 1589 at kernel/irq/manage.c:1913 free_irq+0x318/0x360
 ...
 [ 243.948753] Call Trace:
 [ 243.948754] <TASK>
 [ 243.948760] mana_gd_remove_irqs+0x78/0xc0 [mana]
 [ 243.948767] mana_gd_remove+0x3e/0x80 [mana]
 [ 243.948773] pci_device_remove+0x3d/0xb0
 [ 243.948778] device_remove+0x46/0x70
 [ 243.948782] device_release_driver_internal+0x1fe/0x280
 [ 243.948785] driver_detach+0x4e/0xa0
 [ 243.948787] bus_remove_driver+0x70/0xf0
 [ 243.948789] driver_unregister+0x35/0x60
 [ 243.948792] pci_unregister_driver+0x44/0x90
 [ 243.948794] mana_driver_exit+0x14/0x3fe [mana]
 [ 243.948800] __do_sys_delete_module.constprop.0+0x185/0x2f0

To fix the bug, use the persistent mask, cpumask_of(cpu#), and set
affinity_hint to NULL before freeing the IRQ, as required by free_irq().

Cc: <email address hidden>
Fixes: 71fa6887eeca ("net: mana: Assign interrupts to CPUs based on NUMA nodes")
Signed-off-by: Haiyang Zhang <email address hidden>
Reviewed-by: Michael Kelley <email address hidden>
Reviewed-by: Leon Romanovsky <email address hidden>
Link: https://<email address hidden>
Signed-off-by: Jakub Kicinski <email address hidden>
(cherry picked from commit 18a048370b06a3a521219e9e5b10bdc2178ef19c)
Signed-off-by: Tim Gardner <email address hidden>

e9fc1ea... by error27

RDMA/mana_ib: Prevent array underflow in mana_ib_create_qp_raw()

The "port" comes from the user and if it is zero then the:

 ndev = mc->ports[port - 1];

assignment does an out of bounds read. I have changed the if
statement to fix this and to mirror how it is done in
mana_ib_create_qp_rss().

Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
Signed-off-by: Dan Carpenter <email address hidden>
Link: https://lore.kernel.org/r/Y8/3Vn8qx00kE9Kk@kili
Acked-by: Long Li <email address hidden>
Signed-off-by: Leon Romanovsky <email address hidden>
(cherry picked from commit 563ca0e9eab8acc8a1309e8b440108ff8d23e951)
Signed-off-by: Tim Gardner <email address hidden>

272d753... by Leon Romanovsky <email address hidden>

RDMA/mana: Remove redefinition of basic u64 type

gdma_obj_handle_t is no more than redefinition of basic
u64 type. Remove such obfuscation.

Link: https://lore.kernel.org/<email address hidden>
Acked-by: Long Li <email address hidden>
Signed-off-by: Leon Romanovsky <email address hidden>
(cherry picked from commit 3574cfdca28543e2e8db649297cd6659ea8e4bb8)
Signed-off-by: Tim Gardner <email address hidden>

29a5662... by Long Li

RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter

Add a RDMA VF driver for Microsoft Azure Network Adapter (MANA).

Co-developed-by: Ajay Sharma <email address hidden>
Signed-off-by: Ajay Sharma <email address hidden>
Reviewed-by: Dexuan Cui <email address hidden>
Signed-off-by: Long Li <email address hidden>
Link: https://<email address hidden>
Signed-off-by: Leon Romanovsky <email address hidden>
(backported from commit 0266a177631d4c6b963b5b12dd986a8c5abdbf06)
[rtg - minor context adjustments]
Signed-off-by: Tim Gardner <email address hidden>

3428bb4... by Nathan Huckleberry <email address hidden>

net: mana: Fix return type of mana_start_xmit()

The ndo_start_xmit field in net_device_ops is expected to be of type
netdev_tx_t (*ndo_start_xmit)(struct sk_buff *skb, struct net_device *dev).

The mismatched return type breaks forward edge kCFI since the underlying
function definition does not match the function hook definition. A new
warning in clang will catch this at compile time:

  drivers/net/ethernet/microsoft/mana/mana_en.c:382:21: error: incompatible function pointer types initializing 'netdev_tx_t (*)(struct sk_buff *, struct net_device *)' (aka 'enum netdev_tx (*)(struct sk_buff *, struct net_device *)') with an expression of type 'int (struct sk_buff *, struct net_device *)' [-Werror,-Wincompatible-function-pointer-types-strict]
          .ndo_start_xmit = mana_start_xmit,
                                    ^~~~~~~~~~~~~~~
  1 error generated.

The return type of mana_start_xmit should be changed from int to
netdev_tx_t.

Reported-by: Dan Carpenter <email address hidden>
Link: https://github.com/ClangBuiltLinux/linux/issues/1703
Link: https://github.com/ClangBuiltLinux/linux/issues/1750
Signed-off-by: Nathan Huckleberry <email address hidden>
Reviewed-by: Dexuan Cui <email address hidden>
[nathan: Rebase on net-next and resolve conflicts
         Add note about new clang warning]
Signed-off-by: Nathan Chancellor <email address hidden>
Link: https://<email address hidden>
Signed-off-by: Paolo Abeni <email address hidden>
(cherry picked from commit 0c9ef08a4d0fd6c5e6000597b506235d71a85a61)
Signed-off-by: Tim Gardner <email address hidden>

75d591d... by Ajay Sharma <email address hidden>

net: mana: Define data structures for protection domain and memory registration

The MANA hardware support protection domain and memory registration for use
in RDMA environment. Add those definitions and expose them for use by the
RDMA driver.

Signed-off-by: Ajay Sharma <email address hidden>
Signed-off-by: Long Li <email address hidden>
Link: https://<email address hidden>
Reviewed-by: Dexuan Cui <email address hidden>
Acked-by: Haiyang Zhang <email address hidden>
Signed-off-by: Leon Romanovsky <email address hidden>
(cherry picked from commit 28c66cfa45388af1126985d1114e0ed762eb2abd)
Signed-off-by: Tim Gardner <email address hidden>

6292ce5... by Long Li

net: mana: Define data structures for allocating doorbell page from GDMA

The RDMA device needs to allocate doorbell pages for each user context.
Define the GDMA data structures for use by the RDMA driver.

Reviewed-by: Dexuan Cui <email address hidden>
Signed-off-by: Long Li <email address hidden>
Link: https://<email address hidden>
Acked-by: Haiyang Zhang <email address hidden>
Signed-off-by: Leon Romanovsky <email address hidden>
(cherry picked from commit f72ececfc197e9b0bbb5595294908a950cf444fa)
Signed-off-by: Tim Gardner <email address hidden>

fb95c43... by Ajay Sharma <email address hidden>

net: mana: Define and process GDMA response code GDMA_STATUS_MORE_ENTRIES

When doing memory registration, the PF may respond with
GDMA_STATUS_MORE_ENTRIES to indicate a follow request is needed. This is
not an error and should be processed as expected.

Signed-off-by: Ajay Sharma <email address hidden>
Reviewed-by: Dexuan Cui <email address hidden>
Signed-off-by: Long Li <email address hidden>
Link: https://<email address hidden>
Acked-by: Haiyang Zhang <email address hidden>
Signed-off-by: Leon Romanovsky <email address hidden>
(cherry picked from commit de372f2a9ca7ada2698ecac7df8f02407cd98fa0)
Signed-off-by: Tim Gardner <email address hidden>

3b3d4bd... by Long Li

net: mana: Define max values for SGL entries

The number of maximum SGl entries should be computed from the maximum
WQE size for the intended queue type and the corresponding OOB data
size. This guarantees the hardware queue can successfully queue requests
up to the queue depth exposed to the upper layer.

Reviewed-by: Dexuan Cui <email address hidden>
Signed-off-by: Long Li <email address hidden>
Link: https://<email address hidden>
Acked-by: Haiyang Zhang <email address hidden>
Signed-off-by: Leon Romanovsky <email address hidden>
(cherry picked from commit aa56549792fb348892fbbae67f6f0c71bb750b65)
Signed-off-by: Tim Gardner <email address hidden>