~ztahenakos/ubuntu/+source/linux/+git/jammy:ovs-internal-port-lp1983498

Last commit made on 2022-08-03
Get this branch:
git clone -b ovs-internal-port-lp1983498 https://git.launchpad.net/~ztahenakos/ubuntu/+source/linux/+git/jammy
Only Zachary Tahenakos can upload to this branch. If you are Zachary Tahenakos please log in for upload directions.

Branch merges

Branch information

Name:
ovs-internal-port-lp1983498
Repository:
lp:~ztahenakos/ubuntu/+source/linux/+git/jammy

Recent commits

5195a9a... by Ariel Levkovich <email address hidden>

net/mlx5e: TC, fix decap fallback to uplink when int port not supported

BugLink: https://bugs.launchpad.net/bugs/1983498

When resolving the decap route device for a tunnel decap rule,
the result may be an OVS internal port device.

Prior to adding the support for internal port offload, such case
would result in using the uplink as the default decap route device
which allowed devices that can't support internal port offload
to offload this decap rule.

This behavior got broken by adding the internal port offload which
will fail in case the device can't support internal port offload.

To restore the old behavior, use the uplink device as the decap
route as before when internal port offload is not supported.

Fixes: b16eb3c81fe2 ("net/mlx5: Support internal port as decap route device")
Signed-off-by: Ariel Levkovich <email address hidden>
Reviewed-by: Maor Dickman <email address hidden>
Signed-off-by: Saeed Mahameed <email address hidden>
(cherry picked from commit e3fdc71bcb6ffe1d4870a89252ba296a9558e294)
Signed-off-by: Zachary Tahenakos <email address hidden>

53d5644... by Ariel Levkovich <email address hidden>

net/mlx5e: Fix wrong source vport matching on tunnel rule

BugLink: https://bugs.launchpad.net/bugs/1983498

When OVS internal port is the vtep device, the first decap
rule is matching on the internal port's vport metadata value
and then changes the metadata to be the uplink's value.

Therefore, following rules on the tunnel, in chain > 0, should
avoid matching on internal port metadata and use the uplink
vport metadata instead.

Select the uplink's metadata value for the source vport match
in case the rule is in chain greater than zero, even if the tunnel
route device is internal port.

Fixes: 166f431ec6be ("net/mlx5e: Add indirect tc offload of ovs internal port")
Signed-off-by: Ariel Levkovich <email address hidden>
Reviewed-by: Maor Dickman <email address hidden>
Signed-off-by: Saeed Mahameed <email address hidden>
(cherry picked from commit cb0d54cbf94866b48a73e10a73a55655f808cc7c)
Signed-off-by: Zachary Tahenakos <email address hidden>

add2a15... by Roi Dayan

net/mlx5e: Avoid implicit modify hdr for decap drop rule

BugLink: https://bugs.launchpad.net/bugs/1983498

Currently the driver adds implicit modify hdr action for
decap rules on tunnel devices if the port is an ovs port.
This is also done if the action is drop and makes the modify
hdr redundant and also the FW doesn't support it and will generate
a syndrome.

kernel: mlx5_core 0000:08:00.0: mlx5_cmd_check:777:(pid 102063): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x8708c3)

Fix it by adding the implicit modify hdr only for fwd actions.

Fixes: b16eb3c81fe2 ("net/mlx5: Support internal port as decap route device")
Fixes: 077cdda764c7 ("net/mlx5e: TC, Fix memory leak with rules with internal port")
Signed-off-by: Roi Dayan <email address hidden>
Reviewed-by: Ariel Levkovich <email address hidden>
Signed-off-by: Saeed Mahameed <email address hidden>
(cherry picked from commit 5b209d1a22afabfb7d644abb10510c5713a3e569)
Signed-off-by: Zachary Tahenakos <email address hidden>

b70d8a4... by Dima Chumak <email address hidden>

net/mlx5e: Fix nullptr on deleting mirroring rule

BugLink: https://bugs.launchpad.net/bugs/1983498

Deleting a Tc rule with multiple outputs, one of which is internal port,
like this one:

  tc filter del dev enp8s0f0_0 ingress protocol ip pref 5 flower \
      dst_mac 0c:42:a1:d1:d0:88 \
      src_mac e4:ea:09:08:00:02 \
      action tunnel_key set \
          src_ip 0.0.0.0 \
          dst_ip 7.7.7.8 \
          id 8 \
          dst_port 4789 \
      action mirred egress mirror dev vxlan_sys_4789 pipe \
      action mirred egress redirect dev enp8s0f0_1

Triggers a call trace:

  BUG: kernel NULL pointer dereference, address: 0000000000000230
  RIP: 0010:del_sw_hw_rule+0x2b/0x1f0 [mlx5_core]
  Call Trace:
   tree_remove_node+0x16/0x30 [mlx5_core]
   mlx5_del_flow_rules+0x51/0x160 [mlx5_core]
   __mlx5_eswitch_del_rule+0x4b/0x170 [mlx5_core]
   mlx5e_tc_del_fdb_flow+0x295/0x550 [mlx5_core]
   mlx5e_flow_put+0x1f/0x70 [mlx5_core]
   mlx5e_delete_flower+0x286/0x390 [mlx5_core]
   tc_setup_cb_destroy+0xac/0x170
   fl_hw_destroy_filter+0x94/0xc0 [cls_flower]
   __fl_delete+0x15e/0x170 [cls_flower]
   fl_delete+0x36/0x80 [cls_flower]
   tc_del_tfilter+0x3a6/0x6e0
   rtnetlink_rcv_msg+0xe5/0x360
   ? rtnl_calcit.isra.0+0x110/0x110
   netlink_rcv_skb+0x46/0x110
   netlink_unicast+0x16b/0x200
   netlink_sendmsg+0x202/0x3d0
   sock_sendmsg+0x33/0x40
   ____sys_sendmsg+0x1c3/0x200
   ? copy_msghdr_from_user+0xd6/0x150
   ___sys_sendmsg+0x88/0xd0
   ? ___sys_recvmsg+0x88/0xc0
   ? do_futex+0x10c/0x460
   __sys_sendmsg+0x59/0xa0
   do_syscall_64+0x48/0x140
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fix by disabling offloading for flows matching
esw_is_chain_src_port_rewrite() which have more than one output.

Fixes: 10742efc20a4 ("net/mlx5e: VF tunnel TX traffic offloading")
Signed-off-by: Dima Chumak <email address hidden>
Reviewed-by: Roi Dayan <email address hidden>
Signed-off-by: Saeed Mahameed <email address hidden>
(cherry picked from commit de31854ece175e12ff3c35d07f340988823aed34)
Signed-off-by: Zachary Tahenakos <email address hidden>

1d0cdb6... by Gal Pressman <email address hidden>

net/mlx5e: Fix skb memory leak when TC classifier action offloads are disabled

BugLink: https://bugs.launchpad.net/bugs/1983498

When TC classifier action offloads are disabled (CONFIG_MLX5_CLS_ACT in
Kconfig), the mlx5e_rep_tc_receive() function which is responsible for
passing the skb to the stack (or freeing it) is defined as a nop, and
results in leaking the skb memory. Replace the nop with a call to
napi_gro_receive() to resolve the leak.

Fixes: 28e7606fa8f1 ("net/mlx5e: Refactor rx handler of represetor device")
Signed-off-by: Gal Pressman <email address hidden>
Reviewed-by: Ariel Levkovich <email address hidden>
Signed-off-by: Saeed Mahameed <email address hidden>
(cherry picked from commit a0cb909644c36230a3c48904d14b91732de79fc0)
Signed-off-by: Zachary Tahenakos <email address hidden>

0e1e8f2... by Roi Dayan

net/mlx5e: TC, Fix memory leak with rules with internal port

BugLink: https://bugs.launchpad.net/bugs/1983498

Fix a memory leak with decap rule with internal port as destination
device. The driver allocates a modify hdr action but doesn't set
the flow attr modify hdr action which results in skipping releasing
the modify hdr action when releasing the flow.

backtrace:
    [<000000005f8c651c>] krealloc+0x83/0xd0
    [<000000009f59b143>] alloc_mod_hdr_actions+0x156/0x310 [mlx5_core]
    [<000000002257f342>] mlx5e_tc_match_to_reg_set_and_get_id+0x12a/0x360 [mlx5_core]
    [<00000000b44ea75a>] mlx5e_tc_add_fdb_flow+0x962/0x1470 [mlx5_core]
    [<0000000003e384a0>] __mlx5e_add_fdb_flow+0x54c/0xb90 [mlx5_core]
    [<00000000ed8b22b6>] mlx5e_configure_flower+0xe45/0x4af0 [mlx5_core]
    [<00000000024f4ab5>] mlx5e_rep_indr_offload.isra.0+0xfe/0x1b0 [mlx5_core]
    [<000000006c3bb494>] mlx5e_rep_indr_setup_tc_cb+0x90/0x130 [mlx5_core]
    [<00000000d3dac2ea>] tc_setup_cb_add+0x1d2/0x420

Fixes: b16eb3c81fe2 ("net/mlx5: Support internal port as decap route device")
Signed-off-by: Roi Dayan <email address hidden>
Signed-off-by: Saeed Mahameed <email address hidden>
(cherry picked from commit 077cdda764c7f147e03e6065ba0cd1dbc1bf00d1)
Signed-off-by: Zachary Tahenakos <email address hidden>

4666390... by tititou

net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()'

BugLink: https://bugs.launchpad.net/bugs/1983498

All the error handling paths of 'mlx5e_tc_add_fdb_flow()' end to 'err_out'
where 'flow_flag_set(flow, FAILED);' is called.

All but the new error handling paths added by the commits given in the
Fixes tag below.

Fix these error handling paths and branch to 'err_out'.

Fixes: 166f431ec6be ("net/mlx5e: Add indirect tc offload of ovs internal port")
Fixes: b16eb3c81fe2 ("net/mlx5: Support internal port as decap route device")
Signed-off-by: Christophe JAILLET <email address hidden>
Reviewed-by: Roi Dayan <email address hidden>
Signed-off-by: Saeed Mahameed <email address hidden>
(cherry picked from commit 31108d142f3632970f6f3e0224bd1c6781c9f87d)
(cherry picked from commit 4390c6edc0fb390e699d0f886f45575dfeafeb4b)
Signed-off-by: Zachary Tahenakos <email address hidden>

806533f... by Ariel Levkovich <email address hidden>

net/mlx5: Support internal port as decap route device

BugLink: https://bugs.launchpad.net/bugs/1983498

When performing route device lookup for decap action, support
the case of ovs internal port as the lookup result.

In such case, an internal port struct is mapped and attached
to the flow attributes so that the source port matching of the
rule will match on the internal port's metadata value.

Signed-off-by: Ariel Levkovich <email address hidden>
Reviewed-by: Vlad Buslov <email address hidden>
Reviewed-by: Roi Dayan <email address hidden>
Signed-off-by: Saeed Mahameed <email address hidden>
(cherry picked from commit b16eb3c81fe27978afdb2c111908d4d627a88d99)
Signed-off-by: Zachary Tahenakos <email address hidden>

ad06751... by Ariel Levkovich <email address hidden>

net/mlx5e: Term table handling of internal port rules

BugLink: https://bugs.launchpad.net/bugs/1983498

Adjust termination table logic to handle rules which
involve internal port as filter or forwarding device.

For cases where the rule forwards from internal port
to uplink, always choose to go via termination table.
This is because it is not known from where the packet
originally arrived to the internal port and it is possible
that it came from the uplink itself, in which case
a term table is required to perform hairpin.
If the packet arrived from a vport, going via term
table has no effect.

For cases where the rule forwards to an internal port
from uplink the rep pointer will point to the uplink rep,
avoid going via termination table as it is not required.

Signed-off-by: Ariel Levkovich <email address hidden>
Reviewed-by: Roi Dayan <email address hidden>
Signed-off-by: Saeed Mahameed <email address hidden>
(cherry picked from commit 5e9942721749fc96b9df4b0545474153316c0571)
Signed-off-by: Zachary Tahenakos <email address hidden>

1f6f143... by Ariel Levkovich <email address hidden>

net/mlx5e: Add indirect tc offload of ovs internal port

BugLink: https://bugs.launchpad.net/bugs/1983498

Register callbacks for tc blocks of ovs internal port devices.

This allows an indirect offloading rules that apply on
such devices as the filter device.

In case a rule is added to a tc block of an internal port,
the mlx5 driver will implicitly add a matching on the internal
port's unique vport metadata value to the rule's matching list.
Therefore, only packets that previously hit a rule that redirects
to an internal port and got the vport metadata overwritten to the
internal port's unique metadata, can match on such indirect rule.

Offloading of both ingress and egress tc blocks of internal ports
is supported as opposed to other devices where only ingress block
offloading is supported.

Signed-off-by: Ariel Levkovich <email address hidden>
Reviewed-by: Paul Blakey <email address hidden>
Reviewed-by: Vlad Buslov <email address hidden>
Reviewed-by: Roi Dayan <email address hidden>
Signed-off-by: Saeed Mahameed <email address hidden>
(cherry picked from commit 166f431ec6beaf472bc2e116a202a127b64779e4)
Signed-off-by: Zachary Tahenakos <email address hidden>