git.baikalelectronics.ru Git - kernel.git/commit

author	Maxim Mikityanskiy <maximmi@nvidia.com>
	Fri, 15 Apr 2022 13:19:15 +0000 (16:19 +0300)
committer	Saeed Mahameed <saeedm@nvidia.com>
	Tue, 31 May 2022 20:40:54 +0000 (13:40 -0700)
commit	83f7c22d741dc4412a785c381534f606645b62be
tree	dde8c0e66212cd5b1f8bf4909b432632249ae701	tree \| snapshot
parent	ba5e544b3a21b7bb2aa323010cafdccb4f52876b	commit \| diff

net/mlx5e: Disable softirq in mlx5e_activate_rq to avoid race condition

When the driver activates the channels, it assumes NAPI isn't running
yet. mlx5e_activate_rq posts a NOP WQE to ICOSQ to trigger a hardware
interrupt and start NAPI, which will run mlx5e_alloc_rx_mpwqe and post
UMR WQEs to ICOSQ to be able to receive packets with striding RQ.

Unfortunately, a race condition is possible if NAPI is triggered by
something else (for example, TX) at a bad timing, before
mlx5e_activate_rq finishes. In this case, mlx5e_alloc_rx_mpwqe may post
UMR WQEs to ICOSQ, and with the bad timing, the wqe_info of the first
UMR may be overwritten by the wqe_info of the NOP posted by
mlx5e_activate_rq.

The consequence is that icosq->db.wqe_info[0].num_wqebbs will be changed
from MLX5E_UMR_WQEBBS to 1, disrupting the integrity of the array-based
linked list in wqe_info[]. mlx5e_poll_ico_cq will hang in an infinite
loop after processing wqe_info[0], because after the corruption, the
next item to be processed will be wqe_info[1], which is filled with
zeros, and `sqcc += wi->num_wqebbs` will never move further.

This commit fixes this race condition by using async_icosq to post the
NOP and trigger the interrupt. async_icosq is always protected with a
spinlock, eliminating the race condition.

Fixes: 2cda7f9675b3 ("net/mlx5e: Add fragmented memory support for RX multi packet WQE")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reported-by: Karsten Nielsen <karsten@foo-bar.dk>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

drivers/net/ethernet/mellanox/mlx5/core/en.h		diff \| blob \| history
drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c		diff \| blob \| history
drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c		diff \| blob \| history
drivers/net/ethernet/mellanox/mlx5/core/en/trap.c		diff \| blob \| history
drivers/net/ethernet/mellanox/mlx5/core/en/xsk/pool.c		diff \| blob \| history
drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c		diff \| blob \| history
drivers/net/ethernet/mellanox/mlx5/core/en_main.c		diff \| blob \| history