]> git.baikalelectronics.ru Git - kernel.git/commit
svcrdma: Revert "svcrdma: Reduce Receive doorbell rate"
authorChuck Lever <chuck.lever@oracle.com>
Thu, 11 Mar 2021 18:25:01 +0000 (13:25 -0500)
committerChuck Lever <chuck.lever@oracle.com>
Thu, 11 Mar 2021 20:26:07 +0000 (15:26 -0500)
commitb089d7d9e4cf80377d420cd5d5954cbc27244221
tree2f612411b0860a12b31419b89076f36f635350a8
parent30a5ca6dd6ae68917cda1948f4f487528b59b4bf
svcrdma: Revert "svcrdma: Reduce Receive doorbell rate"

I tested commit 6578826ad510 ("svcrdma: Reduce Receive doorbell
rate") with mlx4 (IB) and software iWARP and didn't find any
issues. However, I recently got my hardware iWARP setup back on
line (FastLinQ) and it's crashing hard on this commit (confirmed
via bisect).

The failure mode is complex.
 - After a connection is established, the first Receive completes
   normally.
 - But the second and third Receives have garbage in their Receive
   buffers. The server responds with ERR_VERS as a result.
 - When the client tears down the connection to retry, a couple
   of posted Receives flush twice, and that corrupts the recv_ctxt
   free list.
 - __svc_rdma_free then faults or loops infinitely while destroying
   the xprt's recv_ctxts.

Since 6578826ad510 ("svcrdma: Reduce Receive doorbell rate") does
not fix a bug but is a scalability enhancement, it's safe and
appropriate to revert it while working on a replacement.

Fixes: 6578826ad510 ("svcrdma: Reduce Receive doorbell rate")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
include/linux/sunrpc/svc_rdma.h
net/sunrpc/xprtrdma/svc_rdma_recvfrom.c