]> git.baikalelectronics.ru Git - kernel.git/commit
staging/rdma/hfi1: Prevent silent data corruption with user SDMA
authorMitko Haralanov <mitko.haralanov@intel.com>
Mon, 26 Oct 2015 14:28:37 +0000 (10:28 -0400)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Tue, 27 Oct 2015 08:19:22 +0000 (17:19 +0900)
commitf2db6ff44324666bd5764ea3d1e74158c55dba52
tree5dd62e3e96a5ac55ccdb424f9c641b48e57ab42d
parent75417308219f5a403ae4ffb347478cc22e4cc545
staging/rdma/hfi1: Prevent silent data corruption with user SDMA

User SDMA keeps track of progress into the submitted IO vectors by tracking an
offset into the vectors when packets are submitted. This offset is updated
after a successful submission of a txreq to the SDMA engine.

The same offset was used when determining whether an IO vector should be
'freed' (pages unpinned) in the SDMA callback functions.

This was causing a silent data corruption in big jobs (> 2 nodes, 120 ranks
each) on the receive side because the send side was mistakenly unpinning the
vector pages before the HW has processed all descriptors referencing the
vector.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
drivers/staging/rdma/hfi1/user_sdma.c