]> git.baikalelectronics.ru Git - kernel.git/commit
IB/mlx5: Fix implicit ODP race
authorArtemy Kovalyov <artemyko@mellanox.com>
Thu, 27 Feb 2020 11:39:18 +0000 (13:39 +0200)
committerJason Gunthorpe <jgg@mellanox.com>
Wed, 4 Mar 2020 17:25:00 +0000 (13:25 -0400)
commit41ceace614013e959a8379e57c263ac5ac166439
tree919b5bb1d3cdf638fec20aa2da669949e19c1601
parent9b37825676c79013db879af2554ba16bab28ae2e
IB/mlx5: Fix implicit ODP race

Following race may occur because of the call_srcu and the placement of
the synchronize_srcu vs the xa_erase.

CPU0    CPU1

mlx5_ib_free_implicit_mr:    destroy_unused_implicit_child_mr:
 xa_erase(odp_mkeys)
 synchronize_srcu()
    xa_lock(implicit_children)
    if (still in xarray)
       atomic_inc()
       call_srcu()
    xa_unlock(implicit_children)
 xa_erase(implicit_children):
   xa_lock(implicit_children)
   __xa_erase()
   xa_unlock(implicit_children)

 flush_workqueue()
   [..]
    free_implicit_child_mr_rcu:
     (via call_srcu)
      queue_work()

 WARN_ON(atomic_read())
   [..]
    free_implicit_child_mr_work:
     (via wq)
      free_implicit_child_mr()
 mlx5_mr_cache_invalidate()
     mlx5_ib_update_xlt() <-- UMR QP fail
     atomic_dec()

The wait_event() solves the race because it blocks until
free_implicit_child_mr_work() completes.

Fixes: 3c273f8fea70 ("RDMA/mlx5: Rework implicit ODP destroy")
Link: https://lore.kernel.org/r/20200227113918.94432-1-leon@kernel.org
Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
drivers/infiniband/hw/mlx5/mlx5_ib.h
drivers/infiniband/hw/mlx5/odp.c