In the Hypervisor, there are several FW commands which are invoked
before the comm channel is initialized (in mlx4_multi_func_init).
These include MOD_STAT_CONFIG, QUERY_DEV_CAP, INIT_HCA, and others.
If any of these commands fails, say with a timeout, the Hypervisor
driver enters the internal error reset flow. In this flow, the driver
attempts to notify all slaves via the comm channel that an internal error
has occurred.
Since the comm channel has not yet been initialized (i.e., mapped via
ioremap), this will cause dereferencing a NULL pointer.
To fix this, do not access the comm channel in the internal error flow
if it has not yet been initialized.
Fixes: ff15addba7bf ("net/mlx4_core: Enable device recovery flow with SRIOV")
Fixes: bfa3f2377fe3 ("mlx4_core: Modify driver initialization flow to accommodate SRIOV for Ethernet")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
kfree(priv->mfunc.master.slave_state);
err_comm:
iounmap(priv->mfunc.comm);
+ priv->mfunc.comm = NULL;
err_vhcr:
dma_free_coherent(&dev->persist->pdev->dev, PAGE_SIZE,
priv->mfunc.vhcr,
int slave;
u32 slave_read;
+ /* If the comm channel has not yet been initialized,
+ * skip reporting the internal error event to all
+ * the communication channels.
+ */
+ if (!priv->mfunc.comm)
+ return;
+
/* Report an internal error event to all
* communication channels.
*/
}
iounmap(priv->mfunc.comm);
+ priv->mfunc.comm = NULL;
}
void mlx4_cmd_cleanup(struct mlx4_dev *dev, int cleanup_mask)