git.baikalelectronics.ru Git - kernel.git/commit

author	James Smart <jsmart2021@gmail.com>
	Sat, 9 Dec 2017 01:18:03 +0000 (17:18 -0800)
committer	Martin K. Petersen <martin.petersen@oracle.com>
	Thu, 21 Dec 2017 02:11:44 +0000 (21:11 -0500)
commit	8ad95bc7b35c27e1edcd7805663c4a98969cead0
tree	29cd2919b5dd43d116ccc6290f94bcb4ea2b6c3c	tree \| snapshot
parent	b4639274adf8a3a726f580e75818bc95a596f7ee	commit \| diff

scsi: lpfc: Fix random heartbeat timeouts during heavy IO

NVME targets appear to randomly disconnect from the initiator when
running heavy IO.

The error is due to the host aggregate (across all controllers) io load
was beyond the maximum exchange count for nvme on the adapter. The
driver was properly returning a resource busy status, but the io load
was so great heartbeat commands would be bounced and not have a
successful retry within the fuzz amount for the nvme heartbeat (yes, a
very high io load!). Thus the target was terminating the controller due
to a keep alive failure.

Resolve by reserving a few exchanges (by counters) which can be used
when the adapter is out of normal exchanges and the command is a NVME
heartbeat command. As counters are used, while the reserved command is
outstanding, as soon as any other exchange completes, the counters are
adjusted and the reserved count is replenished. The heartbeat completes
execution in a normal fashion.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

drivers/scsi/lpfc/lpfc.h		diff \| blob \| history
drivers/scsi/lpfc/lpfc_init.c		diff \| blob \| history
drivers/scsi/lpfc/lpfc_nvme.c		diff \| blob \| history
drivers/scsi/lpfc/lpfc_nvme.h		diff \| blob \| history