Normally, the MPI firmware is reset when an MPI dump is collected. If an
unsaved MPI dump exists in the driver, though, an alternate mechanism is
used. This mechanism, which was not fully correct, is not recommended and
instead an MPI dump template walk is suggested to perform the MPI reset.
To allow for the MPI dump template walk, extra space is reserved in the MPI
dump buffer which gets used only when there is already an MPI dump in
place.
Current code uses wrong mailbox option to extract bbc from firmware. This
field is nested inside of PLOGI payload. Extract bbc from PLOGI template
payload.
Link: https://lore.kernel.org/r/20200929102152.32278-3-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
BIT_13 of extended FW attribute informs about NVMe-2 support. Set BIT_15
of special feature control block for enabling SLER in FW. Set bit 8 (SLER
supported) to 1 for the service parameter information when sending NVMe
PRLI request. Set BIT_14 of special feature control block for enabling PI
Control in FW. Driver should set bit 9 (PI Control supported) to 1 for the
service parameter information when sending NVMe PRLI request. Set BIT_13
for NVMe Async events.
Link: https://lore.kernel.org/r/20200904045128.23631-13-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch tracks number of IOCB resources used in the I/O fast path. If
the number of used IOCBs reach a high water limit, driver would return the
I/O as busy and let upper layer retry. This prevents over subscription of
IOCB resources where any future error recovery command is unable to cut
through. Enable IOCB throttling by default.
scsi: qla2xxx: Fix I/O errors during LIP reset tests
In .fcp_io(), returning ENODEV as soon as remote port delete has started
can cause I/O errors. Fix this by returning EBUSY until the remote port
delete finishes.
Link: https://lore.kernel.org/r/20200904045128.23631-9-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: qla2xxx: Honor status qualifier in FCP_RSP per spec
FCP-4 (referred FCP-4 rev-2b) identifies the earlier known "retry delay
timer" field as "status qualifier", which is described in SAM-5 and later
specs. This fix makes appropriate driver side modifications to honor the
new definition. The SAM document referred was SAM-6 rev-5.
scsi: qla2xxx: Fix I/O failures during remote port toggle testing
Driver was using a lower value for dev_loss_tmo making it more prone to I/O
failures during remote port toggle testing. Set dev_loss_tmo to zero during
remote port registration to allow nvme-fc default dev_loss_tmo to be used,
which is higher than what driver was using.
Brian King [Wed, 16 Sep 2020 20:09:59 +0000 (15:09 -0500)]
scsi: ibmvfc: Protect vhost->task_set increment by the host lock
In the discovery thread, ibmvfc does a vhost->task_set++ without any lock
held. This could result in two targets getting the same cancel key, which
could have strange effects in error recovery. The actual probability of
this occurring should be extremely small, since this should all be done in
a single threaded loop from the discovery thread, but let's fix it up
anyway to be safe.
Bodo Stroesser [Thu, 10 Sep 2020 15:50:41 +0000 (17:50 +0200)]
scsi: target: tcmu: Optimize scatter_data_area()
scatter_data_area() has two purposes:
1) Create the iovs for the data area buffer of a SCSI cmd.
2) If there is data in DMA_TO_DEVICE direction, copy
the data from sg_list to data area buffer.
Both are done in a common loop.
In case of DMA_FROM_DEVICE data transfer, scatter_data_area() is called
with parameter copy_data = false. But this flag is just used to skip
memcpy() for data, while radix_tree_lookup still is called for every dbi of
the area area buffer, and kmap and kunmap are called for every page from
sg_list and data_area as well as flush_dcache_page() for the data area
pages. Since the only thing to do with copy_data = false would be to set
up the iovs, this is a noticeable overhead. Rework the iov creation in the
main loop of scatter_data_area() providing the new function
new_block_to_iov(). Based on this, create the short new function
tcmu_setup_iovs() that only writes the iovs with no overhead. This new
function is now called instead of scatter_data_area() for bidi buffers and
for data buffers in those cases where memcpy() would have been skipped.
Bodo Stroesser [Thu, 10 Sep 2020 15:50:40 +0000 (17:50 +0200)]
scsi: target: tcmu: Optimize queue_cmd_ring()
queue_cmd_ring() needs to check whether there is enough space in cmd ring
and data area for the cmd to queue.
Currently the sequence is:
1) Calculate size the cmd will occupy on the ring based on estimation of
needed iovs.
2) Check whether there is enough space on the ring based on size from 1)
3) Allocate buffers in data area.
4) Calculate number of iovs the command really needs while copying
incoming data (if any) to data area.
5) Re-calculate real size of cmd on ring based on real number of iovs.
6) Set up possible padding and cmd on the ring.
Step 1) must not underestimate the cmd size so use max possible number of
iovs for the given I/O data size. The resulting overestimation can be
really high so this sequence is not ideal. The earliest the real number of
iovs can be calculated is after data buffer allocation. Therefore rework
the code to implement the following sequence:
A) Allocate buffers on data area and calculate number of necessary iovs
during this.
B) Calculate real size of cmd on ring based on number of iovs.
C) Check whether there is enough space on the ring.
D) Set up possible padding and cmd on the ring.
The new sequence enforces the split of new function tcmu_alloc_data_space()
from is_ring_space_avail(). Using this function, change queue_cmd_ring()
according to the new sequence.
Change routines called by tcmu_alloc_data_space() to allow calculating and
returning the iov count. Remove counting of iovs in scatter_data_area().
Bodo Stroesser [Thu, 10 Sep 2020 15:50:39 +0000 (17:50 +0200)]
scsi: target: tcmu: Join tcmu_cmd_get_data_length() and tcmu_cmd_get_block_cnt()
Simplify code by joining tcmu_cmd_get_data_length() and
tcmu_cmd_get_block_cnt() into tcmu_cmd_set_block_cnts(). The new function
sets tcmu_cmd->dbi_cnt and also the new field tcmu_cmd->dbi_bidi_cnt which
is needed for further enhancements in following patches. Simplify some
code by using tcmu_cmd->dbi(_bidi)_cnt instead of calculation from length.
Please note: The calculation of the number of dbis needed for bidi was
wrong. It was based on the length of the first bidi sg only. I changed it
to correctly sum up entire length of all bidi sgs.
Ming Lei [Thu, 10 Sep 2020 07:50:56 +0000 (15:50 +0800)]
scsi: core: Only re-run queue in scsi_end_request() if device queue is busy
The request queue is currently run unconditionally in scsi_end_request() if
both target queue and host queue are ready.
Recently Long Li reported that cost of a queue run can be very heavy in
case of high queue depth. Improve this situation by only running the
request queue when this LUN is busy.
Link: https://lore.kernel.org/r/20200910075056.36509-1-ming.lei@redhat.com Reported-by: Long Li <longli@microsoft.com> Tested-by: Long Li <longli@microsoft.com> Tested-by: Kashyap Desai <kashyap.desai@broadcom.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Matej Genci [Fri, 28 Aug 2020 12:21:35 +0000 (12:21 +0000)]
scsi: virtio_scsi: Rescan the entire target on transport reset when LUN is 0
VirtIO 1.0 spec says:
The removed and rescan events ... when sent for LUN 0, they MAY
apply to the entire target so the driver can ask the initiator
to rescan the target to detect this.
This change introduces the behaviour described above by scanning the entire
SCSI target when LUN is set to 0. This is both a functional and a
performance fix. It aligns the driver with the spec and allows control
planes to hotplug targets with large numbers of LUNs without having to
request a RESCAN for each one of them.
Jason Yan [Tue, 15 Sep 2020 08:40:18 +0000 (16:40 +0800)]
scsi: myrb: Make some symblos static
This addresses the following sparse warning:
drivers/scsi/myrb.c:2229:27: warning: symbol 'myrb_template' was not
declared. Should it be static?
drivers/scsi/myrb.c:2318:31: warning: symbol 'myrb_raid_functions' was
not declared. Should it be static?
drivers/scsi/myrb.c:2492:6: warning: symbol 'myrb_err_status' was not
declared. Should it be static?
Jason Yan [Tue, 15 Sep 2020 08:40:08 +0000 (16:40 +0800)]
scsi: myrs: Make some symbols static
This addresses the following sparse warning:
drivers/scsi/myrs.c:1532:5: warning: symbol 'myrs_host_reset' was not
declared. Should it be static?
drivers/scsi/myrs.c:1922:27: warning: symbol 'myrs_template' was not
declared. Should it be static?
drivers/scsi/myrs.c:2036:31: warning: symbol 'myrs_raid_functions' was
not declared. Should it be static?
drivers/scsi/myrs.c:2046:6: warning: symbol 'myrs_flush_cache' was not
declared. Should it be static?
Jason Yan [Sat, 12 Sep 2020 03:37:58 +0000 (11:37 +0800)]
scsi: bnx2fc: Make a bunch of symbols static in bnx2fc_fcoe.c
This eliminates the following sparse warning:
drivers/scsi/bnx2fc/bnx2fc_fcoe.c:53:1: warning: symbol
'bnx2fc_global_lock' was not declared. Should it be static?
drivers/scsi/bnx2fc/bnx2fc_fcoe.c:111:6: warning: symbol
'bnx2fc_devloss_tmo' was not declared. Should it be static?
drivers/scsi/bnx2fc/bnx2fc_fcoe.c:116:6: warning: symbol
'bnx2fc_max_luns' was not declared. Should it be static?
drivers/scsi/bnx2fc/bnx2fc_fcoe.c:121:6: warning: symbol
'bnx2fc_queue_depth' was not declared. Should it be static?
drivers/scsi/bnx2fc/bnx2fc_fcoe.c:126:6: warning: symbol
'bnx2fc_log_fka' was not declared. Should it be static?
Jason Yan [Sat, 12 Sep 2020 03:37:49 +0000 (11:37 +0800)]
scsi: aacraid: Make some symbols static in aachba.c
This eliminates the following sparse warning:
drivers/scsi/aacraid/aachba.c:245:5: warning: symbol 'aac_convert_sgl'
was not declared. Should it be static?
drivers/scsi/aacraid/aachba.c:293:5: warning: symbol 'acbsize' was not
declared. Should it be static?
drivers/scsi/aacraid/aachba.c:324:5: warning: symbol 'aac_wwn' was not
declared. Should it be static?
Ye Bin [Wed, 2 Sep 2020 06:16:46 +0000 (14:16 +0800)]
scsi: sym53c8xx_2: Delete unnecessary else-if in sym_xerr_cam_status()
If (x_status & XE_PARITY_ERR) is true we set cam_status = DID_PARITY,
othervise cam_status always ends up being DID_ERROR. Delete superfluous
else-if statements.
Brian King [Fri, 11 Sep 2020 21:28:26 +0000 (16:28 -0500)]
scsi: ibmvfc: Avoid link down on FS9100 canister reboot
When a canister on a FS9100, or similar storage, running in NPIV mode, is
rebooted, its WWPNs will fail over to another canister. When this occurs,
we see a WWPN going away from the fabric at one N-Port ID, and, a short
time later, the same WWPN appears at a different N-Port ID. When the
canister is fully operational again, the WWPNs fail back to the original
canister. If there is any I/O outstanding to the target when this occurs,
it will result in the implicit logout the ibmvfc driver issues before
removing the rport to fail. When the WWPN then shows up at a different
N-Port ID, and we issue a PLOGI to it, the VIOS will see that it still has
a login for this WWPN at the old N-Port ID, which results in the VIOS
simulating a link down / link up sequence to the client, in order to get
the VIOS and client LPAR in sync.
The patch below improves the way we handle this scenario so as to avoid the
link bounce, which affects all targets under the virtual host adapter. The
change is to utilize the Move Login MAD, which will work even when I/O is
outstanding to the target. The change only alters the target state machine
for the case where the implicit logout fails prior to deleting the rport.
If this implicit logout fails, we defer deleting the ibmvfc_target object
after calling fc_remote_port_delete. This enables us to later retry the
implicit logout after terminate_rport_io occurs, or to issue the Move Login
request if a WWPN shows up at a new N-Port ID prior to this occurring.
This has been tested by IBM's storage interoperability team on a FS9100,
forcing the failover to occur. With debug tracing enabled in the ibmvfc
driver, we confirmed the move login was sent in this scenario and confirmed
the link bounce no longer occurred.
Damien Le Moal [Thu, 10 Sep 2020 07:48:42 +0000 (16:48 +0900)]
scsi: core: Update additional sense codes list
Add missing Additional Sense Codes listed in
http://www.t10.org/lists/asc-num.txt.
Link: https://lore.kernel.org/r/20200910074843.217661-3-damien.lemoal@wdc.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Damien Le Moal [Thu, 10 Sep 2020 07:48:41 +0000 (16:48 +0900)]
scsi: core: Clean up scsi_noretry_cmd()
No need for else after return.
Link: https://lore.kernel.org/r/20200910074843.217661-2-damien.lemoal@wdc.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Daejun Park [Wed, 2 Sep 2020 02:58:52 +0000 (11:58 +0900)]
scsi: ufs: Fix NOP OUT timeout value
Boot occasionally fails with some Samsung low-power UFS devices. The reason
is that these devices have a little bit higher latency for NOP OUT
responses. This causes boot to fail because the NOP OUT command is issued
during initialization to check whether the device transport protocol is
ready or not. Increase NOP_OUT_TIMEOUT value from 30 to 50ms.
Tomas Henzl [Thu, 10 Sep 2020 14:21:26 +0000 (16:21 +0200)]
scsi: mpt3sas: Fix sync irqs
_base_process_reply_queue() called from _base_interrupt() may schedule a
new irq poll. Fix this by calling synchronize_irq() first.
Also ensure that enable_irq() is called only when necessary to avoid
"Unbalanced enable for IRQ..." errors.
Link: https://lore.kernel.org/r/20200910142126.8147-1-thenzl@redhat.com Fixes: 320e77acb327 ("scsi: mpt3sas: Irq poll to avoid CPU hard lockups") Acked-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Sreekanth Reddy [Fri, 14 Aug 2020 13:04:26 +0000 (13:04 +0000)]
scsi: mpt3sas: Detect tampered Aero and Sea adapters
The driver will throw an error message when a tampered type controller
is detected. The intent is to avoid interacting with any firmware
which is not secured/signed by Broadcom. Any tampering on firmware
component will be detected by hardware and it will be communicated to
the driver to avoid any further interaction with that component.
Jason Yan [Tue, 15 Sep 2020 08:39:48 +0000 (16:39 +0800)]
scsi: megaraid: Make smp_affinity_enable static
This addresses the following sparse warning:
drivers/scsi/megaraid/megaraid_sas_base.c:80:5: warning: symbol
'smp_affinity_enable' was not declared. Should it be static?
Link: https://lore.kernel.org/r/20200915083948.2826598-1-yanaijie@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: zfcp: Clarify access to erp_action in zfcp_fsf_req_complete()
While reviewing commit 936e6b85da04 ("scsi: zfcp: Fix panic on ERP timeout
for previously dismissed ERP action"), I stumbled over
zfcp_fsf_req_complete() and wondered whether it has similar issues wrt
concurrent modification of req->erp_action by
zfcp_erp_strategy_check_fsfreq().
But a closer look shows that both its two callers [zfcp_fsf_reqid_check(),
zfcp_fsf_req_dismiss_all()] remove the request from the adapter's req_list
under the req_list's lock. Hence we can trust that if
zfcp_erp_strategy_check_fsfreq() concurrently looks up the corresponding
req_id, it won't find this request and is thus unable to modify it while
it's being processed by zfcp_fsf_req_complete().
Add a code comment that hopefully makes this easier for future readers, and
condense the two accesses to ->erp_action that made me trip over this code
path in the first place.
Jason Yan [Fri, 11 Sep 2020 09:10:21 +0000 (17:10 +0800)]
scsi: qla2xxx: Remove unneeded variable 'rval'
This addresses the following coccinelle warning:
drivers/scsi/qla2xxx/qla_init.c:7112:5-9: Unneeded variable: "rval".
Return "QLA_SUCCESS" on line 7115
Link: https://lore.kernel.org/r/20200911091021.2937708-1-yanaijie@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Adrian Hunter [Thu, 27 Aug 2020 07:20:30 +0000 (10:20 +0300)]
scsi: ufs-pci: Add LTR support for Intel controllers
Intel host controllers support the setting of latency tolerance.
Accordingly, implement the PM QoS ->set_latency_tolerance() callback. The
raw register values are also exposed via debugfs.
Link: https://lore.kernel.org/r/20200827072030.24655-1-adrian.hunter@intel.com Fixes: 8c09d7527697 ("scsi: ufshdc-pci: Add Intel PCI IDs for EHL") Fixes: 1ab27c9cf8b6 ("ufs: Add support for clock gating") Reviewed-by: Avri Altman <avri.altman@wdc.com> Acked-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Ye Bin [Wed, 9 Sep 2020 08:27:16 +0000 (16:27 +0800)]
scsi: lpfc: Remove set but not used 'qp'
This addresses the following gcc warning with "make W=1":
not used [-Wunused-but-set-variable]
struct lpfc_sli4_hdw_queue *qp;
^
Link: https://lore.kernel.org/r/20200909082716.37787-1-yebin10@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
drivers/scsi/lpfc/lpfc_debugfs.c: In function ‘lpfc_debugfs_hdwqstat_data’:
drivers/scsi/lpfc/lpfc_debugfs.c:1699:30: warning: variable ‘qp’ set but
Ye Bin [Wed, 9 Sep 2020 08:26:26 +0000 (16:26 +0800)]
scsi: gdth: Remove set but used 'cmd_index'
This addresses the following gcc warning with "make W=1":
drivers/scsi/gdth.c: In function ‘gdth_async_event’:
drivers/scsi/gdth.c:3010:9: warning: variable ‘cmd_index’ set but not
used [-Wunused-but-set-variable]
int cmd_index;
Ye Bin [Wed, 9 Sep 2020 08:26:27 +0000 (16:26 +0800)]
scsi: pmcraid: Remove set but not used 'res'
This addresses the following gcc warning with "make W=1":
drivers/scsi/pmcraid.c: In function ‘pmcraid_abort_cmd’:
drivers/scsi/pmcraid.c:2863:33: warning: variable ‘res’ set but not
used [-Wunused-but-set-variable]
struct pmcraid_resource_entry *res;
^
Jason Yan [Mon, 7 Sep 2020 07:45:18 +0000 (15:45 +0800)]
scsi: qla1280: Remove set but not used variable in qla1280_status_entry()
This addresses the following gcc warning with "make W=1":
drivers/scsi/qla1280.c: In function ‘qla1280_status_entry’:
drivers/scsi/qla1280.c:3607:28: warning: variable ‘lun’ set but not used
[-Wunused-but-set-variable]
3607 | unsigned int bus, target, lun;
| ^~~
drivers/scsi/qla1280.c:3607:20: warning: variable ‘target’ set but not
used [-Wunused-but-set-variable]
3607 | unsigned int bus, target, lun;
| ^~~~~~
drivers/scsi/qla1280.c:3607:15: warning: variable ‘bus’ set but not used
[-Wunused-but-set-variable]
3607 | unsigned int bus, target, lun;
| ^~~
Jason Yan [Mon, 7 Sep 2020 07:45:17 +0000 (15:45 +0800)]
scsi: qla1280: Remove set but not used variable in qla1280_mailbox_command()
This addresses the following gcc warning with "make W=1":
drivers/scsi/qla1280.c: In function ‘qla1280_mailbox_command’:
drivers/scsi/qla1280.c:2430:11: warning: variable ‘data’ set but not
used [-Wunused-but-set-variable]
2430 | uint16_t data;
| ^~~~
Jason Yan [Mon, 7 Sep 2020 07:45:16 +0000 (15:45 +0800)]
scsi: qla1280: Remove set but not used variable in qla1280_nvram_config()
This addresses the following gcc warning with "make W=1":
drivers/scsi/qla1280.c: In function ‘qla1280_nvram_config’:
drivers/scsi/qla1280.c:2188:36: warning: variable ‘ddma_conf’ set but
not used [-Wunused-but-set-variable]
2188 | uint16_t hwrev, cfg1, cdma_conf, ddma_conf;
| ^~~~~~~~~
Jason Yan [Mon, 7 Sep 2020 07:45:15 +0000 (15:45 +0800)]
scsi: qla1280: Remove set but not used variable in qla1280_done()
This addresses the following gcc warning with "make W=1":
drivers/scsi/qla1280.c: In function ‘qla1280_done’:
drivers/scsi/qla1280.c:1244:19: warning: variable ‘lun’ set but not used
[-Wunused-but-set-variable]
1244 | int bus, target, lun;
| ^~~
Alim Akhtar [Tue, 21 Jul 2020 17:20:21 +0000 (22:50 +0530)]
scsi: ufs: Fix 'unmet direct dependencies' config warning
With !CONFIG_OF and SCSI_UFS_EXYNOS selected, the below warning is given:
WARNING: unmet direct dependencies detected for PHY_SAMSUNG_UFS
Depends on [n]: OF [=n] && (ARCH_EXYNOS || COMPILE_TEST [=y])
Selected by [y]:
- SCSI_UFS_EXYNOS [=y] && SCSI_LOWLEVEL [=y] && SCSI [=y] && SCSI_UFSHCD_PLATFORM [=y] && (ARCH_EXYNOS || COMPILE_TEST [=y])
Fix it by removing PHY_SAMSUNG_UFS dependency.
Link: https://lore.kernel.org/r/20200721172021.28922-1-alim.akhtar@samsung.com Reported-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Alim Akhtar <alim.akhtar@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Stanley Chu [Thu, 10 Sep 2020 01:37:56 +0000 (09:37 +0800)]
scsi: ufs: ufs-mediatek: Fix build warnings with make W=1
Fix build warnings with make W=1 as below,
1.
>> drivers/scsi/ufs/ufs-mediatek.c:116:22: warning: format '%d' expects
>> argument of type 'int', but argument 4 has type 'long int'
2.
CC [M] drivers/scsi/ufs/ufs-mediatek.o
../drivers/scsi/ufs/ufs-mediatek.c:749: error: Cannot parse struct or union!
/** is used specifically with kernel-doc tool.
As a quick fix by removing dubious /** in the comment block of
struct ufs_hba_variant_ops ufs_hba_mtk_vops.
It was observed on an ISP8324 16Gb HBA with fw=8.08.203 (d0d5) in a
PowerPC64 machine that pkt->entry_type was MBX_IOCB_TYPE/0x39 with an
sp->type SRB_SCSI_CMD which is invalid and should not be possible.
Reading the entry_type from the crash dump shows the expected value of
STATUS_TYPE/0x03 but the call trace shows that qla24xx_mbx_iocb_entry() is
used.
Add a check to verify for consistency and reset the HBA if an invalid state
is reached. Obviously, this is only a workaround until the real problem is
solved.
Daniel Wagner [Tue, 8 Sep 2020 08:15:15 +0000 (10:15 +0200)]
scsi: qla2xxx: Log calling function name in qla2x00_get_sp_from_handle()
Commit 7c3df1320e5e ("[SCSI] qla2xxx: Code changes to support new dynamic
logging infrastructure.") removed the use of the func argument. Let's add
it back.
Daniel Wagner [Tue, 8 Sep 2020 08:15:14 +0000 (10:15 +0200)]
scsi: qla2xxx: Simplify return value logic in qla2x00_get_sp_from_handle()
Refactor qla2x00_get_sp_from_handle() to avoid the unnecessary goto if
early returns are used. With this we can also avoid preinitialzing the sp
pointer.
Link: https://lore.kernel.org/r/20200908081516.8561-3-dwagner@suse.de Reviewed-by: Martin Wilck <mwilck@suse.com> Reviewed-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Daniel Wagner [Tue, 8 Sep 2020 08:15:13 +0000 (10:15 +0200)]
scsi: qla2xxx: Warn if done() or free() are called on an already freed srb
Emit a warning when ->done or ->free are called on an already freed
srb. There is a hidden use-after-free bug in the driver which corrupts
the srb memory pool which originates from the cleanup callbacks.
An extensive search didn't bring any lights on the real problem. The
initial fix was to set both pointers to NULL and try to catch invalid
accesses. But instead the memory corruption was gone and the driver
didn't crash. Since not all calling places check for NULL pointer, add
explicitly default handlers. With this we workaround the memory
corruption and add a debug help.
Link: https://lore.kernel.org/r/20200908081516.8561-2-dwagner@suse.de Reviewed-by: Martin Wilck <mwilck@suse.com> Reviewed-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Dan Carpenter [Sat, 5 Sep 2020 12:58:36 +0000 (15:58 +0300)]
scsi: libsas: Fix error path in sas_notify_lldd_dev_found()
In sas_notify_lldd_dev_found(), if we can't allocate the necessary
resources, then it seems like the wrong thing to mark the device as found
and to increment the reference count. None of the callers ever drop the
reference in that situation.
[mkp: tweaked commit desc based on feedback from John]
Link: https://lore.kernel.org/r/20200905125836.GF183976@mwanda Fixes: 735f7d2fedf5 ("[SCSI] libsas: fix domain_device leak") Reviewed-by: Jason Yan <yanaijie@huawei.com> Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Javed Hasan [Mon, 7 Sep 2020 12:14:38 +0000 (05:14 -0700)]
scsi: qedf: Fix for the session’s E_D_TOV value
Firmware expects E_D_TOV field in connection offload parameters as “msec”.
Earlier incorrect value (100ms), was leading to abort from driver in the
case when data frames for read take more than 100ms from target side,
resulting in firmware reporting E_D_TOV expiration.
The error recovery is handled by management firmware (MFW) with the help of
qed/qedi drivers. Upon detecting errors, driver informs MFW about this
event which in turn starts a recovery process. MFW sends ERROR_RECOVERY
notification to the driver which performs the required cleanup/recovery
from the driver side.
scsi: qedi: Mark all connections for recovery on link down event
For short time cable pulls, the in-flight I/O to the firmware is never
cleaned up, resulting in the behaviour of stale I/O completion causing
list_del corruption and soft lockup of the system.
On link down event, mark all the connections for recovery, causing cleanup
of all the in-flight I/O immediately.
scsi: qedi: Protect active command list to avoid list corruption
Protect active command list for non-I/O commands like login response,
logout response, text response, and recovery cleanup of active list to
avoid list corruption.
scsi: qedi: Fix list_del corruption while removing active I/O
While aborting the I/O, the firmware cleanup task timed out and driver
deleted the I/O from active command list. Some time later the firmware
sent the cleanup task response and driver again deleted the I/O from
active command list causing firmware to send completion for non-existent
I/O and list_del corruption of active command list.
Add fix to check if I/O is present before deleting it from the active
command list to ensure firmware sends valid I/O completion and protect
against list_del corruption.
scsi: qedi: Skip firmware connection termination for PCI shutdown handler
In boot from SAN scenario when qedi PCI shutdown handler is called with
active iSCSI sessions, sometimes target takes too long time to respond to
firmware connection termination request. Instead skip sending termination
ramrod and progress with unload path.
scsi: qedi: Use qed count from set_fp_int in msix allocation
To avoid unnecessary vector allocation when the number of fast-path queues
is less then available msix vectors, use return count from module
qed->set_fp_int.
scsi: docs: Remove obsolete scsi typedef text from scsi_mid_low_api
Commit 91ebc1facd77 ("scsi: core: remove Scsi_Cmnd typedef") removed the
Scsi_cmnd typedef but it was still mentioned in a paragraph in the "SCSI
mid_level - lower_level driver interface" documentation page. Remove this
obsolete paragraph.
Link: https://lore.kernel.org/r/20200905210211.2286172-1-nfraprado@protonmail.com Suggested-by: Randy Dunlap <rdunlap@infradead.org> Suggested-by: Jonathan Corbet <corbet@lwn.net> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Nícolas F. R. A. Prado <nfraprado@protonmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: ibmvfc: Interface updates for future FPIN and MQ support
VIOS partitions with SLI-4 enabled Emulex adapters will be capable of
driving I/O in parallel through mulitple work queues or channels, and with
new hypervisor firmware that supports multiple interrupt sources an ibmvfc
NPIV single initiator can be modified to exploit end-to-end channelization
in a PowerVM environment.
VIOS hosts will also be able to expose fabric perfromance impact
notifications (FPIN) via a new asynchronous event to ibmvfc clients that
advertise support via IBMVFC_CAN_HANDLE_FPIN in their capabilities flag
during NPIV_LOGIN.
This patch introduces three new Management Datagrams (MADs) for
channelization support negotiation as well as the FPIN asynchronous
event and FPIN status flags. Follow up work is required to plumb the
ibmvfc client driver to use these new interfaces.
scsi: ibmvfc: Use compiler attribute defines instead of __attribute__()
Update ibmvfc.h structs to use the preferred __packed and __aligned()
attribute macros defined in include/linux/compiler_attributes.h in place of
__attribute__().
Bao D. Nguyen [Sat, 29 Aug 2020 01:05:13 +0000 (18:05 -0700)]
scsi: ufshcd: Allow specifying an Auto-Hibernate Timer value of zero
Setting the Auto-Hibernate Timer to zero is a valid setting which indicates
the Auto-Hibernate feature being disabled. Correctly support this setting.
In addition, when the timer value is queried from sysfs, read from the host
controller's register and return that value instead of using the RAM value.
John Pittman [Wed, 2 Sep 2020 21:14:34 +0000 (17:14 -0400)]
scsi: scsi_debug: Make sdebug_build_parts() respect virtual_gb
If virtual_gb is passed while using num_parts, when creating the
partitions, virtual_gb is not respected. Set num_sectors using
get_sdebug_capacity() to pull virtual_gb if set.
John Pittman [Wed, 2 Sep 2020 21:14:33 +0000 (17:14 -0400)]
scsi: scsi_debug: Adjust num_parts to create equally sized partitions
Currently when using the num_parts parameter, partitions are aligned and
the end sector is one prior to the next start. This creates different
sized partitions. Create instead equally sized partitions by trimming the
end of each partition to the size of the smallest partition. This aligns
better with what one would expect from automatically created partitions and
can be helpful with testing things such as raid which often expect legs of
the same size. Minimal space is lost as the initial partition starting
size is calculated by dividing num_sectors by sdebug_num_parts.