James Smart [Fri, 1 Jul 2022 21:14:16 +0000 (14:14 -0700)]
scsi: lpfc: Set PU field when providing D_ID in XMIT_ELS_RSP64_CX iocb
When providing a D_ID in XMIT_ELS_RSP64_CX iocb the PU field should
be set to 3 to describe the parameter being passed to firmware.
Link: https://lore.kernel.org/r/20220701211425.2708-4-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Fri, 1 Jul 2022 21:14:15 +0000 (14:14 -0700)]
scsi: lpfc: Prevent buffer overflow crashes in debugfs with malformed user input
Malformed user input to debugfs results in buffer overflow crashes. Adapt
input string lengths to fit within internal buffers, leaving space for NULL
terminators.
Link: https://lore.kernel.org/r/20220701211425.2708-3-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Fri, 1 Jul 2022 21:14:14 +0000 (14:14 -0700)]
scsi: lpfc: Fix uninitialized cqe field in lpfc_nvme_cancel_iocb()
In lpfc_nvme_cancel_iocb(), a cqe is created locally from stack storage.
The code didn't initialize the total_data_placed word, inheriting stack
content.
Initialize the total_data_placed word.
Link: https://lore.kernel.org/r/20220701211425.2708-2-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Bart Van Assche [Thu, 30 Jun 2022 19:57:03 +0000 (12:57 -0700)]
scsi: sd: Rework asynchronous resume support
For some technologies, e.g. an ATA bus, resuming can take multiple
seconds. Waiting for resume to finish can cause a very noticeable delay.
Hence this commit that restores the behavior from before "scsi: core: pm:
Rely on the device driver core for async power management" for most SCSI
devices.
This commit introduces a behavior change: if the START command fails, do
not consider this as a SCSI disk resume failure.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215880 Link: https://lore.kernel.org/r/20220630195703.10155-3-bvanassche@acm.org Fixes: c0b1f7544c68 ("scsi: core: pm: Rely on the device driver core for async power management") Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: John Garry <john.garry@huawei.com> Cc: ericspero@icloud.com Cc: jason600.groome@gmail.com Tested-by: jason600.groome@gmail.com Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Bart Van Assche [Thu, 30 Jun 2022 19:57:02 +0000 (12:57 -0700)]
scsi: core: Move the definition of SCSI_QUEUE_DELAY
Move the definition of SCSI_QUEUE_DELAY to just above the function that
uses it.
Link: https://lore.kernel.org/r/20220630195703.10155-2-bvanassche@acm.org Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: John Garry <john.garry@huawei.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ 475.156955] sd 9:0:0:0: Warning! Received an indication that the LUN assignments on this target have changed. The Linux SCSI layer does not automatical
Link: https://lore.kernel.org/r/20220630024516.1571209-1-lizhijian@fujitsu.com Suggested-by: Finn Thain <fthain@linux-m68k.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Mike Christie [Tue, 28 Jun 2022 20:02:30 +0000 (15:02 -0500)]
scsi: target: Detect UNMAP support post configuration
On our backend we can do something similar to LIO where we can enable and
disable UNMAP support on the fly. In the SCSI/block layer we can detect
this by just doing a rescan. However, LIO cannot detect this change because
we only check during the initial configuration. This patch allows UNMAP
detection to also happen when the user tries to turn it on.
Mike Christie [Tue, 28 Jun 2022 20:02:27 +0000 (15:02 -0500)]
scsi: target: Add callout to configure UNMAP settings
Add a callout to configure a backend's UNMAP settings. This will be used to
allow userspace to configure UNMAP after the initial device setup, similar
to how we can set up the other attributes post device configuration.
Mike Christie [Tue, 28 Jun 2022 20:02:26 +0000 (15:02 -0500)]
scsi: target: Remove incorrect zero blocks WRITE_SAME check
We use WSNZ=1 so if we get a WRITE_SAME with zero logical blocks we are
supposed to fail it. We do this check and failure in target_core_sbc.c
before calling into the backend, so we can remove the incorrect check in
target_core_file.
Alice Chao [Thu, 23 Jun 2022 03:50:52 +0000 (11:50 +0800)]
scsi: ufs: ufs-mediatek: Fix invalid access to vccqx
NULL pointer access issue was found for the regulator released
by ufs_mtk_vreg_fix_vccq(). Simply fix this issue by clearing
the released vreg pointer in ufs_hba struct.
Link: https://lore.kernel.org/r/20220623035052.18802-9-stanley.chu@mediatek.com Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Alice Chao <alice.chao@mediatek.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Peter Wang [Thu, 23 Jun 2022 03:50:51 +0000 (11:50 +0800)]
scsi: ufs: ufs-mediatek: Support performance boosting
Add pm-qos request to support performance boosting in MediaTek UFS
platforms.
At the same time, adjust the order of function calls to be symmetric during
the low-power control flow.
Link: https://lore.kernel.org/r/20220623035052.18802-8-stanley.chu@mediatek.com Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Po-Wen Kao [Thu, 23 Jun 2022 03:50:50 +0000 (11:50 +0800)]
scsi: ufs: ufs-mediatek: Support host power control
Add interfaces for controlling the host power to optimize the power
consumption in MediaTek UFS platforms.
Link: https://lore.kernel.org/r/20220623035052.18802-7-stanley.chu@mediatek.com Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Po-Wen Kao <powen.kao@mediatek.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Po-Wen Kao [Thu, 23 Jun 2022 03:50:49 +0000 (11:50 +0800)]
scsi: ufs: ufs-mediatek: Disable reset confirm feature by UniPro
In MediaTek UFS platforms, UniPro will not return reset confirm if it is in
POWERDOWN state thus hang issue may happen while disabling UFSHCI. Simply
disable this feature before UniPro leaves POWERDOWN state.
Link: https://lore.kernel.org/r/20220623035052.18802-6-stanley.chu@mediatek.com Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Po-Wen Kao <powen.kao@mediatek.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Peter Wang [Thu, 23 Jun 2022 03:50:48 +0000 (11:50 +0800)]
scsi: ufs: ufs-mediatek: Add stage information for ref-clk control
Add "PRE_CHANGE" and "POST_CHANGE" information for ref-clk control to
precisely configure the low-power state of the parent of ref-clk.
Link: https://lore.kernel.org/r/20220623035052.18802-5-stanley.chu@mediatek.com Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Peter Wang [Thu, 23 Jun 2022 03:50:47 +0000 (11:50 +0800)]
scsi: ufs: ufs-mediatek: Prevent host hang by setting CLK_CG early
Some UFSHCI hosts in MediaTek UFS platform need workaround to prevent host
hang issue by setting CLK_CG bit before host is enabled.
This operation shall have no side effect on those platforms which do not
support this bit.
Link: https://lore.kernel.org/r/20220623035052.18802-4-stanley.chu@mediatek.com Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Peter Wang [Thu, 23 Jun 2022 03:50:46 +0000 (11:50 +0800)]
scsi: ufs: ufs-mediatek: Always add delays for VCC operations
MediaTek decides to always add delays before and after VCC is turned-off.
Link: https://lore.kernel.org/r/20220623035052.18802-3-stanley.chu@mediatek.com Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Stanley Chu [Thu, 23 Jun 2022 03:50:45 +0000 (11:50 +0800)]
scsi: ufs: ufs-mediatek: Fix build warnings
Fix build warnings:
1.
../drivers/ufs/host/ufs-mediatek.c:1375:5: error: no previous prototype for function 'ufs_mtk_system_suspend' [-Werror,-Wmissing-prototypes]
int ufs_mtk_system_suspend(struct device *dev)
^
../drivers/ufs/host/ufs-mediatek.c:1375:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
int ufs_mtk_system_suspend(struct device *dev)
^
static
2.
../drivers/ufs/host/ufs-mediatek.c:702:50: error: format specifies type 'unsigned long' but the argument has type 'int' [-Werror,-Wformat]
snprintf(vcc_name, MAX_VCC_NAME, "vcc-ufs%lu", ver);
~~~ ^~~
%d
Arnd Bergmann [Fri, 24 Jun 2022 15:52:25 +0000 (17:52 +0200)]
scsi: dpt_i2o: Remove obsolete driver
The dpt_i2o driver was fixed to stop using virt_to_bus() in 2008, but it
still has a stale reference in an error handling code path that could never
work. I submitted a patch to fix this reference earlier, but Hannes
Reinecke suggested that removing the driver may be just as good here.
The i2o driver layer was removed in 2015 with commit 656f9bfefbf3
("staging: remove i2o subsystem"), but the even older dpt_i2o scsi driver
stayed around.
The last non-cleanup patches I could find were from Miquel van Smoorenburg
and Mark Salyzyn back in 2008, they might know if there is any chance of
the hardware still being used anywhere.
Arnd Bergmann [Fri, 24 Jun 2022 15:52:24 +0000 (17:52 +0200)]
scsi: BusLogic: Remove bus_to_virt()
The BusLogic driver is the last remaining driver that relies on the
deprecated bus_to_virt() function, which in turn only works on a few
architectures, and is incompatible with both swiotlb and iommu support.
Before commit 2b6c0513f94e ("[SCSI] BusLogic: Port driver to 64-bit."), the
driver had a dependency on x86-32, presumably because of this
problem. However, the change introduced another bug that made it still
impossible to use the driver on any 64-bit machine.
This was in turn fixed in commit b4a13fbc90f0 ("scsi: BusLogic: Fix 64-bit
system enumeration error for Buslogic"), 8 years later, which shows that
there are not a lot of users.
Maciej is still using the driver on 32-bit hardware, and Khalid mentioned
that the driver works with the device emulation used in VirtualBox and
VMware. Both of those only emulate it for Windows 2000 and older operating
systems that did not ship with the better LSI logic driver.
Do a minimum fix that searches through the list of descriptors to find one
that matches the bus address. This is clearly as inefficient as was
indicated in the code comment about the lack of a bus_to_virt()
replacement. A better fix would likely involve changing out the entire
descriptor allocation for a simpler one, but that would be much more
invasive.
Link: https://lore.kernel.org/r/20220624155226.2889613-2-arnd@kernel.org Cc: Maciej W. Rozycki <macro@orcam.me.uk> Cc: Matt Wang <wwentao@vmware.com> Tested-by: Khalid Aziz <khalid@gonehiking.org> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Khalid Aziz <khalid@gonehiking.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Colin Ian King [Thu, 23 Jun 2022 16:47:10 +0000 (17:47 +0100)]
scsi: fcoe: Remove redundant assignment to variable 'wlen'
Variable wlen is being assigned a value that is never read, it is being
re-assigned with a different value later on. The assignment is redundant
and can be removed.
Cleans up clang scan build warning:
drivers/scsi/fcoe/fcoe.c:1491:2: warning: Value stored to 'wlen'
is never read [deadcode.DeadStores]
Tetsuo Handa [Thu, 9 Jun 2022 13:26:48 +0000 (22:26 +0900)]
scsi: mpt3sas: Remove flush_scheduled_work() call
It seems to me that mpt3sas driver is using dedicated workqueues and is not
calling schedule{,_delayed}_work{,_on}(). Then, there will be no work to
flush using flush_scheduled_work().
Changyuan Lyu [Tue, 21 Jun 2022 18:11:25 +0000 (18:11 +0000)]
scsi: trace: Print driver_tag and scheduler_tag in SCSI trace
Trace events like scsi_dispatch_cmd_start and scsi_dispatch_cmd_done are
useful for tracking a command throughout its lifetime. But for some ATA
passthrough commands, the information printed in current logs is not enough
to identify and match them. For example, if two threads send SMART cmd to
the same disk at the same time, their trace logs may look the same, which
makes it hard to match scsi_dispatch_cmd_done and scsi_dispatch_cmd_start.
Printing tags can help us solve the problem. Further, if a command failed
for some reason and then is retried, its driver_tag will change. So
scheduler_tag is also included such that we can track the retries of a
command.
Link: https://lore.kernel.org/r/20220621181125.3211399-1-changyuanl@google.com Reviewed-by: Vishakha Channapattan <vishakhavc@google.com> Reviewed-by: Jolly Shah <jollys@google.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Changyuan Lyu <changyuanl@google.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Mike Christie [Thu, 16 Jun 2022 22:45:57 +0000 (17:45 -0500)]
scsi: libiscsi: Improve conn_send_pdu API
The conn_send_pdu API is evil in that it returns a pointer to an
iscsi_task, but that task might have been freed already so you can't touch
it. This patch splits the task allocation and transmission, so functions
like iscsi_send_nopout() can access the task before its sent and do
whatever bookkeeping is needed before it is sent.
Mike Christie [Thu, 16 Jun 2022 22:45:56 +0000 (17:45 -0500)]
scsi: iscsi: Try to avoid taking back_lock in xmit path
We need the back lock when freeing a task, so we hold it when calling
__iscsi_put_task() from the completion path to make it easier and to avoid
having to retake it in that path. For iscsi_put_task() we just grabbed it
while also doing the decrement on the refcount but it's only really needed
if the refcount is zero and we free the task. This modifies
iscsi_put_task() to just take the lock when needed then has the xmit path
use it. Normally we will then not take the back lock from the xmit path. It
will only be rare cases where the network is so fast that we get a response
right after we send the header/data.
We currently require that the back_lock is held when calling the functions
that manipulate the iscsi_task refcount. The only reason for this is to
handle races where we are handling SCSI-ml EH callbacks and the cmd is
completing at the same time the normal completion path is running, and we
can't return from the EH callback until the driver has stopped accessing
the cmd. Holding the back_lock while also accessing the task->state made it
simple to check that a cmd is completing and also get/put a refcount at the
same time, and at the time we were not as concerned about performance.
The problem is that we don't want to take the back_lock from the xmit path
for normal I/O since it causes contention with the completion path if the
user has chosen to try and split those paths on different CPUs (in this
case abusing the CPUs and ignoring caching improves perf for some uses).
Begins to remove the back_lock requirement for iscsi_get/put_task by
removing the requirement for the get path. Instead of always holding the
back_lock we detect if something has done the last put and is about to call
iscsi_free_task(). A subsequent commit will then allow iSCSI code to do the
last put on a task and only grab the back_lock if the refcount is now zero
and it's going to call iscsi_free_task().
Mike Christie [Thu, 16 Jun 2022 22:45:54 +0000 (17:45 -0500)]
scsi: iscsi: Remove unneeded task state check
Commit d0757df95b24 ("scsi: libiscsi: Drop taskqueuelock") added an extra
task->state because for commit 4a81c36e3a80 ("scsi: libiscsi: add lock
around task lists to fix list corruption regression") we didn't know why we
ended up with cmds on the list and thought it might have been a bad target
sending a response while we were still sending the cmd. We were never able
to get a target to send us a response early, because it turns out the bug
was just a race in libiscsi/libiscsi_tcp where:
1. iscsi_tcp_r2t_rsp() queues a r2t to tcp_task->r2tqueue.
2. iscsi_tcp_task_xmit() runs iscsi_tcp_get_curr_r2t() and sees we have a
r2t. It dequeues it and iscsi_tcp_task_xmit() starts to process it.
3. iscsi_tcp_r2t_rsp() runs iscsi_requeue_task() and puts the task on the
requeue list.
4. iscsi_tcp_task_xmit() sends the data for r2t. This is the final chunk
of data, so the cmd is done.
5. target sends the response.
6. On a different CPU from #3, iscsi_complete_task() processes the
response. Since there was no common lock for the list, the lists/tasks
pointers are not fully in sync, so could end up with list corruption.
Since it was just a race on our side, remove the extra check and fix up the
comments.
Mike Christie [Thu, 16 Jun 2022 22:45:53 +0000 (17:45 -0500)]
scsi: iscsi_tcp: Drop target_alloc use
For software iSCSI, we do a session per host so there is no need to set the
target's can_queue since it's the same as the host one. Setting it just
results in extra atomic checks in the main I/O path.
Mike Christie [Thu, 16 Jun 2022 22:45:51 +0000 (17:45 -0500)]
scsi: iscsi: Run recv path from workqueue
We don't always want to run the recv path from the network softirq because
when we have to have multiple sessions sharing the same CPUs, some sessions
can eat up the NAPI softirq budget and affect other sessions or users.
Allow us to queue the recv handling to the iscsi workqueue so we can have
the scheduler/wq code try to balance the work and CPU use across all
sessions' worker threads.
Note: It wasn't the original intent of the change but a nice side effect is
that for some workloads/configs we get a nice performance boost. For a
simple read heavy test:
where the iscsi threads, fio jobs, and rps_cpus share CPUs we see a 32%
throughput boost. We also see increases for small I/O IOPs tests but it's
not as high.
Mike Christie [Thu, 16 Jun 2022 22:27:38 +0000 (17:27 -0500)]
scsi: iscsi: Fix session removal on shutdown
When the system is shutting down, iscsid is not running so we will not get
a response to the ISCSI_ERR_INVALID_HOST error event. The system shutdown
will then hang waiting on userspace to remove the session.
This has libiscsi force the destruction of the session from the kernel when
iscsi_host_remove() is called from a driver's shutdown callout.
This fixes a regression added in qedi boot with commit c55cd560a09b ("scsi:
qedi: Fix host removal with running sessions") which made qedi use the
common session removal function that waits on userspace instead of rolling
its own kernel based removal.
Link: https://lore.kernel.org/r/20220616222738.5722-7-michael.christie@oracle.com Fixes: c55cd560a09b ("scsi: qedi: Fix host removal with running sessions") Tested-by: Nilesh Javali <njavali@marvell.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Mike Christie [Thu, 16 Jun 2022 22:27:37 +0000 (17:27 -0500)]
scsi: qedi: Use QEDI_MODE_NORMAL for error handling
When handling errors that lead to host removal use QEDI_MODE_NORMAL. There
is currently no difference in behavior between NORMAL and SHUTDOWN, but in
a subsequent commit we will want to know when we are called from the
pci_driver shutdown callout vs remove/err_handler so we know when userspace
is up.
Link: https://lore.kernel.org/r/20220616222738.5722-6-michael.christie@oracle.com Tested-by: Nilesh Javali <njavali@marvell.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Mike Christie [Thu, 16 Jun 2022 22:27:36 +0000 (17:27 -0500)]
scsi: iscsi: Add helper to remove a session from the kernel
During qedi shutdown we need to stop the iSCSI layer from sending new nops
as pings and from responding to target ones and make sure there is no
running connection cleanups. Commit c55cd560a09b ("scsi: qedi: Fix host
removal with running sessions") converted the driver to use the libicsi
helper to drive session removal, so the above issues could be handled. The
problem is that during system shutdown iscsid will not be running so when
we try to remove the root session we will hang waiting for userspace to
reply.
Add a helper that will drive the destruction of sessions like these during
system shutdown.
Mike Christie [Thu, 16 Jun 2022 22:27:35 +0000 (17:27 -0500)]
scsi: iscsi: Clean up bound endpoints during shutdown
In the next patch we allow drivers to drive session removal during
shutdown. In this case iscsid will not be running, so we need to detect
bound endpoints and disconnect them. This moves the bound ep check so we
now always check.
Mike Christie [Thu, 16 Jun 2022 22:27:34 +0000 (17:27 -0500)]
scsi: iscsi: Allow iscsi_if_stop_conn() to be called from kernel
iscsi_if_stop_conn() is only called from the userspace interface but in a
subsequent commit we will want to call it from the kernel interface to
allow drivers like qedi to remove sessions from inside the kernel during
shutdown. This removes the iscsi_uevent code from iscsi_if_stop_conn() so we
can call it in a new helper.
Mike Christie [Thu, 16 Jun 2022 22:27:33 +0000 (17:27 -0500)]
scsi: iscsi: Fix HW conn removal use after free
If qla4xxx doesn't remove the connection before the session, the iSCSI
class tries to remove the connection for it. We were doing a
iscsi_put_conn() in the iter function which is not needed and will result
in a use after free because iscsi_remove_conn() will free the connection.
Link: https://lore.kernel.org/r/20220616222738.5722-2-michael.christie@oracle.com Tested-by: Nilesh Javali <njavali@marvell.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Ren Zhijie [Sun, 19 Jun 2022 11:54:32 +0000 (19:54 +0800)]
scsi: ufs: ufs-mediatek: Fix build error and type mismatch
If CONFIG_PM_SLEEP is not set.
make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-, will fail:
drivers/ufs/host/ufs-mediatek.c: In function ‘ufs_mtk_vreg_fix_vcc’:
drivers/ufs/host/ufs-mediatek.c:688:46: warning: format ‘%u’ expects argument of type ‘unsigned int’, but argument 4 has type ‘long unsigned int’ [-Wformat=]
snprintf(vcc_name, MAX_VCC_NAME, "vcc-opt%u", res.a1);
~^ ~~~~~~
%lu
drivers/ufs/host/ufs-mediatek.c: In function ‘ufs_mtk_system_suspend’:
drivers/ufs/host/ufs-mediatek.c:1371:8: error: implicit declaration of function ‘ufshcd_system_suspend’; did you mean ‘ufs_mtk_system_suspend’? [-Werror=implicit-function-declaration]
ret = ufshcd_system_suspend(dev);
^~~~~~~~~~~~~~~~~~~~~
ufs_mtk_system_suspend
drivers/ufs/host/ufs-mediatek.c: In function ‘ufs_mtk_system_resume’:
drivers/ufs/host/ufs-mediatek.c:1386:9: error: implicit declaration of function ‘ufshcd_system_resume’; did you mean ‘ufs_mtk_system_resume’? [-Werror=implicit-function-declaration]
return ufshcd_system_resume(dev);
^~~~~~~~~~~~~~~~~~~~
ufs_mtk_system_resume
cc1: some warnings being treated as errors
The declaration of func "ufshcd_system_suspend()" depends on
CONFIG_PM_SLEEP, so the function wrapper ufs_mtk_system_suspend() should
wrapped by CONFIG_PM_SLEEP too.
Link: https://lore.kernel.org/r/20220619115432.205504-1-renzhijie2@huawei.com Fixes: 225f5d4d2198 ("scsi: ufs: ufs-mediatek: Fix the timing of configuring device regulators") Reported-by: Hulk Robot <hulkci@huawei.com> Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Ren Zhijie <renzhijie2@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[Option 2: By UFS version]
mediatek,ufs-vcc-by-ver;
vcc-ufs3-supply = <&mt6373_vbuck4_ufs>;
Link: https://lore.kernel.org/r/20220616053725.5681-11-stanley.chu@mediatek.com Signed-off-by: Alice Chao <alice.chao@mediatek.com> Signed-off-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Peter Wang [Thu, 16 Jun 2022 05:37:20 +0000 (13:37 +0800)]
scsi: ufs: ufs-mediatek: Support low-power mode for VCCQ
Allow VCCQ to enter low-power mode, and also remove the restriction of VCC
because VCCQ/VCCQ2 can be changed to low-power mode even if VCC stays on
while the device is not in active power mode.
Link: https://lore.kernel.org/r/20220616053725.5681-7-stanley.chu@mediatek.com Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Device regulatrs are allowed to enter low-power mode if neither device is
not in active mode, nor VCC does not keep on.
Fix this by adding conditions before LPM decision.
Link: https://lore.kernel.org/r/20220616053725.5681-6-stanley.chu@mediatek.com Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Po-Wen Kao <powen.kao@mediatek.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Po-Wen Kao [Thu, 16 Jun 2022 05:37:18 +0000 (13:37 +0800)]
scsi: ufs: ufs-mediatek: Fix the timing of configuring device regulators
Currently the LPM configurations of device regulators may not work since
VCC is not disabled yet while ufs_mtk_vreg_set_lpm() is executed.
Fix this by changing the timing of invoking ufs_mtk_vreg_set_lpm().
Link: https://lore.kernel.org/r/20220616053725.5681-5-stanley.chu@mediatek.com Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Po-Wen Kao <powen.kao@mediatek.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
CC Chou [Thu, 16 Jun 2022 05:37:17 +0000 (13:37 +0800)]
scsi: ufs: ufs-mediatek: Introduce workaround for power mode change
Some MediaTek SoC chips need special flow for power mode change, especially
for chips supporting HS-G5.
Enable the workaround by setting the host-specific capability.
Link: https://lore.kernel.org/r/20220616053725.5681-4-stanley.chu@mediatek.com Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: CC Chou <cc.chou@mediatek.com> Signed-off-by: Eddie Huang <eddie.huang@mediatek.com> Signed-off-by: Dennis Yu <tun-yu.yu@mediatek.com> Signed-off-by: Peter Wang <peter.want@medaitek.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Quinn Tran [Thu, 16 Jun 2022 05:35:07 +0000 (22:35 -0700)]
scsi: qla2xxx: Fix erroneous mailbox timeout after PCI error injection
Clear wait for mailbox interrupt flag to prevent stale mailbox:
Feb 22 05:22:56 ltcden4-lp7 kernel: qla2xxx [0135:90:00.1]-500a:4: LOOP UP detected (16 Gbps).
Feb 22 05:22:59 ltcden4-lp7 kernel: qla2xxx [0135:90:00.1]-d04c:4: MBX Command timeout for cmd 69, ...
To fix the issue, driver needs to clear the MBX_INTR_WAIT flag on purging
the mailbox. When the stale mailbox completion does arrive, it will be
dropped.
Arun Easi [Thu, 16 Jun 2022 05:35:06 +0000 (22:35 -0700)]
scsi: qla2xxx: Fix losing FCP-2 targets on long port disable with I/Os
FCP-2 devices were not coming back online once they were lost, login
retries exhausted, and then came back up. Fix this by accepting RSCN when
the device is not online.
Link: https://lore.kernel.org/r/20220616053508.27186-10-njavali@marvell.com Fixes: d5638e660741 ("scsi: qla2xxx: Changes to support FCP2 Target") Cc: stable@vger.kernel.org Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Arun Easi [Thu, 16 Jun 2022 05:35:03 +0000 (22:35 -0700)]
scsi: qla2xxx: Fix losing FCP-2 targets during port perturbation tests
When a mix of FCP-2 (tape) and non-FCP-2 targets are present, FCP-2 target
state was incorrectly transitioned when both of the targets were gone. Fix
this by ignoring state transition for FCP-2 targets.
Link: https://lore.kernel.org/r/20220616053508.27186-7-njavali@marvell.com Fixes: d5638e660741 ("scsi: qla2xxx: Changes to support FCP2 Target") Cc: stable@vger.kernel.org Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Bikash Hazarika [Thu, 16 Jun 2022 05:34:59 +0000 (22:34 -0700)]
scsi: qla2xxx: Add a new v2 dport diagnostic feature
FW requires minimum 72 bytes buffer size for D_port result. Buffer size
1024 is mentioned in the FW spec so buffer size is increased to 1024.
Rewrite the logic to handle START/RESTART command from SDMAPI.
Dmitry Bogdanov [Tue, 7 Jun 2022 13:19:53 +0000 (16:19 +0300)]
scsi: iscsi: Prefer xmit of DataOut over new commands
iscsi_data_xmit() (TX worker) is iterating over the queue of new SCSI
commands concurrently with the queue being replenished. Only after the
queue is emptied will we start sending pending DataOut PDUs. That leads to
DataOut timeout on the target side and to connection reinstatement.
Give priority to pending DataOut commands over new commands.
Link: https://lore.kernel.org/r/20220607131953.11584-1-d.bogdanov@yadro.com Reviewed-by: Konstantin Shelekhin <k.shelekhin@yadro.com> Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Alim Akhtar [Wed, 15 Jun 2022 12:12:03 +0000 (17:42 +0530)]
scsi: ufs: host: ufs-exynos: Use already existing definition
UFS core already uses RX_MIN_ACTIVATETIME_CAPABILITY macro, let's use the
same in driver as well instead of having a different macro name for the
same offset.
John Garry [Fri, 10 Jun 2022 16:46:42 +0000 (00:46 +0800)]
scsi: pm8001: Expose hardware queues for pm80xx
In commit 7266bb90ee0d ("scsi: pm80xx: Increase number of supported
queues"), support for 80xx chip was improved by enabling multiple HW
queues.
In this, like other SCSI MQ HBA drivers at the time, the HW queues were not
exposed to upper layer, and instead the driver managed the queues
internally.
However, this management duplicates blk-mq code. In addition, the HW queue
management is sub-optimal for a system where the number of CPUs exceeds the
HW queues - this is because queues are selected in a round-robin fashion,
when it would be better to make adjacent CPUs submit on the same queue. And
finally, the affinity of the completion queue interrupts is not set to
mirror the cpu<->HQ queue mapping, which is suboptimal.
As such, for when MSIX is supported, expose HW queues to upper layer. We
always use queue index #0 for "internal" commands, i.e. anything which does
not come from the block layer, so omit this from the affinity spreading.
John Garry [Fri, 10 Jun 2022 16:46:41 +0000 (00:46 +0800)]
scsi: pm8001: Use non-atomic bitmap ops for tag alloc + free
In pm8001_tag_alloc() we don't require atomic set_bit() as we are already
in atomic context. In pm8001_tag_free() we should use the same host
spinlock to protect clearing the tag (and then don't require the atomic
clear_bit()).
John Garry [Fri, 10 Jun 2022 16:46:40 +0000 (00:46 +0800)]
scsi: pm8001: Set up tags before using them
The current code is buggy in that the tags are set up after they are needed
in pm80xx_chip_init() -> pm80xx_set_sas_protocol_timer_config(). The tag
depth is earlier read in pm80xx_chip_init() -> read_main_config_table().
Add a post init callback to do the pm80xx work which needs to be done after
reading the tags. I don't see a better way to do this.
John Garry [Fri, 10 Jun 2022 16:46:39 +0000 (00:46 +0800)]
scsi: pm8001: Rework shost initial values
Some values in pm8001_prep_sas_ha_init() are set the same as they would be
set in scsi_host_alloc(), or could be in the sht (which would be better),
or later just overwritten, so rework the following:
- cmd_per_lun can be set in the sht
- max_lun and max_channel are as scsi_host_alloc() (so no need to set)
- can_queue is later overwritten (so don't set in
pm8001_prep_sas_ha_init())
Tyrel Datwyler [Thu, 16 Jun 2022 19:11:25 +0000 (12:11 -0700)]
scsi: ibmvfc: Store vhost pointer during subcrq allocation
Currently the back pointer from a queue to the vhost adapter isn't set
until after subcrq interrupt registration. The value is available when a
queue is first allocated and can/should be also set for primary and async
queues as well as subcrqs.
This fixes a crash observed during kexec/kdump on Power 9 with legacy XICS
interrupt controller where a pending subcrq interrupt from the previous
kernel can be replayed immediately upon IRQ registration resulting in
dereference of a garbage backpointer in ibmvfc_interrupt_scsi().
Tyrel Datwyler [Thu, 16 Jun 2022 19:11:26 +0000 (12:11 -0700)]
scsi: ibmvfc: Allocate/free queue resource only during probe/remove
Currently, the sub-queues and event pool resources are allocated/freed for
every CRQ connection event such as reset and LPM. This exposes the driver
to a couple issues. First the inefficiency of freeing and reallocating
memory that can simply be resued after being sanitized. Further, a system
under memory pressue runs the risk of allocation failures that could result
in a crippled driver. Finally, there is a race window where command
submission/compeletion can try to pull/return elements from/to an event
pool that is being deleted or already has been deleted due to the lack of
host state around freeing/allocating resources. The following is an example
of list corruption following a live partition migration (LPM):
Saurabh Sengar [Tue, 14 Jun 2022 07:05:55 +0000 (00:05 -0700)]
scsi: storvsc: Correct reporting of Hyper-V I/O size limits
Current code is based on the idea that the max number of SGL entries
also determines the max size of an I/O request. While this idea was
true in older versions of the storvsc driver when SGL entry length
was limited to 4 Kbytes, commit 4afe14ec1b50 ("scsi: storvsc: Enable
scatterlist entry lengths > 4Kbytes") removed that limitation. It's
now theoretically possible for the block layer to send requests that
exceed the maximum size supported by Hyper-V. This problem doesn't
currently happen in practice because the block layer defaults to a
512 Kbyte maximum, while Hyper-V in Azure supports 2 Mbyte I/O sizes.
But some future configuration of Hyper-V could have a smaller max I/O
size, and the block layer could exceed that max.
Fix this by correctly setting max_sectors as well as sg_tablesize to
reflect the maximum I/O size that Hyper-V reports. While allowing
I/O sizes larger than the block layer default of 512 Kbytes doesn’t
provide any noticeable performance benefit in the tests we ran, it's
still appropriate to report the correct underlying Hyper-V capabilities
to the Linux block layer.
Also tweak the virt_boundary_mask to reflect that the required
alignment derives from Hyper-V communication using a 4 Kbyte page size,
and not on the guest page size, which might be bigger (eg. ARM64).
Link: https://lore.kernel.org/r/1655190355-28722-1-git-send-email-ssengar@linux.microsoft.com Fixes: 4afe14ec1b50 ("scsi: storvsc: Enable scatter list entry lengths > 4Kbytes") Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Bart Van Assche [Mon, 13 Jun 2022 21:44:42 +0000 (14:44 -0700)]
scsi: ufs: Fix a race between the interrupt handler and the reset handler
Prevent that both the interrupt handler and the reset handler try to
complete a request at the same time. This patch is the result of an
analysis of the following crash:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000120
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G OE 5.10.107-android13-4-00051-g1e48e8970cca-ab8664745 #1
pc : ufshcd_release_scsi_cmd+0x30/0x46c
lr : __ufshcd_transfer_req_compl+0x4fc/0x9c0
Call trace:
ufshcd_release_scsi_cmd+0x30/0x46c
__ufshcd_transfer_req_compl+0x4fc/0x9c0
ufshcd_poll+0xf0/0x208
ufshcd_sl_intr+0xb8/0xf0
ufshcd_intr+0x168/0x2f4
__handle_irq_event_percpu+0xa0/0x30c
handle_irq_event+0x84/0x178
handle_fasteoi_irq+0x150/0x2e8
__handle_domain_irq+0x114/0x1e4
gic_handle_irq.31846+0x58/0x300
el1_irq+0xe4/0x1c0
cpuidle_enter_state+0x3ac/0x8c4
do_idle+0x2fc/0x55c
cpu_startup_entry+0x84/0x90
kernel_init+0x0/0x310
start_kernel+0x0/0x608
start_kernel+0x4ec/0x608
Link: https://lore.kernel.org/r/20220613214442.212466-4-bvanassche@acm.org Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Bart Van Assche [Mon, 13 Jun 2022 21:44:41 +0000 (14:44 -0700)]
scsi: ufs: Support clearing multiple commands at once
Modify ufshcd_clear_cmd() such that it supports clearing multiple commands
at once instead of one command at a time. This change will be used in a
later patch to reduce the time spent in the reset handler.
Link: https://lore.kernel.org/r/20220613214442.212466-3-bvanassche@acm.org Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>