scsi: core: Query VPD size before getting full page
We currently default to 255 bytes when fetching VPD pages during discovery.
However, we have had a few devices that are known to wedge if the requested
buffer exceeds a certain size. See commit e23ccf036d5f ("[SCSI] sd: Reduce
buffer size for vpd request") which works around one example of this
problem in the SCSI disk driver.
With commit a2200cbc64bf ("scsi: core: Add sysfs attributes for VPD pages
0h and 89h") we now risk triggering the same issue in the generic midlayer
code.
The problem with the ATA VPD page in particular is that the SCSI portion of
the page is trailed by 512 bytes of verbatim ATA Identify Device
information. However, not all controllers actually provide the additional
512 bytes and will lock up if one asks for more than the 64 bytes
containing the SCSI protocol fields.
Instead of picking a new, somewhat arbitrary, number of bytes for the VPD
buffer size, start fetching the 4-byte header for each page. The header
contains the size of the page as far as the device is concerned. We can use
the reported size to specify the correct allocation length when
subsequently fetching the full page.
The header validation is done by a new helper function scsi_get_vpd_size()
and both scsi_get_vpd_page() and scsi_get_vpd_buf() now rely on this to
query the page size.
In addition, scsi_get_vpd_page() is simplified to mirror the logic in
scsi_get_vpd_page(). This involves removing the Supported VPD Pages lookup
prior to attempting to query a page. There does not appear any evidence,
even in the oldest SCSI specs, that this step is required. We already rely
on scsi_get_vpd_page() throughout the stack and this function never
consulted the Supported VPD Pages. Since this has not caused any problems
it should be safe to remove the precondition from scsi_get_vpd_page().
Instrumented runs also revealed that the Supported VPD Pages lookup had
little effect since the device page index often was larger than the
supplied buffer size. As a result, inquiries frequently bypassed the index
check and went through the "If we ran off the end of the buffer, give us
the benefit of the doubt" code path which assumed the page was present
despite not being listed. The revised code takes both the page size
reported by the device as well as the size of the buffer provided by the
scsi_get_vpd_page() caller into account.
Link: https://lore.kernel.org/r/20220302053559.32147-3-martin.petersen@oracle.com Fixes: a2200cbc64bf ("scsi: core: Add sysfs attributes for VPD pages 0h and 89h") Reported-by: Maciej W. Rozycki <macro@orcam.me.uk> Tested-by: Maciej W. Rozycki <macro@orcam.me.uk> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: mpt3sas: Use cached ATA Information VPD page
We now cache VPD page 0x89 (ATA Information) so there is no need to request
it from the hardware. Make mpt3sas use the cached page.
Link: https://lore.kernel.org/r/20220302053559.32147-2-martin.petersen@oracle.com Cc: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Tue, 26 Apr 2022 18:14:19 +0000 (11:14 -0700)]
scsi: lpfc: Fix resource leak in lpfc_sli4_send_seq_to_ulp()
If no handler is found in lpfc_complete_unsol_iocb() to match the rctl of a
received frame, the frame is dropped and resources are leaked.
Fix by returning resources when discarding an unhandled frame type. Update
lpfc_fc_frame_check() handling of NOP basic link service.
Link: https://lore.kernel.org/r/20220426181419.9154-1-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Tue, 26 Apr 2022 18:13:15 +0000 (11:13 -0700)]
scsi: lpfc: Remove unnecessary null ndlp check in lpfc_sli_prep_wqe()
Smatch had the following warning:
drivers/scsi/lpfc/lpfc_sli.c:22305 lpfc_sli_prep_wqe() error: we previously assumed 'ndlp' could be null (see line 22298)
Remove the unnecessary null check.
Link: https://lore.kernel.org/r/20220426181315.8990-1-jsmart2021@gmail.com Fixes: f70b5c0741d4 ("scsi: lpfc: Fix field overload in lpfc_iocbq data structure") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Minghao Chi [Wed, 20 Apr 2022 09:03:52 +0000 (09:03 +0000)]
scsi: ufs: Use pm_runtime_resume_and_get() instead of pm_runtime_get_sync()
Using pm_runtime_resume_and_get() to replace pm_runtime_get_sync() and
pm_runtime_put_noidle(). This change is just to simplify the code, no
actual functional changes.
scsi: megaraid: Fix error check return value of register_chrdev()
If major equals 0, register_chrdev() returns an error code when it fails.
This function dynamically allocates a major and returns its number on
success, so we should use "< 0" to check it instead of "!".
scsi: dc395x: Fix a missing check on list iterator
The bug is here:
p->target_id, p->target_lun);
The list iterator 'p' will point to a bogus position containing HEAD if the
list is empty or no element is found. This case must be checked before any
use of the iterator, otherwise it will lead to an invalid memory access.
To fix this bug, add a check. Use a new variable 'iter' as the list
iterator, and use the original variable 'p' as a dedicated pointer to point
to the found element.
scsi: qedf: Remove an unneeded NULL check on list iterator
The list iterator 'fcport' is always non-NULL so it doesn't need to be
checked. Thus just remove the unnecessary NULL check. Also remove the
unnecessary initializer because the list iterator is always initialized.
And adjust the position of blank lines.
Kiwoong Kim [Thu, 31 Mar 2022 01:24:05 +0000 (10:24 +0900)]
scsi: ufs: core: Exclude UECxx from SFR dump list
Some devices may return invalid or zeroed data during an UIC error
condition. In addition, reading these SFRs will clear them. This means the
subsequent error handling will not be able to see them and therefore no
error handling will be scheduled.
John Garry [Wed, 30 Mar 2022 11:38:35 +0000 (19:38 +0800)]
scsi: core: Refine how we set tag_set NUMA node
For SCSI hosts which enable host_tagset the NUMA node returned from
blk_mq_hw_queue_to_node() is NUMA_NO_NODE always. Then, since in
scsi_mq_setup_tags() the default we choose for the tag_set NUMA node is
NUMA_NO_NODE, we always evaluate the NUMA node as NUMA_NO_NODE in functions
like blk_mq_alloc_rq_map().
The reason we get NUMA_NO_NODE from blk_mq_hw_queue_to_node() is that the
hctx_idx passed is BLK_MQ_NO_HCTX_IDX - so we can't match against a (HW)
queue mapping index.
Improve this by defaulting the tag_set NUMA node to the same NUMA node of
the SCSI host DMA dev.
The following warning showed up when compiling with W=1.
drivers/message/fusion/mptctl.c: In function ‘mptctl_hp_hostinfo’:
drivers/message/fusion/mptctl.c:2337:8: warning: variable ‘retval’ set but not used [-Wunused-but-set-variable]
int retval;
scsi: aacraid: Fix undefined behavior due to shift overflowing the constant
Fix:
drivers/scsi/aacraid/commsup.c: In function ‘aac_handle_sa_aif’:
drivers/scsi/aacraid/commsup.c:1983:2: error: case label does not reduce to an integer constant
case SA_AIF_BPCFG_CHANGE:
^~~~
See https://lore.kernel.org/r/YkwQ6%2BtIH8GQpuct@zn.tnic for the gory
details as to why it triggers with older gccs only.
Link: https://lore.kernel.org/r/20220405151517.29753-2-bp@alien8.de Cc: Adaptec OEM Raid Solutions <aacraid@microsemi.com> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: linux-scsi@vger.kernel.org Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Damien Le Moal [Thu, 21 Apr 2022 18:30:23 +0000 (11:30 -0700)]
scsi: scsi_debug: Add gap zone support
Add the 'zone_cap_mb' kernel module parameter. This parameter defines the
zone capacity. The zone capacity must be less than or equal to the zone
size.
Report that sequential write zones and gap zones are paired in the Zoned
Block Device Characteristics VPD page (page B6h).
This patch has been tested as follows:
modprobe scsi_debug delay=0 sector_size=512 dev_size_mb=128 zbc=host-managed zone_nr_conv=16 zone_size_mb=4 zone_cap_mb=3
modprobe brd rd_nr=1 rd_size=$((1<<20))
mkfs.f2fs -m /dev/ram0 -c /dev/${scsi_debug_dev}
mount /dev/ram0 /mnt
# Run a fio job that uses /mnt
Link: https://lore.kernel.org/r/20220421183023.3462291-10-bvanassche@acm.org Cc: Douglas Gilbert <dgilbert@interlog.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
[ bvanassche: Switched to reporting a constant zone starting LBA granularity ] Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Damien Le Moal [Thu, 21 Apr 2022 18:30:22 +0000 (11:30 -0700)]
scsi: scsi_debug: Rename zone type constants
Rename the scsi_debug zone type constants to prevent a conflict with the
ZBC_ZONE_TYPE_GAP constant from include/scsi/scsi_proto.h.
Link: https://lore.kernel.org/r/20220421183023.3462291-9-bvanassche@acm.org Cc: Douglas Gilbert <dgilbert@interlog.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
[ bvanassche: Extracted these changes from a larger patch ] Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220421183023.3462291-8-bvanassche@acm.org Cc: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Damien Le Moal [Thu, 21 Apr 2022 18:30:20 +0000 (11:30 -0700)]
scsi: sd: sd_zbc: Hide gap zones
ZBC-2 allows host-managed disks to report gap zones. This allow zoned disks
to report an offset between data zone starts that is a power of two even if
the number of logical blocks with data per zone is not a power of two.
Another new feature in ZBC-2 is support for constant zone starting LBA
offsets. For zoned disks that report a constant zone starting LBA offset,
hide the gap zones from the block layer. Report the offset between data
zone starts as zone size and report the number of logical blocks with data
per zone as the zone capacity.
Link: https://lore.kernel.org/r/20220421183023.3462291-7-bvanassche@acm.org Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
[ bvanassche: Reworked this patch ] Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Damien Le Moal [Thu, 21 Apr 2022 18:30:19 +0000 (11:30 -0700)]
scsi: sd: sd_zbc: Return early in sd_zbc_check_zoned_characteristics()
Return early in sd_zbc_check_zoned_characteristics() for host-aware
disks. This patch does not change any functionality but makes a later patch
easier to read.
Link: https://lore.kernel.org/r/20220421183023.3462291-6-bvanassche@acm.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
[ bvanassche: extracted this change from a larger patch ] Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Deriving the meaning of the nr_zones, rev_nr_zones, zone_blocks and
rev_zone_blocks member variables requires careful analysis of the source
code. Make the meaning of these member variables easier to understand by
introducing struct zoned_disk_info.
Link: https://lore.kernel.org/r/20220421183023.3462291-5-bvanassche@acm.org Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Damien Le Moal [Thu, 21 Apr 2022 18:30:17 +0000 (11:30 -0700)]
scsi: sd: sd_zbc: Use logical blocks as unit when querying zones
When querying zones, track the position in logical blocks instead of in
sectors. This change slightly simplifies sd_zbc_report_zones().
Link: https://lore.kernel.org/r/20220421183023.3462291-4-bvanassche@acm.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
[ bvanassche: extracted this change from a larger patch ] Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hence this patch that verifies that the zone size is a power of two.
Link: https://lore.kernel.org/r/20220421183023.3462291-3-bvanassche@acm.org Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add several kernel-doc headers. Declare input arrays const. Specify the
array size in function declarations.
Link: https://lore.kernel.org/r/20220421183023.3462291-2-bvanassche@acm.org Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Move the definition of this data structure since it is only used in a
single source file.
Link: https://lore.kernel.org/r/20220419225811.4127248-28-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Split the ufshcd.h header file into a header file that defines the
interface used by UFS drivers and another header file with declarations and
data structures only used by the UFS core.
Follow the convention that is used elsewhere in the Linux kernel source
code and only include those headers of which the declarations are used
directly.
Link: https://lore.kernel.org/r/20220419225811.4127248-25-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: ufs: Remove unnecessary ufshcd-crypto.h include directives
ufshcd-crypto.h declares functions that must only be called by the UFS
core. Hence remove the #include "ufshcd-crypto.h" directive from UFS
drivers.
Link: https://lore.kernel.org/r/20220419225811.4127248-24-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Clearing hba->is_sys_suspended if ufs_qcom_resume() succeeds is wrong. That
variable must only be cleared if all actions involved in a resume succeed.
Hence remove the statement that clears hba->is_sys_suspended from
ufs_qcom_resume().
Link: https://lore.kernel.org/r/20220419225811.4127248-23-bvanassche@acm.org Fixes: e5586a66ebb0 ("ufs-qcom: add support for Qualcomm Technologies Inc platforms") Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Since the code to modify delay_ms while holding the host lock occurs twice,
introduce a function that performs this action.
Link: https://lore.kernel.org/r/20220419225811.4127248-22-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: ufs: Remove locking from around single register writes
Single register writes are atomic and hence do not need to be surrounded by
locking. Additionally, MMIO writes are typically posted asynchronously.
Hence, there is no guarantee that these have finished by the time the
spin_unlock*() call has finished. See also the nonposted-mmio property of
the Open Firmware tree. See also pci_iomap().
Link: https://lore.kernel.org/r/20220419225811.4127248-21-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In the Linux kernel coding style document
(Documentation/process/coding-style.rst) it is recommended to use the type
'bool' and also the values 'true' and 'false'. Hence this patch that
removes the definitions and uses of TRUE and FALSE from the UFS driver.
Link: https://lore.kernel.org/r/20220419225811.4127248-20-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Since specifying the path in a source file is redundant, remove the paths
from source code comments.
Link: https://lore.kernel.org/r/20220419225811.4127248-19-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: ufs: Rename sdev_ufs_device into ufs_device_wlun
The new name reflects the role of this member variable better: a WLUN
through which the power mode of the UFS device is controlled.
Link: https://lore.kernel.org/r/20220419225811.4127248-17-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The current version number is 0.2. That driver version was assigned more
than nine years ago. A version number that is not updated while the driver
is updated is not useful. Hence remove the driver version number from the
UFS driver. See also commit 53e303e34feb ("[SCSI] ufs: Separate PCI code
into glue driver").
Link: https://lore.kernel.org/r/20220419225811.4127248-16-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: ufs: Make the config_scaling_param calls type safe
Pass the actual type to config_scaling_param callback as the third argment
instead of a void pointer. Remove a superfluous NULL pointer check from
ufs_qcom_config_scaling_param().
Link: https://lore.kernel.org/r/20220419225811.4127248-15-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Make it easier to verify for humans that ufshcd_init_pwr_dev_param()
initializes all structure members. This patch does not change any
functionality.
Link: https://lore.kernel.org/r/20220419225811.4127248-14-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit 6ff7a99a44c2 ("scsi: ufs: Remove pre-defined initial voltage values
of device power") removed the code that uses the UFS_VREG_VCC* constants
and also the code that sets the min_uV and max_uV member variables. Hence
also remove these constants and that member variable.
Link: https://lore.kernel.org/r/20220419225811.4127248-13-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: ufs: Invert the return value of ufshcd_is_hba_active()
It is confusing that ufshcd_is_hba_active() returns 'true' if the HBA is
not active. Clear up this confusion by inverting the return value of
ufshcd_is_hba_active(). This patch does not change any functionality.
Link: https://lore.kernel.org/r/20220419225811.4127248-12-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Declare the quirks array and also its 'model' member const to make it
explicit that these are not modified.
Link: https://lore.kernel.org/r/20220419225811.4127248-11-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: ufs: Rename struct ufs_dev_fix into ufs_dev_quirk
Since struct ufs_dev_fix contains quirk information, rename it into struct
ufs_dev_quirk.
Link: https://lore.kernel.org/r/20220419225811.4127248-10-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: ufs: Remove the UFS_FIX() and END_FIX() macros
Since these two macros reduce code readability, remove them.
Link: https://lore.kernel.org/r/20220419225811.4127248-9-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
ufshcd_lrb.sense_buffer is NULL if ufshcd_lrb.cmd is NULL and
ufshcd_lrb.sense_buffer points at cmd->sense_buffer if ufshcd_lrb.cmd is
set. In other words, the ufshcd_lrb.sense_buffer member is identical to
cmd->sense_buffer. Hence this patch that removes the
ufshcd_lrb.sense_buffer structure member.
Link: https://lore.kernel.org/r/20220419225811.4127248-7-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
ufshcd_lrb.sense_bufflen is set but never read. Hence remove this struct
member.
Link: https://lore.kernel.org/r/20220419225811.4127248-6-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Keoseong Park <keosung.park@samsung.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: ufs: Simplify statements that return a boolean
Convert "if (expr) return true; else return false;" into "return expr;" if
either 'expr' is a boolean expression or the return type of the function is
'bool'.
Link: https://lore.kernel.org/r/20220419225811.4127248-5-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Keoseong Park <keosung.park@samsung.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Remove "? true : false" if the preceding expression yields a boolean or if
the result of the expression is assigned to a boolean since in these two
cases the "? true : false" part is superfluous.
Link: https://lore.kernel.org/r/20220419225811.4127248-4-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Declare this function static since it is only used inside the ufshcd.c
source file.
Link: https://lore.kernel.org/r/20220419225811.4127248-3-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: ufs: Fix a spelling error in a source code comment
Change one occurrence of "adpater" into "adapter".
Link: https://lore.kernel.org/r/20220419225811.4127248-2-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: ufs: core: Increase fDeviceInit poll frequency
UFS devices are expected to clear fDeviceInit flag in single digit
milliseconds. Current values of 5 to 10 millisecond sleep add to increased
latency during the initialization and resume path. This CL lowers the sleep
range to 500 to 1000 microseconds.
Link: https://lore.kernel.org/r/20220421002429.3136933-1-bvanassche@acm.org Acked-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Konstantin Vyshetsky <vkon@google.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Dan Carpenter [Thu, 21 Apr 2022 15:03:52 +0000 (18:03 +0300)]
scsi: iscsi: Fix harmless double shift bug
These flags are supposed to be bit numbers. Right now they cause a double
shift bug where we use BIT(BIT(2)) instead of BIT(2). Fortunately, the bit
numbers are small and it's done consistently so it does not cause an issue
at run time.
Link: https://lore.kernel.org/r/YmFyWHf8nrrx+SHa@kili Fixes: bfc38cb7416d ("scsi: iscsi: Merge suspend fields") Reviewed-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: target: core: Silence the message about unknown VPD pages
Target does not support some VPD pages and is very verbose about it.
Sometimes initiators don't bother and just keep sending the same request
from time to time, filling up the logs.
The replyPostRegisterIndex array of struct MPT3SAS_ADAPTER stores iomem
resource addresses. Fix its declaration to annotate it with __iomem to
avoid sparse warnings for writel() calls using the stored addresses.
Damien Le Moal [Mon, 7 Mar 2022 23:48:53 +0000 (08:48 +0900)]
scsi: mpt3sas: Fix event callback log_code value handling
In mpt3sas_scsih_event_callback(), fix a sparse warning when testing the
event log code value by replacing the use of a pointer to the address
storing the event log code with a log code local variable. Doing so,
le32_to_cpu() is used when the log code value is assigned, avoiding a
sparse warning.
Damien Le Moal [Mon, 7 Mar 2022 23:48:52 +0000 (08:48 +0900)]
scsi: mpt3sas: Fix ioc->base_readl() use
The functions _base_readl_aero() and _base_readl() used for an adapter
base_readl() method are implemented using a regular readl() call which
internally performs a conversion to CPU endianness (le32_to_cpu()) of
the values read. The users of the ioc base_readl() method should thus
not convert again the values read using le16_to_cpu().
Fixing this removes sparse warnings.
Damien Le Moal [Mon, 7 Mar 2022 23:48:51 +0000 (08:48 +0900)]
scsi: mpt3sas: Fix writel() use
writel() internally executes cpu_to_le32() to convert the value being
written to little endian. The caller should thus not use this conversion
function for the value passed to writel(). Remove the cpu_to_le32() calls
in _base_put_smid_scsi_io_atomic(), _base_put_smid_fast_path_atomic(),
_base_put_smid_hi_priority_atomic() _base_put_smid_default_atomic() and
_base_handshake_req_reply_wait().
The TaskMID field of struct Mpi2SCSITaskManagementRequest_t is a 16-bit
little endian value. Fix the search loop in _ctl_set_task_mid() to add a
cpu_to_le16() conversion before checking the value of TaskMID to avoid
sparse warnings. While at it, simplify the search loop code to remove an
unnecessarily complicated if condition.
scsi: core: Increase max device queue_depth to 4096
The maximum SCSI device queue depth of 1024 is not sufficient for RAID
volumes configured behind Broadcom RAID controllers. For a 16-drive RAID
volume with a device queue depth limit of 1024, only 64 I/Os (1024/16) can
be issued per drive. That is not sufficient to saturate the device.
Link: https://lore.kernel.org/r/20220414103601.140687-1-sumit.saxena@broadcom.com Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Tue, 12 Apr 2022 22:20:08 +0000 (15:20 -0700)]
scsi: lpfc: Copyright updates for 14.2.0.2 patches
Update copyrights to 2022 for files modified in the 14.2.0.2 patch set.
Link: https://lore.kernel.org/r/20220412222008.126521-27-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Tue, 12 Apr 2022 22:20:07 +0000 (15:20 -0700)]
scsi: lpfc: Update lpfc version to 14.2.0.2
Update lpfc version to 14.2.0.2.
Link: https://lore.kernel.org/r/20220412222008.126521-26-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Tue, 12 Apr 2022 22:20:06 +0000 (15:20 -0700)]
scsi: lpfc: Expand setting ELS_ID field in ELS_REQUEST64_WQE
ELS_ID field for ELS_REQUEST64_WQE is not filled out when FIP is not
supported by the HBA.
Move setting ELS_ID logic into __lpfc_sli_prep_els_req_rsp_s4(), and remove
ELS_ID FIP dependency logic from lpfc_sli_prep_wqe().
Introduce PLOGI ELS_ID and as a result update wqe_els_id_MASK because PLOGI
ELS_ID = 0x4 occupies up to 3 bits.
While in __lpfc_sli_prep_els_req_rsp_s4() routine, remove SLI3-isms.
Link: https://lore.kernel.org/r/20220412222008.126521-25-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Tue, 12 Apr 2022 22:20:05 +0000 (15:20 -0700)]
scsi: lpfc: Update stat accounting for READ_STATUS mbox command
READ_STATUS tx/rx byte count fields are now expanded to 64 bit wide
counters. This patch updates logic for the READ_STATUS mbox command when
displaying tx_word and rx_word statistics in sysfs.
Link: https://lore.kernel.org/r/20220412222008.126521-24-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Tue, 12 Apr 2022 22:20:04 +0000 (15:20 -0700)]
scsi: lpfc: Change FA-PWWN detection methodology
Do not rely on vendor version field of the CSPs to determine if we are in a
FA-PWWN environment. Instead, use the following procedure:
First, during HBA initialization, driver does a READ_CONFIG to determine if
FA-PWWN is configured on the HBA. A LPFC_FAWWPN_CONFIG hba_flag is set
accordingly.
Next, when the link comes up before the driver gets a link up event, the
firmware logs into the fabric with FA-PWWN. If the fabric port does not
support FA-PWWN, the driver will get a Misconfigured FA-WWN async event
before the link up. A LPFC_FAWWPN_FABRIC hba_flag will be set accordingly.
Finally, if the fabric supports FA-PWWN, the firmware will replace its CSPs
WWN with the Fabric Assigned ones. Then after link up, the driver will
retrieve the Fabric Assigned WWN when it does a READ_SPARAM mbox command.
Link: https://lore.kernel.org/r/20220412222008.126521-23-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Tue, 12 Apr 2022 22:20:03 +0000 (15:20 -0700)]
scsi: lpfc: Refactor cleanup of mailbox commands
The intention of this patch is to refactor mailbox memory allocation and
cleanup steps in one routine respectively to prevent memory leaks or memory
errors related to mailbox commands. There are trivial localized fixes as
well.
Provide lpfc_mbox_rsrc_prep() - this routine allocates the dmabuf and the
mbuf associated with it. It also catches allocation errors and returns
status.
Provide lpfc_mbox_rsrc_cleanup() - this routine verifies a dmabuf exists
and if so releases the associated mbuf and the dmabuf memory. It then sets
the ctx_buf to NULL and releases the mailbox memory to the mailbox pool.
Link: https://lore.kernel.org/r/20220412222008.126521-22-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Tue, 12 Apr 2022 22:20:02 +0000 (15:20 -0700)]
scsi: lpfc: Fix field overload in lpfc_iocbq data structure
The lpfc_iocbq data structure has void * pointers that are overloaded to be
as many as 8 different data types and the driver translates the void * by
casting. This patch removes the void * pointers by declaring the specific
types needed by the driver. It also expands the context_un to include more
seldom used pointer types to save structure bytes. It also groups the u8
types together to pack the 8 bytes needed. This work allows the lpfc_iocbq
data structure to be more strongly typed and keeps it from being allocated
from the 512 byte slab.
[mkp: rolled in zeroday fix]
Link: https://lore.kernel.org/r/20220412222008.126521-21-jsmart2021@gmail.com Reported-by: kernel test robot <lkp@intel.com> Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Tue, 12 Apr 2022 22:20:01 +0000 (15:20 -0700)]
scsi: lpfc: Introduce FC_RSCN_MEMENTO flag for tracking post RSCN completion
During an NVMe target reboot, the target may initialize itself as FCP only
during the first RSCN and shortly after trigger a second RSCN claiming NVMe
support. The timing of these RSCNs occur before FCP-PRLI for the first
RSCN completes leading discovery issues over NVMe.
Change RSCN and NVME-PRLI send logic based on a new FC_RSCN_MEMENTO flag
that signals when lpfc_end_rscn() is completed and serves as a memento that
discovery was started from RSCN.
Link: https://lore.kernel.org/r/20220412222008.126521-20-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Tue, 12 Apr 2022 22:20:00 +0000 (15:20 -0700)]
scsi: lpfc: Register for Application Services FC-4 type in Fabric topology
Add new FC-4 type 0x60 Application Services for fabric registration when
VMID is enabled.
Modified rft struture to indicate __be format. Removed redundant ipReg
variable as it was not used.
Link: https://lore.kernel.org/r/20220412222008.126521-19-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Tue, 12 Apr 2022 22:19:59 +0000 (15:19 -0700)]
scsi: lpfc: Remove false FDMI NVMe FC-4 support for NPIV ports
FDMI FC-4 Active Type for vports mistakenly shows NVMe support.
Add a check to only set the NVMe support bit for the physical port.
Link: https://lore.kernel.org/r/20220412222008.126521-18-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Tue, 12 Apr 2022 22:19:58 +0000 (15:19 -0700)]
scsi: lpfc: Revise FDMI reporting of supported port speed for trunk groups
Trunk port FDMI supported port speed shows single port supported speed
rather than the trunked port speed.
Modify supported port speed logic calculation during registration.
Link: https://lore.kernel.org/r/20220412222008.126521-17-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
this_cpu_ptr() calls smp_processor_id() in a preemptible context.
Fix by using per_cpu_ptr() with raw_smp_processor_id() instead.
Link: https://lore.kernel.org/r/20220412222008.126521-16-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Tue, 12 Apr 2022 22:19:56 +0000 (15:19 -0700)]
scsi: lpfc: Correct CRC32 calculation for congestion stats
lpfc_cgn_calc_crc32() is returning 32 bits, and lpfc_cgn_update_stat() was
using u16 to store the crc32 value. Correct by redeclaring the local
variable to u32.
Link: https://lore.kernel.org/r/20220412222008.126521-15-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Tue, 12 Apr 2022 22:19:55 +0000 (15:19 -0700)]
scsi: lpfc: Move MI module parameter check to handle dynamic disable
lpfc_refresh_params() can be called for an async event handler. This could
potentially override the value initialized by lpfc_cmf_setup().
Move module parameter check to lpfc_refresh_params().
Link: https://lore.kernel.org/r/20220412222008.126521-14-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Tue, 12 Apr 2022 22:19:54 +0000 (15:19 -0700)]
scsi: lpfc: Remove unnecessary NULL pointer assignment for ELS_RDF path
The command IOCB ndlp pointer is overwritten in lpfc_issue_els_rdf(), and
the original ndlp pointer is stored ahead of time.
This null ptr assignment can be safely removed.
Link: https://lore.kernel.org/r/20220412222008.126521-13-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Tue, 12 Apr 2022 22:19:53 +0000 (15:19 -0700)]
scsi: lpfc: Transition to NPR state upon LOGO cmpl if link down or aborted
In P2P topology, a target controller reboot sometimes results in not
reestablishing a login because the ndlp is stuck in LOGO state.
Fix by transitioning to NPR state if we get link down before LOGO
completes.
Link: https://lore.kernel.org/r/20220412222008.126521-12-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Tue, 12 Apr 2022 22:19:52 +0000 (15:19 -0700)]
scsi: lpfc: Update fc_prli_sent outstanding only after guaranteed IOCB submit
If lpfc_sli_issue_iocb() fails, then the fc_prli_sent is never decremented.
Move the fc_prli_sent++ to after a guaranteed IOCB submit.
Link: https://lore.kernel.org/r/20220412222008.126521-11-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Tue, 12 Apr 2022 22:19:51 +0000 (15:19 -0700)]
scsi: lpfc: Protect memory leak for NPIV ports sending PLOGI_RJT
There is a potential memory leak in lpfc_ignore_els_cmpl() and
lpfc_els_rsp_reject() that was allocated from NPIV PLOGI_RJT
(lpfc_rcv_plogi()'s login_mbox).
Check if cmdiocb->context_un.mbox was allocated in lpfc_ignore_els_cmpl(),
and then free it back to phba->mbox_mem_pool along with mbox->ctx_buf for
service parameters.
For lpfc_els_rsp_reject() failure, free both the ctx_buf for service
parameters and the login_mbox.
Link: https://lore.kernel.org/r/20220412222008.126521-10-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Tue, 12 Apr 2022 22:19:50 +0000 (15:19 -0700)]
scsi: lpfc: Fix null pointer dereference after failing to issue FLOGI and PLOGI
If lpfc_issue_els_flogi() fails and returns non-zero status, the node
reference count is decremented to trigger the release of the nodelist
structure. However, if there is a prior registration or dev-loss-evt work
pending, the node may be released prematurely. When dev-loss-evt
completes, the released node is referenced causing a use-after-free null
pointer dereference.
Similarly, when processing non-zero ELS PLOGI completion status in
lpfc_cmpl_els_plogi(), the ndlp flags are checked for a transport
registration before triggering node removal. If dev-loss-evt work is
pending, the node may be released prematurely and a subsequent call to
lpfc_dev_loss_tmo_handler() results in a use after free ndlp dereference.
Add test for pending dev-loss before decrementing the node reference count
for FLOGI, PLOGI, PRLI, and ADISC handling.
Link: https://lore.kernel.org/r/20220412222008.126521-9-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Tue, 12 Apr 2022 22:19:49 +0000 (15:19 -0700)]
scsi: lpfc: Clear fabric topology flag before initiating a new FLOGI
Previous topologies may no longer be in fabric mode, so clear FC_FABRIC in
fc_flag for every new FLOGI.
Link: https://lore.kernel.org/r/20220412222008.126521-8-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fix by reordering the taking of the lpfc_cmd->buf_lock and phba->hbalock in
lpfc_abort_handler routine so that it tries to take the lpfc_cmd->buf_lock
first before phba->hbalock.
Link: https://lore.kernel.org/r/20220412222008.126521-7-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Tue, 12 Apr 2022 22:19:47 +0000 (15:19 -0700)]
scsi: lpfc: Requeue SCSI I/O to upper layer when fw reports link down
During heavy I/O stress tests with 100+ vports and cable pulls, it may take
a while before the vport logs back into the fabric to resume I/O.
Currently, the driver immediately fails the I/O with DID_ERROR.
Change behavior to return DID_REQUEUE, and rely on SCSI layer's max retry
of 5 before erroring out the I/O.
Link: https://lore.kernel.org/r/20220412222008.126521-6-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Tue, 12 Apr 2022 22:19:46 +0000 (15:19 -0700)]
scsi: lpfc: Zero SLI4 fcp_cmnd buffer's fcpCntl0 field
It's possible that the fcpCntl0 reserved field is allocated non-zero.
For certain target storage arrays this could cause problems expecting
reserved fields to be all zero.
SLI3 path already allocates fcp_cmnd buffer with dma_pool_zalloc() in
lpfc_new_scsi_buf_s3. The fcpCntl0 field itself is never proactively set
throughout the SCSI I/O path. Thus, we only change the SLI4 fcp_cmnd
buffer allocation to dma_pool_zalloc.
Link: https://lore.kernel.org/r/20220412222008.126521-5-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Tue, 12 Apr 2022 22:19:45 +0000 (15:19 -0700)]
scsi: lpfc: Fix diagnostic fw logging after a function reset
The lpfc_sli4_ras_setup() routine is only called from the
lpfc_pci_probe_one_s4() routine, which means diagnostic fw logging
initialization only occurs during probing.
Thus, any path involving a reset of the HBA that restarts the state of the
SLI port does not reinitialize diagnostic fw logging.
Move lpfc_sli4_ras_setup() into lpfc_sli4_hba_setup() so that the
LOWLEVEL_SET_DIAG_LOG_OPTIONS mailbox command can be sent after a function
reset.
Link: https://lore.kernel.org/r/20220412222008.126521-4-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The same CPU tries to claim the phba->port_list_lock twice.
Move the cfg_log_verbose checks as part of the lpfc_printf_vlog() and
lpfc_printf_log() macros before calling lpfc_dmp_dbg(). There is no need
to take the phba->port_list_lock within lpfc_dmp_dbg().
Link: https://lore.kernel.org/r/20220412222008.126521-3-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>