max_node_id not equal to the ARRAY_SIZE of node array, need increase 1,
otherwise xlate will fail for the last entry. And rename max_node_id
to num_nodes to reflect the reality.
Error code is overridden, in case the PLL doesn't lock. So, the USB
initialization can continue. This leads to a platform freeze.
This can be avoided by returning proper error code to avoid USB probe
freezing the platform. It also displays proper errors in log.
The simple_write_to_buffer() function will return positive/success if it
is able to write a single byte anywhere within the buffer. However that
potentially leaves a lot of the buffer uninitialized.
In this code it's better to return 0 if the offset is non-zero. This
code is not written to support partial writes. And then return -EFAULT
if the buffer is not completely initialized.
Not all platforms have all of the four currently supported wakeup
interrupts so use the optional irq helpers when looking up interrupts to
avoid printing error messages when an optional interrupt is not found:
dwc3-qcom a6f8800.usb: error -ENXIO: IRQ hs_phy_irq not found
According to the programming guide, it is recommended to
perform a GCTL_CORE_SOFTRESET only when switching the mode
from device to host or host to device. However, it is found
that during bootup when __dwc3_set_mode() is called for the
first time, GCTL_CORESOFTRESET is done with suspendable bit(BIT 17)
of DWC3_GUSB3PIPECTL set. This some times leads to issues
like controller going into bad state and controller registers
reading value zero. Until GCTL_CORESOFTRESET is done and
run/stop bit is set core initialization is not complete.
Setting suspendable bit of DWC3_GUSB3PIPECTL and then
performing GCTL_CORESOFTRESET is therefore not recommended.
Avoid this by only performing the reset if current_dr_role is set,
that is, when doing subsequent role switching.
Synopsys IP DWC_usb32 and DWC_usb31 version 1.90a and above deprecated
GCTL.CORESOFTRESET. The DRD mode switching flow is updated to remove the
GCTL soft reset. Add version checks to prevent using deprecated setting
in mode switching flow.
USB_AMD5536UDC should depend on HAS_DMA since it selects USB_SNP_CORE,
which depends on HAS_DMA and since 'select' does not follow any
dependency chains.
Fixes this kconfig warning:
WARNING: unmet direct dependencies detected for USB_SNP_CORE
Depends on [n]: USB_SUPPORT [=y] && USB_GADGET [=y] && (USB_AMD5536UDC [=y] || USB_SNP_UDC_PLAT [=n]) && HAS_DMA [=n]
Selected by [y]:
- USB_AMD5536UDC [=y] && USB_SUPPORT [=y] && USB_GADGET [=y] && USB_PCI [=y]
The 'pdev' and 'netdev' need to be released in error cases of
iss_net_configure().
Change the return type of iss_net_configure() to void, because it's
not used.
Fixes: f3bf2e96d0e8 ("[PATCH] xtensa: Architecture support for Tensilica Xtensa Part 8") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Provide release() callback for the platform device embedded into struct
iss_net_private and registered in the iss_net_configure so that
platform_device_unregister could be called for it.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Correct a SOP READ and WRITE DMA flags for some requests.
This update corrects DMA direction issues with SCSI commands removed from
the controller's internal lookup table.
Currently, SCSI READ BLOCK LIMITS (0x5) was removed from the controller
lookup table and exposed a DMA direction flag issue.
SCSI READ BLOCK LIMITS was recently removed from our controller lookup
table so the controller uses the respective IU flag field to set the DMA
data direction. Since the DMA direction is incorrect the FW never completes
the request causing a hang.
Some SCSI commands which use SCSI READ BLOCK LIMITS
* sg_map
* mt -f /dev/stX status
After updating controller firmware, users may notice their tape units
failing. This patch resolves the issue.
Also, the AIO path DMA direction is correct.
The DMA direction flag is a day-one bug with no reported BZ.
Fixes: c75a508cb567 ("smartpqi: initial commit of Microsemi smartpqi driver") Link: https://lore.kernel.org/r/165730605618.177165.9054223644512926624.stgit@brunhilda Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Mahesh Rajashekhara <Mahesh.Rajashekhara@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
We currently enable clocks BEFORE we write to PARF_PHY_CTRL reg to enable
clocks and resets. This causes the driver to never set to a ready state
with the error 'Phy link never came up'.
This is caused by the PHY clock getting enabled before setting the required
bits in the PARF regs.
A workaround for this was set but with this new discovery we can drop
the workaround and use a proper solution to the problem by just enabling
the clock only AFTER the PARF_PHY_CTRL bit is set.
This correctly sets up the PCIe link and makes it usable even when a
bootloader leaves the PCIe link in an undefined state.
AER reporting is currently disabled in the DevCtl registers of all non Root
Port PCIe devices on systems using pcie_ports_native || host->native_aer,
disabling AER completely in such systems. This is because 814fc331a463
("PCI: PCIe: Disable PCIe port services during port initialization"), added
a call to pci_disable_pcie_error_reporting() *after* the AER setup was
completed for the PCIe device tree.
Here a longer analysis about the current status of AER enabling /
disabling upon bootup provided by Bjorn:
pcie_portdrv_probe
pcie_port_device_register
get_port_device_capability
pci_disable_pcie_error_reporting
clear CERE NFERE FERE URRE # <-- disable for RP USP DSP
pcie_device_init
device_register # new AER service device
aer_probe
aer_enable_rootport # RP only
set_downstream_devices_error_reporting
set_device_error_reporting # self (RP)
if (RP || USP || DSP)
pci_enable_pcie_error_reporting
set CERE NFERE FERE URRE # <-- enable for RP
pci_walk_bus
set_device_error_reporting
if (RP || USP || DSP)
pci_enable_pcie_error_reporting
set CERE NFERE FERE URRE # <-- enable for USP DSP
In a typical Root Port -> Endpoint hierarchy, the above:
- Disables Error Reporting for the Root Port,
- Enables Error Reporting for the Root Port,
- Does NOT enable Error Reporting for the Endpoint because it is not a
Root Port or Switch Port.
In a deeper Root Port -> Upstream Switch Port -> Downstream Switch
Port -> Endpoint hierarchy:
- Disables Error Reporting for the Root Port,
- Enables Error Reporting for the Root Port,
- Enables Error Reporting for both Switch Ports,
- Does NOT enable Error Reporting for the Endpoint because it is not a
Root Port or Switch Port,
- Disables Error Reporting for the Switch Ports when pcie_portdrv_probe()
claims them. AER does not re-enable it because these are not Root
Ports.
Remove this call to pci_disable_pcie_error_reporting() from
get_port_device_capability(), leaving the already enabled AER configuration
intact. With this change, AER is enabled in the Root Port and the PCIe
switch upstream and downstream ports. Only the PCIe Endpoints don't have
AER enabled yet. A follow-up patch will take care of this Endpoint
enabling.
Fixes: 814fc331a463 ("PCI: PCIe: Disable PCIe port services during port initialization") Link: https://lore.kernel.org/r/20220125071820.2247260-3-sr@denx.de Signed-off-by: Stefan Roese <sr@denx.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Pali Rohár <pali@kernel.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com> Cc: Michal Simek <michal.simek@xilinx.com> Cc: Yao Hongbo <yaohongbo@linux.alibaba.com> Cc: Naveen Naidu <naveennaidu479@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Each secure guest must have a unique ASCE (address space control
element); we must avoid that new guests use the same page for their
ASCE, to avoid errors.
Since the ASCE mostly consists of the address of the topmost page table
(plus some flags), we must not return that memory to the pool unless
the ASCE is no longer in use.
Only a successful Destroy Secure Configuration UVC will make the ASCE
reusable again.
If the Destroy Configuration UVC fails, the ASCE cannot be reused for a
secure guest (either for the ASCE or for other memory areas). To avoid
a collision, it must not be used again. This is a permanent error and
the page becomes in practice unusable, so we set it aside and leak it.
On failure we already leak other memory that belongs to the ultravisor
(i.e. the variable and base storage for a guest) and not leaking the
topmost page table was an oversight.
This error (and thus the leakage) should not happen unless the hardware
is broken or KVM has some unknown serious bug.
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Fixes: 814dbd225c8436f ("KVM: s390: protvirt: Add initial vm and cpu lifecycle handling") Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Link: https://lore.kernel.org/r/20220628135619.32410-2-imbrenda@linux.ibm.com
Message-Id: <20220628135619.32410-2-imbrenda@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
In set_uhs_signaling, the DDR bit is being set by fully writing the MC1R
register.
This can lead to accidental erase of certain bits in this register.
Avoid this by doing a read-modify-write operation.
Fixes: e54596e47384 ("mmc: sdhci-of-at91: fix MMC_DDR_52 timing selection") Signed-off-by: Eugen Hristev <eugen.hristev@microchip.com> Tested-by: Karl Olsen <karl@micro-technic.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20220630090926.15061-1-eugen.hristev@microchip.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
'erased_blocks_bitmap' is never freed. As it is allocated at the same time
as 'used_blocks_bitmap', it is likely that it should be freed also at the
same time.
Add the corresponding bitmap_free() in msb_data_clear().
In case of devm_reset_control_get_optional_exclusive() failure we returned
directly instead of jumping to the error path to roll back initialization.
This patch moves devm_reset_control_get_optional_exclusive() early in the
probe so that we have the reset handle prior to initialization of the
hardware.
Fixes: e829d79509811 ("mmc: renesas_sdhi: do hard reset if possible") Reported-by: Pavel Machek <pavel@denx.de> Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Link: https://lore.kernel.org/r/20220624181438.4355-2-prabhakar.mahadev-lad.rj@bp.renesas.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
of_find_matching_node() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
Add missing of_node_put() to avoid refcount leak.
of_node_put() checks null pointer.
Fixes: 75ac19c92f67 ("mmc: sdhci-of-esdhc: add support for signal voltage switch") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Link: https://lore.kernel.org/r/20220523144255.10310-1-linmq006@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
There are sleep in atomic context bugs when dm_fsync_timer_callback is
executing. The root cause is that the memory allocation functions with
GFP_KERNEL or GFP_NOIO parameters are called in dm_fsync_timer_callback
which is a timer handler. The call paths that could trigger bugs are
shown below:
The patchset in [1] exported some definitions to binder_internal.h in
order to make the debugfs entries such as 'stats' and 'transaction_log'
available in a binderfs instance. However, the DEFINE_SHOW_ATTRIBUTE
macro expands into a static function/variable pair, which in turn get
redefined each time a source file includes this internal header.
This problem was made evident after a report from the kernel test robot
<lkp@intel.com> where several W=1 build warnings are seen in downstream
kernels. See the following example:
include/../drivers/android/binder_internal.h:111:23: warning: 'binder_stats_fops' defined but not used [-Wunused-const-variable=]
111 | DEFINE_SHOW_ATTRIBUTE(binder_stats);
| ^~~~~~~~~~~~
include/linux/seq_file.h:174:37: note: in definition of macro 'DEFINE_SHOW_ATTRIBUTE'
174 | static const struct file_operations __name ## _fops = { \
| ^~~~~~
This patch fixes the above issues by moving back the definitions into
binder.c and instead creates an array of the debugfs entries which is
more convenient to share with binderfs and iterate through.
After commit 240442c7424b ("dma-mapping: remove CONFIG_DMA_REMAP") there's
a chance of DMA buffer getting allocated via vmalloc(), which messes up
the mmapping code:
Unbinding an endpoint function from the endpoint controller shouldn't stop
the controller. This is especially a problem for multi-function endpoints
where other endpoints may still be active.
Don't stop the controller when unbinding one of its endpoints. Normally
the controller is stopped via configfs.
Fixes: dbbde090d453 ("PCI: endpoint: functions: Add an EP function to test PCI") Link: https://lore.kernel.org/r/20220622040924.113279-1-mie@igel.co.jp Signed-off-by: Shunsuke Mie <mie@igel.co.jp> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
This happens because of data race. Each thread rewrite channels's
descriptor as soon as device_prep_dma_memcpy() is called. It leads to the
situation when the driver thinks that it uses right descriptor that
actually is freed or substituted for other one.
With current fixes a descriptor changes its value only when it has
been used. A new descriptor is acquired from vc->desc_issued queue that
is already filled with descriptors that are ready to be sent. Threads
have no direct access to DMA channel descriptor. Now it is just possible
to queue a descriptor for further processing.
Although harmless, the return statement in kvm_unexpected_el2_exception
is rather confusing as the function itself has a void return type. The
C standard is also pretty clear that "A return statement with an
expression shall not appear in a function whose return type is void".
Given that this return statement does not seem to add any actual value,
let's not pointlessly violate the standard.
Build-tested with GCC 10 and CLANG 13 for good measure, the disassembled
code is identical with or without the return statement.
In the SoundWire probe, we store a pointer from the driver ops into
the 'slave' structure. This can lead to kernel oopses when unbinding
codec drivers, e.g. with the following sequence to remove machine
driver and codec driver.
The full details can be found in the BugLink below, for reference the
two following examples show different cases of driver ops/callbacks
being invoked after the driver .remove().
This was not detected earlier in Intel tests since the tests first
remove the parent PCI device and shut down the bus. The sequence
above is a corner case which keeps the bus operational but without a
driver bound.
While trying to solve this kernel oopses, it became clear that the
existing SoundWire bus does not deal well with the unbind case.
Commit a3e1041ee4bdb ("soundwire: sdw_slave: add probe_complete structure and new fields")
added a 'probed' status variable and a 'probe_complete'
struct completion. This status is however not reset on remove and
likewise the 'probe complete' is not re-initialized, so the
bind/unbind/bind test cases would fail. The timeout used before the
'update_status' callback was also a bad idea in hindsight, there
should really be no timing assumption as to if and when a driver is
bound to a device.
An initial draft was based on device_lock() and device_unlock() was
tested. This proved too complicated, with deadlocks created during the
suspend-resume sequences, which also use the same device_lock/unlock()
as the bind/unbind sequences. On a CometLake device, a bad DSDT/BIOS
caused spurious resumes and the use of device_lock() caused hangs
during suspend. After multiple weeks or testing and painful
reverse-engineering of deadlocks on different devices, we looked for
alternatives that did not interfere with the device core.
A bus notifier was used successfully to keep track of DRIVER_BOUND and
DRIVER_UNBIND events. This solved the bind-unbind-bind case in tests,
but it can still be defeated with a theoretical corner case where the
memory is freed by a .remove while the callback is in use. The
notifier only helps make sure the driver callbacks are valid, but not
that the memory allocated in probe remains valid while the callbacks
are invoked.
This patch suggests the introduction of a new 'sdw_dev_lock' mutex
protecting probe/remove and all driver callbacks. Since this mutex is
'local' to SoundWire only, it does not interfere with existing locks
and does not create deadlocks. In addition, this patch removes the
'probe_complete' completion, instead we directly invoke the
'update_status' from the probe routine. That removes any sort of
timing dependency and a much better support for the device/driver
model, the driver could be bound before the bus started, or eons after
the bus started and the hardware would be properly initialized in all
cases.
BugLink: https://github.com/thesofproject/linux/issues/3531 Fixes: 592fd752c764 ("soundwire: Add MIPI DisCo property helpers") Fixes: a3e1041ee4bdb ("soundwire: sdw_slave: add probe_complete structure and new fields") Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Rander Wang <rander.wang@intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Link: https://lore.kernel.org/r/20220621225641.221170-2-pierre-louis.bossart@linux.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The bus sdw_drv_remove() and sdw_drv_shutdown() helpers are used
conditionally, if the driver provides these routines.
These helpers already test if the driver provides a .remove or
.shutdown callback, so there's no harm in invoking the
sdw_drv_remove() and sdw_drv_shutdown() unconditionally.
In addition, the current code is imbalanced with
dev_pm_domain_attach() called from sdw_drv_probe(), but
dev_pm_domain_detach() called from sdw_drv_remove() only if the driver
provides a .remove callback.
If the "snps,enable-cdm-check" property exists, we should enable the CDM
check. But previously dw_pcie_setup() could exit before doing so if the
"num-lanes" property was absent or invalid.
Move the CDM enable earlier so we do it regardless of whether "num-lanes"
is present.
If dw_pcie_ep_init() fails to perform any action after the EPC memory is
initialized and the MSI memory region is allocated, the latter parts won't
be undone thus causing a memory leak. Add a cleanup-on-error path to fix
these leaks.
We program the 64-bit ATU limit address (in PCIE_ATU_LIMIT/
PCIE_ATU_UPPER_LIMIT or PCIE_ATU_UNR_LOWER_LIMIT/PCIE_ATU_UNR_UPPER_LIMIT),
but in addition, the PCIE_ATU_INCREASE_REGION_SIZE bit must be set if the
upper 32 bits of the limit address differ from the upper 32 bits of the
base address (see [1,2]).
1190a4b54f55 ("PCI: dwc: Add upper limit address for outbound iATU") set
PCIE_ATU_INCREASE_REGION_SIZE, but only when the *size* was greater than
4GB. It did not set it when a smaller region crossed a 4GB boundary, e.g.,
[mem 0x0_f0000000-0x1_0fffffff].
Set PCIE_ATU_INCREASE_REGION_SIZE whenever PCIE_ATU_UPPER_LIMIT is
greater than PCIE_ATU_UPPER_BASE.
Some DWC-based controllers (e.g., pcie-al.c and pci-keystone.c, identified
by the fact that they override the default dw_child_pcie_ops) use their own
address translation approach instead of the DWC internal ATU (iATU). For
those controllers, skip disabling the iATU outbound windows.
dw_pcie_disable_atu() was introduced by 1ee611b62e8d ("PCI: dwc:
designware: Add EP mode support") and supported only the viewport version
of the iATU CSRs.
DW PCIe IP cores v4.80a and newer also support unrolled iATU/eDMA space.
Callers of dw_pcie_disable_atu(), including pci_epc_ops.clear_bar(),
pci_epc_ops.unmap_addr(), and dw_pcie_setup_rc(), don't work correctly when
it is enabled.
Add dw_pcie_disable_atu() support for controllers with unrolled iATU CSRs
enabled.
It's logically correct to undo everything that was done when an error is
discovered or in the corresponding cleanup counterpart. Otherwise the host
controller will be left in an undetermined state. Since the link is set up
in the host_init method, deactivate it there in the cleanup-on-error block
and stop the link in the antagonistic routine - dw_pcie_host_deinit(). Link
deactivation is platform-specific and should be implemented in
dw_pcie_ops.stop_link().
Fixes: b337f432d9f8 ("PCI: dwc: Move link handling into common code") Link: https://lore.kernel.org/r/20220624143428.8334-2-Sergey.Semin@baikalelectronics.ru Tested-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
On SM8250 two found VFE GDSC power domains shall not be operated, if
titan top is turned off, thus the former power domains will be set as
subdomains by a GDSC registration routine.
Fixes: bd1bc3e01295 ("clk: qcom: Add camera clock controller driver for SM8250") Signed-off-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org> Reviewed-by: Robert Foss <robert.foss@linaro.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20220519214133.1728979-3-vladimir.zapolskiy@linaro.org Signed-off-by: Sasha Levin <sashal@kernel.org>
On SDM845 two found VFE GDSC power domains shall not be operated, if
titan top is turned off, thus the former power domains will be set as
subdomains by a GDSC registration routine.
Fixes: cb41ba691802 ("clk: qcom: Add camera clock controller driver for SDM845") Signed-off-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org> Reviewed-by: Robert Foss <robert.foss@linaro.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20220519214133.1728979-2-vladimir.zapolskiy@linaro.org Signed-off-by: Sasha Levin <sashal@kernel.org>
NSS port 5 and 6 frequency tables are currently broken and are causing a
wide ranges of issue like 1G not working at all on port 6 or port 5 being
clocked with 312 instead of 125 MHz as UNIPHY1 gets selected.
So, update the frequency tables with the ones from the downstream QCA 5.4
based kernel which has already fixed this.
Fixes: 34e5b8de3c8f ("clk: qcom: ipq8074: add NSS ethernet port clocks") Signed-off-by: Robert Marko <robimarko@gmail.com> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20220515210048.483898-3-robimarko@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
UBI32 Huayra PLL fails to lock in 5 us in some SoC silicon and thus it
will cause the wait_for_pll() to timeout and thus return the error
indicating that the PLL failed to lock.
This is bug in Huayra PLL HW for which SW workaround
is to set bit 26 of TEST_CTL register.
This is ported from the QCA 5.4 based downstream kernel.
Like in IPQ6018 the NSS related Alpha PLL-s require initial configuration
to work.
So, obtain the regmap that is required for the Alpha PLL configuration
and thus utilize the qcom_cc_really_probe() as we already have the regmap.
Then utilize the Alpha PLL configs from the downstream QCA 5.4 based
kernel to configure them.
This fixes the UBI32 and NSS crypto PLL-s failing to get enabled by the
kernel.
When a local operation (invalidate mr, reg mr, bind mw) is finished there
will be no ack packet coming from a responder to cause the wqe to be
completed. This may happen anyway if a subsequent wqe performs
IO. Currently if the wqe is signalled the completer tasklet is scheduled
immediately but not otherwise.
This leads to a deadlock if the next wqe has the fence bit set in send
flags and the operation is not signalled. This patch removes the condition
that the wqe must be signalled in order to schedule the completer tasklet
which is the simplest fix for this deadlock and is fairly low cost. This
is the analog for local operations of always setting the ackreq bit in all
last or only request packets even if the operation is not signalled.
Link: https://lore.kernel.org/r/20220523223251.15350-1-rpearsonhpe@gmail.com Reported-by: Jenny Hack <jhack@hpe.com> Fixes: 3e1cf5f469ef ("RDMA/rxe: Move local ops to subroutine") Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit c8af135cd196 ("xhci: fix unsafe memory usage in xhci tracing")
apparently missed one sprintf() call in xhci_decode_trb() -- replace
it with the snprintf() call as well...
Found by Linux Verification Center (linuxtesting.org) with the SVACE static
analysis tool.
The msm8939 has an additional higher operating point for the multi-media
peripherals. The higher throughput MM componets operate off of the
system-mm noc not the system noc.
system_mm_noc_bfdcd_clk_src is the source clock for the higher frequency
capable system noc mm.
In __driver_attach function, There are also AA deadlock problem,
like the commit 0eb2b5aea27d ("driver core: fix deadlock in
__device_attach").
stack like commit 0eb2b5aea27d ("driver core: fix deadlock in
__device_attach").
list below:
In __driver_attach function, The lock holding logic is as follows:
...
__driver_attach
if (driver_allows_async_probing(drv))
device_lock(dev) // get lock dev
async_schedule_dev(__driver_attach_async_helper, dev); // func
async_schedule_node
async_schedule_node_domain(func)
entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC);
/* when fail or work limit, sync to execute func, but
__driver_attach_async_helper will get lock dev as
will, which will lead to A-A deadlock. */
if (!entry || atomic_read(&entry_count) > MAX_WORK) {
func;
else
queue_work_node(node, system_unbound_wq, &entry->work)
device_unlock(dev)
As above show, when it is allowed to do async probes, because of
out of memory or work limit, async work is not be allowed, to do
sync execute instead. it will lead to A-A deadlock because of
__driver_attach_async_helper getting lock dev.
Reproduce:
and it can be reproduce by make the condition
(if (!entry || atomic_read(&entry_count) > MAX_WORK)) untenable, like
below:
To fix the deadlock, move the async_schedule_dev outside device_lock,
as we can see, in async_schedule_node_domain, the parameter of
queue_work_node is system_unbound_wq, so it can accept concurrent
operations. which will also not change the code logic, and will
not lead to deadlock.
If an error occurs after a successful idr_alloc() call, the corresponding
resource must be released with idr_remove() as already done in the .remove
function.
Update the error handling path to add the missing idr_remove() call.
Access to I/O of SM8250 camera clock controller IP depends on enabled
GCC_CAMERA_AHB_CLK clock supplied by global clock controller, the latter
one is inited on subsys level, so, to satisfy the dependency, it would
make sense to deprive the init level of camcc-sm8250 driver.
If both drivers are compiled as built-in, there is a change that a board
won't boot up due to a race, which happens on the same init level.
Currently autoloading for SPI devices does not use the DT ID table, it uses
SPI modalises. Supporting OF modalises is going to be difficult if not
impractical, an attempt was made but has been reverted, so ensure that
module autoloading works for this driver by adding an id_table listing the
SPI IDs for everything.
In accordance with [1, 2] the DW eDMA controller has been created to be
part of the DW PCIe Root Port and DW PCIe End-point controllers and to
offload the transferring of large blocks of data between application and
remote PCIe domains leaving the system CPU free for other tasks. In the
first case (eDMA being part of DW PCIe Root Port) the eDMA controller is
always accessible via the CPU DBI interface and never over the PCIe wire.
The latter case is more complex. Depending on the DW PCIe End-Point IP-core
synthesize parameters it's possible to have the eDMA registers accessible
not only from the application CPU side, but also via mapping the eDMA CSRs
over a dedicated endpoint BAR. So based on the specifics denoted above the
eDMA driver is supposed to support two types of the DMA controller setups:
1) eDMA embedded into the DW PCIe Root Port/End-point and accessible over
the local CPU from the application side.
2) eDMA embedded into the DW PCIe End-point and accessible via the PCIe
wire with MWr/MRd TLPs generated by the CPU PCIe host controller.
Since the CPU memory resides different sides in these cases the semantics
of the MEM_TO_DEV and DEV_TO_MEM operations is flipped with respect to the
Tx and Rx DMA channels. So MEM_TO_DEV/DEV_TO_MEM corresponds to the Tx/Rx
channels in setup 1) and to the Rx/Tx channels in case of setup 2).
The DW eDMA driver has supported the case 2) since 417abf1be9a5
("dmaengine: Add Synopsys eDMA IP core driver") in the framework of the
drivers/dma/dw-edma/dw-edma-pcie.c driver.
The case 1) support was added later by e5933d438c32 ("dmaengine: dw-edma:
support local dma device transfer semantics"). Afterwards the driver was
supposed to cover the both possible eDMA setups, but the latter commit
turned out to be not fully correct.
The problem was that the commit together with the new functionality support
also changed the channel direction semantics so the eDMA Read-channel
(corresponding to the DMA_DEV_TO_MEM direction for case 1) now uses the
sgl/cyclic base addresses as the Source addresses of the DMA transfers and
dma_slave_config.dst_addr as the Destination address of the DMA transfers.
Similarly the eDMA Write-channel (corresponding to the DMA_MEM_TO_DEV
direction for case 1) now uses dma_slave_config.src_addr as a source
address of the DMA transfers and sgl/cyclic base address as the Destination
address of the DMA transfers. This contradicts the logic of the
DMA-interface, which implies that DEV side is supposed to belong to the
PCIe device memory and MEM - to the CPU/Application memory. Indeed it seems
irrational to have the SG-list defined in the PCIe bus space, while
expecting a contiguous buffer allocated in the CPU memory. Moreover the
passed SG-list and cyclic DMA buffers are supposed to be mapped in a way so
to be seen by the DW eDMA Application (CPU) interface.
So in order to have the correct DW eDMA interface we need to invert the
eDMA Rd/Wr-channels and DMA-slave directions semantics by selecting the
src/dst addresses based on the DMA transfer direction instead of using the
channel direction capability.
When the system is shutting down, iscsid is not running so we will not get
a response to the ISCSI_ERR_INVALID_HOST error event. The system shutdown
will then hang waiting on userspace to remove the session.
This has libiscsi force the destruction of the session from the kernel when
iscsi_host_remove() is called from a driver's shutdown callout.
This fixes a regression added in qedi boot with commit c55cd560a09b ("scsi:
qedi: Fix host removal with running sessions") which made qedi use the
common session removal function that waits on userspace instead of rolling
its own kernel based removal.
Link: https://lore.kernel.org/r/20220616222738.5722-7-michael.christie@oracle.com Fixes: c55cd560a09b ("scsi: qedi: Fix host removal with running sessions") Tested-by: Nilesh Javali <njavali@marvell.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
During qedi shutdown we need to stop the iSCSI layer from sending new nops
as pings and from responding to target ones and make sure there is no
running connection cleanups. Commit c55cd560a09b ("scsi: qedi: Fix host
removal with running sessions") converted the driver to use the libicsi
helper to drive session removal, so the above issues could be handled. The
problem is that during system shutdown iscsid will not be running so when
we try to remove the root session we will hang waiting for userspace to
reply.
Add a helper that will drive the destruction of sessions like these during
system shutdown.
Link: https://lore.kernel.org/r/20220616222738.5722-5-michael.christie@oracle.com Tested-by: Nilesh Javali <njavali@marvell.com> Reviewed-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
iscsi_if_stop_conn() is only called from the userspace interface but in a
subsequent commit we will want to call it from the kernel interface to
allow drivers like qedi to remove sessions from inside the kernel during
shutdown. This removes the iscsi_uevent code from iscsi_if_stop_conn() so we
can call it in a new helper.
Link: https://lore.kernel.org/r/20220616222738.5722-3-michael.christie@oracle.com Tested-by: Nilesh Javali <njavali@marvell.com> Reviewed-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
There are sleep in atomic context bugs when uploading device dump
data in mwifiex. The root cause is that dev_coredumpv could not
be used in atomic contexts, because it calls dev_set_name which
include operations that may sleep. The call tree shows execution
paths that could lead to bugs:
This patch uses delayed work to replace timer and moves the operations
that may sleep into a delayed work in order to mitigate bugs, it was
tested on Marvell 88W8801 chip whose port is usb and the firmware is
usb8801_uapsta.bin. The following is the result after using delayed
work to replace timer.
[ 134.936453] usb 1-1: == mwifiex dump information to /sys/class/devcoredump start
[ 135.043344] usb 1-1: == mwifiex dump information to /sys/class/devcoredump end
The firmware of the 88W8897 PCIe+USB card sends those events very
unreliably, sometimes bluetooth together with 2.4ghz-wifi is used and no
COEX event comes in, and sometimes bluetooth is disabled but the
coexistance mode doesn't get disabled.
This means we sometimes end up capping the rx/tx window size while
bluetooth is not enabled anymore, artifically limiting wifi speeds even
though bluetooth is not being used.
Since we can't fix the firmware, let's just ignore those events on the
88W8897 device. From some Wireshark capture sessions it seems that the
Windows driver also doesn't change the rx/tx window sizes when bluetooth
gets enabled or disabled, so this is fairly consistent with the Windows
driver.
Don't set Accessed/Dirty bits for a struct page with PG_reserved set,
i.e. don't set A/D bits for the ZERO_PAGE. The ZERO_PAGE (or pages
depending on the architecture) should obviously never be written, and
similarly there's no point in marking it accessed as the page will never
be swapped out or reclaimed. The comment in page-flags.h is quite clear
that PG_reserved pages should be managed only by their owner, and
strictly following that mandate also simplifies KVM's logic.
Fixes: 148bda399795 ("KVM: fix overflow of zero page refcount with ksm running") Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220429010416.2788472-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
CPU1 CPU2
memunmap_pages
percpu_ref_exit
__percpu_ref_exit
free_percpu(percpu_count);
/* percpu_count is freed here! */
get_dev_pagemap
xa_load(&pgmap_array, PHYS_PFN(phys))
/* pgmap still in the pgmap_array */
percpu_ref_tryget_live(&pgmap->ref)
if __ref_is_percpu
/* __PERCPU_REF_ATOMIC_DEAD not set yet */
this_cpu_inc(*percpu_count)
/* access freed percpu_count here! */
ref->percpu_count_ptr = __PERCPU_REF_ATOMIC_DEAD;
/* too late... */
pageunmap_range
To fix the issue, do percpu_ref_exit() after pgmap_array is emptied. So
we won't do percpu_ref_tryget_live() against a being freed percpu_ref.
Link: https://lkml.kernel.org/r/20220609121305.2508-1-linmiaohe@huawei.com Fixes: ae01e73b9802 ("mm/memremap_pages: support multiple ranges per invocation") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
If make_device_exclusive_range() fails or returns pages marked for
exclusive access less than required, remaining fields of pages will left
uninitialized. So dmirror_atomic_map() will access those yet
uninitialized fields of pages. To fix it, do dmirror_atomic_map() iff all
pages are marked for exclusive access (we will break if mapped is less
than required anyway) so we won't access those uninitialized fields of
pages.
Link: https://lkml.kernel.org/r/20220609130835.35110-1-linmiaohe@huawei.com Fixes: 42ee4beb23cd ("mm: selftests for exclusive device memory") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Original assert/deassert bit is BIT(0), but it's more resonable to modify
them to BIT(id % 32) which is based on id.
This patch will not influence any previous driver because the reset is
only used for thermal. The id (MT8183_INFRACFG_AO_THERM_SW_RST) is 0.
Fixes: 992fce56fe57 ("clk: reset: Modify reset-controller driver") Signed-off-by: Rex-BC Chen <rex-bc.chen@mediatek.com> Reviewed-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Tested-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Link: https://lore.kernel.org/r/20220523093346.28493-3-rex-bc.chen@mediatek.com Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The last resume result exposing logic in cros_ec_sleep_event()
incorrectly requires S0ix support, which doesn't work on ARM based
systems where S0ix doesn't exist. That's because cros_ec_sleep_event()
only reports the last resume result when the EC indicates the last sleep
event was an S0ix resume. On ARM systems, the last sleep event is always
S3 resume, but the EC can still detect sleep hang events in case some
other part of the AP is blocking sleep.
Always expose the last resume result if the EC supports it so that this
works on all devices regardless of S0ix support. This fixes sleep hang
detection on ARM based chromebooks like Trogdor.
Cc: Rajat Jain <rajatja@chromium.org> Cc: Matthias Kaehlcke <mka@chromium.org> Cc: Hsin-Yi Wang <hsinyi@chromium.org> Cc: Tzung-Bi Shih <tzungbi@kernel.org> Reviewed-by: Guenter Roeck <groeck@chromium.org> Reviewed-by: Evan Green <evgreen@chromium.org> Fixes: e8a936107ac9 ("platform/chrome: Add support for v1 of host sleep event") Signed-off-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org> Link: https://lore.kernel.org/r/20220614075726.2729987-1-swboyd@chromium.org Signed-off-by: Sasha Levin <sashal@kernel.org>
The scenario is this: User loaded driver but has not started authentication
app. All sessions to secure device will exhaust all login attempts, fail,
and in stay in deleted state. Then some time later the app is started. The
driver will replenish the login retry count, trigger delete to prepare for
secure login. After deletion, relogin is triggered.
For the session that is already deleted, the delete trigger is a no-op. If
none of the sessions trigger a relogin, no progress is made.
If the session is down and the local port continues to receive AUTH ELS
messages, the driver needs to send back LOGO so that the remote device
knows to tear down its session. Terminate and clean up the AUTH ELS
exchange followed by a passthrough LOGO.
Link: https://lore.kernel.org/r/20220608115849.16693-3-njavali@marvell.com Fixes: ce4d4dd912f3 ("scsi: qla2xxx: edif: Reject AUTH ELS on session down") Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit cdb51be7ebd5 ("License cleanup: add SPDX license identifier to
uapi header files with a license") added the correct SPDX identifier to
include/uapi/linux/netfilter/xt_IDLETIMER.h.
A subsequent commit removed it for no reason and reintroduced the UAPI
license incorrectness as the file is now missing the UAPI exception
again.
Add it back and remove the GPLv2 boilerplate while at it.
In the function tegra_xusb_powerdomain_init(),
dev_pm_domain_attach_by_name() may return NULL in some cases,
so IS_ERR() doesn't meet the requirements. Thus fix it.
of_parse_phandle() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
Add missing of_node_put() to avoid refcount leak.
Fixes: 0775db4aa75f ("USB: ohci-nxp: Use isp1301 driver") Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Link: https://lore.kernel.org/r/20220603141231.979-1-linmq006@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
of_find_compatible_node() returns a node pointer with refcount
incremented, we should use of_node_put() on it when done.
Add missing of_node_put() to avoid refcount leak.
Fixes: 2c9458645465 ("USB: powerpc: Workaround for the PPC440EPX USBH_23 errata [take 3]") Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Link: https://lore.kernel.org/r/20220602110849.58549-1-linmq006@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
of_find_node_by_path() returns a node pointer with refcount incremented,
we should use of_node_put() on it when not need anymore.
Add missing of_node_put() to avoid refcount leak.
of_get_child_by_name() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
Add missing of_node_put() to avoid refcount leak.
Fixes: f0e4c7650b8f ("mtd: partitions: redboot: seek fis-index-block in the right node") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20220526110652.64849-1-linmq006@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
There is a deadlock between sm_release and sm_cache_flush_work
which is a work item. The cancel_work_sync in sm_release will
not return until sm_cache_flush_work is finished. If we hold
mutex_lock and use cancel_work_sync to wait the work item to
finish, the work item also requires mutex_lock. As a result,
the sm_release will be blocked forever. The race condition is
shown below:
Smatch warnings:
drivers/hid/hid-cp2112.c:793 cp2112_xfer() error: __memcpy()
'data->block[1]' too small (33 vs 255)
drivers/hid/hid-cp2112.c:793 cp2112_xfer() error: __memcpy() 'buf' too
small (64 vs 255)
The 'read_length' variable is provided by 'data->block[0]' which comes
from user and it(read_length) can take a value between 0-255. Add an
upper bound to 'read_length' variable to prevent a buffer overflow in
memcpy().
pm_runtime_enable() will increase power disable depth. If
dw_pcie_ep_init() fails, we should use pm_runtime_disable() to balance it
with pm_runtime_enable().
Add missing pm_runtime_disable() for tegra_pcie_config_ep().
Fixes: ae0487c2a0af ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194") Link: https://lore.kernel.org/r/20220602031910.55859-1-linmq006@gmail.com Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
of_get_next_child() returns a node pointer with refcount incremented, so we
should use of_node_put() on it when we don't need it anymore.
mc_pcie_init_irq_domains() only calls of_node_put() in the normal path,
missing it in some error paths. Add missing of_node_put() to avoid
refcount leak.
If NRIPS is supported in hardware but disabled in KVM, set next_rip to
the next RIP when advancing RIP as part of emulating INT3 injection.
There is no flag to tell the CPU that KVM isn't using next_rip, and so
leaving next_rip is left as is will result in the CPU pushing garbage
onto the stack when vectoring the injected event.
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Fixes: 397076fd8c2e ("KVM: SVM: Emulate nRIP feature when reinjecting INT3") Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Message-Id: <cd328309a3b88604daa2359ad56f36cb565ce2d4.1651440202.git.maciej.szmigiero@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Unwind the RIP advancement done by svm_queue_exception() when injecting
an INT3 ultimately "fails" due to the CPU encountering a VM-Exit while
vectoring the injected event, even if the exception reported by the CPU
isn't the same event that was injected. If vectoring INT3 encounters an
exception, e.g. #NP, and vectoring the #NP encounters an intercepted
exception, e.g. #PF when KVM is using shadow paging, then the #NP will
be reported as the event that was in-progress.
Note, this is still imperfect, as it will get a false positive if the
INT3 is cleanly injected, no VM-Exit occurs before the IRET from the INT3
handler in the guest, the instruction following the INT3 generates an
exception (directly or indirectly), _and_ vectoring that exception
encounters an exception that is intercepted by KVM. The false positives
could theoretically be solved by further analyzing the vectoring event,
e.g. by comparing the error code against the expected error code were an
exception to occur when vectoring the original injected exception, but
SVM without NRIPS is a complete disaster, trying to make it 100% correct
is a waste of time.
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Fixes: 397076fd8c2e ("KVM: SVM: Emulate nRIP feature when reinjecting INT3") Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Message-Id: <450133cf0a026cb9825a2ff55d02cb136a1cb111.1651440202.git.maciej.szmigiero@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
After initiator has burned up all login retries, target authentication
application begins to run. This triggers a link bounce on target side.
Initiator will attempt another login. Due to N2N, the PRLI [nvme | fcp] can
fail because of the mode mismatch with target. This patch add a few more
login retries to revive the connection.
Link: https://lore.kernel.org/r/20220607044627.19563-11-njavali@marvell.com Fixes: 03d1f5917d23 ("scsi: qla2xxx: edif: Add N2N support for EDIF") Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
User failed to see disk via n2n topology. Driver used up all login retries
before authentication application started. When authentication application
started, driver did not have enough login retries to connect securely. On
app_start, driver will reset the login retry attempt count.
Link: https://lore.kernel.org/r/20220607044627.19563-10-njavali@marvell.com Fixes: 03d1f5917d23 ("scsi: qla2xxx: edif: Add N2N support for EDIF") Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Relating to EDIF, when sending IKE message, updating key or deleting key,
driver can encounter IOCB queue full. Add additional retries to reduce
higher level recovery.