include/linux/firmware/intel/stratix10-svc-client.h:55: warning: This comment
starts with '/**', but isn't a kernel-doc comment. Refer
Documentation/doc-guide/kernel-doc.rst
* Flag bit for COMMAND_RECONFIG
Merge tag 'soundwire-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire into char-misc-next
Vinod writes:
"soundwire updates for 5.20-rc1
- Core: solve the driver bind/unbind problem and remove ops pointer
- intel: runtime pm updates
- qcom: audio clock gating updates and device status checks"
* tag 'soundwire-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
soundwire: qcom: Enable software clock gating requirement flag
soundwire: qcom: Check device status before reading devid
soundwire: qcom: Add flag for software clock gating check
soundwire: qcom: Add support for controlling audio CGCR from HLOS
soundwire: intel: use pm_runtime_resume() on component probe
soundwire: peripheral: remove useless ops pointer
soundwire: revisit driver bind/unbind and callbacks
soundwire: bus_type: fix remove and shutdown support
This driver doesn't need to access I/O ports directly via inb()/outb()
and friends. This patch abstracts such access by calling ioport_map()
to enable the use of more typical ioread8()/iowrite8() I/O memory
accessor calls.
firmware: stratix10-svc: extend svc to support RSU feature
Extend Intel Stratix10 service layer driver to support new RSU
DCMF status reporting.
The status of each DCMF is reported. The currently used DCMF is used as
reference, while the other three are compared against it to determine if
they are corrupted.
firmware: stratix10-rsu: extend RSU driver to get DCMF status
Extend RSU driver to get DCMF status.
The status of each DCMF is reported. The currently used DCMF is used as
reference, while the other three are compared against it to determine if
they are corrupted.
Ang Tien Sung [Mon, 11 Jul 2022 22:31:37 +0000 (17:31 -0500)]
firmware: stratix10-svc: add new FCS commands
Extending the fpga svc driver to support 6 new FPGA Crypto
Service(FCS) commands.
We are adding FCS SDOS data encryption and decryption,
random number generator, image validation request,
reading the data provision and certificate validation.
Ang Tien Sung [Mon, 11 Jul 2022 22:31:36 +0000 (17:31 -0500)]
firmware: stratix10-svc: add FCS polling command
Introduce a new SMC command INTEL_SIP_SMC_FUNCID_SERVICE_COMPLETED
that polls if a previous asynchronous command was completed. This
SMC command is used by the new FPGA Crypto Service (FCS).
A basic example is that the FCS sends an AES data encryption
call to the secure device manager(SDM) and waits for the completion
of the operation by continuously polling the results with the new
command.
Ang Tien Sung [Mon, 11 Jul 2022 22:31:35 +0000 (17:31 -0500)]
firmware: stratix10-svc: Add support for FCS
Extend Intel service layer driver to support FPGA Crypto service(FCS)
features on Intel Soc platforms. Adding an additional channel and FCS
platform driver ("intel_fcs") as part of the probe method.
FCS driver uses the driver to send crypto operations' commands to
the secure device manager(SDM) on Intel Soc platforms Stratix10 and
Agilex.
Sebastian Ene [Mon, 11 Jul 2022 08:17:20 +0000 (08:17 +0000)]
misc: Add a mechanism to detect stalls on guest vCPUs
This driver creates per-cpu hrtimers which are required to do the
periodic 'pet' operation. On a conventional watchdog-core driver, the
userspace is responsible for delivering the 'pet' events by writing to
the particular /dev/watchdogN node. In this case we require a strong
thread affinity to be able to account for lost time on a per vCPU.
This part of the driver is the 'frontend' which is reponsible for
delivering the periodic 'pet' events, configuring the virtual peripheral
and listening for cpu hotplug events. The other part of the driver is
an emulated MMIO device which is part of the KVM virtual machine
monitor and this part accounts for lost time by looking at the
/proc/{}/task/{}/stat entries.
The VCPU stall detection mechanism allows to configure the expiration
duration and the internal counter clock frequency measured in Hz.
Add these properties in the schema.
While this is a memory mapped virtual device, it is expected to be loaded
when the DT contains the compatible: "qemu,vcpu-stall-detector" node.
In a protected VM we trust the generated DT nodes and we don't rely on
the host to present the hardware peripherals.
When building with Clang we encounter the following warning:
| drivers/misc/mei/hw-me.c:564:44: error: format specifies type 'unsigned
| short' but the argument has type 'int' [-Werror,-Wformat]
| dev_dbg(dev->dev, "empty slots = %hu.\n", empty_slots);
The format specifier used is `%hu` which specifies an unsigned short,
however, empty_slots is an int -- hence the warning.
Dan Carpenter [Fri, 8 Jul 2022 13:46:38 +0000 (16:46 +0300)]
eeprom: idt_89hpesx: uninitialized data in idt_dbgfs_csr_write()
The simple_write_to_buffer() function will return positive/success if it
is able to write a single byte anywhere within the buffer. However that
potentially leaves a lot of the buffer uninitialized.
In this code it's better to return 0 if the offset is non-zero. This
code is not written to support partial writes. And then return -EFAULT
if the buffer is not completely initialized.
Merge tag 'iio-for-5.20a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-next
Jonathan writes:
IIO new device support, features and minor fixes for 5.20
Several on-running cleanup efforts dominate this time, plus the DMA
safety alignment issue identified due to improved understanding of
the restrictions as a result of Catalin Marinas' efforts in that area.
One immutable branch in here due to MFD and SPMI elements needed for
the qcom-rradc driver.
Device support
* bmi088
- Add support for bmi085 (accelerometer part of IMU)
- Add support for bmi090l (accelerometer part of IMU)
* mcp4922
- Add support for single channel device MCP4921
* rzg2l-adc
- Add compatible and minor tweaks to support RZ/G2UL ADC
* sca3300
- Add support for scl3300 including refactoring driver to support
multiple device types and cleanup noticed whilst working on driver.
* spmi-rradc
- New driver for Qualcomm SPMI Round Robin ADC including necessary
additional utility functions in SPMI core and related MFD driver.
* ti-dac55781
- Add compatible for DAC121C081 which is very similar to existing parts.
Features
* core
- Warn on iio_trigger_get() on an unregistered IIO trigger.
* bma400
- Triggered buffer support
- Activity and step counting
- Misc driver improvements such as devm and header ordering
* cm32181
- Add PM support.
* cros_ec
- Sensor location support
* sx9324
- Add precharge resistor setting
- Add internal compensation resistor setting
- Add CS idle/sleep mode.
* sx9360
- Add precharge resistor setting
* vl53l0x
- Handle reset GPIO, regulator and relax handling of irq type.
Cleanup and minor fixes:
Treewide changes
- Cleanup of error handling in remove functions in many drivers.
- Update dt-binding maintainers for a number of ADI bindings.
- Several sets of conversion of drivers from device tree specific to
generic device properties. Includes fixing up various related
header and Kconfig issues.
- Drop include of of.h from iio.h and fix up drivers that need to include
it directly.
- More moves of clusters of drivers into appropriate IIO_XXX namespaces.
- Tree wide fix of a long running bug around DMA safety requirements.
IIO was using __cacheline_aligned to pad iio_priv() structures. This
worked for a long time by coincidence, but correct alignment is
ARCH_KMALLOC_MINALIGN. As there is activity around this area, introduce
an IIO local IIO_DMA_MINALIGN to allow for changing it in one place rather
than every driver in future. Note, there have been no reports of this
bug in the wild, and it may not happen on any platforms supported by
upstream, so no rush to backport these fixes.
Other cleanup
* core
- Switch to ida_alloc()/free()
- Drop unused iio_get_time_res()
- Octal permissions and DEVICE_ATTR_* macros.
- Cleanup bared unsigned usage.
* MAINTAINERS
- Add include/dt-bindings/iio/ to the main IIO entry.
* ad5380
- Comment syntax fix.
* ad74413r
- Call to for_each_set_bit_from(), with from value as 0 replaced.
* ad7768-1
- Drop explicit setting of INDIO_BUFFER_TRIGGERED as now done by the core.
* adxl345
- Fix wrong address in dt-binding example.
* adxl367
- Drop extra update of FIFO watermark.
* at91-sama5d2
- Limit requested watermark to the hwfifo size.
* bmg160, bme680
- Typos
* cio-dac
- Switch to iomap rather than direct use of ioports
* kxsd9
- Replace CONFIG_PM guards with new PM macros that let the compiler
cleanly remove the unused code and structures when !CONFIG_PM
* lsm6dsx
- Use new pm_sleep_ptr() and EXPORT_SIMPLE_DEV_PM_OPS(). Then move
to Namespace.
* meson_saradc - general cleanup.
- Avoid attaching resources to iio_dev->dev
- Use same struct device for all error messages
- Convert to dev_err_probe() and use local struct device *dev to
reduce code complexity.
- Use devm_clk_get_optional() instead of hand rolling.
- Use regmap_read_poll_timeout() instead of hand rolling.
* mma7660
- Drop ACPI_PTR() use that is unhelpful.
* mpu3050
- Stop exporting symbols not used outside of module
- Switch to new DEFINE_RUNTIME_DEV_PM_OPS() macro and move to Namespace.
* ping
- Typo fix
* qcom-spmi-rradc
- Typo fix
* sc27xx
- Convert to generic struct u32_fract
* srf08
- Drop a redundant check on !val
* st_lsm6dsx
- Limit the requested watermark to the hwfifo size.
* stm32-adc
- Use generic_handle_domain_irq() instead of opencoding.
- Fix handling of ADC disable.
* stm32-dac
- Use str_enabled_disable() instead of open coding.
* stx104
- Switch to iomap rather than direct use of ioports
* tsc2046
- Drop explicit setting of INDIO_BUFFER_TRIGGERED as now done by the core.
* tsl2563
- Replace flush_scheduled_work() with cancel_delayed_work_sync()
- Replace cancel_delayed_work() with cancel_delayed_work_sync()
* vl53l0x
- Make the VDD regulator optional by allowing a dummy regulator.
* tag 'iio-for-5.20a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio: (244 commits)
iio: adc: xilinx-xadc: Drop duplicate NULL check in xadc_parse_dt()
iio: adc: xilinx-xadc: Make use of device properties
iio: light: cm32181: Add PM support
iio: adc: ad778-1: do not explicity set INDIO_BUFFER_TRIGGERED mode
iio: adc: ti-tsc2046: do not explicity set INDIO_BUFFER_TRIGGERED mode
iio: adc: stm32-adc: disable adc before calibration
iio: adc: stm32-adc: make safe adc disable
iio: dac: ad5380: align '*' each line and drop unneeded blank line
iio: adc: qcom-spmi-rradc: Fix spelling mistake "coherrency" -> "coherency"
iio: Don't use bare "unsigned"
dt-bindings: iio: dac: mcp4922: expand for mcp4921 support
iio: dac: mcp4922: add support to mcp4921
iio: chemical: sps30: Move symbol exports into IIO_SPS30 namespace
iio: pressure: bmp280: Move symbol exports to IIO_BMP280 namespace
iio: imu: bmi160: Move exported symbols to IIO_BMI160 namespace
iio: adc: stm32-adc: Use generic_handle_domain_irq()
proximity: vl53l0x: Make VDD regulator actually optional
MAINTAINERS: add include/dt-bindings/iio to IIO SUBSYSTEM AND DRIVERS
dt-bindings: iio/accel: Fix adi,adxl345/6 example I2C address
iio: gyro: bmg160: Fix typo in comment
...
Merge tag 'misc-habanalabs-next-2022-07-12' of https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux into char-misc-next
Oded writes:
This tag contains habanalabs driver changes for v5.20:
- Add Gaudi2 ASIC support. All the features required for Gaudi2 are included
in this tag (except the networking aspect).
- Add more events to the eventfd support in the driver. With the new code, we
expose three events that the user can register to get notification about them.
- re-factor soft reset code and replace its name to compute reset to better
reflect the actual reset done in new ASICs
- Change the way Gaudi2 triggers an MSI-X interrupt due to h/w bug.
- Improve the code of the debugfs node that scrubs the device's memory.
- Add mechanism for better compatibility with older f/w versions
- Cleanup kernel log prints by moving some prints to debug and removing others.
- Many small bug fixes and minor changes.
* tag 'misc-habanalabs-next-2022-07-12' of https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux: (88 commits)
habanalabs: move h/w dirty message to debug
habanalabs: rename soft reset to compute reset
habanalabs: add status of reset after device release
habanalabs: fix update of is_in_soft_reset
habanalabs: expose only valid debugfs nodes
habanalabs/gaudi2: map virtual MSI-X doorbell memory for user
habanalabs/gaudi2: modify decoder to use virtual MSI-X doorbell
habanalabs/gaudi2: modify CS completion CQ to use virtual MSI-X doorbell
habanalabs/gaudi2: replace defines for reserved sob/mob with enums
habanalabs/gaudi2: configure virtual MSI-X doorbell interface
habanalabs: add a value field to hl_fw_send_pci_access_msg()
habanalabs: fixes to the poll-timeout macros
habanalabs/gaudi2: use DIV_ROUND_UP_SECTOR_T instead of roundup
habanalabs: initialize variable explicitly
habanalabs: Use the bitmap API to allocate bitmaps
habanalabs/gaudi2: remove unused defines
habanalabs: make sure variable is set before used
habanalabs: don't declare tmp twice in same function
habanalabs: do not set max power on a secured device
habanalabs/gaudi2: SM mask can only be 8-bit
...
H/W being dirty during initialization is completely expected in case
f/w tools are used before loading the driver. As it is not an error,
and as it doesn't give any meaningful information to the user,
no point of printing it.
reset_info.is_in_soft_reset should be updated both before in_reset
and inside the spin lock of the reset info structure.
The reasons are:
- When we are inside soft reset, it implies we are in reset. Therefore,
if someone checks if we are in soft reset, he can deduce we are
in reset, while the opposite is not correct and might be misleading.
- Both these flags are changed together so they must be changed
inside the reset info spinlock.
Tomer Tayar [Thu, 30 Jun 2022 19:05:51 +0000 (22:05 +0300)]
habanalabs/gaudi2: map virtual MSI-X doorbell memory for user
Upon the initialization of a user context, map the host memory page of
the virtual MSI-X doorbell in the device MMU.
A reserved VA is used for this purpose, so user can use it directly
without any allocation/map operation.
Tomer Tayar [Thu, 30 Jun 2022 08:22:54 +0000 (11:22 +0300)]
habanalabs/gaudi2: modify decoder to use virtual MSI-X doorbell
Modify the decoder wrapper blocks to generate interrupts using the
virtual MSI-X doorbell.
As a decoder wrapper block cannot write directly to HBW upon completion,
it writes instead to SOB which is monitored by a master monitor.
When resolved, this monitor will be the one to actually write to the
virtual MSI-X doorbell.
Tomer Tayar [Wed, 29 Jun 2022 16:20:38 +0000 (19:20 +0300)]
habanalabs/gaudi2: replace defines for reserved sob/mob with enums
Following patches are going to add more reserved sync objects and
monitors.
To make the counting of these reserved resources simpler, replace the
existing RESERVED_* defines with enumerations.
Due to a watchdog timer in the LBW path, writes to the MSI-X doorbell
can return sporadic error responses.
To work-around this issue, a virtual MSI-X doorbell on the HBW path is
configured, using the MSI-X AXI slave interface in the PCIe controller.
Upon an access to a configured HBW host address, the controller will
generate MSI-X interrupt instead of treating the access as regular host
memory access.
This patch allocates the dedicate host memory page, and communicate the
address to F/W, so it will configure the relevant address match
registers in the controller, and will use this address to generate MSI-X
interrupts for F/W events.
Following patches will handle other initiators in the device, to move
them to use the virtual MSI-X doorbell.
habanalabs: add a value field to hl_fw_send_pci_access_msg()
For gaudi2 we need to send a value to F/W as part of the
PCI_ACCESS packet.
As a preparation, modify hl_fw_send_pci_access_msg() to have a 'value'
field.
- use conventional internal macro variables (double underscore prefix)
- adjust address casting
- on register poll using ELBI use ELBI read rather than BAR read on
error condition
- remove unused macro
timestamp could be unset in both _hl_interrupt_wait_ioctl() and
_hl_interrupt_wait_ioctl_user_addr() so it is better to explicitly
initialize it to 0 when declaring it.
Ofir Bitton [Tue, 28 Jun 2022 15:34:58 +0000 (18:34 +0300)]
habanalabs: add support for common decoder interrupts
User application should be able to get notification for any decoder
completion. Hence, we introduce a new interface in which a user
can wait for all current decoder pending interrupts.
Ohad Sharabi [Tue, 28 Jun 2022 09:09:21 +0000 (12:09 +0300)]
habanalabs: wait for preboot ready after hard reset
Currently we are not waiting for preboot ready after hard reset.
This leads to a race in which COMMs protocol begins but will get no
response from the f/w.
Ofir Bitton [Tue, 28 Jun 2022 05:34:28 +0000 (08:34 +0300)]
habanalabs/gaudi2: reset device upon critical ECC event
Correctable ECC events are not fatal, but as they accumulate, the f/w
can decide that a hard-rest is required. This indication is
propagated to the host using the existing ECC event interface.
Oded Gabbay [Mon, 27 Jun 2022 12:05:28 +0000 (15:05 +0300)]
habanalabs: add gaudi2 wait-for-CS support
In Gaudi2 we moved to a different wait for command submission
completion model. Instead of receiving interrupt only on external
queues, we use the device's sync manager to notify us when the
entire command submission finishes.
This enables us to remove the categorization of queues to external
and internal, and treat each queue equally, without the need to parse
and patch any command buffer.
This change also requires refactoring to the IRQ handling of
CS completions.
Benjamin Dotan [Sun, 26 Jun 2022 18:35:07 +0000 (21:35 +0300)]
habanalabs/gaudi2: add gaudi2 profiler module
Add the Gaudi2 code to initialize the ASIC's profiler. The profile
receives its initialization values from the user, same as in Gaudi2,
but the code to initialize is in the driver because the configuration
space of the device is not directly exposed to the user.
Ofir Bitton [Sun, 26 Jun 2022 18:24:50 +0000 (21:24 +0300)]
habanalabs: add generic security module
As the ASICs become more complex and have many more registers, we need
a better way to configure the security properties.
As a reminder, we have two dedicated mechanisms for security:
Range Registers and Protection bits. Those mechanisms protect sensitive
memory and configuration areas inside the device.
The generic module handles the low-level part of the configuration,
because the configuration mechanism is identical in all ASICs. The
difference is the address ranges and register names.
Any ASIC that use this block should first block all the register
blocks in the ASIC. Then, it should open only the registers that
need to be accessed by the user (This is opposed to Goya and Gaudi,
where we blocked only what should not be accesses by the user).
The module contains several functions, to unblock single register,
multiple registers, entire blocks, ranges, ranges with mask.
Oded Gabbay [Fri, 24 Jun 2022 13:47:13 +0000 (16:47 +0300)]
habanalabs: add unsupported functions
There are a number of new ASIC-specific functions that were added
for Gaudi2. To make the common code work, we need to define empty
implementations of those functions for Goya and Gaudi.
Some functions will return error if called with Goya/Gaudi.
Oded Gabbay [Sun, 26 Jun 2022 15:20:03 +0000 (18:20 +0300)]
habanalabs: add gaudi2 asic-specific code
Add the ASIC-specific code for Gaudi2. Supply (almost) all of the
function callbacks that the driver's common code need to initialize,
finalize and submit workloads to the Gaudi2 ASIC.
It also contains the code to initialize the F/W of the Gaudi2 ASIC
and to receive events from the F/W.
It contains new debugfs entry to dump razwi events. razwi is a case
where the device's engines create a transaction that reaches an
invalid destination.
Oded Gabbay [Fri, 24 Jun 2022 10:38:57 +0000 (13:38 +0300)]
uapi: habanalabs: add gaudi2 defines
Add the new defines for GAUDI2 uapi interface.
It includes the following:
1. Enums of engines and PLLs.
2. New information in the info IOCTL that is retrieved by the driver.
3. Update comments regarding the CB/CS/wait for CS ioctls.
4. New fields in the debug IOCTL for configuring the profiler for
Gaudi2.
There is no new IOCTL.
Some of the changes are also relevant for Greco (which will be
upstreamed later this year). When ever it says "Greco and onwards",
it means it is also for Gaudi2.
Add the relevant GAUDI2 ASIC registers header files. These files are
generated automatically from a tool maintained by the VLSI engineers.
There are more files which are not upstreamed because only very few
defines from those files are used in the driver. For those files, I
copied the relevant defines into gaudi2_regs.h and gaudi2_masks.h, to
reduce the size of this patch.
Tomer Tayar [Fri, 24 Jun 2022 10:05:23 +0000 (13:05 +0300)]
habanalabs: remove dead code from free_device_memory()
free_device_memory() ends with if and else, each has a return statement,
followed by another return statement that can never be reached.
Restructure the function and remove this dead code.
Ohad Sharabi [Sun, 12 Jun 2022 12:00:29 +0000 (15:00 +0300)]
habanalabs: refactor dma asic-specific functions
This is a pre-requisite patch for adding tracepoints to the DMA memory
operations (allocation/free) in the driver.
The main purpose is to be able to cross data with the map operations and
determine whether memory violation occurred, for example free DMA
allocation before unmapping it from device memory.
To achieve this the DMA alloc/free code flows were refactored so that a
single DMA tracepoint will catch many flows.
Dafna Hirschfeld [Thu, 12 May 2022 12:20:55 +0000 (15:20 +0300)]
habanalabs: move call to scrub_device_mem after ctx_fini
In future ASICs, it would be possible to have a non-idle
device when context is released. We thus need to postpone the
scrubbing. Postpone it to hpriv release if reset is not executed
or to device late init if reset is executed.
Yuri Nudelman [Wed, 22 Jun 2022 09:52:34 +0000 (12:52 +0300)]
habanalabs/gaudi: fix a race condition causing DMAR error
There is a rare race condition in CB completion mechanism, that can
occur under a very high pressure of command submissions.
The preconditions for this to happen are:
1. There should be enough command submissions for the pre-allocated
patched CB pool to run out of commands. At this stage we start
allocating new patched CBs as they arrive.
2. CB size has to be exactly (128*n + 104)B for some n, i.e. 24B below
a cache line end.
The flow:
1. Two command buffers being completed on different streams, at the
same time. Denote those CB1 and CB2.
2. Each command buffer is injected with two messages, 16B each - one
for a HBW update of the completion queue, another to raise
interrupt.
3. Assume CB1 updated the completion queue and raise the interrupt.
4. Assume CB2 updated the completion queue but did not raise the
interrupt yet.
5. The host receives the interrupt. It goes over the completion queue
and sees two completions - CB1 and CB2. Release them both.
6. CB2 performs the last command. The problem is that the last command
is split between 2 cache lines. So to read the last 8B of the last
command, it has to access the host again. Problem is - CB2 is
already released. This causes a DMAR error.
The solution to this problem is simply to make sure the last two
commands in the CB are always in the same cache line, using NOP padding.
ran shalit [Wed, 15 Jun 2022 18:24:38 +0000 (21:24 +0300)]
habanalabs: add critical indication in sram ecc
Multiple SRAM SERR events are treated as critical events,
and host should be notified about it. Thus, adding is_critical
indication as part of SRAM ECC failure packet.
Tal Cohen [Thu, 9 Jun 2022 15:08:31 +0000 (18:08 +0300)]
habanalabs/gaudi: notify user process on device unavailable
When a device error occurs, user process would like to get some
indication on the error by reading some device HW info. If the
device is unavailable, user process can't perform any HW device
reading.
Oded Gabbay [Sat, 18 Jun 2022 18:27:07 +0000 (21:27 +0300)]
habanalabs: remove unused get_dma_desc_list_size
This asic callback function is not called anymore from the common code.
The asic-specific function itself is called but from within the
asic-specific code.
Once FW raised an event following a MME2 QMAN error, the driver should
have gone to the corresponding status registers, trying to gather more
info on the error, yet it was accidentally accessing MME1 QMAN address
space.
Generally, we have x4 MMEs, while 0 & 2 are marked MASTER, and
1 & 3 are marked SLAVE. The former can be addressed, yet addressing
the latter is considered an access violation, and will result in a
hung system, which is what unintentionally happened above.
Note that this cannot happen in a secured system, since these registers
are protected with range registers.
Dani Liberman [Thu, 2 Jun 2022 13:15:03 +0000 (16:15 +0300)]
habanalabs: avoid unnecessary error print
When sending a packet to FW right after it made reset, we will get
packet timeout. Since it is expected behavior, we don't need to
print an error in such case.
Hence, when driver is in hard reset it will avoid from printing error
messages about packet timeout.
Tal Cohen [Thu, 19 May 2022 15:00:55 +0000 (18:00 +0300)]
habanalabs: send an event notification when CS timeout occurs
The Driver needs to inform the User process whenever one of its
CS is timed out. The Driver shall recognize the CS timeout and shall
send an eventfd notification, towards user space, whenever a timeout
is expired on a CS.
Tal Cohen [Wed, 8 Jun 2022 14:34:54 +0000 (17:34 +0300)]
habanalabs/gaudi: send device reset notification
Device reset event, indicates that the device shall be reset -
after a short delay. In such case, the driver sends a notification
towards the User process. This allows the User process
to be able to take several debug actions for system
diagnostic purposes.
Tal Cohen [Wed, 8 Jun 2022 13:02:09 +0000 (16:02 +0300)]
habanalabs/gaudi: invoke device reset from one code block
In order to prepare the driver code for device reset event
notification, change the event handler function flow to call
device reset from one code block.
In addition, the commit fixes an issue that reset was performed
w/o checking the 'hard_reset_on_fw_event' state and w/o setting
the HL_DRV_RESET_DELAY flag.