Due to firmware bugs on the Q6 the hardware watchdog irq can be triggered
multiple times. As the remoteproc framework schedules work items for the
recovery process, if the other threads do not get a chance to run before
recovery is completed the proceeding threads will see the state of the
remoteproc as running and kill the remoteproc while it is running. This
can result in various SMMU and NOC errors. This change sets the state of
the remoteproc to offline whenever a watchdog irq is received.
Signed-off-by: Siddharth Gupta <sidgup@codeaurora.org>
Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/1657022900-2049-6-git-send-email-quic_sibis@quicinc.com
else
dev_err(q6v5->dev, "watchdog without message\n");
+ q6v5->running = false;
rproc_report_crash(q6v5->rproc, RPROC_WATCHDOG);
return IRQ_HANDLED;
size_t len;
char *msg;
+ if (!q6v5->running)
+ return IRQ_HANDLED;
+
msg = qcom_smem_get(QCOM_SMEM_HOST_ANY, q6v5->crash_reason, &len);
if (!IS_ERR(msg) && len > 0 && msg[0])
dev_err(q6v5->dev, "fatal error received: %s\n", msg);