scv support introduced the notion of code that implicitly soft-masks
irqs due to the instruction addresses. This is required because scv
enters the kernel with MSR[EE]=1.
If a NMI (including soft-NMI) interrupt hits when we are implicitly
soft-masked then its regs->softe does not reflect this because it is
derived from the explicit soft mask state (paca->irq_soft_mask). This
makes arch_irq_disabled_regs(regs) return false.
This can trigger a warning in the soft-NMI watchdog code (shown below).
Fix it by having NMI interrupts set regs->softe to disabled in case of
interrupting an implicit soft-masked region.
------------[ cut here ]------------
WARNING: CPU: 41 PID: 1103 at arch/powerpc/kernel/watchdog.c:259 soft_nmi_interrupt+0x3e4/0x5f0
CPU: 41 PID: 1103 Comm: (spawn) Not tainted
NIP:
c000000000039534 LR:
c000000000039234 CTR:
c000000000009a00
REGS:
c000007fffbcf940 TRAP: 0700 Not tainted
MSR:
9000000000021033 <SF,HV,ME,IR,DR,RI,LE> CR:
22042482 XER:
200400ad
CFAR:
c000000000039260 IRQMASK: 3
GPR00:
c000000000039204 c000007fffbcfbe0 c000000001d6c300 0000000000000003
GPR04:
00007ffffa45d078 0000000000000000 0000000000000008 0000000000000020
GPR08:
0000007ffd4e0000 0000000000000000 c000007ffffceb00 7265677368657265
GPR12:
9000000000009033 c000007ffffceb00 00000f7075bf4480 000000000000002a
GPR16:
00000f705745a528 00007ffffa45ddd8 00000f70574d0008 0000000000000000
GPR20:
00000f7075c58d70 00000f7057459c38 0000000000000001 0000000000000040
GPR24:
0000000000000000 0000000000000029 c000000001dae058 0000000000000029
GPR28:
0000000000000000 0000000000000800 0000000000000009 c000007fffbcfd60
NIP [
c000000000039534] soft_nmi_interrupt+0x3e4/0x5f0
LR [
c000000000039234] soft_nmi_interrupt+0xe4/0x5f0
Call Trace:
[
c000007fffbcfbe0] [
c000000000039204] soft_nmi_interrupt+0xb4/0x5f0 (unreliable)
[
c000007fffbcfcf0] [
c00000000000c0e8] soft_nmi_common+0x138/0x1c4
--- interrupt: 900 at end_real_trampolines+0x0/0x1000
NIP:
c000000000003000 LR:
00007ca426adb03c CTR:
900000000280f033
REGS:
c000007fffbcfd60 TRAP: 0900
MSR:
9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR:
44042482 XER:
200400ad
CFAR:
00007ca426946020 IRQMASK: 0
GPR00:
00000000000000ad 00007ffffa45d050 00007ca426b07f00 0000000000000035
GPR04:
00007ffffa45d078 0000000000000000 0000000000000008 0000000000000020
GPR08:
0000000000000000 0000000000100000 0000000010000000 00007ffffa45d110
GPR12:
0000000000000001 00007ca426d4e680 00000f7075bf4480 000000000000002a
GPR16:
00000f705745a528 00007ffffa45ddd8 00000f70574d0008 0000000000000000
GPR20:
00000f7075c58d70 00000f7057459c38 0000000000000001 0000000000000040
GPR24:
0000000000000000 00000f7057473f68 0000000000000003 000000000000041b
GPR28:
00007ffffa45d4c4 0000000000000035 0000000000000000 00000f7057473f68
NIP [
c000000000003000] end_real_trampolines+0x0/0x1000
LR [
00007ca426adb03c] 0x7ca426adb03c
--- interrupt: 900
Instruction dump:
60000000 60000000 60420000 38600001 482b3ae5 60000000 e93f0138 a36d0008
7daa6b78 71290001 7f7907b4 4082fd34 <
0fe00000>
4bfffd2c 60420000 ea6100a8
---[ end trace
dc75f67d819779da ]---
Fixes: 118178e62e2e ("powerpc: move NMI entry/exit code into wrapper")
Reported-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210503111708.758261-1-npiggin@gmail.com
local_paca->irq_soft_mask = IRQS_ALL_DISABLED;
local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
+ if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && !(regs->msr & MSR_PR) &&
+ regs->nip < (unsigned long)__end_interrupts) {
+ // Kernel code running below __end_interrupts is
+ // implicitly soft-masked.
+ regs->softe = IRQS_ALL_DISABLED;
+ }
+
/* Don't do any per-CPU operations until interrupt state is fixed */
if (nmi_disables_ftrace(regs)) {