arm64: atomics: remove LL/SC trampolines
[ Upstream commit
4b14091e7e654a8f10f31a1a8fd161d6ee1a036f ]
When CONFIG_ARM64_LSE_ATOMICS=y, each use of an LL/SC atomic results in
a fragment of code being generated in a subsection without a clear
association with its caller. A trampoline in the caller branches to the
LL/SC atomic with with a direct branch, and the atomic directly branches
back into its trampoline.
This breaks backtracing, as any PC within the out-of-line fragment will
be symbolized as an offset from the nearest prior symbol (which may not
be the function using the atomic), and since the atomic returns with a
direct branch, the caller's PC may be missing from the backtrace.
For example, with secondary_start_kernel() hacked to contain
atomic_inc(NULL), the resulting exception can be reported as being taken
from cpus_are_stuck_in_kernel():
| Unable to handle kernel NULL pointer dereference at virtual address
0000000000000000
| Mem abort info:
| ESR = 0x0000000096000004
| EC = 0x25: DABT (current EL), IL = 32 bits
| SET = 0, FnV = 0
| EA = 0, S1PTW = 0
| FSC = 0x04: level 0 translation fault
| Data abort info:
| ISV = 0, ISS = 0x00000004
| CM = 0, WnR = 0
| [
0000000000000000] user address but active_mm is swapper
| Internal error: Oops:
96000004 [#1] PREEMPT SMP
| Modules linked in:
| CPU: 1 PID: 0 Comm: swapper/1 Not tainted
5.19.0-11219-geb555cb5b794-dirty #3
| Hardware name: linux,dummy-virt (DT)
| pstate:
60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : cpus_are_stuck_in_kernel+0xa4/0x120
| lr : secondary_start_kernel+0x164/0x170
| sp :
ffff80000a4cbe90
| x29:
ffff80000a4cbe90 x28:
0000000000000000 x27:
0000000000000000
| x26:
0000000000000000 x25:
0000000000000000 x24:
0000000000000000
| x23:
0000000000000000 x22:
0000000000000000 x21:
0000000000000000
| x20:
0000000000000001 x19:
0000000000000001 x18:
0000000000000008
| x17:
3030383832343030 x16:
3030303030307830 x15:
ffff80000a4cbab0
| x14:
0000000000000001 x13:
5d31666130663133 x12:
3478305b20313030
| x11:
3030303030303078 x10:
3020726f73736563 x9 :
726f737365636f72
| x8 :
ffff800009ff2ef0 x7 :
0000000000000003 x6 :
0000000000000000
| x5 :
0000000000000000 x4 :
0000000000000000 x3 :
0000000000000100
| x2 :
0000000000000000 x1 :
ffff0000029bd880 x0 :
0000000000000000
| Call trace:
| cpus_are_stuck_in_kernel+0xa4/0x120
| __secondary_switched+0xb0/0xb4
| Code:
35ffffa3 17fffc6c d53cd040 f9800011 (
885f7c01)
| ---[ end trace
0000000000000000 ]---
This is confusing and hinders debugging, and will be problematic for
CONFIG_LIVEPATCH as these cases cannot be unwound reliably.
This is very similar to recent issues with out-of-line exception fixups,
which were removed in commits:
1fcf583d34968663 ("arm64: lib: __arch_clear_user(): fold fixups into body")
4e1064229b104697 ("arm64: lib: __arch_copy_from_user(): fold fixups into body")
f71bd23dc604bd26 ("arm64: lib: __arch_copy_to_user(): fold fixups into body")
When the trampolines were introduced in commit:
b81ef975a065a5ef ("arm64: atomics: avoid out-of-line ll/sc atomics")
The rationale was to improve icache performance by grouping the LL/SC
atomics together. This has never been measured, and this theoretical
benefit is outweighed by other factors:
* As the subsections are collapsed into sections at object file
granularity, these are spread out throughout the kernel and can share
cachelines with unrelated code regardless.
* GCC 12.1.0 has been observed to place the trampoline out-of-line in
specialised __ll_sc_*() functions, introducing more branching than was
intended.
* Removing the trampolines has been observed to shrink a defconfig
kernel Image by 64KiB when building with GCC 12.1.0.
This patch removes the LL/SC trampolines, meaning that the LL/SC atomics
will be inlined into their callers (or placed in out-of line functions
using regular BL/RET pairs). When CONFIG_ARM64_LSE_ATOMICS=y, the LL/SC
atomics are always called in an unlikely branch, and will be placed in a
cold portion of the function, so this should have minimal impact to the
hot paths.
Other than the improved backtracing, there should be no functional
change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220817155914.3975112-2-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Stable-dep-of:
031af50045ea ("arm64: cmpxchg_double*: hazard against entire exchange variable")
Signed-off-by: Sasha Levin <sashal@kernel.org>