git.baikalelectronics.ru Git - kernel.git/log

]> git.baikalelectronics.ru Git - kernel.git/log

projects / kernel.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Michael Ellerman [Mon, 21 Jun 2021 06:49:38 +0000 (16:49 +1000)]

powerpc/prom_init: Pass linux_banner to firmware via option vector 7

Pass the value of linux_banner to firmware via option vector 7.

Option vector 7 is described in "LoPAR" Linux on Power Architecture
Reference v2.9, in table B.7 on page 824:

  An ASCII character formatted null terminated string that describes
  the client operating system. The string shall be human readable and
  may be displayed on the console.

The string can be up to 256 bytes total, including the nul terminator.

linux_banner contains lots of information, and should make it possible
to identify the exact kernel version that is running:

  const char linux_banner[] =
  "Linux version " UTS_RELEASE " (" LINUX_COMPILE_BY "@"
  LINUX_COMPILE_HOST ") (" LINUX_COMPILER ") " UTS_VERSION "\n";

For example:
  Linux version 4.15.0-144-generic (buildd@bos02-ppc64el-018) (gcc
  version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #148-Ubuntu SMP Sat May 8
  02:32:13 UTC 2021 (Ubuntu 4.15.0-144.148-generic 4.15.18)

It's also printed at boot to the console/dmesg, which should make it
possible to correlate what firmware receives with the console/dmesg on
the machine.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210621064938.2021419-2-mpe@ellerman.id.au

commit | commitdiff | tree

Michael Ellerman [Mon, 21 Jun 2021 06:49:37 +0000 (16:49 +1000)]

powerpc/prom_init: Convert prom_strcpy() into prom_strscpy_pad()

In a subsequent patch we'd like to have something like a strscpy_pad()
implementation usable in prom_init.c.

Currently we have a strcpy() implementation with only one caller, so
convert it into strscpy_pad() and update the caller.

Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210621064938.2021419-1-mpe@ellerman.id.au

commit | commitdiff | tree

Michael Ellerman [Thu, 24 Jun 2021 12:34:20 +0000 (22:34 +1000)]

powerpc/64s: Fix boot failure with 4K Radix

When using the Radix MMU our PGD is always 64K, and must be naturally
aligned.

For a 4K page size kernel that means page alignment of swapper_pg_dir is
not sufficient, leading to failure to boot.

Use the existing MAX_PTRS_PER_PGD which has the correct value, and
avoids us hard-coding 64K here.

Fixes: 914348b31399 ("powerpc: Define swapper_pg_dir[] in C")
Reported-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210624123420.2784187-1-mpe@ellerman.id.au

commit | commitdiff | tree

Michael Ellerman [Tue, 22 Jun 2021 14:19:08 +0000 (00:19 +1000)]

Merge branch 'topic/ppc-kvm' into next

Pull in some more ppc KVM patches we are keeping in our topic branch.

In particular this brings in the series to add H_RPT_INVALIDATE.

commit | commitdiff | tree

Nathan Chancellor [Mon, 21 Jun 2021 18:24:40 +0000 (11:24 -0700)]

KVM: PPC: Book3S HV: Workaround high stack usage with clang

LLVM does not emit optimal byteswap assembly, which results in high
stack usage in kvmhv_enter_nested_guest() due to the inlining of
byteswap_pt_regs(). With LLVM 12.0.0:

arch/powerpc/kvm/book3s_hv_nested.c:289:6: error: stack frame size of
2512 bytes in function 'kvmhv_enter_nested_guest' [-Werror,-Wframe-larger-than=]
long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
^
1 error generated.

While this gets fixed in LLVM, mark byteswap_pt_regs() as
noinline_for_stack so that it does not get inlined and break the build
due to -Werror by default in arch/powerpc/. Not inlining saves
approximately 800 bytes with LLVM 12.0.0:

arch/powerpc/kvm/book3s_hv_nested.c:290:6: warning: stack frame size of
1728 bytes in function 'kvmhv_enter_nested_guest' [-Wframe-larger-than=]
long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
^
1 warning generated.

Cc: stable@vger.kernel.org # v4.20+
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://github.com/ClangBuiltLinux/linux/issues/1292
Link: https://bugs.llvm.org/show_bug.cgi?id=49610
Link: https://lore.kernel.org/r/202104031853.vDT0Qjqj-lkp@intel.com/
Link: https://gist.github.com/ba710e3703bf45043a31e2806c843ffd
Link: https://lore.kernel.org/r/20210621182440.990242-1-nathan@kernel.org

commit | commitdiff | tree

Bharata B Rao [Mon, 21 Jun 2021 08:50:03 +0000 (14:20 +0530)]

KVM: PPC: Book3S HV: Use H_RPT_INVALIDATE in nested KVM

In the nested KVM case, replace H_TLB_INVALIDATE by the new hcall
H_RPT_INVALIDATE if available. The availability of this hcall
is determined from "hcall-rpt-invalidate" string in ibm,hypertas-functions
DT property.

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210621085003.904767-7-bharata@linux.ibm.com

commit | commitdiff | tree

Bharata B Rao [Mon, 21 Jun 2021 08:50:02 +0000 (14:20 +0530)]

KVM: PPC: Book3S HV: Add KVM_CAP_PPC_RPT_INVALIDATE capability

Now that we have H_RPT_INVALIDATE fully implemented, enable
support for the same via KVM_CAP_PPC_RPT_INVALIDATE KVM capability

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210621085003.904767-6-bharata@linux.ibm.com

commit | commitdiff | tree

Bharata B Rao [Mon, 21 Jun 2021 08:50:01 +0000 (14:20 +0530)]

KVM: PPC: Book3S HV: Nested support in H_RPT_INVALIDATE

Enable support for process-scoped invalidations from nested
guests and partition-scoped invalidations for nested guests.

Process-scoped invalidations for any level of nested guests
are handled by implementing H_RPT_INVALIDATE handler in the
nested guest exit path in L0.

Partition-scoped invalidation requests are forwarded to the
right nested guest, handled there and passed down to L0
for eventual handling.

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
[aneesh: Nested guest partition-scoped invalidation changes]
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
[mpe: Squash in fixup patch]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210621085003.904767-5-bharata@linux.ibm.com

commit | commitdiff | tree

Bharata B Rao [Mon, 21 Jun 2021 08:50:00 +0000 (14:20 +0530)]

KVM: PPC: Book3S HV: Add support for H_RPT_INVALIDATE

H_RPT_INVALIDATE does two types of TLB invalidations:

1. Process-scoped invalidations for guests when LPCR[GTSE]=0.
   This is currently not used in KVM as GTSE is not usually
   disabled in KVM.
2. Partition-scoped invalidations that an L1 hypervisor does on
   behalf of an L2 guest. This is currently handled
   by H_TLB_INVALIDATE hcall and this new replaces the old that.

This commit enables process-scoped invalidations for L1 guests.
Support for process-scoped and partition-scoped invalidations
from/for nested guests will be added separately.

Process scoped tlbie invalidations from L1 and nested guests
need RS register for TLBIE instruction to contain both PID and
LPID.  This patch introduces primitives that execute tlbie
instruction with both PID and LPID set in prepartion for
H_RPT_INVALIDATE hcall.

A description of H_RPT_INVALIDATE follows:

int64   /* H_Success: Return code on successful completion */
        /* H_Busy - repeat the call with the same */
        /* H_Parameter, H_P2, H_P3, H_P4, H_P5 : Invalid
   parameters */
hcall(const uint64 H_RPT_INVALIDATE, /* Invalidate RPT
translation
lookaside information */
      uint64 id,        /* PID/LPID to invalidate */
      uint64 target,    /* Invalidation target */
      uint64 type,      /* Type of lookaside information */
      uint64 pg_sizes,  /* Page sizes */
      uint64 start,     /* Start of Effective Address (EA)
   range (inclusive) */
      uint64 end)       /* End of EA range (exclusive) */

Invalidation targets (target)
-----------------------------
Core MMU        0x01 /* All virtual processors in the
partition */
Core local MMU  0x02 /* Current virtual processor */
Nest MMU        0x04 /* All nest/accelerator agents
in use by the partition */

A combination of the above can be specified,
except core and core local.

Type of translation to invalidate (type)
---------------------------------------
NESTED       0x0001  /* invalidate nested guest partition-scope */
TLB          0x0002  /* Invalidate TLB */
PWC          0x0004  /* Invalidate Page Walk Cache */
PRT          0x0008  /* Invalidate caching of Process Table
Entries if NESTED is clear */
PAT          0x0008  /* Invalidate caching of Partition Table
Entries if NESTED is set */

A combination of the above can be specified.

Page size mask (pages)
----------------------
4K              0x01
64K             0x02
2M              0x04
1G              0x08
All sizes       (-1UL)

A combination of the above can be specified.
All page sizes can be selected with -1.

Semantics: Invalidate radix tree lookaside information
           matching the parameters given.
* Return H_P2, H_P3 or H_P4 if target, type, or pageSizes parameters
  are different from the defined values.
* Return H_PARAMETER if NESTED is set and pid is not a valid nested
  LPID allocated to this partition
* Return H_P5 if (start, end) doesn't form a valid range. Start and
  end should be a valid Quadrant address and  end > start.
* Return H_NotSupported if the partition is not in running in radix
  translation mode.
* May invalidate more translation information than requested.
* If start = 0 and end = -1, set the range to cover all valid
  addresses. Else start and end should be aligned to 4kB (lower 11
  bits clear).
* If NESTED is clear, then invalidate process scoped lookaside
  information. Else pid specifies a nested LPID, and the invalidation
  is performed   on nested guest partition table and nested guest
  partition scope real addresses.
* If pid = 0 and NESTED is clear, then valid addresses are quadrant 3
  and quadrant 0 spaces, Else valid addresses are quadrant 0.
* Pages which are fully covered by the range are to be invalidated.
  Those which are partially covered are considered outside
  invalidation range, which allows a caller to optimally invalidate
  ranges that may   contain mixed page sizes.
* Return H_SUCCESS on success.

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210621085003.904767-4-bharata@linux.ibm.com

commit | commitdiff | tree

Bharata B Rao [Mon, 21 Jun 2021 08:49:59 +0000 (14:19 +0530)]

powerpc/book3s64/radix: Add H_RPT_INVALIDATE pgsize encodings to mmu_psize_def

Add a field to mmu_psize_def to store the page size encodings
of H_RPT_INVALIDATE hcall. Initialize this while scanning the radix
AP encodings. This will be used when invalidating with required
page size encoding in the hcall.

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210621085003.904767-3-bharata@linux.ibm.com

commit | commitdiff | tree

Aneesh Kumar K.V [Mon, 21 Jun 2021 08:49:58 +0000 (14:19 +0530)]

KVM: PPC: Book3S HV: Fix comments of H_RPT_INVALIDATE arguments

The type values H_RPTI_TYPE_PRT and H_RPTI_TYPE_PAT indicate
invalidating the caching of process and partition scoped entries
respectively.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210621085003.904767-2-bharata@linux.ibm.com

commit | commitdiff | tree

Joel Stanley [Fri, 18 Jun 2021 03:49:43 +0000 (13:49 +1000)]

powerpc/boot: Add a boot wrapper for Microwatt

This allows microwatt's kernel to be built with an embedded device tree.

Load to arch/powerpc/boot/dtbImage.microwatt to 0x500000:

mw_debug -b fpga stop load arch/powerpc/boot/dtbImage.microwatt 500000 start

Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YMwX19wym3kQ7guu@thinks.paulus.ozlabs.org

commit | commitdiff | tree

Benjamin Herrenschmidt [Fri, 18 Jun 2021 03:49:00 +0000 (13:49 +1000)]

powerpc/boot: Fixup device-tree on little endian

This fixes the core devtree.c functions and the ns16550 UART backend.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YMwXrPT8nc4YUdJ9@thinks.paulus.ozlabs.org

commit | commitdiff | tree

Paul Mackerras [Fri, 18 Jun 2021 03:48:12 +0000 (13:48 +1000)]

powerpc/microwatt: Add microwatt_defconfig

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YMwXfL8hOpReIiiP@thinks.paulus.ozlabs.org

commit | commitdiff | tree

Paul Mackerras [Fri, 18 Jun 2021 03:47:08 +0000 (13:47 +1000)]

powerpc/microwatt: Add support for hardware random number generator

Microwatt's hardware RNG is accessed using the DARN instruction.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YMwXPHlV/ZleiQUY@thinks.paulus.ozlabs.org

commit | commitdiff | tree

Benjamin Herrenschmidt [Fri, 18 Jun 2021 03:46:32 +0000 (13:46 +1000)]

powerpc/microwatt: Use standard 16550 UART for console

This adds support to the Microwatt platform to use the standard
16550-style UART which available in the standalone Microwatt FPGA.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YMwXGCTzedpQje7r@thinks.paulus.ozlabs.org

commit | commitdiff | tree

Benjamin Herrenschmidt [Fri, 18 Jun 2021 03:45:53 +0000 (13:45 +1000)]

powerpc/xics: Add a native ICS backend for microwatt

This is a simple native ICS backend that matches the layout of
the Microwatt implementation of ICS.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
[mpe: Add empty ics_native_init() to unbreak non-microwatt builds]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
fixup-ics
Link: https://lore.kernel.org/r/YMwW8cxrwB2W5EUN@thinks.paulus.ozlabs.org

commit | commitdiff | tree

Benjamin Herrenschmidt [Fri, 18 Jun 2021 03:45:11 +0000 (13:45 +1000)]

powerpc/microwatt: Populate platform bus from device-tree

Just like any other embedded platform.

Add an empty soc node.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YMwWx98+PMibZq/G@thinks.paulus.ozlabs.org

commit | commitdiff | tree

Paul Mackerras [Fri, 18 Jun 2021 03:44:16 +0000 (13:44 +1000)]

powerpc: Add Microwatt device tree

Microwatt currently runs with MSR[HV] = 0, hence the usable-privilege
properties don't have bit 2 (for HV support) set, and we need the
/chosen/ibm,architecture-vec-5 property.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YMwWkPcXlGDSQ9Q3@thinks.paulus.ozlabs.org

commit | commitdiff | tree

Paul Mackerras [Fri, 18 Jun 2021 03:43:41 +0000 (13:43 +1000)]

powerpc: Add Microwatt platform

Microwatt is a FPGA-based implementation of the Power ISA.  It
currently only implements little-endian 64-bit mode, and does
not (yet) support SMP, VMX, VSX or transactional memory.  It has an
optional FPU, and an optional MMU (required for running Linux,
obviously) which implements a configurable radix tree but not
hypervisor mode or nested radix translation.

This adds a new machine type to support FPGA-based SoCs with a
Microwatt core.  CONFIG_MATH_EMULATION can be selected for Microwatt
SOCs which don't have the FPU.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YMwWbZVREsVug9R0@thinks.paulus.ozlabs.org

commit | commitdiff | tree

Christophe Leroy [Wed, 9 Jun 2021 01:34:31 +0000 (11:34 +1000)]

powerpc/32: use set_memory_attr()

Use set_memory_attr() instead of the PPC32 specific change_page_attr()

change_page_attr() was checking that the address was not mapped by
blocks and was handling highmem, but that's unneeded because the
affected pages can't be in highmem and block mapping verification
is already done by the callers.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[ruscur: rebase on powerpc/merge with Christophe's new patches]
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210609013431.9805-10-jniethe5@gmail.com

commit | commitdiff | tree

Christophe Leroy [Wed, 9 Jun 2021 01:34:30 +0000 (11:34 +1000)]

powerpc/mm: implement set_memory_attr()

In addition to the set_memory_xx() functions which allows to change
the memory attributes of not (yet) used memory regions, implement a
set_memory_attr() function to:
- set the final memory protection after init on currently used
kernel regions.
- enable/disable kernel memory regions in the scope of DEBUG_PAGEALLOC.

Unlike the set_memory_xx() which can act in three step as the regions
are unused, this function must modify 'on the fly' as the kernel is
executing from them. At the moment only PPC32 will use it and changing
page attributes on the fly is not an issue.

Reported-by: kbuild test robot <lkp@intel.com>
[ruscur: cast "data" to unsigned long instead of int]
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210609013431.9805-9-jniethe5@gmail.com

commit | commitdiff | tree

Russell Currey [Wed, 9 Jun 2021 01:34:29 +0000 (11:34 +1000)]

powerpc: Set ARCH_HAS_STRICT_MODULE_RWX

To enable strict module RWX on powerpc, set:

CONFIG_STRICT_MODULE_RWX=y

You should also have CONFIG_STRICT_KERNEL_RWX=y set to have any real
security benefit.

ARCH_HAS_STRICT_MODULE_RWX is set to require ARCH_HAS_STRICT_KERNEL_RWX.
This is due to a quirk in arch/Kconfig and arch/powerpc/Kconfig that
makes STRICT_MODULE_RWX *on by default* in configurations where
STRICT_KERNEL_RWX is *unavailable*.

Since this doesn't make much sense, and module RWX without kernel RWX
doesn't make much sense, having the same dependencies as kernel RWX
works around this problem.

Book3s/32 603 and 604 core processors are not able to write protect
kernel pages so do not set ARCH_HAS_STRICT_MODULE_RWX for Book3s/32.

[jpn: - predicate on !PPC_BOOK3S_604
- make module_alloc() use PAGE_KERNEL protection]

Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210609013431.9805-8-jniethe5@gmail.com

commit | commitdiff | tree

Jordan Niethe [Wed, 9 Jun 2021 01:34:28 +0000 (11:34 +1000)]

powerpc/bpf: Write protect JIT code

Add the necessary call to bpf_jit_binary_lock_ro() to remove write and
add exec permissions to the JIT image after it has finished being
written.

Without CONFIG_STRICT_MODULE_RWX the image will be writable and
executable until the call to bpf_jit_binary_lock_ro().

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210609013431.9805-7-jniethe5@gmail.com

commit | commitdiff | tree

Jordan Niethe [Wed, 9 Jun 2021 01:34:27 +0000 (11:34 +1000)]

powerpc/bpf: Remove bpf_jit_free()

Commit c2d5f4374f26 ("bpf: make jited programs visible in traces") added
a default bpf_jit_free() implementation. Powerpc did not use the default
bpf_jit_free() as powerpc did not set the images read-only. The default
bpf_jit_free() called bpf_jit_binary_unlock_ro() is why it could not be
used for powerpc.

Commit 40a74445a465 ("bpf: Use vmalloc special flag") moved keeping
track of read-only memory to vmalloc. This included removing
bpf_jit_binary_unlock_ro(). Therefore there is no reason powerpc needs
its own bpf_jit_free(). Remove it.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210609013431.9805-6-jniethe5@gmail.com

commit | commitdiff | tree

Russell Currey [Wed, 9 Jun 2021 01:34:26 +0000 (11:34 +1000)]

powerpc/kprobes: Mark newly allocated probes as ROX

Add the arch specific insn page allocator for powerpc. This allocates
ROX pages if STRICT_KERNEL_RWX is enabled. These pages are only written
to with patch_instruction() which is able to write RO pages.

Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[jpn: Reword commit message, switch to __vmalloc_node_range()]
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210609013431.9805-5-jniethe5@gmail.com

commit | commitdiff | tree

Jordan Niethe [Wed, 9 Jun 2021 01:34:25 +0000 (11:34 +1000)]

powerpc/modules: Make module_alloc() Strict Module RWX aware

Make module_alloc() use PAGE_KERNEL protections instead of
PAGE_KERNEL_EXEX if Strict Module RWX is enabled.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210609013431.9805-4-jniethe5@gmail.com

commit | commitdiff | tree

Jordan Niethe [Wed, 9 Jun 2021 01:34:24 +0000 (11:34 +1000)]

powerpc/lib/code-patching: Set up Strict RWX patching earlier

setup_text_poke_area() is a late init call so it runs before
mark_rodata_ro() and after the init calls. This lets all the init code
patching simply write to their locations. In the future, kprobes is
going to allocate its instruction pages RO which means they will need
setup_text__poke_area() to have been already called for their code
patching. However, init_kprobes() (which allocates and patches some
instruction pages) is an early init call so it happens before
setup_text__poke_area().

start_kernel() calls poking_init() before any of the init calls. On
powerpc, poking_init() is currently a nop. setup_text_poke_area() relies
on kernel virtual memory, cpu hotplug and per_cpu_areas being setup.
setup_per_cpu_areas(), boot_cpu_hotplug_init() and mm_init() are called
before poking_init().

Turn setup_text_poke_area() into poking_init().

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Russell Currey <ruscur@russell.cc>
[mpe: Fold in missing prototype for poking_init() from lkp]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210609013431.9805-3-jniethe5@gmail.com

commit | commitdiff | tree

Russell Currey [Wed, 9 Jun 2021 01:34:23 +0000 (11:34 +1000)]

powerpc/mm: Implement set_memory() routines

The set_memory_{ro/rw/nx/x}() functions are required for
STRICT_MODULE_RWX, and are generally useful primitives to have.  This
implementation is designed to be generic across powerpc's many MMUs.
It's possible that this could be optimised to be faster for specific
MMUs.

This implementation does not handle cases where the caller is attempting
to change the mapping of the page it is executing from, or if another
CPU is concurrently using the page being altered.  These cases likely
shouldn't happen, but a more complex implementation with MMU-specific code
could safely handle them.

On hash, the linear mapping is not kept in the linux pagetable, so this
will not change the protection if used on that range. Currently these
functions are not used on the linear map so just WARN for now.

apply_to_existing_page_range() does not work on huge pages so for now
disallow changing the protection of huge pages.

[jpn: - Allow set memory functions to be used without Strict RWX
      - Hash: Disallow certain regions
      - Have change_page_attr() take function pointers to manipulate ptes
      - Radix: Add ptesync after set_pte_at()]

Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210609013431.9805-2-jniethe5@gmail.com

commit | commitdiff | tree

Nicholas Piggin [Mon, 3 May 2021 13:02:42 +0000 (23:02 +1000)]

powerpc/pesries: Get STF barrier requirement from H_GET_CPU_CHARACTERISTICS

This allows the hypervisor / firmware to describe this workarounds to
the guest.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210503130243.891868-4-npiggin@gmail.com

commit | commitdiff | tree

Nicholas Piggin [Mon, 3 May 2021 13:02:41 +0000 (23:02 +1000)]

powerpc/security: Add a security feature for STF barrier

Rather than tying this mitigation to RFI L1D flush requirement, add a
new bit for it.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210503130243.891868-3-npiggin@gmail.com

commit | commitdiff | tree

Nicholas Piggin [Mon, 3 May 2021 13:02:40 +0000 (23:02 +1000)]

powerpc/pseries: Get entry and uaccess flush required bits from H_GET_CPU_CHARACTERISTICS

This allows the hypervisor / firmware to describe these workarounds to
the guest.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210503130243.891868-2-npiggin@gmail.com

commit | commitdiff | tree

Nicholas Piggin [Fri, 11 Jun 2021 11:11:04 +0000 (21:11 +1000)]

powerpc/boot: add zImage.lds to targets

This prevents spurious rebuilds of the lds and then wrappers.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210611111104.1058991-1-npiggin@gmail.com

commit | commitdiff | tree

Nicholas Piggin [Mon, 17 May 2021 14:03:55 +0000 (00:03 +1000)]

powerpc/powernv: Fix machine check reporting of async store errors

POWER9 and POWER10 asynchronous machine checks due to stores have their
cause reported in SRR1 but SRR1[42] is set, which in other cases
indicates DSISR cause.

Check for these cases and clear SRR1[42], so the cause matching uses
the i-side (SRR1) table.

Fixes: 3877610a434b ("powerpc/64s: POWER9 machine check handler")
Fixes: b3aa0d3be821 ("powerpc/powernv: Machine check handler for POWER10")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210517140355.2325406-1-npiggin@gmail.com

commit | commitdiff | tree

Suraj Jitindar Singh [Wed, 2 Jun 2021 04:04:41 +0000 (14:04 +1000)]

KVM: PPC: Book3S HV: Fix TLB management on SMT8 POWER9 and POWER10 processors

The POWER9 vCPU TLB management code assumes all threads in a core share
a TLB, and that TLBIEL execued by one thread will invalidate TLBs for
all threads. This is not the case for SMT8 capable POWER9 and POWER10
(big core) processors, where the TLB is split between groups of threads.
This results in TLB multi-hits, random data corruption, etc.

Fix this by introducing cpu_first_tlb_thread_sibling etc., to determine
which siblings share TLBs, and use that in the guest TLB flushing code.

[npiggin@gmail.com: add changelog and comment]

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210602040441.3984352-1-npiggin@gmail.com

commit | commitdiff | tree

Haren Myneni [Thu, 17 Jun 2021 20:39:41 +0000 (13:39 -0700)]

crypto/nx: Register and unregister VAS interface on PowerVM

The user space uses /dev/crypto/nx-gzip interface to setup VAS
windows, create paste mapping and close windows. This patch adds
changes to create/remove this interface with VAS register/unregister
functions on PowerVM platform.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/121ea1f4eb3004f3b8f4fe8abefaecc88b292efd.camel@linux.ibm.com

commit | commitdiff | tree

Haren Myneni [Thu, 17 Jun 2021 20:39:08 +0000 (13:39 -0700)]

crypto/nx: Add sysfs interface to export NX capabilities

Export NX-GZIP capabilities to usrespace in sysfs
/sys/devices/vio/ibm,compression-v1/nx_gzip_caps directory.
These are queried by userspace accelerator libraries to set
minimum length heuristics and maximum limits on request sizes.

NX-GZIP capabilities:
min_compress_len /*Recommended minimum compress length in bytes*/
min_decompress_len /*Recommended minimum decompress length in bytes*/
req_max_processed_len /* Maximum number of bytes processed in one
request */

NX will return RMA_Reject if the request buffer size is greater
than req_max_processed_len.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/510da86abbd904878d5f13d74aba72603c37d783.camel@linux.ibm.com

commit | commitdiff | tree

Haren Myneni [Thu, 17 Jun 2021 20:38:36 +0000 (13:38 -0700)]

crypto/nx: Get NX capabilities for GZIP coprocessor type

The hypervisor provides different NX capabilities that it
supports. These capabilities such as recommended minimum
compression / decompression lengths and the maximum request
buffer size in bytes are used to define the user space NX
request.

NX will reject the request if the buffer size is more than
the maximum buffer size. Whereas compression / decompression
lengths are recommended values for better performance.

Changes to get NX overall capabilities which points to the
specific features that the hypervisor supports. Then retrieve
the capabilities for the specific feature (available only
for NXGZIP).

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/f2b6a1fb8b6112595a73d81c67a35af4e7f5d0a3.camel@linux.ibm.com

commit | commitdiff | tree

Haren Myneni [Thu, 17 Jun 2021 20:37:42 +0000 (13:37 -0700)]

crypto/nx: Rename nx-842-pseries file name to nx-common-pseries

Rename nx-842-pseries.c to nx-common-pseries.c to add code for new
GZIP compression type. The actual functionality is not changed in
this patch.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1fcf672209a14ea8944bd3e49c8a7381c8f450f8.camel@linux.ibm.com

commit | commitdiff | tree

Haren Myneni [Thu, 17 Jun 2021 20:37:06 +0000 (13:37 -0700)]

powerpc/pseries/vas: Setup IRQ and fault handling

NX generates an interrupt when sees a fault on the user space
buffer and the hypervisor forwards that interrupt to OS. Then
the kernel handles the interrupt by issuing H_GET_NX_FAULT hcall
to retrieve the fault CRB information.

This patch also adds changes to setup and free IRQ per each
window and also handles the fault by updating the CSB.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b8fc66dcb783d06a099a303e5cfc69087bb3357a.camel@linux.ibm.com

commit | commitdiff | tree

Haren Myneni [Thu, 17 Jun 2021 20:36:28 +0000 (13:36 -0700)]

powerpc/pseries/vas: Integrate API with open/close windows

This patch adds VAS window allocatioa/close with the corresponding
hcalls. Also changes to integrate with the existing user space VAS
API and provide register/unregister functions to NX pseries driver.

The driver register function is used to create the user space
interface (/dev/crypto/nx-gzip) and unregister to remove this entry.

The user space process opens this device node and makes an ioctl
to allocate VAS window. The close interface is used to deallocate
window.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e8d956bace3f182c4d2e66e343ff37cb0391d1fd.camel@linux.ibm.com

commit | commitdiff | tree

Haren Myneni [Thu, 17 Jun 2021 20:35:54 +0000 (13:35 -0700)]

powerpc/pseries/vas: Implement getting capabilities from hypervisor

The hypervisor provides VAS capabilities for GZIP default and QoS
features. These capabilities gives information for the specific
features such as total number of credits available in LPAR,
maximum credits allowed per window, maximum credits allowed in
LPAR, whether usermode copy/paste is supported, and etc.

This patch adds the following:
- Retrieve all features that are provided by hypervisor using
H_QUERY_VAS_CAPABILITIES hcall with 0 as feature type.
- Retrieve capabilities for the specific feature using the same
hcall and the feature type (1 for QoS and 2 for default type).

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/177c88608cb88f7b87d1c546103f18cec6c056b4.camel@linux.ibm.com

commit | commitdiff | tree

Haren Myneni [Thu, 17 Jun 2021 20:35:22 +0000 (13:35 -0700)]

powerpc/pseries/vas: Add hcall wrappers for VAS handling

This patch adds the following hcall wrapper functions to allocate,
modify and deallocate VAS windows, and retrieve VAS capabilities.

H_ALLOCATE_VAS_WINDOW: Allocate VAS window
H_DEALLOCATE_VAS_WINDOW: Close VAS window
H_MODIFY_VAS_WINDOW: Setup window before using
H_QUERY_VAS_CAPABILITIES: Get VAS capabilities

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/40fb02a4d56ca4e240b074a15029082055be5997.camel@linux.ibm.com

commit | commitdiff | tree

Haren Myneni [Thu, 17 Jun 2021 20:34:43 +0000 (13:34 -0700)]

powerpc/vas: Define QoS credit flag to allocate window

PowerVM introduces two different type of credits: Default and Quality
of service (QoS).

The total number of default credits available on each LPAR depends
on CPU resources configured. But these credits can be shared or
over-committed across LPARs in shared mode which can result in
paste command failure (RMA_busy). To avoid NX HW contention, the
hypervisor ntroduces QoS credit type which makes sure guaranteed
access to NX esources. The system admins can assign QoS credits
or each LPAR via HMC.

Default credit type is used to allocate a VAS window by default as
on PowerVM implementation. But the process can pass
VAS_TX_WIN_FLAG_QOS_CREDIT flag with VAS_TX_WIN_OPEN ioctl to open
QoS type window.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/aa950b7b8e8077364267720274a7b9ec34e76e73.camel@linux.ibm.com

commit | commitdiff | tree

Haren Myneni [Thu, 17 Jun 2021 20:34:05 +0000 (13:34 -0700)]

powerpc/pseries/vas: Define VAS/NXGZIP hcalls and structs

This patch adds hcalls and other definitions. Also define structs
that are used in VAS implementation on PowerVM.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b4b8c594c27ee4aa6be9dc6dc4ee7331571cbbe8.camel@linux.ibm.com

commit | commitdiff | tree

Haren Myneni [Thu, 17 Jun 2021 20:33:28 +0000 (13:33 -0700)]

powerpc/vas: Define and use common vas_window struct

Many elements in vas_struct are used on PowerNV and PowerVM
platforms. vas_window is used for both TX and RX windows on
PowerNV and for TX windows on PowerVM. So some elements are
specific to these platforms.

So this patch defines common vas_window and platform
specific window structs (pnv_vas_window on PowerNV). Also adds
the corresponding changes in PowerNV vas code.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1698c35c158dfe52c6d2166667823d3d4a463353.camel@linux.ibm.com

commit | commitdiff | tree

Haren Myneni [Thu, 17 Jun 2021 20:32:38 +0000 (13:32 -0700)]

powerpc/vas: Move update_csb/dump_crb to common book3s platform

If a coprocessor encounters an error translating an address, the
VAS will cause an interrupt in the host. The kernel processes
the fault by updating CSB. This functionality is same for both
powerNV and pseries. So this patch moves these functions to
common vas-api.c and the actual functionality is not changed.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/bf8d5b0770fa1ef5cba88c96580caa08d999d3b5.camel@linux.ibm.com

commit | commitdiff | tree

Haren Myneni [Thu, 17 Jun 2021 20:31:43 +0000 (13:31 -0700)]

powerpc/vas: Create take/drop pid and mm reference functions

Take pid and mm references when each window opens and drops during
close. This functionality is needed for powerNV and pseries. So
this patch defines the existing code as functions in common book3s
platform vas-api.c

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/2fa40df962250a737c804e58202924717b39e381.camel@linux.ibm.com

commit | commitdiff | tree

Haren Myneni [Thu, 17 Jun 2021 20:31:06 +0000 (13:31 -0700)]

powerpc/vas: Add platform specific user window operations

PowerNV uses registers to open/close VAS windows, and getting the
paste address. Whereas the hypervisor calls are used on PowerVM.

This patch adds the platform specific user space window operations
and register with the common VAS user space interface.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/f85091f4ace67f951ac04d60394d67b21e2f5d3c.camel@linux.ibm.com

commit | commitdiff | tree

Haren Myneni [Thu, 17 Jun 2021 20:30:24 +0000 (13:30 -0700)]

powerpc/powernv/vas: Rename register/unregister functions

powerNV and pseries drivers register / unregister to the corresponding
platform specific VAS separately. Then these VAS functions call the
common API with the specific window operations. So rename powerNV VAS
API register/unregister functions.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/9db00d58dbdcb7cfc07a1df95f3d2a9e3e5d746a.camel@linux.ibm.com

commit | commitdiff | tree

Haren Myneni [Thu, 17 Jun 2021 20:29:48 +0000 (13:29 -0700)]

powerpc/vas: Move VAS API to book3s common platform

The pseries platform will share vas and nx code and interfaces
with the PowerNV platform, so create the
arch/powerpc/platforms/book3s/ directory and move VAS API code
there. Functionality is not changed.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e05c8db17b9eabe3545b902d034238e4c6c08180.camel@linux.ibm.com

commit | commitdiff | tree

Haren Myneni [Thu, 17 Jun 2021 20:29:05 +0000 (13:29 -0700)]

powerpc/powernv/vas: Release reference to tgid during window close

The kernel handles the NX fault by updating CSB or sending
signal to process. In multithread applications, children can
open VAS windows and can exit without closing them. But the
parent can continue to send NX requests with these windows. To
prevent pid reuse, reference will be taken on pid and tgid
when the window is opened and release them during window close.

The current code is not releasing the tgid reference which can
cause pid leak and this patch fixes the issue.

Fixes: cba19bdc7a73b ("powerpc/vas: Take reference to PID and mm for user space windows")
Cc: stable@vger.kernel.org # 5.8+
Reported-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/6020fc4d444864fe20f7dcdc5edfe53e67480a1c.camel@linux.ibm.com

commit | commitdiff | tree

Michael Ellerman [Thu, 17 Jun 2021 06:51:38 +0000 (16:51 +1000)]

Merge branch 'topic/ppc-kvm' into next

Merge some powerpc KVM patches from our topic branch.

In particular this brings in Nick's big series rewriting parts of the
guest entry/exit path in C.

Conflicts:
arch/powerpc/kernel/security.c
arch/powerpc/kvm/book3s_hv_rmhandlers.S

commit | commitdiff | tree

Aneesh Kumar K.V [Thu, 10 Jun 2021 08:36:39 +0000 (14:06 +0530)]

powerpc/mm/book3s64: Fix possible build error

Update _tlbiel_pid() such that we can avoid build errors like below when
using this function in other places.

arch/powerpc/mm/book3s64/radix_tlb.c: In function ‘__radix__flush_tlb_range_psize’:
arch/powerpc/mm/book3s64/radix_tlb.c:114:2: warning: ‘asm’ operand 3 probably does not match constraints
114 | asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
| ^~~
arch/powerpc/mm/book3s64/radix_tlb.c:114:2: error: impossible constraint in ‘asm’
make[4]: *** [scripts/Makefile.build:271: arch/powerpc/mm/book3s64/radix_tlb.o] Error 1
m

With this fix, we can also drop the __always_inline in __radix_flush_tlb_range_psize
which was added by commit d7164dcd2baf ("powerpc/mm/radix: mark __radix__flush_tlb_range_psize() as __always_inline")

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210610083639.387365-1-aneesh.kumar@linux.ibm.com

commit | commitdiff | tree

Michael Ellerman [Thu, 10 Jun 2021 07:29:49 +0000 (17:29 +1000)]

powerpc/signal64: Don't read sigaction arguments back from user memory

When delivering a signal to a sigaction style handler (SA_SIGINFO), we
pass pointers to the siginfo and ucontext via r4 and r5.

Currently we populate the values in those registers by reading the
pointers out of the sigframe in user memory, even though the values in
user memory were written by the kernel just prior:

  unsafe_put_user(&frame->info, &frame->pinfo, badframe_block);
  unsafe_put_user(&frame->uc, &frame->puc, badframe_block);
  ...
  if (ksig->ka.sa.sa_flags & SA_SIGINFO) {
   err |= get_user(regs->gpr[4], (unsigned long __user *)&frame->pinfo);
   err |= get_user(regs->gpr[5], (unsigned long __user *)&frame->puc);

ie. we write &frame->info into frame->pinfo, and then read frame->pinfo
back into r4, and similarly for &frame->uc.

The code has always been like this, since linux-fullhistory commit
d4f2d95eca2c ("Forward port of 2.4 ppc64 signal changes.").

There's no reason for us to read the values back from user memory,
rather than just setting the value in the gpr[4/5] directly. In fact
reading the value back from user memory opens up the possibility of
another user thread changing the values before we read them back.
Although any process doing that would be racing against the kernel
delivering the signal, and would risk corrupting the stack, so that
would be a userspace bug.

Note that this is 64-bit only code, so there's no subtlety with the size
of pointers differing between kernel and user. Also the frame variable
is not modified to point elsewhere during the function.

In the past reading the values back from user memory was not costly, but
now that we have KUAP on some CPUs it is, so we'd rather avoid it for
that reason too.

So change the code to just set the values directly, using the same
values we have written to the sigframe previously in the function.

Note also that this matches what our 32-bit signal code does.

Using a version of will-it-scale's signal1_threads that sets SA_SIGINFO,
this results in a ~4% increase in signals per second on a Power9, from
229,777 to 239,766.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210610072949.3198522-1-mpe@ellerman.id.au

commit | commitdiff | tree

Sudeep Holla [Fri, 11 Jun 2021 19:10:58 +0000 (19:10 +0000)]

powerpc/watchdog: include linux/processor.h for spin_until_cond

This implementation uses spin_until_cond in wd_smp_lock including
neither linux/processor.h nor asm/processor.h

This patch includes linux/processor.h here for spin_until_cond usage.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/5e8d2d50f301a346040362028c2ecba40685de9e.1623438544.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Sudeep Holla [Fri, 11 Jun 2021 19:10:57 +0000 (19:10 +0000)]

powerpc/64: drop redundant defination of spin_until_cond

linux/processor.h has exactly same defination for spin_until_cond.
Drop the redundant defination in asm/processor.h

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1fff2054e5dfc00329804dbd3f2a91667c9a8aff.1623438544.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 10 Jun 2021 15:58:34 +0000 (15:58 +0000)]

powerpc/signal32: Remove impossible #ifdef combinations

PPC_TRANSACTIONAL_MEM is only on book3s/64
SPE is only on booke

PPC_TRANSACTIONAL_MEM selects ALTIVEC and VSX

Therefore, within PPC_TRANSACTIONAL_MEM sections,
ALTIVEC and VSX are always defined while SPE never is.

Remove all SPE code and all #ifdef ALTIVEC and VSX in tm
functions.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/a069a348ee3c2fe3123a5a93695c2b35dc42cb40.1623340691.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Fri, 11 Jun 2021 19:08:54 +0000 (19:08 +0000)]

powerpc/32: Display modules range in virtual memory layout

book3s/32 and 8xx don't use vmalloc for modules.

Print the modules area at startup as part of the virtual memory layout:

[    0.000000] Kernel virtual memory layout:
[    0.000000]   * 0xffafc000..0xffffc000  : fixmap
[    0.000000]   * 0xc9000000..0xffafc000  : vmalloc & ioremap
[    0.000000]   * 0xb0000000..0xc0000000  : modules
[    0.000000] Memory: 118480K/131072K available (7152K kernel code, 2320K rwdata, 1328K rodata, 368K init, 854K bss, 12592K reserved, 0K cma-reserved)

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/98394503e92d6fd6d8f657e0b263b32f21cf2790.1623438478.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Daniel Axtens [Mon, 14 Jun 2021 12:09:07 +0000 (22:09 +1000)]

powerpc: make stack walking KASAN-safe

Make our stack-walking code KASAN-safe by using __no_sanitize_address.
Generic code, arm64, s390 and x86 all make accesses unchecked for similar
sorts of reasons: when unwinding a stack, we might touch memory that KASAN
has marked as being out-of-bounds. In ppc64 KASAN development, I hit this
sometimes when checking for an exception frame - because we're checking
an arbitrary offset into the stack frame.

See commit d5f68614a55c ("s390/kasan: avoid false positives during stack
unwind"), commit f4fede9c9de6 ("arm64: disable kasan when accessing
frame->fp in unwind_frame"), commit 34a27c0dd03b ("x86/dumpstack:
Prevent KASAN false positive warnings") and commit cdf32bbe54b2
("tracing, kasan: Silence Kasan warning in check_stack of stack_tracer").

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210614120907.1952321-1-dja@axtens.net

commit | commitdiff | tree

Athira Rajeev [Tue, 25 May 2021 13:51:43 +0000 (09:51 -0400)]

selftests/powerpc: EBB selftest for MMCR0 control for PMU SPRs in ISA v3.1

With the MMCR0 control bit (PMCCEXT) in ISA v3.1, read access to
group B registers is restricted when MMCR0 PMCC=0b00. In other
platforms (like power9), the older behaviour works where group B
PMU SPRs are readable.

Patch creates a selftest which verifies that the test takes a
SIGILL when attempting to read PMU registers via helper function
"dump_ebb_state" for ISA v3.1.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com <mailto:rnsastry@linux.ibm.com>>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1621950703-1532-3-git-send-email-atrajeev@linux.vnet.ibm.com

commit | commitdiff | tree

Athira Rajeev [Tue, 25 May 2021 13:51:42 +0000 (09:51 -0400)]

selftests/powerpc: Fix "no_handler" EBB selftest

The "no_handler_test" in ebb selftests attempts to read the PMU
registers twice via helper function "dump_ebb_state". First dump is
just before closing of event and the second invocation is done after
closing of the event. The original intention of second
dump_ebb_state was to dump the state of registers at the end of
the test when the counters are frozen. But this will be achieved
with the first call itself since sample period is set to low value
and PMU will be frozen by then. Hence patch removes the
dump which was done before closing of the event.

Reported-by: Shirisha Ganta <shirisha.ganta1@ibm.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com <mailto:rnsastry@linux.ibm.com>>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1621950703-1532-2-git-send-email-atrajeev@linux.vnet.ibm.com

commit | commitdiff | tree

Christophe Leroy [Wed, 9 Jun 2021 06:10:29 +0000 (06:10 +0000)]

powerpc: Move update_power8_hid0() into its only user

update_power8_hid0() is used only by powernv platform subcore.c

Move it there.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/37f41d74faa0c66f90b373e243e8b1ee37a1f6fa.1623219019.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Wed, 9 Jun 2021 05:52:50 +0000 (05:52 +0000)]

powerpc: Remove proc_trap()

proc_trap() has never been used, remove it.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/827944ea12d470c2f862635f48b5ee6c1520351f.1623217909.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Tue, 8 Jun 2021 17:22:51 +0000 (17:22 +0000)]

powerpc/32: Remove __main()

Comment says that __main() is there to make GCC happy.

It's been there since the implementation of ppc arch in Linux 1.3.45.

ppc32 is the only architecture having that. Even ppc64 doesn't have it.

Seems like GCC is still happy without it.

Drop it for good.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/d01028f8166b98584eec536b52f14c5e3f98ff6b.1623172922.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Mon, 7 Jun 2021 10:56:06 +0000 (10:56 +0000)]

powerpc/32s: Rename PTE_SIZE to PTE_T_SIZE

PTE_SIZE means PTE page table size in most placed, whereas
in hash_low.S in means size of one entry in the table.

Rename it PTE_T_SIZE, and define it directly in hash_low.S
instead of going through asm-offsets.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/83a008a9fd6cc3f2bbcb470f592555d260ed7a3d.1623063174.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Mon, 7 Jun 2021 10:56:05 +0000 (10:56 +0000)]

powerpc: Define swapper_pg_dir[] in C

Don't duplicate swapper_pg_dir[] in each platform's head.S

Define it in mm/pgtable.c

Define MAX_PTRS_PER_PGD because on book3s/64 PTRS_PER_PGD is
not a constant.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/5e3f1b8a4695c33ccc80aa3870e016bef32b85e1.1623063174.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Mon, 7 Jun 2021 10:56:04 +0000 (10:56 +0000)]

powerpc: Define empty_zero_page[] in C

At the time being, empty_zero_page[] is defined in each
platform head.S.

Define it in mm/mem.c instead, and put it in BSS section instead
of the DATA section. Commit 00d03ac84053 ("arm64: mm: place
empty_zero_page in bss") explains why it is interesting to have
it in BSS.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/5838caffa269e0957c5a50cc85477876220298b0.1623063174.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Fri, 4 Jun 2021 12:31:09 +0000 (12:31 +0000)]

powerpc/selftests: Use gettid() instead of getppid() for null_syscall

gettid() is 10% lighter than getppid(), use it for null_syscall selftest.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/0ad62673d3e063f848e7c99d719bb966efd433e8.1622809833.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 3 Jun 2021 09:29:07 +0000 (09:29 +0000)]

powerpc/nohash: Remove DEBUG_HARDER

DEBUG_HARDER is not user selectable.

Remove it together with related messages.

Also remove two pr_devel() messages that should
likely have been pr_hard().

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/0f25109b0e12fdd1e6541dedbb2212cc53526a57.1622712515.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 3 Jun 2021 09:29:06 +0000 (09:29 +0000)]

powerpc/nohash: Remove DEBUG_CLAMP_LAST_CONTEXT

DEBUG_CLAMP_LAST_CONTEXT was there in the old days to reduce
number of contexts in order to ease debugging implementation
of context switching, but that's been quite stable during
years now.

As it is not user selectable, remove it.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/da81837b452e8b9f1657b529b9c3050dc10b9770.1622712515.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 3 Jun 2021 09:29:05 +0000 (09:29 +0000)]

powerpc/nohash: Remove DEBUG_MAP_CONSISTENCY

mmu_context handling has been there for years, so we
would know if there was problems with maps.

DEBUG_MAP_CONSISTENCY is not user selectable, remove it.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/6fe2b88956db53f8d6ee221525b2c5dc6aec82c6.1622712515.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 3 Jun 2021 09:29:04 +0000 (09:29 +0000)]

powerpc/nohash: Remove CONFIG_SMP #ifdefery in mmu_context.h

Everything can be done even when CONFIG_SMP is not selected.

Just use IS_ENABLED() where relevant and rely on GCC to
opt out unneeded code and variables when CONFIG_SMP is not set.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/cc13b87b0f750a538621876ecc24c22a07e7c8be.1622712515.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 3 Jun 2021 09:29:03 +0000 (09:29 +0000)]

powerpc/nohash: Convert set_context() to C

ppc8xx already has set_context() in C.

Other ones have it in assembly. The only thing it does is to
write the context id into SPRN_PID.

Do it in C.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/a5d0759064f3831c6b88af49ef5d3b05ba1c4dad.1622712515.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 3 Jun 2021 09:29:02 +0000 (09:29 +0000)]

powerpc/nohash: Refactor update of BDI2000 pointers in switch_mmu_context()

Instead of duplicating the update of BDI2000 pointers in
set_context(), do it directly from switch_mmu_context().

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/4c54997edd3548fa54717915e7c6ebaf60f208c0.1622712515.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 3 Jun 2021 09:13:54 +0000 (09:13 +0000)]

powerpc/kuap: Force inlining of all first level KUAP helpers.

All KUAP helpers defined in asm/kup.h are single line functions
that should be inlined. But on book3s/32 build, we get many
instances of <prevent_write_to_user.constprop.0>.

Force inlining of those helpers.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/8479a862e165a57a855292d47e24c259a578f5a0.1622711627.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 3 Jun 2021 08:41:48 +0000 (08:41 +0000)]

powerpc/kuap: Remove to/from/size parameters of prevent_user_access()

prevent_user_access() doesn't use anymore to/from/size parameters.

Remove them.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b7113662fd2c26e4c33e9d705de324bd3860822e.1622708530.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 3 Jun 2021 08:41:46 +0000 (08:41 +0000)]

powerpc/kuap: Remove KUAP_CURRENT_XXX

book3s/32 was the only user of KUAP_CURRENT_XXX.

After rework of book3s/32 KUAP, it is not used anymore.

Remove them.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/549214ecf6887d965645e664520d4886663c5ffb.1622708530.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 3 Jun 2021 08:41:45 +0000 (08:41 +0000)]

powerpc/32s: Activate KUAP and KUEP by default

Now that KUAP and KUEP have been significantly optimised and can be
disabled at boot time using 'nosmap' and 'nosmep' kernel parameters,
them can be active by default like in other powerpc platforms.

It is still possible to disable them completely in the configuration.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/86c7c74a3ba5312daea7e9658b096e2bcc6f4b64.1622708530.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 3 Jun 2021 08:41:44 +0000 (08:41 +0000)]

powerpc/32s: Rework Kernel Userspace Access Protection

On book3s/32, KUAP is provided by toggling Ks bit in segment registers.
One segment register addresses 256M of virtual memory.

At the time being, KUAP implements a complex logic to apply the
unlock/lock on the exact number of segments covering the user range
to access, with saving the boundaries of the range of segments in
a member of thread struct.

But most if not all user accesses are within a single segment.

Rework KUAP with a different approach:
- Open only one segment, the one corresponding to the starting
address of the range to be accessed.
- If a second segment is involved, it will generate a page fault. The
segment will then be open by the page fault handler.

The kuap member of thread struct will now contain:
- The start address of the current on going user access, that will be
used to know which segment to lock at the end of the user access.
- ~0 when no user access is open
- ~1 when additionnal segments are opened by a page fault.

Then, at lock time
- When only one segment is open, close it.
- When several segments are open, close all user segments.

Almost 100% of the time, only one segment will be involved.

In interrupts, inline the function that unlock/lock all segments,
because not inlining them implies a lot of register save/restore.

With the patch, writing value 128 in userspace in perf_copy_attr() is
done with 16 instructions:

    3890: 93 82 04 dc stw     r28,1244(r2)
    3894: 7d 20 e5 26 mfsrin  r9,r28
    3898: 55 29 00 80 rlwinm  r9,r9,0,2,0
    389c: 7d 20 e1 e4 mtsrin  r9,r28
    38a0: 4c 00 01 2c isync

    38a4: 39 20 00 80 li      r9,128
    38a8: 91 3c 00 00 stw     r9,0(r28)

    38ac: 81 42 04 dc lwz     r10,1244(r2)
    38b0: 39 00 ff ff li      r8,-1
    38b4: 91 02 04 dc stw     r8,1244(r2)
    38b8: 2c 0a ff fe cmpwi   r10,-2
    38bc: 41 82 00 88 beq     3944 <perf_copy_attr+0x36c>
    38c0: 7d 20 55 26 mfsrin  r9,r10
    38c4: 65 29 40 00 oris    r9,r9,16384
    38c8: 7d 20 51 e4 mtsrin  r9,r10
    38cc: 4c 00 01 2c isync
...
    3944: 48 00 00 01 bl      3944 <perf_copy_attr+0x36c>
3944: R_PPC_REL24 kuap_lock_all_ool

Before the patch it was 118 instructions. In reality only 42 are
executed in most cases, but GCC is not able to see that a properly
aligned user access cannot involve more than one segment.

    5060: 39 1d 00 04 addi    r8,r29,4
    5064: 3d 20 b0 00 lis     r9,-20480
    5068: 7c 08 48 40 cmplw   r8,r9
    506c: 40 81 00 08 ble     5074 <perf_copy_attr+0x2cc>
    5070: 3d 00 b0 00 lis     r8,-20480
    5074: 39 28 ff ff addi    r9,r8,-1
    5078: 57 aa 00 06 rlwinm  r10,r29,0,0,3
    507c: 55 29 27 3e rlwinm  r9,r9,4,28,31
    5080: 39 29 00 01 addi    r9,r9,1
    5084: 7d 29 53 78 or      r9,r9,r10
    5088: 91 22 04 dc stw     r9,1244(r2)
    508c: 7d 20 ed 26 mfsrin  r9,r29
    5090: 55 29 00 80 rlwinm  r9,r9,0,2,0
    5094: 7c 08 50 40 cmplw   r8,r10
    5098: 40 81 00 c0 ble     5158 <perf_copy_attr+0x3b0>
    509c: 7d 46 50 f8 not     r6,r10
    50a0: 7c c6 42 14 add     r6,r6,r8
    50a4: 54 c6 27 be rlwinm  r6,r6,4,30,31
    50a8: 7d 20 51 e4 mtsrin  r9,r10
    50ac: 3c ea 10 00 addis   r7,r10,4096
    50b0: 39 29 01 11 addi    r9,r9,273
    50b4: 7f 88 38 40 cmplw   cr7,r8,r7
    50b8: 55 29 02 06 rlwinm  r9,r9,0,8,3
    50bc: 40 9d 00 9c ble     cr7,5158 <perf_copy_attr+0x3b0>

    50c0: 2f 86 00 00 cmpwi   cr7,r6,0
    50c4: 41 9e 00 4c beq     cr7,5110 <perf_copy_attr+0x368>
    50c8: 2f 86 00 01 cmpwi   cr7,r6,1
    50cc: 41 9e 00 2c beq     cr7,50f8 <perf_copy_attr+0x350>
    50d0: 2f 86 00 02 cmpwi   cr7,r6,2
    50d4: 41 9e 00 14 beq     cr7,50e8 <perf_copy_attr+0x340>
    50d8: 7d 20 39 e4 mtsrin  r9,r7
    50dc: 39 29 01 11 addi    r9,r9,273
    50e0: 3c e7 10 00 addis   r7,r7,4096
    50e4: 55 29 02 06 rlwinm  r9,r9,0,8,3
    50e8: 7d 20 39 e4 mtsrin  r9,r7
    50ec: 39 29 01 11 addi    r9,r9,273
    50f0: 3c e7 10 00 addis   r7,r7,4096
    50f4: 55 29 02 06 rlwinm  r9,r9,0,8,3
    50f8: 7d 20 39 e4 mtsrin  r9,r7
    50fc: 3c e7 10 00 addis   r7,r7,4096
    5100: 39 29 01 11 addi    r9,r9,273
    5104: 7f 88 38 40 cmplw   cr7,r8,r7
    5108: 55 29 02 06 rlwinm  r9,r9,0,8,3
    510c: 40 9d 00 4c ble     cr7,5158 <perf_copy_attr+0x3b0>
    5110: 7d 20 39 e4 mtsrin  r9,r7
    5114: 39 29 01 11 addi    r9,r9,273
    5118: 3c c7 10 00 addis   r6,r7,4096
    511c: 55 29 02 06 rlwinm  r9,r9,0,8,3
    5120: 7d 20 31 e4 mtsrin  r9,r6
    5124: 39 29 01 11 addi    r9,r9,273
    5128: 3c c6 10 00 addis   r6,r6,4096
    512c: 55 29 02 06 rlwinm  r9,r9,0,8,3
    5130: 7d 20 31 e4 mtsrin  r9,r6
    5134: 39 29 01 11 addi    r9,r9,273
    5138: 3c c7 30 00 addis   r6,r7,12288
    513c: 55 29 02 06 rlwinm  r9,r9,0,8,3
    5140: 7d 20 31 e4 mtsrin  r9,r6
    5144: 3c e7 40 00 addis   r7,r7,16384
    5148: 39 29 01 11 addi    r9,r9,273
    514c: 7f 88 38 40 cmplw   cr7,r8,r7
    5150: 55 29 02 06 rlwinm  r9,r9,0,8,3
    5154: 41 9d ff bc bgt     cr7,5110 <perf_copy_attr+0x368>

    5158: 4c 00 01 2c isync
    515c: 39 20 00 80 li      r9,128
    5160: 91 3d 00 00 stw     r9,0(r29)

    5164: 38 e0 00 00 li      r7,0
    5168: 90 e2 04 dc stw     r7,1244(r2)
    516c: 7d 20 ed 26 mfsrin  r9,r29
    5170: 65 29 40 00 oris    r9,r9,16384
    5174: 40 81 00 c0 ble     5234 <perf_copy_attr+0x48c>
    5178: 7d 47 50 f8 not     r7,r10
    517c: 7c e7 42 14 add     r7,r7,r8
    5180: 54 e7 27 be rlwinm  r7,r7,4,30,31
    5184: 7d 20 51 e4 mtsrin  r9,r10
    5188: 3d 4a 10 00 addis   r10,r10,4096
    518c: 39 29 01 11 addi    r9,r9,273
    5190: 7c 08 50 40 cmplw   r8,r10
    5194: 55 29 02 06 rlwinm  r9,r9,0,8,3
    5198: 40 81 00 9c ble     5234 <perf_copy_attr+0x48c>

    519c: 2c 07 00 00 cmpwi   r7,0
    51a0: 41 82 00 4c beq     51ec <perf_copy_attr+0x444>
    51a4: 2c 07 00 01 cmpwi   r7,1
    51a8: 41 82 00 2c beq     51d4 <perf_copy_attr+0x42c>
    51ac: 2c 07 00 02 cmpwi   r7,2
    51b0: 41 82 00 14 beq     51c4 <perf_copy_attr+0x41c>
    51b4: 7d 20 51 e4 mtsrin  r9,r10
    51b8: 39 29 01 11 addi    r9,r9,273
    51bc: 3d 4a 10 00 addis   r10,r10,4096
    51c0: 55 29 02 06 rlwinm  r9,r9,0,8,3
    51c4: 7d 20 51 e4 mtsrin  r9,r10
    51c8: 39 29 01 11 addi    r9,r9,273
    51cc: 3d 4a 10 00 addis   r10,r10,4096
    51d0: 55 29 02 06 rlwinm  r9,r9,0,8,3
    51d4: 7d 20 51 e4 mtsrin  r9,r10
    51d8: 3d 4a 10 00 addis   r10,r10,4096
    51dc: 39 29 01 11 addi    r9,r9,273
    51e0: 7c 08 50 40 cmplw   r8,r10
    51e4: 55 29 02 06 rlwinm  r9,r9,0,8,3
    51e8: 40 81 00 4c ble     5234 <perf_copy_attr+0x48c>
    51ec: 7d 20 51 e4 mtsrin  r9,r10
    51f0: 39 29 01 11 addi    r9,r9,273
    51f4: 3c ea 10 00 addis   r7,r10,4096
    51f8: 55 29 02 06 rlwinm  r9,r9,0,8,3
    51fc: 7d 20 39 e4 mtsrin  r9,r7
    5200: 39 29 01 11 addi    r9,r9,273
    5204: 3c e7 10 00 addis   r7,r7,4096
    5208: 55 29 02 06 rlwinm  r9,r9,0,8,3
    520c: 7d 20 39 e4 mtsrin  r9,r7
    5210: 39 29 01 11 addi    r9,r9,273
    5214: 3c ea 30 00 addis   r7,r10,12288
    5218: 55 29 02 06 rlwinm  r9,r9,0,8,3
    521c: 7d 20 39 e4 mtsrin  r9,r7
    5220: 3d 4a 40 00 addis   r10,r10,16384
    5224: 39 29 01 11 addi    r9,r9,273
    5228: 7c 08 50 40 cmplw   r8,r10
    522c: 55 29 02 06 rlwinm  r9,r9,0,8,3
    5230: 41 81 ff bc bgt     51ec <perf_copy_attr+0x444>

    5234: 4c 00 01 2c isync

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Export the ool handlers to fix build errors]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/d9121f96a7c4302946839a0771f5d1daeeb6968c.1622708530.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 3 Jun 2021 08:41:43 +0000 (08:41 +0000)]

powerpc/32s: Allow disabling KUAP at boot time

PPC64 uses MMU features to enable/disable KUAP at boot time.
But feature fixups are applied way too early on PPC32.

Now that all KUAP related actions are in C following the
conversion of KUAP initial setup and context switch in C,
static branches can be used to enable/disable KUAP.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Export disable_kuap_key to fix build errors]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/cd79e8008455fba5395d099f9bb1305c039b931c.1622708530.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 3 Jun 2021 08:41:42 +0000 (08:41 +0000)]

powerpc/32s: Allow disabling KUEP at boot time

PPC64 uses MMU features to enable/disable KUEP at boot time.
But feature fixups are applied way too early on PPC32.

Now that all KUEP related actions are in C following the
conversion of KUEP initial setup and context switch in C,
static branches can be used to enable/disable KUEP.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/7745a2c3a08ec46302920a3f48d1cb9b5469dbbb.1622708530.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 3 Jun 2021 08:41:41 +0000 (08:41 +0000)]

powerpc/32s: Initialise KUAP and KUEP in C

In order to selectively activate KUAP and KUEP in a following patch,
perform KUAP and KUEP initialisation in C.

Unlike PPC64, PPC32 doesn't have an early_setup_secondary(),
so do it in start_secondary().

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/87be72023448dd4e476744ed279b8c04b8d08a1c.1622708530.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 3 Jun 2021 08:41:40 +0000 (08:41 +0000)]

powerpc/32s: Simplify calculation of segment register content

segment register has VSID on bits 8-31.
Bits 4-7 are reserved, there is no requirement to set them to 0.

VSIDs are calculated from VSID of SR0 by adding 0x111.

Even with highest possible VSID which would be 0xFFFFF0,
adding 16 times 0x111 results in 0x1001100.

So, the reserved bits are never overflowed, no need to clear
the reserved bits after each calculation.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/ddc1cfd2ec8f3b2395c6a4d7f2b0c1aa1b1e64fb.1622708530.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 3 Jun 2021 08:41:39 +0000 (08:41 +0000)]

powerpc/32s: Convert switch_mmu_context() to C

switch_mmu_context() does things that can easily be done in C.

For updating user segments, we have update_user_segments().

As mentionned in commit 46d348ea9f96 ("powerpc/32s: Move KUEP
locking/unlocking in C"), update_user_segments() has the loop
unrolled which is a significant performance gain.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/05c0875ad8220c03452c3a334946e207c6ca04d6.1622708530.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 3 Jun 2021 08:41:38 +0000 (08:41 +0000)]

powerpc/32s: move CTX_TO_VSID() into mmu-hash.h

In order to reuse it in switch_mmu_context(), this
patch moves CTX_TO_VSID() macro into asm/book3s/32/mmu-hash.h

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/26b36ef2939234a04b37baf6ffe50cba81f5d1b7.1622708530.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 3 Jun 2021 08:41:37 +0000 (08:41 +0000)]

powerpc/32s: Refactor update of user segment registers

KUEP implements the update of user segment registers.

Move it into mmu-hash.h in order to use it from other places.

And inline kuep_lock() and kuep_unlock(). Inlining kuep_lock() is
important for system_call_exception(), otherwise system_call_exception()
has to save into stack the system call parameters that are used just
after, and doing that takes more instructions than kuep_lock() itself.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/24591ca480d14a62ef910e38a5273d551262c4a2.1622708530.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 3 Jun 2021 08:41:36 +0000 (08:41 +0000)]

powerpc/32s: Move setup_{kuep/kuap}() into {kuep/kuap}.c

Avoids the #ifdef in mmu.c

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/0b7a13d414837e58264edc336b89c2fe9f35f9bc.1622708530.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Fri, 4 Jun 2021 04:49:25 +0000 (04:49 +0000)]

powerpc/8xx: Allow disabling KUAP at boot time

PPC64 uses MMU features to enable/disable KUAP at boot time.
But feature fixups are applied way too early on PPC32.

But since commit 3b7972c585ee ("powerpc/32: Manage KUAP in C"),
all KUAP is in C so it is now possible to use static branches.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/3dca510ce555335261a47c4799167da698f569c0.1622782111.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Wed, 2 Jun 2021 06:42:10 +0000 (06:42 +0000)]

powerpc/44x: Implement Kernel Userspace Exec Protection (KUEP)

Powerpc 44x has two bits for exec protection in TLBs: one
for user (UX) and one for superviser (SX).

Clear SX on user pages in TLB miss handlers to provide KUEP.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/169310e08152aa1d96c979770291d165ec6896ae.1622616032.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 3 Jun 2021 07:53:49 +0000 (07:53 +0000)]

powerpc: Remove CONFIG_PPC_MMU_NOHASH_32

Since commit Fixes: 7b0ede696b53 ("powerpc/8xx: MM_SLICE is not needed anymore"),
CONFIG_PPC_MMU_NOHASH_32 has not been used.

Remove it.

Reported-by: Tom Rix <trix@redhat.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/bf1e074f6fb213a1c4cc4964370bdce4b648d647.1622706812.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 20 May 2021 13:50:49 +0000 (13:50 +0000)]

powerpc/optprobes: use PPC_RAW_ macros

Use PPC_RAW_ macros to simplify the code.

And use PPC_LO/PPC_HI instead of IMM_L/IMM_H which are for
internal use inside ppc-opcode.h

Those macros are self explanatory, comments can go as well.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/5a167b8ba4d33a5c09cd504f0c862e25ffe85459.1621516826.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 20 May 2021 13:50:48 +0000 (13:50 +0000)]

powerpc/optprobes: Compact code source a bit.

Now that lines can be up to 100 chars long, minimise the
amount of split lines to increase readability.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/8ebbd977ea8cf8d706d82458f2a21acd44562a99.1621516826.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 20 May 2021 13:50:47 +0000 (13:50 +0000)]

powerpc/optprobes: Minimise casts

nip is already an unsigned long, no cast needed.

op_callback_addr and emulate_step_addr are kprobe_opcode_t *.
There value is obtained with ppc_kallsyms_lookup_name() which
returns 'unsigned long', and there values are used create_branch()
which expects 'unsigned long'. So change them to 'unsigned long'
to avoid casting them back and forth.

can_optimize() used p->addr several times as 'unsigned long'.
Use a local 'unsigned long' variable and avoid casting multiple times.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e03192a6d4123242a275e71ce2ba0bb4d90700c1.1621516826.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 20 May 2021 13:50:46 +0000 (13:50 +0000)]

powerpc/inst: Refactor PPC32 and PPC64 versions

ppc_inst() ppc_inst_prefixed() ppc_inst_swab() can easily be made common
to both PPC32 and PPC64.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/d54c63dcac6d190e1cc0d2fe3259d6e621928cdf.1621516826.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 20 May 2021 13:50:45 +0000 (13:50 +0000)]

powerpc: Don't use 'struct ppc_inst' to reference instruction location

'struct ppc_inst' is an internal representation of an instruction, but
in-memory instructions are and will remain a table of 'u32' forever.

Replace all 'struct ppc_inst *' used for locating an instruction in
memory by 'u32 *'. This removes a lot of undue casts to 'struct
ppc_inst *'.

It also helps locating ab-use of 'struct ppc_inst' dereference.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Fix ppc_inst_next(), use u32 instead of unsigned int]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/7062722b087228e42cbd896e39bfdf526d6a340a.1621516826.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 20 May 2021 13:50:44 +0000 (13:50 +0000)]

powerpc/lib/code-patching: Don't use struct 'ppc_inst' for runnable code in tests.

'struct ppc_inst' is meant to represent an instruction internally, it
is not meant to dereference code in memory.

For testing code patching, use patch_instruction() to properly
write into memory the code to be tested.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/d8425fb42a4adebc35b7509f121817eeb02fac31.1621516826.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 20 May 2021 13:50:43 +0000 (13:50 +0000)]

powerpc/lib/code-patching: Make instr_is_branch_to_addr() static

instr_is_branch_to_addr() is only used in code-patching.c

Make it static.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/5f6b9c8c83170ed310953eac2f5b14539bfc964a.1621516826.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 20 May 2021 13:50:42 +0000 (13:50 +0000)]

powerpc: Do not dereference code as 'struct ppc_inst' (uprobe, code-patching, feature-fixups)

'struct ppc_inst' is an internal structure to represent an instruction,
it is not directly the representation of that instruction in text code.
It is not meant to map and dereference code.

Dereferencing code directly through 'struct ppc_inst' has two main issues:
- On powerpc, structs are expected to be 8 bytes aligned while code is
spread every 4 byte.
- Should a non prefixed instruction lie at the end of the page and the
following page not be mapped, it would generate a page fault.

In-memory code must be accessed with ppc_inst_read().

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c9a1201dd0a66b4a0f91f0fb46d9385cbf030feb.1621516826.git.christophe.leroy@csgroup.eu

commit | commitdiff | tree

Christophe Leroy [Thu, 20 May 2021 13:50:41 +0000 (13:50 +0000)]

powerpc/inst: Avoid pointer dereferencing in ppc_inst_equal()

Avoid casting/dereferencing ppc_inst() as u64* , check each member
of the struct when relevant.

And remove the 0xff initialisation of the suffix for non
prefixed instruction. An instruction with 0xff as a suffix
might be invalid, but still is a prefixed instruction and
has to be considered as this.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/d8b155e930b7a9708ca110e8ff0ace6713a7af75.1621516826.git.christophe.leroy@csgroup.eu

Linux Kernel Source Tree.