Nicholas Piggin [Wed, 5 Jul 2017 03:56:27 +0000 (13:56 +1000)]
powerpc: Do not send system reset request through the oops path
A system reset is a request to crash / debug the system rather than
necessarily caused by encountering a BUG. So there is no need to
serialize all CPUs behind the die lock, adding taints to all
subsequent traces beyond the first, breaking console locks, etc.
The system reset is NMI context which has its own printk buffers to
prevent output being interleaved. Then it's better to have all
secondaries print out their debug as quickly as possible and the
primary will flush out all printk buffers during panic().
So remove the 0x100 path from die, and move it into system_reset. Name
the crash/dump reasons "System Reset".
This gives "not tained" traces when crashing an untainted kernel. It
also gives the panic reason as "System Reset" as opposed to "Fatal
exception in interrupt" (or "die oops" for fadump).
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Nicholas Piggin [Wed, 5 Jul 2017 03:56:26 +0000 (13:56 +1000)]
powerpc/pseries/le: Work around a firmware quirk
Some PowerVM firmware when delivering a system reset interrupt to a
little endian OS will mess up SRR registers. They are byteswapped, and
SRR1 is incorrect. An example from a crash:
It's possible to detect this pattern in SRR1 (that would never happen
in normal operation), and at least fix the NIP. After this patch, the
same interrupt reports NIP properly:
Nicholas Piggin [Wed, 5 Jul 2017 03:56:25 +0000 (13:56 +1000)]
powerpc: Do not call ppc_md.panic in fadump panic notifier
If fadump is not registered, and no other crash or debug handlers are
registered, the powerpc panic handler stops the guest before the
generic panic code can push out debug information to the console.
Currently, system reset injection causes the guest to silently stop.
Stop calling ppc_md.panic in the panic notifier. crash_fadump already
does rtas_os_term() to terminate the guest if fadump is registered.
Remove ppc_md.panic. Move fadump panic notifier into fadump code.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This fixes a couple more bits of fallout from the new hard lockup watchdog
patch.
It restores the required hw_nmi_get_sample_period() function for the
perf watchdog, and removes some function declarations on 64e that are only
defined for 64s. This fixes the 64e build when the hardlockup detector is
enabled.
It restores the default behaviour of disabling the perf watchdog, and also
fixes disabling the 64s watchdog when running as a guest.
Fixes: f5a97cad75 ("powerpc/64s: implement arch-specific hardlockup watchdog") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Nicholas Piggin [Fri, 25 Aug 2017 04:30:35 +0000 (14:30 +1000)]
powerpc/64s: idle POWER9 can execute stop in virtual mode
The hardware can execute stop in any context, and KVM does not
require real mode because siblings do not share MMU state. This
saves a switch to real-mode when going idle.
Acked-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Nicholas Piggin [Tue, 29 Aug 2017 11:40:35 +0000 (21:40 +1000)]
powerpc/64s: Drop no longer used IDLE_STATE_ENTER_SEQ
There are no longer any callers of IDLE_STATE_ENTER_SEQ, all callers
use IDLE_STATE_ENTER_SEQ_NORET. So drop the former.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Split out of larger patch, write change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Nicholas Piggin [Tue, 29 Aug 2017 11:34:40 +0000 (21:34 +1000)]
powerpc/64s: POWER9 can execute stop without a sync sequence
We don't need to use IDLE_STATE_ENTER_SEQ_NORET on Power9.
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Split out of larger patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Nicholas Piggin [Tue, 29 Aug 2017 11:36:35 +0000 (21:36 +1000)]
powerpc/64s: Move IDLE_STATE_ENTER_SEQ[_NORET] into idle_book3s.S
This macro is only used in idle_book3s.S, move it in there and add a
more descriptive comment.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Split out of larger patch and write change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Nicholas Piggin [Fri, 25 Aug 2017 04:30:33 +0000 (14:30 +1000)]
KVM: PPC: Book3S HV: POWER9 does not require secondary thread management
POWER9 CPUs have independent MMU contexts per thread, so KVM does not
need to quiesce secondary threads, so the hwthread_req/hwthread_state
protocol does not have to be used. So patch it away on POWER9, and patch
away the branch from the Linux idle wakeup to kvm_start_guest that is
never used.
Add a warning and error out of kvmppc_grab_hwthread in case it is ever
called on POWER9.
This avoids a hwsync in the idle wakeup path on POWER9.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Paul Mackerras <paulus@ozlabs.org>
[mpe: Use WARN(...) instead of WARN_ON()/pr_err(...)] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:38:03 +0000 (15:38 +1000)]
powerpc/configs/6xx: Switch CONFIG_USB_EHCI_FSL to =m
In commit 78e134fb66b5 ("drivers:usb:fsl:Make fsl ehci drv an
independent driver module"), CONFIG_USB_EHCI_FSL was switched from
built-in to modular. Update the defconfig.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:38:02 +0000 (15:38 +1000)]
powerpc/configs/6xx: Drop no longer needed CONFIG_BT_HCIUART_H4
Since commit 457ec384b02a ("Bluetooth: bpa10x: Use h4_recv_buf helper
for frame reassembly") we no longer need to set CONFIG_BT_HCIUART_H4
in our defconfigs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:38:01 +0000 (15:38 +1000)]
powerpc/configs/6xx: Drop no longer needed CONFIG_NETFILTER_XT_MATCH_SOCKET
Since commit 67d000b30e13 ("netfilter: move socket lookup
infrastructure to nf_socket_ipv{4,6}.c") we no longer need to set
CONFIG_NETFILTER_XT_MATCH_SOCKET in our defconfigs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
In commit 2bb7f19bd4cf ("cpufreq: stats: Make the stats code
non-modular"), the CPU_FREQ_STAT code was made non-modular. Our
defconfig still said =m though, which meant we no longer got the
code at all. Switch the defconfig to =y.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:59 +0000 (15:37 +1000)]
powerpc/configs/6xx: Drop no longer needed CONFIG_NF_CONNTRACK_PROC_COMPAT
Since commit f3055fda3412 ("netfilter: remove ip_conntrack* sysctl
compat code") we no longer need to set CONFIG_NF_CONNTRACK_PROC_COMPAT
in our defconfigs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:55 +0000 (15:37 +1000)]
powerpc/configs/6xx: Turn CONFIG_DRM_RADEON back on
In commit 9e3a74f2b512 ("drm: hide legacy drivers with CONFIG_DRM_LEGACY")
CONFIG_DRM_RADEON was moved behind CONFIG_DRM_LEGACY meaning it
stopped being enabled by ppc6xx_defconfig. Although no one has
noticed, given this is basically a legacy platform, it seems anyone
who is using it probably still wants this driver. So turn it back on
for now.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:52 +0000 (15:37 +1000)]
powerpc/configs: Update for CONFIG_INPUT_MOUSEDEV=n
In commit 428e9da8ebc9 ("Input: mousedev - stop offering PS/2 to
userspace by default") the symbol INPUT_MOUSEDEV went from being
'default y' to 'default n' (implied).
That means we no longer need to explicitly disable it in our
defconfigs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:50 +0000 (15:37 +1000)]
powerpc/configs: Turn CONFIG_R128 back in pmac32_defconfig
In commit 9e3a74f2b512 ("drm: hide legacy drivers with
CONFIG_DRM_LEGACY") CONFIG_R128 was moved behind CONFIG_DRM_LEGACY
meaning it stopped being enabled by pmac32_defconfig. Although no one
has noticed, given this is basically a legacy platform, it seems
anyone who is using it probably still wants this driver. So turn it
back on for now.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:45 +0000 (15:37 +1000)]
powerpc/configs: Add CONFIG_RAS now required for CONFIG_EDAC
In commit b87e32d5643a ("EDAC: Remove EDAC_MM_EDAC") CONFIG_EDAC grew
a dependency on CONFIG_RAS. Some of our defconfigs don't have the
latter, which means we lose CONFIG_EDAC, so add CONFIG_RAS to fix
that.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:44 +0000 (15:37 +1000)]
powerpc/configs: Drop no longer needed CONFIG_AUDITSYSCALL
Since commit 1c020e2c2c9d ("audit: always enable syscall auditing when
supported and audit is enabled") we no longer need to set
CONFIG_AUDITSYSCALL in our defconfigs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:43 +0000 (15:37 +1000)]
powerpc/configs: Drop CONFIG_SERIAL_TXX9_* from cell/ppc64
In commit 0f977b272333 ("powerpc: Remove the celleb support") we
dropped the celleb support, which made these symbols unselectable
because we no longer select HAS_TX99_SERIAL. So drop them from the
defconfigs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:41 +0000 (15:37 +1000)]
powerpc/configs: Drop unnecessary CONFIG_POWERNV_OP_PANEL
In commit 4b634438fece ("powerpc/powernv: Add driver for operator
panel on FSP machines") we added CONFIG_POWERNV_OP_PANEL=m to the
powernv defconfig, but it's default m so that's no necessary.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:40 +0000 (15:37 +1000)]
powerpc/configs: Drop no longer needed PCI_MSI on powernv
In commit a8c5e285a7f6 ("powerpc/powernv: Make PCI non-optional") we
made PCI (and therefore PCI_MSI) non-optional on powernv, so it
doesn't need to be in the defconfig anymore.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:39 +0000 (15:37 +1000)]
powerpc/configs: Drop no longer needed CONFIG_SMP for pseries/ppc64/powernv
In commit 28511e4b4f13 ("powerpc/powernv: Always enable SMP when
building powernv") and 5d3643f6142e ("powerpc/pseries: Always enable
SMP when building pseries") we forced CONFIG_SMP on for some configs.
Therefore we don't need to set it in those configs anymore.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:37 +0000 (15:37 +1000)]
powerpc/configs: Drop unnecessary CONFIG_NUMA_BALANCING_DEFAULT_ENABLED
In commit f90c908dd96f ("powerpc: Enable NUMA balancing in
pseries[_le]_defconfig") we added CONFIG_NUMA_BALANCING_DEFAULT_ENABLED
to our defconfigs. But it's already enabled by default, so drop it.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:36 +0000 (15:37 +1000)]
powerpc/configs: Drop no longer needed CONFIG_DEVPTS_MULTIPLE_INSTANCES
Since commit b8bc9698d2f4 ("devpts: Make each mount of devpts an
independent filesystem.") we no longer need to set
CONFIG_DEVPTS_MULTIPLE_INSTANCES in our defconfigs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:29 +0000 (15:37 +1000)]
powerpc/configs: Drop no longer needed CONFIG_CRYPTO_DEV_VMX_ENCRYPT
Since commit 0258ccf4849f ("crypto: vmx - Convert to CPU feature based
module autoloading") we no longer need to set
CONFIG_CRYPTO_DEV_VMX_ENCRYPT in our defconfigs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:28 +0000 (15:37 +1000)]
powerpc/configs: Update for CONFIG_NF_CT_PROTO_(SCTP|UDPLITE)=y
In commit 17d931ceeac1 ("netfilter: conntrack: built-in support for
SCTP"), NF_CT_PROTO_SCTP switched from tristate to bool and became
default y. Similarly in commit f2974481187c ("netfilter: conntrack:
built-in support for UDPlite"), NF_CT_PROTO_UDPLITE switched from
tristate to bool and became default y.
We had a few configs which set them to =m, which is no longer valid.
We don't need to change them to =y because both symbols are default y
and are enabled automatically based on the other symbols in the
affected defconfigs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:26 +0000 (15:37 +1000)]
powerpc/configs: Update for CONFIG_DEBUG_FS being selected via CONFIG_RCU_TRACE
In commit 8698132872e7 ("rcu: Enable RCU tracepoints by default to aid
in debugging"), CONFIG_RCU_TRACE was made default y (if CONFIG_TREE_RCU=y,
which it is for some of our configs).
That in turn causes CONFIG_TREE_RCU_TRACE to be enabled, which selects
CONFIG_DEBUG_FS. The end result is that CONFIG_DEBUG_FS is forced on,
meaning we don't have to enable it in some of our configs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 05:37:23 +0000 (15:37 +1000)]
powerpc/configs: Explicitly drop CONFIG_INPUT_MOUSEDEV
In commit 428e9da8ebc9 ("Input: mousedev - stop offering PS/2 to userspace by
default") (Jan 2017), CONFIG_INPUT_MOUSEDEV was switched from default y to
default n, with the explanation:
Evdev interface has been available for many years and by now everyone
is switched to using it, so let's stop offering /dev/input/mouseN
and /dev/psaux by default.
We had a number of configs which had it enabled, but going by the above
explanation probably don't need it enabled anymore.
So drop the last remnants of it from our defconfigs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 13:56:21 +0000 (23:56 +1000)]
powerpc/oops: Print the kernel's endian in the oops
Although the MSR tells you what endian you're in it's possible that
isn't the same endian the kernel was built for, and if that happens
you're usually having a very bad day. So print a marker to make
it 100% clear which endian the kernel was built for.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 13:56:20 +0000 (23:56 +1000)]
powerpc/oops: Fix the oops markers to use pr_cont()
When we oops we print a few markers for significant config options
such as PREEMPT, SMP etc. Currently these appear on separate lines
because we're not using pr_cont() properly. Fix it.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Rashmica Gupta [Thu, 1 Jun 2017 05:34:40 +0000 (15:34 +1000)]
Add documentation for the powerpc memtrace debugfs files
CONFIG_PPC_MEMTRACE must be set to use this feature. This can only be
used on powernv platforms.
Signed-off-by: Rashmica Gupta <rashmica.g@gmail.com>
[mpe: Update dates and kernel versions, mention size is in bytes] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Rashmica Gupta [Thu, 1 Jun 2017 05:34:38 +0000 (15:34 +1000)]
powerpc/powernv: Enable removal of memory for in memory tracing
The hardware trace macro feature requires access to a chunk of real
memory. This patch provides a debugfs interface to do this. By
writing an integer containing the size of memory to be unplugged into
/sys/kernel/debug/powerpc/memtrace/enable, the code will attempt to
remove that much memory from the end of each NUMA node.
This patch also adds additional debugsfs files for each node that
allows the tracer to interact with the removed memory, as well as
a trace file that allows userspace to read the generated trace.
Note that this patch does not invoke the hardware trace macro, it
only allows memory to be removed during runtime for the trace macro
to utilise.
Signed-off-by: Rashmica Gupta <rashmica.g@gmail.com>
[mpe: Minor formatting etc fixups] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This helper is used to detect if a uprobe'd function has returned
through a setjmp/longjmp, rather than branching to the LR that was
updated previously by us. This fixes a SIGSEGV that gets generated when
programs use setjmp/longjmp with uretprobes.
We use the arm64 model (arch/arm64/kernel/probes/uprobes.c:
arch_uretprobe_is_alive()) for detecting when stack frames have been
removed from under us.
Reference:
https://marc.info/?l=linux-kernel&m=143748610330073
commit 8e02fca0c9575 ("uprobes/x86: Reimplement arch_uretprobe_is_alive()")
commit 2763f8c19412c ("uprobes/x86: Make arch_uretprobe_is_alive(RP_CHECK_CALL) more
clever")
Tested with the test program from:
https://sourceware.org/git/gitweb.cgi?p=systemtap.git;a=blob;f=testsuite/systemtap.base/bz5274.c;hb=HEAD
And this script:
$ cat test.sh
#!/bin/bash
perf probe -x ./bz5274 -a bz5274_main_return=main%return
perf probe -x ./bz5274 -a bz5274_funca_return=funca%return
perf probe -x ./bz5274 -a bz5274_funcb_return=funcb%return
perf probe -x ./bz5274 -a bz5274_funcc_return=funcc%return
perf probe -x ./bz5274 -a bz5274_funcd_return=funcd%return
perf record -e 'probe_bz5274:*' -aR ./bz5274
Reported-by: Gustavo Luiz Duarte <gduarte@redhat.com> Reported-by: zsun@redhat.com Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This happens because we're being called with our affinity mask set to
irq_default_affinity. That in turn was populated using
cpumask_setall(), which sets NR_CPUs worth of bits, not nr_cpu_ids
worth. Finally cpumask_weight() will return > nr_cpu_ids when passed a
mask which has > nr_cpu_ids bits set.
Fix it by limiting the value returned by cpumask_weight().
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[mpe: Add change log details on actual cause] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Nicholas Piggin [Fri, 11 Aug 2017 16:39:07 +0000 (02:39 +1000)]
powerpc/64: Optimise set/clear of CTRL[RUN] (runlatch)
On modern CPUs the CTRL register is read-only except bit 63 which is
the run latch control. This means it can be updated with a mtspr
rather than mfspr/mtspr.
To accomodate older CPUs (Cell at least), where there are other bits
in the register, we still do a read/modify/write on pre 2.06 CPUs.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Update change log to mention 2.06 workaround] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Nicholas Piggin [Fri, 11 Aug 2017 16:39:04 +0000 (02:39 +1000)]
powerpc/64s: Use the HV handler for external IRQ replay in HV mode on POWER9
POWER9 host external interrupts use the h_virt_irq_common handler, so
use that to replay them rather than using the hardware_interrupt_common
handler. Both call do_IRQ, but using the correct handler reduces
i-cache footprint.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Nicholas Piggin [Fri, 11 Aug 2017 16:39:03 +0000 (02:39 +1000)]
powerpc/64s: Merge HV and non-HV paths for doorbell IRQ replay
This results in smaller code, and fewer branches. This relies on the
fact that both the 0xe80 and 0xa00 handlers call the same upper level
code, namely doorbell_exception().
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Mention we rely on the implementation of the 0xe80/0xa00 handlers] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Nicholas Piggin [Fri, 11 Aug 2017 16:39:02 +0000 (02:39 +1000)]
powerpc/64: Cleanup __check_irq_replay()
Move the clearing of irq_happened bits into the condition where they
were found to be set. This reduces instruction count slightly, and
reduces stores into irq_happened.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Nicholas Piggin [Fri, 11 Aug 2017 16:39:01 +0000 (02:39 +1000)]
powerpc/64s: masked_interrupt() returns to kernel so avoid restoring r13
Places in the kernel where r13 is not the PACA pointer must have
maskable interrupts disabled, so r13 does not have to be restored when
returning from a soft-masked interrupt. We should never have
interrupts soft disabled when we're in user space.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
It's too big to be inline, there is no reason to keep it
that way.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Rework to incorporate the comment changes via fixes branch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
powerpc/mm: Optimize detection of thread local mm's
Instead of comparing the whole CPU mask every time, let's
keep a counter of how many bits are set in the mask. Thus
testing for a local mm only requires testing if that counter
is 1 and the current CPU bit is set in the mask.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Tue, 22 Aug 2017 01:51:37 +0000 (11:51 +1000)]
powerpc/64s: Fix replay interrupt return label name
In __replay_interrupt() we take the address of a local label so we can
return to it later. However the assembler turns the local label into a
symbol with a name like ".L1^B42" - where "^B" is literally "\002".
This does not make for pleasant stack traces. Fix it by giving the
label a sensible name.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Rob Herring [Mon, 21 Aug 2017 15:16:49 +0000 (10:16 -0500)]
powerpc: pseries: remove dlpar_attach_node dependency on full path
In preparation to stop storing the full node path in full_name, remove the
dependency on full_name from dlpar_attach_node(). Callers of
dlpar_attach_node() already have the parent device_node, so just pass the
parent node into dlpar_attach_node instead of the path. This avoids doing
a lookup of the parent node by the path.
Signed-off-by: Rob Herring <robh@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Rob Herring [Mon, 21 Aug 2017 15:16:47 +0000 (10:16 -0500)]
powerpc: Convert to using %pOF instead of full_name
Now that we have a custom printf format specifier, convert users of
full_name to use %pOF instead. This is preparation to remove storing
of the full path string for each node.
Signed-off-by: Rob Herring <robh@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Anatolij Gustschin <agust@denx.de> Cc: Scott Wood <oss@buserror.net> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linuxppc-dev@lists.ozlabs.org Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Tue, 22 Aug 2017 05:14:50 +0000 (15:14 +1000)]
powerpc/vio: Use device_type to detect family
Currently in the vio.c code we use a comparision against the parent
device node's full path to decide if the device is a PFO or VIO family
device.
Both the ibm,platform-facilities and vdevice nodes are defined by PAPR,
and must have a matching device_type. So instead of using the path we
can instead compare the device_type.
I've checked Qemu and kvmtool both do this correctly, and all the
PowerVM systems I have access to do also. So it seems to be safe.
This removes the dependency on full_name, which is being removed
upstream.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Wed, 23 Aug 2017 12:20:10 +0000 (22:20 +1000)]
Merge branch 'fixes' into next
There's a non-trivial dependency between some commits we want to put in
next and the KVM prefetch work around that went into fixes. So merge
fixes into next.
Arvind Yadav [Mon, 19 Jun 2017 05:44:25 +0000 (11:14 +0530)]
macintosh/rack-meter: Make of_device_ids const
of_device_ids are not supposed to change at runtime. All functions
working with of_device_ids provided by <linux/of.h> work with const
of_device_ids. So mark the non-const structs as const.
File size before:
text data bss dec hex filename
407 576 0 983 3d7 drivers/macintosh/rack-meter.o
File size after constify rackmeter_match.
text data bss dec hex filename
807 176 0 983 3d7 drivers/macintosh/rack-meter.o
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
There is no guarantee that the various isync's involved with
the context switch will order the update of the CPU mask with
the first TLB entry for the new context being loaded by the HW.
Be safe here and add a memory barrier to order any subsequent
load/store which may bring entries into the TLB.
The corresponding barrier on the other side already exists as
pte updates use pte_xchg() which uses __cmpxchg_u64 which has
a sync after the atomic operation.
Cc: stable@vger.kernel.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Add comments in the code] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
powerpc/mm/cxl: Add the fault handling cpu to mm cpumask
We use mm cpumask for serializing against lockless page table walk.
Anybody who is doing a lockless page table walk is expected to disable
irq and only cpus in mm cpumask is expected do the lockless walk. This
ensure that a THP split can send IPI to only cpus in the mm cpumask,
to make sure there are no parallel lockless page table walk.
Add the CAPI fault handling cpu to the mm cpumask so that we can do
the lockless page table walk while inserting hash page table entries.
powerpc/mm: Don't send IPI to all cpus on THP updates
Now that we made sure that lockless walk of linux page table is mostly
limitted to current task(current->mm->pgdir) we can update the THP
update sequence to only send IPI to CPUs on which this task has run.
This helps in reducing the IPI overload on systems with large number
of CPUs.
WRT kvm even though kvm is walking page table with vpc->arch.pgdir,
it is done only on secondary CPUs and in that case we have primary CPU
added to task's mm cpumask. Sending an IPI to primary will force the
secondary to do a vm exit and hence this mm cpumask usage is safe
here.
WRT CAPI, we still end up walking linux page table with capi context
MM. For now the pte lookup serialization sends an IPI to all CPUs in
CPI is in use. We can further improve this by adding the CAPI
interrupt handling CPU to task mm cpumask. That will be done in a
later patch.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Michael Ellerman [Thu, 17 Aug 2017 13:14:17 +0000 (23:14 +1000)]
Merge branch 'topic/ppc-kvm' into next
Bring in the commit to rename find_linux_pte_or_hugepte() which touches
arch and KVM code, and might need to be merged with the kvmppc tree to
avoid conflicts.
Add newer helpers to make the function usage simpler. It is always
recommended to use find_current_mm_pte() for walking the page table.
If we cannot use find_current_mm_pte(), it should be documented why
the said usage of __find_linux_pte() is safe against a parallel THP
split.
For now we have KVM code using __find_linux_pte(). This is because kvm
code ends up calling __find_linux_pte() in real mode with MSR_EE=0 but
with PACA soft_enabled = 1. We may want to fix that later and make
sure we keep the MSR_EE and PACA soft_enabled in sync. When we do that
we can switch kvm to use find_linux_pte().
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Geoff Levand [Mon, 7 Aug 2017 20:09:20 +0000 (20:09 +0000)]
block/ps3vram: Check return of ps3vram_cache_init
Cc: Markus Elfring <elfring@users.sourceforge.net> Cc: Jim Paris <jim@jtan.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Geoff Levand <geoff@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Sam Bobroff [Thu, 17 Aug 2017 01:06:47 +0000 (11:06 +1000)]
selftests/powerpc: Improve tm-resched-dscr
The tm-resched-dscr self test can, in some situations, run for
several minutes before being successfully interrupted by the context
switch it needs in order to perform the test. This often seems to
occur when the test is being run in a virtual machine.
Improve the test by running it under eat_cpu() to guarantee
contention for the CPU and increase the chance of a context switch.
In practice this seems to reduce the test time, in some cases, from
more than two minutes to under a second.
Also remove the "progress dots" so that if the test does run for a
long time, it doesn't produce large amounts of unnecessary output.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
nest_imc_refc is a reference count struct, used to track number of
active perf sessions using the nest units.
Currently the code accesses nest_imc_refc using node_id, which is
incorrect, the array is indexed by node number. Meaning in the case of
sparse node ids we index off the end of the array.
Fix it to use get_nest_pmu_ref() which uses the existing per-cpu
variable local_nest_imc_refc.