tools arch x86: Sync asm/cpufeatures.h with the with the kernel
To pick up the changes in:
6dbbf5ec9e1e ("x86/cpufeatures: Enumerate user wait instructions") b302e4b176d0 ("x86/cpufeatures: Enumerate the new AVX512 BFLOAT16 instructions") acec0ce081de ("x86/cpufeatures: Combine word 11 and 12 into a new scattered features word") cbb99c0f5887 ("x86/cpufeatures: Add FDP_EXCPTN_ONLY and ZERO_FCS_FDS")
That don't affect anything in tools/.
This silences this perf build warning:
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
Cc: Aaron Lewis <aaronlewis@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/n/tip-y60wnyg2fuxi0hx7icruo9po@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
tools build: Check if gettid() is available before providing helper
Laura reported that the perf build failed in fedora when we got a glibc
that provides gettid(), which I reproduced using fedora rawhide with the
glibc-devel-2.29.9000-26.fc31.x86_64 package.
Add a feature check to avoid providing a gettid() helper in such
systems.
On a fedora rawhide system with this patch applied we now get:
[acme@quaco perf]$ grep gettid /tmp/build/perf/FEATURE-DUMP
feature-gettid=0
[acme@quaco perf]$ cat /tmp/build/perf/feature/test-gettid.make.output
test-gettid.c: In function ‘main’:
test-gettid.c:8:9: error: implicit declaration of function ‘gettid’; did you mean ‘getgid’? [-Werror=implicit-function-declaration]
return gettid();
^~~~~~
getgid
cc1: all warnings being treated as errors
[acme@quaco perf]$
Reported-by: Laura Abbott <labbott@redhat.com> Tested-by: Laura Abbott <labbott@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lkml.kernel.org/n/tip-yfy3ch53agmklwu9o7rlgf9c@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Fri, 31 May 2019 13:13:21 +0000 (15:13 +0200)]
perf jvmti: Address gcc string overflow warning for strncpy()
We are getting false positive gcc warning when we compile with gcc9 (9.1.1):
CC jvmti/libjvmti.o
In file included from /usr/include/string.h:494,
from jvmti/libjvmti.c:5:
In function ‘strncpy’,
inlined from ‘copy_class_filename.constprop’ at jvmti/libjvmti.c:166:3:
/usr/include/bits/string_fortified.h:106:10: error: ‘__builtin_strncpy’ specified bound depends on the length of the source argument [-Werror=stringop-overflow=]
106 | return __builtin___strncpy_chk (__dest, __src, __len, __bos (__dest));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
jvmti/libjvmti.c: In function ‘copy_class_filename.constprop’:
jvmti/libjvmti.c:165:26: note: length computed here
165 | size_t file_name_len = strlen(file_name);
| ^~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
As per Arnaldo's suggestion use strlcpy(), which does the same thing and keeps
gcc silent.
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190531131321.GB1281@krava Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf python: Remove -fstack-protector-strong if clang doesn't have it
Some distros put -fstack-protector-strong in the compiler flags to be
used to build python extensions, but then, the clang version in that
distro doesn't know about that, only gcc does.
Check if that is the case and remove it from the set of options used to
build the python binding with clang.
Case at hand:
oraclelinux:7
$ head -2 /etc/os-release
NAME="Oracle Linux Server"
VERSION="7.6"
$ grep stack-protector /usr/lib64/python2.7/_sysconfigdata.py | head -1 | cut -c-120
'CFLAGS': '-fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --para
$
gcc version 4.8.5 20150623 (Red Hat 4.8.5-36.0.1) (GCC)
clang version 3.4.2 (tags/RELEASE_34/dot2-final)
clang: error: unknown argument: '-fstack-protector-strong'
clang: error: unknown argument: '-fstack-protector-strong'
error: command 'clang' failed with exit status 1
cp: cannot stat '/tmp/build/perf/python_ext_build/lib/perf*.so': No such file or directory
make[2]: *** [/tmp/build/perf/python/perf.so] Error 1
perf annotate TUI browser: Do not use member from variable within its own initialization
Some compilers will complain when using a member of a struct to
initialize another member, in the same struct initialization.
For instance:
debian:8 Debian clang version 3.5.0-10 (tags/RELEASE_350/final) (based on LLVM 3.5.0)
oraclelinux:7 clang version 3.4.2 (tags/RELEASE_34/dot2-final)
Produce:
ui/browsers/annotate.c:104:12: error: variable 'ops' is uninitialized when used within its own initialization [-Werror,-Wuninitialized]
(!ops.current_entry ||
^~~
1 error generated.
So use an extra variable, initialized just before that struct, to have
the value used in the expressions used to init two of the struct
members.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: c298304bd747 ("perf annotate: Use a ops table for annotation_line__write()") Link: https://lkml.kernel.org/n/tip-f9nexro58q62l3o9hez8hr0i@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Merge tag 'for-linus-20190706' of git://git.kernel.dk/linux-block
Pull block fix from Jens Axboe:
"Just a single fix for a patch from Greg KH, which reportedly break
block debugfs locations for certain setups. Trivial enough that I
think we should include it now, rather than wait and release 5.2 with
it, since it's a regression in this series"
* tag 'for-linus-20190706' of git://git.kernel.dk/linux-block:
blk-mq: fix up placement of debugfs directory of queue files
Merge tag 'mips_fixes_5.2_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Paul Burton:
"A few more MIPS fixes:
- Fix a silly typo in virt_addr_valid which led to completely bogus
behavior (that happened to stop tripping up hardened usercopy
despite being broken).
- Fix UART parity setup on AR933x systems.
- A build fix for non-Linux build machines.
- Have the 'all' make target build DTBs, primarily to fit in with the
behavior of scripts/package/builddeb.
- Handle an execution hazard in TLB exceptions that use KScratch
registers, which could inadvertently clobber the $1 register on
some generally higher-end out-of-order CPUs.
- A MAINTAINERS update to fix the path to the NAND driver for Ingenic
systems"
* tag 'mips_fixes_5.2_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MAINTAINERS: Correct path to moved files
MIPS: Add missing EHB in mtc0 -> mfc0 sequence.
MIPS: have "plain" make calls build dtbs for selected platforms
MIPS: fix build on non-linux hosts
MIPS: ath79: fix ar933x uart parity mode
MIPS: Fix bounds check virt_addr_valid
perf tests: Fix record+probe_libc_inet_pton.sh for powerpc64
'probe libc's inet_pton & backtrace it with ping' testcase sometimes
fails on powerpc because distro ping binary does not have symbol
information and thus it prints "[unknown]" function name in the
backtrace.
Accept "[unknown]" as valid function name for powerpc as well.
# perf test -v "probe libc's inet_pton & backtrace it with ping"
Before:
59: probe libc's inet_pton & backtrace it with ping :
--- start ---
test child forked, pid 79695
ping 79718 [077] 96483.787025: probe_libc:inet_pton: (7fff83a754c8) 7fff83a754c8 __GI___inet_pton+0x8 (/usr/lib64/power9/libc-2.28.so) 7fff83a2b7a0 gaih_inet.constprop.7+0x1020
(/usr/lib64/power9/libc-2.28.so) 7fff83a2c170 getaddrinfo+0x160 (/usr/lib64/power9/libc-2.28.so) 1171830f4 [unknown] (/usr/bin/ping)
FAIL: expected backtrace entry
".*\+0x[[:xdigit:]]+[[:space:]]\(.*/bin/ping.*\)$"
got "1171830f4 [unknown] (/usr/bin/ping)"
test child finished with -1
---- end ----
probe libc's inet_pton & backtrace it with ping: FAILED!
After:
59: probe libc's inet_pton & backtrace it with ping :
--- start ---
test child forked, pid 79085
ping 79108 [045] 96400.214177: probe_libc:inet_pton: (7fffbb9654c8) 7fffbb9654c8 __GI___inet_pton+0x8 (/usr/lib64/power9/libc-2.28.so) 7fffbb91b7a0 gaih_inet.constprop.7+0x1020
(/usr/lib64/power9/libc-2.28.so) 7fffbb91c170 getaddrinfo+0x160 (/usr/lib64/power9/libc-2.28.so) 132e830f4 [unknown] (/usr/bin/ping)
test child finished with 0
---- end ----
probe libc's inet_pton & backtrace it with ping: Ok
Signed-off-by: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com> Reviewed-by: Kim Phillips <kim.phillips@amd.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan@linux.ibm.com> Fixes: 1632936480a5 ("perf tests: Fix record+probe_libc_inet_pton.sh without ping's debuginfo") Link: http://lkml.kernel.org/r/1561630614-3216-1-git-send-email-s1seetee@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Wed, 3 Jul 2019 08:09:49 +0000 (10:09 +0200)]
perf evsel: Do not rely on errno values for precise_ip fallback
Konstantin reported problem with default perf record command, which
fails on some AMD servers, because of the default maximum precise
config.
The current fallback mechanism counts on getting ENOTSUP errno for
precise_ip fails, but that's not the case on some AMD servers.
We can fix this by removing the errno check completely, because the
precise_ip fallback is separated. We can just try (if requested by
evsel->precise_max) all possible precise_ip, and if one succeeds we win,
if not, we continue with standard fallback.
Reported-by: Konstantin Kharlamov <Hi-Angel@yandex.ru> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Monnet <quentin.monnet@netronome.com> Cc: Kim Phillips <kim.phillips@amd.com> Link: http://lkml.kernel.org/r/20190703080949.10356-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf thread: Allow references to thread objects after machine__exit()
Threads are created when we either synthesize PERF_RECORD_FORK events
for pre-existing threads or when we receive PERF_RECORD_FORK events from
the kernel as new threads get created.
We then keep them in machine->threads[].entries rb trees till when we
receive a PERF_RECORD_EXIT, i.e. that thread terminated.
The thread object has a reference count that is grabbed when, for
instance, we keep that thread referenced in struct hist_entry, in 'perf
report' and 'perf top'.
When we receive a PERF_RECORD_EXIT we remove the thread object from the
rb tree and move it to the corresponding machine->threads[].dead list,
then we do a thread__put(), dropping the reference we had for keeping it
in the rb tree.
In thread__put() we were assuming that when the reference count hit zero
we should remove it from the dead list by simply doing a
list_del_init(&thread->node).
That works well when all the thread lifetime is during the machine that
has the list heads lifetime, since we know that we can do the
list_del_init() and it will update the 'dead' list_head.
But in 'perf sched lat' we were doing:
machine__new() (via perf_session__new)
process events, grabbing refcounts to keep those thread objects
in 'perf sched' local data structures.
machine__exit() (via perf_session__delete) which would delete the
'dead' list heads.
And then doing the final thread__put() for the refcounts 'perf sched'
rightfully obtained for keeping those thread object references.
b00m, since thread__put() would do the list_del_init() touching
a dead dead list head.
Fix it by removing all the dead threads from machine->threads[].dead at
machine__exit(), since whatever is there should have refcounts taken by
things like 'perf sched lat', and make thread__put() check if the thread
is in a linked list before removing it from that list.
Song Liu [Thu, 20 Jun 2019 01:04:53 +0000 (18:04 -0700)]
perf header: Assign proper ff->ph in perf_event__synthesize_features()
bpf/btf write_* functions need ff->ph->env.
With this missing, pipe-mode (perf record -o -) would crash like:
Program terminated with signal SIGSEGV, Segmentation fault.
This patch assign proper ph value to ff.
Committer testing:
(gdb) run record -o -
Starting program: /root/bin/perf record -o -
PERFILE2
<SNIP start of perf.data headers>
Thread 1 "perf" received signal SIGSEGV, Segmentation fault.
__do_write_buf (size=4, buf=0x160, ff=0x7fffffff8f80) at util/header.c:126
126 memcpy(ff->buf + ff->offset, buf, size);
(gdb) bt
#0 __do_write_buf (size=4, buf=0x160, ff=0x7fffffff8f80) at util/header.c:126
#1 do_write (ff=ff@entry=0x7fffffff8f80, buf=buf@entry=0x160, size=4) at util/header.c:137
#2 0x00000000004eddba in write_bpf_prog_info (ff=0x7fffffff8f80, evlist=<optimized out>) at util/header.c:912
#3 0x00000000004f69d7 in perf_event__synthesize_features (tool=tool@entry=0x97cc00 <record>, session=session@entry=0x7fffe9c6d010,
evlist=0x7fffe9cae010, process=process@entry=0x4435d0 <process_synthesized_event>) at util/header.c:3695
#4 0x0000000000443c79 in record__synthesize (tail=tail@entry=false, rec=0x97cc00 <record>) at builtin-record.c:1214
#5 0x0000000000444ec9 in __cmd_record (rec=0x97cc00 <record>, argv=<optimized out>, argc=0) at builtin-record.c:1435
#6 cmd_record (argc=0, argv=<optimized out>) at builtin-record.c:2450
#7 0x00000000004ae3e9 in run_builtin (p=p@entry=0x98e058 <commands+216>, argc=argc@entry=3, argv=0x7fffffffd670) at perf.c:304
#8 0x000000000042eded in handle_internal_command (argv=<optimized out>, argc=<optimized out>) at perf.c:356
#9 run_argv (argcp=<optimized out>, argv=<optimized out>) at perf.c:400
#10 main (argc=3, argv=<optimized out>) at perf.c:522
(gdb)
After the patch the SEGSEGV is gone.
Reported-by: David Carrillo Cisneros <davidca@fb.com> Signed-off-by: Song Liu <songliubraving@fb.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kernel-team@fb.com Cc: stable@vger.kernel.org # v5.1+ Fixes: 606f972b1361 ("perf bpf: Save bpf_prog_info information as headers to perf.data") Link: http://lkml.kernel.org/r/20190620010453.4118689-1-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
tools arch kvm: Sync kvm headers with the kernel sources
To pick up the changes from:
41040cf7c5f0 ("arm64/sve: Fix missing SVE/FPSIMD endianness conversions") 6ca00dfafda7 ("KVM: x86: Modify struct kvm_nested_state to have explicit fields for data")
None entail changes in tooling.
This silences these tools/perf build warnings:
Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h'
diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h
Warning: Kernel ABI header at 'tools/arch/arm64/include/uapi/asm/kvm.h' differs from latest version at 'arch/arm64/include/uapi/asm/kvm.h'
diff -u tools/arch/arm64/include/uapi/asm/kvm.h arch/arm64/include/uapi/asm/kvm.h
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Dave Martin <Dave.Martin@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Liran Alon <liran.alon@oracle.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Link: https://lkml.kernel.org/n/tip-1cdbq5ulr4d6cx3iv2ye5wdv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two iscsi fixes.
One for an oops in the client which can be triggered by the server
authentication protocol and the other in the target code which causes
data corruption"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: iscsi: set auth_protocol back to NULL if CHAP_A value is not supported
scsi: target/iblock: Fix overrun in WRITE SAME emulation
Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixlet from Al Viro:
"Fix bogus default y in Kconfig (VALIDATE_FS_PARSER)
That thing should not be turned on by default, especially since it's
not quiet in case it finds no problems. Geert has sent the obvious fix
quite a few times, but it fell through the cracks"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: VALIDATE_FS_PARSER should default to n
blk-mq: fix up placement of debugfs directory of queue files
When the blk-mq debugfs file creation logic was "cleaned up" it was
cleaned up too much, causing the queue file to not be created in the
correct location. Turns out the check for the directory being present
is needed as if that has not happened yet, the files should not be
created, and the function will be called later on in the initialization
code so that the files can be created in the correct location.
Fixes: 6cfc0081b046 ("blk-mq: no need to check return value of debugfs_create functions") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: linux-block@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"x86 bugfix patches and one compilation fix for ARM"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: arm64/sve: Fix vq_present() macro to yield a bool
KVM: LAPIC: Fix pending interrupt in IRR blocked by software disable LAPIC
KVM: nVMX: Change KVM_STATE_NESTED_EVMCS to signal vmcs12 is copied from eVMCS
KVM: nVMX: Allow restore nested-state to enable eVMCS when vCPU in SMM
KVM: x86: degrade WARN to pr_warn_ratelimited
Merge tag 'mtd/fixes-for-5.2-final' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull mtf fixes from Miquel Raynal:
- Fix the memory organization structure of a Macronix SPI-NAND chip.
- Fix a build dependency wrongly described.
- Fix the sunxi NAND driver for A23/A33 SoCs by (a) reverting the
faulty commit introducing broken DMA support and (b) applying another
commit bringing working DMA support.
* tag 'mtd/fixes-for-5.2-final' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
mtd: rawnand: sunxi: Add A23/A33 DMA support with extra MBUS configuration
Revert "mtd: rawnand: sunxi: Add A23/A33 DMA support"
mtd: rawnand: ingenic: Fix ingenic_ecc dependency
mtd: spinand: Fix max_bad_eraseblocks_per_lun info in memorg
Merge tag 'nfsd-5.2-2' of git://linux-nfs.org/~bfields/linux
Pull nfsd fixes from Bruce Fields:
"Two more quick bugfixes for nfsd: fixing a regression causing mount
failures on high-memory machines and fixing the DRC over RDMA"
* tag 'nfsd-5.2-2' of git://linux-nfs.org/~bfields/linux:
nfsd: Fix overflow causing non-working mounts on 1 TB machines
svcrdma: Ignore source port when computing DRC hash
mtd: rawnand: sunxi: Add A23/A33 DMA support with extra MBUS configuration
Allwinner NAND controllers can make use of DMA to enhance the I/O
throughput thanks to ECC pipelining. DMA handling with A23/A33 NAND IP
is a bit different than with the older SoCs, hence the introduction of
a new compatible to handle:
* the differences between register offsets,
* the burst length change from 4 to minimum 8,
* manage SRAM accesses through MBUS with extra configuration.
Dmitry Osipenko [Sun, 23 Jun 2019 17:46:55 +0000 (20:46 +0300)]
i2c: tegra: Add Dmitry as a reviewer
I'm contributing to Tegra's upstream development in general and happened
to review the Tegra's I2C patches for awhile because I'm actively using
upstream kernel on all of my Tegra-powered devices and initially some of
the submitted patches were getting my attention since they were causing
problems. Recently Wolfram Sang asked whether I'm interested in becoming
a reviewer for the driver and I don't mind at all.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
[wsa: ack was expressed by Thierry Reding in a mail thread] Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
CONFIG_VALIDATE_FS_PARSER is a debugging tool to check that the parser
tables are vaguely sane. It was set to default to 'Y' for the moment to
catch errors in upcoming fs conversion development.
Make sure it is not enabled by default in the final release of v5.1.
Zhang Lei [Wed, 3 Jul 2019 17:42:50 +0000 (18:42 +0100)]
KVM: arm64/sve: Fix vq_present() macro to yield a bool
The original implementation of vq_present() relied on aggressive
inlining in order for the compiler to know that the code is
correct, due to some const-casting issues. This was causing sparse
and clang to complain, while GCC compiled cleanly.
Commit 0c529ff789bc addressed this problem, but since vq_present()
is no longer a function, there is now no implicit casting of the
returned value to the return type (bool).
In set_sve_vls(), this uncast bit value is compared against a bool,
and so may spuriously compare as unequal when both are nonzero. As
a result, KVM may reject valid SVE vector length configurations as
invalid, and vice versa.
Fix it by forcing the returned value to a bool.
Signed-off-by: Zhang Lei <zhang.lei@jp.fujitsu.com> Fixes: 0c529ff789bc ("KVM: arm64: Implement vq_present() as a macro") Signed-off-by: Dave Martin <Dave.Martin@arm.com> [commit message rewrite] Cc: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
One space is left unused in circular FIFO to differentiate
'full' and 'empty' cases. So take that in to account while
counting for the descriptors completed.
Fixes the issue reported here,
https://lkml.org/lkml/2019/6/18/669
Robin Gong [Fri, 21 Jun 2019 08:23:06 +0000 (16:23 +0800)]
dmaengine: imx-sdma: remove BD_INTR for channel0
It is possible for an irq triggered by channel0 to be received later
after clks are disabled once firmware loaded during sdma probe. If
that happens then clearing them by writing to SDMA_H_INTR won't work
and the kernel will hang processing infinite interrupts. Actually,
don't need interrupt triggered on channel0 since it's pollling
SDMA_H_STATSTOP to know channel0 done rather than interrupt in
current code, just clear BD_INTR to disable channel0 interrupt to
avoid the above case.
This issue was brought by commit 1d069bfa3c78 ("dmaengine: imx-sdma:
ack channel 0 IRQ in the interrupt handler") which didn't take care
the above case.
Fixes: 1d069bfa3c78 ("dmaengine: imx-sdma: ack channel 0 IRQ in the interrupt handler") Cc: stable@vger.kernel.org #5.0+ Signed-off-by: Robin Gong <yibin.gong@nxp.com> Reported-by: Sven Van Asbroeck <thesven73@gmail.com> Tested-by: Sven Van Asbroeck <thesven73@gmail.com> Reviewed-by: Michael Olbrich <m.olbrich@pengutronix.de> Signed-off-by: Vinod Koul <vkoul@kernel.org>
dmaengine: imx-sdma: fix use-after-free on probe error path
If probe() fails anywhere beyond the point where
sdma_get_firmware() is called, then a kernel oops may occur.
Problematic sequence of events:
1. probe() calls sdma_get_firmware(), which schedules the
firmware callback to run when firmware becomes available,
using the sdma instance structure as the context
2. probe() encounters an error, which deallocates the
sdma instance structure
3. firmware becomes available, firmware callback is
called with deallocated sdma instance structure
4. use after free - kernel oops !
Solution: only attempt to load firmware when we're certain
that probe() will succeed. This guarantees that the firmware
callback's context will remain valid.
Note that the remove() path is unaffected by this issue: the
firmware loader will increment the driver module's use count,
ensuring that the module cannot be unloaded while the
firmware callback is pending or running.
Signed-off-by: Sven Van Asbroeck <TheSven73@gmail.com> Reviewed-by: Robin Gong <yibin.gong@nxp.com>
[vkoul: fixed braces for if condition] Signed-off-by: Vinod Koul <vkoul@kernel.org>
Dan Carpenter [Mon, 24 Jun 2019 13:49:40 +0000 (16:49 +0300)]
dmaengine: jz4780: Fix an endian bug in IRQ handler
The "pending" variable was a u32 but we cast it to an unsigned long
pointer when we do the for_each_set_bit() loop. The problem is that on
big endian 64bit systems that results in an out of bounds read.
Fixes: 4e4106f5e942 ("dmaengine: jz4780: Fix transfers being ACKed too soon") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Vinod Koul <vkoul@kernel.org>
Merge tag 'drm-fixes-2019-07-05-1' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"I skipped last week because there wasn't much worth doing, this week
got a few more fixes in.
amdgpu:
- default register value change
- runpm regression fix
- fan control fix
i915:
- fix Ironlake regression
panfrost:
- fix a double free
virtio:
- fix a locking bug
imx:
- crtc disable fixes"
* tag 'drm-fixes-2019-07-05-1' of git://anongit.freedesktop.org/drm/drm:
drm/imx: only send event on crtc disable if kept disabled
drm/imx: notify drm core before sending event during crtc disable
drm/i915/ringbuffer: EMIT_INVALIDATE *before* switch context
drm/amdgpu/gfx9: use reset default for PA_SC_FIFO_SIZE
drm/amdgpu: Don't skip display settings in hwmgr_resume()
drm/amd/powerplay: use hardware fan control if no powerplay fan table
drm/panfrost: Fix a double-free error
drm/etnaviv: add missing failure path to destroy suballoc
drm/virtio: move drm_connector_update_edid_property() call
Dave Airlie [Fri, 5 Jul 2019 02:54:48 +0000 (12:54 +1000)]
Merge tag 'imx-drm-fixes-2019-07-04' of git://git.pengutronix.de/git/pza/linux into drm-fixes
drm/imx: fix stale vblank timestamp after a modeset
This series fixes stale vblank timestamps in the first event sent after
a crtc was disabled. The core now is notified via drm_crtc_vblank_off
before sending the last pending event in atomic_disable. If the crtc is
reenabled right away during to a modeset, the event is not sent at all,
as the next vblank will take care of it.
Merge tag 'dax-fix-5.2-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull dax fix from Dan Williams:
"A single dax fix that has been soaking awaiting other fixes under
discussion to join it. As it is getting late in the cycle lets proceed
with this fix and save follow-on changes for post-v5.3-rc1.
- Fix xarray entry association for mixed mappings"
* tag 'dax-fix-5.2-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
dax: Fix xarray entry association for mixed mappings
swap_readpage(): avoid blk_wake_io_task() if !synchronous
swap_readpage() sets waiter = bio->bi_private even if synchronous = F,
this means that the caller can get the spurious wakeup after return.
This can be fatal if blk_wake_io_task() does
set_current_state(TASK_RUNNING) after the caller does
set_special_state(), in the worst case the kernel can crash in
do_task_dead().
devm_ioremap_resource() does not currently take 'const' arguments, which
results in a warning from the first driver trying to do it anyway:
drivers/gpio/gpio-amd-fch.c: In function 'amd_fch_gpio_probe':
drivers/gpio/gpio-amd-fch.c:171:49: error: passing argument 2 of 'devm_ioremap_resource' discards 'const' qualifier from pointer target type [-Werror=discarded-qualifiers]
priv->base = devm_ioremap_resource(&pdev->dev, &amd_fch_gpio_iores);
^~~~~~~~~~~~~~~~~~~
Change the prototype to allow it, as there is no real reason not to.
In production we have noticed hard lockups on large machines running
large jobs due to kswaps hoarding lru lock within isolate_lru_pages when
sc->reclaim_idx is 0 which is a small zone. The lru was couple hundred
GiBs and the condition (page_zonenum(page) > sc->reclaim_idx) in
isolate_lru_pages() was basically skipping GiBs of pages while holding
the LRU spinlock with interrupt disabled.
On further inspection, it seems like there are two issues:
(1) If kswapd on the return from balance_pgdat() could not sleep (i.e.
node is still unbalanced), the classzone_idx is unintentionally set
to 0 and the whole reclaim cycle of kswapd will try to reclaim only
the lowest and smallest zone while traversing the whole memory.
(2) Fundamentally isolate_lru_pages() is really bad when the
allocation has woken kswapd for a smaller zone on a very large machine
running very large jobs. It can hoard the LRU spinlock while skipping
over 100s of GiBs of pages.
This patch only fixes (1). (2) needs a more fundamental solution. To
fix (1), in the kswapd context, if pgdat->kswapd_classzone_idx is
invalid use the classzone_idx of the previous kswapd loop otherwise use
the one the waker has requested.
Link: http://lkml.kernel.org/r/20190701201847.251028-1-shakeelb@google.com Fixes: e716f2eb24de ("mm, vmscan: prevent kswapd sleeping prematurely due to mismatched classzone_idx") Signed-off-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Yang Shi <yang.shi@linux.alibaba.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Hillf Danton <hdanton@sina.com> Cc: Roman Gushchin <guro@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Biggers [Thu, 4 Jul 2019 22:14:39 +0000 (15:14 -0700)]
fs/userfaultfd.c: disable irqs for fault_pending and event locks
When IOCB_CMD_POLL is used on a userfaultfd, aio_poll() disables IRQs
and takes kioctx::ctx_lock, then userfaultfd_ctx::fd_wqh.lock.
This may have to wait for userfaultfd_ctx::fd_wqh.lock to be released by
userfaultfd_ctx_read(), which in turn can be waiting for
userfaultfd_ctx::fault_pending_wqh.lock or
userfaultfd_ctx::event_wqh.lock.
But elsewhere the fault_pending_wqh and event_wqh locks are taken with
IRQs enabled. Since the IRQ handler may take kioctx::ctx_lock, lockdep
reports that a deadlock is possible.
Fix it by always disabling IRQs when taking the fault_pending_wqh and
event_wqh locks.
Commit ae62c16e105a ("userfaultfd: disable irqs when taking the
waitqueue lock") didn't fix this because it only accounted for the
fd_wqh lock, not the other locks nested inside it.
Link: http://lkml.kernel.org/r/20190627075004.21259-1-ebiggers@kernel.org Fixes: bfe4037e722e ("aio: implement IOCB_CMD_POLL") Signed-off-by: Eric Biggers <ebiggers@google.com> Reported-by: syzbot+fab6de82892b6b9c6191@syzkaller.appspotmail.com Reported-by: syzbot+53c0b767f7ca0dc0c451@syzkaller.appspotmail.com Reported-by: syzbot+a3accb352f9c22041cfa@syzkaller.appspotmail.com Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> [4.19+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/page_alloc.c: fix regression with deferred struct page init
Commit 0e56acae4b4d ("mm: initialize MAX_ORDER_NR_PAGES at a time
instead of doing larger sections") is causing a regression on some
systems when the kernel is booted as Xen dom0.
The system will just hang in early boot.
Reason is an endless loop in get_page_from_freelist() in case the first
zone looked at has no free memory. deferred_grow_zone() is always
returning true due to the following code snipplet:
/* If the zone is empty somebody else may have cleared out the zone */
if (!deferred_init_mem_pfn_range_in_zone(&i, zone, &spfn, &epfn,
first_deferred_pfn)) {
pgdat->first_deferred_pfn = ULONG_MAX;
pgdat_resize_unlock(pgdat, &flags);
return true;
}
This in turn results in the loop as get_page_from_freelist() is assuming
forward progress can be made by doing some more struct page
initialization.
Link: http://lkml.kernel.org/r/20190620160821.4210-1-jgross@suse.com Fixes: 0e56acae4b4d ("mm: initialize MAX_ORDER_NR_PAGES at a time instead of doing larger sections") Signed-off-by: Juergen Gross <jgross@suse.com> Suggested-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Acked-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge tag 'sound-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Here are a collection of small fixes for:
- A race with ASoC HD-audio registration
- LINE6 usb-audio memory overwrite by malformed descriptor
- FireWire MIDI handling
- Missing cast for bit shifts in a few USB-audio quirks
- The wrong function calls in minor OSS sequencer code paths
- A couple of HD-audio quirks"
* tag 'sound-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: line6: Fix write on zero-sized buffer
ALSA: hda: Fix widget_mutex incomplete protection
ALSA: firewire-lib/fireworks: fix miss detection of received MIDI messages
ALSA: seq: fix incorrect order of dest_client/dest_ports arguments
ALSA: hda/realtek - Change front mic location for Lenovo M710q
ALSA: usb-audio: fix sign unintended sign extension on left shifts
ALSA: hda/realtek: Add quirks for several Clevo notebook barebones
ptrace: Fix ->ptracer_cred handling for PTRACE_TRACEME
Fix two issues:
When called for PTRACE_TRACEME, ptrace_link() would obtain an RCU
reference to the parent's objective credentials, then give that pointer
to get_cred(). However, the object lifetime rules for things like
struct cred do not permit unconditionally turning an RCU reference into
a stable reference.
PTRACE_TRACEME records the parent's credentials as if the parent was
acting as the subject, but that's not the case. If a malicious
unprivileged child uses PTRACE_TRACEME and the parent is privileged, and
at a later point, the parent process becomes attacker-controlled
(because it drops privileges and calls execve()), the attacker ends up
with control over two processes with a privileged ptrace relationship,
which can be abused to ptrace a suid binary and obtain root privileges.
Fix both of these by always recording the credentials of the process
that is requesting the creation of the ptrace relationship:
current_cred() can't change under us, and current is the proper subject
for access control.
This change is theoretically userspace-visible, but I am not aware of
any code that it will actually break.
Robert Beckett [Tue, 25 Jun 2019 17:59:13 +0000 (18:59 +0100)]
drm/imx: notify drm core before sending event during crtc disable
Notify drm core before sending pending events during crtc disable.
This fixes the first event after disable having an old stale timestamp
by having drm_crtc_vblank_off update the timestamp to now.
This was seen while debugging weston log message:
Warning: computed repaint delay is insane: -8212 msec
This occurred due to:
1. driver starts up
2. fbcon comes along and restores fbdev, enabling vblank
3. vblank_disable_fn fires via timer disabling vblank, keeping vblank
seq number and time set at current value
(some time later)
4. weston starts and does a modeset
5. atomic commit disables crtc while it does the modeset
6. ipu_crtc_atomic_disable sends vblank with old seq number and time
Fixes: a474478642d5 ("drm/imx: fix crtc vblank state regression") Signed-off-by: Robert Beckett <bob.beckett@collabora.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Merge tag 'trace-v5.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"This includes three fixes:
- Fix a deadlock from a previous fix to keep module loading and
function tracing text modifications from stepping on each other
(this has a few patches to help document the issue in comments)
- Fix a crash when the snapshot buffer gets out of sync with the main
ring buffer
- Fix a memory leak when reading the memory logs"
* tag 'trace-v5.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftrace/x86: Anotate text_mutex split between ftrace_arch_code_modify_post_process() and ftrace_arch_code_modify_prepare()
tracing/snapshot: Resize spare buffer if size changed
tracing: Fix memory leak in tracing_err_log_open()
ftrace/x86: Add a comment to why we take text_mutex in ftrace_arch_code_modify_prepare()
ftrace/x86: Remove possible deadlock between register_kprobe() and ftrace_run_update_code()
Paul Menzel [Wed, 3 Jul 2019 11:28:15 +0000 (13:28 +0200)]
nfsd: Fix overflow causing non-working mounts on 1 TB machines
Since commit 10a68cdf10 (nfsd: fix performance-limiting session
calculation) (Linux 5.1-rc1 and 4.19.31), shares from NFS servers with
1 TB of memory cannot be mounted anymore. The mount just hangs on the
client.
The gist of commit 10a68cdf10 is the change below.
`total_avail` is 8,434,659,328 on the 1 TB machine. `clamp_t()` casts
the values to `int`, which for 32-bit integers can only hold values
−2,147,483,648 (−2^31) through 2,147,483,647 (2^31 − 1).
`avail` (in the function signature) is just 65536, so that no overflow
was happening. Before the commit the assignment would result in 21845,
and `num = 4`.
When using `total_avail`, it is causing the assignment to be 18446744072226137429 (printed as %lu), and `num` is then 4164608182.
My next guess is, that `nfsd_drc_mem_used` is then exceeded, and the
server thinks there is no memory available any more for this client.
Updating the arguments of `clamp_t()` and `min_t()` to `unsigned long`
fixes the issue.
Now, `avail = 65536` (before commit 10a68cdf10 `avail = 21845`), but
`num = 4` remains the same.
Fixes: c54f24e338ed (nfsd: fix performance-limiting session calculation) Cc: stable@vger.kernel.org Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Thomas Gleixner [Wed, 3 Jul 2019 12:19:36 +0000 (14:19 +0200)]
x86/fsgsbase: Revert FSGSBASE support
The FSGSBASE series turned out to have serious bugs and there is still an
open issue which is not fully understood yet.
The confidence in those changes has become close to zero especially as the
test cases which have been shipped with that series were obviously never
run before sending the final series out to LKML.
./fsgsbase_64 >/dev/null
Segmentation fault
As the merge window is close, the only sane decision is to revert FSGSBASE
support. The revert is necessary as this branch has been merged into
perf/core already and rebasing all of that a few days before the merge
window is not the most brilliant idea.
I could definitely slap myself for not noticing the test case fail when
merging that series, but TBH my expectations weren't that low back
then. Won't happen again.
Revert the following commits: 539bca535dec ("x86/entry/64: Fix and clean up paranoid_exit") 2c7b5ac5d5a9 ("Documentation/x86/64: Add documentation for GS/FS addressing mode") f987c955c745 ("x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2") 2032f1f96ee0 ("x86/cpu: Enable FSGSBASE on 64bit by default and add a chicken bit") 5bf0cab60ee2 ("x86/entry/64: Document GSBASE handling in the paranoid path") 708078f65721 ("x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit") 79e1932fa3ce ("x86/entry/64: Introduce the FIND_PERCPU_BASE macro") 1d07316b1363 ("x86/entry/64: Switch CR3 before SWAPGS in paranoid entry") f60a83df4593 ("x86/process/64: Use FSGSBASE instructions on thread copy and ptrace") 1ab5f3f7fe3d ("x86/process/64: Use FSBSBASE in switch_to() if available") a86b4625138d ("x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions") 8b71340d702e ("x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions") b64ed19b93c3 ("x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Chang S. Bae <chang.seok.bae@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com>
Eric Biggers [Tue, 2 Jul 2019 21:17:00 +0000 (14:17 -0700)]
crypto: user - prevent operating on larval algorithms
Michal Suchanek reported [1] that running the pcrypt_aead01 test from
LTP [2] in a loop and holding Ctrl-C causes a NULL dereference of
alg->cra_users.next in crypto_remove_spawns(), via crypto_del_alg().
The test repeatedly uses CRYPTO_MSG_NEWALG and CRYPTO_MSG_DELALG.
The crash occurs when the instance that CRYPTO_MSG_DELALG is trying to
unregister isn't a real registered algorithm, but rather is a "test
larval", which is a special "algorithm" added to the algorithms list
while the real algorithm is still being tested. Larvals don't have
initialized cra_users, so that causes the crash. Normally pcrypt_aead01
doesn't trigger this because CRYPTO_MSG_NEWALG waits for the algorithm
to be tested; however, CRYPTO_MSG_NEWALG returns early when interrupted.
Everything else in the "crypto user configuration" API has this same bug
too, i.e. it inappropriately allows operating on larval algorithms
(though it doesn't look like the other cases can cause a crash).
Fix this by making crypto_alg_match() exclude larval algorithms.
cryptd_skcipher_free() fails to free the struct skcipher_instance
allocated in cryptd_create_skcipher(), leading to a memory leak. This
is detected by kmemleak on bootup on ARM64 platforms:
Fixes: 4e0958d19bd8 ("crypto: cryptd - Add support for skcipher") Cc: <stable@vger.kernel.org> Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Herbert Xu [Mon, 24 Jun 2019 10:32:26 +0000 (18:32 +0800)]
lib/mpi: Fix karactx leak in mpi_powm
Sometimes mpi_powm will leak karactx because a memory allocation
failure causes a bail-out that skips the freeing of karactx. This
patch moves the freeing of karactx to the end of the function like
everything else so that it can't be skipped.
Reported-by: syzbot+f7baccc38dcc1e094e77@syzkaller.appspotmail.com Fixes: cdec9cb5167a ("crypto: GnuPG based MPI lib - source files...") Cc: <stable@vger.kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Reviewed-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Merge tag 'perf-core-for-mingo-5.3-20190701' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
perf annotate:
Mao Han:
- Add support for the csky processor architecture.
perf stat:
Andi Kleen:
- Fix metrics with --no-merge.
- Don't merge events in the same PMU.
- Fix group lookup for metric group.
Intel PT:
Adrian Hunter:
- Improve CBR (Core to Bus Ratio) packets support.
- Fix thread stack return from kernel for kernel only case.
- Export power and ptwrite events to sqlite and postgresql.
core libraries:
Arnaldo Carvalho de Melo:
- Find routines in tools/perf/util/ that have implementations in the kernel
libraries (lib/*.c), such as strreplace(), strim(), skip_spaces() and reuse
them after making a copy into tools/lib and tools/include/.
This continues the effort of having tools/ code looking as much as possible
like kernel source code, to help encourage people to work on both the kernel
and in tools hosted in the kernel sources.
That in turn will help moving stuff that uses those routines to
tools/lib/perf/ where they will be made available for use in other tools.
In the process ditch old cruft, remove unused variables and add missing
include directives for headers providing things used in places that were
building by sheer luck.
Kyle Meyer:
- Bump MAX_NR_CPUS and MAX_CACHES to get these tools to work on more machines.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
Merge tag 'for-linus-20190701' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull pidfd fork() fix from Christian Brauner:
"A single small fix for copy_process() in kernel/fork.c:
With Al's removal of ksys_close() from cleanup paths in copy_process()
a bug was introduced. When anon_inode_getfile() failed the cleanup was
correctly performed but the error code was not propagated to callers
of copy_process() causing them to operate on a nonsensical pointer.
The fix is a simple on-liner which makes sure that a proper negative
error code is returned from copy_process().
syzkaller has also verified that the bug is not reproducible with this
fix"
* tag 'for-linus-20190701' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
fork: return proper negative error code
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"Fix a build failure with the LLVM linker and a module allocation
failure when KASLR is active:
- Fix module allocation when running with KASLR enabled
- Fix broken build due to bug in LLVM linker (ld.lld)"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64/efi: Mark __efistub_stext_offset as an absolute symbol explicitly
arm64: kaslr: keep modules inside module region when KASAN is enabled
perf script: Allow specifying the files to process guest samples
The 'perf kvm' command set up things so that we can record, report, top,
etc, but not 'script', so make 'perf script' be able to process samples
by allowing to pass guest kallsyms, vmlinux, modules, etc, and if at
least one of those is provided, set perf_guest to true so that guest
samples get properly resolved.
Testing it:
# perf kvm --guest --guestkallsyms /wb/rhel6.kallsyms --guestmodules /wb/rhel6.modules record -e cycles:Gk
^C[ perf record: Woken up 7 times to write data ]
[ perf record: Captured and wrote 3.602 MB perf.data.guest (10492 samples) ]
Jiri Olsa noticed we need to set 'perf_guest' to true if we want to
process guest samples and I made it be set if one of the guest files
settings get set via the command line options added in this patch, that
match those present in the 'perf kvm' command.
We probably want to have 'perf record', 'perf report' etc to notice that
there are guest samples and do the right thing, which is to look for
files with some suffix that make it be associated with the guest used to
collect the samples, i.e. if a vmlinux file is passed, we can get the
build-id from it, if not some other identifier or simply looking for
"kallsyms.guest", for instance, in the current directory.
Reported-by: Mariano Pache <npache@redhat.com> Tested-by: Mariano Pache <npache@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Cc: Ali Raza <alirazabhutta.10@gmail.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Mario <jmario@redhat.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Orran Krieger <okrieger@redhat.com> Cc: Ramkumar Ramachandra <artagnon@gmail.com> Cc: Yunlong Song <yunlong.song@huawei.com> Link: https://lkml.kernel.org/n/tip-d54gj64rerlxcqsrod05biwn@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
scsi: iscsi: set auth_protocol back to NULL if CHAP_A value is not supported
If the CHAP_A value is not supported, the chap_server_open() function
should free the auth_protocol pointer and set it to NULL, or we will leave
a dangling pointer around.
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Roman Bolshakov [Tue, 2 Jul 2019 19:16:38 +0000 (22:16 +0300)]
scsi: target/iblock: Fix overrun in WRITE SAME emulation
WRITE SAME corrupts data on the block device behind iblock if the command
is emulated. The emulation code issues (M - 1) * N times more bios than
requested, where M is the number of 512 blocks per real block size and N is
the NUMBER OF LOGICAL BLOCKS specified in WRITE SAME command. So, for a
device with 4k blocks, 7 * N more LBAs gets written after the requested
range.
The issue happens because the number of 512 byte sectors to be written is
decreased one by one while the real bios are typically from 1 to 8 512 byte
sectors per bio.
Fixes: c66ac9db8d4a ("[SCSI] target: Add LIO target core v4.0.0-rc6") Cc: <stable@vger.kernel.org> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
gpio/spi: Fix spi-gpio regression on active high CS
I ran into an intriguing bug caused by
commit ""spi: gpio: Don't request CS GPIO in DT use-case"
affecting all SPI GPIO devices with an active high
chip select line.
The commit switches the CS gpio handling over to the GPIO
core, which will parse and handle "cs-gpios" from the OF
node without even calling down to the driver to get the
job done.
However the GPIO core handles the standard bindings in
Documentation/devicetree/bindings/spi/spi-controller.yaml
that specifies that active high CS needs to be specified
using "spi-cs-high" in the DT node.
The code in drivers/spi/spi-gpio.c never respected this
and never tried to inspect subnodes to see if they contained
"spi-cs-high" like the gpiolib OF quirks does. Instead the
only way to get an active high CS was to tag it in the
device tree using the flags cell such as
cs-gpios = <&gpio 4 GPIO_ACTIVE_HIGH>;
This alters the quirks to not inspect the subnodes of SPI
masters on "spi-gpio" for the standard attribute "spi-cs-high",
making old device trees work as expected.
This semantic is a bit ambigous, but just allowing the
flags on the GPIO descriptor to modify polarity is what
the kernel at large mostly uses so let's encourage that.
Jiri Kosina [Sat, 29 Jun 2019 21:22:33 +0000 (23:22 +0200)]
ftrace/x86: Anotate text_mutex split between ftrace_arch_code_modify_post_process() and ftrace_arch_code_modify_prepare()
ftrace_arch_code_modify_prepare() is acquiring text_mutex, while the
corresponding release is happening in ftrace_arch_code_modify_post_process().
This has already been documented in the code, but let's also make the fact
that this is intentional clear to the semantic analysis tools such as sparse.
Link: http://lkml.kernel.org/r/nycvar.YFH.7.76.1906292321170.27227@cbobk.fhfr.pm Fixes: 39611265edc1a ("ftrace/x86: Add a comment to why we take text_mutex in ftrace_arch_code_modify_prepare()") Fixes: d5b844a2cf507 ("ftrace/x86: Remove possible deadlock between register_kprobe() and ftrace_run_update_code()") Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
John Garry [Fri, 28 Jun 2019 14:35:49 +0000 (22:35 +0800)]
perf pmu: Support more complex PMU event aliasing
The jevent "Unit" field is used for uncore PMU alias definition.
The form uncore_pmu_example_X is supported, where "X" is a wildcard, to
support multiple instances of the same PMU in a system.
Unfortunately this format not suitable for all uncore PMUs; take the
Hisi DDRC uncore PMU for example, where the name is in the form
hisi_scclX_ddrcY.
For for current jevent parsing, we would be required to hardcode an
uncore alias translation for each possible value of X. This is not
scalable.
Instead, add support for "Unit" field in the form "hisi_sccl,ddrc",
where we can match by hisi_scclX and ddrcY. Tokens in Unit field are
delimited by ','.
Signed-off-by: John Garry <john.garry@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxarm@huawei.com Link: http://lkml.kernel.org/r/1561732552-143038-2-git-send-email-john.garry@huawei.com
[ Shut up older gcc complianing about the last arg to strtok_r() being uninitialized, set that tmp to NULL ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
LINE6 drivers allocate the buffers based on the value returned from
usb_maxpacket() calls. The manipulated device may return zero for
this, and this results in the kmalloc() with zero size (and it may
succeed) while the other part of the driver code writes the packet
data with the fixed size -- which eventually overwrites.
This patch adds a simple sanity check for the invalid buffer size for
avoiding that problem.
Wanpeng Li [Tue, 2 Jul 2019 09:25:02 +0000 (17:25 +0800)]
KVM: LAPIC: Fix pending interrupt in IRR blocked by software disable LAPIC
Thomas reported that:
| Background:
|
| In preparation of supporting IPI shorthands I changed the CPU offline
| code to software disable the local APIC instead of just masking it.
| That's done by clearing the APIC_SPIV_APIC_ENABLED bit in the APIC_SPIV
| register.
|
| Failure:
|
| When the CPU comes back online the startup code triggers occasionally
| the warning in apic_pending_intr_clear(). That complains that the IRRs
| are not empty.
|
| The offending vector is the local APIC timer vector who's IRR bit is set
| and stays set.
|
| It took me quite some time to reproduce the issue locally, but now I can
| see what happens.
|
| It requires apicv_enabled=0, i.e. full apic emulation. With apicv_enabled=1
| (and hardware support) it behaves correctly.
|
| Here is the series of events:
|
| Guest CPU
|
| goes down
|
| native_cpu_disable()
|
| apic_soft_disable();
|
| play_dead()
|
| ....
|
| startup()
|
| if (apic_enabled())
| apic_pending_intr_clear() <- Not taken
|
| enable APIC
|
| apic_pending_intr_clear() <- Triggers warning because IRR is stale
|
| When this happens then the deadline timer or the regular APIC timer -
| happens with both, has fired shortly before the APIC is disabled, but the
| interrupt was not serviced because the guest CPU was in an interrupt
| disabled region at that point.
|
| The state of the timer vector ISR/IRR bits:
|
| ISR IRR
| before apic_soft_disable() 0 1
| after apic_soft_disable() 0 1
|
| On startup 0 1
|
| Now one would assume that the IRR is cleared after the INIT reset, but this
| happens only on CPU0.
|
| Why?
|
| Because our CPU0 hotplug is just for testing to make sure nothing breaks
| and goes through an NMI wakeup vehicle because INIT would send it through
| the boots-trap code which is not really working if that CPU was not
| physically unplugged.
|
| Now looking at a real world APIC the situation in that case is:
|
| ISR IRR
| before apic_soft_disable() 0 1
| after apic_soft_disable() 0 1
|
| On startup 0 0
|
| Why?
|
| Once the dying CPU reenables interrupts the pending interrupt gets
| delivered as a spurious interupt and then the state is clear.
|
| While that CPU0 hotplug test case is surely an esoteric issue, the APIC
| emulation is still wrong, Even if the play_dead() code would not enable
| interrupts then the pending IRR bit would turn into an ISR .. interrupt
| when the APIC is reenabled on startup.
From SDM 10.4.7.2 Local APIC State After It Has Been Software Disabled
* Pending interrupts in the IRR and ISR registers are held and require
masking or handling by the CPU.
In Thomas's testing, hardware cpu will not respect soft disable LAPIC
when IRR has already been set or APICv posted-interrupt is in flight,
so we can skip soft disable APIC checking when clearing IRR and set ISR,
continue to respect soft disable APIC when attempting to set IRR.
Reported-by: Rong Chen <rong.a.chen@intel.com> Reported-by: Feng Tang <feng.tang@intel.com> Reported-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Thomas Gleixner <tglx@linutronix.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Rong Chen <rong.a.chen@intel.com> Cc: Feng Tang <feng.tang@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Liran Alon [Wed, 26 Jun 2019 13:09:27 +0000 (16:09 +0300)]
KVM: nVMX: Change KVM_STATE_NESTED_EVMCS to signal vmcs12 is copied from eVMCS
Currently KVM_STATE_NESTED_EVMCS is used to signal that eVMCS
capability is enabled on vCPU.
As indicated by vmx->nested.enlightened_vmcs_enabled.
This is quite bizarre as userspace VMM should make sure to expose
same vCPU with same CPUID values in both source and destination.
In case vCPU is exposed with eVMCS support on CPUID, it is also
expected to enable KVM_CAP_HYPERV_ENLIGHTENED_VMCS capability.
Therefore, KVM_STATE_NESTED_EVMCS is redundant.
KVM_STATE_NESTED_EVMCS is currently used on restore path
(vmx_set_nested_state()) only to enable eVMCS capability in KVM
and to signal need_vmcs12_sync such that on next VMEntry to guest
nested_sync_from_vmcs12() will be called to sync vmcs12 content
into eVMCS in guest memory.
However, because restore nested-state is rare enough, we could
have just modified vmx_set_nested_state() to always signal
need_vmcs12_sync.
From all the above, it seems that we could have just removed
the usage of KVM_STATE_NESTED_EVMCS. However, in order to preserve
backwards migration compatibility, we cannot do that.
(vmx_get_nested_state() needs to signal flag when migrating from
new kernel to old kernel).
Returning KVM_STATE_NESTED_EVMCS when just vCPU have eVMCS enabled
have a bad side-effect of userspace VMM having to send nested-state
from source to destination as part of migration stream. Even if
guest have never used eVMCS as it doesn't even run a nested
hypervisor workload. This requires destination userspace VMM and
KVM to support setting nested-state. Which make it more difficult
to migrate from new host to older host.
To avoid this, change KVM_STATE_NESTED_EVMCS to signal eVMCS is
not only enabled but also active. i.e. Guest have made some
eVMCS active via an enlightened VMEntry. i.e. vmcs12 is copied
from eVMCS and therefore should be restored into eVMCS resident
in memory (by copy_vmcs12_to_enlightened()).
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Maran Wilson <maran.wilson@oracle.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Liran Alon [Tue, 25 Jun 2019 11:26:42 +0000 (14:26 +0300)]
KVM: nVMX: Allow restore nested-state to enable eVMCS when vCPU in SMM
As comment in code specifies, SMM temporarily disables VMX so we cannot
be in guest mode, nor can VMLAUNCH/VMRESUME be pending.
However, code currently assumes that these are the only flags that can be
set on kvm_state->flags. This is not true as KVM_STATE_NESTED_EVMCS
can also be set on this field to signal that eVMCS should be enabled.
Therefore, fix code to check for guest-mode and pending VMLAUNCH/VMRESUME
explicitly.
Paolo Bonzini [Wed, 26 Jun 2019 12:16:13 +0000 (14:16 +0200)]
KVM: x86: degrade WARN to pr_warn_ratelimited
This warning can be triggered easily by userspace, so it should certainly not
cause a panic if panic_on_warn is set.
Reported-by: syzbot+c03f30b4f4c46bdf8575@syzkaller.appspotmail.com Suggested-by: Alexander Potapenko <glider@google.com> Acked-by: Alexander Potapenko <glider@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Jin Yao [Fri, 28 Jun 2019 09:23:04 +0000 (17:23 +0800)]
perf diff: Documentation -c cycles option
Documentation the new computation selection 'cycles'.
v4:
---
Change the column 'Block cycles diff [start:end]' to
'[Program Block Range] Cycles Diff'
Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1561713784-30533-8-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
"[Program Block Range]" indicates the range of program basic block
(start -> end). If we can find the source line it prints the source line
otherwise it prints the symbol+offset instead.
v4:
---
Use source lines or symbol+offset to indicate the basic block. It should
be easier to understand.
v3:
---
Cast 'struct hist_entry' to 'struct block_hist' in hist_entry__block_fprintf.
Use symbol_conf.report_block to check if executing hist_entry__block_fprintf.
v2:
---
Keep standard perf diff format and display the 'Baseline' and
'Shared Object'.
The output is sorted by "Baseline" and the basic blocks in the same
function are sorted by cycles diff.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1561713784-30533-7-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jin Yao [Fri, 28 Jun 2019 09:23:02 +0000 (17:23 +0800)]
perf diff: Link same basic blocks among different data
The target is to compare the performance difference (cycles diff) for
the same basic blocks in different data files.
The same basic block means same function, same start address and same
end address. This patch finds the same basic blocks from different data
files and link them together and resort by the cycles diff.
v3:
---
The block stuffs are maintained by new structure 'block_hist',
so this patch is update accordingly.
v2:
---
Since now the basic block hists is changed to per symbol,
the patch only links the basic block hists for the same
symbol in different data files.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1561713784-30533-6-git-send-email-yao.jin@linux.intel.com
[ sym->name is an array, not a pointer, so no need to check it for NULL, fixes de build in some distros ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jin Yao [Fri, 28 Jun 2019 09:23:01 +0000 (17:23 +0800)]
perf diff: Use hists to manage basic blocks per symbol
The hist__account_cycles() can account cycles per basic block. The basic
block information is saved in cycles_hist structure.
This patch processes each symbol, get basic blocks from cycles_hist and
add the basic block entries to a new hists (in 'struct block_hist').
Using a hists is because we need to compare, sort and print the basic
blocks later.
v6:
---
Since 'ops' argument is removed from hists__add_entry_block,
update the code accordingly. No functional change.
v5:
---
Since now we still carry block_info in 'struct hist_entry'
we don't need to use our own new/free ops for hist entries.
And the block_info is released in hist_entry__delete.
v3:
---
1. In v2, we put block stuffs in 'struct hist_entry', but
it's not a good design. In v3, we create a new
'struct block_hist' and cast the 'struct hist_entry' to
'struct block_hist' in some places, which can avoid adding
new stuffs in 'struct hist_entry'.
2. abs() -> labs(), in block_cycles_diff_cmp().
v2:
---
v1 adds the basic block entries to per data-file hists
but v2 adds the basic block entries to per symbol hists.
That is to keep current perf-diff format. Will show the
result in next patches.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1561713784-30533-5-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jin Yao [Fri, 28 Jun 2019 09:23:00 +0000 (17:23 +0800)]
perf diff: Check if all data files with branch stacks
We will expand perf diff to support diff cycles of individual programs
blocks, so it requires all data files having branch stacks.
This patch checks HEADER_BRANCH_STACK in header, and only set the flag
has_br_stack when HEADER_BRANCH_STACK are set in all data files.
v2:
---
Move check_file_brstack() from __cmd_diff() to cmd_diff().
Because later patch will check flag 'has_br_stack' before
ui_init().
Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1561713784-30533-4-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jin Yao [Fri, 28 Jun 2019 09:22:59 +0000 (17:22 +0800)]
perf hists: Add block_info in hist_entry
The block_info contains the program basic block information, i.e,
contains the start address and the end address of this basic block and
how much cycles it takes.
We need to compare, sort and even print out the basic block by some
orders, i.e. sort by cycles.
For this purpose, we add block_info field to hist_entry. In order not to
impact current interface, we creates a new function
hists__add_entry_block.
v6:
---
Remove the 'ops' argument in hists__add_entry_block
Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1561713784-30533-3-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jin Yao [Fri, 28 Jun 2019 09:22:58 +0000 (17:22 +0800)]
perf symbol: Create block_info structure
'perf diff' currently can only diff symbols(functions).
We should expand it to diff cycles of individual programs blocks as
reported by timed LBR. This would allow to identify changes in specific
code accurately.
We need a new structure to maintain the basic block information, such as,
symbol(function), start/end address of this block, cycles. This patch
creates this structure and with some ops.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1561713784-30533-2-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Tue, 2 Jul 2019 12:12:40 +0000 (14:12 +0200)]
objtool: Fix build by linking against tools/lib/ctype.o sources
Fix objtool build, because it adds _ctype dependency via isspace call patch.
Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: André Goddard Rosa <andre.goddard@gmail.com> Cc: Clark Williams <williams@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 7bd330de43fd ("tools lib: Adopt skip_spaces() from the kernel sources") Link: http://lkml.kernel.org/r/20190702121240.GB12694@krava Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Despite what I think the prm recommends, commit f2253bd9859b
("drm/i915/ringbuffer: EMIT_INVALIDATE after switch context") turned out
to be a huge mistake when enabling Ironlake contexts as the GPU would
hang on either a MI_FLUSH or PIPE_CONTROL immediately following the
MI_SET_CONTEXT of an active mesa context (more vanilla contexts, e.g.
simple rendercopies with igt, do not suffer).
Ville found the following clue,
"[DevCTG+]: For the invalidate operation of the pipe control, the
following pointers are affected. The
invalidate operation affects the restore of these packets. If the pipe
control invalidate operation is completed
before the context save, the indirect pointers will not be restored from
memory.
1. Pipeline State Pointer
2. Media State Pointer
3. Constant Buffer Packet"
which suggests by us emitting the INVALIDATE prior to the MI_SET_CONTEXT,
we prevent the context-restore from chasing the dangling pointers within
the image, and explains why this likely prevents the GPU hang.
Andy Lutomirski [Tue, 2 Jul 2019 03:43:21 +0000 (20:43 -0700)]
x86/entry/64: Fix and clean up paranoid_exit
paranoid_exit needs to restore CR3 before GSBASE. Doing it in the opposite
order crashes if the exception came from a context with user GSBASE and
user CR3 -- RESTORE_CR3 cannot resture user CR3 if run with user GSBASE.
This results in infinitely recursing exceptions if user code does SYSENTER
with TF set if both FSGSBASE and PTI are enabled.
The old code worked if user code just set TF without SYSENTER because #DB
from user mode is special cased in idtentry and paranoid_exit doesn't run.
Fix it by cleaning up the spaghetti code. All that paranoid_exit needs to
do is to disable IRQs, handle IRQ tracing, then restore CR3, and restore
GSBASE. Simply do those actions in that order.
Fixes: 708078f65721 ("x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit") Reported-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Link: https://lkml.kernel.org/r/59725ceb08977359489fbed979716949ad45f616.1562035429.git.luto@kernel.org