Merge tag 'trace-v5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
- Check kprobe is enabled before unregistering from ftrace as it isn't
registered when disabled.
- Remove kprobes enabled via command-line that is on init text when
freed.
- Add missing RCU synchronization for ftrace trampoline symbols removed
from kallsyms.
- Free trampoline on error path if ftrace_startup() fails.
- Give more space for the longer PID numbers in trace output.
- Fix a possible double free in the histogram code.
- A couple of fixes that were discovered by sparse.
* tag 'trace-v5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
bootconfig: init: make xbc_namebuf static
kprobes: tracing/kprobes: Fix to kill kprobes on initmem after boot
tracing: fix double free
ftrace: Let ftrace_enable_sysctl take a kernel pointer buffer
tracing: Make the space reserved for the pid wider
ftrace: Fix missing synchronize_rcu() removing trampoline from kallsyms
ftrace: Free the trampoline when ftrace_startup() fails
kprobes: Fix to check probe enabled before disarm_kprobe_ftrace()
Merge branch 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU fix from Paul McKenney:
"This contains a single commit that fixes a bug that was introduced in
the last merge window. This bug causes a compiler warning complaining
about show_rcu_tasks_classic_gp_kthread() being an unused static
function in !SMP kernels.
The fix is straightforward, just adding an 'inline' to make this a
static inline function, thus avoiding the warning.
This bug was reported by Laurent Pinchart, who would like it fixed
sooner rather than later"
* 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
rcu-tasks: Prevent complaints of unused show_rcu_tasks_classic_gp_kthread()
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"ARM:
- fix fault on page table writes during instruction fetch
s390:
- doc improvement
x86:
- The obvious patches are always the ones that turn out to be
completely broken. /me hangs his head in shame"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
Revert "KVM: Check the allocation of pv cpu mask"
KVM: arm64: Remove S1PTW check from kvm_vcpu_dabt_iswrite()
KVM: arm64: Assume write fault on S1PTW permission fault on instruction fetch
docs: kvm: add documentation for KVM_CAP_S390_DIAG318
Jan Kara [Mon, 21 Sep 2020 09:33:23 +0000 (11:33 +0200)]
dax: Fix compilation for CONFIG_DAX && !CONFIG_FS_DAX
dax_supported() is defined whenever CONFIG_DAX is enabled. So dummy
implementation should be defined only in !CONFIG_DAX case, not in
!CONFIG_FS_DAX case.
Fixes: b7222edc3ff9 ("dm: Call proper helper to determine dax support") Cc: <stable@vger.kernel.org> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Merge tag 'core_urgent_for_v5.9_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull syscall tracing fix from Borislav Petkov:
"Fix the seccomp syscall rewriting so that trace and audit see the
rewritten syscall number, from Kees Cook"
* tag 'core_urgent_for_v5.9_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
core/entry: Report syscall correctly for trace and audit
- Make percpu-rwsem operations on the semaphore's ->read_count
IRQ-safe because it can be used in an IRQ context (Hou Tao)"
* tag 'locking_urgent_for_v5.9_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/percpu-rwsem: Use this_cpu_{inc,dec}() for read_count
locking/lockdep: Fix "USED" <- "IN-NMI" inversions
Merge tag 'efi-urgent-for-v5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fix from Borislav Petkov:
"Ensure that the EFI bootloader control module only probes successfully
on systems that support the EFI SetVariable runtime service"
[ Tag and commit from Ard Biesheuvel, forwarded by Borislav ]
* tag 'efi-urgent-for-v5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi: efibc: check for efivars write capability
Merge tag 'x86_urgent_for_v5.9_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- A defconfig fix (Daniel Díaz)
- Disable relocation relaxation for the compressed kernel when not
built as -pie as in that case kernels built with clang and linked
with LLD fail to boot due to the linker optimizing some instructions
in non-PIE form; the gory details in the commit message (Arvind
Sankar)
- A fix for the "bad bp value" warning issued by the frame-pointer
unwinder (Josh Poimboeuf)
* tag 'x86_urgent_for_v5.9_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/unwind/fp: Fix FP unwinding in ret_from_fork
x86/boot/compressed: Disable relocation relaxation
x86/defconfigs: Explicitly unset CONFIG_64BIT in i386_defconfig
Merge tag 'libnvdimm-fixes-5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:
"A handful of fixes to address a string of mistakes in the mechanism
for device-mapper to determine if its component devices are dax
capable.
- Fix an original bug in device-mapper table reference counting when
interrogating dax capability in the component device. This bug was
hidden by the following bug.
- Fix device-mapper to use the proper helper (dax_supported() instead
of the leaf helper generic_fsdax_supported()) to determine dax
operation of a stacked block device configuration. The original
implementation is only valid for one level of dax-capable block
device stacking. This bug was discovered while fixing the below
regression.
- Fix an infinite recursion regression introduced by broken attempts
to quiet the generic_fsdax_supported() path and make it bail out
before logging "dax capability not found" errors"
* tag 'libnvdimm-fixes-5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
dax: Fix stack overflow when mounting fsdax pmem device
dm: Call proper helper to determine dax support
dm/dax: Fix table reference counts
The commit 0a614374e9e1 ("KVM: Check the allocation of pv cpu mask") we
have in 5.9-rc5 has two issue:
1) Compilation fails for !CONFIG_SMP, see:
https://bugzilla.kernel.org/show_bug.cgi?id=209285
2) This commit completely disables PV TLB flush, see
https://lore.kernel.org/kvm/87y2lrnnyf.fsf@vitty.brq.redhat.com/
The allocation problem is likely a theoretical one, if we don't
have memory that early in boot process we're likely doomed anyway.
Let's solve it properly later.
Merge tag 'riscv-for-linus-5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- A fix for a lockdep issue to avoid an asserting triggering during
early boot. There shouldn't be any incorrect behavior as the system
isn't concurrent at the time.
- The addition of a missing fence when installing early fixmap
mappings.
- A corretion to the K210 device tree's interrupt map.
- A fix for M-mode timer handling on the K210.
* tag 'riscv-for-linus-5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: Resurrect the MMIO timer implementation for M-mode systems
riscv: Fix Kendryte K210 device tree
riscv: Add sfence.vma after early page table changes
RISC-V: Take text_mutex in ftrace_init_nop()
Merge tag 'tty-5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial/fbcon fixes from Greg KH:
"Here are some small tty/serial and one more fbcon fix.
They include:
- serial core locking regression fixes
- new device ids for 8250_pci driver
- fbcon fix for syzbot found issue
All have been in linux-next with no reported issues"
* tag 'tty-5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
fbcon: Fix user font detection test at fbcon_resize().
serial: 8250_pci: Add Realtek 816a and 816b
serial: core: fix console port-lock regression
serial: core: fix port-lock initialisation
Merge tag 'edac_urgent_for_v5.9_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull EDAC fixes from Borislav Petkov:
"Two fixes for resulting from CONFIG_DEBUG_TEST_DRIVER_REMOVE=y
experiments:
- complete a previous fix to reset a local structure containing
scanned system data properly so that the driver rescans, as it
should, on a second load.
- address a refcount underflow due to not paying attention to the
driver whitelest on unregister"
* tag 'edac_urgent_for_v5.9_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/ghes: Check whether the driver is on the safe list correctly
EDAC/ghes: Clear scanned data on unload
Merge tag 'kbuild-fixes-v5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
"Fix qconf warnings and revive help message"
* tag 'kbuild-fixes-v5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kconfig: qconf: revive help message in the info view
kconfig: qconf: fix incomplete type 'struct gstr' warning
kconfig: qconf: use delete[] instead of delete to free array (again)
Adrian Huang [Thu, 17 Sep 2020 11:15:49 +0000 (19:15 +0800)]
dax: Fix stack overflow when mounting fsdax pmem device
When mounting fsdax pmem device, commit e806b0104b50 ("dax: fix
detection of dax support for non-persistent memory block devices")
introduces the stack overflow [1][2]. Here is the call path for
mounting ext4 file system:
ext4_fill_super
bdev_dax_supported
__bdev_dax_supported
dax_supported
generic_fsdax_supported
__generic_fsdax_supported
bdev_dax_supported
The call path leads to the infinite calling loop, so we cannot
call bdev_dax_supported() in __generic_fsdax_supported(). The sanity
checking of the variable 'dax_dev' is moved prior to the two
bdev_dax_pgoff() checks [3][4].
Fixes: e806b0104b50 ("dax: fix detection of dax support for non-persistent memory block devices") Reported-by: Yi Zhang <yi.zhang@redhat.com> Reported-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Adrian Huang <ahuang12@lenovo.com> Reviewed-by: Jan Kara <jack@suse.cz> Tested-by: Ritesh Harjani <riteshh@linux.ibm.com> Cc: Coly Li <colyli@suse.de> Cc: Ira Weiny <ira.weiny@intel.com> Cc: John Pittman <jpittman@redhat.com> Link: https://lore.kernel.org/r/20200917111549.6367-1-adrianhuang0701@gmail.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jan Kara [Sun, 20 Sep 2020 15:54:42 +0000 (08:54 -0700)]
dm: Call proper helper to determine dax support
DM was calling generic_fsdax_supported() to determine whether a device
referenced in the DM table supports DAX. However this is a helper for "leaf" device drivers so that
they don't have to duplicate common generic checks. High level code
should call dax_supported() helper which that calls into appropriate
helper for the particular device. This problem manifested itself as
kernel messages:
dm-3: error: dax access failed (-95)
when lvm2-testsuite run in cases where a DM device was stacked on top of
another DM device.
Fixes: 7c99699a32f4 ("dax: Arrange for dax_supported check to span multiple devices") Cc: <stable@vger.kernel.org> Tested-by: Adrian Huang <ahuang12@lenovo.com> Signed-off-by: Jan Kara <jack@suse.cz> Acked-by: Mike Snitzer <snitzer@redhat.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/160061715195.13131.5503173247632041975.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Fri, 18 Sep 2020 19:51:15 +0000 (12:51 -0700)]
dm/dax: Fix table reference counts
A recent fix to the dm_dax_supported() flow uncovered a latent bug. When
dm_get_live_table() fails it is still required to drop the
srcu_read_lock(). Without this change the lvm2 test-suite triggers this
warning:
# lvm2-testsuite --only pvmove-abort-all.sh
WARNING: lock held when returning to user space!
5.9.0-rc5+ #251 Tainted: G OE
------------------------------------------------
lvm/1318 is leaving the kernel with locks still held!
1 lock held by lvm/1318:
#0: ffff9372abb5a340 (&md->io_barrier){....}-{0:0}, at: dm_get_live_table+0x5/0xb0 [dm_mod]
Fixes: 7c99699a32f4 ("dax: Arrange for dax_supported check to span multiple devices") Cc: <stable@vger.kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Alasdair Kergon <agk@redhat.com> Cc: Mike Snitzer <snitzer@redhat.com> Reported-by: Adrian Huang <ahuang12@lenovo.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Adrian Huang <ahuang12@lenovo.com> Link: https://lore.kernel.org/r/160045867590.25663.7548541079217827340.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
kconfig: qconf: revive help message in the info view
Since commit 2a0d14d5cf20 ("kconfig: qconf: remove redundant help in
the info view"), the help message is no longer displayed.
I intended to drop duplicated "Symbol:", "Type:", but precious info
about help and reverse dependencies was lost too.
Revive it now.
"defined at" is contained in menu_get_ext_help(), so I made sure
to not display it twice.
Fixes: 2a0d14d5cf20 ("kconfig: qconf: remove redundant help in the info view") Reported-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
kconfig: qconf: fix incomplete type 'struct gstr' warning
"make HOSTCXX=clang++ xconfig" reports the following:
HOSTCXX scripts/kconfig/qconf.o
In file included from scripts/kconfig/qconf.cc:23:
In file included from scripts/kconfig/lkc.h:15:
scripts/kconfig/lkc_proto.h:26:13: warning: 'get_relations_str' has C-linkage specified, but returns incomplete type 'struct gstr' which could be incompatible with C [-Wreturn-type-c-linkage]
struct gstr get_relations_str(struct symbol **sym_arr, struct list_head *head);
^
Currently, get_relations_str() is declared before the struct gstr
definition.
Move all declarations of menu.c functions below.
BTW, some are declared in lkc.h and some in lkc_proto.h, but the
difference is unclear. I guess some refactoring is needed.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Boris Kolpackov <boris@codesynthesis.com>
Subsystems affected by this patch series: mailmap, mm/hotfixes,
mm/thp, mm/memory-hotplug, misc, kcsan"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
kcsan: kconfig: move to menu 'Generic Kernel Debugging Instruments'
fs/fs-writeback.c: adjust dirtytime_interval_handler definition to match prototype
stackleak: let stack_erasing_sysctl take a kernel pointer buffer
ftrace: let ftrace_enable_sysctl take a kernel pointer buffer
mm/memory_hotplug: drain per-cpu pages again during memory offline
selftests/vm: fix display of page size in map_hugetlb
mm/thp: fix __split_huge_pmd_locked() for migration PMD
kprobes: fix kill kprobe which has been marked as gone
tmpfs: restore functionality of nr_inodes=0
mlock: fix unevictable_pgs event counts on THP
mm: fix check_move_unevictable_pages() on THP
mm: migration of hugetlbfs page skip memcg
ksm: reinstate memcg charge on copied pages
mailmap: add older email addresses for Kees Cook
Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"Another bunch of fixes for I2C.
Jean's i801 patch is a cleanup on top of Volker's i801 patch, but it
will make dependency handling much easier if those two go together"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: mxs: use MXS_DMA_CTRL_WAIT4END instead of DMA_CTRL_ACK
i2c: mediatek: Send i2c master code at more than 1MHz
i2c: mediatek: Fix generic definitions for bus frequency
i2c: core: Call i2c_acpi_install_space_handler() before i2c_acpi_register_devices()
i2c: i801: Simplify the suspend callback
i2c: i801: Fix resume bug
i2c: aspeed: Mask IRQ status to relevant bits
RISC-V: Resurrect the MMIO timer implementation for M-mode systems
The K210 doesn't implement rdtime in M-mode, and since that's where Linux runs
in the NOMMU systems that means we can't use rdtime. The K210 is the only
system that anyone is currently running NOMMU or M-mode on, so here we're just
inlining the timer read directly.
This also adds the CLINT driver as an !MMU dependency, as it's currently the
only timer driver availiable for these systems and without it we get a build
failure for some configurations.
Tested-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Damien Le Moal [Wed, 16 Sep 2020 07:59:41 +0000 (16:59 +0900)]
riscv: Fix Kendryte K210 device tree
The Kendryte K210 SoC CLINT is compatible with Sifive clint v0
(sifive,clint0). Fix the Kendryte K210 device tree clint entry to be
inline with the sifive timer definition documented in
Documentation/devicetree/bindings/timer/sifive,clint.yaml.
The device tree clint entry is renamed similarly to u-boot device tree
definition to improve compatibility with u-boot defined device tree.
To ensure correct initialization, the interrup-cells attribute is added
and the interrupt-extended attribute definition fixed.
This fixes boot failures with Kendryte K210 SoC boards.
Note that the clock referenced is kept as K210_CLK_ACLK, which does not
necessarilly match the clint MTIME increment rate. This however does not
seem to cause any problem for now.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
fs/fs-writeback.c: adjust dirtytime_interval_handler definition to match prototype
Commit ecad013df758 ("sysctl: pass kernel pointers to ->proc_handler")
changed ctl_table.proc_handler to take a kernel pointer. Adjust the
definition of dirtytime_interval_handler to match its prototype in
linux/writeback.h which fixes the following sparse error/warning:
fs/fs-writeback.c:2189:50: warning: incorrect type in argument 3 (different address spaces)
fs/fs-writeback.c:2189:50: expected void *
fs/fs-writeback.c:2189:50: got void [noderef] __user *buffer
fs/fs-writeback.c:2184:5: error: symbol 'dirtytime_interval_handler' redeclared with different type (incompatible argument 3 (different address spaces)):
fs/fs-writeback.c:2184:5: int extern [addressable] [signed] [toplevel] dirtytime_interval_handler( ... )
fs/fs-writeback.c: note: in included file:
./include/linux/writeback.h:374:5: note: previously declared as:
./include/linux/writeback.h:374:5: int extern [addressable] [signed] [toplevel] dirtytime_interval_handler( ... )
Fixes: ecad013df758 ("sysctl: pass kernel pointers to ->proc_handler") Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Link: https://lkml.kernel.org/r/20200907093140.13434-1-tklauser@distanz.ch Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
stackleak: let stack_erasing_sysctl take a kernel pointer buffer
Commit ecad013df758 ("sysctl: pass kernel pointers to ->proc_handler")
changed ctl_table.proc_handler to take a kernel pointer. Adjust the
signature of stack_erasing_sysctl to match ctl_table.proc_handler which
fixes the following sparse warning:
kernel/stackleak.c:31:50: warning: incorrect type in argument 3 (different address spaces)
kernel/stackleak.c:31:50: expected void *
kernel/stackleak.c:31:50: got void [noderef] __user *buffer
Fixes: ecad013df758 ("sysctl: pass kernel pointers to ->proc_handler") Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Link: https://lkml.kernel.org/r/20200907093253.13656-1-tklauser@distanz.ch Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ftrace: let ftrace_enable_sysctl take a kernel pointer buffer
Commit ecad013df758 ("sysctl: pass kernel pointers to ->proc_handler")
changed ctl_table.proc_handler to take a kernel pointer. Adjust the
signature of ftrace_enable_sysctl to match ctl_table.proc_handler which
fixes the following sparse warning:
kernel/trace/ftrace.c:7544:43: warning: incorrect type in argument 3 (different address spaces)
kernel/trace/ftrace.c:7544:43: expected void *
kernel/trace/ftrace.c:7544:43: got void [noderef] __user *buffer
Fixes: ecad013df758 ("sysctl: pass kernel pointers to ->proc_handler") Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Link: https://lkml.kernel.org/r/20200907093207.13540-1-tklauser@distanz.ch Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pavel Tatashin [Sat, 19 Sep 2020 04:20:31 +0000 (21:20 -0700)]
mm/memory_hotplug: drain per-cpu pages again during memory offline
There is a race during page offline that can lead to infinite loop:
a page never ends up on a buddy list and __offline_pages() keeps
retrying infinitely or until a termination signal is received.
Thread#1 - a new process:
load_elf_binary
begin_new_exec
exec_mmap
mmput
exit_mmap
tlb_finish_mmu
tlb_flush_mmu
release_pages
free_unref_page_list
free_unref_page_prepare
set_pcppage_migratetype(page, migratetype);
// Set page->index migration type below MIGRATE_PCPTYPES
Thread#2 - hot-removes memory
__offline_pages
start_isolate_page_range
set_migratetype_isolate
set_pageblock_migratetype(page, MIGRATE_ISOLATE);
Set migration type to MIGRATE_ISOLATE-> set
drain_all_pages(zone);
// drain per-cpu page lists to buddy allocator.
Thread#1 - continue
free_unref_page_commit
migratetype = get_pcppage_migratetype(page);
// get old migration type
list_add(&page->lru, &pcp->lists[migratetype]);
// add new page to already drained pcp list
Thread#2
Never drains pcp again, and therefore gets stuck in the loop.
The fix is to try to drain per-cpu lists again after
check_pages_isolated_cb() fails.
Ralph Campbell [Sat, 19 Sep 2020 04:20:24 +0000 (21:20 -0700)]
mm/thp: fix __split_huge_pmd_locked() for migration PMD
A migrating transparent huge page has to already be unmapped. Otherwise,
the page could be modified while it is being copied to a new page and data
could be lost. The function __split_huge_pmd() checks for a PMD migration
entry before calling __split_huge_pmd_locked() leading one to think that
__split_huge_pmd_locked() can handle splitting a migrating PMD.
However, the code always increments the page->_mapcount and adjusts the
memory control group accounting assuming the page is mapped.
Also, if the PMD entry is a migration PMD entry, the call to
is_huge_zero_pmd(*pmd) is incorrect because it calls pmd_pfn(pmd) instead
of migration_entry_to_pfn(pmd_to_swp_entry(pmd)). Fix these problems by
checking for a PMD migration entry.
Fixes: 46d0e68d9ae8 ("mm: thp: check pmd migration entry in common path") Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Yang Shi <shy828301@gmail.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Bharata B Rao <bharata@linux.ibm.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> [4.14+] Link: https://lkml.kernel.org/r/20200903183140.19055-1-rcampbell@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Muchun Song [Sat, 19 Sep 2020 04:20:21 +0000 (21:20 -0700)]
kprobes: fix kill kprobe which has been marked as gone
If a kprobe is marked as gone, we should not kill it again. Otherwise, we
can disarm the kprobe more than once. In that case, the statistics of
kprobe_ftrace_enabled can unbalance which can lead to that kprobe do not
work.
Fixes: a1929f61d48b ("kprobes: support probing module __exit function") Co-developed-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: "Naveen N . Rao" <naveen.n.rao@linux.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Song Liu <songliubraving@fb.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20200822030055.32383-1-songmuchun@bytedance.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 3098fc782acb ("tmpfs: per-superblock i_ino support") made changes
to shmem_reserve_inode() in mm/shmem.c, however the original test for
(sbinfo->max_inodes) got dropped. This causes mounting tmpfs with option
nr_inodes=0 to fail:
# mount -ttmpfs -onr_inodes=0 none /ext0
mount: /ext0: mount(2) system call failed: Cannot allocate memory.
This patch restores the nr_inodes=0 functionality.
Fixes: 3098fc782acb ("tmpfs: per-superblock i_ino support") Signed-off-by: Byron Stanoszek <gandalf@winds.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Hugh Dickins <hughd@google.com> Acked-by: Chris Down <chris@chrisdown.name> Link: https://lkml.kernel.org/r/20200902035715.16414-1-gandalf@winds.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5.8 commit b19967fe25d2 ("mm: swap: fix vmstats for huge page") has
established that vm_events should count every subpage of a THP, including
unevictable_pgs_culled and unevictable_pgs_rescued; but
lru_cache_add_inactive_or_unevictable() was not doing so for
unevictable_pgs_mlocked, and mm/mlock.c was not doing so for
unevictable_pgs mlocked, munlocked, cleared and stranded.
Fix them; but THPs don't go the pagevec way in mlock.c, so no fixes needed
on that path.
Fixes: b19967fe25d2 ("mm: swap: fix vmstats for huge page") Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Yang Shi <shy828301@gmail.com> Cc: Alex Shi <alex.shi@linux.alibaba.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Qian Cai <cai@lca.pw> Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008301408230.5954@eggly.anvils Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
check_move_unevictable_pages() is used in making unevictable shmem pages
evictable: by shmem_unlock_mapping(), drm_gem_check_release_pagevec() and
i915/gem check_release_pagevec(). Those may pass down subpages of a huge
page, when /sys/kernel/mm/transparent_hugepage/shmem_enabled is "force".
That does not crash or warn at present, but the accounting of vmstats
unevictable_pgs_scanned and unevictable_pgs_rescued is inconsistent:
scanned being incremented on each subpage, rescued only on the head (since
tails already appear evictable once the head has been updated).
5.8 commit b19967fe25d2 ("mm: swap: fix vmstats for huge page") has
established that vm_events in general (and unevictable_pgs_rescued in
particular) should count every subpage: so follow that precedent here.
Do this in such a way that if mem_cgroup_page_lruvec() is made stricter
(to check page->mem_cgroup is always set), no problem: skip the tails
before calling it, and add thp_nr_pages() to vmstats on the head.
Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Yang Shi <shy828301@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Qian Cai <cai@lca.pw> Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008301405000.5954@eggly.anvils Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
hugetlbfs pages do not participate in memcg: so although they do find most
of migrate_page_states() useful, it would be better if they did not call
into mem_cgroup_migrate() - where Qian Cai reported that LTP's
move_pages12 triggers the warning in Alex Shi's prospective commit
"mm/memcg: warning on !memcg after readahead page charged".
Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Johannes Weiner <hannes@cmpxch.org> Cc: Alex Shi <alex.shi@linux.alibaba.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Qian Cai <cai@lca.pw> Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008301359460.5954@eggly.anvils Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "mm: fixes to past from future testing".
Here's a set of independent fixes against 5.9-rc2: prompted by
testing Alex Shi's "warning on !memcg" and lru_lock series, but
I think fit for 5.9 - though maybe only the first for stable.
This patch (of 5):
In 5.8 some instances of memcg charging in do_swap_page() and unuse_pte()
were removed, on the understanding that swap cache is now already charged
at those points; but a case was missed, when ksm_might_need_to_copy() has
decided it must allocate a substitute page: such pages were never charged.
Fix it inside ksm_might_need_to_copy().
This was discovered by Alex Shi's prospective commit "mm/memcg: warning on
!memcg after readahead page charged".
But there is a another surprise: this also fixes some rarer uncharged
PageAnon cases, when KSM is configured in, but has never been activated.
ksm_might_need_to_copy()'s anon_vma->root and linear_page_index() check
sometimes catches a case which would need to have been copied if KSM were
turned on. Or that's my optimistic interpretation (of my own old code),
but it leaves some doubt as to whether everything is working as intended
there - might it hint at rare anon ptes which rmap cannot find? A
question not easily answered: put in the fix for missed memcg charges.
Cc; Matthew Wilcox <willy@infradead.org>
Fixes: 1c31fa2a2941 ("mm: memcontrol: charge swapin pages on instantiation") Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Alex Shi <alex.shi@linux.alibaba.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Qian Cai <cai@lca.pw> Cc: <stable@vger.kernel.org> [5.8] Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008301343270.5954@eggly.anvils Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008301358020.5954@eggly.anvils Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge tag 's390-5.9-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Vasily Gorbik:
- Fix order in trace_hardirqs_off_caller() to make locking state
consistent even if the IRQ tracer calls into lockdep again. Touches
common code. Acked-by Peter Zijlstra.
- Correctly handle secure storage violation exception to avoid kernel
panic triggered by user space misbehaviour.
- Switch the idle->seqcount over to using raw_write_*() to avoid
"suspicious RCU usage".
- Fix memory leaks on hard unplug in pci code.
- Use kvmalloc instead of kmalloc for larger allocations in zcrypt.
- Add few missing __init annotations to static functions to avoid
section mismatch complains when functions are not inlined.
* tag 's390-5.9-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: add 3f program exception handler
lockdep: fix order in trace_hardirqs_off_caller()
s390/pci: fix leak of DMA tables on hard unplug
s390/init: add missing __init annotations
s390/zcrypt: fix kmalloc 256k failure
s390/idle: fix suspicious RCU usage
i2c: mxs: use MXS_DMA_CTRL_WAIT4END instead of DMA_CTRL_ACK
The driver-specific usage of the DMA_CTRL_ACK flag was replaced with a
custom flag in commit 26bf76b933d9 ("dmaengine: mxs: rename custom flag"),
but i2c-mxs was not updated to use the new flag, completely breaking I2C
transactions using DMA.
Qii Wang [Thu, 17 Sep 2020 11:55:42 +0000 (19:55 +0800)]
i2c: mediatek: Send i2c master code at more than 1MHz
The master code needs to being sent when the speed is more than
I2C_MAX_FAST_MODE_PLUS_FREQ, not I2C_MAX_FAST_MODE_FREQ in the
latest I2C-bus specification and user manual.
Signed-off-by: Qii Wang <qii.wang@mediatek.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
Qii Wang [Thu, 17 Sep 2020 11:55:41 +0000 (19:55 +0800)]
i2c: mediatek: Fix generic definitions for bus frequency
The max frequency of mediatek i2c controller driver is
I2C_MAX_HIGH_SPEED_MODE_FREQ, not I2C_MAX_FAST_MODE_PLUS_FREQ.
Fix it.
Fixes: c759d24d8ce3 ("i2c: drivers: Use generic definitions for bus frequencies") Reviewed-by: Yingjoe Chen <yingjoe.chen@mediatek.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Qii Wang <qii.wang@mediatek.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Allow CPUs affected by erratum 1418040 to come online late
(previously we only fixed the other case - CPUs not affected by the
erratum coming up late).
- Fix branch offset in BPF JIT.
- Defer the stolen time initialisation to the CPU online time from the
CPU starting time to avoid a (sleep-able) memory allocation in an
atomic context.
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: paravirt: Initialize steal time when cpu is online
arm64: bpf: Fix branch offset in JIT
arm64: Allow CPUs unffected by ARM erratum 1418040 to come in late
Merge tag 'powerpc-5.9-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Some more powerpc fixes for 5.9:
- Opt us out of the DEBUG_VM_PGTABLE support for now as it's causing
crashes.
- Fix a long standing bug in our DMA mask handling that was hidden
until recently, and which caused problems with some drivers.
- Fix a boot failure on systems with large amounts of RAM, and no
hugepage support and using Radix MMU, only seen in the lab.
- A few other minor fixes.
Thanks to Alexey Kardashevskiy, Aneesh Kumar K.V, Gautham R. Shenoy,
Hari Bathini, Ira Weiny, Nick Desaulniers, Shirisha Ganta, Vaibhav
Jain, and Vaidyanathan Srinivasan"
* tag 'powerpc-5.9-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/papr_scm: Limit the readability of 'perf_stats' sysfs attribute
cpuidle: pseries: Fix CEDE latency conversion from tb to us
powerpc/dma: Fix dma_map_ops::get_required_mask
Revert "powerpc/build: vdso linker warning for orphan sections"
powerpc/mm: Remove DEBUG_VM_PGTABLE support on powerpc
selftests/powerpc: Skip PROT_SAO test in guests/LPARS
powerpc/book3s64/radix: Fix boot failure with large amount of guest memory
Merge tag 'pm-5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These add a new CPU ID to the RAPL power capping driver and prevent
the ACPI processor idle driver from triggering RCU-lockdep complaints.
Specifics:
- Add support for the Lakefield chip to the RAPL power capping driver
(Ricardo Neri).
- Modify the ACPI processor idle driver to prevent it from triggering
RCU-lockdep complaints which has started to happen after recent
changes in that area (Peter Zijlstra)"
* tag 'pm-5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: processor: Take over RCU-idle for C3-BM idle
cpuidle: Allow cpuidle drivers to take over RCU-idle
ACPI: processor: Use CPUIDLE_FLAG_TLB_FLUSHED
ACPI: processor: Use CPUIDLE_FLAG_TIMER_STOP
powercap: RAPL: Add support for Lakefield
Merge tag 'sound-5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Here is a collection of fixes for 5.9. All look small and are nothing
scary.
The majority of changes are about ASoC driver- specific fixes, while
there are a couple of ASoC core fixes (DAI lookup and lockdep stuff)
and usual HD-audio quirks"
* tag 'sound-5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (23 commits)
ALSA: hda/realtek - The Mic on a RedmiBook doesn't work
ASoC: tlv320adcx140: Wake up codec before accessing register
ASoC: core: Do not cleanup uninitialized dais on soc_pcm_open failure
ALSA: hda: fixup headset for ASUS GX502 laptop
ASoC: Intel: bytcr_rt5640: Add quirk for MPMAN Converter9 2-in-1
ASoC: Intel: haswell: Fix power transition refactor
ASoC: tlv320adcx140: Fix accessing uninitialized adcx140->dev
ASoC: wm8994: Ensure the device is resumed in wm89xx_mic_detect functions
ASoC: wm8994: Skip setting of the WM8994_MICBIAS register for WM1811
ASoC: meson: axg-toddr: fix channel order on g12 platforms
ASoC: soc-core: add snd_soc_find_dai_with_mutex()
ASoC: qcom: common: Fix refcount imbalance on error
ASoC: rt700: Fix return check for devm_regmap_init_sdw()
ASoC: rt715: Fix return check for devm_regmap_init_sdw()
ASoC: rt711: Fix return check for devm_regmap_init_sdw()
ASoC: rt1308-sdw: Fix return check for devm_regmap_init_sdw()
ASoC: max98373: Fix return check for devm_regmap_init_sdw()
ASoC: ti: fixup ams_delta_mute() function name
ASoC: pcm3168a: ignore 0 Hz settings
ASoC: Intel: tgl_max98373: fix a runtime pm issue in multi-thread case
...
kprobes: tracing/kprobes: Fix to kill kprobes on initmem after boot
Since kprobe_event= cmdline option allows user to put kprobes on the
functions in initmem, kprobe has to make such probes gone after boot.
Currently the probes on the init functions in modules will be handled
by module callback, but the kernel init text isn't handled.
Without this, kprobes may access non-exist text area to disable or
remove it.
Merge tag 'iommu-fixes-v5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
"Two fixes for the AMD IOMMU driver:
- Fix a potential NULL-ptr dereference found by smatch
- Fix interrupt remapping when a device is assigned to a guest and
AVIC is enabled"
* tag 'iommu-fixes-v5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/amd: Restore IRTE.RemapEn bit for amd_iommu_activate_guest_mode
iommu/amd: Fix potential @entry null deref
Merge tag 'mtd/fixes-for-5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull MTD/SPI NOR fixes from Vignesh Raghavendra:
"Revert patches that caused non volatile Quad Enable bit to be cleared
for certain SPI NOR flashes during module remove or during shutdown,
thus breaking backward compatibility"
Acked-by: Miquel Raynal <miquel.raynal@bootlin.com>
* tag 'mtd/fixes-for-5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
Revert "mtd: spi-nor: Add capability to disable flash quad mode"
Revert "mtd: spi-nor: Disable the flash quad mode in spi_nor_restore()"
objtool: Fix noreturn detection for ignored functions
When a function is annotated with STACK_FRAME_NON_STANDARD, objtool
doesn't validate its code paths. It also skips sibling call detection
within the function.
But sibling call detection is actually needed for the case where the
ignored function doesn't have any return instructions. Otherwise
objtool naively marks the function as implicit static noreturn, which
affects the reachability of its callers, resulting in "unreachable
instruction" warnings.
Fix it by just enabling sibling call detection for ignored functions.
The 'insn->ignore' check in add_jump_destinations() is no longer needed
after
b93843f49370 ("objtool: Don't use ignore flag for fake jumps").
Tom Rix [Mon, 7 Sep 2020 13:58:45 +0000 (06:58 -0700)]
tracing: fix double free
clang static analyzer reports this problem
trace_events_hist.c:3824:3: warning: Attempt to free
released memory
kfree(hist_data->attrs->var_defs.name[i]);
In parse_var_defs() if there is a problem allocating
var_defs.expr, the earlier var_defs.name is freed.
This free is duplicated by free_var_defs() which frees
the rest of the list.
Because free_var_defs() has to run anyway, remove the
second free fom parse_var_defs().
Link: https://lkml.kernel.org/r/20200907135845.15804-1-trix@redhat.com Cc: stable@vger.kernel.org Fixes: 872211e935bd ("tracing: Add variable support to hist triggers") Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
ftrace: Let ftrace_enable_sysctl take a kernel pointer buffer
Commit ecad013df758 ("sysctl: pass kernel pointers to ->proc_handler")
changed ctl_table.proc_handler to take a kernel pointer. Adjust the
signature of ftrace_enable_sysctl to match ctl_table.proc_handler which
fixes the following sparse warning:
kernel/trace/ftrace.c:7544:43: warning: incorrect type in argument 3 (different address spaces)
kernel/trace/ftrace.c:7544:43: expected void *
kernel/trace/ftrace.c:7544:43: got void [noderef] __user *buffer
Link: https://lkml.kernel.org/r/20200907093207.13540-1-tklauser@distanz.ch Fixes: ecad013df758 ("sysctl: pass kernel pointers to ->proc_handler") Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Marc Zyngier [Tue, 15 Sep 2020 10:42:18 +0000 (11:42 +0100)]
KVM: arm64: Remove S1PTW check from kvm_vcpu_dabt_iswrite()
Now that kvm_vcpu_trap_is_write_fault() checks for S1PTW, there
is no need for kvm_vcpu_dabt_iswrite() to do the same thing, as
we already check for this condition on all existing paths.
Marc Zyngier [Tue, 15 Sep 2020 10:42:17 +0000 (11:42 +0100)]
KVM: arm64: Assume write fault on S1PTW permission fault on instruction fetch
KVM currently assumes that an instruction abort can never be a write.
This is in general true, except when the abort is triggered by
a S1PTW on instruction fetch that tries to update the S1 page tables
(to set AF, for example).
This can happen if the page tables have been paged out and brought
back in without seeing a direct write to them (they are thus marked
read only), and the fault handling code will make the PT executable(!)
instead of writable. The guest gets stuck forever.
In these conditions, the permission fault must be considered as
a write so that the Stage-1 update can take place. This is essentially
the I-side equivalent of the problem fixed by bdef27af87c7 ("arm64: KVM:
Take S1 walks into account when determining S2 write faults").
Update kvm_is_write_fault() to return true on IABT+S1PTW, and introduce
kvm_vcpu_trap_is_exec_fault() that only return true when no faulting
on a S1 fault. Additionally, kvm_vcpu_dabt_iss1tw() is renamed to
kvm_vcpu_abt_iss1tw(), as the above makes it plain that it isn't
specific to data abort.
tracing: Make the space reserved for the pid wider
For 64bit CONFIG_BASE_SMALL=0 systems PID_MAX_LIMIT is set by default to 4194304. During boot the kernel sets a new value based on number of CPUs
but no lower than 32768. It is 1024 per CPU so with 128 CPUs the default
becomes 131072 which needs six digits.
This value can be increased during run time but must not exceed the
initial upper limit.
Systemd sometime after v241 sets it to the upper limit during boot. The
result is that when the pid exceeds five digits, the trace output is a
little hard to read because it is no longer properly padded (same like
on big iron with 98+ CPUs).
Adrian Hunter [Tue, 1 Sep 2020 09:16:17 +0000 (12:16 +0300)]
ftrace: Fix missing synchronize_rcu() removing trampoline from kallsyms
Add synchronize_rcu() after list_del_rcu() in
ftrace_remove_trampoline_from_kallsyms() to protect readers of
ftrace_ops_trampoline_list (in ftrace_get_trampoline_kallsym)
which is used when kallsyms is read.
Miroslav Benes [Mon, 31 Aug 2020 12:26:31 +0000 (14:26 +0200)]
ftrace: Free the trampoline when ftrace_startup() fails
Commit 26e652e899b2 ("ftrace: Add symbols for ftrace trampolines")
missed to remove ops from new ftrace_ops_trampoline_list in
ftrace_startup() if ftrace_hash_ipmodify_enable() fails there. It may
lead to BUG if such ops come from a module which may be removed.
Moreover, the trampoline itself is not freed in this case.
Fix it by calling ftrace_trampoline_free() during the rollback.
Link: https://lkml.kernel.org/r/20200831122631.28057-1-mbenes@suse.cz Fixes: 26e652e899b2 ("ftrace: Add symbols for ftrace trampolines") Fixes: ce5a5b5ca18d ("ftrace, kprobes: Support IPMODIFY flag to find IP modify conflict") Signed-off-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Masami Hiramatsu [Mon, 31 Aug 2020 15:12:07 +0000 (00:12 +0900)]
kprobes: Fix to check probe enabled before disarm_kprobe_ftrace()
Commit ca5b77d4fbb4 ("kprobes: Fix NULL pointer dereference at
kprobe_ftrace_handler") fixed one bug but not completely fixed yet.
If we run a kprobe_module.tc of ftracetest, kernel showed a warning
as below.
This is because the kill_kprobe() calls disarm_kprobe_ftrace() even
if the given probe is not enabled. In that case, ftrace_set_filter_ip()
fails because the given probe point is not registered to ftrace.
Fix to check the given (going) probe is enabled before invoking
disarm_kprobe_ftrace().
Link: https://lkml.kernel.org/r/159888672694.1411785.5987998076694782591.stgit@devnote2 Fixes: ca5b77d4fbb4 ("kprobes: Fix NULL pointer dereference at kprobe_ftrace_handler") Cc: Ingo Molnar <mingo@kernel.org> Cc: "Naveen N . Rao" <naveen.n.rao@linux.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: David Miller <davem@davemloft.net> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Chengming Zhou <zhouchengming@bytedance.com> Cc: stable@vger.kernel.org Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
* pm-cpuidle:
ACPI: processor: Take over RCU-idle for C3-BM idle
cpuidle: Allow cpuidle drivers to take over RCU-idle
ACPI: processor: Use CPUIDLE_FLAG_TLB_FLUSHED
ACPI: processor: Use CPUIDLE_FLAG_TIMER_STOP
kconfig: qconf: use delete[] instead of delete to free array (again)
Commit 1099e21f140b ("kconfig: qconf: use delete[] instead of delete
to free array") fixed two lines, but there is one more.
(cppcheck does not report it for some reason...)
This was detected by Clang.
"make HOSTCXX=clang++ xconfig" reports the following:
scripts/kconfig/qconf.cc:1279:2: warning: 'delete' applied to a pointer that was allocated with 'new[]'; did you mean 'delete[]'? [-Wmismatched-new-delete]
delete data;
^
[]
scripts/kconfig/qconf.cc:1239:15: note: allocated with 'new[]' here
char *data = new char[count + 1];
^
Fixes: da3ebf8b1332 ("kconfig: qconf: make debug links work again") Fixes: 1099e21f140b ("kconfig: qconf: use delete[] instead of delete to free array") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
iommu/amd: Restore IRTE.RemapEn bit for amd_iommu_activate_guest_mode
Commit 76ac501d57ec ("iommu/amd: Use cmpxchg_double() when updating
128-bit IRTE") removed an assumption that modify_irte_ga always set
the valid bit, which requires the callers to set the appropriate value
for the struct irte_ga.valid bit before calling the function.
Similar to the commit d04af2d92c7d ("iommu/amd: Restore IRTE.RemapEn
bit after programming IRTE"), which is for the function
amd_iommu_deactivate_guest_mode().
The same change is also needed for the amd_iommu_activate_guest_mode().
Otherwise, this could trigger IO_PAGE_FAULT for the VFIO based VMs with
AVIC enabled.
This warning happens when unwinding from an interrupt in
ret_from_fork(). If entry code gets interrupted, the state of the
frame pointer (rbp) may be undefined, which can confuse the unwinder,
resulting in warnings like the above.
There's an in_entry_code() check which normally silences such
warnings for entry code. But in this case, ret_from_fork() is getting
interrupted. It recently got moved out of .entry.text, so the
in_entry_code() check no longer works.
It could be moved back into .entry.text, but that would break the
noinstr validation because of the call to schedule_tail().
Instead, initialize each new task's RBP to point to the task's entry
regs via an encoded frame pointer. That will allow the unwinder to
reach the end of the stack gracefully.
Fixes: 54e1e3218b5a ("x86/entry/64: Move non entry code into .text section") Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Reported-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/f366bbf5a8d02e2318ee312f738112d0af74d16f.1600103007.git.jpoimboe@redhat.com
Merge branch 'for-5.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu
Pull percpu fix from Dennis Zhou:
"This is a fix for the first chunk size calculation where the variable
length array incorrectly used the number of longs instead of bytes of
longs.
This came in as a code fix and not a bug report, so I don't think it
was widely problematic. I believe it worked out due to it being
memblock memory and alignment requirements working in our favor"
* 'for-5.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu:
percpu: fix first chunk size calculation for populated bitmap
Merge tag 'drm-fixes-2020-09-18' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"A bunch of small fixes, some of the i915 ones have been out for a
while and got better commit msg explaining some better reasoning
behind them (hopefully this trend continues).
Otherwise there a few AMD related ones mostly small, one radeon PLL
regression fix and a bunch of small mediatek fixes.
amdgpu:
- Sienna Cichlid fixes
- Navy Flounder fixes
- DC fixes
amdkfd:
- Fix a GPU reset crash
- Fix a memory leak
radeon:
- Revert a PLL fix that broke other boards
i915:
- Avoid exposing a partially constructed context
- Use RCU instead of mutex for context termination list iteration
- Avoid data race reported by KCSAN
- Filter wake_flags passed to default_wake_function
mediatek:
- Fix scrolling of panel
- Remove duplicated include
- Use CPU when fail to get cmdq event
- Add missing put_device() call"
* tag 'drm-fixes-2020-09-18' of git://anongit.freedesktop.org/drm/drm: (21 commits)
drm/amd/display: Don't log hdcp module warnings in dmesg
drm/amdgpu: declare ta firmware for navy_flounder
drm/mediatek: Add missing put_device() call in mtk_hdmi_dt_parse_pdata()
drm/mediatek: Add missing put_device() call in mtk_drm_kms_init()
drm/mediatek: Add exception handing in mtk_drm_probe() if component init fail
drm/mediatek: Add missing put_device() call in mtk_ddp_comp_init()
drm/mediatek: Use CPU when fail to get cmdq event
drm/mediatek: Remove duplicated include
drm/i915: Filter wake_flags passed to default_wake_function
drm/i915: Be wary of data races when reading the active execlists
drm/i915/gem: Reduce context termination list iteration guard to RCU
drm/i915/gem: Delay tracking the GEM context until it is registered
drm/amdgpu/dc: Require primary plane to be enabled whenever the CRTC is
drm/radeon: revert "Prefer lower feedback dividers"
drm/amdgpu: Include sienna_cichlid in USBC PD FW support.
drm/amd/display: update nv1x stutter latencies
drm/amd/display: Don't use DRM_ERROR() for DTM add topology
drm/amd/pm: support runtime pptable update for sienna_cichlid etc.
drm/amdkfd: fix a memory leak issue
drm/kfd: fix a system crash issue during GPU recovery
...
Dave Airlie [Thu, 17 Sep 2020 22:37:31 +0000 (08:37 +1000)]
Merge tag 'drm-intel-fixes-2020-09-17' of ssh://git.freedesktop.org/git/drm/drm-intel into drm-fixes
drm/i915 fixes for v5.9-rc6:
- Avoid exposing a partially constructed context
- Use RCU instead of mutex for context termination list iteration
- Avoid data race reported by KCSAN
- Filter wake_flags passed to default_wake_function
Hans de Goede [Wed, 9 Sep 2020 10:32:33 +0000 (12:32 +0200)]
i2c: core: Call i2c_acpi_install_space_handler() before i2c_acpi_register_devices()
Some ACPI i2c-devices _STA method (which is used to detect if the device
is present) use autodetection code which probes which device is present
over i2c. This requires the I2C ACPI OpRegion handler to be registered
before we enumerate i2c-clients under the i2c-adapter.
This fixes the i2c touchpad on the Lenovo ThinkBook 14-IIL and
ThinkBook 15 IIL not getting an i2c-client instantiated and thus not
working.
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1842039 Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
Sunghyun Jin [Thu, 3 Sep 2020 12:41:16 +0000 (21:41 +0900)]
percpu: fix first chunk size calculation for populated bitmap
Variable populated, which is a member of struct pcpu_chunk, is used as a
unit of size of unsigned long.
However, size of populated is miscounted. So, I fix this minor part.
Fixes: adabd144a9cf ("percpu: change the number of pages marked in the first_chunk pop bitmap") Cc: <stable@vger.kernel.org> # 4.14+ Signed-off-by: Sunghyun Jin <mcsmonk@gmail.com> Signed-off-by: Dennis Zhou <dennis@kernel.org>
mm: allow a controlled amount of unfairness in the page lock
Commit 1fdb11d3af01 ("mm: rewrite wait_on_page_bit_common() logic") made
the page locking entirely fair, in that if a waiter came in while the
lock was held, the lock would be transferred to the lockers strictly in
order.
That was intended to finally get rid of the long-reported watchdog
failures that involved the page lock under extreme load, where a process
could end up waiting essentially forever, as other page lockers stole
the lock from under it.
It also improved some benchmarks, but it ended up causing huge
performance regressions on others, simply because fair lock behavior
doesn't end up giving out the lock as aggressively, causing better
worst-case latency, but potentially much worse average latencies and
throughput.
Instead of reverting that change entirely, this introduces a controlled
amount of unfairness, with a sysctl knob to tune it if somebody needs
to. But the default value should hopefully be good for any normal load,
allowing a few rounds of lock stealing, but enforcing the strict
ordering before the lock has been stolen too many times.
There is also a hint from Matthieu Baerts that the fair page coloring
may end up exposing an ABBA deadlock that is hidden by the usual
optimistic lock stealing, and while the unfairness doesn't fix the
fundamental issue (and I'm still looking at that), it avoids it in
practice.
The amount of unfairness can be modified by writing a new value to the
'sysctl_page_lock_unfairness' variable (default value of 5, exposed
through /proc/sys/vm/page_lock_unfairness), but that is hopefully
something we'd use mainly for debugging rather than being necessary for
any deep system tuning.
This whole issue has exposed just how critical the page lock can be, and
how contended it gets under certain locks. And the main contention
doesn't really seem to be anything related to IO (which was the origin
of this lock), but for things like just verifying that the page file
mapping is stable while faulting in the page into a page table.
Andrew Jones [Wed, 16 Sep 2020 15:45:30 +0000 (17:45 +0200)]
arm64: paravirt: Initialize steal time when cpu is online
Steal time initialization requires mapping a memory region which
invokes a memory allocation. Doing this at CPU starting time results
in the following trace when CONFIG_DEBUG_ATOMIC_SLEEP is enabled:
However we don't need to initialize steal time at CPU starting time.
We can simply wait until CPU online time, just sacrificing a bit of
accuracy by returning zero for steal time until we know better.
While at it, add __init to the functions that are only called by
pv_time_init() which is __init.
Signed-off-by: Andrew Jones <drjones@redhat.com> Fixes: 16b631200723 ("arm64: Retrieve stolen time as paravirtualized guest") Cc: stable@vger.kernel.org Reviewed-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20200916154530.40809-1-drjones@redhat.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The reason is the offset[] creation and later usage, while building
the eBPF body. The code currently omits the first instruction, since
build_insn() will increase our ctx->idx before saving it.
That was fine up until bounded eBPF loops were introduced. After that
introduction, offset[0] must be the offset of the end of prologue which
is the start of the 1st insn while, offset[n] holds the
offset of the end of n-th insn.
When "taken loop with back jump to 1st insn" test runs, it will
eventually call bpf2a64_offset(-1, 2, ctx). Since negative indexing is
permitted, the current outcome depends on the value stored in
ctx->offset[-1], which has nothing to do with our array.
If the value happens to be 0 the tests will work. If not this error
triggers.
commit 8e8dae1e8bf1 ("bpf: fix x64 JIT code generation for jmp to 1st insn")
fixed an indentical bug on x86 when eBPF bounded loops were introduced.
So let's fix it by creating the ctx->offset[] differently. Track the
beginning of instruction and account for the extra instruction while
calculating the arm instruction offsets.
The CRC calculation done by genksyms is triggered when the parser hits
EXPORT_SYMBOL*() macros. At this point, genksyms recursively expands the
types of the function parameters, and uses that as the input for the CRC
calculation. In the case of forward-declared structs, the type expands
to 'UNKNOWN'. Following this, it appears that the result of the
expansion of each type is cached somewhere, and seems to be re-used
when/if the same type is seen again for another exported symbol in the
same C file.
Unfortunately, this can cause CRC 'stability' issues when a struct
definition becomes visible in the middle of a C file. For example, let's
assume code with the following pattern:
struct foo;
int bar(struct foo *arg)
{
/* Do work ... */
}
EXPORT_SYMBOL_GPL(bar);
/* This contains struct foo's definition */
#include "foo.h"
int baz(struct foo *arg)
{
/* Do more work ... */
}
EXPORT_SYMBOL_GPL(baz);
Here, baz's CRC will be computed using the expansion of struct foo that
was cached after bar's CRC calculation ('UNKOWN' here). But if
EXPORT_SYMBOL_GPL(bar) is removed from the file (because of e.g. symbol
trimming using CONFIG_TRIM_UNUSED_KSYMS), struct foo will be expanded
late, during baz's CRC calculation, which now has visibility over the
full struct definition, hence resulting in a different CRC for baz.
The proper fix for this certainly is in genksyms, but that will take me
some time to get right. In the meantime, we have seen one occurrence of
this in the ehci-hcd code which hits this problem because of the way it
includes C files halfway through the code together with an unlucky mix
of symbol trimming.
In order to workaround this, move the include done in ehci-hub.c early
in ehci-hcd.c, hence making sure the struct definitions are visible to
the entire file. This improves CRC stability of the ehci-hcd exports
even when symbol trimming is enabled.
drm/amd/display: Don't log hdcp module warnings in dmesg
[Why]
DTM topology updates happens by default now. This results in DTM
warnings when hdcp is not even being enabled. This spams the dmesg
and doesn't effect normal display functionality so it is better to log it
using DRM_DEBUG_KMS()
[How]
Change the DRM_WARN() to DRM_DEBUG_KMS()
Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The firmware provided via MODULE_FIRMWARE appears in the
module information. External tools(eg. dracut) may use the
list of fw files to include them as appropriate in an initramfs,
thus missing declaration will lead to request firmware failure
in boot time.
Paul E. McKenney [Tue, 25 Aug 2020 15:09:40 +0000 (08:09 -0700)]
rcu-tasks: Prevent complaints of unused show_rcu_tasks_classic_gp_kthread()
Commit c5411467e36a ("rcu-tasks: Conditionally compile
show_rcu_tasks_gp_kthreads()") introduced conditional
compilation of several functions, but forgot one occurrence of
show_rcu_tasks_classic_gp_kthread() that causes the compiler to warn of
an unused static function. This commit uses "static inline" to avoid
these complaints and possibly also to avoid emitting an actual definition
of this function.
Fixes: c5411467e36a ("rcu-tasks: Conditionally compile show_rcu_tasks_gp_kthreads()") Cc: <stable@vger.kernel.org> # 5.8.x Reported-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
drm/mediatek: Add missing put_device() call in mtk_hdmi_dt_parse_pdata()
if of_find_device_by_node() succeed, mtk_drm_kms_init() doesn't have
a corresponding put_device(). Thus add jump target to fix the exception
handling for this function implementation.
drm/mediatek: Add missing put_device() call in mtk_drm_kms_init()
if of_find_device_by_node() succeed, mtk_drm_kms_init() doesn't have
a corresponding put_device(). Thus add jump target to fix the exception
handling for this function implementation.
Fixes: 6a252d99de5c ("drm/mediatek: Add DRM Driver for Mediatek SoC MT8173.") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
drm/mediatek: Add missing put_device() call in mtk_ddp_comp_init()
if of_find_device_by_node() succeed, mtk_ddp_comp_init() doesn't have
a corresponding put_device(). Thus add put_device() to fix the exception
handling for this function implementation.
Fixes: 7c508b8d0b1f ("drm/mediatek: support CMDQ interface in ddp component") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Wang Hai [Wed, 19 Aug 2020 02:58:29 +0000 (10:58 +0800)]
drm/mediatek: Remove duplicated include
Remove mtk_drm_ddp.h which is included more than once
Fixes: 50a21bc3aa80 ("drm/mediatek: drop use of drmP.h") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
On A20R machines the interrupt pending bits in cause register need to be
updated by requesting the chipset to do it. This needs to be done to
find the interrupt cause and after interrupt service. In
commit d41e6d016057 ("MIPS: SNI: Convert to new irq_chip functions") the
function to do after service update got lost, which caused spurious
interrupts.
Fixes: d41e6d016057 ("MIPS: SNI: Convert to new irq_chip functions") Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Merge tag 'perf-tools-fixes-for-v5.9-2020-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Set PERF_SAMPLE_PERIOD if attr->freq is set.
- Remove trailing commas from AMD JSON vendor event files.
- Don't clear event's period if set by a event definition term.
- Leader sampling shouldn't clear sample period in 'perf test'.
- Fix the "signal" test inline assembly when built with DEBUG=1.
- Fix memory leaks detected by ASAN, some in normal paths, some in
error paths.
- Fix 2 memory sanitizer warnings in 'perf bench'.
- Fix the ratio comments of miss-events in 'perf stat'.
- Prevent override of attr->sample_period for libpfm4 events.
- Sync kvm.h and in.h headers with the kernel sources.
* tag 'perf-tools-fixes-for-v5.9-2020-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf stat: Fix the ratio comments of miss-events
perf test: Free formats for perf pmu parse test
perf metric: Do not free metric when failed to resolve
perf metric: Free metric when it failed to resolve
perf metric: Release expr_parse_ctx after testing
perf test: Fix memory leaks in parse-metric test
perf parse-event: Fix memory leak in evsel->unit
perf evlist: Fix cpu/thread map leak
perf metric: Fix some memory leaks - part 2
perf metric: Fix some memory leaks
perf test: Free aliases for PMU event map aliases test
perf vendor events amd: Remove trailing commas
perf test: Leader sampling shouldn't clear sample period
perf record: Don't clear event's period if set by a term
tools headers UAPI: update linux/in.h copy
tools headers UAPI: Sync kvm.h headers with the kernel sources
perf record: Prevent override of attr->sample_period for libpfm4 events
perf record: Set PERF_RECORD_PERIOD if attr->freq is set.
perf bench: Fix 2 memory sanitizer warnings
perf test: Fix the "signal" test inline assembly
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk driver fixes from Stephen Boyd:
"A handful of clk driver fixes. Mostly they're for error paths or
improper memory allocations sizes. Nothing as exciting as a wildfire"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: qcom: lpass: Correct goto target in lpass_core_sc7180_probe()
clk: versatile: Add of_node_put() before return statement
clk: bcm: dvp: Select the reset framework
clk: rockchip: Fix initialization of mux_pll_src_4plls_p
clk: davinci: Use the correct size when allocating memory
Leon Romanovsky [Tue, 15 Sep 2020 08:27:41 +0000 (11:27 +0300)]
MAINTAINERS: Fix Max's and Shravan's emails
Max's and Shravan's usernames were changed while @mellanox.com emails
were transferred to be @nvidia.com.
Fixes: 315e9a8f43c7 ("MAINTAINERS: Update Mellanox and Cumulus Network addresses to new domain") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peter Zijlstra [Tue, 15 Sep 2020 10:32:01 +0000 (12:32 +0200)]
ACPI: processor: Take over RCU-idle for C3-BM idle
The C3 BusMaster idle code takes lock in a number of places, some deep
inside the ACPI code. Instead of wrapping it all in RCU_NONIDLE, have
the driver take over RCU-idle duty and avoid flipping RCU state back
and forth a lot.
( by marking 'C3 && bm_check' as RCU_IDLE, we _must_ call enter_bm() for
that combination, otherwise we'll loose RCU-idle, this requires
shuffling some code around )
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>