git.baikalelectronics.ru Git

tools/power/turbostat: Fix turbostat for AMD Zen CPUs

It was reported that on Zen+ system turbostat started exiting,
which was tracked down to the MSR_PKG_ENERGY_STAT read failing because
offset_to_idx wasn't returning a non-negative index.

This patch combined the modification from Bingsong Si and
Bas Nieuwenhuizen and addd the MSR to the index system as alternative for
MSR_PKG_ENERGY_STATUS.

Fixes: 219b781fe605 ("tools/power turbostat: Enable accumulate RAPL display")
Reported-by: youling257 <youling257@gmail.com>
Tested-by: youling257 <youling257@gmail.com>
Tested-by: Kurt Garloff <kurt@garloff.de>
Tested-by: Bingsong Si <owen.si@ucloud.cn>
Tested-by: Artem S. Tashkinov <aros@gmx.com>
Co-developed-by: Bingsong Si <owen.si@ucloud.cn>
Co-developed-by: Terry Bowman <terry.bowman@amd.com>
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

tools/power turbostat: update version number

tools/power turbostat: Fix DRAM Energy Unit on SKX

SKX uses fixed DRAM Energy Unit, just like HSX and BDX.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

Revert "tools/power turbostat: adjust for temperature offset"

This reverts commit 7a4cc180adb5e3fbf6c993c8227d031206b3cf6f.

Apparently the TCC offset should not be used to adjust what temperature
we show the user after all.

(on most systems, TCC offset is 0, FWIW)

Fixes: 7a4cc180adb5
Signed-off-by: Len Brown <len.brown@intel.com>

tools/power turbostat: Support Ice Lake D

Ice Lake D is low-end server version of Ice Lake X, reuse
the code accordingly.

Tested-by: Wendy Wang <wendy.wang@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

tools/power turbostat: Support Alder Lake Mobile

Share the code between Alder Lake Mobile and Alder Lake Desktop.

Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

tools/power turbostat: print microcode patch level

(also available via "grep microcode /proc/cpuinfo")

Signed-off-by: Len Brown <len.brown@intel.com>

tools/power turbostat: add built-in-counter for IPC -- Instructions per Cycle

Use linux-perf to access the hardware instructions-retired counter.
This is necessary because the counter is not enabled by default,
and also the counter is prone to roll-over -- both of which
perf manages.

It is not necessary to use perf for the cycle counter,
because turbostat already needs to collect delta-aperf
to calcuate frequency.

Signed-off-by: Len Brown <len.brown@intel.com>

Linux 5.12

Merge tag 'perf-tools-fixes-for-v5.12-2021-04-25' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools fixes from Arnaldo Carvalho de Melo:

- Fix potential NULL pointer dereference in the auxtrace option parser

- Fix access to PID in an array when setting a PID filter in 'perf ftrace'

- Fix error return code in the 'perf data' tool and in maps__clone(),
   found using a static analysis tool from Huawei

* tag 'perf-tools-fixes-for-v5.12-2021-04-25' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf map: Fix error return code in maps__clone()
  perf ftrace: Fix access to pid in array when setting a pid filter
  perf auxtrace: Fix potential NULL pointer dereference
  perf data: Fix error return code in perf_data__create_dir()

Merge tag 'perf_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 perf fixes from Borislav Petkov:

- Fix Broadwell Xeon's stepping in the PEBS isolation table of CPUs

- Fix a panic when initializing perf uncore machinery on Haswell and
   Broadwell servers

* tag 'perf_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/kvm: Fix Broadwell Xeon stepping in isolation_ucodes[]
  perf/x86/intel/uncore: Remove uncore extra PCI dev HSWEP_PCI_PCU_3

Merge tag 'locking_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fix from Borislav Petkov:
"Fix ordering in the queued writer lock's slowpath"

* tag 'locking_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/qrwlock: Fix ordering in queued_write_lock_slowpath()

Merge tag 'sched_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fix from Borislav Petkov:
"Fix a typo in a macro ifdeffery"

* tag 'sched_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
preempt/dynamic: Fix typo in macro conditional statement

Merge tag 'x86_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fix from Borislav Petkov:
"Fix an out-of-bounds memory access when setting up a crash kernel with
kexec"

* tag 'x86_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/crash: Fix crash_setup_memmap_entries() out-of-bounds access

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fix from Paolo Bonzini:
"Fix SRCU bug introduced in the merge window"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86/xen: Take srcu lock when accessing kvm_memslots()

Revert "net/rds: Avoid potential use after free in rds_send_remove_from_sock"

This reverts commit 4d841c260c3052a9667a21a86396de22fc80bcdf.

The games with 'rm' are on (two separate instances) of a local variable,
and make no difference.

Quoting Aditya Pakki:
"I was the author of the patch and it was the cause of the giant UMN
  revert.

  The patch is garbage and I was unaware of the steps involved in
  retracting it. I *believed* the maintainers would pull it, given it
  was already under Greg's list. The patch does not introduce any bugs
  but is pointless and is stupid. I accept my incompetence and for not
  requesting a revert earlier."

Link: https://lwn.net/Articles/854319/
Requested-by: Aditya Pakki <pakki001@umn.edu>
Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'pinctrl-v5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:
"Late pin control fixes, would have been in the main pull request
  normally but hey I got lucky and we got another week to polish up
  v5.12 so here we go.

  One driver fix and one making the core debugfs work:

   - Fix the number of pins in the community of the Intel Lewisburg SoC

   - Show pin numbers for controllers with base = 0 in the new debugfs
     feature"

* tag 'pinctrl-v5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: core: Show pin numbers for the controllers with base = 0
  pinctrl: lewisburg: Update number of pins in community

Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
"5 patches.

  Subsystems affected by this patch series: coda, overlayfs, and
  mm (pagecache and memcg)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  tools/cgroup/slabinfo.py: updated to work on current kernel
  mm/filemap: fix mapping_seek_hole_data on THP & 32-bit
  mm/filemap: fix find_lock_entries hang on 32-bit THP
  ovl: fix reference counting in ovl_mmap error path
  coda: fix reference counting in coda_file_mmap error path

Merge tag 'block-5.12-2021-04-23' of git://git.kernel.dk/linux-block

Pull block fix from Jens Axboe:
"A single fix for a behavioral regression in this series, when
re-reading the partition table with partitions open"

* tag 'block-5.12-2021-04-23' of git://git.kernel.dk/linux-block:
block: return -EBUSY when there are open partitions in blkdev_reread_part

tools/cgroup/slabinfo.py: updated to work on current kernel

slabinfo.py script does not work with actual kernel version.

First, it was unable to recognise SLUB susbsytem, and when I specified
it manually it failed again with

  AttributeError: 'struct page' has no member 'obj_cgroups'

.. and then again with

  File "tools/cgroup/memcg_slabinfo.py", line 221, in main
    memcg.kmem_caches.address_of_(),
  AttributeError: 'struct mem_cgroup' has no member 'kmem_caches'

Link: https://lkml.kernel.org/r/cec1a75e-43b4-3d64-2084-d9f98fda037f@virtuozzo.com
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Tested-by: Roman Gushchin <guro@fb.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/filemap: fix mapping_seek_hole_data on THP & 32-bit

No problem on 64-bit, or without huge pages, but xfstests generic/285
and other SEEK_HOLE/SEEK_DATA tests have regressed on huge tmpfs, and on
32-bit architectures, with the new mapping_seek_hole_data(). Several
different bugs turned out to need fixing.

u64 cast to stop losing bits when converting unsigned long to loff_t
(and let's use shifts throughout, rather than mixed with * and /).

Use round_up() when advancing pos, to stop assuming that pos was already
THP-aligned when advancing it by THP-size. (This use of round_up()
assumes that any THP has THP-aligned index: true at present and true
going forward, but could be recoded to avoid the assumption.)

Use xas_set() when iterating away from a THP, so that xa_index stays in
synch with start, instead of drifting away to return bogus offset.

Check start against end to avoid wrapping 32-bit xa_index to 0 (and to
handle these additional cases, seek_data or not, it's easier to break
the loop than goto: so rearrange exit from the function).

[hughd@google.com: remove unneeded u64 casts, per Matthew]
Link: https://lkml.kernel.org/r/alpine.LSU.2.11.2104221347240.1170@eggly.anvils
Link: https://lkml.kernel.org/r/alpine.LSU.2.11.2104211737410.3299@eggly.anvils
Fixes: 4c1d288be54a ("mm/filemap: add mapping_seek_hole_data")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/filemap: fix find_lock_entries hang on 32-bit THP

No problem on 64-bit, or without huge pages, but xfstests generic/308
hung uninterruptibly on 32-bit huge tmpfs.

Since commit 87c3695e6356 ("Clarify (and fix) in 4.13 MAX_LFS_FILESIZE
macros"), MAX_LFS_FILESIZE is only a PAGE_SIZE away from wrapping 32-bit
xa_index to 0, so the new find_lock_entries() has to be extra careful
when handling a THP.

Link: https://lkml.kernel.org/r/alpine.LSU.2.11.2104211735430.3299@eggly.anvils
Fixes: ce869ae5cf28 ("mm: add and use find_lock_entries")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ovl: fix reference counting in ovl_mmap error path

mmap_region() now calls fput() on the vma->vm_file.

Fix this by using vma_set_file() so it doesn't need to be handled
manually here any more.

Link: https://lkml.kernel.org/r/20210421132012.82354-2-christian.koenig@amd.com
Fixes: 0e759ccdeeb1 ("mm: mmap: fix fput in error path v2")
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: <stable@vger.kernel.org> [5.11+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

coda: fix reference counting in coda_file_mmap error path

mmap_region() now calls fput() on the vma->vm_file.

So we need to drop the extra reference on the coda file instead of the
host file.

Link: https://lkml.kernel.org/r/20210421132012.82354-1-christian.koenig@amd.com
Fixes: 0e759ccdeeb1 ("mm: mmap: fix fput in error path v2")
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: <stable@vger.kernel.org> [5.11+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

KVM: x86/xen: Take srcu lock when accessing kvm_memslots()

kvm_memslots() will be called by kvm_write_guest_offset_cached() so we should
take the srcu lock. Let's pull the srcu lock operation from kvm_steal_time_set_preempted()
again to fix xen part.

Fixes: 652519a15a3 ("KVM: x86/xen: Add support for vCPU runstate information")
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1619166200-9215-1-git-send-email-wanpengli@tencent.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Merge tag 'arm-fixes-5.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull ARM SoC fixes from Arnd Bergmann:
"These should be the final fixes for v5.12.

  There is one fix for SD card detection on one Allwinner board, and a
  few fixes for the Tegra platform that I had already queued up for
  v5.13 due to a communication problem. This addresses MMC device
  ordering on multiple machines, audio support on Jetson AGX Xavier and
  suspend/resume on Jetson TX2"

* tag 'arm-fixes-5.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  arm64: dts: allwinner: Revert SD card CD GPIO for Pine64-LTS
  arm64: tegra: Move clocks from RT5658 endpoint to device node
  arm64: tegra: Fix mmc0 alias for Jetson Xavier NX
  arm64: tegra: Set fw_devlink=on for Jetson TX2
  arm64: tegra: Add unit-address for ACONNECT on Tegra186

perf map: Fix error return code in maps__clone()

Although 'err' has been initialized to -ENOMEM, but it will be reassigned
by the "err = unwind__prepare_access(...)" statement in the for loop. So
that, the value of 'err' is unknown when map__clone() failed.

Fixes: e70e821fcb160e0e ("perf unwind: Call unwind__prepare_access for forked thread")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: zhen lei <thunder.leizhen@huawei.com>
Link: http://lore.kernel.org/lkml/20210415092744.3793-1-thunder.leizhen@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf ftrace: Fix access to pid in array when setting a pid filter

Command 'perf ftrace -v -- ls' fails in s390 (at least 5.12.0rc6).

The root cause is a missing pointer dereference which causes an
array element address to be used as PID.

Fix this by extracting the PID.

Output before:
  # ./perf ftrace -v -- ls
  function_graph tracer is used
  write '-263732416' to tracing/set_ftrace_pid failed: Invalid argument
  failed to set ftrace pid
  #

Output after:
   ./perf ftrace -v -- ls
   function_graph tracer is used
   # tracer: function_graph
   #
   # CPU  DURATION                  FUNCTION CALLS
   # |     |   |                     |   |   |   |
   4)               |  rcu_read_lock_sched_held() {
   4)   0.552 us    |    rcu_lockdep_current_cpu_online();
   4)   6.124 us    |  }

Reported-by: Alexander Schmidt <alexschm@de.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20210421120400.2126433-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf auxtrace: Fix potential NULL pointer dereference

In the function auxtrace_parse_snapshot_options(), the callback pointer
"itr->parse_snapshot_options" can be NULL if it has not been set during
the AUX record initialization. This can cause tool crashing if the
callback pointer "itr->parse_snapshot_options" is dereferenced without
performing NULL check.

Add a NULL check for the pointer "itr->parse_snapshot_options" before
invoke the callback.

Fixes: 86fe783d403ec67f ("perf tools: Add AUX area tracing Snapshot Mode")
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Link: http://lore.kernel.org/lkml/20210420151554.2031768-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Merge tag 'drm-fixes-2021-04-23' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
"Just some small i915 and amdgpu fixes this week, should be all until
  you open the merge window.

  amdgpu:
   - Fix gpuvm page table update issue
   - Modifier fixes
   - Register fix for dimgrey cavefish

  i915:
   - GVT's BDW regression fix for cmd parser
   - Fix modesetting in case of unexpected AUX timeouts"

* tag 'drm-fixes-2021-04-23' of git://anongit.freedesktop.org/drm/drm:
  drm/amdgpu: fix GCR_GENERAL_CNTL offset for dimgrey_cavefish
  amd/display: allow non-linear multi-planar formats
  drm/amd/display: Update modifier list for gfx10_3
  drm/amdgpu: reserve fence slot to update page table
  drm/i915: Fix modesetting in case of unexpected AUX timeouts
  drm/i915/gvt: Fix BDW command parser regression

Merge tag 'gpio-fixes-for-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fix from Bartosz Golaszewski:
"Save and restore the sysconfig register in gpio-omap to fix a
power-management issue"

* tag 'gpio-fixes-for-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: omap: Save and restore sysconfig

Merge branch 'tegra/dt64' into arm/fixes

arm64: tegra: Device tree fixes for v5.12-rc6

This contains a couple of device tree fixes for the v5.12 release cycle.
These are needed for proper audio support on Jetson AGX Xavier, to boot
the Jetson Xavier NX from an SD card and to be able to suspend/resume
the Jetson TX2.

* tegra/dt64:
  arm64: tegra: Move clocks from RT5658 endpoint to device node
  arm64: tegra: Fix mmc0 alias for Jetson Xavier NX
  arm64: tegra: Set fw_devlink=on for Jetson TX2
  arm64: tegra: Add unit-address for ACONNECT on Tegra186

Link: https://lore.kernel.org/linux-arm-kernel/YILD4yyPXuiYbHW1@orome.fritz.box/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Merge tag 'drm-intel-fixes-2021-04-22' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

- GVT's BDW regression fix for cmd parser (Zhenyu)
- Fix modesetting in case of unexpected AUX timeouts (Imre)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YIGZ3pQPgPQtZtyI@intel.com

Merge tag 'amd-drm-fixes-5.12-2021-04-21' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-5.12-2021-04-21:

amdgpu:
- Fix gpuvm page table update issue
- Modifier fixes
- Register fix for dimgrey cavefish

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210421220456.3839-1-alexander.deucher@amd.com

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio fixes from Michael Tsirkin:
"Very late in the cycle but both risky if left unfixed and more or less
  obvious.."

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vdpa/mlx5: Set err = -ENOMEM in case dma_map_sg_attrs fails
  vhost-vdpa: protect concurrent access to vhost device iotlb

vdpa/mlx5: Set err = -ENOMEM in case dma_map_sg_attrs fails

Set err = -ENOMEM if dma_map_sg_attrs() fails so the function reutrns
error.

Fixes: eb3156f4b78f ("vdpa/mlx5: Add shared memory registration code")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20210411083646.910546-1-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>

vhost-vdpa: protect concurrent access to vhost device iotlb

Protect vhost device iotlb by vhost_dev->mutex. Otherwise,
it might cause corruption of the list and interval tree in
struct vhost_iotlb if userspace sends the VHOST_IOTLB_MSG_V2
message concurrently.

Fixes: 747a8757("vhost: introduce vDPA-based backend")
Cc: stable@vger.kernel.org
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20210412095512.178-1-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Merge tag 'sunxi-fixes-for-5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into arm/fixes

One fix for the MMC card detect on the Pine H64 board

* tag 'sunxi-fixes-for-5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
arm64: dts: allwinner: Revert SD card CD GPIO for Pine64-LTS

Link: https://lore.kernel.org/r/45fc5e4d-ef48-4729-a869-79a8f288bb83.lettre@localhost
Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Merge tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/tpmdd

Pull tpm fix from James Bottomley:
"This is an urgent regression fix for a tpm patch set that went in this
  merge window. It looks like a rebase before the original pull request
  lost a tpm_try_get_ops() so we have a lock imbalance in our code which
  is causing oopses. The original patch was correct on the mailing list.

  I'm sending this in agreement with Mimi (as joint maintainers of
  trusted keys) because Jarkko is off communing with the Reindeer or
  whatever it is Finns do when on holiday"

* tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/tpmdd:
  KEYS: trusted: Fix TPM reservation for seal/unseal

perf/x86/kvm: Fix Broadwell Xeon stepping in isolation_ucodes[]

The only stepping of Broadwell Xeon parts is stepping 1. Fix the
relevant isolation_ucodes[] entry, which previously enumerated
stepping 2.

Although the original commit was characterized as an optimization, it
is also a workaround for a correctness issue.

If a PMI arrives between kvm's call to perf_guest_get_msrs() and the
subsequent VM-entry, a stale value for the IA32_PEBS_ENABLE MSR may be
restored at the next VM-exit. This is because, unbeknownst to kvm, PMI
throttling may clear bits in the IA32_PEBS_ENABLE MSR. CPUs with "PEBS
isolation" don't suffer from this issue, because perf_guest_get_msrs()
doesn't report the IA32_PEBS_ENABLE value.

Fixes: 52065cbf12293 ("perf/x86/kvm: Avoid unnecessary work in guest filtering")
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Peter Shier <pshier@google.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Link: https://lkml.kernel.org/r/20210422001834.1748319-1-jmattson@google.com

arm64: dts: allwinner: Revert SD card CD GPIO for Pine64-LTS

Commit 20ca0fdccb37 ("arm64: dts: allwinner: Drop non-removable from
SoPine/LTS SD card") enabled the card detect GPIO for the SOPine module,
along the way with the Pine64-LTS, which share the same base .dtsi.

This was based on the observation that the Pine64-LTS has as "push-push"
SD card socket, and that the schematic mentions the card detect GPIO.

After having received two reports about failing SD card access with that
patch, some more research and polls on that subject revealed that there
are at least two different versions of the Pine64-LTS out there:
- On some boards (including mine) the card detect pin is "stuck" at
high, regardless of an microSD card being inserted or not.
- On other boards the card-detect is working, but is active-high, by
virtue of an explicit inverter circuit, as shown in the schematic.

To cover all versions of the board out there, and don't take any chances,
let's revert the introduction of the active-low CD GPIO, but let's use
the broken-cd property for the Pine64-LTS this time. That should avoid
regressions and should work for everyone, even allowing SD card changes
now.
The SOPine card detect has proven to be working, so let's keep that
GPIO in place.

Fixes: 20ca0fdccb37 ("arm64: dts: allwinner: Drop non-removable from SoPine/LTS SD card")
Reported-by: Michael Weiser <michael.weiser@gmx.de>
Reported-by: Daniel Kulesz <kuleszdl@posteo.org>
Suggested-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Tested-by: Michael Weiser <michael.weiser@gmx.de>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20210414104740.31497-1-andre.przywara@arm.com

pinctrl: core: Show pin numbers for the controllers with base = 0

The commit dad75e14adcc ("pinctrl: core: print gpio in pins debugfs file")
enabled GPIO pin number and label in debugfs for pin controller. However,
it limited that feature to the chips where base is positive number. This,
in particular, excluded chips where base is 0 for the historical or backward
compatibility reasons. Refactor the code to include the latter as well.

Fixes: dad75e14adcc ("pinctrl: core: print gpio in pins debugfs file")
Cc: Drew Fustini <drew@beagleboard.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Drew Fustini <drew@beagleboard.org>
Reviewed-by: Drew Fustini <drew@beagleboard.org>
Link: https://lore.kernel.org/r/20210415130356.15885-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

KEYS: trusted: Fix TPM reservation for seal/unseal

The original patch 41f9915b74d8 ("KEYS: trusted: Reserve TPM for seal
and unseal operations") was correct on the mailing list:

https://lore.kernel.org/linux-integrity/20210128235621.127925-4-jarkko@kernel.org/

But somehow got rebased so that the tpm_try_get_ops() in
tpm2_seal_trusted() got lost. This causes an imbalanced put of the
TPM ops and causes oopses on TIS based hardware.

This fix puts back the lost tpm_try_get_ops()

Fixes: 41f9915b74d8 ("KEYS: trusted: Reserve TPM for seal and unseal operations")
Reported-by: Mimi Zohar <zohar@linux.ibm.com>
Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

Merge tag 'mmc-v5.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC fix from Ulf Hansson:
"Replace WARN_ONCE with dev_warn_once for non-optimal sg-alignment in
the meson-gx host driver"

* tag 'mmc-v5.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: meson-gx: replace WARN_ONCE with dev_warn_once about scatterlist size alignment in block mode

block: return -EBUSY when there are open partitions in blkdev_reread_part

The switch to go through blkdev_get_by_dev means we now ignore the
return value from bdev_disk_changed in __blkdev_get. Add a manual
check to restore the old semantics.

Fixes: 89cbb96206b2 ("block: reopen the device in blkdev_reread_part")
Reported-by: Karel Zak <kzak@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210421160502.447418-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>

drm/amdgpu: fix GCR_GENERAL_CNTL offset for dimgrey_cavefish

dimgrey_cavefish has similar gc_10_3 ip with sienna_cichlid,
so follow its registers offset setting.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

amd/display: allow non-linear multi-planar formats

Accept non-linear buffers which use a multi-planar format, as long
as they don't use DCC.

Tested on GFX9 with NV12.

Signed-off-by: Simon Ser <contact@emersion.fr>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Harry Wentland <hwentlan@amd.com>
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

drm/amd/display: Update modifier list for gfx10_3

[Why]
Current list supports modifiers that have DCC_MAX_COMPRESSED_BLOCK
set to AMD_FMT_MOD_DCC_BLOCK_128B, while AMD_FMT_MOD_DCC_BLOCK_64B
is used instead by userspace.

[How]
Replace AMD_FMT_MOD_DCC_BLOCK_128B with AMD_FMT_MOD_DCC_BLOCK_64B
for modifiers with DCC supported.

Fixes: f43793d5d6df75 ("drm/amd/display: Expose modifiers")
Signed-off-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

drm/amdgpu: reserve fence slot to update page table

Forgot to reserve a fence slot to use sdma to update page table, cause
below kernel BUG backtrace to handle vm retry fault while application is
exiting.

[  133.048143] kernel BUG at /home/yangp/git/compute_staging/kernel/drivers/dma-buf/dma-resv.c:281!
[  133.048487] Workqueue: events amdgpu_irq_handle_ih1 [amdgpu]
[  133.048506] RIP: 0010:dma_resv_add_shared_fence+0x204/0x280
[  133.048672]  amdgpu_vm_sdma_commit+0x134/0x220 [amdgpu]
[  133.048788]  amdgpu_vm_bo_update_range+0x220/0x250 [amdgpu]
[  133.048905]  amdgpu_vm_handle_fault+0x202/0x370 [amdgpu]
[  133.049031]  gmc_v9_0_process_interrupt+0x1ab/0x310 [amdgpu]
[  133.049165]  ? kgd2kfd_interrupt+0x9a/0x180 [amdgpu]
[  133.049289]  ? amdgpu_irq_dispatch+0xb6/0x240 [amdgpu]
[  133.049408]  amdgpu_irq_dispatch+0xb6/0x240 [amdgpu]
[  133.049534]  amdgpu_ih_process+0x9b/0x1c0 [amdgpu]
[  133.049657]  amdgpu_irq_handle_ih1+0x21/0x60 [amdgpu]
[  133.049669]  process_one_work+0x29f/0x640
[  133.049678]  worker_thread+0x39/0x3f0
[  133.049685]  ? process_one_work+0x640/0x640

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.11.x

gpio: omap: Save and restore sysconfig

As we are using cpu_pm to save and restore context, we must also save and
restore the GPIO sysconfig register. This is needed because we are not
calling PM runtime functions at all with cpu_pm.

We need to save the sysconfig on idle as it's value can get reconfigured by
PM runtime and can be different from the init time value. Device specific
flags like "ti,no-idle-on-init" can affect the init value.

Fixes: 86c740e8eaea ("gpio: omap: Remove custom PM calls and use cpu_pm instead")
Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: Adam Ford <aford173@gmail.com>
Cc: Andreas Kemnade <andreas@kemnade.info>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Peter Ujfalusi <peter.ujfalusi@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Acked-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>

perf/x86/intel/uncore: Remove uncore extra PCI dev HSWEP_PCI_PCU_3

There may be a kernel panic on the Haswell server and the Broadwell
server, if the snbep_pci2phy_map_init() return error.

The uncore_extra_pci_dev[HSWEP_PCI_PCU_3] is used in the cpu_init() to
detect the existence of the SBOX, which is a MSR type of PMON unit.
The uncore_extra_pci_dev is allocated in the uncore_pci_init(). If the
snbep_pci2phy_map_init() returns error, perf doesn't initialize the
PCI type of the PMON units, so the uncore_extra_pci_dev will not be
allocated. But perf may continue initializing the MSR type of PMON
units. A null dereference kernel panic will be triggered.

The sockets in a Haswell server or a Broadwell server are identical.
Only need to detect the existence of the SBOX once.
Current perf probes all available PCU devices and stores them into the
uncore_extra_pci_dev. It's unnecessary.
Use the pci_get_device() to replace the uncore_extra_pci_dev. Only
detect the existence of the SBOX on the first available PCU device once.

Factor out hswep_has_limit_sbox(), since the Haswell server and the
Broadwell server uses the same way to detect the existence of the SBOX.

Add some macros to replace the magic number.

Fixes: 708222196218 ("perf/x86/uncore/hsw-ep: Handle systems with only two SBOXes")
Reported-by: Steve Wahl <steve.wahl@hpe.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Steve Wahl <steve.wahl@hpe.com>
Link: https://lkml.kernel.org/r/1618521764-100923-1-git-send-email-kan.liang@linux.intel.com

Merge tag 'trace-v5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
"Fix tp_printk command line and trace events

  Masami added a wrapper to be able to unhash trace event pointers as
  they are only read by root anyway, and they can also be extracted by
  the raw trace data buffers. But this wrapper utilized the iterator to
  have a temporary buffer to manipulate the text with.

  tp_printk is a kernel command line option that will send the trace
  output of a trace event to the console on boot up (useful when the
  system crashes before finishing the boot). But the code used the same
  wrapper that Masami added, and its iterator did not have a buffer, and
  this caused the system to crash.

  Have the wrapper just print the trace event normally if the iterator
  has no temporary buffer"

* tag 'trace-v5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Fix checking event hash pointer logic when tp_printk is enabled

capabilities: require CAP_SETFCAP to map uid 0

cap_setfcap is required to create file capabilities.

Since commit 8e3c7cc45997 ("Introduce v3 namespaced file capabilities"),
a process running as uid 0 but without cap_setfcap is able to work
around this as follows: unshare a new user namespace which maps parent
uid 0 into the child namespace.

While this task will not have new capabilities against the parent
namespace, there is a loophole due to the way namespaced file
capabilities are represented as xattrs.  File capabilities valid in
userns 1 are distinguished from file capabilities valid in userns 2 by
the kuid which underlies uid 0.  Therefore the restricted root process
can unshare a new self-mapping namespace, add a namespaced file
capability onto a file, then use that file capability in the parent
namespace.

To prevent that, do not allow mapping parent uid 0 if the process which
opened the uid_map file does not have CAP_SETFCAP, which is the
capability for setting file capabilities.

As a further wrinkle: a task can unshare its user namespace, then open
its uid_map file itself, and map (only) its own uid.  In this case we do
not have the credential from before unshare, which was potentially more
restricted.  So, when creating a user namespace, we record whether the
creator had CAP_SETFCAP.  Then we can use that during map_write().

With this patch:

1. Unprivileged user can still unshare -Ur

   ubuntu@caps:~$ unshare -Ur
   root@caps:~# logout

2. Root user can still unshare -Ur

   ubuntu@caps:~$ sudo bash
   root@caps:/home/ubuntu# unshare -Ur
   root@caps:/home/ubuntu# logout

3. Root user without CAP_SETFCAP cannot unshare -Ur:

   root@caps:/home/ubuntu# /sbin/capsh --drop=cap_setfcap --
   root@caps:/home/ubuntu# /sbin/setcap cap_setfcap=p /sbin/setcap
   unable to set CAP_SETFCAP effective capability: Operation not permitted
   root@caps:/home/ubuntu# unshare -Ur
   unshare: write failed /proc/self/uid_map: Operation not permitted

Note: an alternative solution would be to allow uid 0 mappings by
processes without CAP_SETFCAP, but to prevent such a namespace from
writing any file capabilities.  This approach can be seen at [1].

Background history: commit cccdc282870 ("capabilities: Don't allow
writing ambiguous v3 file capabilities") tried to fix the issue by
preventing v3 fscaps to be written to disk when the root uid would map
to the same uid in nested user namespaces.  This led to regressions for
various workloads.  For example, see [2].  Ultimately this is a valid
use-case we have to support meaning we had to revert this change in
5947a3fbdb41 ("Revert cccdc2828707 ("capabilities: Don't allow writing
ambiguous v3 file capabilities")").

Link: https://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux.git/log/?h=2021-04-15/setfcap-nsfscaps-v4
Link: https://github.com/containers/buildah/issues/3071
Signed-off-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Andrew G. Morgan <morgan@kernel.org>
Tested-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>
Tested-by: Giuseppe Scrivano <gscrivan@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

perf data: Fix error return code in perf_data__create_dir()

Although 'ret' has been initialized to -1, but it will be reassigned by
the "ret = open(...)" statement in the for loop. So that, the value of
'ret' is unknown when asprintf() failed.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210415083417.3740-1-thunder.leizhen@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

x86/crash: Fix crash_setup_memmap_entries() out-of-bounds access

Commit in Fixes: added support for kexec-ing a kernel on panic using a
new system call. As part of it, it does prepare a memory map for the new
kernel.

However, while doing so, it wrongly accesses memory it has not
allocated: it accesses the first element of the cmem->ranges[] array in
memmap_exclude_ranges() but it has not allocated the memory for it in
crash_setup_memmap_entries(). As KASAN reports:

  BUG: KASAN: vmalloc-out-of-bounds in crash_setup_memmap_entries+0x17e/0x3a0
  Write of size 8 at addr ffffc90000426008 by task kexec/1187

  (gdb) list *crash_setup_memmap_entries+0x17e
  0xffffffff8107cafe is in crash_setup_memmap_entries (arch/x86/kernel/crash.c:322).
  317                                      unsigned long long mend)
  318     {
  319             unsigned long start, end;
  320
  321             cmem->ranges[0].start = mstart;
  322             cmem->ranges[0].end = mend;
  323             cmem->nr_ranges = 1;
  324
  325             /* Exclude elf header region */
  326             start = image->arch.elf_load_addr;
  (gdb)

Make sure the ranges array becomes a single element allocated.

[ bp: Write a proper commit message. ]

Fixes: e8e898f679f1 ("kexec: support for kexec on panic using new system call")
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Young <dyoung@redhat.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/725fa3dc1da2737f0f6188a1a9701bead257ea9d.camel@gmx.de

tracing: Fix checking event hash pointer logic when tp_printk is enabled

Pointers in events that are printed are unhashed if the flags allow it,
and the logic to do so is called before processing the event output from
the raw ring buffer. In most cases, this is done when a user reads one of
the trace files.

But if tp_printk is added on the kernel command line, this logic is done
for trace events when they are triggered, and their output goes out via
printk. The unhash logic (and even the validation of the output) did not
support the tp_printk output, and would crash.

Link: https://lore.kernel.org/linux-tegra/9835d9f1-8d3a-3440-c53f-516c2606ad07@nvidia.com/
Fixes: ce133cf6c0cd ("tracing: Show real address for trace event arguments")
Reported-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Merge tag 'gvt-fixes-2021-04-20' of https://github.com/intel/gvt-linux into drm-intel-fixes

gvt-fixes-2021-04-20

- Fix cmd parser regression on BDW (Zhenyu)

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
From: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210420023312.GL1551@zhen-hp.sh.intel.com

Revert "gcov: clang: fix clang-11+ build"

This reverts commit b7747c243b288fa07b7f602b01aa3195cd171c69.

Nathan Chancellor points out that it should not have been merged into
mainline by itself. It was a fix for "gcov: use kvmalloc()", which is
still in -mm/-next. Merging it alone has broken the build.

Link: https://github.com/ClangBuiltLinux/continuous-integration2/runs/2384465683?check_suite_focus=true
Reported-by: Nathan Chancellor <nathan@kernel.org>
Cc: Johannes Berg <johannes.berg@intel.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drm/i915: Fix modesetting in case of unexpected AUX timeouts

In case AUX failures happen unexpectedly during a modeset, the driver
should still complete the modeset. In particular the driver should
perform the link training sequence steps even in case of an AUX failure,
as this sequence also includes port initialization steps. Not doing that
can leave the port/pipe in a broken state and lead for instance to a
flip done timeout.

Fix this by continuing with link training (in a no-LTTPR mode) if the
DPRX DPCD readout failed for some reason at the beginning of link
training. After a successful connector detection we already have the
DPCD read out and cached, so the failed repeated read for it should not
cause a problem. Note that a partial AUX read could in theory partly
overwrite the cached DPCD (and return error) but this overwrite should
not happen if the returned values are corrupted (due to a timeout or
some other IO error).

Kudos to Ville to root cause the problem.

Fixes: 356b8153e8cf ("drm/i915: Disable LTTPR support when the DPCD rev < 1.4")
References: https://gitlab.freedesktop.org/drm/intel/-/issues/3308
Cc: stable@vger.kernel.org # 5.11
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210412232413.2755054-1-imre.deak@intel.com
(cherry picked from commit e42e7e585984b85b0fb9dd1fefc85ee4800ca629)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[adjusted Fixes: tag]

preempt/dynamic: Fix typo in macro conditional statement

Commit f254f85feffc ("preempt/dynamic: Provide irqentry_exit_cond_resched()
static call") tried to provide irqentry_exit_cond_resched() static call
in irqentry_exit, but has a typo in macro conditional statement.

Fixes: f254f85feffc ("preempt/dynamic: Provide irqentry_exit_cond_resched() static call")
Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210410073523.5493-1-zhouzhouyi@gmail.com

mmc: meson-gx: replace WARN_ONCE with dev_warn_once about scatterlist size alignment in block mode

Since commit ac6df0198111 ("mmc: meson-gx: check for scatterlist size alignment in block mode"),
support for SDIO SD_IO_RW_EXTENDED transferts are properly filtered but some driver
like brcmfmac still gives a block sg buffer size not aligned with SDIO block,
triggerring a WARN_ONCE() with scary stacktrace even if the transfer works fine
but with possible degraded performances.

Simply replace with dev_warn_once() to inform user this should be fixed to avoid
degraded performance.

This should be ultimately fixed in brcmfmac, but since it's only a performance issue
the warning should be removed.

Fixes: ac6df0198111 ("mmc: meson-gx: check for scatterlist size alignment in block mode")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Link: https://lore.kernel.org/r/20210416094347.2015896-1-narmstrong@baylibre.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

Linux 5.12-rc8

Merge tag 'arm-fixes-5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull ARM SoC fixes from Arnd Bergmann:
"Another smaller set of fixes for three of the Arm platforms:

  TI OMAP:

     Fix swapped mmc device order also for omap3 that got changed with
     the recent PROBE_PREFER_ASYNCHRONOUS changes. While eventually the
     aliases should be board specific, all the mmc device instances are
     all there in the SoC, and we do probe them by default so that PM
     runtime can idle the devices if left enabled from the bootloader.

  Qualcomm Snapdragon:

     This bypasses the recently introduced interconnect handling in
     the GENI (serial engine) driver when running off ACPI, as this
     causes the GENI probe to fail and the Lenovo Yoga C630 to boot
     without keyboard and touchpad.

  Allwinner:

     One 32kHz clock fix for the beelink gs1, a CD polarity fix for the
     SoPine, some MAINTAINERS maintainance, and a clk / reset switch to
     our headers"

* tag 'arm-fixes-5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  arm64: dts: allwinner: h6: beelink-gs1: Remove ext. 32 kHz osc reference
  MAINTAINERS: Match on allwinner keyword
  MAINTAINERS: Add our new mailing-list
  arm64: dts: allwinner: Fix SD card CD GPIO for SOPine systems
  arm64: dts: allwinner: h6: Switch to macros for RSB clock/reset indices
  ARM: OMAP2+: Fix uninitialized sr_inst
  ARM: dts: Fix swapped mmc order for omap3
  ARM: OMAP2+: Fix warning for omap_init_time_of()
  soc: qcom: geni: shield geni_icc_get() for ACPI boot

Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM fixes from Russell King:

- Halve maximum number of CPUs if DEBUG_KMAP_LOCAL is enabled

- Fix conversion for_each_membock() to for_each_mem_range()

- Fix footbridge PCI mapping

- Avoid uprobes hooking on thumb instructions

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 9071/1: uprobes: Don't hook on thumb instructions
  ARM: footbridge: fix PCI interrupt mapping
  ARM: 9069/1: NOMMU: Fix conversion for_each_membock() to for_each_mem_range()
  ARM: 9063/1: mm: reduce maximum number of CPUs if DEBUG_KMAP_LOCAL is enabled

ARM: 9071/1: uprobes: Don't hook on thumb instructions

Since uprobes is not supported for thumb, check that the thumb bit is
not set when matching the uprobes instruction hooks.

The Arm UDF instructions used for uprobes triggering
(UPROBE_SWBP_ARM_INSN and UPROBE_SS_ARM_INSN) coincidentally share the
same encoding as a pair of unallocated 32-bit thumb instructions (not
UDF) when the condition code is 0b1111 (0xf). This in effect makes it
possible to trigger the uprobes functionality from thumb, and at that
using two unallocated instructions which are not permanently undefined.

Signed-off-by: Fredrik Strupe <fredrik@strupe.net>
Cc: stable@vger.kernel.org
Fixes: 73b18eed65b1 ("ARM: add uprobes support")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"Two fixes: the libsas fix is for a problem that occurs when trying to
  change the cache type of an ATA device and the libiscsi one is a
  regression fix from this merge window"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: libsas: Reset num_scatter if libata marks qc as NODATA
  scsi: iscsi: Fix iSCSI cls conn state

Merge tag 'drm-fixes-2021-04-18' of git://anongit.freedesktop.org/drm/drm

Pull vmwgfx fixes from Dave Airlie:
"This contains two regression fixes for vmwgfx, one due to a refactor
  which meant locks were being used before initialisation, and the other
  in fixing up some warnings from the core when destroying pinned
  buffers.

  vmwgfx:

   - fixed unpinning before destruction

   - lockdep init reordering"

* tag 'drm-fixes-2021-04-18' of git://anongit.freedesktop.org/drm/drm:
  drm/vmwgfx: Make sure bo's are unpinned before putting them back
  drm/vmwgfx: Fix the lockdep breakage
  drm/vmwgfx: Make sure we unpin no longer needed buffers

Merge tag 'vmwgfx-fixes-2021-04-14' of gitlab.freedesktop.org:zack/vmwgfx into drm-fixes

vmwgfx fixes for regressions in 5.12

Here's a set of 3 patches fixing ugly regressions
in the vmwgfx driver. We broke lock initialization
code and ended up using spinlocks before initialization
breaking lockdep.
Also there was a bit of a fallout from drm changes
which made the core validate that unreferenced buffers
have been unpinned. vmwgfx pinning code predates a lot
of the core drm and wasn't written to account for those
semantics. Fortunately changes required to fix it
are not too intrusive.
The changes have been validated by our internal ci.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Zack Rusin <zackr@vmware.com>
Link: https://patchwork.freedesktop.org/patch/msgid/f7add0a2-162e-3bd2-b1be-344a94f2acbf@vmware.com

Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c fix from Wolfram Sang:
"One more driver bugfix for I2C"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: mv64xxx: Fix random system lock caused by runtime PM

readdir: make sure to verify directory entry for legacy interfaces too

This does the directory entry name verification for the legacy
"fillonedir" (and compat) interface that goes all the way back to the
dark ages before we had a proper dirent, and the readdir() system call
returned just a single entry at a time.

Nobody should use this interface unless you still have binaries from
1991, but let's do it right.

This came up during discussions about unsafe_copy_to_user() and proper
checking of all the inputs to it, as the networking layer is looking to
use it in a few new places.  So let's make sure the _old_ users do it
all right and proper, before we add new ones.

See also commit 2ec5a2ba96d1 ("Make filldir[64]() verify the directory
entry filename is valid") which did the proper modern interfaces that
people actually use. It had a note:

    Note that I didn't bother adding the checks to any legacy interfaces
    that nobody uses.

which this now corrects.  Note that we really don't care about POSIX and
the presense of '/' in a directory entry, but verify_dirent_name() also
ends up doing the proper name length verification which is what the
input checking discussion was about.

[ Another option would be to remove the support for this particular very
  old interface: any binaries that use it are likely a.out binaries, and
  they will no longer run anyway since we removed a.out binftm support
  in commit d3a95a38a391 ("x86: Deprecate a.out support").

  But I'm not sure which came first: getdents() or ELF support, so let's
  pretend somebody might still have a working binary that uses the
  legacy readdir() case.. ]

Link: https://lore.kernel.org/lkml/CAHk-=wjbvzCAhAtvG0d81W5o0-KT5PPTHhfJ5ieDFq+bGtgOYg@mail.gmail.com/
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'net-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
"Networking fixes for 5.12-rc8, including fixes from netfilter, and
  bpf. BPF verifier changes stand out, otherwise things have slowed
  down.

  Current release - regressions:

   - gro: ensure frag0 meets IP header alignment

   - Revert "net: stmmac: re-init rx buffers when mac resume back"

   - ethernet: macb: fix the restore of cmp registers

  Previous releases - regressions:

   - ixgbe: Fix NULL pointer dereference in ethtool loopback test

   - ixgbe: fix unbalanced device enable/disable in suspend/resume

   - phy: marvell: fix detection of PHY on Topaz switches

   - make tcp_allowed_congestion_control readonly in non-init netns

   - xen-netback: Check for hotplug-status existence before watching

  Previous releases - always broken:

   - bpf: mitigate a speculative oob read of up to map value size by
     tightening the masking window

   - sctp: fix race condition in sctp_destroy_sock

   - sit, ip6_tunnel: Unregister catch-all devices

   - netfilter: nftables: clone set element expression template

   - netfilter: flowtable: fix NAT IPv6 offload mangling

   - net: geneve: check skb is large enough for IPv4/IPv6 header

   - netlink: don't call ->netlink_bind with table lock held"

* tag 'net-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (52 commits)
  netlink: don't call ->netlink_bind with table lock held
  MAINTAINERS: update my email
  bpf: Update selftests to reflect new error states
  bpf: Tighten speculative pointer arithmetic mask
  bpf: Move sanitize_val_alu out of op switch
  bpf: Refactor and streamline bounds check into helper
  bpf: Improve verifier error messages for users
  bpf: Rework ptr_limit into alu_limit and add common error path
  bpf: Ensure off_reg has no mixed signed bounds for all types
  bpf: Move off_reg into sanitize_ptr_alu
  bpf: Use correct permission flag for mixed signed bounds arithmetic
  ch_ktls: do not send snd_una update to TCB in middle
  ch_ktls: tcb close causes tls connection failure
  ch_ktls: fix device connection close
  ch_ktls: Fix kernel panic
  i40e: fix the panic when running bpf in xdpdrv mode
  net/mlx5e: fix ingress_ifindex check in mlx5e_flower_parse_meta
  net/mlx5e: Fix setting of RS FEC mode
  net/mlx5: Fix setting of devlink traps in switchdev mode
  Revert "net: stmmac: re-init rx buffers when mac resume back"
  ...

Merge tag 'libnvdimm-fixes-for-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm fixes from Dan Williams:
"The largest change is for a regression that landed during -rc1 for
  block-device read-only handling. Vaibhav found a new use for the
  ability (originally introduced by virtio_pmem) to call back to the
  platform to flush data, but also found an original bug in that
  implementation. Lastly, Arnd cleans up some compile warnings in dax.

  This has all appeared in -next with no reported issues.

  Summary:

   - Fix a regression of read-only handling in the pmem driver

   - Fix a compile warning

   - Fix support for platform cache flush commands on powerpc/papr"

* tag 'libnvdimm-fixes-for-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  libnvdimm/region: Fix nvdimm_has_flush() to handle ND_REGION_ASYNC
  libnvdimm: Notify disk drivers to revalidate region read-only
  dax: avoid -Wempty-body warnings

Merge tag 'cxl-fixes-for-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull CXL memory class fixes from Dan Williams:
"A collection of fixes for the CXL memory class driver introduced in
  this release cycle.

  The driver was primarily developed on a work-in-progress QEMU
  emulation of the interface and we have since found a couple places
  where it hid spec compliance bugs in the driver, or had a spec
  implementation bug itself.

  The biggest change here is replacing a percpu_ref with an rwsem to
  cleanup a couple bugs in the error unwind path during ioctl device
  init. Lastly there were some minor cleanups to not export the
  power-management sysfs-ABI for the ioctl device, use the proper sysfs
  helper for emitting values, and prevent subtle bugs as new
  administration commands are added to the supported list.

  The bulk of it has appeared in -next save for the top commit which was
  found today and validated on a fixed-up QEMU model.

  Summary:

   - Fix support for CXL memory devices with registers offset from the
     BAR base.

   - Fix the reporting of device capacity.

   - Fix the driver commands list definition to be disconnected from the
     UAPI command list.

   - Replace percpu_ref with rwsem to fix initialization error path.

   - Fix leaks in the driver initialization error path.

   - Drop the power/ directory from CXL device sysfs.

   - Use the recommended sysfs helper for attribute 'show'
     implementations"

* tag 'cxl-fixes-for-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl/mem: Fix memory device capacity probing
  cxl/mem: Fix register block offset calculation
  cxl/mem: Force array size of mem_commands[] to CXL_MEM_COMMAND_ID_MAX
  cxl/mem: Disable cxl device power management
  cxl/mem: Do not rely on device_add() side effects for dev_set_name() failures
  cxl/mem: Fix synchronization mechanism for device removal vs ioctl operations
  cxl/mem: Use sysfs_emit() for attribute show routines

Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
"12 patches.

  Subsystems affected by this patch series: mm (documentation, kasan,
  and pagemap), csky, ia64, gcov, and lib"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  lib: remove "expecting prototype" kernel-doc warnings
  gcov: clang: fix clang-11+ build
  mm: ptdump: fix build failure
  mm/mapping_dirty_helpers: guard hugepage pud's usage
  ia64: tools: remove duplicate definition of ia64_mf() on ia64
  ia64: tools: remove inclusion of ia64-specific version of errno.h header
  ia64: fix discontig.c section mismatches
  ia64: remove duplicate entries in generic_defconfig
  csky: change a Kconfig symbol name to fix e1000 build error
  kasan: remove redundant config option
  kasan: fix hwasan build for gcc
  mm: eliminate "expecting prototype" kernel-doc warnings

locking/qrwlock: Fix ordering in queued_write_lock_slowpath()

While this code is executed with the wait_lock held, a reader can
acquire the lock without holding wait_lock.  The writer side loops
checking the value with the atomic_cond_read_acquire(), but only truly
acquires the lock when the compare-and-exchange is completed
successfully which isn’t ordered. This exposes the window between the
acquire and the cmpxchg to an A-B-A problem which allows reads
following the lock acquisition to observe values speculatively before
the write lock is truly acquired.

We've seen a problem in epoll where the reader does a xchg while
holding the read lock, but the writer can see a value change out from
under it.

  Writer                                | Reader
  --------------------------------------------------------------------------------
  ep_scan_ready_list()                  |
  |- write_lock_irq()                   |
      |- queued_write_lock_slowpath()   |
|- atomic_cond_read_acquire()   |
        | read_lock_irqsave(&ep->lock, flags);
     --> (observes value before unlock) |  chain_epi_lockless()
     |                                  |    epi->next = xchg(&ep->ovflist, epi);
     |                                  | read_unlock_irqrestore(&ep->lock, flags);
     |                                  |
     |     atomic_cmpxchg_relaxed()     |
     |-- READ_ONCE(ep->ovflist);        |

A core can order the read of the ovflist ahead of the
atomic_cmpxchg_relaxed(). Switching the cmpxchg to use acquire
semantics addresses this issue at which point the atomic_cond_read can
be switched to use relaxed semantics.

Fixes: 3e70590156539 ("locking/qrwlock: Use atomic_cond_read_acquire() when spinning in qrwlock")
Signed-off-by: Ali Saidi <alisaidi@amazon.com>
[peterz: use try_cmpxchg()]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Waiman Long <longman@redhat.com>
Tested-by: Steve Capper <steve.capper@arm.com>

cxl/mem: Fix memory device capacity probing

The CXL Identify Memory Device output payload emits capacity in 256MB
units. The driver is treating the capacity field as bytes. This was
missed because QEMU reports bytes when it should report bytes / 256MB.

Fixes: fc78f3260985 ("cxl/mem: Find device capabilities")
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ben Widawsky <ben.widawsky@intel.com>
Link: https://lore.kernel.org/r/161862021044.3259705.7008520073059739760.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>

netlink: don't call ->netlink_bind with table lock held

When I added support to allow generic netlink multicast groups to be
restricted to subscribers with CAP_NET_ADMIN I was unaware that a
genl_bind implementation already existed in the past.

It was reverted due to ABBA deadlock:

1. ->netlink_bind gets called with the table lock held.
2. genetlink bind callback is invoked, it grabs the genl lock.

But when a new genl subsystem is (un)registered, these two locks are
taken in reverse order.

One solution would be to revert again and add a comment in genl
referring 5ea67840fe5df, "genetlink: remove genl_bind").

This would need a second change in mptcp to not expose the raw token
value anymore, e.g. by hashing the token with a secret key so userspace
can still associate subflow events with the correct mptcp connection.

However, Paolo Abeni reminded me to double-check why the netlink table is
locked in the first place.

I can't find one. netlink_bind() is already called without this lock
when userspace joins a group via NETLINK_ADD_MEMBERSHIP setsockopt.
Same holds for the netlink_unbind operation.

Digging through the history, commit 253eed5da8b21
("netlink: access nlk groups safely in netlink bind and getname")
expanded the lock scope.

commit 4827b25b4667a9f ("net: netlink: cap max groups which will be considered in netlink_bind()")
... removed the nlk->ngroups access that the lock scope
extension was all about.

Reduce the lock scope again and always call ->netlink_bind without
the table lock.

The Fixes tag should be vs. the patch mentioned in the link below,
but that one got squash-merged into the patch that came earlier in the
series.

Fixes: 5520854b6d5516 ("mptcp: avoid lock_fast usage in accept path")
Link: https://lore.kernel.org/mptcp/20210213000001.379332-8-mathew.j.martineau@linux.intel.com/T/#u
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Xin Long <lucien.xin@gmail.com>
Cc: Johannes Berg <johannes.berg@intel.com>
Cc: Sean Tranchetti <stranche@codeaurora.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'io_uring-5.12-2021-04-16' of git://git.kernel.dk/linux-block

Pull io_uring fix from Jens Axboe:
"Fix for a potential hang at exit with SQPOLL from Pavel"

* tag 'io_uring-5.12-2021-04-16' of git://git.kernel.dk/linux-block:
io_uring: fix early sqd_list removal sqpoll hangs

lib: remove "expecting prototype" kernel-doc warnings

Fix various kernel-doc warnings in lib/ due to missing or erroneous
function names.

Add kernel-doc for some function parameters that was missing.  Use
kernel-doc "Return:" notation in earlycpio.c.

Quietens the following warnings:

  lib/earlycpio.c:61: warning: expecting prototype for cpio_data find_cpio_data(). Prototype was for find_cpio_data() instead

  lib/lru_cache.c:640: warning: expecting prototype for lc_dump(). Prototype was for lc_seq_dump_details() instead
  lru_cache.c:90: warning: Function parameter or member 'cache' not described in 'lc_create'

  lib/parman.c:368: warning: expecting prototype for parman_item_del(). Prototype was for parman_item_remove() instead
  parman.c:309: warning: Excess function parameter 'prority' description in 'parman_prio_init'

  lib/radix-tree.c:703: warning: expecting prototype for __radix_tree_insert(). Prototype was for radix_tree_insert() instead
  radix-tree.c:180: warning: Excess function parameter 'addr' description in 'radix_tree_find_next_bit'
  radix-tree.c:180: warning: Excess function parameter 'size' description in 'radix_tree_find_next_bit'
  radix-tree.c:931: warning: Function parameter or member 'iter' not described in 'radix_tree_iter_replace'

Link: https://lkml.kernel.org/r/20210411221756.15461-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Jiri Pirko <jiri@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

gcov: clang: fix clang-11+ build

With clang-11+, the code is broken due to my kvmalloc() conversion
(which predated the clang-11 support code) leaving one vmalloc() in
place. Fix that.

Link: https://lkml.kernel.org/r/20210412214210.6e1ecca9cdc5.I24459763acf0591d5e6b31c7e3a59890d802f79c@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: ptdump: fix build failure

READ_ONCE() cannot be used for reading PTEs.  Use ptep_get() instead, to
avoid the following errors:

    CC      mm/ptdump.o
  In file included from <command-line>:
  mm/ptdump.c: In function 'ptdump_pte_entry':
  include/linux/compiler_types.h:320:38: error: call to '__compiletime_assert_207' declared with attribute error: Unsupported access size for {READ,WRITE}_ONCE().
    320 |  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
        |                                      ^
  include/linux/compiler_types.h:301:4: note: in definition of macro '__compiletime_assert'
    301 |    prefix ## suffix();    \
        |    ^~~~~~
  include/linux/compiler_types.h:320:2: note: in expansion of macro '_compiletime_assert'
    320 |  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
        |  ^~~~~~~~~~~~~~~~~~~
  include/asm-generic/rwonce.h:36:2: note: in expansion of macro 'compiletime_assert'
     36 |  compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
        |  ^~~~~~~~~~~~~~~~~~
  include/asm-generic/rwonce.h:49:2: note: in expansion of macro 'compiletime_assert_rwonce_type'
     49 |  compiletime_assert_rwonce_type(x);    \
        |  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  mm/ptdump.c:114:14: note: in expansion of macro 'READ_ONCE'
    114 |  pte_t val = READ_ONCE(*pte);
        |              ^~~~~~~~~
  make[2]: *** [mm/ptdump.o] Error 1

See commit e61384029eed ("mm: Allow arches to provide ptep_get()") and
commit db570a7b15ae ("powerpc/8xx: Provide ptep_get() with 16k pages")
for details.

Link: https://lkml.kernel.org/r/912b349e2bcaa88939904815ca0af945740c6bd4.1618478922.git.christophe.leroy@csgroup.eu
Fixes: 5d4917838a09 ("mm: add generic ptdump")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Steven Price <steven.price@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/mapping_dirty_helpers: guard hugepage pud's usage

Mapping dirty helpers have, so far, been only used on X86, but a port of
vmwgfx to ARM64 exposed a problem which results in a compilation error
on ARM64 systems:

  mm/mapping_dirty_helpers.c: In function `wp_clean_pud_entry':
  mm/mapping_dirty_helpers.c:172:32: error: implicit declaration of function `pud_dirty'; did you mean `pmd_dirty'? [-Werror=implicit-function-declaration]

This is due to the fact that mapping_dirty_helpers code assumes that
pud_dirty is always defined, which is not the case for architectures
that don't define CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD.

ARM64 arch is a little inconsistent when it comes to PUD hugepage
helpers, e.g. it defines pud_young but not pud_dirty but regardless of
that the core kernel code shouldn't assume that any of the PUD hugepage
helpers are available unless CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
is defined.  This prevents compilation errors whenever one of the
drivers is ported to new architectures.

Link: https://lkml.kernel.org/r/20210409165151.694574-1-zackr@vmware.com
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Thomas Hellstrm (Intel) <thomas_os@shipmail.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ia64: tools: remove duplicate definition of ia64_mf() on ia64

The ia64_mf() macro defined in tools/arch/ia64/include/asm/barrier.h is
already defined in <asm/gcc_intrin.h> on ia64 which causes libbpf
failing to build:

    CC       /usr/src/linux/tools/bpf/bpftool//libbpf/staticobjs/libbpf.o
  In file included from /usr/src/linux/tools/include/asm/barrier.h:24,
                   from /usr/src/linux/tools/include/linux/ring_buffer.h:4,
                   from libbpf.c:37:
  /usr/src/linux/tools/include/asm/../../arch/ia64/include/asm/barrier.h:43: error: "ia64_mf" redefined [-Werror]
     43 | #define ia64_mf()       asm volatile ("mf" ::: "memory")
        |
  In file included from /usr/include/ia64-linux-gnu/asm/intrinsics.h:20,
                   from /usr/include/ia64-linux-gnu/asm/swab.h:11,
                   from /usr/include/linux/swab.h:8,
                   from /usr/include/linux/byteorder/little_endian.h:13,
                   from /usr/include/ia64-linux-gnu/asm/byteorder.h:5,
                   from /usr/src/linux/tools/include/uapi/linux/perf_event.h:20,
                   from libbpf.c:36:
  /usr/include/ia64-linux-gnu/asm/gcc_intrin.h:382: note: this is the location of the previous definition
    382 | #define ia64_mf() __asm__ volatile ("mf" ::: "memory")
        |
  cc1: all warnings being treated as errors

Thus, remove the definition from tools/arch/ia64/include/asm/barrier.h.

Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ia64: tools: remove inclusion of ia64-specific version of errno.h header

There is no longer an ia64-specific version of the errno.h header below
arch/ia64/include/uapi/asm/, so trying to build tools/bpf fails with:

    CC       /usr/src/linux/tools/bpf/bpftool/btf_dumper.o
  In file included from /usr/src/linux/tools/include/linux/err.h:8,
                   from btf_dumper.c:11:
  /usr/src/linux/tools/include/uapi/asm/errno.h:13:10: fatal error: ../../../arch/ia64/include/uapi/asm/errno.h: No such file or directory
     13 | #include "../../../arch/ia64/include/uapi/asm/errno.h"
        |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  compilation terminated.

Thus, just remove the inclusion of the ia64-specific errno.h so that the
build will use the generic errno.h header on this target which was used
there anyway as the ia64-specific errno.h was just a wrapper for the
generic header.

Fixes: 21b352a8bda9 ("ia64: remove unneeded uapi asm-generic wrappers")
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ia64: fix discontig.c section mismatches

Fix IA64 discontig.c Section mismatch warnings.

When CONFIG_SPARSEMEM=y and CONFIG_MEMORY_HOTPLUG=y, the functions
computer_pernodesize() and scatter_node_data() should not be marked as
__meminit because they are needed after init, on any memory hotplug
event.  Also, early_nr_cpus_node() is called by compute_pernodesize(),
so early_nr_cpus_node() cannot be __meminit either.

  WARNING: modpost: vmlinux.o(.text.unlikely+0x1612): Section mismatch in reference from the function arch_alloc_nodedata() to the function .meminit.text:compute_pernodesize()
  The function arch_alloc_nodedata() references the function __meminit compute_pernodesize().
  This is often because arch_alloc_nodedata lacks a __meminit annotation or the annotation of compute_pernodesize is wrong.

  WARNING: modpost: vmlinux.o(.text.unlikely+0x1692): Section mismatch in reference from the function arch_refresh_nodedata() to the function .meminit.text:scatter_node_data()
  The function arch_refresh_nodedata() references the function __meminit scatter_node_data().
  This is often because arch_refresh_nodedata lacks a __meminit annotation or the annotation of scatter_node_data is wrong.

  WARNING: modpost: vmlinux.o(.text.unlikely+0x1502): Section mismatch in reference from the function compute_pernodesize() to the function .meminit.text:early_nr_cpus_node()
  The function compute_pernodesize() references the function __meminit early_nr_cpus_node().
  This is often because compute_pernodesize lacks a __meminit annotation or the annotation of early_nr_cpus_node is wrong.

Link: https://lkml.kernel.org/r/20210411001201.3069-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ia64: remove duplicate entries in generic_defconfig

Fix ia64 generic_defconfig duplicate entries, as warned by:

arch/ia64/configs/generic_defconfig: warning: override: reassigning to symbol ATA: => 58
arch/ia64/configs/generic_defconfig: warning: override: reassigning to symbol ATA_PIIX: => 59

These 2 symbols still have the same value as in the removed lines.

Link: https://lkml.kernel.org/r/20210411020255.18052-1-rdunlap@infradead.org
Fixes: eec87680b9fc ("ia64: Use libata instead of the legacy ide driver in defconfigs")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

csky: change a Kconfig symbol name to fix e1000 build error

e1000's #define of CONFIG_RAM_BASE conflicts with a Kconfig symbol in
arch/csky/Kconfig.

The symbol in e1000 has been around longer, so change arch/csky/ to use
DRAM_BASE instead of RAM_BASE to remove the conflict. (although e1000
is also a 2-line change)

Link: https://lkml.kernel.org/r/20210411055335.7111-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Acked-by: Guo Ren <guoren@kernel.org>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kasan: remove redundant config option

CONFIG_KASAN_STACK and CONFIG_KASAN_STACK_ENABLE both enable KASAN stack
instrumentation, but we should only need one config, so that we remove
CONFIG_KASAN_STACK_ENABLE and make CONFIG_KASAN_STACK workable. see [1].

When enable KASAN stack instrumentation, then for gcc we could do no
prompt and default value y, and for clang prompt and default value n.

This patch fixes the following compilation warning:

include/linux/kasan.h:333:30: warning: 'CONFIG_KASAN_STACK' is not defined, evaluates to 0 [-Wundef]

[akpm@linux-foundation.org: fix merge snafu]

Link: https://bugzilla.kernel.org/show_bug.cgi?id=210221
Link: https://lkml.kernel.org/r/20210226012531.29231-1-walter-zh.wu@mediatek.com
Fixes: 09f2b352e270 ("kasan: fix KASAN_STACK dependency for HW_TAGS")
Signed-off-by: Walter Wu <walter-zh.wu@mediatek.com>
Suggested-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kasan: fix hwasan build for gcc

gcc-11 adds support for -fsanitize=kernel-hwaddress, so it becomes
possible to enable CONFIG_KASAN_SW_TAGS.

Unfortunately this fails to build at the moment, because the
corresponding command line arguments use llvm specific syntax.

Change it to use the cc-param macro instead, which works on both clang
and gcc.

[elver@google.com: fixup for "kasan: fix hwasan build for gcc"]
Link: https://lkml.kernel.org/r/YHQZVfVVLE/LDK2v@elver.google.com
Link: https://lkml.kernel.org/r/20210323124112.1229772-1-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Acked-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: eliminate "expecting prototype" kernel-doc warnings

Fix stray kernel-doc warnings in mm/ due to mis-typed or missing function
names.

Quietens these kernel-doc warnings:

  mm/mmu_gather.c:264: warning: expecting prototype for tlb_gather_mmu(). Prototype was for __tlb_gather_mmu() instead
  mm/oom_kill.c:180: warning: expecting prototype for Check whether unreclaimable slab amount is greater than(). Prototype was for should_dump_unreclaim_slab() instead
  mm/shuffle.c:155: warning: expecting prototype for shuffle_free_memory(). Prototype was for __shuffle_free_memory() instead

Link: https://lkml.kernel.org/r/20210411210642.11362-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2021-04-17

The following pull-request contains BPF updates for your *net* tree.

We've added 10 non-merge commits during the last 9 day(s) which contain
a total of 8 files changed, 175 insertions(+), 111 deletions(-).

The main changes are:

1) Fix a potential NULL pointer dereference in libbpf's xsk
umem handling, from Ciara Loftus.

2) Mitigate a speculative oob read of up to map value size by
tightening the masking window, from Daniel Borkmann.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

MAINTAINERS: update my email

Update my email and change myself to Reviewer.

Signed-off-by: Lijun Pan <lijunp213@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bpf: Update selftests to reflect new error states

Update various selftest error messages:

* The 'Rx tried to sub from different maps, paths, or prohibited types'
   is reworked into more specific/differentiated error messages for better
   guidance.

* The change into 'value -4294967168 makes map_value pointer be out of
   bounds' is due to moving the mixed bounds check into the speculation
   handling and thus occuring slightly later than above mentioned sanity
   check.

* The change into 'math between map_value pointer and register with
   unbounded min value' is similarly due to register sanity check coming
   before the mixed bounds check.

* The case of 'map access: known scalar += value_ptr from different maps'
   now loads fine given masks are the same from the different paths (despite
   max map value size being different).

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>

bpf: Tighten speculative pointer arithmetic mask

This work tightens the offset mask we use for unprivileged pointer arithmetic
in order to mitigate a corner case reported by Piotr and Benedict where in
the speculative domain it is possible to advance, for example, the map value
pointer by up to value_size-1 out-of-bounds in order to leak kernel memory
via side-channel to user space.

Before this change, the computed ptr_limit for retrieve_ptr_limit() helper
represents largest valid distance when moving pointer to the right or left
which is then fed as aux->alu_limit to generate masking instructions against
the offset register. After the change, the derived aux->alu_limit represents
the largest potential value of the offset register which we mask against which
is just a narrower subset of the former limit.

For minimal complexity, we call sanitize_ptr_alu() from 2 observation points
in adjust_ptr_min_max_vals(), that is, before and after the simulated alu
operation. In the first step, we retieve the alu_state and alu_limit before
the operation as well as we branch-off a verifier path and push it to the
verification stack as we did before which checks the dst_reg under truncation,
in other words, when the speculative domain would attempt to move the pointer
out-of-bounds.

In the second step, we retrieve the new alu_limit and calculate the absolute
distance between both. Moreover, we commit the alu_state and final alu_limit
via update_alu_sanitation_state() to the env's instruction aux data, and bail
out from there if there is a mismatch due to coming from different verification
paths with different states.

Reported-by: Piotr Krysiuk <piotras@gmail.com>
Reported-by: Benedict Schlueter <benedict.schlueter@rub.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Benedict Schlueter <benedict.schlueter@rub.de>

bpf: Move sanitize_val_alu out of op switch

Add a small sanitize_needed() helper function and move sanitize_val_alu()
out of the main opcode switch. In upcoming work, we'll move sanitize_ptr_alu()
as well out of its opcode switch so this helps to streamline both.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>

bpf: Refactor and streamline bounds check into helper

Move the bounds check in adjust_ptr_min_max_vals() into a small helper named
sanitize_check_bounds() in order to simplify the former a bit.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>

bpf: Improve verifier error messages for users

Consolidate all error handling and provide more user-friendly error messages
from sanitize_ptr_alu() and sanitize_val_alu().

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>

bpf: Rework ptr_limit into alu_limit and add common error path

Small refactor with no semantic changes in order to consolidate the max
ptr_limit boundary check.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>

bpf: Ensure off_reg has no mixed signed bounds for all types

The mixed signed bounds check really belongs into retrieve_ptr_limit()
instead of outside of it in adjust_ptr_min_max_vals(). The reason is
that this check is not tied to PTR_TO_MAP_VALUE only, but to all pointer
types that we handle in retrieve_ptr_limit() and given errors from the latter
propagate back to adjust_ptr_min_max_vals() and lead to rejection of the
program, it's a better place to reside to avoid anything slipping through
for future types. The reason why we must reject such off_reg is that we
otherwise would not be able to derive a mask, see details in a28322fd184c
("bpf: restrict unknown scalars of mixed signed bounds for unprivileged").

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>

bpf: Move off_reg into sanitize_ptr_alu

Small refactor to drag off_reg into sanitize_ptr_alu(), so we later on can
use off_reg for generalizing some of the checks for all pointer types.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>