git.baikalelectronics.ru Git

epoll: fix compat syscall wire up of epoll_pwait2

Commit b0a0c2615f6f ("epoll: wire up syscall epoll_pwait2") wired up
the 64 bit syscall instead of the compat variant in a couple of places.

Fixes: b0a0c2615f6f ("epoll: wire up syscall epoll_pwait2")
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'close-range-cloexec-unshare-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull close_range fix from Christian Brauner:
"syzbot reported a bug when asking close_range() to unshare the file
  descriptor table and making all fds close-on-exec.

  If CLOSE_RANGE_UNSHARE the caller will receive a private file
  descriptor table in case their file descriptor table is currently
  shared before operating on the requested file descriptor range.

  For the case where the caller has requested all file descriptors to be
  actually closed via e.g. close_range(3, ~0U, CLOSE_RANGE_UNSHARE) the
  kernel knows that the caller does not need any of the file descriptors
  anymore and will optimize the close operation by only copying all
  files in the range from 0 to 3 and no others.

  However, if the caller requested CLOSE_RANGE_CLOEXEC together with
  CLOSE_RANGE_UNSHARE the caller wants to still make use of the file
  descriptors so the kernel needs to copy all of them and can't
  optimize.

  The original patch didn't account for this and thus could cause oopses
  as evidenced by the syzbot report because it assumed that all fds had
  been copied. Fix this by handling the CLOSE_RANGE_CLOEXEC case and
  copying all fds if the two flags are specified together.

  This should've been caught in the selftests but the original patch
  didn't cover this case and I didn't catch it during review. So in
  addition to the bugfix I'm also adding selftests. They will reliably
  reproduce the bug on a non-fixed kernel and allows us to catch
  regressions and verify correct behavior.

  Note, the kernel selftest tree contained a bunch of changes that made
  the original selftest fail to compile so there are small fixups in
  here make them compile without warnings"

* tag 'close-range-cloexec-unshare-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  selftests/core: add regression test for CLOSE_RANGE_UNSHARE | CLOSE_RANGE_CLOEXEC
  selftests/core: add test for CLOSE_RANGE_UNSHARE | CLOSE_RANGE_CLOEXEC
  selftests/core: handle missing syscall number for close_range
  selftests/core: fix close_range_test build after XFAIL removal
  close_range: unshare all fds for CLOSE_RANGE_UNSHARE | CLOSE_RANGE_CLOEXEC

Merge tag 'for-linus-5.11-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull more xen updates from Juergen Gross:
"Some minor cleanup patches and a small series disentangling some Xen
  related Kconfig options"

* tag 'for-linus-5.11-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: Kconfig: remove X86_64 depends from XEN_512GB
  xen/manage: Fix fall-through warnings for Clang
  xen-blkfront: Fix fall-through warnings for Clang
  xen: remove trailing semicolon in macro definition
  xen: Kconfig: nest Xen guest options
  xen: Remove Xen PVH/PVHVM dependency on PCI
  x86/xen: Convert to DEFINE_SHOW_ATTRIBUTE

Merge branch 'pcmcia-next' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux

Pull pcmcia updates from Dominik Brodowski:
"Besides a few PCMCIA odd fixes, the NEC VRC4173 CARDU driver is
  removed, as it has not compiled in ages"

* 'pcmcia-next' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux:
  pcmcia: omap: Fix error return code in omap_cf_probe()
  pcmcia: Remove NEC VRC4173 CARDU
  pcmcia: db1xxx_ss: remove unneeded semicolon
  pcmcia/electra_cf: Fix some return values in 'electra_cf_probe()' in case of error

Merge tag 'i3c/for-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux

Pull i3c updates from Boris Brezillon:

- Add the HCI driver

- Add a missing destroy_workqueue() in an error path

- Flag Alexandre Belloni as the new maintainer

* tag 'i3c/for-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux:
  i3c/master/mipi-i3c-hci: quiet maybe-unused variable warning
  i3c: Resign from my maintainer role
  i3c/master: Fix uninitialized variable next_addr
  i3c/master: introduce the mipi-i3c-hci driver
  dt-bindings: i3c: MIPI I3C Host Controller Interface
  i3c master: fix missing destroy_workqueue() on error in i3c_master_register

Merge tag 'for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply

Pull power supply and reset updates from Sebastian Reichel:
"Battery/charger driver changes:

   - collie_battery, generic-adc-battery, s3c-adc-battery: convert to
     GPIO descriptors (incl ARM board files)

   - misc cleanup and fixes

  Reset drivers:

   - new poweroff driver for force disabling a regulator

   - use printk format symbol resolver

   - ocelot: add support for Luton and Jaguar2"

* tag 'for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: (31 commits)
  power: supply: Fix a typo in warning message
  Documentation: DT: binding documentation for regulator-poweroff
  power: reset: new driver regulator-poweroff
  power: supply: ab8500: Use dev_err_probe() for IIO channels
  power: supply: ab8500_fg: Request all IRQs as threaded
  power: supply: ab8500_charger: Oneshot threaded IRQs
  power: supply: ab8500: Convert to dev_pm_ops
  power: supply: ab8500: Use local helper
  power: supply: wm831x_power: remove unneeded break
  power: supply: bq24735: Drop unused include
  power: supply: bq24190_charger: Drop unused include
  power: supply: generic-adc-battery: Use GPIO descriptors
  power: supply: collie_battery: Convert to GPIO descriptors
  power: supply: bq24190_charger: fix reference leak
  power: supply: s3c-adc-battery: Convert to GPIO descriptors
  power: reset: Use printk format symbol resolver
  power: supply: axp20x_usb_power: Use power efficient workqueue for debounce
  power: supply: axp20x_usb_power: fix typo
  power: supply: max8997-charger: Improve getting charger status
  power: supply: max8997-charger: Fix platform data retrieval
  ...

Merge tag 'hsi-for-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-hsi

Pull HSI updates from Sebastian Reichel:
"Misc cleanups"

* tag 'hsi-for-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-hsi:
HSI: core: fix a kernel-doc markup
HSI: omap_ssi: Don't jump to free ID in ssi_add_controller()

Merge tag 'pwm/for-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm

Pull pwm updates from Thierry Reding:
"This is a fairly big release cycle from the PWM framework's point of
  view.

  There's a large patcheset here which converts drivers to use the new
  devm_platform_ioremap_resource() helper and a bunch of minor fixes to
  existing drivers. Some of the existing drivers also add support for
  more hardware, such as Atmel SAMA 5D2 and Mediatek MT8183.

  Finally there's a couple of new drivers for Intel Keem Bay and LGM
  SoCs as well as the DesignWare PWM controller"

* tag 'pwm/for-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: (66 commits)
  pwm: sun4i: Remove erroneous else branch
  pwm: sl28cpld: Set driver data before registering the PWM chip
  pwm: Remove unused function pwmchip_add_inversed()
  pwm: imx27: Fix overflow for bigger periods
  pwm: bcm2835: Support apply function for atomic configuration
  pwm: keembay: Fix build failure with -Os
  pwm: core: Use octal permission
  pwm: lpss: Make compilable with COMPILE_TEST
  pwm: Fix dependencies on HAS_IOMEM
  pwm: Use -EINVAL for unsupported polarity
  pwm: sti: Remove unnecessary blank line
  pwm: sti: Avoid conditional gotos
  pwm: Add PWM fan controller driver for LGM SoC
  Add DT bindings YAML schema for PWM fan controller of LGM SoC
  pwm: Add DesignWare PWM Controller Driver
  dt-bindings: pwm: mtk-disp: add MT8167 SoC binding
  pwm: mediatek: Add MT8183 SoC support
  pwm: mediatek: Always use bus clock
  dt-bindings: pwm: pwm-mediatek: Add documentation for MT8183 SoC
  pwm: Add PWM driver for Intel Keem Bay
  ...

Merge branch 'akpm' (patches from Andrew)

Merge still more updates from Andrew Morton:
"18 patches.

  Subsystems affected by this patch series: mm (memcg and cleanups) and
  epoll"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm/Kconfig: fix spelling mistake "whats" -> "what's"
  selftests/filesystems: expand epoll with epoll_pwait2
  epoll: wire up syscall epoll_pwait2
  epoll: add syscall epoll_pwait2
  epoll: convert internal api to timespec64
  epoll: eliminate unnecessary lock for zero timeout
  epoll: replace gotos with a proper loop
  epoll: pull all code between fetch_events and send_event into the loop
  epoll: simplify and optimize busy loop logic
  epoll: move eavail next to the list_empty_careful check
  epoll: pull fatal signal checks into ep_send_events()
  epoll: simplify signal handling
  epoll: check for events when removing a timed out thread from the wait queue
  mm/memcontrol:rewrite mem_cgroup_page_lruvec()
  mm, kvm: account kvm_vcpu_mmap to kmemcg
  mm/memcg: remove unused definitions
  mm/memcg: warning on !memcg after readahead page charged
  mm/memcg: bail early from swap accounting if memcg disabled

mm/Kconfig: fix spelling mistake "whats" -> "what's"

There is a spelling mistake in the Kconfig help text. Fix it.

Link: https://lkml.kernel.org/r/20201217172717.58203-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

selftests/filesystems: expand epoll with epoll_pwait2

Code coverage for the epoll_pwait2 syscall.

epoll62: Repeat basic test epoll1, but exercising the new syscall.
epoll63: Pass a timespec and exercise the timeout wakeup path.

Link: https://lkml.kernel.org/r/20201121144401.3727659-5-willemdebruijn.kernel@gmail.com
Signed-off-by: Willem de Bruijn <willemb@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

epoll: wire up syscall epoll_pwait2

Split off from prev patch in the series that implements the syscall.

Link: https://lkml.kernel.org/r/20201121144401.3727659-4-willemdebruijn.kernel@gmail.com
Signed-off-by: Willem de Bruijn <willemb@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

epoll: add syscall epoll_pwait2

Add syscall epoll_pwait2, an epoll_wait variant with nsec resolution that
replaces int timeout with struct timespec.  It is equivalent otherwise.

    int epoll_pwait2(int fd, struct epoll_event *events,
                     int maxevents,
                     const struct timespec *timeout,
                     const sigset_t *sigset);

The underlying hrtimer is already programmed with nsec resolution.
pselect and ppoll also set nsec resolution timeout with timespec.

The sigset_t in epoll_pwait has a compat variant. epoll_pwait2 needs
the same.

For timespec, only support this new interface on 2038 aware platforms
that define __kernel_timespec_t. So no CONFIG_COMPAT_32BIT_TIME.

Link: https://lkml.kernel.org/r/20201121144401.3727659-3-willemdebruijn.kernel@gmail.com
Signed-off-by: Willem de Bruijn <willemb@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

epoll: convert internal api to timespec64

Patch series "add epoll_pwait2 syscall", v4.

Enable nanosecond timeouts for epoll.

Analogous to pselect and ppoll, introduce an epoll_wait syscall
variant that takes a struct timespec instead of int timeout.

This patch (of 4):

Make epoll more consistent with select/poll: pass along the timeout as
timespec64 pointer.

In anticipation of additional changes affecting all three polling
mechanisms:

- add epoll_pwait2 syscall with timespec semantics,
and share poll_select_set_timeout implementation.
- compute slack before conversion to absolute time,
to save one ktime_get_ts64 call.

Link: https://lkml.kernel.org/r/20201121144401.3727659-1-willemdebruijn.kernel@gmail.com
Link: https://lkml.kernel.org/r/20201121144401.3727659-2-willemdebruijn.kernel@gmail.com
Signed-off-by: Willem de Bruijn <willemb@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

epoll: eliminate unnecessary lock for zero timeout

We call ep_events_available() under lock when timeout is 0, and then call
it without locks in the loop for the other cases.

Instead, call ep_events_available() without lock for all cases. For
non-zero timeouts, we will recheck after adding the thread to the wait
queue. For zero timeout cases, by definition, user is opportunistically
polling and will have to call epoll_wait again in the future.

Note that this lock was kept in c5a282e9635e9 because the whole loop was
historically under lock.

This patch results in a 1% CPU/RPC reduction in RPC benchmarks.

Link: https://lkml.kernel.org/r/20201106231635.3528496-9-soheil.kdev@gmail.com
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Khazhismel Kumykov <khazhy@google.com>
Cc: Guantao Liu <guantaol@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

epoll: replace gotos with a proper loop

The existing loop is pointless, and the labels make it really hard to
follow the structure.

Replace that control structure with a simple loop that returns when there
are new events, there is a signal, or the thread has timed out.

Link: https://lkml.kernel.org/r/20201106231635.3528496-8-soheil.kdev@gmail.com
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Khazhismel Kumykov <khazhy@google.com>
Cc: Guantao Liu <guantaol@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

epoll: pull all code between fetch_events and send_event into the loop

This is a no-op change which simplifies the follow up patches.

Link: https://lkml.kernel.org/r/20201106231635.3528496-7-soheil.kdev@gmail.com
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Khazhismel Kumykov <khazhy@google.com>
Cc: Guantao Liu <guantaol@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

epoll: simplify and optimize busy loop logic

ep_events_available() is called multiple times around the busy loop logic,
even though the logic is generally not used.  ep_reset_busy_poll_napi_id()
is similarly always called, even when busy loop is not used.

Eliminate ep_reset_busy_poll_napi_id() and inline it inside
ep_busy_loop().  Make ep_busy_loop() return whether there are any events
available after the busy loop.  This will eliminate unnecessary loads and
branches, and simplifies the loop.

Link: https://lkml.kernel.org/r/20201106231635.3528496-6-soheil.kdev@gmail.com
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Khazhismel Kumykov <khazhy@google.com>
Cc: Guantao Liu <guantaol@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

epoll: move eavail next to the list_empty_careful check

This is a no-op change and simply to make the code more coherent.

Link: https://lkml.kernel.org/r/20201106231635.3528496-5-soheil.kdev@gmail.com
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Khazhismel Kumykov <khazhy@google.com>
Cc: Guantao Liu <guantaol@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

epoll: pull fatal signal checks into ep_send_events()

To simplify the code, pull in checking the fatal signals into
ep_send_events(). ep_send_events() is called only from ep_poll().

Note that, previously, we were always checking fatal events, but it is
checked only if eavail is true. This should be fine because the goal of
that check is to quickly return from epoll_wait() when there is a pending
fatal signal.

Link: https://lkml.kernel.org/r/20201106231635.3528496-4-soheil.kdev@gmail.com
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Suggested-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Khazhismel Kumykov <khazhy@google.com>
Cc: Guantao Liu <guantaol@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

epoll: simplify signal handling

Check signals before locking ep->lock, and immediately return -EINTR if
there is any signal pending.

This saves a few loads, stores, and branches from the hot path and
simplifies the loop structure for follow up patches.

Link: https://lkml.kernel.org/r/20201106231635.3528496-3-soheil.kdev@gmail.com
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Khazhismel Kumykov <khazhy@google.com>
Cc: Guantao Liu <guantaol@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

epoll: check for events when removing a timed out thread from the wait queue

Patch series "simplify ep_poll".

This patch series is a followup based on the suggestions and feedback by
Linus:
https://lkml.kernel.org/r/CAHk-=wizk=OxUyQPbO8MS41w2Pag1kniUV5WdD5qWL-gq1kjDA@mail.gmail.com

The first patch in the series is a fix for the epoll race in presence of
timeouts, so that it can be cleanly backported to all affected stable
kernels.

The rest of the patch series simplify the ep_poll() implementation.  Some
of these simplifications result in minor performance enhancements as well.
We have kept these changes under self tests and internal benchmarks for a
few days, and there are minor (1-2%) performance enhancements as a result.

This patch (of 8):

After abc610e01c66 ("fs/epoll: avoid barrier after an epoll_wait(2)
timeout"), we break out of the ep_poll loop upon timeout, without checking
whether there is any new events available.  Prior to that patch-series we
always called ep_events_available() after exiting the loop.

This can cause races and missed wakeups.  For example, consider the
following scenario reported by Guantao Liu:

Suppose we have an eventfd added using EPOLLET to an epollfd.

Thread 1: Sleeps for just below 5ms and then writes to an eventfd.
Thread 2: Calls epoll_wait with a timeout of 5 ms. If it sees an
          event of the eventfd, it will write back on that fd.
Thread 3: Calls epoll_wait with a negative timeout.

Prior to abc610e01c66, it is guaranteed that Thread 3 will wake up either
by Thread 1 or Thread 2.  After abc610e01c66, Thread 3 can be blocked
indefinitely if Thread 2 sees a timeout right before the write to the
eventfd by Thread 1.  Thread 2 will be woken up from
schedule_hrtimeout_range and, with evail 0, it will not call
ep_send_events().

To fix this issue:
1) Simplify the timed_out case as suggested by Linus.
2) while holding the lock, recheck whether the thread was woken up
   after its time out has reached.

Note that (2) is different from Linus' original suggestion: It do not set
"eavail = ep_events_available(ep)" to avoid unnecessary contention (when
there are too many timed-out threads and a small number of events), as
well as races mentioned in the discussion thread.

This is the first patch in the series so that the backport to stable
releases is straightforward.

Link: https://lkml.kernel.org/r/20201106231635.3528496-1-soheil.kdev@gmail.com
Link: https://lkml.kernel.org/r/CAHk-=wizk=OxUyQPbO8MS41w2Pag1kniUV5WdD5qWL-gq1kjDA@mail.gmail.com
Link: https://lkml.kernel.org/r/20201106231635.3528496-2-soheil.kdev@gmail.com
Fixes: abc610e01c66 ("fs/epoll: avoid barrier after an epoll_wait(2) timeout")
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Tested-by: Guantao Liu <guantaol@google.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Guantao Liu <guantaol@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Khazhismel Kumykov <khazhy@google.com>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/memcontrol:rewrite mem_cgroup_page_lruvec()

mem_cgroup_page_lruvec() in memcontrol.c and mem_cgroup_lruvec() in
memcontrol.h is very similar except for the param(page and memcg) which
also can be convert to each other.

So rewrite mem_cgroup_page_lruvec() with mem_cgroup_lruvec().

[alex.shi@linux.alibaba.com: add missed warning in mem_cgroup_lruvec]
Link: https://lkml.kernel.org/r/94f17bb7-ec61-5b72-3555-fabeb5a4d73b@linux.alibaba.com
[lstoakes@gmail.com: warn on missing memcg on mem_cgroup_page_lruvec()]
Link: https://lkml.kernel.org/r/20201125112202.387009-1-lstoakes@gmail.com
Link: https://lkml.kernel.org/r/20201108143731.GA74138@rlk
Signed-off-by: Hui Su <sh_def@163.com>
Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Yafang Shao <laoar.shao@gmail.com>
Cc: Chris Down <chris@chrisdown.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm, kvm: account kvm_vcpu_mmap to kmemcg

A VCPU of a VM can allocate couple of pages which can be mmap'ed by the
user space application. At the moment this memory is not charged to the
memcg of the VMM. On a large machine running large number of VMs or
small number of VMs having large number of VCPUs, this unaccounted
memory can be very significant. So, charge this memory to the memcg of
the VMM. Please note that lifetime of these allocations corresponds to
the lifetime of the VMM.

Link: https://lkml.kernel.org/r/20201106202923.2087414-1-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/memcg: remove unused definitions

Some definitions are left unused, just clean them.

Link: https://lkml.kernel.org/r/20201108003834.12669-1-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Roman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/memcg: warning on !memcg after readahead page charged

Add VM_WARN_ON_ONCE_PAGE() macro.

Since readahead page is charged on memcg too, in theory we don't have to
check this exception now. Before safely remove them all, add a warning
for the unexpected !memcg.

Link: https://lkml.kernel.org/r/1604283436-18880-3-git-send-email-alex.shi@linux.alibaba.com
Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/memcg: bail early from swap accounting if memcg disabled

Patch series "bail out early for memcg disable".

These 2 patches are indepenedent from per memcg lru lock, and may
encounter unexpected warning, so let's move out them from per memcg
lru locking patchset.

This patch (of 2):

We could bail out early when memcg wasn't enabled.

Link: https://lkml.kernel.org/r/1604283436-18880-1-git-send-email-alex.shi@linux.alibaba.com
Link: https://lkml.kernel.org/r/1604283436-18880-2-git-send-email-alex.shi@linux.alibaba.com
Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Reviewed-by: Roman Gushchin <guro@fb.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

selftests/core: add regression test for CLOSE_RANGE_UNSHARE | CLOSE_RANGE_CLOEXEC

This test is a minimalized version of the reproducer given by syzbot
(cf. [1]).

After introducing CLOSE_RANGE_CLOEXEC syzbot reported a crash when
CLOSE_RANGE_CLOEXEC is specified in conjunction with
CLOSE_RANGE_UNSHARE. When CLOSE_RANGE_UNSHARE is specified the caller
will receive a private file descriptor table in case their file
descriptor table is currently shared.
For the case where the caller has requested all file descriptors to be
actually closed via e.g. close_range(3, ~0U, 0) the kernel knows that
the caller does not need any of the file descriptors anymore and will
optimize the close operation by only copying all files in the range from
0 to 3 and no others.

However, if the caller requested CLOSE_RANGE_CLOEXEC together with
CLOSE_RANGE_UNSHARE the caller wants to still make use of the file
descriptors so the kernel needs to copy all of them and can't optimize.

The original patch didn't account for this and thus could cause oopses
as evidenced by the syzbot report. Add tests for this regression.

We first create a huge gap in the fd table. When we now call
CLOSE_RANGE_UNSHARE with a shared fd table and and with ~0U as upper
bound the kernel will only copy up to fd1 file descriptors into the new
fd table. If the kernel is buggy and doesn't handle CLOSE_RANGE_CLOEXEC
correctly it will not have copied all file descriptors and we will oops!

This test passes on a fixed kernel and will trigger an oops on a buggy
kernel.

[1]: https://syzkaller.appspot.com/text?tag=KernelConfig&x=db720fe37a6a41d8

Cc: Giuseppe Scrivano <gscrivan@redhat.com>
Cc: linux-fsdevel@vger.kernel.org
Link: syzbot+96cfd2b22b3213646a93@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/20201218145415.801063-4-christian.brauner@ubuntu.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>

selftests/core: add test for CLOSE_RANGE_UNSHARE | CLOSE_RANGE_CLOEXEC

Add a test to verify that CLOSE_RANGE_UNSHARE works correctly when combined
with CLOSE_RANGE_CLOEXEC for the single-threaded case.

Cc: Giuseppe Scrivano <gscrivan@redhat.com>
Cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20201218145415.801063-3-christian.brauner@ubuntu.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>

selftests/core: handle missing syscall number for close_range

This improves the syscall number handling in the close_range()
selftests. This should handle any architecture.

Cc: Giuseppe Scrivano <gscrivan@redhat.com>
Cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20201218145415.801063-2-christian.brauner@ubuntu.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>

selftests/core: fix close_range_test build after XFAIL removal

XFAIL was removed in commit 9847d24af95c ("selftests/harness: Refactor
XFAIL into SKIP") and its use in close_range_test was already replaced
by commit 1d44d0dd61b6 ("selftests: core: use SKIP instead of XFAIL in
close_range_test.c"). However, commit 23afeaeff3d9 ("selftests: core:
add tests for CLOSE_RANGE_CLOEXEC") introduced usage of XFAIL in
TEST(close_range_cloexec). Use SKIP there as well.

Fixes: 23afeaeff3d9 ("selftests: core: add tests for CLOSE_RANGE_CLOEXEC")
Cc: Giuseppe Scrivano <gscrivan@redhat.com>
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20201218112428.13662-1-tklauser@distanz.ch
Link: https://lore.kernel.org/r/20201218145415.801063-1-christian.brauner@ubuntu.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>

close_range: unshare all fds for CLOSE_RANGE_UNSHARE | CLOSE_RANGE_CLOEXEC

After introducing CLOSE_RANGE_CLOEXEC syzbot reported a crash when
CLOSE_RANGE_CLOEXEC is specified in conjunction with CLOSE_RANGE_UNSHARE.
When CLOSE_RANGE_UNSHARE is specified the caller will receive a private
file descriptor table in case their file descriptor table is currently
shared.

For the case where the caller has requested all file descriptors to be
actually closed via e.g. close_range(3, ~0U, 0) the kernel knows that
the caller does not need any of the file descriptors anymore and will
optimize the close operation by only copying all files in the range from
0 to 3 and no others.

However, if the caller requested CLOSE_RANGE_CLOEXEC together with
CLOSE_RANGE_UNSHARE the caller wants to still make use of the file
descriptors so the kernel needs to copy all of them and can't optimize.

The original patch didn't account for this and thus could cause oopses
as evidenced by the syzbot report because it assumed that all fds had
been copied. Fix this by handling the CLOSE_RANGE_CLOEXEC case.

syzbot reported
==================================================================
BUG: KASAN: null-ptr-deref in instrument_atomic_read include/linux/instrumented.h:71 [inline]
BUG: KASAN: null-ptr-deref in atomic64_read include/asm-generic/atomic-instrumented.h:837 [inline]
BUG: KASAN: null-ptr-deref in atomic_long_read include/asm-generic/atomic-long.h:29 [inline]
BUG: KASAN: null-ptr-deref in filp_close+0x22/0x170 fs/open.c:1274
Read of size 8 at addr 0000000000000077 by task syz-executor511/8522

CPU: 1 PID: 8522 Comm: syz-executor511 Not tainted 5.10.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:79 [inline]
dump_stack+0x107/0x163 lib/dump_stack.c:120
__kasan_report mm/kasan/report.c:549 [inline]
kasan_report.cold+0x5/0x37 mm/kasan/report.c:562
check_memory_region_inline mm/kasan/generic.c:186 [inline]
check_memory_region+0x13d/0x180 mm/kasan/generic.c:192
instrument_atomic_read include/linux/instrumented.h:71 [inline]
atomic64_read include/asm-generic/atomic-instrumented.h:837 [inline]
atomic_long_read include/asm-generic/atomic-long.h:29 [inline]
filp_close+0x22/0x170 fs/open.c:1274
close_files fs/file.c:402 [inline]
put_files_struct fs/file.c:417 [inline]
put_files_struct+0x1cc/0x350 fs/file.c:414
exit_files+0x12a/0x170 fs/file.c:435
do_exit+0xb4f/0x2a00 kernel/exit.c:818
do_group_exit+0x125/0x310 kernel/exit.c:920
get_signal+0x428/0x2100 kernel/signal.c:2792
arch_do_signal_or_restart+0x2a8/0x1eb0 arch/x86/kernel/signal.c:811
handle_signal_work kernel/entry/common.c:147 [inline]
exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
exit_to_user_mode_prepare+0x124/0x200 kernel/entry/common.c:201
__syscall_exit_to_user_mode_work kernel/entry/common.c:291 [inline]
syscall_exit_to_user_mode+0x19/0x50 kernel/entry/common.c:302
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x447039
Code: Unable to access opcode bytes at RIP 0x44700f.
RSP: 002b:00007f1b1225cdb8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: 0000000000000001 RBX: 00000000006dbc28 RCX: 0000000000447039
RDX: 00000000000f4240 RSI: 0000000000000081 RDI: 00000000006dbc2c
RBP: 00000000006dbc20 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dbc2c
R13: 00007fff223b6bef R14: 00007f1b1225d9c0 R15: 00000000006dbc2c
==================================================================

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+96cfd2b22b3213646a93@syzkaller.appspotmail.com
Tested on:

commit:         10f7cddd selftests/core: add regression test for CLOSE_RAN..
git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git vfs
kernel config:  https://syzkaller.appspot.com/x/.config?x=5d42216b510180e3
link: https://syzkaller.appspot.com/bug?extid=96cfd2b22b3213646a93
compiler:       gcc (GCC) 10.1.0-syz 20200507

Reported-by: syzbot+96cfd2b22b3213646a93@syzkaller.appspotmail.com
Fixes: 582f1fb6b721 ("fs, close_range: add flag CLOSE_RANGE_CLOEXEC")
Cc: Giuseppe Scrivano <gscrivan@redhat.com>
Cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20201217213303.722643-1-christian.brauner@ubuntu.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>

xen: Kconfig: remove X86_64 depends from XEN_512GB

commit bfda93aee0ec ("xen: Kconfig: nest Xen guest options")
accidentally re-added X86_64 as a dependency to XEN_512GB. It was
originally removed in commit a13f2ef168cb ("x86/xen: remove 32-bit Xen
PV guest support"). Remove it again.

Fixes: bfda93aee0ec ("xen: Kconfig: nest Xen guest options")
Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20201216140838.16085-1-jandryuk@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>

mm/filemap: fix infinite loop in generic_file_buffered_read()

If iter->count is 0 and iocb->ki_pos is page aligned, this causes
nr_pages to be 0.

Then in generic_file_buffered_read_get_pages() find_get_pages_contig()
returns 0 - because we asked for 0 pages, so we call
generic_file_buffered_read_no_cached_page() which attempts to add a page
to the page cache, which fails with -EEXIST, and then we loop. Oops...

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Reported-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'xfs-5.11-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs updates from Darrick Wong:
"In this release we add the ability to set a 'needsrepair' flag
  indicating that we /know/ the filesystem requires xfs_repair, but
  other than that, it's the usual strengthening of metadata validation
  and miscellaneous cleanups.

  Summary:

   - Introduce a "needsrepair" "feature" to flag a filesystem as needing
     a pass through xfs_repair. This is key to enabling filesystem
     upgrades (in xfs_db) that require xfs_repair to make minor
     adjustments to metadata.

   - Refactor parameter checking of recovered log intent items so that
     we actually use the same validation code as them that generate the
     intent items.

   - Various fixes to online scrub not reacting correctly to directory
     entries pointing to inodes that cannot be igetted.

   - Refactor validation helpers for data and rt volume extents.

   - Refactor XFS_TRANS_DQ_DIRTY out of existence.

   - Fix a longstanding bug where mounting with "uqnoenforce" would
     start user quotas in non-enforcing mode but /proc/mounts would
     display "usrquota", implying that they are being enforced.

   - Don't flag dax+reflink inodes as corruption since that is a valid
     (but not fully functional) combination right now.

   - Clean up raid stripe validation functions.

   - Refactor the inode allocation code to be more straightforward.

   - Small prep cleanup for idmapping support.

   - Get rid of the xfs_buf_t typedef"

* tag 'xfs-5.11-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (40 commits)
  xfs: remove xfs_buf_t typedef
  fs/xfs: convert comma to semicolon
  xfs: open code updating i_mode in xfs_set_acl
  xfs: remove xfs_vn_setattr_nonsize
  xfs: kill ialloced in xfs_dialloc()
  xfs: spilt xfs_dialloc() into 2 functions
  xfs: move xfs_dialloc_roll() into xfs_dialloc()
  xfs: move on-disk inode allocation out of xfs_ialloc()
  xfs: introduce xfs_dialloc_roll()
  xfs: convert noroom, okalloc in xfs_dialloc() to bool
  xfs: don't catch dax+reflink inodes as corruption in verifier
  xfs: fix the forward progress assertion in xfs_iwalk_run_callbacks
  xfs: remove unneeded return value check for *init_cursor()
  xfs: introduce xfs_validate_stripe_geometry()
  xfs: show the proper user quota options
  xfs: remove the unused XFS_B_FSB_OFFSET macro
  xfs: remove unnecessary null check in xfs_generic_create
  xfs: directly return if the delta equal to zero
  xfs: check tp->t_dqinfo value instead of the XFS_TRANS_DQ_DIRTY flag
  xfs: delete duplicated tp->t_dqinfo null check and allocation
  ...

Merge tag 'ktest-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest

Pull ktest updates from Steven Rostedt:
"No new features. Just a couple of fixes that I had in my local
  repository that fixed issues with sending the result emails"

* tag 'ktest-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
  ktest.pl: Fix the logic for truncating the size of the log file for email
  ktest.pl: If size of log is too big to email, email error message

Merge tag 'drm-next-2020-12-18' of git://anongit.freedesktop.org/drm/drm

Pull more drm updates from Daniel Vetter:
"UAPI Changes:

   - Only enable char/agp uapi when CONFIG_DRM_LEGACY is set

  Cross-subsystem Changes:

   - vma_set_file helper to make vma->vm_file changing less brittle,
     acked by Andrew

  Core Changes:

   - dma-buf heaps improvements

   - pass full atomic modeset state to driver callbacks

   - shmem helpers: cached bo by default

   - cleanups for fbdev, fb-helpers

   - better docs for drm modes and SCALING_FITLER uapi

   - ttm: fix dma32 page pool regression

  Driver Changes:

   - multi-hop regression fixes for amdgpu, radeon, nouveau

   - lots of small amdgpu hw enabling fixes (display, pm, ...)

   - fixes for imx, mcde, meson, some panels, virtio, qxl, i915, all
     fairly minor

   - some cleanups for legacy drm/fbdev drivers"

* tag 'drm-next-2020-12-18' of git://anongit.freedesktop.org/drm/drm: (117 commits)
  drm/qxl: don't allocate a dma_address array
  drm/nouveau: fix multihop when move doesn't work.
  drm/i915/tgl: Fix REVID macros for TGL to fetch correct stepping
  drm/i915: Fix mismatch between misplaced vma check and vma insert
  drm/i915/perf: also include Gen11 in OATAILPTR workaround
  Revert "drm/i915: re-order if/else ladder for hpd_irq_setup"
  drm/amdgpu/disply: fix documentation warnings in display manager
  drm/amdgpu: print mmhub client name for dimgrey_cavefish
  drm/amdgpu: set mode1 reset as default for dimgrey_cavefish
  drm/amd/display: Add get_dig_frontend implementation for DCEx
  drm/radeon: remove h from printk format specifier
  drm/amdgpu: remove h from printk format specifier
  drm/amdgpu: Fix spelling mistake "Heterogenous" -> "Heterogeneous"
  drm/amdgpu: fix regression in vbios reservation handling on headless
  drm/amdgpu/SRIOV: Extend VF reset request wait period
  drm/amdkfd: correct amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu log.
  drm/amd/display: Adding prototype for dccg21_update_dpp_dto()
  drm/amdgpu: print what method we are using for runtime pm
  drm/amdgpu: simplify logic in atpx resume handling
  drm/amdgpu: no need to call pci_ignore_hotplug for _PR3
  ...

Merge tag 'thermal-v5.11-2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux

Pull thermal fixlet from Daniel Lezcano:
"A trivial change which fell through the cracks:

Add Alder Lake support ACPI ids (Srinivas Pandruvada)"

* tag 'thermal-v5.11-2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
thermal: int340x: Support Alder Lake

Merge tag 's390-5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull more s390 updates from Heiko Carstens:
"This is mainly to decouple udelay() and arch_cpu_idle() and simplify
  both of them.

  Summary:

   - Always initialize kernel stack backchain when entering the kernel,
     so that unwinding works properly.

   - Fix stack unwinder test case to avoid rare interrupt stack
     corruption.

   - Simplify udelay() and just let it busy loop instead of implementing
     a complex logic.

   - arch_cpu_idle() cleanup.

   - Some other minor improvements"

* tag 's390-5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/zcrypt: convert comma to semicolon
  s390/idle: allow arch_cpu_idle() to be kprobed
  s390/idle: remove raw_local_irq_save()/restore() from arch_cpu_idle()
  s390/idle: merge enabled_wait() and arch_cpu_idle()
  s390/delay: remove udelay_simple()
  s390/irq: select HAVE_IRQ_EXIT_ON_IRQ_STACK
  s390/delay: simplify udelay
  s390/test_unwind: use timer instead of udelay
  s390/test_unwind: fix CALL_ON_STACK tests
  s390: make calls to TRACE_IRQS_OFF/TRACE_IRQS_ON balanced
  s390: always clear kernel stack backchain before calling functions

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull more arm64 updates from Catalin Marinas:
"These are some some trivial updates that mostly fix/clean-up code
  pushed during the merging window:

   - Work around broken GCC 4.9 handling of "S" asm constraint

   - Suppress W=1 missing prototype warnings

   - Warn the user when a small VA_BITS value cannot map the available
     memory

   - Drop the useless update to per-cpu cycles"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Work around broken GCC 4.9 handling of "S" constraint
  arm64: Warn the user when a small VA_BITS value wastes memory
  arm64: entry: suppress W=1 prototype warnings
  arm64: topology: Drop the useless update to per-cpu cycles

Merge tag 'riscv-for-linus-5.11-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:
"We have a handful of new kernel features for 5.11:

   - Support for the contiguous memory allocator.

   - Support for IRQ Time Accounting

   - Support for stack tracing

   - Support for strict /dev/mem

   - Support for kernel section protection

  I'm being a bit conservative on the cutoff for this round due to the
  timing, so this is all the new development I'm going to take for this
  cycle (even if some of it probably normally would have been OK). There
  are, however, some fixes on the list that I will likely be sending
  along either later this week or early next week.

  There is one issue in here: one of my test configurations
  (PREEMPT{,_DEBUG}=y) fails to boot on QEMU 5.0.0 (from April) as of
  the .text.init alignment patch.

  With any luck we'll sort out the issue, but given how many bugs get
  fixed all over the place and how unrelated those features seem my
  guess is that we're just running into something that's been lurking
  for a while and has already been fixed in the newer QEMU (though I
  wouldn't be surprised if it's one of these implicit assumptions we
  have in the boot flow). If it was hardware I'd be strongly inclined to
  look more closely, but given that users can upgrade their simulators
  I'm less worried about it"

* tag 'riscv-for-linus-5.11-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  arm64: Use the generic devmem_is_allowed()
  arm: Use the generic devmem_is_allowed()
  RISC-V: Use the new generic devmem_is_allowed()
  lib: Add a generic version of devmem_is_allowed()
  riscv: Fixed kernel test robot warning
  riscv: kernel: Drop unused clean rule
  riscv: provide memmove implementation
  RISC-V: Move dynamic relocation section under __init
  RISC-V: Protect all kernel sections including init early
  RISC-V: Align the .init.text section
  RISC-V: Initialize SBI early
  riscv: Enable ARCH_STACKWALK
  riscv: Make stack walk callback consistent with generic code
  riscv: Cleanup stacktrace
  riscv: Add HAVE_IRQ_TIME_ACCOUNTING
  riscv: Enable CMA support
  riscv: Ignore Image.* and loader.bin
  riscv: Clean up boot dir
  riscv: Fix compressed Image formats build
  RISC-V: Add kernel image sections to the resource tree

Merge tag 'drm-intel-next-fixes-2020-12-18' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

drm/i915 fixes for the merge window

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87zh2bp34m.fsf@intel.com

drm/qxl: don't allocate a dma_address array

That seems to be unused.

Daniel: Mike reported a warning when booting with qxl, which this
patch fixes:

[ 1.815561] WARNING: CPU: 7 PID: 355 at drivers/gpu/drm/ttm/ttm_pool.c:365 ttm_pool_alloc+0x41b/0x540 [ttm]

Signed-off-by: Christian König <christian.koenig@amd.com>
Reported-by: Mike Galbraith <efault@gmx.de>
Tested-by: Mike Galbraith <efault@gmx.de>
References: https://lore.kernel.org/lkml/7cb43d5b-4e6a-defc-1ab6-5f713ad5a963@amd.com/
Reviewed-by: David Airlie <airlied@redhat.com>
[davnet: bring commit message up to par.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20201218134243.110884-1-christian.koenig@amd.com

drm/nouveau: fix multihop when move doesn't work.

As per the radeon/amdgpu fix don't use multihop if hw moves
aren't enabled.

Reported-by: Mike Galbraith <efault@gmx.de>
Tested-by: Mike Galbraith <efault@gmx.de>
Fixes: 0c8c0659d747 ("drm/nouveau/ttm: use multihop")
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20201217200943.30511-1-airlied@gmail.com

drm/i915/tgl: Fix REVID macros for TGL to fetch correct stepping

Fix TGL REVID macros to fetch correct display/gt stepping based
on SOC rev id from INTEL_REVID() macro. Previously, we were just
returning the first element of the revid array instead of using
the correct index based on SOC rev id.

Fixes: c33298cb34f5 ("drm/i915/tgl: Fix stepping WA matching")
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Aditya Swarup <aditya.swarup@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201203072359.156682-1-aditya.swarup@intel.com
(cherry picked from commit 83dbd74f8243f020d1ad8a3a3b3cd0795067920e)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915: Fix mismatch between misplaced vma check and vma insert

When inserting a VMA, we restrict the placement to the low 4G unless the
caller opts into using the full range. This was done to allow usersapce
the opportunity to transition slowly from a 32b address space, and to
avoid breaking inherent 32b assumptions of some commands.

However, for insert we limited ourselves to 4G-4K, but on verification
we allowed the full 4G. This causes some attempts to bind a new buffer
to sporadically fail with -ENOSPC, but at other times be bound
successfully.

commit 48ea1e32c39d ("drm/i915/gen9: Set PIN_ZONE_4G end to 4GB - 1
page") suggests that there is a genuine problem with stateless addressing
that cannot utilize the last page in 4G and so we purposefully excluded
it. This means that the quick pin pass may cause us to utilize a buggy
placement.

Reported-by: CQ Tang <cq.tang@intel.com>
Testcase: igt/gem_exec_params/larger-than-life-batch
Fixes: 48ea1e32c39d ("drm/i915/gen9: Set PIN_ZONE_4G end to 4GB - 1 page")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: CQ Tang <cq.tang@intel.com>
Reviewed-by: CQ Tang <cq.tang@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v4.5+
Link: https://patchwork.freedesktop.org/patch/msgid/20201216092951.7124-1-chris@chris-wilson.co.uk
(cherry picked from commit 5f22cc0b134ab702d7f64b714e26018f7288ffee)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/perf: also include Gen11 in OATAILPTR workaround

CI shows this workaround is also needed on Gen11.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 059a0beb486344 ("drm/i915/perf: workaround register corruption in OATAILPTR")
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201126105155.540350-1-lionel.g.landwerlin@intel.com
(cherry picked from commit fa5d598b8cbab0af92bac48fd60e74a893550923)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

Revert "drm/i915: re-order if/else ladder for hpd_irq_setup"

We now use ilk_hpd_irq_setup for all GMCH platforms that do not have
hotplug. These are early gen3 and gen2 devices that now explode on boot
as they try to access non-existent registers.

Fixes: 794d61a19090 ("drm/i915: re-order if/else ladder for hpd_irq_setup")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201127145748.29491-1-chris@chris-wilson.co.uk
(cherry picked from commit e5346a1ff38a405c14ce8e595269e9b7dcfbb2e9)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

Merge tag 'gpio-v5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO updates from Linus Walleij:
"This is the bulk of the GPIO changes for the v5.11 kernel cycle:

  Core changes:

   - Retired the old set-up function for GPIO IRQ chips. All chips now
     use the template struct gpio_irq_chip and pass that to the core to
     be set up alongside the gpio_chip. We can finally get rid of the
     old cruft.

   - Some refactoring and clean up of the core code.

   - Support edge event timestamps to be stamped using REALTIME (wall
     clock) timestamps. We have found solid use cases for this, so we
     support it.

  New drivers:

   - MStar MSC313 GPIO driver.

   - HiSilicon GPIO driver.

  Driver improvements:

   - The PCA953x driver now also supports the NXP PCAL9554B/C chips.

   - The mockup driver can now be probed from the device tree which is
     pretty useful for virtual prototyping of devices.

   - The Rcar driver now supports .get_multiple()

   - The MXC driver dropped some legacy and became a pure device tree
     client.

   - The Exar driver was moved over to the IDA interface for
     enumerating, and also switched over to using regmap for register
     access"

* tag 'gpio-v5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (87 commits)
  MAINTAINERS: Remove reference to non-existing file
  gpio: hisi: Do not require ACPI for COMPILE_TEST
  MAINTAINERS: Add maintainer for HiSilicon GPIO driver
  gpio: gpio-hisi: Add HiSilicon GPIO support
  gpio: cs5535: Simplify the return expression of cs5535_gpio_probe()
  gpiolib: irq hooks: fix recursion in gpiochip_irq_unmask
  dt-bindings: mt7621-gpio: convert bindings to YAML format
  gpiolib: cdev: Flag invalid GPIOs as used
  gpio: put virtual gpio device into their own submenu
  drivers: gpio: amd8111: use SPDX-License-Identifier
  drivers: gpio: amd8111: prefer dev_err()/dev_info() over raw printk
  drivers: gpio: bt8xx: prefer dev_err()/dev_warn() over of raw printk
  gpio: Add TODO item for debugfs interface
  gpio: just plain warning when nonexisting gpio requested
  tools: gpio: add option to report wall-clock time to gpio-event-mon
  tools: gpio: add support for reporting realtime event clock to lsgpio
  gpiolib: cdev: allow edge event timestamps to be configured as REALTIME
  gpio: msc313: MStar MSC313 GPIO driver
  dt-bindings: gpio: Binding for MStar MSC313 GPIO controller
  dt-bindings: gpio: Add a binding header for the MSC313 GPIO driver
  ...

Merge tag 'for-linus-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml

Pull UML updates from Richard Weinberger:

- IRQ handling cleanups

- Support for suspend

- Various fixes for UML specific drivers: ubd, vector, xterm

* tag 'for-linus-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: (32 commits)
  um: Fix build w/o CONFIG_PM_SLEEP
  um: time-travel: Correct time event IRQ delivery
  um: irq/sigio: Support suspend/resume handling of workaround IRQs
  um: time-travel: Actually apply "free-until" optimisation
  um: chan_xterm: Fix fd leak
  um: tty: Fix handling of close in tty lines
  um: Monitor error events in IRQ controller
  um: allocate a guard page to helper threads
  um: support some of ARCH_HAS_SET_MEMORY
  um: time-travel: avoid multiple identical propagations
  um: Fetch registers only for signals which need them
  um: Support suspend to RAM
  um: Allow PM with suspend-to-idle
  um: time: Fix read_persistent_clock64() in time-travel
  um: Simplify os_idle_sleep() and sleep longer
  um: Simplify IRQ handling code
  um: Remove IRQ_NONE type
  um: irq: Reduce irq_reg allocation
  um: irq: Clean up and rename struct irq_fd
  um: Clean up alarm IRQ chip name
  ...

Merge tag 'for-linus-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs

Pull jffs2, ubi and ubifs updates from Richard Weinberger:
"JFFS2:
   - Fix for a remount regression
   - Fix for an abnormal GC exit
   - Fix for a possible NULL pointer issue while mounting

  UBI:
   - Add support ECC-ed NOR flash
   - Removal of dead code

  UBIFS:
   - Make node dumping debug code more reliable
   - Various cleanups: less ifdefs, less typos
   - Fix for an info leak"

* tag 'for-linus-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs:
  ubifs: ubifs_dump_node: Dump all branches of the index node
  ubifs: ubifs_dump_sleb: Remove unused function
  ubifs: Pass node length in all node dumping callers
  Revert "ubifs: Fix out-of-bounds memory access caused by abnormal value of node_len"
  ubifs: Limit dumping length by size of memory which is allocated for the node
  ubifs: Remove the redundant return in dbg_check_nondata_nodes_order
  jffs2: Fix NULL pointer dereference in rp_size fs option parsing
  ubifs: Fixed print foramt mismatch in ubifs
  ubi: Do not zero out EC and VID on ECC-ed NOR flashes
  jffs2: remove trailing semicolon in macro definition
  ubifs: Fix error return code in ubifs_init_authentication()
  ubifs: wbuf: Don't leak kernel memory to flash
  ubi: Remove useless code in bytes_str_to_int
  ubifs: Fix the printing type of c->big_lpt
  jffs2: Allow setting rp_size to zero during remounting
  jffs2: Fix ignoring mounting options problem during remounting
  jffs2: Fix GC exit abnormally
  ubifs: Code cleanup by removing ifdef macro surrounding
  jffs2: Fix if/else empty body warnings
  ubifs: Delete duplicated words + other fixes

Merge tag '5.11-rc-smb3' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs updates from Steve French:
"The largest part are for support of the newer mount API which has been
  needed for cifs/smb3 mounts for a long time due to the new API's
  better handling of remount, and better error reporting. There are
  three additional small cleanup patches for this being tested, that are
  not included yet.

  This series also includes addition of support for the SMB3 witness
  protocol which can provide important notifications from the server to
  client on server address or export or network changes. This can be
  useful for example in order to be notified before the failure - when a
  server's IP address changes (in the future it will allow us to support
  server notifications of when a share is moved).

  It also includes three patches for stable e.g. some that better handle
  some confusing error messages during session establishment"

* tag '5.11-rc-smb3' of git://git.samba.org/sfrench/cifs-2.6: (55 commits)
  cifs: update internal module version number
  cifs: Fix support for remount when not changing rsize/wsize
  cifs: handle "guest" mount parameter
  cifs: correct four aliased mount parms to allow use of previous names
  cifs: Tracepoints and logs for tracing credit changes.
  cifs: fix use after free in cifs_smb3_do_mount()
  cifs: fix rsize/wsize to be negotiated values
  cifs: Fix some error pointers handling detected by static checker
  smb3: remind users that witness protocol is experimental
  cifs: update super_operations to show_devname
  cifs: fix uninitialized variable in smb3_fs_context_parse_param
  cifs: update mnt_cifs_flags during reconfigure
  cifs: move update of flags into a separate function
  cifs: remove ctx argument from cifs_setup_cifs_sb
  cifs: do not allow changing posix_paths during remount
  cifs: uncomplicate printing the iocharset parameter
  cifs: don't create a temp nls in cifs_setup_ipc
  cifs: simplify handling of cifs_sb/ctx->local_nls
  cifs: we do not allow changing username/password/unc/... during remount
  cifs: add initial reconfigure support
  ...

Merge tag 'net-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
"Current release - always broken:

   - net/smc: fix access to parent of an ib device

   - devlink: use _BITUL() macro instead of BIT() in the UAPI header

   - handful of mptcp fixes

  Previous release - regressions:

   - intel: AF_XDP: clear the status bits for the next_to_use descriptor

   - dpaa2-eth: fix the size of the mapped SGT buffer

  Previous release - always broken:

   - mptcp: fix security context on server socket

   - ethtool: fix string set id check

   - ethtool: fix error paths in ethnl_set_channels()

   - lan743x: fix rx_napi_poll/interrupt ping-pong

   - qca: ar9331: fix sleeping function called from invalid context bug"

* tag 'net-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (32 commits)
  net/sched: sch_taprio: reset child qdiscs before freeing them
  nfp: move indirect block cleanup to flower app stop callback
  octeontx2-af: Fix undetected unmap PF error check
  net: nixge: fix spelling mistake in Kconfig: "Instuments" -> "Instruments"
  qlcnic: Fix error code in probe
  mptcp: fix pending data accounting
  mptcp: push pending frames when subflow has free space
  mptcp: properly annotate nested lock
  mptcp: fix security context on server socket
  net/mlx5: Fix compilation warning for 32-bit platform
  mptcp: clear use_ack and use_map when dropping other suboptions
  devlink: use _BITUL() macro instead of BIT() in the UAPI header
  net: korina: fix return value
  net/smc: fix access to parent of an ib device
  ethtool: fix error paths in ethnl_set_channels()
  nfc: s3fwrn5: Remove unused NCI prop commands
  nfc: s3fwrn5: Remove the delay for NFC sleep
  phy: fix kdoc warning
  tipc: do sanity check payload of a netlink message
  use __netdev_notify_peers in hyperv
  ...

Merge tag 'for-linus' of git://github.com/openrisc/linux

Pull OpenRISC updates from Stafford Horne:

- New drivers and OpenRISC support for the LiteX platform

- A bug fix to support userspace gdb debugging

- Fixes one compile issue with blk-iocost

* tag 'for-linus' of git://github.com/openrisc/linux:
  openrisc: add local64.h to fix blk-iocost build
  openrisc: fix trap for debugger breakpoint signalling
  openrisc: add support for LiteX
  drivers/tty/serial: add LiteUART driver
  dt-bindings: serial: document LiteUART bindings
  drivers/soc/litex: add LiteX SoC Controller driver
  dt-bindings: soc: document LiteX SoC Controller bindings
  dt-bindings: vendor: add vendor prefix for LiteX

Merge tag 'powerpc-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

- Switch to the generic C VDSO, as well as some cleanups of our VDSO
   setup/handling code.

- Support for KUAP (Kernel User Access Prevention) on systems using the
   hashed page table MMU, using memory protection keys.

- Better handling of PowerVM SMT8 systems where all threads of a core
   do not share an L2, allowing the scheduler to make better scheduling
   decisions.

- Further improvements to our machine check handling.

- Show registers when unwinding interrupt frames during stack traces.

- Improvements to our pseries (PowerVM) partition migration code.

- Several series from Christophe refactoring and cleaning up various
   parts of the 32-bit code.

- Other smaller features, fixes & cleanups.

Thanks to: Alan Modra, Alexey Kardashevskiy, Andrew Donnellan, Aneesh
Kumar K.V, Ard Biesheuvel, Athira Rajeev, Balamuruhan S, Bill Wendling,
Cédric Le Goater, Christophe Leroy, Christophe Lombard, Colin Ian King,
Daniel Axtens, David Hildenbrand, Frederic Barrat, Ganesh Goudar,
Gautham R. Shenoy, Geert Uytterhoeven, Giuseppe Sacco, Greg Kurz,
Harish, Jan Kratochvil, Jordan Niethe, Kaixu Xia, Laurent Dufour,
Leonardo Bras, Madhavan Srinivasan, Mahesh Salgaonkar, Mathieu
Desnoyers, Nathan Lynch, Nicholas Piggin, Oleg Nesterov, Oliver
O'Halloran, Oscar Salvador, Po-Hsu Lin, Qian Cai, Qinglang Miao, Randy
Dunlap, Ravi Bangoria, Sachin Sant, Sandipan Das, Sebastian Andrzej
Siewior , Segher Boessenkool, Srikar Dronamraju, Tyrel Datwyler, Uwe
Kleine-König, Vincent Stehlé, Youling Tang, and Zhang Xiaoxu.

* tag 'powerpc-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (304 commits)
  powerpc/32s: Fix cleanup_cpu_mmu_context() compile bug
  powerpc: Add config fragment for disabling -Werror
  powerpc/configs: Add ppc64le_allnoconfig target
  powerpc/powernv: Rate limit opal-elog read failure message
  powerpc/pseries/memhotplug: Quieten some DLPAR operations
  powerpc/ps3: use dma_mapping_error()
  powerpc: force inlining of csum_partial() to avoid multiple csum_partial() with GCC10
  powerpc/perf: Fix Threshold Event Counter Multiplier width for P10
  powerpc/mm: Fix hugetlb_free_pmd_range() and hugetlb_free_pud_range()
  KVM: PPC: Book3S HV: Fix mask size for emulated msgsndp
  KVM: PPC: fix comparison to bool warning
  KVM: PPC: Book3S: Assign boolean values to a bool variable
  powerpc: Inline setup_kup()
  powerpc/64s: Mark the kuap/kuep functions non __init
  KVM: PPC: Book3S HV: XIVE: Add a comment regarding VP numbering
  powerpc/xive: Improve error reporting of OPAL calls
  powerpc/xive: Simplify xive_do_source_eoi()
  powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_EOI_FW
  powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_MASK_FW
  powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_SHIFT_BUG
  ...

Merge tag 'trace-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:
"The major update to this release is that there's a new arch config
  option called CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS.

  Currently, only x86_64 enables it. All the ftrace callbacks now take a
  struct ftrace_regs instead of a struct pt_regs. If the architecture
  has HAVE_DYNAMIC_FTRACE_WITH_ARGS enabled, then the ftrace_regs will
  have enough information to read the arguments of the function being
  traced, as well as access to the stack pointer.

  This way, if a user (like live kernel patching) only cares about the
  arguments, then it can avoid using the heavier weight "regs" callback,
  that puts in enough information in the struct ftrace_regs to simulate
  a breakpoint exception (needed for kprobes).

  A new config option that audits the timestamps of the ftrace ring
  buffer at most every event recorded.

  Ftrace recursion protection has been cleaned up to move the protection
  to the callback itself (this saves on an extra function call for those
  callbacks).

  Perf now handles its own RCU protection and does not depend on ftrace
  to do it for it (saving on that extra function call).

  New debug option to add "recursed_functions" file to tracefs that
  lists all the places that triggered the recursion protection of the
  function tracer. This will show where things need to be fixed as
  recursion slows down the function tracer.

  The eval enum mapping updates done at boot up are now offloaded to a
  work queue, as it caused a noticeable pause on slow embedded boards.

  Various clean ups and last minute fixes"

* tag 'trace-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (33 commits)
  tracing: Offload eval map updates to a work queue
  Revert: "ring-buffer: Remove HAVE_64BIT_ALIGNED_ACCESS"
  ring-buffer: Add rb_check_bpage in __rb_allocate_pages
  ring-buffer: Fix two typos in comments
  tracing: Drop unneeded assignment in ring_buffer_resize()
  tracing: Disable ftrace selftests when any tracer is running
  seq_buf: Avoid type mismatch for seq_buf_init
  ring-buffer: Fix a typo in function description
  ring-buffer: Remove obsolete rb_event_is_commit()
  ring-buffer: Add test to validate the time stamp deltas
  ftrace/documentation: Fix RST C code blocks
  tracing: Clean up after filter logic rewriting
  tracing: Remove the useless value assignment in test_create_synth_event()
  livepatch: Use the default ftrace_ops instead of REGS when ARGS is available
  ftrace/x86: Allow for arguments to be passed in to ftrace_regs by default
  ftrace: Have the callbacks receive a struct ftrace_regs instead of pt_regs
  MAINTAINERS: assign ./fs/tracefs to TRACING
  tracing: Fix some typos in comments
  ftrace: Remove unused varible 'ret'
  ring-buffer: Add recording of ring buffer recursion into recursed_functions
  ...

Merge tag 'modules-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux

Pull modules updates from Jessica Yu:
"Summary of modules changes for the 5.11 merge window:

   - Fix a race condition between systemd/udev and the module loader.

     The module loader was sending a uevent before the module was fully
     initialized (i.e., before its init function has been called). This
     means udev can start processing the module uevent before the module
     has finished initializing, and some udev rules expect that the
     module has initialized already upon receiving the uevent.

     This resulted in some systemd mount units failing if udev processes
     the event faster than the module can finish init. This is fixed by
     delaying the uevent until after the module has called its init
     routine.

   - Make the linker array sections for kernel params and module version
     attributes more robust by switching to use the alignment of the
     type in question.

     Namely, linker section arrays will be constructed using the
     alignment required by the struct (using __alignof__()) as opposed
     to a specific value such as sizeof(void *) or sizeof(long). This is
     less likely to cause breakages should the size of the type ever
     change (Johan Hovold)

   - Fix module state inconsistency by setting it back to GOING when a
     module fails to load and is on its way out (Miroslav Benes)

   - Some comment and code cleanups (Sergey Shtylyov)"

* tag 'modules-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
  module: delay kobject uevent until after module init call
  module: drop semicolon from version macro
  init: use type alignment for kernel parameters
  params: clean up module-param macros
  params: use type alignment for kernel parameters
  params: drop redundant "unused" attributes
  module: simplify version-attribute handling
  module: drop version-attribute alignment
  module: fix comment style
  module: add more 'kernel-doc' comments
  module: fix up 'kernel-doc' comments
  module: only handle errors with the *switch* statement in module_sig_check()
  module: avoid *goto*s in module_sig_check()
  module: merge repetitive strings in module_sig_check()
  module: set MODULE_STATE_GOING state when a module fails to load

Merge tag 'dmaengine-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine

Pull dmaengine updates from Vinod Koul:
"The last dmaengine updates for this year :)

  This contains couple of new drivers, new device support and updates to
  bunch of drivers.

  New drivers/devices:
   - Qualcomm ADM driver
   - Qualcomm GPI driver
   - Allwinner A100 DMA support
   - Microchip Sama7g5 support
   - Mediatek MT8516 apdma

  Updates:
   - more updates to idxd driver and support for IAX config
   - runtime PM support for dw driver
   - TI drivers"

* tag 'dmaengine-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (75 commits)
  soc: ti: k3-ringacc: Use correct error casting in k3_ringacc_dmarings_init
  dmaengine: ti: k3-udma-glue: Add support for K3 PKTDMA
  dmaengine: ti: k3-udma: Initial support for K3 PKTDMA
  dmaengine: ti: k3-udma: Add support for BCDMA channel TPL handling
  dmaengine: ti: k3-udma: Initial support for K3 BCDMA
  soc: ti: k3-ringacc: add AM64 DMA rings support.
  dmaengine: ti: Add support for k3 event routers
  dmaengine: ti: k3-psil: Add initial map for AM64
  dmaengine: ti: k3-psil: Extend psil_endpoint_config for K3 PKTDMA
  dt-bindings: dma: ti: Add document for K3 PKTDMA
  dt-bindings: dma: ti: Add document for K3 BCDMA
  dmaengine: dmatest: Use dmaengine_get_dma_device
  dmaengine: doc: client: Update for dmaengine_get_dma_device() usage
  dmaengine: Add support for per channel coherency handling
  dmaengine: of-dma: Add support for optional router configuration callback
  dmaengine: ti: k3-udma-glue: Configure the dma_dev for rings
  dmaengine: ti: k3-udma-glue: Get the ringacc from udma_dev
  dmaengine: ti: k3-udma-glue: Add function to get device pointer for DMA API
  dmaengine: ti: k3-udma: Add support for second resource range from sysfw
  dmaengine: ti: k3-udma: Wait for peer teardown completion if supported
  ...

Merge tag 'mailbox-v5.11' of git://git.linaro.org/landing-teams/working/fujitsu/integration

Pull mailbox updates from Jassi Brar:

- arm: added mhu-v2 controller driver

- arm_mhu_db: fix kfree by using devm_ variant

- stm32-ipcc: misc cleanup

* tag 'mailbox-v5.11' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
  mailbox: arm_mhuv2: Add driver
  dt-bindings: mailbox : arm,mhuv2: Add bindings
  mailbox: stm32-ipcc: cast void pointers to unsigned long
  mailbox: stm32-ipcc: remove duplicate error message
  mailbox: stm32-ipcc: add COMPILE_TEST dependency
  mailbox: arm_mhu_db: Fix mhu_db_shutdown by replacing kfree with devm_kfree

Merge tag 'nfs-for-5.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client updates from Trond Myklebust:
"Highlights include:

  Features:

   - NFSv3: Add emulation of lookupp() to improve open_by_filehandle()
     support

   - A series of patches to improve readdir performance, particularly
     with large directories

   - Basic support for using NFS/RDMA with the pNFS files and flexfiles
     drivers

   - Micro-optimisations for RDMA

   - RDMA tracing improvements

  Bugfixes:

   - Fix a long standing bug with xs_read_xdr_buf() when receiving
     partial pages (Dan Aloni)

   - Various fixes for getxattr and listxattr, when used over non-TCP
     transports

   - Fixes for containerised NFS from Sargun Dhillon

   - switch nfsiod to be an UNBOUND workqueue (Neil Brown)

   - READDIR should not ask for security label information if there is
     no LSM policy (Olga Kornievskaia)

   - Avoid using interval-based rebinding with TCP in lockd (Calum
     Mackay)

   - A series of RPC and NFS layer fixes to support the NFSv4.2
     READ_PLUS code

   - A couple of fixes for pnfs/flexfiles read failover

  Cleanups:

   - Various cleanups for the SUNRPC xdr code in conjunction with the
     READ_PLUS fixes"

* tag 'nfs-for-5.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (90 commits)
  NFS/pNFS: Fix a typo in ff_layout_resend_pnfs_read()
  pNFS/flexfiles: Avoid spurious layout returns in ff_layout_choose_ds_for_read
  NFSv4/pnfs: Add tracing for the deviceid cache
  fs/lockd: convert comma to semicolon
  NFSv4.2: fix error return on memory allocation failure
  NFSv4.2/pnfs: Don't use READ_PLUS with pNFS yet
  NFSv4.2: Deal with potential READ_PLUS data extent buffer overflow
  NFSv4.2: Don't error when exiting early on a READ_PLUS buffer overflow
  NFSv4.2: Handle hole lengths that exceed the READ_PLUS read buffer
  NFSv4.2: decode_read_plus_hole() needs to check the extent offset
  NFSv4.2: decode_read_plus_data() must skip padding after data segment
  NFSv4.2: Ensure we always reset the result->count in decode_read_plus()
  SUNRPC: When expanding the buffer, we may need grow the sparse pages
  SUNRPC: Cleanup - constify a number of xdr_buf helpers
  SUNRPC: Clean up open coded setting of the xdr_stream 'nwords' field
  SUNRPC: _copy_to/from_pages() now check for zero length
  SUNRPC: Cleanup xdr_shrink_bufhead()
  SUNRPC: Fix xdr_expand_hole()
  SUNRPC: Fixes for xdr_align_data()
  SUNRPC: _shift_data_left/right_pages should check the shift length
  ...

Merge tag 'ceph-for-5.11-rc1' of git://github.com/ceph/ceph-client

Pull ceph updates from Ilya Dryomov:
"The big ticket item here is support for msgr2 on-wire protocol, which
  adds the option of full in-transit encryption using AES-GCM algorithm
  (myself).

  On top of that we have a series to avoid intermittent errors during
  recovery with recover_session=clean and some MDS request encoding work
  from Jeff, a cap handling fix and assorted observability improvements
  from Luis and Xiubo and a good number of cleanups.

  Luis also ran into a corner case with quotas which sadly means that we
  are back to denying cross-quota-realm renames"

* tag 'ceph-for-5.11-rc1' of git://github.com/ceph/ceph-client: (59 commits)
  libceph: drop ceph_auth_{create,update}_authorizer()
  libceph, ceph: make use of __ceph_auth_get_authorizer() in msgr1
  libceph, ceph: implement msgr2.1 protocol (crc and secure modes)
  libceph: introduce connection modes and ms_mode option
  libceph, rbd: ignore addr->type while comparing in some cases
  libceph, ceph: get and handle cluster maps with addrvecs
  libceph: factor out finish_auth()
  libceph: drop ac->ops->name field
  libceph: amend cephx init_protocol() and build_request()
  libceph, ceph: incorporate nautilus cephx changes
  libceph: safer en/decoding of cephx requests and replies
  libceph: more insight into ticket expiry and invalidation
  libceph: move msgr1 protocol specific fields to its own struct
  libceph: move msgr1 protocol implementation to its own file
  libceph: separate msgr1 protocol implementation
  libceph: export remaining protocol independent infrastructure
  libceph: export zero_page
  libceph: rename and export con->flags bits
  libceph: rename and export con->state states
  libceph: make con->state an int
  ...

Merge tag 'ovl-update-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs

Pull overlayfs updates from Miklos Szeredi:

- Allow unprivileged mounting in a user namespace.

   For quite some time the security model of overlayfs has been that
   operations on underlying layers shall be performed with the
   privileges of the mounting task.

   This way an unprvileged user cannot gain privileges by the act of
   mounting an overlayfs instance. A full audit of all function calls
   made by the overlayfs code has been performed to see whether they
   conform to this model, and this branch contains some fixes in this
   regard.

- Support running on copied filesystem images by optionally disabling
   UUID verification.

- Bug fixes as well as documentation updates.

* tag 'ovl-update-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: unprivieged mounts
  ovl: do not get metacopy for userxattr
  ovl: do not fail because of O_NOATIME
  ovl: do not fail when setting origin xattr
  ovl: user xattr
  ovl: simplify file splice
  ovl: make ioctl() safe
  ovl: check privs before decoding file handle
  vfs: verify source area in vfs_dedupe_file_range_one()
  vfs: move cap_convert_nscap() call into vfs_setxattr()
  ovl: fix incorrect extent info in metacopy case
  ovl: expand warning in ovl_d_real()
  ovl: document lower modification caveats
  ovl: warn about orphan metacopy
  ovl: doc clarification
  ovl: introduce new "uuid=off" option for inodes index feature
  ovl: propagate ovl_fs to ovl_decode_real_fh and ovl_encode_real_fh

Merge tag 'fuse-update-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse

Pull fuse updates from Miklos Szeredi:

- Improve performance of virtio-fs in mixed read/write workloads

- Try to revalidate cache before returning EEXIST on exclusive create

- Add a couple of miscellaneous bug fixes as well as some code cleanups

* tag 'fuse-update-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: fix bad inode
  fuse: support SB_NOSEC flag to improve write performance
  fuse: add a flag FUSE_OPEN_KILL_SUIDGID for open() request
  fuse: don't send ATTR_MODE to kill suid/sgid for handle_killpriv_v2
  fuse: setattr should set FATTR_KILL_SUIDGID
  fuse: set FUSE_WRITE_KILL_SUIDGID in cached write path
  fuse: rename FUSE_WRITE_KILL_PRIV to FUSE_WRITE_KILL_SUIDGID
  fuse: introduce the notion of FUSE_HANDLE_KILLPRIV_V2
  fuse: always revalidate if exclusive create
  virtiofs: clean up error handling in virtio_fs_get_tree()
  fuse: add fuse_sb_destroy() helper
  fuse: simplify get_fuse_conn*()
  fuse: get rid of fuse_mount refcount
  virtiofs: simplify sb setup
  virtiofs fix leak in setup
  fuse: launder page should wait for page writeback

Merge tag 'f2fs-for-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
"In this round, we've made more work into per-file compression support.

  For example, F2FS_IOC_GET | SET_COMPRESS_OPTION provides a way to
  change the algorithm or cluster size per file. F2FS_IOC_COMPRESS |
  DECOMPRESS_FILE provides a way to compress and decompress the existing
  normal files manually.

  There is also a new mount option, compress_mode=fs|user, which can
  control who compresses the data.

  Chao also added a checksum feature with a mount option so that
  we are able to detect any corrupted cluster.

  In addition, Daniel contributed casefolding with encryption patch,
  which will be used for Android devices.

  Summary:

  Enhancements:
   - add ioctls and mount option to manage per-file compression feature
   - support casefolding with encryption
   - support checksum for compressed cluster
   - avoid IO starvation by replacing mutex with rwsem
   - add sysfs, max_io_bytes, to control max bio size

  Bug fixes:
   - fix use-after-free issue when compression and fsverity are enabled
   - fix consistency corruption during fault injection test
   - fix data offset for lseek
   - get rid of buffer_head which has 32bits limit in fiemap
   - fix some bugs in multi-partitions support
   - fix nat entry count calculation in shrinker
   - fix some stat information

  And, we've refactored some logics and fix minor bugs as well"

* tag 'f2fs-for-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (36 commits)
  f2fs: compress: fix compression chksum
  f2fs: fix shift-out-of-bounds in sanity_check_raw_super()
  f2fs: fix race of pending_pages in decompression
  f2fs: fix to account inline xattr correctly during recovery
  f2fs: inline: fix wrong inline inode stat
  f2fs: inline: correct comment in f2fs_recover_inline_data
  f2fs: don't check PAGE_SIZE again in sanity_check_raw_super()
  f2fs: convert to F2FS_*_INO macro
  f2fs: introduce max_io_bytes, a sysfs entry, to limit bio size
  f2fs: don't allow any writes on readonly mount
  f2fs: avoid race condition for shrinker count
  f2fs: add F2FS_IOC_DECOMPRESS_FILE and F2FS_IOC_COMPRESS_FILE
  f2fs: add compress_mode mount option
  f2fs: Remove unnecessary unlikely()
  f2fs: init dirty_secmap incorrectly
  f2fs: remove buffer_head which has 32bits limit
  f2fs: fix wrong block count instead of bytes
  f2fs: use new conversion functions between blks and bytes
  f2fs: rename logical_to_blk and blk_to_logical
  f2fs: fix kbytes written stat for multi-device case
  ...

Merge tag 'for_v5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull ext2, reiserfs, quota and writeback updates from Jan Kara:

- a couple of quota fixes (mostly for problems found by syzbot)

- several ext2 cleanups

- one fix for reiserfs crash on corrupted image

- a fix for spurious warning in writeback code

* tag 'for_v5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  writeback: don't warn on an unregistered BDI in __mark_inode_dirty
  fs: quota: fix array-index-out-of-bounds bug by passing correct argument to vfs_cleanup_quota_inode()
  reiserfs: add check for an invalid ih_entry_count
  ext2: Fix fall-through warnings for Clang
  fs/ext2: Use ext2_put_page
  docs: filesystems: Reduce ext2.rst to one top-level heading
  quota: Sanity-check quota file headers on load
  quota: Don't overflow quota file offsets
  ext2: Remove unnecessary blank
  fs/quota: update quota state flags scheme with project quota flags

net/sched: sch_taprio: reset child qdiscs before freeing them

syzkaller shows that packets can still be dequeued while taprio_destroy()
is running. Let sch_taprio use the reset() function to cancel the advance
timer and drop all skbs from the child qdiscs.

Fixes: 5a781ccbd19e ("tc: Add support for configuring the taprio scheduler")
Link: https://syzkaller.appspot.com/bug?id=f362872379bf8f0017fb667c1ab158f2d1e764ae
Reported-by: syzbot+8971da381fb5a31f542d@syzkaller.appspotmail.com
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Link: https://lore.kernel.org/r/63b6d79b0e830ebb0283e020db4df3cdfdfb2b94.1608142843.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

nfp: move indirect block cleanup to flower app stop callback

The indirect block cleanup may cause control messages to be sent
if offloaded flows are present. However, by the time the flower app
cleanup callback is called txbufs are no longer available and attempts
to send control messages result in a NULL-pointer dereference in
nfp_ctrl_tx_one().

This problem may be resolved by moving the indirect block cleanup
to the stop callback, where txbufs are still available.

As suggested by Jakub Kicinski and Louis Peens.

Fixes: a1db217861f3 ("net: flow_offload: fix flow_indr_dev_unregister path")
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Link: https://lore.kernel.org/r/20201216145701.30005-1-simon.horman@netronome.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

octeontx2-af: Fix undetected unmap PF error check

Currently the check for an unmap PF error is always going to be false
because intr_val is a 32 bit int and is being bit-mask checked against
1ULL << 32. Fix this by making intr_val a u64 to match the type at it
is copied from, namely npa_event_context->npa_af_rvu_ge.

Addresses-Coverity: ("Operands don't affect result")
Fixes: f1168d1e207c ("octeontx2-af: Add devlink health reporters for NPA")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: George Cherian <george.cherian@marvell.com>
Link: https://lore.kernel.org/r/20201216123604.15369-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'fsnotify_for_v5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull fsnotify updates from Jan Kara:
"A few fsnotify fixes from Amir fixing fallout from big fsnotify
  overhaul a few months back and an improvement of defaults limiting
  maximum number of inotify watches from Waiman"

* tag 'fsnotify_for_v5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  fsnotify: fix events reported to watching parent and child
  inotify: convert to handle_inode_event() interface
  fsnotify: generalize handle_inode_event()
  inotify: Increase default inotify.max_user_watches limit to 1048576

net: nixge: fix spelling mistake in Kconfig: "Instuments" -> "Instruments"

There is a spelling mistake in the Kconfig. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20201216120020.13149-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

qlcnic: Fix error code in probe

Return -EINVAL if we can't find the correct device. Currently it
returns success.

Fixes: 13159183ec7a ("qlcnic: 83xx base driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/X9nHbMqEyI/xPfGd@mwanda
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'mptcp-a-bunch-of-assorted-fixes'

Paolo Abeni says:

====================
mptcp: a bunch of assorted fixes

This series pulls a few fixes for the MPTCP datapath.
Most issues addressed here has been recently introduced
with the recent reworks, with the notable exception of
the first patch, which addresses an issue present since
the early days
====================

Link: https://lore.kernel.org/r/cover.1608114076.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mptcp: fix pending data accounting

When sendmsg() needs to wait for memory, the pending data
is not updated. That causes a drift in forward memory allocation,
leading to stall and/or warnings at socket close time.

This change addresses the above issue moving the pending data
counter update inside the sendmsg() main loop.

Fixes: 6e628cd3a8f7 ("mptcp: use mptcp release_cb for delayed tasks")
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mptcp: push pending frames when subflow has free space

When multiple subflows are active, we can receive a
window update on subflow with no write space available.
MPTCP will try to push frames on such subflow and will
fail. Pending frames will be pushed only after receiving
a window update on a subflow with some wspace available.

Overall the above could lead to suboptimal aggregate
bandwidth usage.

Instead, we should try to push pending frames as soon as
the subflow reaches both conditions mentioned above.

We can finally enable self-tests with asymmetric links,
as the above makes them finally pass.

Fixes: 6f8a612a33e4 ("mptcp: keep track of advertised windows right edge")
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mptcp: properly annotate nested lock

MPTCP closes the subflows while holding the msk-level lock.
While acquiring the subflow socket lock we need to use the
correct nested annotation, or we can hit a lockdep splat
at runtime.

Reported-and-tested-by: Geliang Tang <geliangtang@gmail.com>
Fixes: e16163b6e2b7 ("mptcp: refactor shutdown and close")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mptcp: fix security context on server socket

Currently MPTCP is not propagating the security context
from the ingress request socket to newly created msk
at clone time.

Address the issue invoking the missing security helper.

Fixes: cf7da0d66cc1 ("mptcp: Create SUBFLOW socket for incoming connections")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: Fix compilation warning for 32-bit platform

MLX5_GENERAL_OBJECT_TYPES types bitfield is 64-bit field.

Defining an enum for such bit fields on 32-bit platform results in below
warning.

./include/vdso/bits.h:7:26: warning: left shift count >= width of type [-Wshift-count-overflow]
^
./include/linux/mlx5/mlx5_ifc.h:10716:46: note: in expansion of macro ‘BIT’
MLX5_HCA_CAP_GENERAL_OBJECT_TYPES_SAMPLER = BIT(0x20),
^~~

Use 32-bit friendly BIT_ULL macro.

Fixes: 2a2970891647 ("net/mlx5: Add sample offload hardware bits and structures")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20201213120641.216032-1-leon@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

drm/edid: fix objtool warning in drm_cvt_modes()

Commit 991fcb77f490 ("drm/edid: Fix uninitialized variable in
drm_cvt_modes()") just replaced one warning with another.

The original warning about a possibly uninitialized variable was due to
the compiler not being smart enough to see that the case statement
actually enumerated all possible cases.  And the initial fix was just to
add a "default" case that had a single "unreachable()", just to tell the
compiler that that situation cannot happen.

However, that doesn't actually fix the fundamental reason for the
problem: the compiler still doesn't see that the existing case
statements enumerate all possibilities, so the compiler will still
generate code to jump to that unreachable case statement.  It just won't
complain about an uninitialized variable any more.

So now the compiler generates code to our inline asm marker that we told
it would not fall through, and end end result is basically random.  We
have created a bridge to nowhere.

And then, depending on the random details of just exactly what the
compiler ends up doing, 'objtool' might end up complaining about the
conditional branches (for conditions that cannot happen, and that thus
will never be taken - but if the compiler was not smart enough to figure
that out, we can't expect objtool to do so) going off in the weeds.

So depending on how the compiler has laid out the result, you might see
something like this:

    drivers/gpu/drm/drm_edid.o: warning: objtool: do_cvt_mode() falls through to next function drm_mode_detailed.isra.0()

and now you have a truly inscrutable warning that makes no sense at all
unless you start looking at whatever random code the compiler happened
to generate for our bare "unreachable()" statement.

IOW, don't use "unreachable()" unless you have an _active_ operation
that generates code that actually makes it obvious that something is not
reachable (ie an UD instruction or similar).

Solve the "compiler isn't smart enough" problem by just marking one of
the cases as "default", so that even when the compiler doesn't otherwise
see that we've enumerated all cases, the compiler will feel happy and
safe about there always being a valid case that initializes the 'width'
variable.

This also generates better code, since now the compiler doesn't generate
comparisons for five different possibilities (the four real ones and the
one that can't happen), but just for the three real ones and "the rest"
(which is that last one).

A smart enough compiler that sees that we cover all the cases won't care.

Cc: Lyude Paul <lyude@redhat.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

thermal: int340x: Support Alder Lake

Add ACPI IDs for thermal drivers for Alder Lake support.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201117194802.503337-1-srinivas.pandruvada@linux.intel.com

pwm: sun4i: Remove erroneous else branch

Commit d3817a647059 ("pwm: sun4i: Remove redundant needs_delay") changed
the logic of an else branch so that the PWM_EN and PWM_CLK_GATING bits
are now cleared if the PWM is to be disabled, whereas previously the
condition was always false, and hence the branch never got executed.

This code is reported causing backlight issues on boards based on the
Allwinner A20 SoC. Fix this by removing the else branch, which restores
the behaviour prior to the offending commit.

Note that the PWM_EN and PWM_CLK_GATING bits still get cleared later in
sun4i_pwm_apply() if the PWM is to be disabled.

Fixes: d3817a647059 ("pwm: sun4i: Remove redundant needs_delay")
Reported-by: Taras Galchenko <tpgalchenko@gmail.com>
Suggested-by: Taras Galchenko <tpgalchenko@gmail.com>
Tested-by: Taras Galchenko <tpgalchenko@gmail.com>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>

pwm: sl28cpld: Set driver data before registering the PWM chip

It is good practice to set the driver data before registering a device
with a subsystem because the subsystem or the driver core may call back
into the driver implementation. This is not currently an issue, but to
prevent future changes from causing this to break unexpectedly, make
sure that the driver data is set before the PWM chip registration.

Signed-off-by: Thierry Reding <thierry.reding@gmail.com>

pwm: Remove unused function pwmchip_add_inversed()

This is only defined with CONFIG_PWM unset and was introduced together
with pwmchip_add_with_polarity() (which is only defined with CONFIG_PWM
enabled). I guess the series that introduced pwmchip_add_with_polarity()
had a different concept in earlier revisions and the !CONFIG_PWM part
was just not updated accordingly.

Given that there is no implementation for pwmchip_add_with_polarity()
without CONFIG_PWM, just drop pwmchip_add_inversed() instead of renaming
it to pwmchip_add_with_polarity().

Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>

pwm: imx27: Fix overflow for bigger periods

The second parameter of do_div is an u32 and NSEC_PER_SEC * prescale
overflows this for bigger periods. Assuming the usual pwm input clk rate
of 66 MHz this happens starting at requested period > 606060 ns.

Splitting the division into two operations doesn't loose any precision.
It doesn't need to be feared that c / NSEC_PER_SEC doesn't fit into the
unsigned long variable "duty_cycles" because in this case the assignment
above to period_cycles would already have been overflowing as
period >= duty_cycle and then the calculation is moot anyhow.

Fixes: aef1a3799b5c ("pwm: imx27: Fix rounding behavior")
Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org>
Tested-by: Johannes Pointner <johannes.pointner@br-automation.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>

pwm: bcm2835: Support apply function for atomic configuration

Use the newer .apply function of pwm_ops instead of .config, .enable,
.disable and .set_polarity. This guarantees atomic changes of the pwm
controller configuration. It also reduces the size of the driver.

Since now period is a 64 bit value, add an extra check to reject periods
that exceed the possible max value for the 32 bit register.

This has been tested on a Raspberry PI 4.

Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>

pwm: keembay: Fix build failure with -Os

The driver used this construct:

#define KMB_PWM_LEADIN_MASK             GENMASK(30, 0)

static inline void keembay_pwm_update_bits(struct keembay_pwm *priv, u32 mask,
   u32 val, u32 offset)
{
u32 buff = readl(priv->base + offset);

buff = u32_replace_bits(buff, val, mask);
writel(buff, priv->base + offset);
}

...
keembay_pwm_update_bits(priv, KMB_PWM_LEADIN_MASK, 0,
KMB_PWM_LEADIN_OFFSET(pwm->hwpwm));

With CONFIG_CC_OPTIMIZE_FOR_SIZE the compiler (here: gcc 10.2.0) this
triggers:

In file included from /home/uwe/gsrc/linux/drivers/pwm/pwm-keembay.c:16:
In function ‘field_multiplier’,
    inlined from ‘keembay_pwm_update_bits’ at /home/uwe/gsrc/linux/include/linux/bitfield.h:124:17:
/home/uwe/gsrc/linux/include/linux/bitfield.h:119:3: error: call to ‘__bad_mask’ declared with attribute error: bad bitfield mask
  119 |   __bad_mask();
      |   ^~~~~~~~~~~~
In function ‘field_multiplier’,
    inlined from ‘keembay_pwm_update_bits’ at /home/uwe/gsrc/linux/include/linux/bitfield.h:154:1:
/home/uwe/gsrc/linux/include/linux/bitfield.h:119:3: error: call to ‘__bad_mask’ declared with attribute error: bad bitfield mask
  119 |   __bad_mask();
      |   ^~~~~~~~~~~~

The compiler doesn't seem to be able to notice that with field being
0x3ffffff the expression

if ((field | (field - 1)) & ((field | (field - 1)) + 1))
__bad_mask();

can be optimized away.

So use __always_inline and document the problem in a comment to fix
this.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Tested-by: Vijayakannan Ayyathurai <vijayakannan.ayyathurai@intel.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>

pwm: core: Use octal permission

Permission bits are easier readable in octal than with using the
symbolic names.

Fixes the following warning generated by checkpatch:

    WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'.
    #1341: FILE: drivers/pwm/core.c:1341:
    +       debugfs_create_file("pwm", S_IFREG | S_IRUGO, NULL, NULL,

Signed-off-by: Soham Biswas <sohambiswas41@gmail.com>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>

pwm: lpss: Make compilable with COMPILE_TEST

All used ACPI functions have dummy implementations, and there is no hard
dependency on x86.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>

pwm: Fix dependencies on HAS_IOMEM

Drivers making use of IO remapping must depend on HAS_IOMEM.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>

pwm: Use -EINVAL for unsupported polarity

Instead of using a mix of -EOPNOTSUPP and -ENOTSUPP, use the more
standard -EINVAL to signal that the specified polarity value was
invalid.

Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>

pwm: sti: Remove unnecessary blank line

A single blank line is enough to separate logical code blocks.

Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>

pwm: sti: Avoid conditional gotos

Using gotos for conditional code complicates this code significantly.
Convert the code to simple conditional blocks to increase readability.

Suggested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>

pwm: Add PWM fan controller driver for LGM SoC

Intel Lightning Mountain(LGM) SoC contains a PWM fan controller. This
PWM controller does not have any other consumer, it is a dedicated PWM
controller for fan attached to the system. Add driver for this PWM fan
controller.

Signed-off-by: Rahul Tanwar <rahul.tanwar@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>

Add DT bindings YAML schema for PWM fan controller of LGM SoC

Intel's LGM(Lightning Mountain) SoC contains a PWM fan controller
which is only used to control the fan attached to the system. This
PWM controller does not have any other consumer other than fan.
Add DT bindings documentation for this PWM fan controller.

Signed-off-by: Rahul Tanwar <rahul.tanwar@linux.intel.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>

pwm: Add DesignWare PWM Controller Driver

Introduce driver for Synopsys DesignWare PWM Controller used on Intel
Elkhart Lake.

Initial implementation is done by Felipe Balbi while he was working at
Intel with later changes from Raymond Tan and me.

Co-developed-by: Felipe Balbi (Intel) <balbi@kernel.org>
Signed-off-by: Felipe Balbi (Intel) <balbi@kernel.org>
Co-developed-by: Raymond Tan <raymond.tan@intel.com>
Signed-off-by: Raymond Tan <raymond.tan@intel.com>
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>

dt-bindings: pwm: mtk-disp: add MT8167 SoC binding

Add binding for MT8167 SoC. The IP is compatible with MT8173.

Signed-off-by: Fabien Parent <fparent@baylibre.com>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>

pwm: mediatek: Add MT8183 SoC support

Add PWM support for the MT8183 SoC.

Signed-off-by: Fabien Parent <fparent@baylibre.com>
Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>

pwm: mediatek: Always use bus clock

The MediaTek PWM IP can sometimes use the 26 MHz source clock to
generate the PWM signal, but the driver currently assumes that we always
use the PWM bus clock to generate the PWM signal.

This commit modifies the PWM driver in order to force the PWM IP to
always use the bus clock as source clock.

I do not have the datasheet of all the MediaTek SoC, so I don't know if
the register to choose the source clock is present in all the SoCs or
only in subset. As a consequence I made this change optional by using a
platform data paremeter to says whether this register is supported or
not. On all the SoCs I don't have the datasheet (MT2712, MT7622, MT7623,
MT7628, MT7629) I kept the behavior to be the same as before this
change.

Signed-off-by: Fabien Parent <fparent@baylibre.com>
Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>

dt-bindings: pwm: pwm-mediatek: Add documentation for MT8183 SoC

Add binding documentation for the MT8183 SoC.

Signed-off-by: Fabien Parent <fparent@baylibre.com>
Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>

pwm: Add PWM driver for Intel Keem Bay

The Intel Keem Bay SoC requires PWM support.
Add the pwm-keembay driver to enable this.

Signed-off-by: Lai, Poey Seng <poey.seng.lai@intel.com>
Co-developed-by: Vineetha G. Jaya Kumaran <vineetha.g.jaya.kumaran@intel.com>
Signed-off-by: Vineetha G. Jaya Kumaran <vineetha.g.jaya.kumaran@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Co-developed-by: Vijayakannan Ayyathurai <vijayakannan.ayyathurai@intel.com>
Signed-off-by: Vijayakannan Ayyathurai <vijayakannan.ayyathurai@intel.com>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>

dt-bindings: pwm: keembay: Add bindings for Intel Keem Bay PWM

Add PWM Device Tree bindings documentation for the Intel Keem Bay SoC.

Signed-off-by: Vineetha G. Jaya Kumaran <vineetha.g.jaya.kumaran@intel.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Vijayakannan Ayyathurai <vijayakannan.ayyathurai@intel.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>