git.baikalelectronics.ru Git

Merge tag 'powerpc-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fix from Michael Ellerman:
"One fix for a boot crash on some platforms introduced by the recent
  pkey refactoring.

  Thanks to Christian Zigotzky and Aneesh Kumar K.V"

* tag 'powerpc-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/pkeys: Fix boot failures with Nemo board (A-EON AmigaOne X1000)

Merge tag 'for-linus-5.9-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull more xen updates from Juergen Gross:

- Remove support for running as 32-bit Xen PV-guest.

   32-bit PV guests are rarely used, are lacking security fixes for
   Meltdown, and can be easily replaced by PVH mode. Another series for
   doing more cleanup will follow soon (removal of 32-bit-only pvops
   functionality).

- Fixes and additional features for the Xen display frontend driver.

* tag 'for-linus-5.9-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  drm/xen-front: Pass dumb buffer data offset to the backend
  xen: Sync up with the canonical protocol definition in Xen
  drm/xen-front: Add YUYV to supported formats
  drm/xen-front: Fix misused IS_ERR_OR_NULL checks
  xen/gntdev: Fix dmabuf import with non-zero sgt offset
  x86/xen: drop tests for highmem in pv code
  x86/xen: eliminate xen-asm_64.S
  x86/xen: remove 32-bit Xen PV guest support

Merge tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull hyper-v fixes from Wei Liu:

- fix oops reporting on Hyper-V

- make objtool happy

* tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
x86/hyperv: Make hv_setup_sched_clock inline
Drivers: hv: vmbus: Only notify Hyper-V for die events that are oops

x86/fsgsbase/64: Fix NULL deref in 86_fsgsbase_read_task

syzbot found its way in 86_fsgsbase_read_task() and triggered this oops:

   KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
   CPU: 0 PID: 6866 Comm: syz-executor262 Not tainted 5.8.0-syzkaller #0
   RIP: 0010:x86_fsgsbase_read_task+0x16d/0x310 arch/x86/kernel/process_64.c:393
   Call Trace:
     putreg32+0x3ab/0x530 arch/x86/kernel/ptrace.c:876
     genregs32_set arch/x86/kernel/ptrace.c:1026 [inline]
     genregs32_set+0xa4/0x100 arch/x86/kernel/ptrace.c:1006
     copy_regset_from_user include/linux/regset.h:326 [inline]
     ia32_arch_ptrace arch/x86/kernel/ptrace.c:1061 [inline]
     compat_arch_ptrace+0x36c/0xd90 arch/x86/kernel/ptrace.c:1198
     __do_compat_sys_ptrace kernel/ptrace.c:1420 [inline]
     __se_compat_sys_ptrace kernel/ptrace.c:1389 [inline]
     __ia32_compat_sys_ptrace+0x220/0x2f0 kernel/ptrace.c:1389
     do_syscall_32_irqs_on arch/x86/entry/common.c:84 [inline]
     __do_fast_syscall_32+0x57/0x80 arch/x86/entry/common.c:126
     do_fast_syscall_32+0x2f/0x70 arch/x86/entry/common.c:149
     entry_SYSENTER_compat_after_hwframe+0x4d/0x5c

This can happen if ptrace() or sigreturn() pokes an LDT selector into FS
or GS for a task with no LDT and something tries to read the base before
a return to usermode notices the bad selector and fixes it.

The fix is to make sure ldt pointer is not NULL.

Fixes: 66cee9012dd8 ("x86/fsgsbase/64: Fix ptrace() to read the FS/GS base accurately")
Co-developed-by: Jann Horn <jannh@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Markus T Metzger <markus.t.metzger@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fix from Herbert Xu:
"This fixes a regression in af_alg"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: algif_aead - fix uninitialized ctx->init

Merge tag 'modules-for-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux

Pull module updates from Jessica Yu:
"The most important change would be Christoph Hellwig's patch
  implementing proprietary taint inheritance, in an effort to discourage
  the creation of GPL "shim" modules that interface between GPL symbols
  and proprietary symbols.

  Summary:

   - Have modules that use symbols from proprietary modules inherit the
     TAINT_PROPRIETARY_MODULE taint, in an effort to prevent GPL shim
     modules that are used to circumvent _GPL exports. These are modules
     that claim to be GPL licensed while also using symbols from
     proprietary modules. Such modules will be rejected while non-GPL
     modules will inherit the proprietary taint.

   - Module export space cleanup. Unexport symbols that are unused
     outside of module.c or otherwise used in only built-in code"

* tag 'modules-for-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
  modules: inherit TAINT_PROPRIETARY_MODULE
  modules: return licensing information from find_symbol
  modules: rename the licence field in struct symsearch to license
  modules: unexport __module_address
  modules: unexport __module_text_address
  modules: mark each_symbol_section static
  modules: mark find_symbol static
  modules: mark ref_module static
  modules: linux/moduleparam.h: drop duplicated word in a comment

Merge tag 'kconfig-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kconfig updates from Masahiro Yamada:

- remove '---help---' keyword support

- fix mouse events for 'menuconfig' symbols in search view of qconf

- code cleanups of qconf

* tag 'kconfig-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (24 commits)
  kconfig: qconf: move setOptionMode() to ConfigList from ConfigView
  kconfig: qconf: do not limit the pop-up menu to the first row
  kconfig: qconf: refactor icon setups
  kconfig: qconf: remove unused voidPix, menuInvPix
  kconfig: qconf: remove ConfigItem::text/setText
  kconfig: qconf: remove ConfigList::addColumn/removeColumn
  kconfig: qconf: remove ConfigItem::pixmap/setPixmap
  kconfig: qconf: drop more localization code
  kconfig: qconf: remove 'parent' from ConfigList::updateMenuList()
  kconfig: qconf: remove unused argument from ConfigView::updateList()
  kconfig: qconf: remove unused argument from ConfigList::updateList()
  kconfig: qconf: omit parent to QHBoxLayout()
  kconfig: qconf: remove name from ConfigSearchWindow constructor
  kconfig: qconf: remove unused ConfigList::listView()
  kconfig: qconf: overload addToolBar() to create and insert toolbar
  kconfig: qconf: remove toolBar from ConfigMainWindow members
  kconfig: qconf: use 'menu' variable for (QMenu *)
  kconfig: qconf: do not use 'menu' variable for (QMenuBar *)
  kconfig: qconf: remove ->addSeparator() to menuBar
  kconfig: add 'static' to some file-local data
  ...

kconfig: qconf: move setOptionMode() to ConfigList from ConfigView

ConfigView::setOptionMode() only gets access to the 'list' member.

Move it to the more relevant ConfigList class.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

kconfig: qconf: do not limit the pop-up menu to the first row

If you right-click the first row in the option tree, the pop-up menu
shows up, but if you right-click the second row or below, the event
is ignored due to the following check:

if (e->y() <= header()->geometry().bottom()) {

Perhaps, the intention was to show the pop-menu only when the tree
header was right-clicked, but this handler is not called in that case.

Since the origin of e->y() starts from the bottom of the header,
this check is odd.

Going forward, you can right-click anywhere in the tree to get the
pop-up menu.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

kconfig: qconf: refactor icon setups

These icon data are used by ConfigItem, but stored in each instance
of ConfigView. There is no point to keep the same data in each of 3
instances, "menu", "config", and "search".

Move the icon data to the more relevant ConfigItem class, and make
them static members.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

kconfig: qconf: remove unused voidPix, menuInvPix

These are initialized, but not used by anyone.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

kconfig: qconf: remove ConfigItem::text/setText

Use QTreeWidgetItem::text/setText directly

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

kconfig: qconf: remove ConfigList::addColumn/removeColumn

Use QTreeView::showColumn/hideColumn directly.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

kconfig: qconf: remove ConfigItem::pixmap/setPixmap

Use QTreeWidgetItem::icon/setIcon directly.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

kconfig: qconf: drop more localization code

This is a remnant of commit 53d40b99219b ("kconfig: drop localization
support").

Get it back to the code prior to commit 0ddd5d63e171 ("[PATCH] Kconfig
i18n support").

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

kconfig: qconf: remove 'parent' from ConfigList::updateMenuList()

All the call-sites of this function pass 'this' to the first argument.

So, 'parent' is always the 'this' pointer.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

kconfig: qconf: remove unused argument from ConfigView::updateList()

Now that ConfigList::updateList() takes no argument, the 'item' argument
ConfigView::updateList() is no longer used.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

kconfig: qconf: remove unused argument from ConfigList::updateList()

This function allocates 'item' before using it, so the argument 'item'
is always shadowed.

Remove the meaningless argument.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

kconfig: qconf: omit parent to QHBoxLayout()

Instead of passing 0 (i.e. nullptr), leave it empty.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

kconfig: qconf: remove name from ConfigSearchWindow constructor

This constructor is only called with "search" as the second argument.

Hard-code the name in the constructor, and drop it from the function
argument.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

kconfig: qconf: remove unused ConfigList::listView()

I do not know how this function can be useful. In fact, it is unsed.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

kconfig: qconf: overload addToolBar() to create and insert toolbar

Use the overloaded function, addToolBar(const QString &title)
to create a QToolBar object, setting its window title, and inserts
it into the toolbar area.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

kconfig: qconf: remove toolBar from ConfigMainWindow members

This pointer is only used in the ConfigMainWindow constructor.

Drop it from the private members.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

kconfig: qconf: use 'menu' variable for (QMenu *)

The variable 'config' for the file menu is inconsistent.

You do not need to use different variables. Use 'menu' for every menu.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

kconfig: qconf: do not use 'menu' variable for (QMenuBar *)

I think it is a bit confusing to use 'menu' to hold a QMenuBar pointer.
I want to use 'menu' for a QMenu pointer.

You do not need to use a local variable here. Use menuBar() directly.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

kconfig: qconf: remove ->addSeparator() to menuBar

I do not understand the purpose of this ->addSeparator().
It does not make any difference.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

kconfig: add 'static' to some file-local data

Fix some warnings from sparce like follows:

warning: symbol '...' was not declared. Should it be static?

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

kconfig: qconf: Fix mouse events in search view

On menu properties mouse events didn't do anything in search view
(listMode).

As there are no menus in listMode we can add an exception in tests to
always change the value on mouse events if we are in listMode.

Signed-off-by: Maxime Chretien <maxime.chretien@bootlin.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

kconfig: constify XPM data

Constify arrays as well as strings.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

Revert "checkpatch: kconfig: prefer 'help' over '---help---'"

This reverts commit d0ebe97b779f5b22dd489d4c42841d2c67b03861.

The conversion is done.

Cc: Ulf Magnusson <ulfalizer@gmail.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

kconfig: remove '---help---' support

The conversion is done. No more user of '---help---'.

Cc: Ulf Magnusson <ulfalizer@gmail.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from David Miller:
"Some merge window fallout, some longer term fixes:

   1) Handle headroom properly in lapbether and x25_asy drivers, from
      Xie He.

   2) Fetch MAC address from correct r8152 device node, from Thierry
      Reding.

   3) In the sw kTLS path we should allow MSG_CMSG_COMPAT in sendmsg,
      from Rouven Czerwinski.

   4) Correct fdputs in socket layer, from Miaohe Lin.

   5) Revert troublesome sockptr_t optimization, from Christoph Hellwig.

   6) Fix TCP TFO key reading on big endian, from Jason Baron.

   7) Missing CAP_NET_RAW check in nfc, from Qingyu Li.

   8) Fix inet fastreuse optimization with tproxy sockets, from Tim
      Froidcoeur.

   9) Fix 64-bit divide in new SFC driver, from Edward Cree.

  10) Add a tracepoint for prandom_u32 so that we can more easily
      perform usage analysis. From Eric Dumazet.

  11) Fix rwlock imbalance in AF_PACKET, from John Ogness"

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (49 commits)
  net: openvswitch: introduce common code for flushing flows
  af_packet: TPACKET_V3: fix fill status rwlock imbalance
  random32: add a tracepoint for prandom_u32()
  Revert "ipv4: tunnel: fix compilation on ARCH=um"
  net: accept an empty mask in /sys/class/net/*/queues/rx-*/rps_cpus
  net: ethernet: stmmac: Disable hardware multicast filter
  net: stmmac: dwmac1000: provide multicast filter fallback
  ipv4: tunnel: fix compilation on ARCH=um
  vsock: fix potential null pointer dereference in vsock_poll()
  sfc: fix ef100 design-param checking
  net: initialize fastreuse on inet_inherit_port
  net: refactor bind_bucket fastreuse into helper
  net: phy: marvell10g: fix null pointer dereference
  net: Fix potential memory leak in proto_register()
  net: qcom/emac: add missed clk_disable_unprepare in error path of emac_clks_phase1_init
  ionic_lif: Use devm_kcalloc() in ionic_qcq_alloc()
  net/nfc/rawsock.c: add CAP_NET_RAW check.
  hinic: fix strncpy output truncated compile warnings
  drivers/net/wan/x25_asy: Added needed_headroom and a skb->len check
  net/tls: Fix kmap usage
  ...

Merge branch 'i2c/for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c updates from Wolfram Sang:

- bus recovery can now be given a pinctrl handle and the I2C core will
   do all the steps to switch to/from GPIO which can save quite some
   boilerplate code from drivers

- "fallthrough" conversion

- driver updates, mostly ID additions

* 'i2c/for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (32 commits)
  i2c: iproc: fix race between client unreg and isr
  i2c: eg20t: use generic power management
  i2c: eg20t: Drop PCI wakeup calls from .suspend/.resume
  i2c: mediatek: Fix i2c_spec_values description
  i2c: mediatek: Add i2c compatible for MediaTek MT8192
  dt-bindings: i2c: update bindings for MT8192 SoC
  i2c: mediatek: Add access to more than 8GB dram in i2c driver
  i2c: mediatek: Add apdma sync in i2c driver
  i2c: i801: Add support for Intel Tiger Lake PCH-H
  i2c: i801: Add support for Intel Emmitsburg PCH
  i2c: bcm2835: Replace HTTP links with HTTPS ones
  Documentation: i2c: dev: 'block process call' is supported
  i2c: at91: Move to generic GPIO bus recovery
  i2c: core: treat EPROBE_DEFER when acquiring SCL/SDA GPIOs
  i2c: core: add generic I2C GPIO recovery
  dt-bindings: i2c: add generic properties for GPIO bus recovery
  i2c: rcar: avoid race when unregistering slave
  i2c: tegra: Avoid tegra_i2c_init_dma() for Tegra210 vi i2c
  i2c: tegra: Fix runtime resume to re-init VI I2C
  i2c: tegra: Fix the error path in tegra_i2c_runtime_resume
  ...

net: openvswitch: introduce common code for flushing flows

To avoid some issues, for example RCU usage warning and double free,
we should flush the flows under ovs_lock. This patch refactors
table_instance_destroy and introduces table_instance_flow_flush
which can be invoked by __dp_destroy or ovs_flow_tbl_flush.

Fixes: 67df6806f2fb ("net: openvswitch: fix possible memleak on destroy flow-table")
Reported-by: Johan Knöös <jknoos@google.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2020-August/050489.html
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

af_packet: TPACKET_V3: fix fill status rwlock imbalance

After @blk_fill_in_prog_lock is acquired there is an early out vnet
situation that can occur. In that case, the rwlock needs to be
released.

Also, since @blk_fill_in_prog_lock is only acquired when @tp_version
is exactly TPACKET_V3, only release it on that exact condition as
well.

And finally, add sparse annotation so that it is clearer that
prb_fill_curr_block() and prb_clear_blk_fill_status() are acquiring
and releasing @blk_fill_in_prog_lock, respectively. sparse is still
unable to understand the balance, but the warnings are now on a
higher level that make more sense.

Fixes: 39d698d2d507 ("af_packet: TPACKET_V3: replace busy-wait loop")
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

random32: add a tracepoint for prandom_u32()

There has been some heat around prandom_u32() lately, and some people
were wondering if there was a simple way to determine how often
it was used, before considering making it maybe 10 times more expensive.

This tracepoint exports the generated pseudo random value.

Tested:

perf list | grep prandom_u32
  random:prandom_u32                                 [Tracepoint event]

perf record -a [-g] [-C1] -e random:prandom_u32 sleep 1
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 259.748 MB perf.data (924087 samples) ]

perf report --nochildren
    ...
    97.67%  ksoftirqd/1     [kernel.vmlinux]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               |
               |--48.86%--tcp_v4_syn_recv_sock
               |          tcp_check_req
               |          tcp_v4_rcv
               |          ...
                --48.81%--tcp_conn_request
                          tcp_v4_conn_request
                          tcp_rcv_state_process
                          ...
perf script

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'docs-5.9-2' of git://git.lwn.net/linux

Pull documentation fixes from Jonathan Corbet:
"A handful of obvious fixes that wandered in during the merge window"

* tag 'docs-5.9-2' of git://git.lwn.net/linux:
  Documentation/locking/locktypes: fix the typo
  doc/zh_CN: resolve undefined label warning in admin-guide index
  doc/zh_CN: fix title heading markup in admin-guide cpu-load
  docs: remove the 2.6 "Upgrading I2C Drivers" guide
  docs: Correct the release date of 5.2 stable
  mailmap: Update comments for with format and more detalis
  docs: cdrom: Fix a typo and rst markup
  Doc: admin-guide: use correct legends in kernel-parameters.txt
  Documentation/features: refresh RISC-V arch support files
  documentation: coccinelle: Improve command example for make C={1,2}
  Core-api: Documentation: Replace deprecated :c:func: Usage
  Dev-tools: Documentation: Replace deprecated :c:func: Usage
  Filesystems: Documentation: Replace deprecated :c:func: Usage
  docs: trace: fix a typo

Documentation/locking/locktypes: fix the typo

We have three categories locks, not two.

Signed-off-by: Huang Shijie <sjhuang@iluvatar.ai>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200813060220.18199-1-sjhuang@iluvatar.ai
Signed-off-by: Jonathan Corbet <corbet@lwn.net>

Merge tag 's390-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull more s390 updates from Heiko Carstens:

- Allow s390 debug feature to handle finally more than 256 CPU numbers,
   instead of truncating the most significant bits.

- Improve THP splitting required by qemu processes by making use of
   walk_page_vma() instead of calling follow_page() for every single
   page within each vma.

- Add missing ZCRYPT dependency to VFIO_AP to fix potential compile
   problems.

- Remove not required select CLOCKSOURCE_VALIDATE_LAST_CYCLE again.

- Set node distance to LOCAL_DISTANCE instead of 0, since e.g. libnuma
   translates a node distance of 0 to "no NUMA support available".

- Couple of other minor fixes and improvements.

* tag 's390-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/numa: move code to arch/s390/kernel
  s390/time: remove select CLOCKSOURCE_VALIDATE_LAST_CYCLE again
  s390/debug: debug feature version 3
  s390/Kconfig: add missing ZCRYPT dependency to VFIO_AP
  s390/numa: set node distance to LOCAL_DISTANCE
  s390/pkey: remove redundant variable initialization
  s390/test_unwind: fix possible memleak in test_unwind()
  s390/gmap: improve THP splitting
  s390/atomic: circumvent gcc 10 build regression

Merge tag 'for-5.9-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull more btrfs updates from David Sterba:
"One minor update, the rest are fixes that have arrived a bit late for
  the first batch. There are also some recent fixes for bugs that were
  discovered during the merge window and pop up during testing.

  User visible change:

   - show correct subvolume path in /proc/mounts for bind mounts

  Fixes:

   - fix compression messages when remounting with different level or
     compression algorithm

   - tree-log: fix some memory leaks on error handling paths

   - restore I_VERSION on remount

   - fix return values and error code mixups

   - fix umount crash with quotas enabled when removing sysfs files

   - fix trim range on a shrunk device"

* tag 'for-5.9-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: trim: fix underflow in trim length to prevent access beyond device boundary
  btrfs: fix return value mixup in btrfs_get_extent
  btrfs: sysfs: fix NULL pointer dereference at btrfs_sysfs_del_qgroups()
  btrfs: check correct variable after allocation in btrfs_backref_iter_alloc
  btrfs: make sure SB_I_VERSION doesn't get unset by remount
  btrfs: fix memory leaks after failure to lookup checksums during inode logging
  btrfs: don't show full path of bind mounts in subvol=
  btrfs: fix messages after changing compression level by remount
  btrfs: only search for left_info if there is no right_info in try_merge_free_space
  btrfs: inode: fix NULL pointer dereference if inode doesn't need compression

Merge tag 'xfs-5.9-merge-8' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fixes from Darrick Wong:
"Two small fixes that have come in during the past week:

   - Fix duplicated words in comments

   - Fix an ubsan complaint about null pointer arithmetic"

* tag 'xfs-5.9-merge-8' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: Fix UBSAN null-ptr-deref in xfs_sysfs_init
  xfs: delete duplicated words + other fixes

Merge tag 'exfat-for-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat

Pull exfat updates from Namjae Jeon:

- don't clear MediaFailure and VolumeDirty bit in volume flags if these
   were already set before mounting

- write multiple dirty buffers at once in sync mode

- remove unneeded EXFAT_SB_DIRTY bit set

* tag 'exfat-for-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
  exfat: retain 'VolumeFlags' properly
  exfat: optimize exfat_zeroed_cluster()
  exfat: add error check when updating dir-entries
  exfat: write multiple sectors at once
  exfat: remove EXFAT_SB_DIRTY flag

mm: memcontrol: fix warning when allocating the root cgroup

Commit 9209067f96cf ("mm: memcg: charge memcg percpu memory to the
parent cgroup") adds memory tracking to the memcg kernel structures
themselves to make cgroups liable for the memory they are consuming
through the allocation of child groups (which can be significant).

This code is a bit awkward as it's spread out through several functions:
The outermost function does memalloc_use_memcg(parent) to set up
current->active_memcg, which designates which cgroup to charge, and the
inner functions pass GFP_ACCOUNT to request charging for specific
allocations.  To make sure this dependency is satisfied at all times -
to make sure we don't randomly charge whoever is calling the functions -
the inner functions warn on !current->active_memcg.

However, this triggers a false warning when the root memcg itself is
allocated.  No parent exists in this case, and so current->active_memcg
is rightfully NULL.  It's a false positive, not indicative of a bug.

Delete the warnings for now, we can revisit this later.

Fixes: 9209067f96cf ("mm: memcg: charge memcg percpu memory to the parent cgroup")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drm/xen-front: Pass dumb buffer data offset to the backend

While importing a dmabuf it is possible that the data of the buffer
is put with offset which is indicated by the SGT offset.
Respect the offset value and forward it to the backend.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Acked-by: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: Juergen Gross <jgross@suse.com>

xen: Sync up with the canonical protocol definition in Xen

This is the sync up with the canonical definition of the
display protocol in Xen.

1. Add protocol version as an integer

Version string, which is in fact an integer, is hard to handle in the
code that supports different protocol versions. To simplify that
also add the version as an integer.

2. Pass buffer offset with XENDISPL_OP_DBUF_CREATE

There are cases when display data buffer is created with non-zero
offset to the data start. Handle such cases and provide that offset
while creating a display buffer.

3. Add XENDISPL_OP_GET_EDID command

Add an optional request for reading Extended Display Identification
Data (EDID) structure which allows better configuration of the
display connectors over the configuration set in XenStore.
With this change connectors may have multiple resolutions defined
with respect to detailed timing definitions and additional properties
normally provided by displays.

If this request is not supported by the backend then visible area
is defined by the relevant XenStore's "resolution" property.

If backend provides extended display identification data (EDID) with
XENDISPL_OP_GET_EDID request then EDID values must take precedence
over the resolutions defined in XenStore.

4. Bump protocol version to 2.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20200813062113.11030-5-andr2000@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>

drm/xen-front: Add YUYV to supported formats

Add YUYV to supported formats, so the frontend can work with the
formats used by cameras and other HW.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Acked-by: Noralf Trønnes <noralf@tronnes.org>
Link: https://lore.kernel.org/r/20200813062113.11030-4-andr2000@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>

drm/xen-front: Fix misused IS_ERR_OR_NULL checks

The patch 33f79d15f6b5: "drm/xen-front: Add support for Xen PV
display frontend" from Apr 3, 2018, leads to the following static
checker warning:

drivers/gpu/drm/xen/xen_drm_front_gem.c:140 xen_drm_front_gem_create()
warn: passing zero to 'ERR_CAST'

drivers/gpu/drm/xen/xen_drm_front_gem.c
   133  struct drm_gem_object *xen_drm_front_gem_create(struct drm_device *dev,
   134                                                  size_t size)
   135  {
   136          struct xen_gem_object *xen_obj;
   137
   138          xen_obj = gem_create(dev, size);
   139          if (IS_ERR_OR_NULL(xen_obj))
   140                  return ERR_CAST(xen_obj);

Fix this and the rest of misused places with IS_ERR_OR_NULL in the
driver.

Fixes: 33f79d15f6b5: "drm/xen-front: Add support for Xen PV display frontend"
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200813062113.11030-3-andr2000@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>

xen/gntdev: Fix dmabuf import with non-zero sgt offset

It is possible that the scatter-gather table during dmabuf import has
non-zero offset of the data, but user-space doesn't expect that.
Fix this by failing the import, so user-space doesn't access wrong data.

Fixes: e1386868affe ("xen/gntdev: Implement dma-buf import functionality")
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200813062113.11030-2-andr2000@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>

crypto: algif_aead - fix uninitialized ctx->init

In skcipher_accept_parent_nokey() the whole af_alg_ctx structure is
cleared by memset() after allocation, so add such memset() also to
aead_accept_parent_nokey() so that the new "init" field is also
initialized to zero. Without that the initial ctx->init checks might
randomly return true and cause errors.

While there, also remove the redundant zero assignments in both
functions.

Found via libkcapi testsuite.

Cc: Stephan Mueller <smueller@chronox.de>
Fixes: b7dad8488f83 ("crypto: algif_aead - Only wake up when ctx->more is zero")
Suggested-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Merge tag 'rtc-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux

Pull RTC updates from Alexandre Belloni:
"Not much this cycle - mostly non urgent driver fixes:

   - ds1374: use watchdog core

   - pcf2127: add alarm and pcf2129 support"

* tag 'rtc-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
  rtc: pcf2127: fix alarm handling
  rtc: pcf2127: add alarm support
  rtc: pcf2127: add pca2129 device id
  rtc: max77686: Fix wake-ups for max77620
  rtc: ds1307: provide an indication that the watchdog has fired
  rtc: ds1374: remove unused define
  rtc: ds1374: fix RTC_DRV_DS1374_WDT dependencies
  rtc: cleanup obsolete comment about struct rtc_class_ops
  rtc: pl031: fix set_alarm by adding back call to alarm_irq_enable
  rtc: ds1374: wdt: Use watchdog core for watchdog part
  rtc: Replace HTTP links with HTTPS ones
  rtc: goldfish: Enable interrupt in set_alarm() when necessary
  rtc: max77686: Do not allow interrupt to fire before system resume
  rtc: imxdi: fix trivial typos
  rtc: cpcap: fix range

Revert "ipv4: tunnel: fix compilation on ARCH=um"

This reverts commit 66a86880dcc899ad27a7355e1dfd6606241cbceb.

The bug was already fixed, this added a dup include.

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: accept an empty mask in /sys/class/net/*/queues/rx-*/rps_cpus

We must accept an empty mask in store_rps_map(), or we are not able
to disable RPS on a queue.

Fixes: 42063ee416e0 ("net: Restrict receive packets queuing to housekeeping CPUs")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Maciej Żenczykowski <maze@google.com>
Cc: Alex Belits <abelits@marvell.com>
Cc: Nitesh Narayan Lal <nitesh@redhat.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Maciej Żenczykowski <maze@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Nitesh Narayan Lal <nitesh@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'net-stmmac-Fix-multicast-filter-on-IPQ806x'

Jonathan McDowell says:

====================
net: stmmac: Fix multicast filter on IPQ806x

This pair of patches are the result of discovering a failure to
correctly receive IPv6 multicast packets on such a device (in particular
DHCPv6 requests and RA solicitations). Putting the device into
promiscuous mode, or allmulti, both resulted in such packets correctly
being received. Examination of the vendor driver (nss-gmac from the
qsdk) shows that it does not enable the multicast filter and instead
falls back to allmulti.

Extend the base dwmac1000 driver to fall back when there's no suitable
hardware filter, and update the ipq806x platform to request this.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: ethernet: stmmac: Disable hardware multicast filter

The IPQ806x does not appear to have a functional multicast ethernet
address filter. This was observed as a failure to correctly receive IPv6
packets on a LAN to the all stations address. Checking the vendor driver
shows that it does not attempt to enable the multicast filter and
instead falls back to receiving all multicast packets, internally
setting ALLMULTI.

Use the new fallback support in the dwmac1000 driver to correctly
achieve the same with the mainline IPQ806x driver. Confirmed to fix IPv6
functionality on an RB3011 router.

Cc: stable@vger.kernel.org
Signed-off-by: Jonathan McDowell <noodles@earth.li>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: stmmac: dwmac1000: provide multicast filter fallback

If we don't have a hardware multicast filter available then instead of
silently failing to listen for the requested ethernet broadcast
addresses fall back to receiving all multicast packets, in a similar
fashion to other drivers with no multicast filter.

Cc: stable@vger.kernel.org
Signed-off-by: Jonathan McDowell <noodles@earth.li>
Signed-off-by: David S. Miller <davem@davemloft.net>

i2c: iproc: fix race between client unreg and isr

When i2c client unregisters, synchronize irq before setting
iproc_i2c->slave to NULL.

(1) disable_irq()
(2) Mask event enable bits in control reg
(3) Erase slave address (avoid further writes to rx fifo)
(4) Flush tx and rx FIFOs
(5) Clear pending event (interrupt) bits in status reg
(6) enable_irq()
(7) Set client pointer to NULL

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000318

[  371.020421] pc : bcm_iproc_i2c_isr+0x530/0x11f0
[  371.025098] lr : __handle_irq_event_percpu+0x6c/0x170
[  371.030309] sp : ffff800010003e40
[  371.033727] x29: ffff800010003e40 x28: 0000000000000060
[  371.039206] x27: ffff800010ca9de0 x26: ffff800010f895df
[  371.044686] x25: ffff800010f18888 x24: ffff0008f7ff3600
[  371.050165] x23: 0000000000000003 x22: 0000000001600000
[  371.055645] x21: ffff800010f18888 x20: 0000000001600000
[  371.061124] x19: ffff0008f726f080 x18: 0000000000000000
[  371.066603] x17: 0000000000000000 x16: 0000000000000000
[  371.072082] x15: 0000000000000000 x14: 0000000000000000
[  371.077561] x13: 0000000000000000 x12: 0000000000000001
[  371.083040] x11: 0000000000000000 x10: 0000000000000040
[  371.088519] x9 : ffff800010f317c8 x8 : ffff800010f317c0
[  371.093999] x7 : ffff0008f805b3b0 x6 : 0000000000000000
[  371.099478] x5 : ffff0008f7ff36a4 x4 : ffff8008ee43d000
[  371.104957] x3 : 0000000000000000 x2 : ffff8000107d64c0
[  371.110436] x1 : 00000000c00000af x0 : 0000000000000000

[  371.115916] Call trace:
[  371.118439]  bcm_iproc_i2c_isr+0x530/0x11f0
[  371.122754]  __handle_irq_event_percpu+0x6c/0x170
[  371.127606]  handle_irq_event_percpu+0x34/0x88
[  371.132189]  handle_irq_event+0x40/0x120
[  371.136234]  handle_fasteoi_irq+0xcc/0x1a0
[  371.140459]  generic_handle_irq+0x24/0x38
[  371.144594]  __handle_domain_irq+0x60/0xb8
[  371.148820]  gic_handle_irq+0xc0/0x158
[  371.152687]  el1_irq+0xb8/0x140
[  371.155927]  arch_cpu_idle+0x10/0x18
[  371.159615]  do_idle+0x204/0x290
[  371.162943]  cpu_startup_entry+0x24/0x60
[  371.166990]  rest_init+0xb0/0xbc
[  371.170322]  arch_call_rest_init+0xc/0x14
[  371.174458]  start_kernel+0x404/0x430

Fixes: 75a1580dbd54 ("i2c: iproc: Add multi byte read-write support for slave mode")
Signed-off-by: Dhananjay Phadke <dphadke@linux.microsoft.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>

ipv4: tunnel: fix compilation on ARCH=um

With certain configurations, a 64-bit ARCH=um errors
out here with an unknown csum_ipv6_magic() function.
Include the right header file to always have it.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

vsock: fix potential null pointer dereference in vsock_poll()

syzbot reported this issue where in the vsock_poll() we find the
socket state at TCP_ESTABLISHED, but 'transport' is null:
  general protection fault, probably for non-canonical address 0xdffffc0000000012: 0000 [#1] PREEMPT SMP KASAN
  KASAN: null-ptr-deref in range [0x0000000000000090-0x0000000000000097]
  CPU: 0 PID: 8227 Comm: syz-executor.2 Not tainted 5.8.0-rc7-syzkaller #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  RIP: 0010:vsock_poll+0x75a/0x8e0 net/vmw_vsock/af_vsock.c:1038
  Call Trace:
   sock_poll+0x159/0x460 net/socket.c:1266
   vfs_poll include/linux/poll.h:90 [inline]
   do_pollfd fs/select.c:869 [inline]
   do_poll fs/select.c:917 [inline]
   do_sys_poll+0x607/0xd40 fs/select.c:1011
   __do_sys_poll fs/select.c:1069 [inline]
   __se_sys_poll fs/select.c:1057 [inline]
   __x64_sys_poll+0x18c/0x440 fs/select.c:1057
   do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:384
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

This issue can happen if the TCP_ESTABLISHED state is set after we read
the vsk->transport in the vsock_poll().

We could put barriers to synchronize, but this can only happen during
connection setup, so we can simply check that 'transport' is valid.

Fixes: 377128ab2681 ("vsock: add multi-transports support")
Reported-and-tested-by: syzbot+a61bac2fcc1a7c6623fe@syzkaller.appspotmail.com
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sfc: fix ef100 design-param checking

The handling of the RXQ/TXQ size granularity design-params had two
problems: it had a 64-bit divide that didn't build on 32-bit platforms,
and it could divide by zero if the NIC supplied 0 as the value of the
design-param. Fix both by checking for 0 and for a granularity bigger
than our min-size; if the granularity <= EFX_MIN_DMAQ_SIZE then it fits
in 32 bits, so we can cast it to u32 for the divide.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'ceph-for-5.9-rc1' of git://github.com/ceph/ceph-client

Pull ceph updates from Ilya Dryomov:
"Xiubo has completed his work on filesystem client metrics, they are
  sent to all available MDSes once per second now.

  Other than that, we have a lot of fixes and cleanups all around the
  filesystem, including a tweak to cut down on MDS request resends in
  multi-MDS setups from Yanhu and fixups for SELinux symlink labeling
  and MClientSession message decoding from Jeff"

* tag 'ceph-for-5.9-rc1' of git://github.com/ceph/ceph-client: (22 commits)
  ceph: handle zero-length feature mask in session messages
  ceph: use frag's MDS in either mode
  ceph: move sb->wb_pagevec_pool to be a global mempool
  ceph: set sec_context xattr on symlink creation
  ceph: remove redundant initialization of variable mds
  ceph: fix use-after-free for fsc->mdsc
  ceph: remove unused variables in ceph_mdsmap_decode()
  ceph: delete repeated words in fs/ceph/
  ceph: send client provided metric flags in client metadata
  ceph: periodically send perf metrics to MDSes
  ceph: check the sesion state and return false in case it is closed
  libceph: replace HTTP links with HTTPS ones
  ceph: remove unnecessary cast in kfree()
  libceph: just have osd_req_op_init() return a pointer
  ceph: do not access the kiocb after aio requests
  ceph: clean up and optimize ceph_check_delayed_caps()
  ceph: fix potential mdsc use-after-free crash
  ceph: switch to WARN_ON_ONCE in encode_supported_features()
  ceph: add global total_caps to count the mdsc's total caps number
  ceph: add check_session_state() helper and make it global
  ...

Merge branch 'parisc-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull more parisc updates from Helge Deller:

- Oscar Carter contributed a patch which fixes parisc's usage of
   dereference_function_descriptor() and thus will allow using the
   -Wcast-function-type compiler option in the top-level Makefile

- Sven Schnelle fixed a bug in the SBA code to prevent crashes during
   kexec

- John David Anglin provided implementations for __smp_store_release()
   and __smp_load_acquire barriers() which avoids using the sync
   assembler instruction and thus speeds up barrier paths

- Some whitespace cleanups in parisc's atomic.h header file

* 'parisc-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Implement __smp_store_release and __smp_load_acquire barriers
  parisc: mask out enable and reserved bits from sba imask
  parisc: Whitespace cleanups in atomic.h
  parisc/kernel/ftrace: Remove function callback casts
  sections.h: dereference_function_descriptor() returns void pointer

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull more KVM updates from Paolo Bonzini:
"PPC:
   - Improvements and bugfixes for secure VM support, giving reduced
     startup time and memory hotplug support.

   - Locking fixes in nested KVM code

   - Increase number of guests supported by HV KVM to 4094

   - Preliminary POWER10 support

  ARM:
   - Split the VHE and nVHE hypervisor code bases, build the EL2 code
     separately, allowing for the VHE code to now be built with
     instrumentation

   - Level-based TLB invalidation support

   - Restructure of the vcpu register storage to accomodate the NV code

   - Pointer Authentication available for guests on nVHE hosts

   - Simplification of the system register table parsing

   - MMU cleanups and fixes

   - A number of post-32bit cleanups and other fixes

  MIPS:
   - compilation fixes

  x86:
   - bugfixes

   - support for the SERIALIZE instruction"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (70 commits)
  KVM: MIPS/VZ: Fix build error caused by 'kvm_run' cleanup
  x86/kvm/hyper-v: Synic default SCONTROL MSR needs to be enabled
  MIPS: KVM: Convert a fallthrough comment to fallthrough
  MIPS: VZ: Only include loongson_regs.h for CPU_LOONGSON64
  x86: Expose SERIALIZE for supported cpuid
  KVM: x86: Don't attempt to load PDPTRs when 64-bit mode is enabled
  KVM: arm64: Move S1PTW S2 fault logic out of io_mem_abort()
  KVM: arm64: Don't skip cache maintenance for read-only memslots
  KVM: arm64: Handle data and instruction external aborts the same way
  KVM: arm64: Rename kvm_vcpu_dabt_isextabt()
  KVM: arm: Add trace name for ARM_NISV
  KVM: arm64: Ensure that all nVHE hyp code is in .hyp.text
  KVM: arm64: Substitute RANDOMIZE_BASE for HARDEN_EL2_VECTORS
  KVM: arm64: Make nVHE ASLR conditional on RANDOMIZE_BASE
  KVM: PPC: Book3S HV: Rework secure mem slot dropping
  KVM: PPC: Book3S HV: Move kvmppc_svm_page_out up
  KVM: PPC: Book3S HV: Migrate hot plugged memory
  KVM: PPC: Book3S HV: In H_SVM_INIT_DONE, migrate remaining normal-GFNs to secure-GFNs
  KVM: PPC: Book3S HV: Track the state GFNs associated with secure VMs
  KVM: PPC: Book3S HV: Disable page merging in H_SVM_INIT_START
  ...

Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull more clk updates from Stephen Boyd:
"Here's some more updates that missed the last pull request because I
  happened to tag the tree at an earlier point in the history of
  clk-next. I must have fat fingered it and checked out an older version
  of clk-next on this second computer I'm using.

  This time it actually includes more code for Qualcomm SoCs, the AT91
  major updates, and some Rockchip SoC clk driver updates as well. I've
  corrected this flow so this shouldn't happen again"

* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (83 commits)
  clk: bcm2835: Do not use prediv with bcm2711's PLLs
  clk: drop unused function __clk_get_flags
  clk: hsdk: Fix bad dependency on IOMEM
  dt-bindings: clock: Fix YAML schemas for LPASS clocks on SC7180
  clk: mmp: avoid missing prototype warning
  clk: sparx5: Add Sparx5 SoC DPLL clock driver
  dt-bindings: clock: sparx5: Add bindings include file
  clk: qoriq: add LS1021A core pll mux options
  clk: clk-atlas6: fix return value check in atlas6_clk_init()
  clk: tegra: pll: Improve PLLM enable-state detection
  clk: X1000: Add support for calculat REFCLK of USB PHY.
  clk: JZ4780: Reformat the code to align it.
  clk: JZ4780: Add functions for enable and disable USB PHY.
  clk: Ingenic: Add RTC related clocks for Ingenic SoCs.
  dt-bindings: clock: Add tabs to align code.
  dt-bindings: clock: Add RTC related clocks for Ingenic SoCs.
  clk: davinci: Use fallthrough pseudo-keyword
  clk: imx: Use fallthrough pseudo-keyword
  clk: qcom: gcc-sdm660: Fix up gcc_mss_mnoc_bimc_axi_clk
  clk: qcom: gcc-sdm660: Add missing modem reset
  ...

Merge tag 'linux-watchdog-5.9-rc1' of git://www.linux-watchdog.org/linux-watchdog

Pull watchdog updates from Wim Van Sebroeck:

- f71808e_wdt imporvements

- dw_wdt improvements

- mlx-wdt: support new watchdog type with longer timeout period

- fallthrough pseudo-keyword replacements

- overall small fixes and improvements

* tag 'linux-watchdog-5.9-rc1' of git://www.linux-watchdog.org/linux-watchdog: (35 commits)
  watchdog: rti-wdt: balance pm runtime enable calls
  watchdog: rti-wdt: attach to running watchdog during probe
  watchdog: add support for adjusting last known HW keepalive time
  watchdog: use __watchdog_ping in startup
  watchdog: softdog: Add options 'soft_reboot_cmd' and 'soft_active_on_boot'
  watchdog: pcwd_usb: remove needless check before usb_free_coherent()
  watchdog: Replace HTTP links with HTTPS ones
  dt-bindings: watchdog: renesas,wdt: Document r8a774e1 support
  watchdog: initialize device before misc_register
  watchdog: booke_wdt: Add common nowayout parameter driver
  watchdog: scx200_wdt: Use fallthrough pseudo-keyword
  watchdog: Use fallthrough pseudo-keyword
  watchdog: f71808e_wdt: do stricter parameter validation
  watchdog: f71808e_wdt: clear watchdog timeout occurred flag
  watchdog: f71808e_wdt: remove use of wrong watchdog_info option
  watchdog: f71808e_wdt: indicate WDIOF_CARDRESET support in watchdog_info.options
  docs: watchdog: codify ident.options as superset of possible status flags
  dt-bindings: watchdog: Add compatible for QCS404, SC7180, SDM845, SM8150
  dt-bindings: watchdog: Convert QCOM watchdog timer bindings to YAML
  watchdog: dw_wdt: Add DebugFS files
  ...

Merge tag 'vfio-v5.9-rc1' of git://github.com/awilliam/linux-vfio

Pull VFIO updates from Alex Williamson:

- Inclusive naming updates (Alex Williamson)

- Intel X550 INTx quirk (Alex Williamson)

- Error path resched between unmaps (Xiang Zheng)

- SPAPR IOMMU pin_user_pages() conversion (John Hubbard)

- Trivial mutex simplification (Alex Williamson)

- QAT device denylist (Giovanni Cabiddu)

- type1 IOMMU ioctl refactor (Liu Yi L)

* tag 'vfio-v5.9-rc1' of git://github.com/awilliam/linux-vfio:
  vfio/type1: Refactor vfio_iommu_type1_ioctl()
  vfio/pci: Add QAT devices to denylist
  vfio/pci: Add device denylist
  PCI: Add Intel QuickAssist device IDs
  vfio/pci: Hold igate across releasing eventfd contexts
  vfio/spapr_tce: convert get_user_pages() --> pin_user_pages()
  vfio/type1: Add conditional rescheduling after iommu map failed
  vfio/pci: Add Intel X550 to hidden INTx devices
  vfio: Cleanup allowed driver naming

Merge tag 'drm-next-2020-08-12' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
"This has a few vmwgfx regression fixes we hit from the merge window
  (one in TTM), it also has a bunch of amdgpu fixes along with a
  scattering everywhere else.

  core:
   - Fix drm_dp_mst_port refcount leaks in drm_dp_mst_allocate_vcpi
   - Remove null check for kfree in drm_dev_release.
   - Fix DRM_FORMAT_MOD_AMLOGIC_FBC definition.
   - re-added docs for drm_gem_flink_ioctl()
   - add orientation quirk for ASUS T103HAF

  ttm:
   - ttm: fix page-offset calculation within TTM
   - revert patch causing vmwgfx regressions

  fbcon:
   - Fix a fbcon OOB read in fbdev, found by syzbot.

  vga:
   - Mark vga_tryget static as it's not used elsewhere.

  amdgpu:
   - Re-add spelling typo fix
   - Sienna Cichlid fixes
   - Navy Flounder fixes
   - DC fixes
   - SMU i2c fix
   - Power fixes

  vmwgfx:
   - regression fixes for modesetting crashes
   - misc fixes

  xlnx:
   - Small fixes to xlnx.

  omap:
   - Fix mode initialization in omap_connector_mode_valid().
   - force runtime PM suspend on system suspend

  tidss:
   - fix modeset init for DPI panels"

* tag 'drm-next-2020-08-12' of git://anongit.freedesktop.org/drm/drm: (70 commits)
  drm/ttm: revert "drm/ttm: make TT creation purely optional v3"
  drm/vmwgfx: fix spelling mistake "Cant" -> "Can't"
  drm/vmwgfx: fix spelling mistake "Cound" -> "Could"
  drm/vmwgfx/ldu: Use drm_mode_config_reset
  drm/vmwgfx/sou: Use drm_mode_config_reset
  drm/vmwgfx/stdu: Use drm_mode_config_reset
  drm/vmwgfx: Fix two list_for_each loop exit tests
  drm/vmwgfx: Use correct vmw_legacy_display_unit pointer
  drm/vmwgfx: Use struct_size() helper
  drm/amdgpu: Fix bug where DPM is not enabled after hibernate and resume
  drm/amd/powerplay: put VCN/JPEG into PG ungate state before dpm table setup(V3)
  drm/amd/powerplay: update swSMU VCN/JPEG PG logics
  drm/amdgpu: use mode1 reset by default for sienna_cichlid
  drm/amdgpu/smu: rework i2c adpater registration
  drm/amd/display: Display goes blank after inst
  drm/amd/display: Change null plane state swizzle mode to 4kb_s
  drm/amd/display: Use helper function to check for HDMI signal
  drm/amd/display: AMD OUI (DPCD 0x00300) skipped on some sink
  drm/amd/display: Fix logger context
  drm/amd/display: populate new dml variable
  ...

Merge branch 'akpm' (patches from Andrew)

Merge more updates from Andrew Morton:

- most of the rest of MM (memcg, hugetlb, vmscan, proc, compaction,
   mempolicy, oom-kill, hugetlbfs, migration, thp, cma, util,
   memory-hotplug, cleanups, uaccess, migration, gup, pagemap),

- various other subsystems (alpha, misc, sparse, bitmap, lib, bitops,
   checkpatch, autofs, minix, nilfs, ufs, fat, signals, kmod, coredump,
   exec, kdump, rapidio, panic, kcov, kgdb, ipc).

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (164 commits)
  mm/gup: remove task_struct pointer for all gup code
  mm: clean up the last pieces of page fault accountings
  mm/xtensa: use general page fault accounting
  mm/x86: use general page fault accounting
  mm/sparc64: use general page fault accounting
  mm/sparc32: use general page fault accounting
  mm/sh: use general page fault accounting
  mm/s390: use general page fault accounting
  mm/riscv: use general page fault accounting
  mm/powerpc: use general page fault accounting
  mm/parisc: use general page fault accounting
  mm/openrisc: use general page fault accounting
  mm/nios2: use general page fault accounting
  mm/nds32: use general page fault accounting
  mm/mips: use general page fault accounting
  mm/microblaze: use general page fault accounting
  mm/m68k: use general page fault accounting
  mm/ia64: use general page fault accounting
  mm/hexagon: use general page fault accounting
  mm/csky: use general page fault accounting
  ...

mm/gup: remove task_struct pointer for all gup code

After the cleanup of page fault accounting, gup does not need to pass
task_struct around any more. Remove that parameter in the whole gup
stack.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Link: http://lkml.kernel.org/r/20200707225021.200906-26-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: clean up the last pieces of page fault accountings

Here're the last pieces of page fault accounting that were still done
outside handle_mm_fault() where we still have regs==NULL when calling
handle_mm_fault():

arch/powerpc/mm/copro_fault.c:   copro_handle_mm_fault
arch/sparc/mm/fault_32.c:        force_user_fault
arch/um/kernel/trap.c:           handle_page_fault
mm/gup.c:                        faultin_page
                                 fixup_user_fault
mm/hmm.c:                        hmm_vma_fault
mm/ksm.c:                        break_ksm

Some of them has the issue of duplicated accounting for page fault
retries.  Some of them didn't do the accounting at all.

This patch cleans all these up by letting handle_mm_fault() to do per-task
page fault accounting even if regs==NULL (though we'll still skip the perf
event accountings).  With that, we can safely remove all the outliers now.

There's another functional change in that now we account the page faults
to the caller of gup, rather than the task_struct that passed into the gup
code.  More information of this can be found at [1].

After this patch, below things should never be touched again outside
handle_mm_fault():

  - task_struct.[maj|min]_flt
  - PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]

[1] https://lore.kernel.org/lkml/CAHk-=wj_V2Tps2QrMn20_W0OJF9xqNh52XSGA42s-ZJ8Y+GyKw@mail.gmail.com/

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200707225021.200906-25-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/xtensa: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Remove the PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN] perf events because it's
now also done in handle_mm_fault().

Move the PERF_COUNT_SW_PAGE_FAULTS event higher before taking mmap_sem for
the fault, then it'll match with the rest of the archs.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Cc: Chris Zankel <chris@zankel.net>
Link: http://lkml.kernel.org/r/20200707225021.200906-24-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/x86: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Link: http://lkml.kernel.org/r/20200707225021.200906-23-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/sparc64: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: David S. Miller <davem@davemloft.net>
Link: http://lkml.kernel.org/r/20200707225021.200906-22-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/sparc32: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: David S. Miller <davem@davemloft.net>
Link: http://lkml.kernel.org/r/20200707225021.200906-21-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/sh: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Link: http://lkml.kernel.org/r/20200707225021.200906-20-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/s390: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Link: http://lkml.kernel.org/r/20200707225021.200906-19-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/riscv: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Link: http://lkml.kernel.org/r/20200707225021.200906-18-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/powerpc: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Link: http://lkml.kernel.org/r/20200707225021.200906-17-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/parisc: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too. Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Link: http://lkml.kernel.org/r/20200707225021.200906-16-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/openrisc: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too. Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Stafford Horne <shorne@gmail.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Link: http://lkml.kernel.org/r/20200707225021.200906-15-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/nios2: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too. Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Link: http://lkml.kernel.org/r/20200707225021.200906-14-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/nds32: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Fix PERF_COUNT_SW_PAGE_FAULTS perf event manually for page fault retries,
by moving it before taking mmap_sem.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Greentime Hu <green.hu@gmail.com>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Link: http://lkml.kernel.org/r/20200707225021.200906-13-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/mips: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Fix PERF_COUNT_SW_PAGE_FAULTS perf event manually for page fault retries,
by moving it before taking mmap_sem.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Link: http://lkml.kernel.org/r/20200707225021.200906-12-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/microblaze: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too. Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Simek <monstr@monstr.eu>
Link: http://lkml.kernel.org/r/20200707225021.200906-11-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/m68k: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too. Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Link: http://lkml.kernel.org/r/20200707225021.200906-10-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/ia64: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too. Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/20200707225021.200906-9-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/hexagon: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too. Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Brian Cain <bcain@codeaurora.org>
Link: http://lkml.kernel.org/r/20200707225021.200906-8-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/csky: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Guo Ren <guoren@kernel.org>
Link: http://lkml.kernel.org/r/20200707225021.200906-7-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/arm64: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened. To do this, we pass pt_regs
pointer into __do_page_fault().

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Link: http://lkml.kernel.org/r/20200707225021.200906-6-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/arm: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened. To do this, we need to pass
the pt_regs pointer into __do_page_fault().

Fix PERF_COUNT_SW_PAGE_FAULTS perf event manually for page fault retries,
by moving it before taking mmap_sem.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200707225021.200906-5-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/arc: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Fix PERF_COUNT_SW_PAGE_FAULTS perf event manually for page fault retries,
by moving it before taking mmap_sem.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Vineet Gupta <vgupta@synopsys.com>
Link: http://lkml.kernel.org/r/20200707225021.200906-4-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/alpha: use general page fault accounting

Use the general page fault accounting by passing regs into
handle_mm_fault().

Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too. Note, the
other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in
handle_mm_fault().

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Link: http://lkml.kernel.org/r/20200707225021.200906-3-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: do page fault accounting in handle_mm_fault

Patch series "mm: Page fault accounting cleanups", v5.

This is v5 of the pf accounting cleanup series.  It originates from Gerald
Schaefer's report on an issue a week ago regarding to incorrect page fault
accountings for retried page fault after commit 28f57e5a325c ("mm: allow
VM_FAULT_RETRY for multiple times"):

  https://lore.kernel.org/lkml/20200610174811.44b94525@thinkpad/

What this series did:

  - Correct page fault accounting: we do accounting for a page fault
    (no matter whether it's from #PF handling, or gup, or anything else)
    only with the one that completed the fault.  For example, page fault
    retries should not be counted in page fault counters.  Same to the
    perf events.

  - Unify definition of PERF_COUNT_SW_PAGE_FAULTS: currently this perf
    event is used in an adhoc way across different archs.

    Case (1): for many archs it's done at the entry of a page fault
    handler, so that it will also cover e.g.  errornous faults.

    Case (2): for some other archs, it is only accounted when the page
    fault is resolved successfully.

    Case (3): there're still quite some archs that have not enabled
    this perf event.

    Since this series will touch merely all the archs, we unify this
    perf event to always follow case (1), which is the one that makes most
    sense.  And since we moved the accounting into handle_mm_fault, the
    other two MAJ/MIN perf events are well taken care of naturally.

  - Unify definition of "major faults": the definition of "major
    fault" is slightly changed when used in accounting (not
    VM_FAULT_MAJOR).  More information in patch 1.

  - Always account the page fault onto the one that triggered the page
    fault.  This does not matter much for #PF handlings, but mostly for
    gup.  More information on this in patch 25.

Patchset layout:

Patch 1:     Introduced the accounting in handle_mm_fault(), not enabled.
Patch 2-23:  Enable the new accounting for arch #PF handlers one by one.
Patch 24:    Enable the new accounting for the rest outliers (gup, iommu, etc.)
Patch 25:    Cleanup GUP task_struct pointer since it's not needed any more

This patch (of 25):

This is a preparation patch to move page fault accountings into the
general code in handle_mm_fault().  This includes both the per task
flt_maj/flt_min counters, and the major/minor page fault perf events.  To
do this, the pt_regs pointer is passed into handle_mm_fault().

PERF_COUNT_SW_PAGE_FAULTS should still be kept in per-arch page fault
handlers.

So far, all the pt_regs pointer that passed into handle_mm_fault() is
NULL, which means this patch should have no intented functional change.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200707225021.200906-1-peterx@redhat.com
Link: http://lkml.kernel.org/r/20200707225021.200906-2-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/gup: use a standard migration target allocation callback

There is a well-defined migration target allocation callback. Use it.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Link: http://lkml.kernel.org/r/1596180906-8442-3-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/hugetlb: make hugetlb migration callback CMA aware

new_non_cma_page() in gup.c requires to allocate the new page that is not
on the CMA area.  new_non_cma_page() implements it by using allocation
scope APIs.

However, there is a work-around for hugetlb.  Normal hugetlb page
allocation API for migration is alloc_huge_page_nodemask().  It consists
of two steps.  First is dequeing from the pool.  Second is, if there is no
available page on the queue, allocating by using the page allocator.

new_non_cma_page() can't use this API since first step (deque) isn't aware
of scope API to exclude CMA area.  So, new_non_cma_page() exports hugetlb
internal function for the second step, alloc_migrate_huge_page(), to
global scope and uses it directly.  This is suboptimal since hugetlb pages
on the queue cannot be utilized.

This patch tries to fix this situation by making the deque function on
hugetlb CMA aware.  In the deque function, CMA memory is skipped if
PF_MEMALLOC_NOCMA flag is found.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Link: http://lkml.kernel.org/r/1596180906-8442-2-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/gup: restrict CMA region by using allocation scope API

We have well defined scope API to exclude CMA region.  Use it rather than
manipulating gfp_mask manually.  With this change, we can now restore
__GFP_MOVABLE for gfp_mask like as usual migration target allocation.  It
would result in that the ZONE_MOVABLE is also searched by page allocator.
For hugetlb, gfp_mask is redefined since it has a regular allocation mask
filter for migration target.  __GPF_NOWARN is added to hugetlb gfp_mask
filter since a new user for gfp_mask filter, gup, want to be silent when
allocation fails.

Note that this can be considered as a fix for the commit 9fd8488f295d
("mm: update get_user_pages_longterm to migrate pages allocated from CMA
region").  However, "Fixes" tag isn't added here since it is just
suboptimal but it doesn't cause any problem.

Suggested-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Link: http://lkml.kernel.org/r/1596180906-8442-1-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/page_alloc: remove a wrapper for alloc_migration_target()

There is a well-defined standard migration target callback. Use it
directly.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Link: http://lkml.kernel.org/r/1594622517-20681-8-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/mempolicy: use a standard migration target allocation callback

There is a well-defined migration target allocation callback. Use it.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Link: http://lkml.kernel.org/r/1594622517-20681-7-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/migrate: introduce a standard migration target allocation function

There are some similar functions for migration target allocation.  Since
there is no fundamental difference, it's better to keep just one rather
than keeping all variants.  This patch implements base migration target
allocation function.  In the following patches, variants will be converted
to use this function.

Changes should be mechanical, but, unfortunately, there are some
differences.  First, some callers' nodemask is assgined to NULL since NULL
nodemask will be considered as all available nodes, that is,
&node_states[N_MEMORY].  Second, for hugetlb page allocation, gfp_mask is
redefined as regular hugetlb allocation gfp_mask plus __GFP_THISNODE if
user provided gfp_mask has it.  This is because future caller of this
function requires to set this node constaint.  Lastly, if provided nodeid
is NUMA_NO_NODE, nodeid is set up to the node where migration source
lives.  It helps to remove simple wrappers for setting up the nodeid.

Note that PageHighmem() call in previous function is changed to open-code
"is_highmem_idx()" since it provides more readability.

[akpm@linux-foundation.org: tweak patch title, per Vlastimil]
[akpm@linux-foundation.org: fix typo in comment]

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Link: http://lkml.kernel.org/r/1594622517-20681-6-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/migrate: clear __GFP_RECLAIM to make the migration callback consistent with regular THP allocations

new_page_nodemask is a migration callback and it tries to use a common gfp
flags for the target page allocation whether it is a base page or a THP.
The later only adds GFP_TRANSHUGE to the given mask.  This results in the
allocation being slightly more aggressive than necessary because the
resulting gfp mask will contain also __GFP_RECLAIM_KSWAPD.  THP
allocations usually exclude this flag to reduce over eager background
reclaim during a high THP allocation load which has been seen during large
mmaps initialization.  There is no indication that this is a problem for
migration as well but theoretically the same might happen when migrating
large mappings to a different node.  Make the migration callback
consistent with regular THP allocations.

[akpm@linux-foundation.org: fix comment typo, per Vlastimil]

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Link: http://lkml.kernel.org/r/1594622517-20681-5-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/hugetlb: unify migration callbacks

There is no difference between two migration callback functions,
alloc_huge_page_node() and alloc_huge_page_nodemask(), except
__GFP_THISNODE handling.  It's redundant to have two almost similar
functions in order to handle this flag.  So, this patch tries to remove
one by introducing a new argument, gfp_mask, to
alloc_huge_page_nodemask().

After introducing gfp_mask argument, it's caller's job to provide correct
gfp_mask.  So, every callsites for alloc_huge_page_nodemask() are changed
to provide gfp_mask.

Note that it's safe to remove a node id check in alloc_huge_page_node()
since there is no caller passing NUMA_NO_NODE as a node id.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Roman Gushchin <guro@fb.com>
Link: http://lkml.kernel.org/r/1594622517-20681-4-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>