Florian Westphal [Thu, 19 May 2022 22:02:04 +0000 (00:02 +0200)]
netfilter: conntrack: re-fetch conntrack after insertion
In case the conntrack is clashing, insertion can free skb->_nfct and
set skb->_nfct to the already-confirmed entry.
This wasn't found before because the conntrack entry and the extension
space used to free'd after an rcu grace period, plus the race needs
events enabled to trigger.
Reported-by: <syzbot+793a590957d9c1b96620@syzkaller.appspotmail.com> Fixes: 82ae35d224ae ("netfilter: conntrack: introduce clash resolution on insertion race") Fixes: 0b331c716b85 ("netfilter: conntrack: free extension area immediately") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Miaoqian Lin [Thu, 26 May 2022 14:52:08 +0000 (18:52 +0400)]
net: dsa: mv88e6xxx: Fix refcount leak in mv88e6xxx_mdios_register
of_get_child_by_name() returns a node pointer with refcount
incremented, we should use of_node_put() on it when done.
mv88e6xxx_mdio_register() pass the device node to of_mdiobus_register().
We don't need the device node after it.
Add missing of_node_put() to avoid refcount leak.
Fixes: b34277b7205d ("net: dsa: mv88e6xxx: Support multiple MDIO busses") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Miaoqian Lin [Thu, 26 May 2022 08:52:08 +0000 (12:52 +0400)]
net: ethernet: ti: am65-cpsw-nuss: Fix some refcount leaks
of_get_child_by_name() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
am65_cpsw_init_cpts() and am65_cpsw_nuss_probe() don't release
the refcount in error case.
Add missing of_node_put() to avoid refcount leak.
Fixes: 1448b012fa6c ("net: ethernet: ti: am65-cpsw-nuss: enable packet timestamping support") Fixes: e7364a21077b ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Thu, 26 May 2022 08:02:42 +0000 (11:02 +0300)]
net: ethernet: mtk_eth_soc: out of bounds read in mtk_hwlro_get_fdir_entry()
The "fsp->location" variable comes from user via ethtool_get_rxnfc().
Check that it is valid to prevent an out of bounds read.
Fixes: 754bb21bfd99 ("net: ethernet: mediatek: add ethtool functions to configure RX flows of HW LRO") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vincent Ray [Thu, 26 May 2022 00:17:46 +0000 (17:17 -0700)]
net: sched: fixed barrier to prevent skbuff sticking in qdisc backlog
In qdisc_run_begin(), smp_mb__before_atomic() used before test_bit()
does not provide any ordering guarantee as test_bit() is not an atomic
operation. This, added to the fact that the spin_trylock() call at
the beginning of qdisc_run_begin() does not guarantee acquire
semantics if it does not grab the lock, makes it possible for the
following statement :
if (test_bit(__QDISC_STATE_MISSED, &qdisc->state))
to be executed before an enqueue operation called before
qdisc_run_begin().
In the above scenario, CPU 1 and CPU 2 both try to grab the
qdisc->seqlock at the same time. Only CPU 2 succeeds and enters the
bypass code path, where it emits its skb then calls __qdisc_run().
CPU1 fails, sets MISSED and goes down the traditionnal enqueue() +
dequeue() code path. But when executing qdisc_run_begin() for the
second time, after enqueuing its skbuff, it sees the MISSED bit still
set (by itself) and consequently chooses to exit early without setting
it again nor trying to grab the spinlock again.
Meanwhile CPU2 has seen MISSED = 1, cleared it, checked the queue
and found it empty, so it returned.
At the end of the sequence, we end up with skb1 enqueued in the
backlog, both CPUs out of __dev_xmit_skb(), the MISSED bit not set,
and no __netif_schedule() called made. skb1 will now linger in the
qdisc until somebody later performs a full __qdisc_run(). Associated
to the bypass capacity of the qdisc, and the ability of the TCP layer
to avoid resending packets which it knows are still in the qdisc, this
can lead to serious traffic "holes" in a TCP connection.
We fix this by replacing the smp_mb__before_atomic() / test_bit() /
set_bit() / smp_mb__after_atomic() sequence inside qdisc_run_begin()
by a single test_and_set_bit() call, which is more concise and
enforces the needed memory barriers.
Fixes: e19f219577e6 ("net: sched: add barrier to ensure correct ordering for lockless qdisc") Signed-off-by: Vincent Ray <vray@kalrayinc.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220526001746.2437669-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Michael Walle [Wed, 25 May 2022 23:12:39 +0000 (01:12 +0200)]
net: lan966x: check devm_of_phy_get() for -EDEFER_PROBE
At the moment, if devm_of_phy_get() returns an error the serdes
simply isn't set. While it is bad to ignore an error in general, there
is a particular bug that network isn't working if the serdes driver is
compiled as a module. In that case, devm_of_phy_get() returns
-EDEFER_PROBE and the error is silently ignored.
The serdes is optional, it is not there if the port is using RGMII, in
which case devm_of_phy_get() returns -ENODEV. Rearrange the error
handling so that -ENODEV will be handled but other error codes will
abort the probing.
Phil Sutter [Tue, 24 May 2022 12:50:01 +0000 (14:50 +0200)]
netfilter: nft_limit: Clone packet limits' cost value
When cloning a packet-based limit expression, copy the cost value as
well. Otherwise the new limit is not functional anymore.
Fixes: 062d62c00def9 ("netfilter: nft_limit: move stateful fields out of expression data") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netfilter: nf_tables: disallow non-stateful expression in sets earlier
Since bf2e6126db7d ("netfilter: nft_dynset: dynamic stateful expression
instantiation"), it is possible to attach stateful expressions to set
elements.
ea3cf8101c0d ("netfilter: nf_tables: split set destruction in deactivate
and destroy phase") introduces conditional destruction on the object to
accomodate transaction semantics.
nft_expr_init() calls expr->ops->init() first, then check for
NFT_STATEFUL_EXPR, this stills allows to initialize a non-stateful
lookup expressions which points to a set, which might lead to UAF since
the set is not properly detached from the set->binding for this case.
Anyway, this combination is non-sense from nf_tables perspective.
This patch fixes this problem by checking for NFT_STATEFUL_EXPR before
expr->ops->init() is called.
The reporter provides a KASAN splat and a poc reproducer (similar to
those autogenerated by syzbot to report use-after-free errors). It is
unknown to me if they are using syzbot or if they use similar automated
tool to locate the bug that they are reporting.
For the record, this is the KASAN splat.
[ 85.431824] ==================================================================
[ 85.432901] BUG: KASAN: use-after-free in nf_tables_bind_set+0x81b/0xa20
[ 85.433825] Write of size 8 at addr ffff8880286f0e98 by task poc/776
[ 85.434756]
[ 85.434999] CPU: 1 PID: 776 Comm: poc Tainted: G W 5.18.0+ #2
[ 85.436023] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
Fixes: 29610df11a8f ("netfilter: nf_tables: add helper functions for expression handling") Reported-and-tested-by: Aaron Adams <edg-e@nccgroup.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Daniel Borkmann [Tue, 24 May 2022 22:56:18 +0000 (00:56 +0200)]
net, neigh: Set lower cap for neigh_managed_work rearming
Yuwei reported that plain reuse of DELAY_PROBE_TIME to rearm work queue
in neigh_managed_work is problematic if user explicitly configures the
DELAY_PROBE_TIME to 0 for a neighbor table. Such misconfig can then hog
CPU to 100% processing the system work queue. Instead, set lower interval
bound to HZ which is totally sufficient. Yuwei is additionally looking
into making the interval separately configurable from DELAY_PROBE_TIME.
liuyacan [Wed, 25 May 2022 08:54:08 +0000 (16:54 +0800)]
net/smc: set ini->smcrv2.ib_dev_v2 to NULL if SMC-Rv2 is unavailable
In the process of checking whether RDMAv2 is available, the current
implementation first sets ini->smcrv2.ib_dev_v2, and then allocates
smc buf desc and register rmb, but the latter may fail. In this case,
the pointer should be reset.
luyun [Wed, 25 May 2022 03:18:19 +0000 (11:18 +0800)]
selftests/net: enable lo.accept_local in psock_snd test
The psock_snd test sends and receives packets over loopback, and
the test results depend on parameter settings:
Set rp_filter=0,
or set rp_filter=1 and accept_local=1
so that the test will pass. Otherwise, this test will fail with
Resource temporarily unavailable:
sudo ./psock_snd.sh
dgram
tx: 128
rx: 142
./psock_snd: recv: Resource temporarily unavailable
For most distro kernel releases(like Ubuntu or Centos), the parameter
rp_filter is enabled by default, so it's necessary to enable the
parameter lo.accept_local in psock_snd test. And this test runs
inside a netns, changing a sysctl is fine.
Signed-off-by: luyun <luyun@kylinos.cn> Reviewed-by: Jackie Liu <liuyun01@kylinos.cn> Tested-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20220525031819.866684-1-luyun_611@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net: ethernet: ti: am65-cpsw: Fix fwnode passed to phylink_create()
am65-cpsw-nuss driver incorrectly uses fwnode member of common
ethernet device's "struct device_node" instead of using fwnode
member of the port's "struct device_node" in phylink_create().
This results in all ports having the same phy data when there
are multiple ports with their phy properties populated in their
respective nodes rather than the common ethernet device node.
Fix it here by using fwnode member of the port's node.
Jakub Kicinski [Thu, 26 May 2022 04:36:19 +0000 (21:36 -0700)]
Merge branch 'amt-fix-several-bugs'
Taehee Yoo says:
====================
amt: fix several bugs
This patchset fixes several bugs in amt module
First patch fixes typo.
Second patch fixes wrong return value of amt_update_handler().
A relay finds a tunnel if it receives an update message from the gateway.
If it can't find a tunnel, amt_update_handler() should return an error,
not success. But it always returns success.
Third patch fixes a possible memory leak in amt_rcv().
A skb would not be freed if an amt interface doesn't have a socket.
====================
Taehee Yoo [Mon, 23 May 2022 16:17:08 +0000 (16:17 +0000)]
amt: fix possible memory leak in amt_rcv()
If an amt receives packets and it finds socket.
If it can't find a socket, it should free a received skb.
But it doesn't.
So, a memory leak would possibly occur.
Fixes: 19284cc854b3 ("amt: add data plane of amt interface") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Taehee Yoo [Mon, 23 May 2022 16:17:07 +0000 (16:17 +0000)]
amt: fix return value of amt_update_handler()
If a relay receives an update message, it lookup a tunnel.
and if there is no tunnel for that message, it should be treated
as an error, not a success.
But amt_update_handler() returns false, which means success.
Fixes: 19284cc854b3 ("amt: add data plane of amt interface") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Taehee Yoo [Mon, 23 May 2022 16:17:06 +0000 (16:17 +0000)]
amt: fix typo in amt
AMT_MSG_TEARDOWM is defined,
But it should be AMT_MSG_TEARDOWN.
Fixes: 1470160287c4 ("amt: add control plane of amt interface") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Wed, 25 May 2022 19:22:58 +0000 (12:22 -0700)]
Merge tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core
----
- Support TCPv6 segmentation offload with super-segments larger than
64k bytes using the IPv6 Jumbogram extension header (AKA BIG TCP).
- Generalize skb freeing deferral to per-cpu lists, instead of
per-socket lists.
- Add a netdev statistic for packets dropped due to L2 address
mismatch (rx_otherhost_dropped).
- Continue work annotating skb drop reasons.
- Accept alternative netdev names (ALT_IFNAME) in more netlink
requests.
- Add VLAN support for AF_PACKET SOCK_RAW GSO.
- Allow receiving skb mark from the socket as a cmsg.
- Enable memcg accounting for veth queues, sysctl tables and IPv6.
BPF
---
- Add libbpf support for User Statically-Defined Tracing (USDTs).
- Speed up symbol resolution for kprobes multi-link attachments.
- Support storing typed pointers to referenced and unreferenced
objects in BPF maps.
- Add support for BPF link iterator.
- Introduce access to remote CPU map elements in BPF per-cpu map.
- Allow middle-of-the-road settings for the
kernel.unprivileged_bpf_disabled sysctl.
- Implement basic types of dynamic pointers e.g. to allow for
dynamically sized ringbuf reservations without extra memory copies.
Protocols
---------
- Retire port only listening_hash table, add a second bind table
hashed by port and address. Avoid linear list walk when binding to
very popular ports (e.g. 443).
- Add bridge FDB bulk flush filtering support allowing user space to
remove all FDB entries matching a condition.
- Introduce accept_unsolicited_na sysctl for IPv6 to implement
router-side changes for RFC9131.
- Support for MPTCP path manager in user space.
- Add MPTCP support for fallback to regular TCP for connections that
have never connected additional subflows or transmitted
out-of-sequence data (partial support for RFC8684 fallback).
- Avoid races in MPTCP-level window tracking, stabilize and improve
throughput.
- Support lockless operation of GRE tunnels with seq numbers enabled.
- WiFi support for host based BSS color collision detection.
- Add support for SO_TXTIME/SCM_TXTIME on CAN sockets.
- Support transmission w/o flow control in CAN ISOTP (ISO 15765-2).
- Support zero-copy Tx with TLS 1.2 crypto offload (sendfile).
- Allow matching on the number of VLAN tags via tc-flower.
- Add tracepoint for tcp_set_ca_state().
Driver API
----------
- Improve error reporting from classifier and action offload.
- Add support for listing line cards in switches (devlink).
- Add helpers for reporting page pool statistics with ethtool -S.
- Add support for reading clock cycles when using PTP virtual clocks,
instead of having the driver convert to time before reporting. This
makes it possible to report time from different vclocks.
- Support configuring low-latency Tx descriptor push via ethtool.
- Separate Clause 22 and Clause 45 MDIO accesses more explicitly.
New hardware / drivers
----------------------
- Ethernet:
- Marvell's Octeon NIC PCI Endpoint support (octeon_ep)
- Sunplus SP7021 SoC (sp7021_emac)
- Add support for Renesas RZ/V2M (in ravb)
- Add support for MediaTek mt7986 switches (in mtk_eth_soc)
- Ethernet PHYs:
- ADIN1100 industrial PHYs (w/ 10BASE-T1L and SQI reporting)
- TI DP83TD510 PHY
- Microchip LAN8742/LAN88xx PHYs
- WiFi:
- Driver for pureLiFi X, XL, XC devices (plfxlc)
- Driver for Silicon Labs devices (wfx)
- Support for WCN6750 (in ath11k)
- Support Realtek 8852ce devices (in rtw89)
- CAN:
- ctucanfd: add support for CTU CAN FD open-source IP core from
Czech Technical University in Prague
Drivers
-------
- Delete a number of old drivers still using virt_to_bus().
- Ethernet NICs:
- intel: support TSO on tunnels MPLS
- broadcom: support multi-buffer XDP
- nfp: support VF rate limiting
- sfc: use hardware tx timestamps for more than PTP
- mlx5: multi-port eswitch support
- hyper-v: add support for XDP_REDIRECT
- atlantic: XDP support (including multi-buffer)
- macb: improve real-time perf by deferring Tx processing to NAPI
- High-speed Ethernet switches:
- mlxsw: implement basic line card information querying
- prestera: add support for traffic policing on ingress and egress
- Embedded Ethernet switches:
- lan966x: add support for packet DMA (FDMA)
- lan966x: add support for PTP programmable pins
- ti: cpsw_new: enable bc/mc storm prevention
- Qualcomm 802.11ax WiFi (ath11k):
- Wake-on-WLAN support for QCA6390 and WCN6855
- device recovery (firmware restart) support
- support setting Specific Absorption Rate (SAR) for WCN6855
- read country code from SMBIOS for WCN6855/QCA6390
- enable keep-alive during WoWLAN suspend
- implement remain-on-channel support
- MediaTek WiFi (mt76):
- support Wireless Ethernet Dispatch offloading packet movement
between the Ethernet switch and WiFi interfaces
- non-standard VHT MCS10-11 support
- mt7921 AP mode support
- mt7921 IPv6 NS offload support
- Ethernet PHYs:
- micrel: ksz9031/ksz9131: cabletest support
- lan87xx: SQI support for T1 PHYs
- lan937x: add interrupt support for link detection"
* tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1809 commits)
ptp: ocp: Add firmware header checks
ptp: ocp: fix PPS source selector debugfs reporting
ptp: ocp: add .init function for sma_op vector
ptp: ocp: vectorize the sma accessor functions
ptp: ocp: constify selectors
ptp: ocp: parameterize input/output sma selectors
ptp: ocp: revise firmware display
ptp: ocp: add Celestica timecard PCI ids
ptp: ocp: Remove #ifdefs around PCI IDs
ptp: ocp: 32-bit fixups for pci start address
Revert "net/smc: fix listen processing for SMC-Rv2"
ath6kl: Use cc-disable-warning to disable -Wdangling-pointer
selftests/bpf: Dynptr tests
bpf: Add dynptr data slices
bpf: Add bpf_dynptr_read and bpf_dynptr_write
bpf: Dynptr support for ring buffers
bpf: Add bpf_dynptr_from_mem for local dynptrs
bpf: Add verifier support for dynptrs
bpf: Suppress 'passing zero to PTR_ERR' warning
bpf: Introduce bpf_arch_text_invalidate for bpf_prog_pack
...
Linus Torvalds [Wed, 25 May 2022 18:47:25 +0000 (11:47 -0700)]
Merge branch 'for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
"Nothing too interesting. This adds cpu controller selftests and there
are a couple code cleanup patches"
* 'for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: remove the superfluous judgment
cgroup: Make cgroup_debug static
kseltest/cgroup: Make test_stress.sh work if run interactively
kselftest/cgroup: fix test_stress.sh to use OUTPUT dir
cgroup: Add config file to cgroup selftest suite
cgroup: Add test_cpucg_max_nested() testcase
cgroup: Add test_cpucg_max() testcase
cgroup: Add test_cpucg_nested_weight_underprovisioned() testcase
cgroup: Adding test_cpucg_nested_weight_overprovisioned() testcase
cgroup: Add test_cpucg_weight_underprovisioned() testcase
cgroup: Add test_cpucg_weight_overprovisioned() testcase
cgroup: Add test_cpucg_stats() testcase to cgroup cpu selftests
cgroup: Add new test_cpu.c test suite in cgroup selftests
Linus Torvalds [Wed, 25 May 2022 18:32:53 +0000 (11:32 -0700)]
Merge tag 'linux-kselftest-kunit-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull KUnit updates from Shuah Khan:
"Several fixes, cleanups, and enhancements to tests and framework:
- introduce _NULL and _NOT_NULL macros to pointer error checks
- rework kunit_resource allocation policy to fix memory leaks when
caller doesn't specify free() function to be used when allocating
memory using kunit_add_resource() and kunit_alloc_resource() funcs.
- add ability to specify suite-level init and exit functions"
* tag 'linux-kselftest-kunit-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (41 commits)
kunit: tool: Use qemu-system-i386 for i386 runs
kunit: fix executor OOM error handling logic on non-UML
kunit: tool: update riscv QEMU config with new serial dependency
kcsan: test: use new suite_{init,exit} support
kunit: tool: Add list of all valid test configs on UML
kunit: take `kunit_assert` as `const`
kunit: tool: misc cleanups
kunit: tool: minor cosmetic cleanups in kunit_parser.py
kunit: tool: make parser stop overwriting status of suites w/ no_tests
kunit: tool: remove dead parse_crash_in_log() logic
kunit: tool: print clearer error message when there's no TAP output
kunit: tool: stop using a shell to run kernel under QEMU
kunit: tool: update test counts summary line format
kunit: bail out of test filtering logic quicker if OOM
lib/Kconfig.debug: change KUnit tests to default to KUNIT_ALL_TESTS
kunit: Rework kunit_resource allocation policy
kunit: fix debugfs code to use enum kunit_status, not bool
kfence: test: use new suite_{init/exit} support, add .kunitconfig
kunit: add ability to specify suite-level init and exit functions
kunit: rename print_subtest_{start,end} for clarity (s/subtest/suite)
...
Linus Torvalds [Wed, 25 May 2022 18:30:21 +0000 (11:30 -0700)]
Merge tag 'linux-kselftest-next-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kselftest updates from Shuah Khan:
"Several fixes, cleanups, and enhancements to tests:
- add mips support for kprobe args string and syntax tests
- updates to resctrl test to use kselftest framework
- fixes, cleanups, and enhancements to tests"
* tag 'linux-kselftest-next-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kselftests/ir : Improve readability of modprobe error message
selftests/resctrl: Fix null pointer dereference on open failed
selftests/resctrl: Add missing SPDX license to Makefile
selftests/resctrl: Update README about using kselftest framework to build/run resctrl_tests
selftests/resctrl: Make resctrl_tests run using kselftest framework
selftests/resctrl: Fix resctrl_tests' return code to work with selftest framework
selftests/resctrl: Change the default limited time to 120 seconds
selftests/resctrl: Kill child process before parent process terminates if SIGTERM is received
selftests/resctrl: Print a message if the result of MBM&CMT tests is failed on Intel CPU
selftests/resctrl: Extend CPU vendor detection
selftests/x86/corrupt_xstate_header: Use provided __cpuid_count() macro
selftests/x86/amx: Use provided __cpuid_count() macro
selftests/vm/pkeys: Use provided __cpuid_count() macro
selftests: Provide local define of __cpuid_count()
selftests/damon: add damon to selftests root Makefile
selftests/binderfs: Improve message to provide more info
selftests: mqueue: drop duplicate min definition
selftests/ftrace: add mips support for kprobe args syntax tests
selftests/ftrace: add mips support for kprobe args string tests
Linus Torvalds [Wed, 25 May 2022 18:17:41 +0000 (11:17 -0700)]
Merge tag 'docs-5.19' of git://git.lwn.net/linux
Pull documentation updates from Jonathan Corbet:
"It was a moderately busy cycle for documentation; highlights include:
- After a long period of inactivity, the Japanese translations are
seeing some much-needed maintenance and updating.
- Reworked IOMMU documentation
- Some new documentation for static-analysis tools
- A new overall structure for the memory-management documentation.
This is an LSFMM outcome that, it is hoped, will help encourage
developers to fill in the many gaps. Optimism is eternal...but
hopefully it will work.
- More Chinese translations.
Plus the usual typo fixes, updates, etc"
* tag 'docs-5.19' of git://git.lwn.net/linux: (70 commits)
docs: pdfdocs: Add space for chapter counts >= 100 in TOC
docs/zh_CN: Add dev-tools/gdb-kernel-debugging.rst Chinese translation
input: Docs: correct ntrig.rst typo
input: Docs: correct atarikbd.rst typos
MAINTAINERS: Become the docs/zh_CN maintainer
docs/zh_CN: fix devicetree usage-model translation
mm,doc: Add new documentation structure
Documentation: drop more IDE boot options and ide-cd.rst
Documentation/process: use scripts/get_maintainer.pl on patches
MAINTAINERS: Add entry for DOCUMENTATION/JAPANESE
docs/trans/ja_JP/howto: Don't mention specific kernel versions
docs/ja_JP/SubmittingPatches: Request summaries for commit references
docs/ja_JP/SubmittingPatches: Add Suggested-by as a standard signature
docs/ja_JP/SubmittingPatches: Randy has moved
docs/ja_JP/SubmittingPatches: Suggest the use of scripts/get_maintainer.pl
docs/ja_JP/SubmittingPatches: Update GregKH links
Documentation/sysctl: document max_rcu_stall_to_panic
Documentation: add missing angle bracket in cgroup-v2 doc
Documentation: dev-tools: use literal block instead of code-block
docs/zh_CN: add vm numa translation
...
Linus Torvalds [Wed, 25 May 2022 17:32:08 +0000 (10:32 -0700)]
Merge tag 'printk-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk updates from Petr Mladek:
- Offload writing printk() messages on consoles to per-console
kthreads.
It prevents soft-lockups when an extensive amount of messages is
printed. It was observed, for example, during boot of large systems
with a lot of peripherals like disks or network interfaces.
It prevents live-lockups that were observed, for example, when
messages about allocation failures were reported and a CPU handled
consoles instead of reclaiming the memory. It was hard to solve even
with rate limiting because it would need to take into account the
amount of messages and the speed of all consoles.
It is a must to have for real time. Otherwise, any printk() might
break latency guarantees.
The per-console kthreads allow to handle each console on its own
speed. Slow consoles do not longer slow down faster ones. And
printk() does not longer unpredictably slows down various code paths.
There are situations when the kthreads are either not available or
not reliable, for example, early boot, suspend, or panic. In these
situations, printk() uses the legacy mode and tries to handle
consoles immediately.
- Add documentation for the printk index.
* tag 'printk-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
printk, tracing: fix console tracepoint
printk: remove @console_locked
printk: extend console_lock for per-console locking
printk: add kthread console printers
printk: add functions to prefer direct printing
printk: add pr_flush()
printk: move buffer definitions into console_emit_next_record() caller
printk: refactor and rework printing logic
printk: add con_printk() macro for console details
printk: call boot_delay_msec() in printk_delay()
printk: get caller_id/timestamp after migration disable
printk: wake waiters for safe and NMI contexts
printk: wake up all waiters
printk: add missing memory barrier to wake_up_klogd()
printk: cpu sync always disable interrupts
printk: rename cpulock functions
printk/index: Printk index feature documentation
MAINTAINERS: Add printk indexing maintainers on mention of printk_index
Linus Torvalds [Wed, 25 May 2022 17:24:04 +0000 (10:24 -0700)]
Merge tag 'slab-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab updates from Vlastimil Babka:
- Conversion of slub_debug stack traces to stackdepot, allowing more
useful debugfs-based inspection for e.g. memory leak debugging.
Allocation and free debugfs info now includes full traces and is
sorted by the unique trace frequency.
The stackdepot conversion was already attempted last year but
reverted by 7e28eda20a24. The memory overhead (while not actually
enabled on boot) has been meanwhile solved by making the large
stackdepot allocation dynamic. The xfstest issues haven't been
reproduced on current kernel locally nor in -next, so the slab cache
layout changes that originally made that bug manifest were probably
not the root cause.
- Refactoring of dma-kmalloc caches creation.
- Trivial cleanups such as removal of unused parameters, fixes and
clarifications of comments.
- Hyeonggon Yoo joins as a reviewer.
* tag 'slab-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
MAINTAINERS: add myself as reviewer for slab
mm/slub: remove unused kmem_cache_order_objects max
mm: slab: fix comment for __assume_kmalloc_alignment
mm: slab: fix comment for ARCH_KMALLOC_MINALIGN
mm/slub: remove unneeded return value of slab_pad_check
mm/slab_common: move dma-kmalloc caches creation into new_kmalloc_cache()
mm/slub: remove meaningless node check in ___slab_alloc()
mm/slub: remove duplicate flag in allocate_slab()
mm/slub: remove unused parameter in setup_object*()
mm/slab.c: fix comments
slab, documentation: add description of debugfs files for SLUB caches
mm/slub: sort debugfs output by frequency of stack traces
mm/slub: distinguish and print stack traces in debugfs files
mm/slub: use stackdepot to save stack trace in objects
mm/slub: move struct track init out of set_track()
lib/stackdepot: allow requesting early initialization dynamically
mm/slub, kunit: Make slub_kunit unaffected by user specified flags
mm/slab: remove some unused functions
Linus Torvalds [Wed, 25 May 2022 16:02:19 +0000 (09:02 -0700)]
linux/types.h: reinstate "__bitwise__" macro for user space use
Commit dcde58fe2922 ("linux/types.h: remove unnecessary __bitwise__")
was right that there are no users of __bitwise__ in the kernel, but it
turns out there are user space users of it that do expect it.
It is, after all, in the uapi directory, so user space usage is to be
expected.
Instead of reverting the commit completely, let's just clarify the
situation so that it doesn't happen again, and have some in-code
explanations for why that "__bitwise__" still exists.
Sean Young [Wed, 25 May 2022 13:08:30 +0000 (14:08 +0100)]
media: lirc: revert removal of unused feature flags
Commit 741718ef582e ("media: lirc: remove unused lirc features") removed
feature flags which were never implemented, but they are still used by
the lirc daemon went built from source.
Reinstate these symbols in order not to break the lirc build.
Linus Torvalds [Wed, 25 May 2022 02:55:07 +0000 (19:55 -0700)]
Merge tag 'folio-5.19' of git://git.infradead.org/users/willy/pagecache
Pull page cache updates from Matthew Wilcox:
- Appoint myself page cache maintainer
- Fix how scsicam uses the page cache
- Use the memalloc_nofs_save() API to replace AOP_FLAG_NOFS
- Remove the AOP flags entirely
- Remove pagecache_write_begin() and pagecache_write_end()
- Documentation updates
- Convert several address_space operations to use folios:
- is_dirty_writeback
- readpage becomes read_folio
- releasepage becomes release_folio
- freepage becomes free_folio
- Change filler_t to require a struct file pointer be the first
argument like ->read_folio
* tag 'folio-5.19' of git://git.infradead.org/users/willy/pagecache: (107 commits)
nilfs2: Fix some kernel-doc comments
Appoint myself page cache maintainer
fs: Remove aops->freepage
secretmem: Convert to free_folio
nfs: Convert to free_folio
orangefs: Convert to free_folio
fs: Add free_folio address space operation
fs: Convert drop_buffers() to use a folio
fs: Change try_to_free_buffers() to take a folio
jbd2: Convert release_buffer_page() to use a folio
jbd2: Convert jbd2_journal_try_to_free_buffers to take a folio
reiserfs: Convert release_buffer_page() to use a folio
fs: Remove last vestiges of releasepage
ubifs: Convert to release_folio
reiserfs: Convert to release_folio
orangefs: Convert to release_folio
ocfs2: Convert to release_folio
nilfs2: Remove comment about releasepage
nfs: Convert to release_folio
jfs: Convert to release_folio
...
Linus Torvalds [Wed, 25 May 2022 02:21:30 +0000 (19:21 -0700)]
Merge tag 'iomap-5.19-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull iomap updates from Darrick Wong:
"There's a couple of corrections sent in by Andreas for some accounting
errors.
The biggest change this time around is that writeback errors longer
clear pageuptodate nor does XFS invalidate the page cache anymore.
This brings XFS (and gfs2/zonefs) behavior in line with every other
Linux filesystem driver, and fixes some UAF bugs that only cropped up
after willy turned on multipage folios for XFS in 5.18-rc1.
Regrettably, it took all the way to the end of the 5.18 cycle to find
the source of these bugs and reach a consensus that XFS' writeback
failure behavior from 20 years ago is no longer necessary.
Summary:
- Fix a couple of accounting errors in the buffered io code.
- Discontinue the practice of marking folios !uptodate and
invalidating them when writeback fails.
This fixes some UAF bugs when multipage folios are enabled, and
brings the behavior of XFS/gfs/zonefs into alignment with the
behavior of all the other Linux filesystems"
* tag 'iomap-5.19-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
iomap: don't invalidate folios after writeback errors
iomap: iomap_write_end cleanup
iomap: iomap_write_failed fix
Linus Torvalds [Wed, 25 May 2022 02:09:16 +0000 (19:09 -0700)]
Merge tag 'dlm-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
Pull dlm updates from David Teigland:
"This includes several large patches to improve endian handling and
remove sparse warnings. The code previously used in/out, in-place
endianness conversion functions.
Other code cleanup includes the list iterator changes.
Finally, a long standing bug was found and fixed, caused by missed
decrement on an lock struct ref count"
* tag 'dlm-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: (28 commits)
dlm: use kref_put_lock in __put_lkb
dlm: use kref_put_lock in put_rsb
dlm: remove unnecessary error assign
dlm: fix missing lkb refcount handling
fs: dlm: cast resource pointer to uintptr_t
dlm: replace usage of found with dedicated list iterator variable
dlm: remove usage of list iterator for list_add() after the loop body
dlm: fix pending remove if msg allocation fails
dlm: fix wake_up() calls for pending remove
dlm: check required context while close
dlm: cleanup lock handling in dlm_master_lookup
dlm: remove found label in dlm_master_lookup
dlm: remove __user conversion warnings
dlm: move conversion to compile time
dlm: use __le types for dlm messages
dlm: use __le types for rcom messages
dlm: use __le types for dlm header
dlm: use __le types for options header
dlm: add __CHECKER__ for false positives
dlm: move global to static inits
...
Linus Torvalds [Wed, 25 May 2022 02:04:46 +0000 (19:04 -0700)]
Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
"Various bug fixes and cleanups for ext4.
In particular, move the crypto related fucntions from fs/ext4/super.c
into a new fs/ext4/crypto.c, and fix a number of bugs found by fuzzers
and error injection tools"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (25 commits)
ext4: only allow test_dummy_encryption when supported
ext4: fix bug_on in __es_tree_search
ext4: avoid cycles in directory h-tree
ext4: verify dir block before splitting it
ext4: filter out EXT4_FC_REPLAY from on-disk superblock field s_state
ext4: fix bug_on in ext4_writepages
ext4: refactor and move ext4_ioctl_get_encryption_pwsalt()
ext4: cleanup function defs from ext4.h into crypto.c
ext4: move ext4 crypto code to its own file crypto.c
ext4: fix memory leak in parse_apply_sb_mount_options()
ext4: reject the 'commit' option on ext2 filesystems
ext4: remove duplicated #include of dax.h in inode.c
ext4: fix race condition between ext4_write and ext4_convert_inline_data
ext4: convert symlink external data block mapping to bdev
ext4: add nowait mode for ext4_getblk()
ext4: fix journal_ioprio mount option handling
ext4: mark group as trimmed only if it was fully scanned
ext4: fix use-after-free in ext4_rename_dir_prepare
ext4: add unmount filesystem message
ext4: remove unnecessary conditionals
...
Linus Torvalds [Wed, 25 May 2022 02:00:41 +0000 (19:00 -0700)]
Merge tag 'gfs2-v5.18-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
Pull gfs2 updates from Andreas Gruenbacher:
- Clean up the allocation of glocks that have an address space attached
- Quota locking fix and quota iomap conversion
- Fix the FITRIM error reporting
- Some list iterator cleanups
* tag 'gfs2-v5.18-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: Convert function bh_get to use iomap
gfs2: use i_lock spin_lock for inode qadata
gfs2: Return more useful errors from gfs2_rgrp_send_discards()
gfs2: Use container_of() for gfs2_glock(aspace)
gfs2: Explain some direct I/O oddities
gfs2: replace 'found' with dedicated list iterator variable
Linus Torvalds [Wed, 25 May 2022 01:52:35 +0000 (18:52 -0700)]
Merge tag 'for-5.19-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
"Features:
- subpage:
- support for PAGE_SIZE > 4K (previously only 64K)
- make it work with raid56
- repair super block num_devices automatically if it does not match
the number of device items
- defrag can convert inline extents to regular extents, up to now
inline files were skipped but the setting of mount option
max_inline could affect the decision logic
- zoned:
- minimal accepted zone size is explicitly set to 4MiB
- make zone reclaim less aggressive and don't reclaim if there are
enough free zones
- add per-profile sysfs tunable of the reclaim threshold
- allow automatic block group reclaim for non-zoned filesystems, with
sysfs tunables
- tree-checker: new check, compare extent buffer owner against owner
rootid
Performance:
- avoid blocking on space reservation when doing nowait direct io
writes (+7% throughput for reads and writes)
- NOCOW write throughput improvement due to refined locking (+3%)
- send: reduce pressure to page cache by dropping extent pages right
after they're processed
Core:
- convert all radix trees to xarray
- add iterators for b-tree node items
- support printk message index
- user bulk page allocation for extent buffers
- switch to bio_alloc API, use on-stack bios where convenient, other
bio cleanups
- use rw lock for block groups to favor concurrent reads
- simplify workques, don't allocate high priority threads for all
normal queues as we need only one
- refactor scrub, process chunks based on their constraints and
similarity
- allocate direct io structures on stack and pass around only
pointers, avoids allocation and reduces potential error handling
Fixes:
- fix count of reserved transaction items for various inode
operations
- fix deadlock between concurrent dio writes when low on free data
space
- fix a few cases when zones need to be finished
VFS, iomap:
- add helper to check if sb write has started (usable for assertions)
- new helper iomap_dio_alloc_bio, export iomap_dio_bio_end_io"
* tag 'for-5.19-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (173 commits)
btrfs: zoned: introduce a minimal zone size 4M and reject mount
btrfs: allow defrag to convert inline extents to regular extents
btrfs: add "0x" prefix for unsupported optional features
btrfs: do not account twice for inode ref when reserving metadata units
btrfs: zoned: fix comparison of alloc_offset vs meta_write_pointer
btrfs: send: avoid trashing the page cache
btrfs: send: keep the current inode open while processing it
btrfs: allocate the btrfs_dio_private as part of the iomap dio bio
btrfs: move struct btrfs_dio_private to inode.c
btrfs: remove the disk_bytenr in struct btrfs_dio_private
btrfs: allocate dio_data on stack
iomap: add per-iomap_iter private data
iomap: allow the file system to provide a bio_set for direct I/O
btrfs: add a btrfs_dio_rw wrapper
btrfs: zoned: zone finish unused block group
btrfs: zoned: properly finish block group on metadata write
btrfs: zoned: finish block group when there are no more allocatable bytes left
btrfs: zoned: consolidate zone finish functions
btrfs: zoned: introduce btrfs_zoned_bg_is_full
btrfs: improve error reporting in lookup_inline_extent_backref
...
Linus Torvalds [Wed, 25 May 2022 01:42:04 +0000 (18:42 -0700)]
Merge tag 'erofs-for-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs (and fscache) updates from Gao Xiang:
"After working on it on the mailing list for more than half a year, we
finally form 'erofs over fscache' feature into shape. Hopefully it
could bring more possibility to the communities.
The story mainly started from a new project what we called "RAFS v6" [1]
for Nydus image service almost a year ago, which enhances EROFS to be
a new form of one bootstrap (which includes metadata representing the
whole fs tree) + several data-deduplicated content addressable blobs
(actually treated as multiple devices). Each blob can represent one
container image layer but not quite exactly since all new data can be
fully existed in the previous blobs so no need to introduce another
new blob.
It is actually not a new idea (at least on my side it's much like a
simpilied casync [2] for now) and has many benefits over per-file
blobs or some other exist ways since typically each RAFS v6 image only
has dozens of device blobs instead of thousands of per-file blobs.
It's easy to be signed with user keys as a golden image, transfered
untouchedly with minimal overhead over the network, kept in some type
of storage conveniently, and run with (optional) runtime verification
but without involving too many irrelevant features crossing the system
beyond EROFS itself. At least it's our final goal and we're keeping
working on it. There was also a good summary of this approach from the
casync author [3].
Regardless further optimizations, this work is almost done in the
previous Linux release cycles. In this round, we'd like to introduce
on-demand load for EROFS with the fscache/cachefiles infrastructure,
considering the following advantages:
- Introduce new file-based backend to EROFS. Although each image only
contains dozens of blobs but in densely-deployed runC host for
example, there could still be massive blobs on a machine, which is
messy if each blob is treated as a device. In contrast, fscache and
cachefiles are really great interfaces for us to make them work.
- Introduce on-demand load to fscache and EROFS. Previously, fscache
is mainly used to caching network-likewise filesystems, now it can
support on-demand downloading for local fses too with the exact
localfs on-disk format. It has many advantages which we're been
described in the latest patchset cover letter [4]. In addition to
that, most importantly, the cached data is still stored in the
original local fs on-disk format so that it's still the one signed
with private keys but only could be partially available. Users can
fully trust it during running. Later, users can also back up
cachefiles easily to another machine.
- More reliable on-demand approach in principle. After data is all
available locally, user daemon can be no longer online in some use
cases, which helps daemon crash recovery (filesystems can still in
service) and hot-upgrade (user daemon can be upgraded more
frequently due to new features or protocols introduced.)
- Other format can also be converted to EROFS filesystem format over
the internet on the fly with the new on-demand load feature and
mounted. That is entirely possible with on-demand load feature as
long as such archive format metadata can be fetched in advance like
stargz.
In addition, although currently our target user is Nydus image service [5],
but laterly, it can be used for other use cases like on-demand system
booting, etc. As for the fscache on-demand load feature itself,
strictly it can be used for other local fses too. Laterly we could
promote most code to the iomap infrastructure and also enhance it in
the read-write way if other local fses are interested.
Thanks David Howells for taking so much time and patience on this
these months, many thanks with great respect here again! Thanks Jeffle
for working on this feature and Xin Yin from Bytedance for
asynchronous I/O implementation as well as Zichen Tian, Jia Zhu, and
Yan Song for testing, much appeciated. We're also exploring more
possibly over fscache cache management over FSDAX for secure
containers and working on more improvements and useful features for
fscache, cachefiles, and on-demand load.
In addition to "erofs over fscache", NFS export and idmapped mount are
also completed in this cycle for container use cases as well.
Summary:
- Add erofs on-demand load support over fscache
- Support NFS export for erofs
- Support idmapped mounts for erofs
- Don't prompt for risk any more when using big pcluster
- Fix buffer copy overflow of ztailpacking feature
Linus Torvalds [Wed, 25 May 2022 01:30:27 +0000 (18:30 -0700)]
Merge tag 'exfat-for-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat
Pull exfat updates from Namjae Jeon:
- fix referencing wrong parent directory information during rename
- introduce a sys_tz mount option to use system timezone
- improve performance while zeroing a cluster with dirsync mount option
- fix slab-out-bounds in exat_clear_bitmap() reported from syzbot
* tag 'exfat-for-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
exfat: check if cluster num is valid
exfat: reduce block requests when zeroing a cluster
block: add sync_blockdev_range()
exfat: introduce mount option 'sys_tz'
exfat: fix referencing wrong parent directory information after renaming
Linus Torvalds [Wed, 25 May 2022 01:19:06 +0000 (18:19 -0700)]
Merge tag 'fs.idmapped.v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull fs idmapping updates from Christian Brauner:
"This contains two minor updates:
- An update to the idmapping documentation by Rodrigo making it
easier to understand that we first introduce several use-cases that
fail without idmapped mounts simply to explain how they can be
handled with idmapped mounts.
- When changing a mount's idmapping we now hold writers to make it
more robust.
This is similar to turning a mount ro with the difference that in
contrast to turning a mount ro changing the idmapping can only ever
be done once while a mount can transition between ro and rw as much
as it wants.
The vfs layer itself takes care to retrieve the idmapping of a
mount once ensuring that the idmapping used for vfs permission
checking is identical to the idmapping passed down to the
filesystem. All filesystems with FS_ALLOW_IDMAP raised take the
same precautions as the vfs in code-paths that are outside of
direct control of the vfs such as ioctl()s.
However, holding writers makes this more robust and predictable for
both the kernel and userspace.
This is a minor user-visible change. But it is extremely unlikely
to matter. The caller must've created a detached mount via
OPEN_TREE_CLONE and then handed that O_PATH fd to another process
or thread which then must've gotten a writable fd for that mount
and started creating files in there while the caller is still
changing mount properties. While not impossible it will be an
extremely rare corner-case and should in general be considered a
bug in the application. Consider making a mount MOUNT_ATTR_NOEXEC
or MOUNT_ATTR_NODEV while allowing someone else to perform lookups
or exec'ing in parallel by handing them a copy of the
OPEN_TREE_CLONE fd or another fd beneath that mount.
I've pinged all major users of idmapped mounts pointing out this
change and none of them have active writers on a mount while still
changing mount properties. It would've been strange if they did.
The rest and majority of the work will be coming through the overlayfs
tree this cycle. In addition to overlayfs this cycle should also see
support for idmapped mounts on erofs as I've acked a patch to this
effect a little while ago"
* tag 'fs.idmapped.v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
fs: hold writers when changing mount's idmapping
docs: Add small intro to idmap examples
Linus Torvalds [Wed, 25 May 2022 01:09:16 +0000 (18:09 -0700)]
Merge tag 'media/v5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media updates from Mauro Carvalho Chehab:
- dvb-usb drivers entries got reworked to avoid usage of magic numbers
to refer to data position inside tables
- vcodec driver has gained support for MT8186 and for vp8 and vp9
stateless codecs
- hantro has gained support for Hantro G1 on RK366x
- Added more h264 levels on coda960
- ccs gained support for MIPI CSI-2 28 bits per pixel raw data type
- venus driver gained support for Qualcomm custom compressed pixel
formats
- lots of driver fixes and updates
* tag 'media/v5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (308 commits)
media: hantro: Enable HOLD_CAPTURE_BUF for H.264
media: hantro: Add H.264 field decoding support
media: hantro: h264: Make dpb entry management more robust
media: hantro: Stop using H.264 parameter pic_num
media: rkvdec: Enable capture buffer holding for H264
media: rkvdec-h264: Add field decoding support
media: rkvdec: Ensure decoded resolution fit coded resolution
media: rkvdec: h264: Fix reference frame_num wrap for second field
media: rkvdec: h264: Validate and use pic width and height in mbs
media: rkvdec: Move H264 SPS validation in rkvdec-h264
media: rkvdec: h264: Fix bit depth wrap in pps packet
media: rkvdec: h264: Fix dpb_valid implementation
media: rkvdec: Stop overclocking the decoder
media: v4l2: Reorder field reflist
media: h264: Sort p/b reflist using frame_num
media: v4l2: Trace calculated p/b0/b1 initial reflist
media: h264: Store all fields into the unordered list
media: h264: Store current picture fields
media: h264: Increase reference lists size to 32
media: h264: Use v4l2_h264_reference for reflist
...
Linus Torvalds [Tue, 24 May 2022 23:34:14 +0000 (16:34 -0700)]
Merge tag 'devprop-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull device properties framework updates from Rafael Wysocki:
"These mostly extend the device property API and make it easier to use
in some cases.
Specifics:
- Allow error pointer to be passed to fwnode APIs (Andy Shevchenko).
- Introduce fwnode_for_each_parent_node() (Andy Shevchenko, Douglas
Anderson).
- Advertise fwnode and device property count API calls (Andy
Shevchenko).
- Clean up fwnode_is_ancestor_of() (Andy Shevchenko).
- Convert device_{dma_supported,get_dma_attr} to fwnode (Sakari
Ailus).
- Release subnode properties with data nodes (Sakari Ailus).
- Add ->iomap() and ->irq_get() to fwnode operations (Sakari Ailus)"
* tag 'devprop-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
device property: Advertise fwnode and device property count API calls
device property: Fix recent breakage of fwnode_get_next_parent_dev()
device property: Drop 'test' prefix in parameters of fwnode_is_ancestor_of()
device property: Introduce fwnode_for_each_parent_node()
device property: Allow error pointer to be passed to fwnode APIs
ACPI: property: Release subnode properties with data nodes
device property: Add irq_get to fwnode operation
device property: Add iomap to fwnode operations
ACPI: property: Move acpi_fwnode_device_get_match_data() up
device property: Convert device_{dma_supported,get_dma_attr} to fwnode
Linus Torvalds [Tue, 24 May 2022 23:19:30 +0000 (16:19 -0700)]
Merge tag 'thermal-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control updates from Rafael Wysocki:
"These add a thermal library and thermal tools to wrap the netlink
interface into event-based callbacks, improve overheat condition
handling during suspend-to-idle on Intel SoCs, add some new hardware
support, fix bugs and clean up code.
Specifics:
- Add thermal library and thermal tools to encapsulate the netlink
into event based callbacks (Daniel Lezcano, Jiapeng Chong).
- Improve overheat condition handling during suspend-to-idle in the
Intel PCH thermal driver (Zhang Rui).
- Use local ops instead of global ops in devfreq_cooling (Kant Fan).
- Clean up _OSC handling in int340x (Davidlohr Bueso).
- Switch hisi_termal from CONFIG_PM_SLEEP guards to pm_sleep_ptr()
(Hesham Almatary).
- Add new k3 j72xx bangdap driver and the corresponding bindings
(Keerthy).
- Fix missing of_node_put() in the SC iMX driver at probe time
(Miaoqian Lin).
- Fix memory leak in __thermal_cooling_device_register()
when device_register() fails by calling
thermal_cooling_device_destroy_sysfs() (Yang Yingliang).
- Add sc8180x and sc8280xp compatible string in the DT bindings and
lMH support for QCom tsens driver (Bjorn Andersson).
- Fix OTP Calibration Register values conforming to the documentation
on RZ/G2L and bindings documentation for RZ/G2UL (Biju Das).
- Fix type in kerneldoc description for __thermal_bind_params
(Corentin Labbe).
- Fix potential NULL dereference in sr_thermal_probe() on Broadcom
platform (Zheng Yongjun).
- Add change mode ops to the thermal-of sensor (Manaf Meethalavalappu
Pallikunhi).
- Fix non-negative value support by preventing the value to be clamp
to zero (Stefan Wahren).
- Add compatible string and DT bindings for MSM8960 tsens driver
(Dmitry Baryshkov).
- Add hwmon support for K3 driver (Massimiliano Minella).
- Refactor and add multiple generations support for QCom ADC driver
(Jishnu Prakash).
- Use platform_get_irq_optional() to get the interrupt on RCar driver
and document Document RZ/V2L bindings (Lad Prabhakar).
- Remove NULL check after container_of() call from the Intel HFI
thermal driver (Haowen Bai)"
* tag 'thermal-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (38 commits)
thermal: intel: pch: improve the cooling delay log
thermal: intel: pch: enhance overheat handling
thermal: intel: pch: move cooling delay to suspend_noirq phase
PM: wakeup: expose pm_wakeup_pending to modules
thermal: k3_j72xx_bandgap: Add the bandgap driver support
dt-bindings: thermal: k3-j72xx: Add VTM bindings documentation
thermal/drivers/imx_sc_thermal: Fix refcount leak in imx_sc_thermal_probe
thermal/core: Fix memory leak in __thermal_cooling_device_register()
dt-bindings: thermal: tsens: Add sc8280xp compatible
dt-bindings: thermal: lmh: Add Qualcomm sc8180x compatible
thermal/drivers/qcom/lmh: Add sc8180x compatible
thermal/drivers/rz2gl: Fix OTP Calibration Register values
dt-bindings: thermal: rzg2l-thermal: Document RZ/G2UL bindings
thermal: thermal_of: fix typo on __thermal_bind_params
tools/thermal: remove unneeded semicolon
tools/lib/thermal: remove unneeded semicolon
thermal/drivers/broadcom: Fix potential NULL dereference in sr_thermal_probe
tools/thermal: Add thermal daemon skeleton
tools/thermal: Add a temperature capture tool
tools/thermal: Add util library
...
Linus Torvalds [Tue, 24 May 2022 23:04:25 +0000 (16:04 -0700)]
Merge tag 'pm-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These add support for 'artificial' Energy Models in which power
numbers for different entities may be in different scales, add support
for some new hardware, fix bugs and clean up code in multiple places.
Specifics:
- Update the Energy Model support code to allow the Energy Model to
be artificial, which means that the power values may not be on a
uniform scale with other devices providing power information, and
update the cpufreq_cooling and devfreq_cooling thermal drivers to
support artificial Energy Models (Lukasz Luba).
- Make DTPM check the Energy Model type (Lukasz Luba).
- Fix policy counter decrementation in cpufreq if Energy Model is in
use (Pierre Gondois).
- Add CPU-based scaling support to passive devfreq governor (Saravana
Kannan, Chanwoo Choi).
- Update the rk3399_dmc devfreq driver (Brian Norris).
- Export dev_pm_ops instead of suspend() and resume() in the IIO
chemical scd30 driver (Jonathan Cameron).
- Add namespace variants of EXPORT[_GPL]_SIMPLE_DEV_PM_OPS and
PM-runtime counterparts (Jonathan Cameron).
- Move symbol exports in the IIO chemical scd30 driver into the
IIO_SCD30 namespace (Jonathan Cameron).
- Allow dynamic debug to control printing of PM messages (David
Cohen).
- Fix some kernel-doc comments in hibernation code (Yang Li, Haowen
Bai).
- Preserve ACPI-table override during hibernation (Amadeusz
Sławiński).
- Improve support for suspend-to-RAM for PSCI OSI mode (Ulf Hansson).
- Make Intel RAPL power capping driver support the RaptorLake and
AlderLake N processors (Zhang Rui, Sumeet Pawnikar).
- Remove redundant store to value after multiply in the RAPL power
capping driver (Colin Ian King).
- Add AlderLake processor support to the intel_idle driver (Zhang
Rui).
- Fix regression leading to no genpd governor in the PSCI cpuidle
driver and fix the riscv-sbi cpuidle driver to allow a genpd
governor to be used (Ulf Hansson).
- Fix cpufreq governor clean up code to avoid using kfree() directly
to free kobject-based items (Kevin Hao).
- Prepare cpufreq for powerpc's asm/prom.h cleanup (Christophe
Leroy).
- Make intel_pstate notify frequency invariance code when no_turbo is
turned on and off (Chen Yu).
- Add Sapphire Rapids OOB mode support to intel_pstate (Srinivas
Pandruvada).
- Make cpufreq avoid unnecessary frequency updates due to mismatch
between hardware and the frequency table (Viresh Kumar).
- Make remove_cpu_dev_symlink() clear the real_cpus mask to simplify
code (Viresh Kumar).
- Rearrange cpufreq_offline() and cpufreq_remove_dev() to make the
calling convention for some driver callbacks consistent (Rafael
Wysocki).
- Avoid accessing half-initialized cpufreq policies from the show()
and store() sysfs functions (Schspa Shi).
- Rearrange cpufreq_offline() to make the calling convention for some
driver callbacks consistent (Schspa Shi).
- Update CPPC handling in cpufreq (Pierre Gondois).
- Move genpd's time-accounting to ktime_get_mono_fast_ns() (Ulf
Hansson).
- Improve the way genpd deals with its governors (Ulf Hansson).
- Update the turbostat utility to version 2022.04.16 (Len Brown, Dan
Merillat, Sumeet Pawnikar, Zephaniah E. Loss-Cutler-Hull, Chen Yu)"
* tag 'pm-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (94 commits)
PM: domains: Trust domain-idle-states from DT to be correct by genpd
PM: domains: Measure power-on/off latencies in genpd based on a governor
PM: domains: Allocate governor data dynamically based on a genpd governor
PM: domains: Clean up some code in pm_genpd_init() and genpd_remove()
PM: domains: Fix initialization of genpd's next_wakeup
PM: domains: Fixup QoS latency measurements for IRQ safe devices in genpd
PM: domains: Measure suspend/resume latencies in genpd based on governor
PM: domains: Move the next_wakeup variable into the struct gpd_timing_data
PM: domains: Allocate gpd_timing_data dynamically based on governor
PM: domains: Skip another warning in irq_safe_dev_in_sleep_domain()
PM: domains: Rename irq_safe_dev_in_no_sleep_domain() in genpd
PM: domains: Don't check PM_QOS_FLAG_NO_POWER_OFF in genpd
PM: domains: Drop redundant code for genpd always-on governor
PM: domains: Add GENPD_FLAG_RPM_ALWAYS_ON for the always-on governor
powercap: intel_rapl: remove redundant store to value after multiply
cpufreq: CPPC: Enable dvfs_possible_from_any_cpu
cpufreq: CPPC: Enable fast_switch
ACPI: CPPC: Assume no transition latency if no PCCT
ACPI: bus: Set CPPC _OSC bits for all and when CPPC_LIB is supported
ACPI: CPPC: Check _OSC for flexible address space
...
Linus Torvalds [Tue, 24 May 2022 22:46:55 +0000 (15:46 -0700)]
Merge tag 'acpi-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"These update the ACPICA kernel code to upstream revision 20220331,
improve handling of PCI devices that are in D3cold during system
initialization, add support for a few features, fix bugs and clean up
code.
Specifics:
- Update ACPICA code in the kernel to upstream revision 20220331
including the following changes:
- Add support for the Windows 11 _OSI string (Mario Limonciello)
- Add the CFMWS subtable to the CEDT table (Lawrence Hileman).
- iASL: NHLT: Treat Terminator as specific_config (Piotr
Maziarz).
- iASL: NHLT: Fix parsing undocumented bytes at the end of
Endpoint Descriptor (Piotr Maziarz).
- iASL: NHLT: Rename linux specific strucures to device_info
(Piotr Maziarz).
- Add new ACPI 6.4 semantics to Load() and LoadTable() (Bob
Moore).
- Clean up double word in comment (Tom Rix).
- Update copyright notices to the year 2022 (Bob Moore).
- Remove some tabs and // comments - automated cleanup (Bob
Moore).
- Replace zero-length array with flexible-array member (Gustavo
A. R. Silva).
- Interpreter: Add units to time variable names (Paul Menzel).
- Add support for ARM Performance Monitoring Unit Table (Besar
Wicaksono).
- Inform users about ACPI spec violation related to sleep length
(Paul Menzel).
- iASL/MADT: Add OEM-defined subtable (Bob Moore).
- Interpreter: Fix some typo mistakes (Selvarasu Ganesan).
- Updates for revision E.d of IORT (Shameer Kolothum).
- Use ACPI_FORMAT_UINT64 for 64-bit output (Bob Moore).
- Improve debug messages in the ACPI device PM code (Rafael Wysocki).
- Block ASUS B1400CEAE from suspend to idle by default (Mario
Limonciello).
- Improve handling of PCI devices that are in D3cold during system
initialization (Rafael Wysocki).
- Fix BERT error region memory mapping (Lorenzo Pieralisi).
- Add support for NVIDIA 16550-compatible port subtype to the SPCR
parsing code (Jeff Brasen).
- Use static for BGRT_SHOW kobj_attribute defines (Tom Rix).
- Fix missing prototype warning for acpi_agdi_init() (Ilkka
Koskinen).
- Fix missing ERST record ID in the APEI code (Liu Xinpeng).
- Make APEI error injection to refuse to inject into the zero page
(Tony Luck).
- Correct description of INT3407 / INT3532 DPTF attributes in sysfs
(Sumeet Pawnikar).
- Add support for high frequency impedance notification to the DPTF
driver (Sumeet Pawnikar).
- Make mp_config_acpi_gsi() a void function (Li kunyu).
- Unify Package () representation for properties in the ACPI device
properties documentation (Andy Shevchenko).
- Include UUID in _DSM evaluation warning (Michael Niewöhner)"
* tag 'acpi-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (41 commits)
Revert "ACPICA: executer/exsystem: Warn about sleeps greater than 10 ms"
ACPI: utils: include UUID in _DSM evaluation warning
ACPI: PM: Block ASUS B1400CEAE from suspend to idle by default
x86: ACPI: Make mp_config_acpi_gsi() a void function
ACPI: DPTF: Add support for high frequency impedance notification
ACPI: AGDI: Fix missing prototype warning for acpi_agdi_init()
ACPI: bus: Avoid non-ACPI device objects in walks over children
ACPI: DPTF: Correct description of INT3407 / INT3532 attributes
ACPI: BGRT: use static for BGRT_SHOW kobj_attribute defines
ACPI, APEI, EINJ: Refuse to inject into the zero page
ACPI: PM: Always print final debug message in acpi_device_set_power()
ACPI: SPCR: Add support for NVIDIA 16550-compatible port subtype
ACPI: docs: enumeration: Unify Package () for properties (part 2)
ACPI: APEI: Fix missing ERST record id
ACPICA: Update version to 20220331
ACPICA: exsystem.c: Use ACPI_FORMAT_UINT64 for 64-bit output
ACPICA: IORT: Updates for revision E.d
ACPICA: executer/exsystem: Fix some typo mistakes
ACPICA: iASL/MADT: Add OEM-defined subtable
ACPICA: executer/exsystem: Warn about sleeps greater than 10 ms
...
Linus Torvalds [Tue, 24 May 2022 22:21:15 +0000 (15:21 -0700)]
Merge tag 'for-linus-2022052401' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID updates from Jiri Kosina:
- support for pens with 3 buttons with Wacom driver (Joshua Dickens)
- support for HID_DG_SCANTIME to report the timestamp for pen and touch
events in Wacom driver (Joshua Dickens)
- support for sensor discovery in amd-sfh driver (Basavaraj Natikar)
- support for wider variety of Huion tablets ported from DIGImend
project (José Expósito, Nikolai Kondrashov)
- new device IDs and other assorted small code cleanups
* tag 'for-linus-2022052401' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: (44 commits)
HID: apple: Properly handle function keys on Keychron keyboards
HID: uclogic: Switch to Digitizer usage for styluses
HID: uclogic: Add pen support for XP-PEN Star 06
HID: uclogic: Differentiate touch ring and touch strip
HID: uclogic: Always shift touch reports to zero
HID: uclogic: Do not focus on touch ring only
HID: uclogic: Return raw parameters from v2 pen init
HID: uclogic: Move param printing to a function
HID: core: Display "SENSOR HUB" for sensor hub bus string in hid_info
HID: amd_sfh: Move bus declaration outside of amd-sfh
HID: amd_sfh: Add physical location to HID device
HID: amd_sfh: Modify the hid name
HID: amd_sfh: Modify the bus name
HID: amd_sfh: Add sensor name by index for debug info
HID: amd_sfh: Add support for sensor discovery
HID: bigben: fix slab-out-of-bounds Write in bigben_probe
Hid: wacom: Fix kernel test robot warning
HID: uclogic: Disable pen usage for Huion keyboard interfaces
HID: uclogic: Support disabling pen usage
HID: uclogic: Pass keyboard reports as is
...
Linus Torvalds [Tue, 24 May 2022 22:13:30 +0000 (15:13 -0700)]
Merge tag 'spi-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi updates from Mark Brown:
"This is quite a quiet release but some new drivers mean that the
diffstat is fairly large. The new drivers include the aspeed driver
which is migrated from MTD as part of the ongoing move of controllers
with specialised support for SPI flashes into the SPI subsystem.
- Support for devices which flip CPHA during recieve only transfers
(eg, if MOSI and MISO have inverted polarity).
- Overhaul of the i.MX driver, including the addition of PIO support
for better performance on small transfers.
- Migration of the Aspeed driver from MTD.
- Support for Aspeed AST2400, Ingenic JZ4775 and X1/2000 and MediaTek
IPM and SFI"
* tag 'spi-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (84 commits)
spi: spi-au1550: replace ternary operator with min()
mtd: spi-nor: aspeed: set the decoding size to at least 2MB for AST2600
spi: aspeed: Calibrate read timings
spi: aspeed: Add support for the AST2400 SPI controller
spi: aspeed: Workaround AST2500 limitations
spi: aspeed: Adjust direct mapping to device size
spi: aspeed: Add support for direct mapping
spi: spi-mem: Convert Aspeed SMC driver to spi-mem
spi: Convert the Aspeed SMC controllers device tree binding
spi: spi-cadence: Update ISR status variable type to irqreturn_t
spi: Doc fix - Describe add_lock and dma_map_dev in spi_controller
spi: cadence-quadspi: Handle spi_unregister_master() in remove()
spi: stm32-qspi: Remove SR_BUSY bit check before sending command
spi: stm32-qspi: Always check SR_TCF flags in stm32_qspi_wait_cmd()
spi: stm32-qspi: Fix wait_cmd timeout in APM mode
spi: cadence-quadspi: remove unnecessary (void *) casts
spi: cadence-quadspi: Add missing blank line in cqspi_request_mmap_dma()
spi: spi-imx: mx51_ecspi_prepare_message(): skip writing MX51_ECSPI_CONFIG register if unchanged
spi: spi-imx: add PIO polling support
spi: spi-imx: replace struct spi_imx_data::bitbang by pointer to struct spi_controller
...
Linus Torvalds [Tue, 24 May 2022 22:09:47 +0000 (15:09 -0700)]
Merge tag 'regulator-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator updates from Mark Brown:
"This is mostly a drivers update including a couple of new drivers but
we do have some fixes and improvements to the core as well.
- Make sure we don't log spuriously about uncontrollable regulators.
- Don't use delays when we should use sleeps for regulators with
larger ramp times.
- Support for MediaTek MT6358 and MT6366, Richtek RT5759 and Silicon
Mitus SM5703"
* tag 'regulator-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (36 commits)
regulator: scmi: Fix refcount leak in scmi_regulator_probe
regulator: pfuze100: Fix refcount leak in pfuze_parse_regulators_dt
regulator: qcom_smd: Fix up PM8950 regulator configuration
regulator: core: Fix enable_count imbalance with EXCLUSIVE_GET
regulator: core: Add error flags to sysfs attributes
regulator: dt-bindings: qcom,rpmh: document vdd-l7-bob-supply on PMR735A
regulator: dt-bindings: qcom,rpmh: document supplies per variant
regulator: dt-bindings: qcom,rpmh: update maintainers
regulator: mt6315: Enforce regulator-compatible, not name
regulator: pca9450: Enable DVS control via PMIC_STBY_REQ
regulator: pca9450: Make warm reset on WDOG_B assertion
regulator: Add property for WDOG_B warm reset
regulator: pca9450: Make I2C Level Translator configurable
regulator: Add property for I2C level shifter
regulator: sm5703: Correct reference to the common regulator schema
regulator: sm5703-regulator: Add regulators support for SM5703 MFD
dt-bindings: regulator: Add bindings for Silicon Mitus SM5703 regulators
regulator: richtek,rt4801: parse GPIOs per regulator
regulator: dt-bindings: richtek,rt4801: use existing ena_gpiod feature
regulator: core: Sleep (not delay) in set_voltage()
...
Linus Torvalds [Tue, 24 May 2022 22:02:58 +0000 (15:02 -0700)]
Merge tag 'regmap-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"The main change here is Marek's addition of bulk read/write callbacks
for individual regmaps, we've supported single register operations for
a while but there's enough hardware out there which can use bulk
equivalents to make it worthwhile"
* tag 'regmap-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: Add missing map->bus check
regmap: Add bulk read/write callbacks into regmap_config
regmap: cache: set max_register with reg_stride
regmap: Constify static regmap_bus structs
Linus Torvalds [Tue, 24 May 2022 21:56:38 +0000 (14:56 -0700)]
Merge tag 'mmc-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC updates from Ulf Hansson:
"MMC core:
- Support zero-out using TRIM for eMMC
- Allow to override the busy-timeout for the ioctl-cmds
MMC host:
- Continued the conversion of DT bindings into the JSON schema
- jz4740: Apply DMA engine limits to maximum segment size
- mmci_stm32: Use a buffer for unaligned DMA requests
- mmc_spi: Enabled high-speed modes via parsing of DT
- omap: Make clock management to be compliant with CCF
- renesas_sdhi:
- Support eMMC HS400 mode for R-Car V3H ES2.0
- Don't allow support for eMMC HS400 for R-Car V3M/D3
- sdhci_am654: Fix problem when SD card slot lacks the card detect
line
- sdhci-esdhc-imx: Add support for the imx8dxl variant
- sdhci-brcmstb: Enable support for clock gating to save power
- sdhci-msm:
- Add support for the sdx65 variant
- Add support for the sm8150 variant
- sdhci-of-dwcmshc: Add support for the Rockchip rk3588 variant
- sdhci-pci-gli: Add workaround to allow GL9755 to enter ASPM L1.2"
* tag 'mmc-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: (52 commits)
mmc: sdhci-of-arasan: Add NULL check for data field
mmc: core: Support zeroout using TRIM for eMMC
mmc: sdhci-brcmstb: Fix compiler warning
mmc: sdhci-msm: Add compatible string check for sdx65
dt-bindings: mmc: sdhci-msm: Document the SDX65 compatible
mmc: sdhci-msm: Add compatible string check for sm8150
dt-bindings: mmc: sdhci-msm: Add compatible string for sm8150
mmc: sdhci-msm: Add SoC specific compatibles
dt-bindings: mmc: sdhci-msm: Convert bindings to yaml
dt-bindings: mmc: brcm,sdhci-brcmstb: cleanup example
dt-bindings: mmc: brcm,sdhci-brcmstb: correct number of reg entries
mmc: sdhci-brcmstb: Enable Clock Gating to save power
mmc: sdhci-brcmstb: Re-organize flags
mmc: mmci: Remove custom ios handler
mmc: atmel-mci: Simplify if(chan) and if(!chan)
mmc: core: use kobj_to_dev()
dt-bindings: mmc: sdhci-of-dwcmhsc: Add rk3588
mmc: core: Add CIDs for cards to the entropy pool
mmc: core: Allows to override the timeout value for ioctl() path
mmc: sdhci-omap: Use of_device_get_match_data() helper
...
Linus Torvalds [Tue, 24 May 2022 21:50:54 +0000 (14:50 -0700)]
Merge tag 'for-linus-4.19-1' of https://github.com/cminyard/linux-ipmi
Pull IPMI update from Corey Minyard:
"Add limits on the number of users and messages, plus sysfs interfaces
to control those limits.
Other than that, little cleanups, use dev_xxx() insted of pr_xxx(),
create initializers for structures, fix a refcount leak, etc"
* tag 'for-linus-4.19-1' of https://github.com/cminyard/linux-ipmi:
ipmi:ipmb: Fix refcount leak in ipmi_ipmb_probe
ipmi: remove unnecessary type castings
ipmi: Make two logs unique
ipmi:si: Convert pr_debug() to dev_dbg()
ipmi: Convert pr_debug() to dev_dbg()
ipmi: Fix pr_fmt to avoid compilation issues
ipmi: Add an intializer for ipmi_recv_msg struct
ipmi: Add an intializer for ipmi_smi_msg struct
ipmi:ssif: Check for NULL msg when handling events and messages
ipmi: use simple i2c probe function
ipmi: Add a sysfs count of total outstanding messages for an interface
ipmi: Add a sysfs interface to view the number of users
ipmi: Limit the number of message a user may have outstanding
ipmi: Add a limit on the number of users that may use IPMI
Linus Torvalds [Tue, 24 May 2022 21:31:29 +0000 (14:31 -0700)]
Merge tag 'mtd/for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull mtd updates from Miquel Raynal:
"MTD core changes:
- Call of_platform_populate() for MTD partitions
- Check devicetree alias for index
- mtdoops:
- Add a timestamp to the mtdoops header.
- Create a header structure for the saved mtdoops.
- Fix the size of the header read buffer.
- mtdblock: Warn if opened on NAND
- Bindings:
- reserved-memory: Support MTD/block device
- jedec,spi-nor: remove unneeded properties
- Extend fixed-partitions binding
- Add Sercomm (Suzhou) Corporation vendor prefix
MTD driver changes:
- st_spi_fsm: add missing clk_disable_unprepare() in stfsm_remove()
- phram:
- Allow cached mappings
- Allow probing via reserved-memory
- maps: ixp4xx: Drop driver
- bcm47xxpart: Print correct offset on read error
CFI driver changes:
- Rename chip_ready variables
- Add S29GL064N ID definition
- Use chip_ready() for write on S29GL064N
- Move and rename chip_check/chip_ready/chip_good_for_write
NAND core changes:
- Print offset instead of page number for bad blocks
Raw NAND controller drivers:
- Cadence: Fix possible null-ptr-deref in cadence_nand_dt_probe()
- CS553X: simplify the return expression of cs553x_write_ctrl_byte()
- Davinci: Remove redundant unsigned comparison to zero
- Denali: Use managed device resources
- GPMI:
- Add large oob bch setting support
- Rename the variable ecc_chunk_size
- Uninline the gpmi_check_ecc function
- Add strict ecc strength check
- Refactor BCH geometry settings function
- Intel: Fix possible null-ptr-deref in ebu_nand_probe()
- MPC5121: Check before clk_disable_unprepare() not needed
- Mtk:
- MTD_NAND_ECC_MEDIATEK should depend on ARCH_MEDIATEK
- Also parse the default nand-ecc-engine property if available
- Make mtk_ecc.c a separated module
- OMAP ELM:
- Convert the bindings to yaml
- Describe the bindings for AM64 ELM
- Add support for its compatible
- Renesas: Use runtime PM instead of the raw clock API and update the
bindings accordingly
- Rockchip: Check before clk_disable_unprepare() not needed
- TMIO: Check return value after calling platform_get_resource()
Raw NAND chip driver:
- Kioxia: Add support for TH58NVG3S0HBAI4 and TC58NVG0S3HTA00
SPI-NAND chip drivers:
- Gigadevice:
- Add support for:
- GD5FxGM7xExxG
- GD5F{2,4}GQ5xExxG
- GD5F1GQ5RExxG
- GD5FxGQ4xExxG
- Fix Quad IO for GD5F1GQ5UExxG
- XTX: Add support for XT26G0xA
SPI NOR core changes:
- Read back written SR value to make sure the write was done
correctly.
- Introduce a common function for Read ID that manufacturer drivers
can use to verify the Octal DTR switch worked correctly.
- Add helpers for read/write any register commands so manufacturer
drivers don't open code it every time.
- Clarify rdsr dummy cycles documentation.
- Add debugfs entry to expose internal flash parameters and state.
SPI NOR manufacturer drivers changes:
- Add support for Winbond W25Q512NW-IM, and Eon EN25QH256A.
- Move spi_nor_write_ear() to Winbond module since only Winbond
flashes use it.
- Rework Micron and Cypress Octal DTR enable methods to improve
readability.
- Use the common Read ID function to verify switch to Octal DTR mode
for Micron and Cypress flashes.
- Skip polling status on volatile register writes for Micron and
Cypress flashes since the operation is instant"
* tag 'mtd/for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: (68 commits)
mtd: st_spi_fsm: add missing clk_disable_unprepare() in stfsm_remove()
dt-bindings: mtd: partitions: Extend fixed-partitions binding
dt-bindings: Add Sercomm (Suzhou) Corporation vendor prefix
mtd: phram: Allow cached mappings
mtd: call of_platform_populate() for MTD partitions
mtd: rawnand: renesas: Use runtime PM instead of the raw clock API
dt-bindings: mtd: renesas: Fix the NAND controller description
mtd: rawnand: mpc5121: Check before clk_disable_unprepare() not needed
mtd: rawnand: rockchip: Check before clk_disable_unprepare() not needed
mtd: nand: MTD_NAND_ECC_MEDIATEK should depend on ARCH_MEDIATEK
mtd: rawnand: cs553x: simplify the return expression of cs553x_write_ctrl_byte()
mtd: rawnand: kioxia: Add support for TH58NVG3S0HBAI4
mtd: spi-nor: debugfs: fix format specifier
mtd: spi-nor: support eon en25qh256a variant
mtd: spi-nor: winbond: add support for W25Q512NW-IM
mtd: spi-nor: expose internal parameters via debugfs
mtd: spi-nor: export spi_nor_hwcaps_pp2cmd()
mtd: spi-nor: move spi_nor_write_ear() to winbond module
mtd: spi-nor: amend the rdsr dummy cycles documentation
mtd: cfi_cmdset_0002: Rename chip_ready variables
...
Linus Torvalds [Tue, 24 May 2022 21:23:10 +0000 (14:23 -0700)]
Merge tag 'hwmon-for-v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon updates from Guenter Roeck:
"New drivers:
- Driver for the Microchip LAN966x SoC
- PMBus driver for Infineon Digital Multi-phase xdp152 family
controllers
Chip support added to existing drivers:
- asus-ec-sensors:
- Support for ROG STRIX X570-E GAMING WIFI II, PRIME X470-PRO, and
ProArt X570 Creator WIFI
- External temperature sensor support for ASUS WS X570-ACE
- nct6775:
- Support for I2C driver
- Support for ASUS PRO H410T / PRIME H410M-R /
ROG X570-E GAMING WIFI II
- lm75:
- Support for - Atmel AT30TS74
- pmbus/max16601:
- Support for MAX16602
- aquacomputer_d5next:
- Support for Aquacomputer Farbwerk
- Support for Aquacomputer Octo
- jc42:
- Support for S-34TS04A
Kernel API changes / clarifications:
- The chip parameter of with_info API is now mandatory
- New hwmon_device_register_for_thermal API call for use by the
thermal subsystem
Improvements:
- PMBus and JC42 drivers now register with thermal subsystem
- PMBus drivers now support get_voltage/set_voltage power operations
- The adt7475 driver now supports pin configuration
- The lm90 driver now supports setting extended range temperatures
configuration with a devicetree property
- The dell-smm driver now registers as cooling device
- The OCC driver delays hwmon registration until requested by
userspace
... and various other minor fixes and improvements"
* tag 'hwmon-for-v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: (71 commits)
hwmon: (aquacomputer_d5next) Fix an error handling path in aqc_probe()
hwmon: (sl28cpld) Fix typo in comment
hwmon: (pmbus) Check PEC support before reading other registers
hwmon: (dimmtemp) Fix bitmap handling
hwmon: (lm90) enable extended range according to DTS node
dt-bindings: hwmon: lm90: add ti,extended-range-enable property
dt-bindings: hwmon: lm90: add missing ti,tmp461
hwmon: (ibmaem) Directly use ida_alloc()/free()
hwmon: Directly use ida_alloc()/free()
hwmon: (asus-ec-sensors) fix Formula VIII definition
dt-bindings: trivial-devices: Add xdp152
hwmon: (sl28cpld-hwmon) Use HWMON_CHANNEL_INFO macro
hwmon: (pwm-fan) Use HWMON_CHANNEL_INFO macro
hwmon: (peci/dimmtemp) Use HWMON_CHANNEL_INFO macro
hwmon: (peci/cputemp) Use HWMON_CHANNEL_INFO macro
hwmon: (mr75203) Use HWMON_CHANNEL_INFO macro
hwmon: (ltc2992) Use HWMON_CHANNEL_INFO macro
hwmon: (as370-hwmon) Use HWMON_CHANNEL_INFO macro
hwmon: Make chip parameter for with_info API mandatory
thermal/drivers/thermal_hwmon: Use hwmon_device_register_for_thermal()
...
Linus Torvalds [Tue, 24 May 2022 20:50:39 +0000 (13:50 -0700)]
Merge tag 'integrity-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity
Pull IMA updates from Mimi Zohar:
"New is IMA support for including fs-verity file digests and signatures
in the IMA measurement list as well as verifying the fs-verity file
digest based signatures, both based on policy.
In addition, are two bug fixes:
- avoid reading UEFI variables, which cause a page fault, on Apple
Macs with T2 chips.
- remove the original "ima" template Kconfig option to address a boot
command line ordering issue.
The rest is a mixture of code/documentation cleanup"
* tag 'integrity-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
integrity: Fix sparse warnings in keyring_handler
evm: Clean up some variables
evm: Return INTEGRITY_PASS for enum integrity_status value '0'
efi: Do not import certificates from UEFI Secure Boot for T2 Macs
fsverity: update the documentation
ima: support fs-verity file digest based version 3 signatures
ima: permit fsverity's file digests in the IMA measurement list
ima: define a new template field named 'd-ngv2' and templates
fs-verity: define a function to return the integrity protected file digest
ima: use IMA default hash algorithm for integrity violations
ima: fix 'd-ng' comments and documentation
ima: remove the IMA_TEMPLATE Kconfig option
ima: remove redundant initialization of pointer 'file'.
Linus Torvalds [Tue, 24 May 2022 20:16:50 +0000 (13:16 -0700)]
Merge tag 'tpmdd-next-v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
Pull tpm updates from Jarkko Sakkinen:
- Tightened validation of key hashes for SYSTEM_BLACKLIST_HASH_LIST. An
invalid hash format causes a compilation error. Previously, they got
included to the kernel binary but were silently ignored at run-time.
- Allow root user to append new hashes to the blacklist keyring.
- Trusted keys backed with Cryptographic Acceleration and Assurance
Module (CAAM), which part of some of the new NXP's SoC's. Now there
is total three hardware backends for trusted keys: TPM, ARM TEE and
CAAM.
- A scattered set of fixes and small improvements for the TPM driver.
* tag 'tpmdd-next-v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
MAINTAINERS: add KEYS-TRUSTED-CAAM
doc: trusted-encrypted: describe new CAAM trust source
KEYS: trusted: Introduce support for NXP CAAM-based trusted keys
crypto: caam - add in-kernel interface for blob generator
crypto: caam - determine whether CAAM supports blob encap/decap
KEYS: trusted: allow use of kernel RNG for key material
KEYS: trusted: allow use of TEE as backend without TCG_TPM support
tpm: Add field upgrade mode support for Infineon TPM2 modules
tpm: Fix buffer access in tpm2_get_tpm_pt()
char: tpm: cr50_i2c: Suppress duplicated error message in .remove()
tpm: cr50: Add new device/vendor ID 0x504a6666
tpm: Remove read16/read32/write32 calls from tpm_tis_phy_ops
tpm: ibmvtpm: Correct the return value in tpm_ibmvtpm_probe()
tpm/tpm_ftpm_tee: Return true/false (not 1/0) from bool functions
certs: Explain the rationale to call panic()
certs: Allow root user to append signed hashes to the blacklist keyring
certs: Check that builtin blacklist hashes are valid
certs: Make blacklist_vet_description() more strict
certs: Factor out the blacklist hash creation
tools/certs: Add print-cert-tbs-hash.sh
Linus Torvalds [Tue, 24 May 2022 20:09:13 +0000 (13:09 -0700)]
Merge tag 'landlock-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull Landlock updates from Mickaël Salaün:
- improve the path_rename LSM hook implementations for RENAME_EXCHANGE;
- fix a too-restrictive filesystem control for a rare corner case;
- set the nested sandbox limitation to 16 layers;
- add a new LANDLOCK_ACCESS_FS_REFER access right to properly handle
file reparenting (i.e. full rename and link support);
- add new tests and documentation;
- format code with clang-format to make it easier to maintain and
contribute.
* tag 'landlock-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: (30 commits)
landlock: Explain how to support Landlock
landlock: Add design choices documentation for filesystem access rights
landlock: Document good practices about filesystem policies
landlock: Document LANDLOCK_ACCESS_FS_REFER and ABI versioning
samples/landlock: Add support for file reparenting
selftests/landlock: Add 11 new test suites dedicated to file reparenting
landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER
LSM: Remove double path_rename hook calls for RENAME_EXCHANGE
landlock: Move filesystem helpers and add a new one
landlock: Fix same-layer rule unions
landlock: Create find_rule() from unmask_layers()
landlock: Reduce the maximum number of layers to 16
landlock: Define access_mask_t to enforce a consistent access mask size
selftests/landlock: Test landlock_create_ruleset(2) argument check ordering
landlock: Change landlock_restrict_self(2) check ordering
landlock: Change landlock_add_rule(2) argument check ordering
selftests/landlock: Add tests for O_PATH
selftests/landlock: Fully test file rename with "remove" access
selftests/landlock: Extend access right tests to directories
selftests/landlock: Add tests for unknown access rights
...
Linus Torvalds [Tue, 24 May 2022 20:06:32 +0000 (13:06 -0700)]
Merge tag 'selinux-pr-20220523' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux updates from Paul Moore:
"We've got twelve patches queued for v5.19, with most being fairly
minor. The highlights are below:
- The checkreqprot and runtime disable knobs have been deprecated for
some time with no active users that we can find. In an effort to
move things along we are adding a pause when the knobs are used to
help make the deprecation more noticeable in case anyone is still
using these hacks in the shadows.
- We've added the anonymous inode class name to the AVC audit records
when anonymous inodes are involved. This should make writing policy
easier when anonymous inodes are involved.
- More constification work. This is fairly straightforward and the
source of most of the diffstat.
- The usual minor cleanups: remove unnecessary assignments, assorted
style/checkpatch fixes, kdoc fixes, macro while-loop
encapsulations, #include tweaks, etc"
* tag 'selinux-pr-20220523' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
security: declare member holding string literal const
selinux: log anon inode class name
selinux: declare data arrays const
selinux: fix indentation level of mls_ops block
selinux: include necessary headers in headers
selinux: avoid extra semicolon
selinux: update parameter documentation
selinux: resolve checkpatch errors
selinux: don't sleep when CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE is true
selinux: checkreqprot is deprecated, add some ssleep() discomfort
selinux: runtime disable is deprecated, add some ssleep() discomfort
selinux: Remove redundant assignments
Linus Torvalds [Tue, 24 May 2022 19:49:48 +0000 (12:49 -0700)]
Merge tag 'execve-v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull execve updates from Kees Cook:
- Fix binfmt_flat GOT handling for riscv (Niklas Cassel)
- Remove unused/broken binfmt_flat shared library and coredump code
(Eric W. Biederman)
* tag 'execve-v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
binfmt_flat: Remove shared library support
binfmt_flat: Drop vestiges of coredump support
binfmt_flat: do not stop relocating GOT entries prematurely on riscv
- Gracefully handle failed unshare() in selftests (Yang Guang)
- Spelling fix (Colin Ian King)
* tag 'seccomp-v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
selftests/seccomp: Fix spelling mistake "Coud" -> "Could"
selftests/seccomp: Add test for wait killable notifier
selftests/seccomp: Refactor get_proc_stat to split out file reading code
seccomp: Add wait_killable semantic to seccomp user notifier
selftests/seccomp: Ensure that notifications come in FIFO order
seccomp: Use FIFO semantics to order notifications
selftests/seccomp: Add SKIP for failed unshare()
selftests/seccomp: Test PTRACE_O_SUSPEND_SECCOMP without CAP_SYS_ADMIN
Eric Biggers [Thu, 19 May 2022 20:44:37 +0000 (13:44 -0700)]
ext4: only allow test_dummy_encryption when supported
Make the test_dummy_encryption mount option require that the encrypt
feature flag be already enabled on the filesystem, rather than
automatically enabling it. Practically, this means that "-O encrypt"
will need to be included in MKFS_OPTIONS when running xfstests with the
test_dummy_encryption mount option. (ext4/053 also needs an update.)
Moreover, as long as the preconditions for test_dummy_encryption are
being tightened anyway, take the opportunity to start rejecting it when
!CONFIG_FS_ENCRYPTION rather than ignoring it.
The motivation for requiring the encrypt feature flag is that:
- Having the filesystem auto-enable feature flags is problematic, as it
bypasses the usual sanity checks. The specific issue which came up
recently is that in kernel versions where ext4 supports casefold but
not encrypt+casefold (v5.1 through v5.10), the kernel will happily add
the encrypt flag to a filesystem that has the casefold flag, making it
unmountable -- but only for subsequent mounts, not the initial one.
This confused the casefold support detection in xfstests, causing
generic/556 to fail rather than be skipped.
- The xfstests-bld test runners (kvm-xfstests et al.) already use the
required mkfs flag, so they will not be affected by this change. Only
users of test_dummy_encryption alone will be affected. But, this
option has always been for testing only, so it should be fine to
require that the few users of this option update their test scripts.
- f2fs already requires it (for its equivalent feature flag).
In the ext4_valid_extent_entries function,
if prev is 0, no error is returned even if lblock<=prev.
This was intended to skip the check on the first extent, but
in the error image above, prev=0+1-1=0 when checking the second extent,
so even though lblock<=prev, the function does not return an error.
As a result, bug_ON occurs in __es_tree_search and the system panics.
To solve this problem, we only need to check that:
1. The lblock of the first extent is not less than 0.
2. The lblock of the next extent is not less than
the next block of the previous extent.
The same applies to extent_idx.
Cc: stable@kernel.org Fixes: 78ba52c464cb ("ext4: check for overlapping extents in ext4_valid_extent_entries()") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220518120816.1541863-1-libaokun1@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Jan Kara [Wed, 18 May 2022 09:33:29 +0000 (11:33 +0200)]
ext4: avoid cycles in directory h-tree
A maliciously corrupted filesystem can contain cycles in the h-tree
stored inside a directory. That can easily lead to the kernel corrupting
tree nodes that were already verified under its hands while doing a node
split and consequently accessing unallocated memory. Fix the problem by
verifying traversed block numbers are unique.
Theodore Ts'o [Tue, 17 May 2022 17:27:55 +0000 (13:27 -0400)]
ext4: filter out EXT4_FC_REPLAY from on-disk superblock field s_state
The EXT4_FC_REPLAY bit in sbi->s_mount_state is used to indicate that
we are in the middle of replay the fast commit journal. This was
actually a mistake, since the sbi->s_mount_info is initialized from
es->s_state. Arguably s_mount_state is misleadingly named, but the
name is historical --- s_mount_state and s_state dates back to ext2.
What should have been used is the ext4_{set,clear,test}_mount_flag()
inline functions, which sets EXT4_MF_* bits in sbi->s_mount_flags.
The problem with using EXT4_FC_REPLAY is that a maliciously corrupted
superblock could result in EXT4_FC_REPLAY getting set in
s_mount_state. This bypasses some sanity checks, and this can trigger
a BUG() in ext4_es_cache_extent(). As a easy-to-backport-fix, filter
out the EXT4_FC_REPLAY bit for now. We should eventually transition
away from EXT4_FC_REPLAY to something like EXT4_MF_REPLAY.
Bob Peterson [Fri, 11 Feb 2022 15:50:36 +0000 (10:50 -0500)]
gfs2: Convert function bh_get to use iomap
Before this patch, function bh_get used block_map to figure out the
block it needed to read in from the quota_change file. This patch
changes it to use iomap directly to make it more efficient.
Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Bob Peterson [Fri, 11 Feb 2022 15:50:08 +0000 (10:50 -0500)]
gfs2: use i_lock spin_lock for inode qadata
Before this patch, functions gfs2_qa_get and _put used the i_rw_mutex to
prevent simultaneous access to its i_qadata. But i_rw_mutex is now used
for many other things, including iomap_begin and end, which causes a
conflict according to lockdep. We cannot just remove the lock since
simultaneous opens (gfs2_open -> gfs2_open_common -> gfs2_qa_get) can
then stomp on each others values for i_qadata.
This patch solves the conflict by using the i_lock spin_lock in the inode
to prevent simultaneous access.
Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Andrew Price [Tue, 22 Mar 2022 18:49:19 +0000 (18:49 +0000)]
gfs2: Return more useful errors from gfs2_rgrp_send_discards()
The bug that 895ee5709 ("gfs2: Make sure FITRIM minlen is rounded up to
fs block size") fixes was a little confusing as the user saw
"Input/output error" which masked the -EINVAL that sb_issue_discard()
returned.
sb_issue_discard() can fail for various reasons, so we should return its
return value from gfs2_rgrp_send_discards() to avoid all errors being
reported as IO errors.
This improves error reporting for FITRIM and makes no difference to the
-o discard code path because the return value from
gfs2_rgrp_send_discards() gets thrown away in that case (and the option
switches off). Presumably that's why it was ok to just return -EIO in
the past, before FITRIM was implemented.
Tested with xfstests.
Signed-off-by: Andrew Price <anprice@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Kees Cook [Sun, 8 May 2022 10:06:30 +0000 (03:06 -0700)]
gfs2: Use container_of() for gfs2_glock(aspace)
Clang's structure layout randomization feature gets upset when it sees
struct address_space (which is randomized) cast to struct gfs2_glock.
This is due to seeing the mapping pointer as being treated as an array
of gfs2_glock, rather than "something else, before struct address_space":
In file included from fs/gfs2/acl.c:23:
fs/gfs2/meta_io.h:44:12: error: casting from randomized structure pointer type 'struct address_space *' to 'struct gfs2_glock *'
return (((struct gfs2_glock *)mapping) - 1)->gl_name.ln_sbd;
^
Replace the instances of open-coded pointer math with container_of()
usage, and update the allocator to match.
Some cleanups and conversion of gfs2_glock_get() and
gfs2_glock_dealloc() by Andreas.
Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/lkml/202205041550.naKxwCBj-lkp@intel.com Cc: Bob Peterson <rpeterso@redhat.com> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Bill Wendling <morbo@google.com> Cc: cluster-devel@redhat.com Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Linus Torvalds [Tue, 24 May 2022 19:22:56 +0000 (12:22 -0700)]
Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt
Pull fsverity updates from Eric Biggers:
"A couple small cleanups for fs/verity/"
* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
fs-verity: Use struct_size() helper in enable_verity()
fs-verity: remove unused parameter desc_size in fsverity_create_info()
Linus Torvalds [Tue, 24 May 2022 19:17:45 +0000 (12:17 -0700)]
Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt
Pull fscrypt updates from Eric Biggers:
"Some cleanups for fs/crypto/:
- Split up the misleadingly-named FS_CRYPTO_BLOCK_SIZE constant.
- Consistently report the encryption implementation that is being
used.
- Add helper functions for the test_dummy_encryption mount option
that work properly with the new mount API. ext4 and f2fs will use
these"
* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
fscrypt: add new helper functions for test_dummy_encryption
fscrypt: factor out fscrypt_policy_to_key_spec()
fscrypt: log when starting to use inline encryption
fscrypt: split up FS_CRYPTO_BLOCK_SIZE
Linus Torvalds [Tue, 24 May 2022 18:58:10 +0000 (11:58 -0700)]
Merge tag 'random-5.19-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull random number generator updates from Jason Donenfeld:
"These updates continue to refine the work began in 5.17 and 5.18 of
modernizing the RNG's crypto and streamlining and documenting its
code.
New for 5.19, the updates aim to improve entropy collection methods
and make some initial decisions regarding the "premature next" problem
and our threat model. The cloc utility now reports that random.c is
931 lines of code and 466 lines of comments, not that basic metrics
like that mean all that much, but at the very least it tells you that
this is very much a manageable driver now.
Here's a summary of the various updates:
- The random_get_entropy() function now always returns something at
least minimally useful. This is the primary entropy source in most
collectors, which in the best case expands to something like RDTSC,
but prior to this change, in the worst case it would just return 0,
contributing nothing. For 5.19, additional architectures are wired
up, and architectures that are entirely missing a cycle counter now
have a generic fallback path, which uses the highest resolution
clock available from the timekeeping subsystem.
Some of those clocks can actually be quite good, despite the CPU
not having a cycle counter of its own, and going off-core for a
stamp is generally thought to increase jitter, something positive
from the perspective of entropy gathering. Done very early on in
the development cycle, this has been sitting in next getting some
testing for a while now and has relevant acks from the archs, so it
should be pretty well tested and fine, but is nonetheless the thing
I'll be keeping my eye on most closely.
- Of particular note with the random_get_entropy() improvements is
MIPS, which, on CPUs that lack the c0 count register, will now
combine the high-speed but short-cycle c0 random register with the
lower-speed but long-cycle generic fallback path.
- With random_get_entropy() now always returning something useful,
the interrupt handler now collects entropy in a consistent
construction.
- Rather than comparing two samples of random_get_entropy() for the
jitter dance, the algorithm now tests many samples, and uses the
amount of differing ones to determine whether or not jitter entropy
is usable and how laborious it must be. The problem with comparing
only two samples was that if the cycle counter was extremely slow,
but just so happened to be on the cusp of a change, the slowness
wouldn't be detected. Taking many samples fixes that to some
degree.
This, combined with the other improvements to random_get_entropy(),
should make future unification of /dev/random and /dev/urandom
maybe more possible. At the very least, were we to attempt it again
today (we're not), it wouldn't break any of Guenter's test rigs
that broke when we tried it with 5.18. So, not today, but perhaps
down the road, that's something we can revisit.
- We attempt to reseed the RNG immediately upon waking up from system
suspend or hibernation, making use of the various timestamps about
suspend time and such available, as well as the usual inputs such
as RDRAND when available.
- Batched randomness now falls back to ordinary randomness before the
RNG is initialized. This provides more consistent guarantees to the
types of random numbers being returned by the various accessors.
- The "pre-init injection" code is now gone for good. I suspect you
in particular will be happy to read that, as I recall you
expressing your distaste for it a few months ago. Instead, to avoid
a "premature first" issue, while still allowing for maximal amount
of entropy availability during system boot, the first 128 bits of
estimated entropy are used immediately as it arrives, with the next
128 bits being buffered. And, as before, after the RNG has been
fully initialized, it winds up reseeding anyway a few seconds later
in most cases. This resulted in a pretty big simplification of the
initialization code and let us remove various ad-hoc mechanisms
like the ugly crng_pre_init_inject().
- The RNG no longer pretends to handle the "premature next" security
model, something that various academics and other RNG designs have
tried to care about in the past. After an interesting mailing list
thread, these issues are thought to be a) mainly academic and not
practical at all, and b) actively harming the real security of the
RNG by delaying new entropy additions after a potential compromise,
making a potentially bad situation even worse. As well, in the
first place, our RNG never even properly handled the premature next
issue, so removing an incomplete solution to a fake problem was
particularly nice.
This allowed for numerous other simplifications in the code, which
is a lot cleaner as a consequence. If you didn't see it before,
https://lore.kernel.org/lkml/YmlMGx6+uigkGiZ0@zx2c4.com/ may be a
thread worth skimming through.
- While the interrupt handler received a separate code path years ago
that avoids locks by using per-cpu data structures and a faster
mixing algorithm, in order to reduce interrupt latency, input and
disk events that are triggered in hardirq handlers were still
hitting locks and more expensive algorithms. Those are now
redirected to use the faster per-cpu data structures.
- Rather than having the fake-crypto almost-siphash-based random32
implementation be used right and left, and in many places where
cryptographically secure randomness is desirable, the batched
entropy code is now fast enough to replace that.
- As usual, numerous code quality and documentation cleanups. For
example, the initialization state machine now uses enum symbolic
constants instead of just hard coding numbers everywhere.
- Since the RNG initializes once, and then is always initialized
thereafter, a pretty heavy amount of code used during that
initialization is never used again. It is now completely cordoned
off using static branches and it winds up in the .text.unlikely
section so that it doesn't reduce cache compactness after the RNG
is ready.
- A variety of functions meant for waiting on the RNG to be
initialized were only used by vsprintf, and in not a particularly
optimal way. Replacing that usage with a more ordinary setup made
it possible to remove those functions.
- A cleanup of how we warn userspace about the use of uninitialized
/dev/urandom and uninitialized get_random_bytes() usage.
Interestingly, with the change you merged for 5.18 that attempts to
use jitter (but does not block if it can't), the majority of users
should never see those warnings for /dev/urandom at all now, and
the one for in-kernel usage is mainly a debug thing.
- The file_operations struct for /dev/[u]random now implements
.read_iter and .write_iter instead of .read and .write, allowing it
to also implement .splice_read and .splice_write, which makes
splice(2) work again after it was broken here (and in many other
places in the tree) during the set_fs() removal. This was a bit of
a last minute arrival from Jens that hasn't had as much time to
bake, so I'll be keeping my eye on this as well, but it seems
fairly ordinary. Unfortunately, read_iter() is around 3% slower
than read() in my tests, which I'm not thrilled about. But Jens and
Al, spurred by this observation, seem to be making progress in
removing the bottlenecks on the iter paths in the VFS layer in
general, which should remove the performance gap for all drivers.
- Assorted other bug fixes, cleanups, and optimizations.
- A small SipHash cleanup"
* tag 'random-5.19-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (49 commits)
random: check for signals after page of pool writes
random: wire up fops->splice_{read,write}_iter()
random: convert to using fops->write_iter()
random: convert to using fops->read_iter()
random: unify batched entropy implementations
random: move randomize_page() into mm where it belongs
random: remove mostly unused async readiness notifier
random: remove get_random_bytes_arch() and add rng_has_arch_random()
random: move initialization functions out of hot pages
random: make consistent use of buf and len
random: use proper return types on get_random_{int,long}_wait()
random: remove extern from functions in header
random: use static branch for crng_ready()
random: credit architectural init the exact amount
random: handle latent entropy and command line from random_init()
random: use proper jiffies comparison macro
random: remove ratelimiting for in-kernel unseeded randomness
random: move initialization out of reseeding hot path
random: avoid initializing twice in credit race
random: use symbolic constants for crng_init states
...
Vadim Fedorenko [Thu, 19 May 2022 21:21:53 +0000 (14:21 -0700)]
ptp: ocp: Add firmware header checks
Right now it's possible to flash any kind of binary via devlink and
break the card easily. This diff adds an optional header check when
installing the firmware.
Signed-off-by: Vadim Fedorenko <vadfed@fb.com> Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jonathan Lemon [Thu, 19 May 2022 21:21:48 +0000 (14:21 -0700)]
ptp: ocp: parameterize input/output sma selectors
Group the sma input/output tables together and select the correct
group from the bp information. This allows adding new groups with
different sma mappings.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jonathan Lemon [Thu, 19 May 2022 21:21:45 +0000 (14:21 -0700)]
ptp: ocp: Remove #ifdefs around PCI IDs
These #ifdefs are not required, so remove them.
Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Daniel Thompson [Mon, 23 May 2022 18:11:02 +0000 (19:11 +0100)]
lockdown: also lock down previous kgdb use
KGDB and KDB allow read and write access to kernel memory, and thus
should be restricted during lockdown. An attacker with access to a
serial port (for example, via a hypervisor console, which some cloud
vendors provide over the network) could trigger the debugger so it is
important that the debugger respect the lockdown mode when/if it is
triggered.
Fix this by integrating lockdown into kdb's existing permissions
mechanism. Unfortunately kgdb does not have any permissions mechanism
(although it certainly could be added later) so, for now, kgdb is simply
and brutally disabled by immediately exiting the gdb stub without taking
any action.
For lockdowns established early in the boot (e.g. the normal case) then
this should be fine but on systems where kgdb has set breakpoints before
the lockdown is enacted than "bad things" will happen.
CVE: CVE-2022-21499 Co-developed-by: Stephen Brennan <stephen.s.brennan@oracle.com> Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Updates to instrumentation/debugging:
- Remove sched_trace_*() helper functions - can be done via debug
info
- Fix double update_rq_clock() warnings
- Introduce & use "preemption model accessors" to simplify some of the
Kconfig complexity.
- Make softirq handling RT-safe.
- Misc smaller fixes & cleanups.
* tag 'sched-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
topology: Remove unused cpu_cluster_mask()
sched: Reverse sched_class layout
sched/deadline: Remove superfluous rq clock update in push_dl_task()
sched/core: Avoid obvious double update_rq_clock warning
smp: Make softirq handling RT safe in flush_smp_call_function_queue()
smp: Rename flush_smp_call_function_from_idle()
sched: Fix missing prototype warnings
sched/fair: Remove cfs_rq_tg_path()
sched/fair: Remove sched_trace_*() helper functions
sched/fair: Refactor cpu_util_without()
sched/fair: Revise comment about lb decision matrix
sched/psi: report zeroes for CPU full at the system level
sched/fair: Delete useless condition in tg_unthrottle_up()
sched/fair: Fix cfs_rq_clock_pelt() for throttled cfs_rq
sched/fair: Move calculate of avg_load to a better location
mailmap: Update my email address to @redhat.com
MAINTAINERS: Add myself as scheduler topology reviewer
psi: Fix trigger being fired unexpectedly at initial
ftrace: Use preemption model accessors for trace header printout
kcsan: Use preemption model accessors
Linus Torvalds [Tue, 24 May 2022 17:59:38 +0000 (10:59 -0700)]
Merge tag 'perf-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf events updates from Ingo Molnar:
"Platform PMU changes:
- x86/intel:
- Add new Intel Alder Lake and Raptor Lake support
- x86/amd:
- AMD Zen4 IBS extensions support
- Add AMD PerfMonV2 support
- Add AMD Fam19h Branch Sampling support
Generic changes:
- signal: Deliver SIGTRAP on perf event asynchronously if blocked
Perf instrumentation can be driven via SIGTRAP, but this causes a
problem when SIGTRAP is blocked by a task & terminate the task.
Allow user-space to request these signals asynchronously (after
they get unblocked) & also give the information to the signal
handler when this happens:
"To give user space the ability to clearly distinguish
synchronous from asynchronous signals, introduce
siginfo_t::si_perf_flags and TRAP_PERF_FLAG_ASYNC (opted for
flags in case more binary information is required in future).
The resolution to the problem is then to (a) no longer force the
signal (avoiding the terminations), but (b) tell user space via
si_perf_flags if the signal was synchronous or not, so that such
signals can be handled differently (e.g. let user space decide
to ignore or consider the data imprecise). "
- Unify/standardize the /sys/devices/cpu/events/* output format.
- Misc fixes & cleanups"
* tag 'perf-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
perf/x86/amd/core: Fix reloading events for SVM
perf/x86/amd: Run AMD BRS code only on supported hw
perf/x86/amd: Fix AMD BRS period adjustment
perf/x86/amd: Remove unused variable 'hwc'
perf/ibs: Fix comment
perf/amd/ibs: Advertise zen4_ibs_extensions as pmu capability attribute
perf/amd/ibs: Add support for L3 miss filtering
perf/amd/ibs: Use ->is_visible callback for dynamic attributes
perf/amd/ibs: Cascade pmu init functions' return value
perf/x86/uncore: Add new Alder Lake and Raptor Lake support
perf/x86/uncore: Clean up uncore_pci_ids[]
perf/x86/cstate: Add new Alder Lake and Raptor Lake support
perf/x86/msr: Add new Alder Lake and Raptor Lake support
perf/x86: Add new Alder Lake and Raptor Lake support
perf/amd/ibs: Use interrupt regs ip for stack unwinding
perf/x86/amd/core: Add PerfMonV2 overflow handling
perf/x86/amd/core: Add PerfMonV2 counter control
perf/x86/amd/core: Detect available counters
perf/x86/amd/core: Detect PerfMonV2 support
x86/msr: Add PerfCntrGlobal* registers
...
- Several features are done unconditionally, without any way to
turn them off. Some of them might be surprising. This makes
objtool tricky to use, and prevents porting individual features
to other arches.
- The config dependencies are too coarse-grained. Objtool
enablement is tied to CONFIG_STACK_VALIDATION, but it has several
other features independent of that.
- The objtool subcmds ("check" and "orc") are clumsy: "check" is
really a subset of "orc", so it has all the same options.
The subcmd model has never really worked for objtool, as it only
has a single purpose: "do some combination of things on an object
file".
- The '--lto' and '--vmlinux' options are nonsensical and have
surprising behavior.
- Add scripts/objdump-func helper script to disassemble a single
function from an object file.
- Rewrite scripts/faddr2line to be section-aware, by basing it on
'readelf', moving it away from 'nm', which doesn't handle multiple
sections well, which can result in decoding failure.
- Rewrite & fix symbol handling - which had a number of bugs wrt.
object files that don't have global symbols - which is rare but
possible. Also fix a bunch of symbol handling bugs found along the
way.
* tag 'objtool-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
objtool: Fix objtool regression on x32 systems
objtool: Fix symbol creation
scripts/faddr2line: Fix overlapping text section failures
scripts: Create objdump-func helper script
objtool: Remove libsubcmd.a when make clean
objtool: Remove inat-tables.c when make clean
objtool: Update documentation
objtool: Remove --lto and --vmlinux in favor of --link
objtool: Add HAVE_NOINSTR_VALIDATION
objtool: Rename "VMLINUX_VALIDATION" -> "NOINSTR_VALIDATION"
objtool: Make noinstr hacks optional
objtool: Make jump label hack optional
objtool: Make static call annotation optional
objtool: Make stack validation frame-pointer-specific
objtool: Add CONFIG_OBJTOOL
objtool: Extricate sls from stack validation
objtool: Rework ibt and extricate from stack validation
objtool: Make stack validation optional
objtool: Add option to print section addresses
objtool: Don't print parentheses in function addresses
...
Linus Torvalds [Tue, 24 May 2022 17:18:23 +0000 (10:18 -0700)]
Merge tag 'locking-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
- rwsem cleanups & optimizations/fixes:
- Conditionally wake waiters in reader/writer slowpaths
- Always try to wake waiters in out_nolock path
- Add try_cmpxchg64() implementation, with arch optimizations - and use
it to micro-optimize sched_clock_{local,remote}()
- Various force-inlining fixes to address objdump instrumentation-check
warnings
- Add lock contention tracepoints:
lock:contention_begin
lock:contention_end
- Misc smaller fixes & cleanups
* tag 'locking-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/clock: Use try_cmpxchg64 in sched_clock_{local,remote}
locking/atomic/x86: Introduce arch_try_cmpxchg64
locking/atomic: Add generic try_cmpxchg64 support
futex: Remove a PREEMPT_RT_FULL reference.
locking/qrwlock: Change "queue rwlock" to "queued rwlock"
lockdep: Delete local_irq_enable_in_hardirq()
locking/mutex: Make contention tracepoints more consistent wrt adaptive spinning
locking: Apply contention tracepoints in the slow path
locking: Add lock contention tracepoints
locking/rwsem: Always try to wake waiters in out_nolock path
locking/rwsem: Conditionally wake waiters in reader/writer slowpaths
locking/rwsem: No need to check for handoff bit if wait queue empty
lockdep: Fix -Wunused-parameter for _THIS_IP_
x86/mm: Force-inline __phys_addr_nodebug()
x86/kvm/svm: Force-inline GHCB accessors
task_stack, x86/cea: Force-inline stack helpers
Damien Le Moal [Mon, 23 May 2022 23:29:39 +0000 (08:29 +0900)]
zonefs: Fix zonefs_init_file_inode() return value
Commit 76dff81de8fc ("zonefs: Add active seq file accounting") wrongly
changed zonefs_init_file_inode() to always return 0 even if the call to
zonefs_zone_mgmt() fails. Fix this by propagating zonefs_zone_mgmt()
return value as the return value for zonefs_init_file_inode().
Fixes: 76dff81de8fc ("zonefs: Add active seq file accounting") Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
drivers/net/ethernet/cadence/macb_main.c 0fcca792e5aa ("net: macb: Fix PTP one step sync support") 917adea31ce2 ("net: macb: use NAPI for TX completion path")
https://lore.kernel.org/all/20220523111021.31489367@canb.auug.org.au/
net/smc/af_smc.c 80a16c03a208 ("net/smc: postpone sk_refcnt increment in connect()") 9996f3911fc5 ("net/smc: align the connect behaviour with TCP")
https://lore.kernel.org/all/20220524114408.4bf1af38@canb.auug.org.au/