Linus Torvalds [Sat, 18 Dec 2021 21:23:55 +0000 (13:23 -0800)]
Merge tag 'tty-5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg KH:
"Here are two small tty/serial fixes for 5.16-rc6. They include:
- n_hdlc fix for syzbot reported problem that you were previously
copied on.
- 8250_fintek driver fix that resolved a console problem by removing
a previous change.
Both have been in linux-next with no reported issues"
* tag 'tty-5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: 8250_fintek: Fix garbled text for console
tty: n_hdlc: make n_hdlc_tty_wakeup() asynchronous
Linus Torvalds [Sat, 18 Dec 2021 21:16:43 +0000 (13:16 -0800)]
Merge tag 'usb-5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are a number of small USB driver fixes for reported problems.
They include:
- dwc2 driver fixes
- xhci driver fixes
- cdnsp driver fixes
- typec driver fix
- gadget u_ether driver fix
- new quirk additions
- usb gadget endpoint calculation fix
- usb serial new device ids
- revert of a xhci-dbg change that broke early debug booting
All changes, except for the revert, have been in linux-next with no
reported problems. The revert was from yesterday, and it was reported
by the developers affected that it resolved their problem"
* tag 'usb-5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
Revert "usb: early: convert to readl_poll_timeout_atomic()"
usb: typec: tcpm: fix tcpm unregister port but leave a pending timer
usb: cdnsp: Fix lack of spin_lock_irqsave/spin_lock_restore
USB: NO_LPM quirk Lenovo USB-C to Ethernet Adapher(RTL8153-04)
usb: xhci: Extend support for runtime power management for AMD's Yellow carp.
usb: dwc2: fix STM ID/VBUS detection startup delay in dwc2_driver_probe
USB: gadget: bRequestType is a bitfield, not a enum
USB: serial: option: add Telit FN990 compositions
USB: serial: cp210x: fix CP2105 GPIO registration
usb: cdnsp: Fix incorrect status for control request
usb: cdnsp: Fix issue in cdnsp_log_ep trace event
usb: cdnsp: Fix incorrect calling of cdnsp_died function
usb: xhci-mtk: fix list_del warning when enable list debug
usb: gadget: u_ether: fix race in setting MAC address in setup phase
Linus Torvalds [Sat, 18 Dec 2021 19:53:14 +0000 (11:53 -0800)]
Merge tag 'perf-tools-fixes-for-v5.16-2021-12-18' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix segfaults in 'perf inject' related to usage of unopened files
- The return value of hashmap__new() should be checked using IS_ERR()
* tag 'perf-tools-fixes-for-v5.16-2021-12-18' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf inject: Fix segfault due to perf_data__fd() without open
perf inject: Fix segfault due to close without open
perf expr: Fix missing check for return value of hashmap__new()
Adrian Hunter [Mon, 13 Dec 2021 08:48:29 +0000 (10:48 +0200)]
perf inject: Fix segfault due to perf_data__fd() without open
The fixed commit attempts to get the output file descriptor even if the
file was never opened e.g.
$ perf record uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.002 MB perf.data (7 samples) ]
$ perf inject -i perf.data --vm-time-correlation=dry-run
Segmentation fault (core dumped)
$ gdb --quiet perf
Reading symbols from perf...
(gdb) r inject -i perf.data --vm-time-correlation=dry-run
Starting program: /home/ahunter/bin/perf inject -i perf.data --vm-time-correlation=dry-run
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Program received signal SIGSEGV, Segmentation fault.
__GI___fileno (fp=0x0) at fileno.c:35
35 fileno.c: No such file or directory.
(gdb) bt
#0 __GI___fileno (fp=0x0) at fileno.c:35
#1 0x00005621e48dd987 in perf_data__fd (data=0x7fff4c68bd08) at util/data.h:72
#2 perf_data__fd (data=0x7fff4c68bd08) at util/data.h:69
#3 cmd_inject (argc=<optimized out>, argv=0x7fff4c69c1f0) at builtin-inject.c:1017
#4 0x00005621e4936783 in run_builtin (p=0x5621e4ee6878 <commands+600>, argc=4, argv=0x7fff4c69c1f0) at perf.c:313
#5 0x00005621e4897d5c in handle_internal_command (argv=<optimized out>, argc=<optimized out>) at perf.c:365
#6 run_argv (argcp=<optimized out>, argv=<optimized out>) at perf.c:409
#7 main (argc=4, argv=0x7fff4c69c1f0) at perf.c:539
(gdb)
Fixes: 298b2e0f8cd8c9b2 ("perf tools: Pass a fd to perf_file_header__read_pipe()") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: stable@vger.kernel.org Link: http://lore.kernel.org/lkml/20211213084829.114772-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Mon, 13 Dec 2021 08:48:28 +0000 (10:48 +0200)]
perf inject: Fix segfault due to close without open
The fixed commit attempts to close inject.output even if it was never
opened e.g.
$ perf record uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.002 MB perf.data (7 samples) ]
$ perf inject -i perf.data --vm-time-correlation=dry-run
Segmentation fault (core dumped)
$ gdb --quiet perf
Reading symbols from perf...
(gdb) r inject -i perf.data --vm-time-correlation=dry-run
Starting program: /home/ahunter/bin/perf inject -i perf.data --vm-time-correlation=dry-run
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Program received signal SIGSEGV, Segmentation fault.
0x00007eff8afeef5b in _IO_new_fclose (fp=0x0) at iofclose.c:48
48 iofclose.c: No such file or directory.
(gdb) bt
#0 0x00007eff8afeef5b in _IO_new_fclose (fp=0x0) at iofclose.c:48
#1 0x0000557fc7b74f92 in perf_data__close (data=data@entry=0x7ffcdafa6578) at util/data.c:376
#2 0x0000557fc7a6b807 in cmd_inject (argc=<optimized out>, argv=<optimized out>) at builtin-inject.c:1085
#3 0x0000557fc7ac4783 in run_builtin (p=0x557fc8074878 <commands+600>, argc=4, argv=0x7ffcdafb6a60) at perf.c:313
#4 0x0000557fc7a25d5c in handle_internal_command (argv=<optimized out>, argc=<optimized out>) at perf.c:365
#5 run_argv (argcp=<optimized out>, argv=<optimized out>) at perf.c:409
#6 main (argc=4, argv=0x7ffcdafb6a60) at perf.c:539
(gdb)
Fixes: b386cf8558f10cd3 ("perf inject: Close inject.output on exit") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: stable@vger.kernel.org Link: http://lore.kernel.org/lkml/20211213084829.114772-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Miaoqian Lin [Sun, 12 Dec 2021 06:25:02 +0000 (06:25 +0000)]
perf expr: Fix missing check for return value of hashmap__new()
The hashmap__new() function may return ERR_PTR(-ENOMEM) when malloc()
fails, add IS_ERR() checking for ctx->ids.
Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20211212062504.25841-1-linmq006@gmail.com
[ s/kfree()/free()/ and add missing linux/err.h include ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Linus Torvalds [Fri, 17 Dec 2021 21:50:58 +0000 (13:50 -0800)]
Merge tag 'for-5.16-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"A few more fixes, almost all error handling one-liners and for stable.
- regression fix in directory logging items
- regression fix of extent buffer status bits handling after an error
- fix memory leak in error handling path in tree-log
- fix freeing invalid anon device number when handling errors during
subvolume creation
- fix warning when freeing leaf after subvolume creation failure
- fix missing blkdev put in device scan error handling
- fix invalid delayed ref after subvolume creation failure"
* tag 'for-5.16-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix missing blkdev_put() call in btrfs_scan_one_device()
btrfs: fix warning when freeing leaf after subvolume creation failure
btrfs: fix invalid delayed ref after subvolume creation failure
btrfs: check WRITE_ERR when trying to read an extent buffer
btrfs: fix missing last dir item offset update when logging directory
btrfs: fix double free of anon_dev after failure to create subvolume
btrfs: fix memory leak in __add_inode_ref()
Linus Torvalds [Fri, 17 Dec 2021 20:03:14 +0000 (12:03 -0800)]
Merge tag 'selinux-pr-20211217' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux fix from Paul Moore:
"Another small SELinux fix for v5.16 to ensure that we don't block on
memory allocations while holding a spinlock.
This passes all our tests without problem"
* tag 'selinux-pr-20211217' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: fix sleeping function called from invalid context
Linus Torvalds [Fri, 17 Dec 2021 19:55:35 +0000 (11:55 -0800)]
Merge tag 'riscv-for-linus-5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- A handful of DT updates for the SiFive HiFive Unmatched, that fix the
regulator handling. These should stop some warning spew.
- A pair of fixes for both the SiFive Hifive Unleashed and Unmatched,
that correctly hook up the MMC card detect signal.
* tag 'riscv-for-linus-5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: dts: sifive unmatched: Link the tmp451 with its power supply
riscv: dts: sifive unmatched: Fix regulator for board rev3
riscv: dts: sifive unmatched: Expose the PMIC sub-functions
riscv: dts: sifive unmatched: Expose the board ID eeprom
riscv: dts: sifive unmatched: Name gpio lines
riscv: dts: unmatched: Add gpio card detect to mmc-spi-slot
riscv: dts: unleashed: Add gpio card detect to mmc-spi-slot
Linus Torvalds [Fri, 17 Dec 2021 19:46:07 +0000 (11:46 -0800)]
Merge tag 'block-5.16-2021-12-17' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- Fix for hammering on the delayed run queue timer (me)
- bcache regression fix for this merge window (Lin)
- Fix a divide-by-zero in the blk-iocost code (Tejun)
* tag 'block-5.16-2021-12-17' of git://git.kernel.dk/linux-block:
bcache: fix NULL pointer reference in cached_dev_detach_finish
block: reduce kblockd_mod_delayed_work_on() CPU consumption
iocost: Fix divide-by-zero on donation from low hweight cgroup
Linus Torvalds [Fri, 17 Dec 2021 19:07:13 +0000 (11:07 -0800)]
Merge tag 'drm-fixes-2021-12-17-1' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Mostly amdgpu fixes this week scattered around the driver, otherwise
one i915, one ast, one simpledrm. There is a revert in the fb-helper
for places userspace was using a string that we tried to change.
i915:
- Fix a bound check in the DMC fw load.
ast:
- NULL ptr deref fix
simpledrm:
- pixel clock units fix
fb-helper:
- userspace regression revert
amdgpu:
- Fix RLC register offset
- GMC fix
- Properly cache SMU FW version on Yellow Carp
- Fix missing callback on DCN3.1
- Reset DMCUB before HW init
- Fix for GMC powergating on PCO
- Fix a possible memory leak in GPU metrics table handling on RN"
* tag 'drm-fixes-2021-12-17-1' of git://anongit.freedesktop.org/drm/drm:
drm/amd/pm: fix a potential gpu_metrics_table memory leak
drm/amdgpu: correct the wrong cached state for GMC on PICASSO
drm/amd/display: Reset DMCUB before HW init
drm/amd/display: Set exit_optimized_pwr_state for DCN31
drm/amd/pm: fix reading SMU FW version from amdgpu_firmware_info on YC
drm/amdgpu: don't override default ECO_BITs setting
drm/amdgpu: correct register access for RLC_JUMP_TABLE_RESTORE
drm/i915/display: Fix an unsigned subtraction which can never be negative.
drm/ast: potential dereference of null pointer
drm: simpledrm: fix wrong unit with pixel clock
Revert "drm/fb-helper: improve DRM fbdev emulation device names"
This change causes boot lockups when using "arlyprintk=xdbc" because
ktime can not be used at this point in time in the boot process. Also,
it is not needed for very small delays like this.
riscv: dts: sifive unmatched: Link the tmp451 with its power supply
Fixes the following probe warning:
lm90 0-004c: Looking up vcc-supply from device tree
lm90 0-004c: Looking up vcc-supply property in node /soc/i2c@10030000/temperature-sensor@4c failed
lm90 0-004c: supply vcc not found, using dummy regulator
Signed-off-by: Vincent Pelletier <plr.vincent@gmail.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
riscv: dts: sifive unmatched: Fix regulator for board rev3
The existing values are rejected by the da9063 regulator driver, as they
are unachievable with the declared chip setup (non-merged vcore and bmem
are unable to provide the declared curent).
Fix voltages to match rev3 schematics, which also matches their boot-up
configuration within the chip's available precision.
Declare bcore1/bcore2 and bmem/bio as merged.
Set ldo09 and ldo10 as always-on as their consumers are not declared but
exist.
Drop ldo current limits as there is no current limit feature for these
regulators in the DA9063. Fixes warnings like:
DA9063_LDO3: Operation of current configuration missing
Signed-off-by: Vincent Pelletier <plr.vincent@gmail.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
riscv: dts: sifive unmatched: Expose the board ID eeprom
Mark it as read-only as it is factory-programmed with identifying
information, and no executable nor configuration:
- eth MAC address
- board model (PCB version, BoM version)
- board serial number
Accidental modification would cause misidentification which could brick
the board, so marking read-only seem like both a safe and non-constraining
choice.
Signed-off-by: Vincent Pelletier <plr.vincent@gmail.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Dave Airlie [Fri, 17 Dec 2021 05:01:00 +0000 (15:01 +1000)]
Merge tag 'amd-drm-fixes-5.16-2021-12-15' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.16-2021-12-15:
amdgpu:
- Fix RLC register offset
- GMC fix
- Properly cache SMU FW version on Yellow Carp
- Fix missing callback on DCN3.1
- Reset DMCUB before HW init
- Fix for GMC powergating on PCO
- Fix a possible memory leak in GPU metrics table handling on RN
George Kennedy [Tue, 14 Dec 2021 14:45:10 +0000 (09:45 -0500)]
libata: if T_LENGTH is zero, dma direction should be DMA_NONE
Avoid data corruption by rejecting pass-through commands where
T_LENGTH is zero (No data is transferred) and the dma direction
is not DMA_NONE.
Cc: <stable@vger.kernel.org> Reported-by: syzkaller<syzkaller@googlegroups.com> Signed-off-by: George Kennedy<george.kennedy@oracle.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Linus Torvalds [Thu, 16 Dec 2021 23:02:14 +0000 (15:02 -0800)]
Merge tag 'net-5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Networking fixes, including fixes from mac80211, wifi, bpf.
Relatively large batches of fixes from BPF and the WiFi stack, calm in
general networking.
Current release - regressions:
- dpaa2-eth: fix buffer overrun when reporting ethtool statistics
Current release - new code bugs:
- bpf: fix incorrect state pruning for <8B spill/fill
- iavf:
- add missing unlocks in iavf_watchdog_task()
- do not override the adapter state in the watchdog task (again)
- mlxsw: spectrum_router: consolidate MAC profiles when possible
Previous releases - regressions:
- mac80211 fixes:
- rate control, avoid driver crash for retransmitted frames
- regression in SSN handling of addba tx
- a memory leak where sta_info is not freed
- marking TX-during-stop for TX in in_reconfig, prevent stall
- cfg80211: acquire wiphy mutex on regulatory work
- wifi drivers: fix build regressions and LED config dependency
- virtio_net: fix rx_drops stat for small pkts
- dsa: mv88e6xxx: unforce speed & duplex in mac_link_down()
Previous releases - always broken:
- bpf fixes:
- kernel address leakage in atomic fetch
- kernel address leakage in atomic cmpxchg's r0 aux reg
- signed bounds propagation after mov32
- extable fixup offset
- extable address check
- mac80211:
- fix the size used for building probe request
- send ADDBA requests using the tid/queue of the aggregation
session
- agg-tx: don't schedule_and_wake_txq() under sta->lock, avoid
deadlocks
- validate extended element ID is present
- mptcp:
- never allow the PM to close a listener subflow (null-defer)
- clear 'kern' flag from fallback sockets, prevent crash
- fix deadlock in __mptcp_push_pending()
- inet_diag: fix kernel-infoleak for UDP sockets
- xsk: do not sleep in poll() when need_wakeup set
- smc: avoid very long waits in smc_release()
- sch_ets: don't remove idle classes from the round-robin list
- netdevsim:
- zero-initialize memory for bpf map's value, prevent info leak
- don't let user space overwrite read only (max) ethtool parms
- ixgbe: set X550 MDIO speed before talking to PHY
- stmmac:
- fix null-deref in flower deletion w/ VLAN prio Rx steering
- dwmac-rk: fix oob read in rk_gmac_setup
- ice: time stamping fixes
- systemport: add global locking for descriptor life cycle"
* tag 'net-5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (89 commits)
bpf, selftests: Fix racing issue in btf_skc_cls_ingress test
selftest/bpf: Add a test that reads various addresses.
bpf: Fix extable address check.
bpf: Fix extable fixup offset.
bpf, selftests: Add test case trying to taint map value pointer
bpf: Make 32->64 bounds propagation slightly more robust
bpf: Fix signed bounds propagation after mov32
sit: do not call ipip6_dev_free() from sit_init_net()
net: systemport: Add global locking for descriptor lifecycle
net/smc: Prevent smc_release() from long blocking
net: Fix double 0x prefix print in SKB dump
virtio_net: fix rx_drops stat for small pkts
dsa: mv88e6xxx: fix debug print for SPEED_UNFORCED
sfc_ef100: potential dereference of null pointer
net: stmmac: dwmac-rk: fix oob read in rk_gmac_setup
net: usb: lan78xx: add Allied Telesis AT29M2-AF
net/packet: rx_owner_map depends on pg_vec
netdevsim: Zero-initialize memory for new map's value in function nsim_bpf_map_alloc
dpaa2-eth: fix ethtool statistics
ixgbe: set X550 MDIO speed before talking to PHY
...
Linus Torvalds [Thu, 16 Dec 2021 22:48:57 +0000 (14:48 -0800)]
Merge tag 'soc-fixes-5.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"There are a number of DT fixes, mostly for mistakes found through
static checking of the dts files again, as well as a couple of minor
changes to address incorrect DT settings.
For i.MX, there is yet another series of devitree changes to update
RGMII delay settings for ethernet, which is an ongoing problem after
some driver changes.
For SoC specific device drivers, a number of smaller fixes came up:
- i.MX SoC identification was incorrectly registered non-i.MX
machines when the driver is built-in
- One fix on imx8m-blk-ctrl driver to get i.MX8MM MIPI reset work
properly
- a few compile fixes for warnings that get in the way of -Werror
- a string overflow in the scpi firmware driver
- a boot failure with FORTIFY_SOURCE on Rockchips machines
- broken error handling in the AMD TEE driver
- a revert for a tegra reset driver commit that broke HDA"
* tag 'soc-fixes-5.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (25 commits)
soc/tegra: fuse: Fix bitwise vs. logical OR warning
firmware: arm_scpi: Fix string overflow in SCPI genpd driver
soc: imx: Register SoC device only on i.MX boards
soc: imx: imx8m-blk-ctrl: Fix imx8mm mipi reset
ARM: dts: imx6ull-pinfunc: Fix CSI_DATA07__ESAI_TX0 pad name
arm64: dts: imx8mq: remove interconnect property from lcdif
ARM: socfpga: dts: fix qspi node compatible
arm64: dts: apple: add #interrupt-cells property to pinctrl nodes
dt-bindings: i2c: apple,i2c: allow multiple compatibles
arm64: meson: remove COMMON_CLK
arm64: meson: fix dts for JetHub D1
tee: amdtee: fix an IS_ERR() vs NULL bug
arm64: dts: apple: change ethernet0 device type to ethernet
arm64: dts: ten64: remove redundant interrupt declaration for gpio-keys
arm64: dts: rockchip: fix poweroff on helios64
arm64: dts: rockchip: fix audio-supply for Rock Pi 4
arm64: dts: rockchip: fix rk3399-leez-p710 vcc3v3-lan supply
arm64: dts: rockchip: fix rk3308-roc-cc vcc-sd supply
arm64: dts: rockchip: remove mmc-hs400-enhanced-strobe from rk3399-khadas-edge
ARM: rockchip: Use memcpy_toio instead of memcpy on smp bring-up
...
Cc: stable@vger.kernel.org Fixes: 97417fc8bc8f ("lsm,selinux: add new hook to compare new mount to an existing mount") Signed-off-by: Scott Mayhew <smayhew@redhat.com>
[PM: cleanup/line-wrap the backtrace] Signed-off-by: Paul Moore <paul@paul-moore.com>
We've added 15 non-merge commits during the last 7 day(s) which contain
a total of 12 files changed, 434 insertions(+), 30 deletions(-).
The main changes are:
1) Fix incorrect verifier state pruning behavior for <8B register spill/fill,
from Paul Chaignon.
2) Fix x86-64 JIT's extable handling for fentry/fexit when return pointer
is an ERR_PTR(), from Alexei Starovoitov.
3) Fix 3 different possibilities that BPF verifier missed where unprivileged
could leak kernel addresses, from Daniel Borkmann.
4) Fix xsk's poll behavior under need_wakeup flag, from Magnus Karlsson.
5) Fix an oob-write in test_verifier due to a missed MAX_NR_MAPS bump,
from Kumar Kartikeya Dwivedi.
6) Fix a race in test_btf_skc_cls_ingress selftest, from Martin KaFai Lau.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf, selftests: Fix racing issue in btf_skc_cls_ingress test
selftest/bpf: Add a test that reads various addresses.
bpf: Fix extable address check.
bpf: Fix extable fixup offset.
bpf, selftests: Add test case trying to taint map value pointer
bpf: Make 32->64 bounds propagation slightly more robust
bpf: Fix signed bounds propagation after mov32
bpf, selftests: Update test case for atomic cmpxchg on r0 with pointer
bpf: Fix kernel address leakage in atomic cmpxchg's r0 aux reg
bpf, selftests: Add test case for atomic fetch on spilled pointer
bpf: Fix kernel address leakage in atomic fetch
selftests/bpf: Fix OOB write in test_verifier
xsk: Do not sleep in poll() when need_wakeup set
selftests/bpf: Tests for state pruning with u32 spill/fill
bpf: Fix incorrect state pruning for <8B spill/fill
====================
Martin KaFai Lau [Thu, 16 Dec 2021 19:16:30 +0000 (11:16 -0800)]
bpf, selftests: Fix racing issue in btf_skc_cls_ingress test
The libbpf CI reported occasional failure in btf_skc_cls_ingress:
test_syncookie:FAIL:Unexpected syncookie states gen_cookie:80326634 recv_cookie:0
bpf prog error at line 97
"error at line 97" means the bpf prog cannot find the listening socket
when the final ack is received. It then skipped processing
the syncookie in the final ack which then led to "recv_cookie:0".
The problem is the userspace program did not do accept() and went
ahead to close(listen_fd) before the kernel (and the bpf prog) had
a chance to process the final ack.
The fix is to add accept() call so that the userspace will wait for
the kernel to finish processing the final ack first before close()-ing
everything.
Fixes: 3261b8bd461e ("bpf: selftest: Add test_btf_skc_cls_ingress") Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211216191630.466151-1-kafai@fb.com
selftest/bpf: Add a test that reads various addresses.
Add a function to bpf_testmod that returns invalid kernel and user addresses.
Then attach an fexit program to that function that tries to read
memory through these addresses.
This logic checks that bpf_probe_read_kernel and BPF_PROBE_MEM logic is sane.
Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The verifier checks that PTR_TO_BTF_ID pointer is either valid or NULL,
but it cannot distinguish IS_ERR pointer from valid one.
When offset is added to IS_ERR pointer it may become small positive
value which is a user address that is not handled by extable logic
and has to be checked for at the runtime.
Tighten BPF_PROBE_MEM pointer check code to prevent this case.
Fixes: 643d483d7b91 ("bpf: Emit explicit NULL pointer checks for PROBE_LDX instructions.") Reported-by: Lorenzo Fontana <lorenzo.fontana@elastic.co> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The prog - start_of_ldx is the offset before the faulting ldx to the location
after it, so this will be used to adjust pt_regs->ip for jumping over it and
continuing, and with old temp it would have been fixed up to the wrong offset,
causing crash.
Fixes: 643d483d7b91 ("bpf: Emit explicit NULL pointer checks for PROBE_LDX instructions.") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Linus Torvalds [Thu, 16 Dec 2021 19:48:59 +0000 (11:48 -0800)]
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fix from Stephen Boyd:
"A single fix for the clk framework that needed some more bake time in
linux-next.
The problem is that two clks being registered at the same time can
lead to a busted clk tree if the parent isn't fully registered by the
time the child finds the parent. We rejigger the place where we mark
the parent as fully registered so that the child can't find the parent
until things are proper"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: Don't parent clks until the parent is fully registered
Daniel Borkmann [Wed, 15 Dec 2021 22:28:48 +0000 (22:28 +0000)]
bpf: Make 32->64 bounds propagation slightly more robust
Make the bounds propagation in __reg_assign_32_into_64() slightly more
robust and readable by aligning it similarly as we did back in the
__reg_combine_64_into_32() counterpart. Meaning, only propagate or
pessimize them as a smin/smax pair.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org>
Daniel Borkmann [Wed, 15 Dec 2021 22:02:19 +0000 (22:02 +0000)]
bpf: Fix signed bounds propagation after mov32
For the case where both s32_{min,max}_value bounds are positive, the
__reg_assign_32_into_64() directly propagates them to their 64 bit
counterparts, otherwise it pessimises them into [0,u32_max] universe and
tries to refine them later on by learning through the tnum as per comment
in mentioned function. However, that does not always happen, for example,
in mov32 operation we call zext_32_to_64(dst_reg) which invokes the
__reg_assign_32_into_64() as is without subsequent bounds update as
elsewhere thus no refinement based on tnum takes place.
Thus, not calling into the __update_reg_bounds() / __reg_deduce_bounds() /
__reg_bound_offset() triplet as we do, for example, in case of ALU ops via
adjust_scalar_min_max_vals(), will lead to more pessimistic bounds when
dumping the full register state:
Technically, the smin_value=0 and smax_value=4294967295 bounds are not
incorrect, but given the register is still a constant, they break assumptions
about const scalars that smin_value == smax_value and umin_value == umax_value.
Without the smin_value == smax_value and umin_value == umax_value invariant
being intact for const scalars, it is possible to leak out kernel pointers
from unprivileged user space if the latter is enabled. For example, when such
registers are involved in pointer arithmtics, then adjust_ptr_min_max_vals()
will taint the destination register into an unknown scalar, and the latter
can be exported and stored e.g. into a BPF map value.
Fixes: 576785048b47 ("bpf: Verifier, do explicit ALU32 bounds tracking") Reported-by: Kuee K1r0a <liulin063@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org>
Linus Torvalds [Thu, 16 Dec 2021 18:44:20 +0000 (10:44 -0800)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Catalin Marinas:
"Fix missing error code on kexec failure path"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: kexec: Fix missing error code 'ret' warning in load_other_segments()
Linus Torvalds [Thu, 16 Dec 2021 18:05:49 +0000 (10:05 -0800)]
Merge tag 'for-5.16/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- Fix use after free in DM btree remove's rebalance_children()
- Fix DM integrity data corruption, introduced during 5.16 merge, due
to improper use of bvec_kmap_local()
* tag 'for-5.16/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm integrity: fix data corruption due to improper use of bvec_kmap_local
dm btree remove: fix use after free in rebalance_children()
Return code is not set to an error code in load_other_segments() when
of_kexec_alloc_and_setup_fdt() call returns a NULL dtb. This results
in status success (return code set to 0) being returned from
load_other_segments().
Set return code to -EINVAL if of_kexec_alloc_and_setup_fdt() returns
NULL dtb.
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: 843ef14bcdb1 ("arm64: Use common of_kexec_alloc_and_setup_fdt()") Link: https://lore.kernel.org/r/20211210010121.101823-1-nramas@linux.microsoft.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
David Howells [Tue, 14 Dec 2021 09:22:12 +0000 (09:22 +0000)]
afs: Fix mmap
Fix afs_add_open_map() to check that the vnode isn't already on the list
when it adds it. It's possible that afs_drop_open_mmap() decremented
the cb_nr_mmap counter, but hadn't yet got into the locked section to
remove it.
Also vnode->cb_mmap_link should be initialised, so fix that too.
Fixes: 5c5ec604ae2a ("afs: Fix mmap coherency vs 3rd-party changes") Reported-by: kafs-testing+fedora34_64checkkafs-build-300@auristor.com Suggested-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: kafs-testing+fedora34_64checkkafs-build-300@auristor.com
cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/686465.1639435380@warthog.procyon.org.uk/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Florian Fainelli [Wed, 15 Dec 2021 20:24:49 +0000 (12:24 -0800)]
net: systemport: Add global locking for descriptor lifecycle
The descriptor list is a shared resource across all of the transmit queues, and
the locking mechanism used today only protects concurrency across a given
transmit queue between the transmit and reclaiming. This creates an opportunity
for the SYSTEMPORT hardware to work on corrupted descriptors if we have
multiple producers at once which is the case when using multiple transmit
queues.
This was particularly noticeable when using multiple flows/transmit queues and
it showed up in interesting ways in that UDP packets would get a correct UDP
header checksum being calculated over an incorrect packet length. Similarly TCP
packets would get an equally correct checksum computed by the hardware over an
incorrect packet length.
The SYSTEMPORT hardware maintains an internal descriptor list that it re-arranges
when the driver produces a new descriptor anytime it writes to the
WRITE_PORT_{HI,LO} registers, there is however some delay in the hardware to
re-organize its descriptors and it is possible that concurrent TX queues
eventually break this internal allocation scheme to the point where the
length/status part of the descriptor gets used for an incorrect data buffer.
The fix is to impose a global serialization for all TX queues in the short
section where we are writing to the WRITE_PORT_{HI,LO} registers which solves
the corruption even with multiple concurrent TX queues being used.
D. Wythe [Wed, 15 Dec 2021 12:29:21 +0000 (20:29 +0800)]
net/smc: Prevent smc_release() from long blocking
In nginx/wrk benchmark, there's a hung problem with high probability
on case likes that: (client will last several minutes to exit)
server: smc_run nginx
client: smc_run wrk -c 10000 -t 1 http://server
Client hangs with the following backtrace:
0 [ffffa7ce8Of3bbf8] __schedule at ffffffff9f9eOd5f
1 [ffffa7ce8Of3bc88] schedule at ffffffff9f9eløe6
2 [ffffa7ce8Of3bcaO] schedule_timeout at ffffffff9f9e3f3c
3 [ffffa7ce8Of3bd2O] wait_for_common at ffffffff9f9el9de
4 [ffffa7ce8Of3bd8O] __flush_work at ffffffff9fOfeOl3
5 [ffffa7ce8øf3bdfO] smc_release at ffffffffcO697d24 [smc]
6 [ffffa7ce8Of3be2O] __sock_release at ffffffff9f8O2e2d
7 [ffffa7ce8Of3be4ø] sock_close at ffffffff9f8ø2ebl
8 [ffffa7ce8øf3be48] __fput at ffffffff9f334f93
9 [ffffa7ce8Of3be78] task_work_run at ffffffff9flOlff5
10 [ffffa7ce8Of3beaO] do_exit at ffffffff9fOe5Ol2
11 [ffffa7ce8Of3bflO] do_group_exit at ffffffff9fOe592a
12 [ffffa7ce8Of3bf38] __x64_sys_exit_group at ffffffff9fOe5994
13 [ffffa7ce8Of3bf4O] do_syscall_64 at ffffffff9f9d4373
14 [ffffa7ce8Of3bfsO] entry_SYSCALL_64_after_hwframe at ffffffff9fa0007c
This issue dues to flush_work(), which is used to wait for
smc_connect_work() to finish in smc_release(). Once lots of
smc_connect_work() was pending or all executing work dangling,
smc_release() has to block until one worker comes to free, which
is equivalent to wait another smc_connnect_work() to finish.
In order to fix this, There are two changes:
1. For those idle smc_connect_work(), cancel it from the workqueue; for
executing smc_connect_work(), waiting for it to finish. For that
purpose, replace flush_work() with cancel_work_sync().
2. Since smc_connect() hold a reference for passive closing, if
smc_connect_work() has been cancelled, release the reference.
Fixes: 67f98918e75b ("net/smc: rebuild nonblocking connect") Reported-by: Tony Lu <tonylu@linux.alibaba.com> Tested-by: Dust Li <dust.li@linux.alibaba.com> Reviewed-by: Dust Li <dust.li@linux.alibaba.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: D. Wythe <alibuda@linux.alibaba.com> Acked-by: Karsten Graul <kgraul@linux.ibm.com> Link: https://lore.kernel.org/r/1639571361-101128-1-git-send-email-alibuda@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Gal Pressman [Thu, 16 Dec 2021 09:28:25 +0000 (11:28 +0200)]
net: Fix double 0x prefix print in SKB dump
When printing netdev features %pNF already takes care of the 0x prefix,
remove the explicit one.
Fixes: 74fd0b22ca13 ("skbuff: increase verbosity when dumping skb data") Signed-off-by: Gal Pressman <gal@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Wenliang Wang [Thu, 16 Dec 2021 03:11:35 +0000 (11:11 +0800)]
virtio_net: fix rx_drops stat for small pkts
We found the stat of rx drops for small pkts does not increment when
build_skb fail, it's not coherent with other mode's rx drops stat.
Signed-off-by: Wenliang Wang <wangwenliang.1995@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrey Eremeev [Wed, 15 Dec 2021 17:30:32 +0000 (20:30 +0300)]
dsa: mv88e6xxx: fix debug print for SPEED_UNFORCED
Debug print uses invalid check to detect if speed is unforced:
(speed != SPEED_UNFORCED) should be used instead of (!speed).
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Andrey Eremeev <Axtone4all@yandex.ru> Fixes: f4b4622108c0 ("net: dsa: mv88e6xxx: add port's MAC speed setter") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiasheng Jiang [Wed, 15 Dec 2021 14:37:31 +0000 (22:37 +0800)]
sfc_ef100: potential dereference of null pointer
The return value of kmalloc() needs to be checked.
To avoid use in efx_nic_update_stats() in case of the failure of alloc.
Fixes: a55cfb214878 ("sfc_ef100: statistics gathering") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John Keeping [Tue, 14 Dec 2021 19:10:09 +0000 (19:10 +0000)]
net: stmmac: dwmac-rk: fix oob read in rk_gmac_setup
KASAN reports an out-of-bounds read in rk_gmac_setup on the line:
while (ops->regs[i]) {
This happens for most platforms since the regs flexible array member is
empty, so the memory after the ops structure is being read here. It
seems that mostly this happens to contain zero anyway, so we get lucky
and everything still works.
To avoid adding redundant data to nearly all the ops structures, add a
new flag to indicate whether the regs field is valid and avoid this loop
when it is not.
Fixes: f6cbed57c8dd ("net: stmmac: Add RK3566/RK3568 SoC support") Signed-off-by: John Keeping <john@metanate.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Willem de Bruijn [Wed, 15 Dec 2021 14:39:37 +0000 (09:39 -0500)]
net/packet: rx_owner_map depends on pg_vec
Packet sockets may switch ring versions. Avoid misinterpreting state
between versions, whose fields share a union. rx_owner_map is only
allocated with a packet ring (pg_vec) and both are swapped together.
If pg_vec is NULL, meaning no packet ring was allocated, then neither
was rx_owner_map. And the field may be old state from a tpacket_v3.
Fixes: 6b1d969bc861 ("net/packet: tpacket_rcv: avoid a producer race condition") Reported-by: Syzbot <syzbot+1ac0994a0a0c55151121@syzkaller.appspotmail.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20211215143937.106178-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Haimin Zhang [Wed, 15 Dec 2021 11:15:30 +0000 (19:15 +0800)]
netdevsim: Zero-initialize memory for new map's value in function nsim_bpf_map_alloc
Zero-initialize memory for new map's value in function nsim_bpf_map_alloc
since it may cause a potential kernel information leak issue, as follows:
1. nsim_bpf_map_alloc calls nsim_map_alloc_elem to allocate elements for
a new map.
2. nsim_map_alloc_elem uses kmalloc to allocate map's value, but doesn't
zero it.
3. A user application can use IOCTL BPF_MAP_LOOKUP_ELEM to get specific
element's information in the map.
4. The kernel function map_lookup_elem will call bpf_map_copy_value to get
the information allocated at step-2, then use copy_to_user to copy to the
user buffer.
This can only leak information for an array map.
Fixes: c80225d11804 ("netdevsim: bpf: support fake map offload") Suggested-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Haimin Zhang <tcs.kernel@gmail.com> Link: https://lore.kernel.org/r/20211215111530.72103-1-tcs.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ioana Ciornei [Wed, 15 Dec 2021 10:58:31 +0000 (12:58 +0200)]
dpaa2-eth: fix ethtool statistics
Unfortunately, with the blamed commit I also added a side effect in the
ethtool stats shown. Because I added two more fields in the per channel
structure without verifying if its size is used in any way, part of the
ethtool statistics were off by 2.
Fix this by not looking up the size of the structure but instead on a
fixed value kept in a macro.
Xu Yang [Thu, 9 Dec 2021 10:15:07 +0000 (18:15 +0800)]
usb: typec: tcpm: fix tcpm unregister port but leave a pending timer
In current design, when the tcpm port is unregisterd, the kthread_worker
will be destroyed in the last step. Inside the kthread_destroy_worker(),
the worker will flush all the works and wait for them to end. However, if
one of the works calls hrtimer_start(), this hrtimer will be pending until
timeout even though tcpm port is removed. Once the hrtimer timeout, many
strange kernel dumps appear.
Thus, we can first complete kthread_destroy_worker(), then cancel all the
hrtimers. This will guarantee that no hrtimer is pending at the end.
Fixes: d29320bb0bc0 ("usb: typec: tcpm: Migrate workqueue to RT priority for processing events")
cc: <stable@vger.kernel.org> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Link: https://lore.kernel.org/r/20211209101507.499096-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pawel Laszczak [Tue, 14 Dec 2021 04:55:27 +0000 (05:55 +0100)]
usb: cdnsp: Fix lack of spin_lock_irqsave/spin_lock_restore
Patch puts content of cdnsp_gadget_pullup function inside
spin_lock_irqsave and spin_lock_restore section.
This construction is required here to keep the data consistency,
otherwise some data can be changed e.g. from interrupt context.
Fixes: 7a5ffba5c283 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver") Reported-by: Ken (Jian) He <jianhe@ambarella.com>
cc: <stable@vger.kernel.org> Signed-off-by: Pawel Laszczak <pawell@cadence.com>
--
Changelog:
v2:
- added disable_irq/enable_irq as sugester by Peter Chen
Commit bd1df14897c8 ("serial: 8250_fintek: Enable high speed mode on Fintek F81866")
introduced support to use high baudrate with Fintek SuperIO UARTs. It'll
change clocksources when the UART probed.
But when user add kernel parameter "console=ttyS0,115200 console=tty0" to make
the UART as console output, the console will output garbled text after the
following kernel message.
The issue is occurs in following step:
probe_setup_port() -> fintek_8250_goto_highspeed()
It change clocksource from 115200 to 921600 with wrong time, it should change
clocksource in set_termios() not in probed. The following 3 patches are
implemented change clocksource in fintek_8250_set_termios().
Due to the high baud rate had implemented above 3 patches and the patch
Commit bd1df14897c8 ("serial: 8250_fintek: Enable high speed mode on Fintek F81866")
is bugged, So this patch will remove it.
Fixes: bd1df14897c8 ("serial: 8250_fintek: Enable high speed mode on Fintek F81866") Signed-off-by: Ji-Ze Hong (Peter Hong) <hpeter+linux_kernel@gmail.com> Link: https://lore.kernel.org/r/20211215075835.2072-1-hpeter+linux_kernel@gmail.com Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tetsuo Handa [Wed, 15 Dec 2021 11:52:40 +0000 (20:52 +0900)]
tty: n_hdlc: make n_hdlc_tty_wakeup() asynchronous
syzbot is reporting that an unprivileged user who logged in from tty
console can crash the system using a reproducer shown below [1], for
n_hdlc_tty_wakeup() is synchronously calling n_hdlc_send_frames().
int main(int argc, char *argv[])
{
const int disc = 0xd;
ioctl(1, TIOCSETD, &disc);
while (1) {
ioctl(1, TCXONC, 0);
write(1, "", 1);
ioctl(1, TCXONC, 1); /* Kernel panic - not syncing: scheduling while atomic */
}
}
----------
Linus suspected that "struct tty_ldisc"->ops->write_wakeup() must not
sleep, and Jiri confirmed it from include/linux/tty_ldisc.h. Thus, defer
n_hdlc_send_frames() from n_hdlc_tty_wakeup() to a WQ context like
net/nfc/nci/uart.c does.
Link: https://syzkaller.appspot.com/bug?extid=5f47a8cea6a12b77a876 Reported-by: syzbot <syzbot+5f47a8cea6a12b77a876@syzkaller.appspotmail.com> Cc: stable <stable@vger.kernel.org> Analyzed-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Confirmed-by: Jiri Slaby <jirislaby@kernel.org> Reviewed-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Link: https://lore.kernel.org/r/40de8b7e-a3be-4486-4e33-1b1d1da452f8@i-love.sakura.ne.jp Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cyril Novikov [Tue, 2 Nov 2021 01:39:36 +0000 (18:39 -0700)]
ixgbe: set X550 MDIO speed before talking to PHY
The MDIO bus speed must be initialized before talking to the PHY the first
time in order to avoid talking to it using a speed that the PHY doesn't
support.
This fixes HW initialization error -17 (IXGBE_ERR_PHY_ADDR_INVALID) on
Denverton CPUs (a.k.a. the Atom C3000 family) on ports with a 10Gb network
plugged in. On those devices, HLREG0[MDCSPD] resets to 1, which combined
with the 10Gb network results in a 24MHz MDIO speed, which is apparently
too fast for the connected PHY. PHY register reads over MDIO bus return
garbage, leading to initialization failure.
Reproduced with Linux kernel 4.19 and 5.15-rc7. Can be reproduced using
the following setup:
* Use an Atom C3000 family system with at least one X552 LAN on the SoC
* Disable PXE or other BIOS network initialization if possible
(the interface must not be initialized before Linux boots)
* Connect a live 10Gb Ethernet cable to an X550 port
* Power cycle (not reset, doesn't always work) the system and boot Linux
* Observe: ixgbe interfaces w/ 10GbE cables plugged in fail with error -17
Fixes: bcd6384d7d25 ("ixgbe: Introduce function to control MDIO speed") Signed-off-by: Cyril Novikov <cnovikov@lynx.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Mike Snitzer [Wed, 15 Dec 2021 17:31:51 +0000 (12:31 -0500)]
dm integrity: fix data corruption due to improper use of bvec_kmap_local
Commit 9fd3b1ce0d4f ("dm integrity: use bvec_kmap_local in
__journal_read_write") didn't account for __journal_read_write() later
adding the biovec's bv_offset. As such using bvec_kmap_local() caused
the start of the biovec to be skipped.
Trivial test that illustrates data corruption:
# integritysetup format /dev/pmem0
# integritysetup open /dev/pmem0 integrityroot
# mkfs.xfs /dev/mapper/integrityroot
...
bad magic number
bad magic number
Metadata corruption detected at xfs_sb block 0x0/0x1000
libxfs_writebufr: write verifer failed on xfs_sb bno 0x0/0x1000
releasing dirty buffer (bulk) to free list!
Fix this by using kmap_local_page() instead of bvec_kmap_local() in
__journal_read_write().
Fixes: 9fd3b1ce0d4f ("dm integrity: use bvec_kmap_local in __journal_read_write") Reported-by: Tony Asleson <tasleson@redhat.com> Reviewed-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Commit a9d844bfbf2b ("ixgbe: Add ethtool support to enable 2.5 and 5.0
Gbps support") introduced suppression of the advertisement of NBASE-T
speeds by default, according to Todd Fujinaka to accommodate customers
with network switches which could not cope with advertised NBASE-T
speeds, as posted in the E1000-devel mailing list:
However, the suppression was not documented at all, nor was how to
enable NBASE-T support.
Properly document the NBASE-T suppression and how to enable NBASE-T
support.
Fixes: a9d844bfbf2b ("ixgbe: Add ethtool support to enable 2.5 and 5.0 Gbps support") Reported-by: Robert Schlabbach <robert_s@gmx.net> Signed-off-by: Robert Schlabbach <robert_s@gmx.net> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Sasha Neftin [Tue, 2 Nov 2021 07:20:06 +0000 (09:20 +0200)]
igc: Fix typo in i225 LTR functions
The LTR maximum value was incorrectly written using the scale from
the LTR minimum value. This would cause incorrect values to be sent,
in cases where the initial calculation lead to different min/max scales.
Letu Ren [Sat, 13 Nov 2021 03:42:34 +0000 (11:42 +0800)]
igbvf: fix double free in `igbvf_probe`
In `igbvf_probe`, if register_netdev() fails, the program will go to
label err_hw_init, and then to label err_ioremap. In free_netdev() which
is just below label err_ioremap, there is `list_for_each_entry_safe` and
`netif_napi_del` which aims to delete all entries in `dev->napi_list`.
The program has added an entry `adapter->rx_ring->napi` which is added by
`netif_napi_add` in igbvf_alloc_queues(). However, adapter->rx_ring has
been freed below label err_hw_init. So this a UAF.
In terms of how to patch the problem, we can refer to igbvf_remove() and
delete the entry before `adapter->rx_ring`.
Fixes: ac92e453a3681 (igbvf: add new driver to support 82576 virtual functions) Reported-by: Zheyu Ma <zheyuma97@gmail.com> Signed-off-by: Letu Ren <fantasquex@gmail.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Karen Sornek [Tue, 31 Aug 2021 11:16:35 +0000 (13:16 +0200)]
igb: Fix removal of unicast MAC filters of VFs
Move checking condition of VF MAC filter before clearing
or adding MAC filter to VF to prevent potential blackout caused
by removal of necessary and working VF's MAC filter.
Fixes: 089e281baef6 ("igb: add VF trust infrastructure") Signed-off-by: Karen Sornek <karen.sornek@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Linus Torvalds [Wed, 15 Dec 2021 19:06:41 +0000 (11:06 -0800)]
Merge tag 'ceph-for-5.16-rc6' of git://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
"An SGID directory handling fix (marked for stable), a metrics
accounting fix and two fixups to appease static checkers"
* tag 'ceph-for-5.16-rc6' of git://github.com/ceph/ceph-client:
ceph: fix up non-directory creation in SGID directories
ceph: initialize pathlen variable in reconnect_caps_cb
ceph: initialize i_size variable in ceph_sync_read
ceph: fix duplicate increment of opened_inodes metric
Linus Torvalds [Wed, 15 Dec 2021 18:52:01 +0000 (10:52 -0800)]
Merge tag 's390-5.16-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Heiko Carstens:
- Add missing handling of R_390_PLT32DBL relocation type in
arch_kexec_apply_relocations_add(). Clang and the upcoming gcc 11.3
generate such relocation entries, which our relocation code silently
ignores, and which finally will result in an endless loop within the
purgatory code in case of kexec.
- Add proper handling of errors and print error messages when applying
relocations
- Fix duplicate tracking of irq nesting level in entry code
- Let recordmcount.pl also look for jgnop mnemonic. Starting with
binutils 2.37 objdump emits a jgnop mnemonic instead of brcl, which
breaks mcount location detection. This is only a problem if used with
compilers older than gcc 9, since with gcc 9 and newer compilers
recordmcount.pl is not used anymore.
- Remove preempt_disable()/preempt_enable() pair in
kprobe_ftrace_handler() which was done for all architectures except
for s390.
- Update defconfig
* tag 's390-5.16-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
recordmcount.pl: look for jgnop instruction as well as bcrl on s390
s390/entry: fix duplicate tracking of irq nesting level
s390: enable switchdev support in defconfig
s390/kexec: handle R_390_PLT32DBL rela in arch_kexec_apply_relocations_add()
s390/ftrace: remove preempt_disable()/preempt_enable() pair
s390/kexec_file: fix error handling when applying relocations
s390/kexec_file: print some more error messages
Linus Torvalds [Wed, 15 Dec 2021 18:46:03 +0000 (10:46 -0800)]
Merge tag 'hyperv-fixes-signed-20211214' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv fix from Wei Liu:
"Build fix from Randy Dunlap"
* tag 'hyperv-fixes-signed-20211214' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
hv: utils: add PTP_1588_CLOCK to Kconfig to fix build
Paul Moore [Thu, 9 Dec 2021 16:46:07 +0000 (11:46 -0500)]
audit: improve robustness of the audit queue handling
If the audit daemon were ever to get stuck in a stopped state the
kernel's kauditd_thread() could get blocked attempting to send audit
records to the userspace audit daemon. With the kernel thread
blocked it is possible that the audit queue could grow unbounded as
certain audit record generating events must be exempt from the queue
limits else the system enter a deadlock state.
This patch resolves this problem by lowering the kernel thread's
socket sending timeout from MAX_SCHEDULE_TIMEOUT to HZ/10 and tweaks
the kauditd_send_queue() function to better manage the various audit
queues when connection problems occur between the kernel and the
audit daemon. With this patch, the backlog may temporarily grow
beyond the defined limits when the audit daemon is stopped and the
system is under heavy audit pressure, but kauditd_thread() will
continue to make progress and drain the queues as it would for other
connection problems. For example, with the audit daemon put into a
stopped state and the system configured to audit every syscall it
was still possible to shutdown the system without a kernel panic,
deadlock, etc.; granted, the system was slow to shutdown but that is
to be expected given the extreme pressure of recording every syscall.
The timeout value of HZ/10 was chosen primarily through
experimentation and this developer's "gut feeling". There is likely
no one perfect value, but as this scenario is limited in scope (root
privileges would be needed to send SIGSTOP to the audit daemon), it
is likely not worth exposing this as a tunable at present. This can
always be done at a later date if it proves necessary.
Cc: stable@vger.kernel.org Fixes: ceee30414e577 ("audit: fix auditd/kernel connection state tracking") Reported-by: Gaosheng Cui <cuigaosheng1@huawei.com> Tested-by: Gaosheng Cui <cuigaosheng1@huawei.com> Reviewed-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
Amelie Delaunay [Tue, 7 Dec 2021 12:45:10 +0000 (13:45 +0100)]
usb: dwc2: fix STM ID/VBUS detection startup delay in dwc2_driver_probe
When activate_stm_id_vb_detection is enabled, ID and Vbus detection relies
on sensing comparators. This detection needs time to stabilize.
A delay was already applied in dwc2_resume() when reactivating the
detection, but it wasn't done in dwc2_probe().
This patch adds delay after enabling STM ID/VBUS detection. Then, ID state
is good when initializing gadget and host, and avoid to get a wrong
Connector ID Status Change interrupt.
Fixes: ebd56c34fed6 ("usb: dwc2: add support for STM32MP15 SoCs USB OTG HS and FS") Cc: stable <stable@vger.kernel.org> Acked-by: Minas Harutyunyan <Minas.Harutyunyan@synopsys.com> Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com> Link: https://lore.kernel.org/r/20211207124510.268841-1-amelie.delaunay@foss.st.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
USB: gadget: bRequestType is a bitfield, not a enum
Szymon rightly pointed out that the previous check for the endpoint
direction in bRequestType was not looking at only the bit involved, but
rather the whole value. Normally this is ok, but for some request
types, bits other than bit 8 could be set and the check for the endpoint
length could not stall correctly.
soc/tegra: fuse: Fix bitwise vs. logical OR warning
A new warning in clang points out two instances where boolean
expressions are being used with a bitwise OR instead of logical OR:
drivers/soc/tegra/fuse/speedo-tegra20.c:72:9: warning: use of bitwise '|' with boolean operands [-Wbitwise-instead-of-logical]
reg = tegra_fuse_read_spare(i) |
^~~~~~~~~~~~~~~~~~~~~~~~~~
||
drivers/soc/tegra/fuse/speedo-tegra20.c:72:9: note: cast one or both operands to int to silence this warning
drivers/soc/tegra/fuse/speedo-tegra20.c:87:9: warning: use of bitwise '|' with boolean operands [-Wbitwise-instead-of-logical]
reg = tegra_fuse_read_spare(i) |
^~~~~~~~~~~~~~~~~~~~~~~~~~
||
drivers/soc/tegra/fuse/speedo-tegra20.c:87:9: note: cast one or both operands to int to silence this warning
2 warnings generated.
The motivation for the warning is that logical operations short circuit
while bitwise operations do not.
In this instance, tegra_fuse_read_spare() is not semantically returning
a boolean, it is returning a bit value. Use u32 for its return type so
that it can be used with either bitwise or boolean operators without any
warnings.
btrfs: fix missing blkdev_put() call in btrfs_scan_one_device()
The function btrfs_scan_one_device() calls blkdev_get_by_path() and
blkdev_put() to get and release its target block device. However, when
btrfs_sb_log_location_bdev() fails, blkdev_put() is not called and the
block device is left without clean up. This triggered failure of fstests
generic/085. Fix the failure path of btrfs_sb_log_location_bdev() to
call blkdev_put().
Fixes: 4c772e1914afa ("btrfs: implement log-structured superblock for ZONED mode") CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Filipe Manana [Mon, 13 Dec 2021 08:45:13 +0000 (08:45 +0000)]
btrfs: fix warning when freeing leaf after subvolume creation failure
When creating a subvolume, at ioctl.c:create_subvol(), if we fail to
insert the root item for the new subvolume into the root tree, we can
trigger the following warning:
This is because we are calling btrfs_free_tree_block() on an extent
buffer that is dirty. Fix that by cleaning the extent buffer, with
btrfs_clean_tree_block(), before freeing it.
This was triggered by test case generic/475 from fstests.
Fixes: d39804a16fea12 ("btrfs: fix metadata extent leak after failure to create subvolume") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Filipe Manana [Mon, 13 Dec 2021 08:45:12 +0000 (08:45 +0000)]
btrfs: fix invalid delayed ref after subvolume creation failure
When creating a subvolume, at ioctl.c:create_subvol(), if we fail to
insert the new root's root item into the root tree, we are freeing the
metadata extent we reserved for the new root to prevent a metadata
extent leak, as we don't abort the transaction at that point (since
there is nothing at that point that is irreversible).
However we allocated the metadata extent for the new root which we are
creating for the new subvolume, so its delayed reference refers to the
ID of this new root. But when we free the metadata extent we pass the
root of the subvolume where the new subvolume is located to
btrfs_free_tree_block() - this is incorrect because this will generate
a delayed reference that refers to the ID of the parent subvolume's root,
and not to ID of the new root.
This results in a failure when running delayed references that leads to
a transaction abort and a trace like the following:
Fix this by passing the new subvolume's root ID to btrfs_free_tree_block().
This requires changing the root argument of btrfs_free_tree_block() from
struct btrfs_root * to a u64, since at this point during the subvolume
creation we have not yet created the struct btrfs_root for the new
subvolume, and btrfs_free_tree_block() only needs a root ID and nothing
else from a struct btrfs_root.
This was triggered by test case generic/475 from fstests.
Fixes: d39804a16fea12 ("btrfs: fix metadata extent leak after failure to create subvolume") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Josef Bacik [Mon, 13 Dec 2021 19:22:33 +0000 (14:22 -0500)]
btrfs: check WRITE_ERR when trying to read an extent buffer
Filipe reported a hang when we have errors on btrfs. This turned out to
be a side-effect of my fix 77b57e093feffa ("btrfs: clear extent buffer
uptodate when we fail to write it") which made it so we clear
EXTENT_BUFFER_UPTODATE on an eb when we fail to write it out.
Below is a paste of Filipe's analysis he got from using drgn to debug
the hang
"""
btree readahead code calls read_extent_buffer_pages(), sets ->io_pages to
a value while writeback of all pages has not yet completed:
--> writeback for the first 3 pages finishes, we clear
EXTENT_BUFFER_UPTODATE from eb on the first page when we get an
error.
--> at this point eb->io_pages is 1 and we cleared Uptodate bit from the
first 3 pages
--> read_extent_buffer_pages() does not see EXTENT_BUFFER_UPTODATE() so
it continues, it's able to lock the pages since we obviously don't
hold the pages locked during writeback
--> read_extent_buffer_pages() then computes 'num_reads' as 3, and sets
eb->io_pages to 3, since only the first page does not have Uptodate
bit set at this point
--> writeback for the remaining page completes, we ended decrementing
eb->io_pages by 1, resulting in eb->io_pages == 2, and therefore
never calling end_extent_buffer_writeback(), so
EXTENT_BUFFER_WRITEBACK remains in the eb's flags
--> of course, when the read bio completes, it doesn't and shouldn't
call end_extent_buffer_writeback()
--> we should clear EXTENT_BUFFER_UPTODATE only after all pages of
the eb finished writeback? or maybe make the read pages code
wait for writeback of all pages of the eb to complete before
checking which pages need to be read, touch ->io_pages, submit
read bio, etc
writeback bit never cleared means we can hang when aborting a
transaction, at:
This is a problem because our writes are not synchronized with reads in
any way. We clear the UPTODATE flag and then we can easily come in and
try to read the EB while we're still waiting on other bio's to
complete.
We have two options here, we could lock all the pages, and then check to
see if eb->io_pages != 0 to know if we've already got an outstanding
write on the eb.
Or we can simply check to see if we have WRITE_ERR set on this extent
buffer. We set this bit _before_ we clear UPTODATE, so if the read gets
triggered because we aren't UPTODATE because of a write error we're
guaranteed to have WRITE_ERR set, and in this case we can simply return
-EIO. This will fix the reported hang.
Reported-by: Filipe Manana <fdmanana@suse.com> Fixes: 77b57e093feffa ("btrfs: clear extent buffer uptodate when we fail to write it") CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
Daniel Borkmann [Mon, 13 Dec 2021 22:25:23 +0000 (22:25 +0000)]
bpf, selftests: Update test case for atomic cmpxchg on r0 with pointer
Fix up unprivileged test case results for 'Dest pointer in r0' verifier tests
given they now need to reject R0 containing a pointer value, and add a couple
of new related ones with 32bit cmpxchg as well.
root@foo:~/bpf/tools/testing/selftests/bpf# ./test_verifier
#0/u invalid and of negative number OK
#0/p invalid and of negative number OK
[...]
#1268/p XDP pkt read, pkt_meta' <= pkt_data, bad access 1 OK
#1269/p XDP pkt read, pkt_meta' <= pkt_data, bad access 2 OK
#1270/p XDP pkt read, pkt_data <= pkt_meta', good access OK
#1271/p XDP pkt read, pkt_data <= pkt_meta', bad access 1 OK
#1272/p XDP pkt read, pkt_data <= pkt_meta', bad access 2 OK
Summary: 1900 PASSED, 0 SKIPPED, 0 FAILED
Given a BPF insn can only have two registers (dst, src), the R0 is fixed and
used as an auxilliary register for input (old value) as well as output (returning
old value from memory location). While the verifier performs a number of safety
checks, it misses to reject unprivileged programs where R0 contains a pointer as
old value.
Through brute-forcing it takes about ~16sec on my machine to leak a kernel pointer
with BPF_CMPXCHG. The PoC is basically probing for kernel addresses by storing the
guessed address into the map slot as a scalar, and using the map value pointer as
R0 while SRC_REG has a canary value to detect a matching address.
Fix it by checking R0 for pointers, and reject if that's the case for unprivileged
programs.
Daniel Borkmann [Tue, 7 Dec 2021 10:07:04 +0000 (10:07 +0000)]
bpf, selftests: Add test case for atomic fetch on spilled pointer
Test whether unprivileged would be able to leak the spilled pointer either
by exporting the returned value from the atomic{32,64} operation or by reading
and exporting the value from the stack after the atomic operation took place.
Note that for unprivileged, the below atomic cmpxchg test case named "Dest
pointer in r0 - succeed" is failing. The reason is that in the dst memory
location (r10 -8) there is the spilled register r10:
Daniel Borkmann [Tue, 7 Dec 2021 12:51:56 +0000 (12:51 +0000)]
bpf: Fix kernel address leakage in atomic fetch
The change in commit f33f93ce3900 ("bpf: Propagate stack bounds to registers
in atomics w/ BPF_FETCH") around check_mem_access() handling is buggy since
this would allow for unprivileged users to leak kernel pointers. For example,
an atomic fetch/and with -1 on a stack destination which holds a spilled
pointer will migrate the spilled register type into a scalar, which can then
be exported out of the program (since scalar != pointer) by dumping it into
a map value.
The original implementation of XADD was preventing this situation by using
a double call to check_mem_access() one with BPF_READ and a subsequent one
with BPF_WRITE, in both cases passing -1 as a placeholder value instead of
register as per XADD semantics since it didn't contain a value fetch. The
BPF_READ also included a check in check_stack_read_fixed_off() which rejects
the program if the stack slot is of __is_pointer_value() if dst_regno < 0.
The latter is to distinguish whether we're dealing with a regular stack spill/
fill or some arithmetical operation which is disallowed on non-scalars, see
also 071755280b8a ("bpf: Forbid XADD on spilled pointers for unprivileged
users") for more context on check_mem_access() and its handling of placeholder
value -1.
One minimally intrusive option to fix the leak is for the BPF_FETCH case to
initially check the BPF_READ case via check_mem_access() with -1 as register,
followed by the actual load case with non-negative load_reg to propagate
stack bounds to registers.
Fixes: f33f93ce3900 ("bpf: Propagate stack bounds to registers in atomics w/ BPF_FETCH") Reported-by: <n4ke4mry@gmail.com> Acked-by: Brendan Jackman <jackmanb@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Lin Feng [Fri, 12 Nov 2021 05:36:29 +0000 (13:36 +0800)]
bcache: fix NULL pointer reference in cached_dev_detach_finish
Commit 9f5e3e7024ee ("bcache: move calc_cached_dev_sectors to proper
place on backing device detach") tries to fix calc_cached_dev_sectors
when bcache device detaches, but now we have:
cached_dev_detach_finish
...
bcache_device_detach(&dc->disk);
...
closure_put(&d->c->caching);
d->c = NULL; [*explicitly set dc->disk.c to NULL*]
list_move(&dc->list, &uncached_devices);
calc_cached_dev_sectors(dc->disk.c); [*passing a NULL pointer*]
...
Upper codeflows shows how bug happens, this patch fix the problem by
caching dc->disk.c beforehand, and cache_set won't be freed under us
because c->caching closure at least holds a reference count and closure
callback __cache_set_unregister only being called by bch_cache_set_stop
which using closure_queue(&c->caching), that means c->caching closure
callback for destroying cache_set won't be trigger by previous
closure_put(&d->c->caching).
So at this stage(while cached_dev_detach_finish is calling) it's safe to
access cache_set dc->disk.c.
Fixes: 9f5e3e7024ee ("bcache: move calc_cached_dev_sectors to proper place on backing device detach") Signed-off-by: Lin Feng <linf@wangsu.com> Signed-off-by: Coly Li <colyli@suse.de> Link: https://lore.kernel.org/r/20211112053629.3437-2-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Tue, 14 Dec 2021 14:03:24 +0000 (07:03 -0700)]
block: reduce kblockd_mod_delayed_work_on() CPU consumption
Dexuan reports that he's seeing spikes of very heavy CPU utilization when
running 24 disks and using the 'none' scheduler. This happens off the
sched restart path, because SCSI requires the queue to be restarted async,
and hence we're hammering on mod_delayed_work_on() to ensure that the work
item gets run appropriately.
Avoid hammering on the timer and just use queue_work_on() if no delay
has been specified.
====================
mptcp: Fixes for ULP, a deadlock, and netlink docs
Two of the MPTCP fixes in this set are related to the TCP_ULP socket
option with MPTCP sockets operating in "fallback" mode (the connection
has reverted to regular TCP). The other issues are an observed deadlock
and missing parameter documentation in the MPTCP netlink API.
Patch 1 marks TCP_ULP as unsupported earlier in MPTCP setsockopt code,
so the fallback code path in the MPTCP layer does not pass the TCP_ULP
option down to the subflow TCP socket.
Patch 2 makes sure a TCP fallback socket returned to userspace by
accept()ing on a MPTCP listening socket does not allow use of the
"mptcp" TCP_ULP type. That ULP is intended only for use by in-kernel
MPTCP subflows.
Patch 3 fixes the possible deadlock when sending data and there are
socket option changes to sync to the subflows.
Patch 4 makes sure all MPTCP netlink event parameters are documented
in the MPTCP uapi header.
====================
Maxim Galaganov [Tue, 14 Dec 2021 23:16:03 +0000 (15:16 -0800)]
mptcp: fix deadlock in __mptcp_push_pending()
__mptcp_push_pending() may call mptcp_flush_join_list() with subflow
socket lock held. If such call hits mptcp_sockopt_sync_all() then
subsequently __mptcp_sockopt_sync() could try to lock the subflow
socket for itself, causing a deadlock.
Fix the issue by using __mptcp_flush_join_list() instead of plain
mptcp_flush_join_list() inside __mptcp_push_pending(), as suggested by
Florian. The sockopt sync will be deferred to the workqueue.
Fixes: 581d3c63be8a ("mptcp: setsockopt: handle SO_KEEPALIVE and SO_PRIORITY") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/244 Suggested-by: Florian Westphal <fw@strlen.de> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Maxim Galaganov <max@internet.ru> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Florian Westphal [Tue, 14 Dec 2021 23:16:02 +0000 (15:16 -0800)]
mptcp: clear 'kern' flag from fallback sockets
The mptcp ULP extension relies on sk->sk_sock_kern being set correctly:
It prevents setsockopt(fd, IPPROTO_TCP, TCP_ULP, "mptcp", 6); from
working for plain tcp sockets (any userspace-exposed socket).
But in case of fallback, accept() can return a plain tcp sk.
In such case, sk is still tagged as 'kernel' and setsockopt will work.
This will crash the kernel, The subflow extension has a NULL ctx->conn
mptcp socket:
BUG: KASAN: null-ptr-deref in subflow_data_ready+0x181/0x2b0
Call Trace:
tcp_data_ready+0xf8/0x370
[..]
Fixes: e16f0c73e49e ("mptcp: Create SUBFLOW socket for incoming connections") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Evan Quan [Mon, 13 Dec 2021 06:38:38 +0000 (14:38 +0800)]
drm/amdgpu: correct the wrong cached state for GMC on PICASSO
Pair the operations did in GMC ->hw_init and ->hw_fini. That
can help to maintain correct cached state for GMC and avoid
unintention gate operation dropping due to wrong cached state.
BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1828 Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
If the firmware wasn't reset by PSP or HW and is currently running
then the firmware will hang or perform underfined behavior when we
modify its firmware state underneath it.
[How]
Reset DMCUB before setting up cache windows and performing HW init.
Reviewed-by: Aurabindo Jayamohanan Pillai <Aurabindo.Pillai@amd.com> Acked-by: Pavle Kotarac <Pavle.Kotarac@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>