Linus Torvalds [Fri, 17 Jun 2022 20:12:20 +0000 (15:12 -0500)]
Merge tag 'pci-v5.19-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull pci fix from Bjorn Helgaas:
"Revert clipping of PCI host bridge windows to avoid E820 regions,
which broke several machines by forcing unnecessary BAR reassignments
(Hans de Goede)"
* tag 'pci-v5.19-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
x86/PCI: Revert "x86/PCI: Clip only host bridge windows for E820 regions"
Linus Torvalds [Fri, 17 Jun 2022 19:57:42 +0000 (14:57 -0500)]
Merge tag 'printk-for-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk fixes from Petr Mladek:
"Make the global console_sem available for CPU that is handling panic()
or shutdown.
This is an old problem when an existing console lock owner might block
console output, but it became more visible with the kthreads"
* tag 'printk-for-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
printk: Wait for the global console lock when the system is going down
printk: Block console kthreads when direct printing will be required
Prior to 8540557c7060 ("x86/PCI: Clip only host bridge windows for E820
regions"), E820 regions did not affect PCI host bridge windows. We only
looked at E820 regions and avoided them when allocating new MMIO space.
If firmware PCI bridge window and BAR assignments used E820 regions, we
left them alone.
After 8540557c7060, we removed E820 regions from the PCI host bridge
windows before looking at BARs, so firmware assignments in E820 regions
looked like errors, and we moved things around to fit in the space left
(if any) after removing the E820 regions. This unnecessary BAR
reassignment broke several machines.
Guilherme reported that Steam Deck fails to boot after 8540557c7060. We
clipped the window that contained most 32-bit BARs:
BIOS-e820: [mem 0x00000000a0000000-0x00000000a00fffff] reserved
acpi PNP0A08:00: clipped [mem 0x80000000-0xf7ffffff window] to [mem 0xa0100000-0xf7ffffff window] for e820 entry [mem 0xa0000000-0xa00fffff]
which forced us to reassign all those BARs, for example, this NVMe BAR:
pci 0000:00:01.2: PCI bridge to [bus 01]
pci 0000:00:01.2: bridge window [mem 0x80600000-0x806fffff]
pci 0000:01:00.0: BAR 0: [mem 0x80600000-0x80603fff 64bit]
pci 0000:00:01.2: can't claim window [mem 0x80600000-0x806fffff]: no compatible bridge window
pci 0000:01:00.0: can't claim BAR 0 [mem 0x80600000-0x80603fff 64bit]: no compatible bridge window
All the reassignments were successful, so the devices should have been
functional at the new addresses, but some were not.
Andy reported a similar failure on an Intel MID platform. Benjamin
reported a similar failure on a VMWare Fusion VM.
Note: this is not a clean revert; this revert keeps the later change to
make the clipping dependent on a new pci_use_e820 bool, moving the checking
of this bool to arch_remove_reservations().
[bhelgaas: commit log, add more reporters and testers] BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=216109 Reported-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Reported-by: Andy Shevchenko <andy.shevchenko@gmail.com> Reported-by: Benjamin Coddington <bcodding@redhat.com> Reported-by: Jongman Heo <jongman.heo@gmail.com> Fixes: 8540557c7060 ("x86/PCI: Clip only host bridge windows for E820 regions") Link: https://lore.kernel.org/r/20220612144325.85366-1-hdegoede@redhat.com Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Linus Torvalds [Fri, 17 Jun 2022 18:55:19 +0000 (13:55 -0500)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Revert the moving of the jump labels initialisation before
setup_machine_fdt(). The bug was fixed in drivers/char/random.c.
- Ftrace fixes: branch range check and consistent handling of PLTs.
- Clean rather than invalidate FROM_DEVICE buffers at start of DMA
transfer (safer if such buffer is mapped in user space). A cache
invalidation is done already at the end of the transfer.
- A couple of clean-ups (unexport symbol, remove unused label).
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mm: Don't invalidate FROM_DEVICE buffers at start of DMA transfer
arm64/cpufeature: Unexport set_cpu_feature()
arm64: ftrace: remove redundant label
arm64: ftrace: consistently handle PLTs.
arm64: ftrace: fix branch range checks
Revert "arm64: Initialize jump labels before setup_machine_fdt()"
Linus Torvalds [Fri, 17 Jun 2022 18:50:24 +0000 (13:50 -0500)]
Merge tag 'loongarch-fixes-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
"Add missing ELF_DETAILS in vmlinux.lds.S and fix document rendering"
* tag 'loongarch-fixes-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
docs/zh_CN/LoongArch: Fix notes rendering by using reST directives
docs/LoongArch: Fix notes rendering by using reST directives
LoongArch: vmlinux.lds.S: Add missing ELF_DETAILS
Linus Torvalds [Fri, 17 Jun 2022 18:45:47 +0000 (13:45 -0500)]
Merge tag 'riscv-for-linus-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- A fix for the PolarFire SOC's device tree
- A handful of fixes for the recently added Svpmbt support
- An improvement to the Kconfig text for Svpbmt
* tag 'riscv-for-linus-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Improve description for RISCV_ISA_SVPBMT Kconfig symbol
riscv: drop cpufeature_apply_feature tracking variable
riscv: fix dependency for t-head errata
riscv: dts: microchip: re-add pdma to mpfs device tree
* tag 'hyperv-fixes-signed-20220617' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
x86/Hyper-V: Add SEV negotiate protocol support in Isolation VM
Drivers: hv: vmbus: Release cpu lock in error case
HID: hyperv: Correctly access fields declared as __le16
clocksource: hyper-v: unexport __init-annotated hv_init_clocksource()
Drivers: hv: Fix syntax errors in comments
Drivers: hv: vmbus: Don't assign VMbus channel interrupts to isolated CPUs
Linus Torvalds [Fri, 17 Jun 2022 18:22:58 +0000 (11:22 -0700)]
Merge tag 'block-5.19-2022-06-16' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- NVMe pull request from Christoph
- Quirks, quirks, quirks to work around buggy consumer grade
devices (Keith Bush, Ning Wang, Stefan Reiter, Rasheed Hsueh)
- Better kernel messages for devices that need quirking (Keith
Bush)
- Make a kernel message more useful (Thomas Weißschuh)
- MD pull request from Song, with a few fixes
- blk-mq sysfs locking fixes (Ming)
- BFQ stats fix (Bart)
- blk-mq offline queue fix (Bart)
- blk-mq flush request tag fix (Ming)
* tag 'block-5.19-2022-06-16' of git://git.kernel.dk/linux-block:
block/bfq: Enable I/O statistics
blk-mq: don't clear flush_rq from tags->rqs[]
blk-mq: avoid to touch q->elevator without any protection
blk-mq: protect q->elevator by ->sysfs_lock in blk_mq_elv_switch_none
block: Fix handling of offline queues in blk_mq_alloc_request_hctx()
md/raid5-ppl: Fix argument order in bio_alloc_bioset()
Revert "md: don't unregister sync_thread with reconfig_mutex held"
nvme-pci: disable write zeros support on UMIC and Samsung SSDs
nvme-pci: avoid the deepest sleep state on ZHITAI TiPro7000 SSDs
nvme-pci: sk hynix p31 has bogus namespace ids
nvme-pci: smi has bogus namespace ids
nvme-pci: phison e12 has bogus namespace ids
nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA XPG GAMMIX S50
nvme-pci: add trouble shooting steps for timeouts
nvme: add bug report info for global duplicate id
nvme: add device name to warning in uuid_show()
Linus Torvalds [Fri, 17 Jun 2022 18:14:07 +0000 (11:14 -0700)]
Merge tag 'io_uring-5.19-2022-06-16' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"Bigger than usual at this time, both because we missed -rc2, but also
because of some reverts that we chose to do. In detail:
- Adjust mapped buffer API while we still can (Dylan)
- Mapped buffer fixes (Dylan, Hao, Pavel, me)
- Fix for uring_cmd wrong API usage for task_work (Dylan)
- Fix for bug introduced in fixed file closing (Hao)
- Fix race in buffer/file resource handling (Pavel)
- Revert the NOP support for CQE32 and buffer selection that was
brought up during the merge window (Pavel)
- Remove IORING_CLOSE_FD_AND_FILE_SLOT introduced in this merge
window. The API needs further refining, so just yank it for now and
we'll revisit for a later kernel.
- Series cleaning up the CQE32 support added in this merge window,
making it more integrated rather than sitting on the side (Pavel)"
* tag 'io_uring-5.19-2022-06-16' of git://git.kernel.dk/linux-block: (21 commits)
io_uring: recycle provided buffer if we punt to io-wq
io_uring: do not use prio task_work_add in uring_cmd
io_uring: commit non-pollable provided mapped buffers upfront
io_uring: make io_fill_cqe_aux honour CQE32
io_uring: remove __io_fill_cqe() helper
io_uring: fix ->extra{1,2} misuse
io_uring: fill extra big cqe fields from req
io_uring: unite fill_cqe and the 32B version
io_uring: get rid of __io_fill_cqe{32}_req()
io_uring: remove IORING_CLOSE_FD_AND_FILE_SLOT
Revert "io_uring: add buffer selection support to IORING_OP_NOP"
Revert "io_uring: support CQE32 for nop operation"
io_uring: limit size of provided buffer ring
io_uring: fix types in provided buffer ring
io_uring: fix index calculation
io_uring: fix double unlock for pbuf select
io_uring: kbuf: fix bug of not consuming ring buffer in partial io case
io_uring: openclose: fix bug of closing wrong fixed file
io_uring: fix not locked access to fixed buf table
io_uring: fix races with buffer table unregister
...
Will Deacon [Fri, 10 Jun 2022 15:12:27 +0000 (16:12 +0100)]
arm64: mm: Don't invalidate FROM_DEVICE buffers at start of DMA transfer
Invalidating the buffer memory in arch_sync_dma_for_device() for
FROM_DEVICE transfers
When using the streaming DMA API to map a buffer prior to inbound
non-coherent DMA (i.e. DMA_FROM_DEVICE), we invalidate any dirty CPU
cachelines so that they will not be written back during the transfer and
corrupt the buffer contents written by the DMA. This, however, poses two
potential problems:
(1) If the DMA transfer does not write to every byte in the buffer,
then the unwritten bytes will contain stale data once the transfer
has completed.
(2) If the buffer has a virtual alias in userspace, then stale data
may be visible via this alias during the period between performing
the cache invalidation and the DMA writes landing in memory.
Address both of these issues by cleaning (aka writing-back) the dirty
lines in arch_sync_dma_for_device(DMA_FROM_DEVICE) instead of discarding
them using invalidation.
Linus Torvalds [Fri, 17 Jun 2022 17:09:24 +0000 (10:09 -0700)]
Merge tag 'fs_for_v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull writeback and ext2 fixes from Jan Kara:
"A fix for writeback bug which prevented machines with kdevtmpfs from
booting and also one small ext2 bugfix in IO error handling"
* tag 'fs_for_v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
init: Initialize noop_backing_dev_info early
ext2: fix fs corruption when trying to remove a non-empty directory with IO error
Linus Torvalds [Fri, 17 Jun 2022 17:03:53 +0000 (10:03 -0700)]
Merge tag 'for-5.19/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- Fix a race in DM core's dm_start_io_acct that could result in double
accounting for abnormal IO (e.g. discards, write zeroes, etc).
- Fix a use-after-free in DM core's dm_put_live_table_bio.
- Fix a race for REQ_NOWAIT bios being issued despite no support from
underlying DM targets (due to DM table reload at an "unlucky" time)
- Fix access beyond allocated bitmap in DM mirror's log.
* tag 'for-5.19/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm mirror log: round up region bitmap size to BITS_PER_LONG
dm: fix narrow race for REQ_NOWAIT bios being issued despite no support
dm: fix use-after-free in dm_put_live_table_bio
dm: fix race in dm_start_io_acct
Linus Torvalds [Fri, 17 Jun 2022 17:02:26 +0000 (10:02 -0700)]
Merge tag 'hwmon-for-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- Add missing lock protection in occ driver
- Add missing comma in board name list in asus-ec-sensors driver
- Fix devicetree bindings for ti,tmp401
* tag 'hwmon-for-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (asus-ec-sensors) add missing comma in board name list.
hwmon: (occ) Lock mutex in shutdown to prevent race with occ_active
dt-bindings: hwmon: ti,tmp401: Drop 'items' from 'ti,n-factor' property
Linus Torvalds [Fri, 17 Jun 2022 14:58:39 +0000 (07:58 -0700)]
Merge tag 'char-misc-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are some small char/misc driver fixes for 5.19-rc3 that resolve
some reported issues.
They include:
- mei driver fixes
- comedi driver fix
- rtsx build warning fix
- fsl-mc-bus driver fix
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
eeprom: at25: Split reads into chunks and cap write size
misc: atmel-ssc: Fix IRQ check in ssc_probe
char: lp: remove redundant initialization of err
Linus Torvalds [Fri, 17 Jun 2022 14:55:24 +0000 (07:55 -0700)]
Merge tag 'staging-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are some small staging driver fixes for 5.19-rc3 that resolve
reported issues:
- remove visorbus.h which was forgotten in the -rc1 merge where the
code that used it was removed
- olpc_dcon: mark as broken to allow the DRM developers to evolve the
fbdev api properly without having to deal with this obsolete
driver. It will be removed soon if no one steps up to adopt it and
fix the issues with it.
- rtl8723bs driver fix
- r8188eu driver fix to resolve many reports of the driver being
broken with -rc1.
All of these have been in linux-next for a while with no reported
issues"
* tag 'staging-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: Also remove the Unisys visorbus.h
staging: rtl8723bs: Allocate full pwep structure
staging: olpc_dcon: mark driver as broken
staging: r8188eu: Fix warning of array overflow in ioctl_linux.c
staging: r8188eu: fix rtw_alloc_hwxmits error detection for now
Linus Torvalds [Fri, 17 Jun 2022 14:52:43 +0000 (07:52 -0700)]
Merge tag 'tty-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver fixes from Greg KH:
"Here are some small tty and serial driver fixes for 5.19-rc3 to
resolve some reported problems:
- 8250 lsr read bugfix
- n_gsm line discipline allocation fix
- qcom serial driver fix for reported lockups that happened in -rc1
- goldfish tty driver fix
All have been in linux-next for a while now with no reported issues"
* tag 'tty-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: 8250: Store to lsr_save_flags after lsr read
tty: goldfish: Fix free_irq() on remove
tty: serial: qcom-geni-serial: Implement start_rx callback
serial: core: Introduce callback for start_rx and do stop_rx in suspend only if this callback implementation is present.
tty: n_gsm: Debug output allocation must use GFP_ATOMIC
Linus Torvalds [Fri, 17 Jun 2022 14:50:41 +0000 (07:50 -0700)]
Merge tag 'usb-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB driver fixes from Greg KH:
"Here are some small USB driver fixes and new device ids for 5.19-rc3
They include:
- new usb-serial driver device ids
- usb gadget driver fixes for reported problems
- cdnsp driver fix
- dwc3 driver fixes for reported problems
- dwc3 driver fix for merge problem that I caused in 5.18
- xhci driver fixes
- dwc2 memory leak fix
All of these have been in linux-next for a while with no reported
issues"
* tag 'usb-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: gadget: f_fs: change ep->ep safe in ffs_epfile_io()
usb: gadget: f_fs: change ep->status safe in ffs_epfile_io()
xhci: Fix null pointer dereference in resume if xhci has only one roothub
USB: fixup for merge issue with "usb: dwc3: Don't switch OTG -> peripheral if extcon is present"
usb: cdnsp: Fixed setting last_trb incorrectly
usb: gadget: u_ether: fix regression in setting fixed MAC address
usb: gadget: lpc32xx_udc: Fix refcount leak in lpc32xx_udc_probe
usb: dwc2: Fix memory leak in dwc2_hcd_init
usb: dwc3: pci: Restore line lost in merge conflict resolution
usb: dwc3: gadget: Fix IN endpoint max packet size allocation
USB: serial: option: add support for Cinterion MV31 with new baseline
USB: serial: io_ti: add Agilent E5805A support
Youling Tang [Mon, 13 Jun 2022 10:54:12 +0000 (18:54 +0800)]
LoongArch: vmlinux.lds.S: Add missing ELF_DETAILS
Commit 6e17087a2a6 ("vmlinux.lds.h: Split ELF_DETAILS from STABS_DEBUG")
splits ELF_DETAILS from STABS_DEBUG, resulting in missing ELF_DETAILS
information in LoongArch architecture, so add it.
Jens Axboe [Fri, 17 Jun 2022 12:24:26 +0000 (06:24 -0600)]
io_uring: recycle provided buffer if we punt to io-wq
io_arm_poll_handler() will recycle the buffer appropriately if we end
up arming poll (or if we're ready to retry), but not for the io-wq case
if we have attempted poll first.
Explicitly recycle the buffer to avoid both hanging on to it too long,
but also to avoid multiple reads grabbing the same one. This can happen
for ring mapped buffers, since it hasn't necessarily been committed.
Linus Torvalds [Fri, 17 Jun 2022 04:39:51 +0000 (21:39 -0700)]
Merge tag 'drm-fixes-2022-06-17' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Regular drm fixes for rc3. Nothing too serious, i915, amdgpu and
exynos all have a few small driver fixes, and two ttm fixes, and one
compiler warning.
i915:
- Fix page fault on error state read
- Fix memory leaks in per-gt sysfs
- Fix multiple fence handling
- Remove accidental static from a local variable
exynos:
- Check a null pointer instead of IS_ERR()
- Rework initialization code of Exynos MIC driver"
* tag 'drm-fixes-2022-06-17' of git://anongit.freedesktop.org/drm/drm:
drm/amd/display: Cap OLED brightness per max frame-average luminance
drm/amdgpu: Fix GTT size reporting in amdgpu_ioctl
drm/exynos: mic: Rework initialization
drm/exynos: fix IS_ERR() vs NULL check in probe
drm/ttm: fix bulk move handling v2
drm/i915/uc: remove accidental static from a local variable
drm/i915: Individualize fences before adding to dma_resv obj
drm/i915/gt: Fix memory leaks in per-gt sysfs
drm/i915/reset: Fix error_state_read ptr + offset use
drm/ttm: fix missing NULL check in ttm_device_swapout
drm/atomic: fix warning of unused variable
Dave Airlie [Fri, 17 Jun 2022 00:24:42 +0000 (10:24 +1000)]
Merge tag 'drm-intel-fixes-2022-06-16' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
drm/i915 fixes for v5.19-rc3:
- Fix page fault on error state read
- Fix memory leaks in per-gt sysfs
- Fix multiple fence handling
- Remove accidental static from a local variable
Mikulas Patocka [Thu, 16 Jun 2022 17:28:57 +0000 (13:28 -0400)]
dm mirror log: round up region bitmap size to BITS_PER_LONG
The code in dm-log rounds up bitset_size to 32 bits. It then uses
find_next_zero_bit_le on the allocated region. find_next_zero_bit_le
accesses the bitmap using unsigned long pointers. So, on 64-bit
architectures, it may access 4 bytes beyond the allocated size.
Fix this bug by rounding up bitset_size to BITS_PER_LONG.
This bug was found by running the lvm2 testsuite with kasan.
Mikulas Patocka [Thu, 16 Jun 2022 18:14:39 +0000 (14:14 -0400)]
dm: fix narrow race for REQ_NOWAIT bios being issued despite no support
Starting with the commit 63a225c9fd20, device mapper has an optimization
that it will take cheaper table lock (dm_get_live_table_fast instead of
dm_get_live_table) if the bio has REQ_NOWAIT. The bios with REQ_NOWAIT
must not block in the target request routine, if they did, we would be
blocking while holding rcu_read_lock, which is prohibited.
The targets that are suitable for REQ_NOWAIT optimization (and that don't
block in the map routine) have the flag DM_TARGET_NOWAIT set. Device
mapper will test if all the targets and all the devices in a table
support nowait (see the function dm_table_supports_nowait) and it will set
or clear the QUEUE_FLAG_NOWAIT flag on its request queue according to
this check.
There's a test in submit_bio_noacct: "if ((bio->bi_opf & REQ_NOWAIT) &&
!blk_queue_nowait(q)) goto not_supported" - this will make sure that
REQ_NOWAIT bios can't enter a request queue that doesn't support them.
This mechanism works to prevent REQ_NOWAIT bios from reaching dm targets
that don't support the REQ_NOWAIT flag (and that may block in the map
routine) - except that there is a small race condition:
submit_bio_noacct checks if the queue has the QUEUE_FLAG_NOWAIT without
holding any locks. Immediatelly after this check, the device mapper table
may be reloaded with a table that doesn't support REQ_NOWAIT (for example,
if we start moving the logical volume or if we activate a snapshot).
However the REQ_NOWAIT bio that already passed the check in
submit_bio_noacct would be sent to device mapper, where it could be
redirected to a dm target that doesn't support REQ_NOWAIT - the result is
sleeping while we hold rcu_read_lock.
In order to fix this race, we double-check if the target supports
REQ_NOWAIT while we hold the table lock (so that the table can't change
under us).
Fixes: 7cb885820065 ("dm: introduce dm_{get,put}_live_table_bio called from dm_submit_bio") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Mikulas Patocka [Thu, 16 Jun 2022 17:21:27 +0000 (13:21 -0400)]
dm: fix use-after-free in dm_put_live_table_bio
dm_put_live_table_bio is called from the end of dm_submit_bio.
However, at this point, the bio may be already finished and the caller
may have freed the bio. Consequently, dm_put_live_table_bio accesses
the stale "bio" pointer.
Fix this bug by loading the bi_opf value and passing it to
dm_get_live_table_bio and dm_put_live_table_bio instead of the bio.
This bug was found by running the lvm2 testsuite with kasan.
Fixes: 7cb885820065 ("dm: introduce dm_{get,put}_live_table_bio called from dm_submit_bio") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Dave Airlie [Thu, 16 Jun 2022 23:31:22 +0000 (09:31 +1000)]
Merge tag 'drm-misc-fixes-2022-06-16' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Two fixes for TTM, one for a NULL pointer dereference and one to make sure
the buffer is pinned prior to a bulk move, and a fix for a spurious
compiler warning.
Bart Van Assche [Mon, 13 Jun 2022 16:32:34 +0000 (09:32 -0700)]
block/bfq: Enable I/O statistics
BFQ uses io_start_time_ns. That member variable is only set if I/O
statistics are enabled. Hence this patch that enables I/O statistics
at the time BFQ is associated with a request queue.
Compile-tested only.
Reported-by: Cixi Geng <cixi.geng1@unisoc.com> Cc: Cixi Geng <cixi.geng1@unisoc.com> Cc: Yu Kuai <yukuai3@huawei.com> Cc: Paolo Valente <paolo.valente@unimore.it> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Thu, 16 Jun 2022 22:53:38 +0000 (15:53 -0700)]
Merge tag 'audit-pr-20220616' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit
Pull audit fix from Paul Moore:
"A single audit patch to fix a problem where we were not properly
freeing memory allocated when recording information related to a
module load"
* tag 'audit-pr-20220616' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
audit: free module name
Linus Torvalds [Thu, 16 Jun 2022 22:50:36 +0000 (15:50 -0700)]
Merge tag 'selinux-pr-20220616' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux fix from Paul Moore:
"A single SELinux patch to fix memory leaks when mounting filesystems
with SELinux mount options"
* tag 'selinux-pr-20220616' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: free contexts previously transferred in selinux_add_opt()
Heiko Stuebner [Thu, 26 May 2022 20:56:45 +0000 (22:56 +0200)]
riscv: fix dependency for t-head errata
alternatives only work correctly on non-xip-kernels and while the
selected alternative-symbol has the correct dependency the symbol
selecting it also needs that dependency.
So add the missing dependency to the T-Head errata Kconfig symbol.
Palmer Dabbelt [Thu, 16 Jun 2022 22:13:10 +0000 (15:13 -0700)]
Merge tag 'dt-fixes-for-palmer-5.19-rc3' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/conor/linux into fixes
Microchip RISC-V devicetree fixes for 5.19-rc3
A single fix for mpfs.dtsi:
- The sifive pdma entry fell through the cracks between versions of my
dt patches & I gave Zong the wrong conflict resolution, so it is
added back.
* tag 'dt-fixes-for-palmer-5.19-rc3' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/conor/linux:
riscv: dts: microchip: re-add pdma to mpfs device tree
Ming Lei [Thu, 16 Jun 2022 01:44:01 +0000 (09:44 +0800)]
blk-mq: don't clear flush_rq from tags->rqs[]
commit 3ff8e237de54 ("blk-mq: clearing flush request reference in
tags->rqs[]") is added to clear the to-be-free flush request from
tags->rqs[] for avoiding use-after-free on the flush rq.
Yu Kuai reported that blk_mq_clear_flush_rq_mapping() slows down boot time
by ~8s because running scsi probe which may create and remove lots of
unpresent LUNs on megaraid-sas which uses BLK_MQ_F_TAG_HCTX_SHARED and
each request queue has lots of hw queues.
Improve the situation by not running blk_mq_clear_flush_rq_mapping if
disk isn't added when there can't be any flush request issued.
Ming Lei [Thu, 16 Jun 2022 01:44:00 +0000 (09:44 +0800)]
blk-mq: avoid to touch q->elevator without any protection
q->elevator is referred in blk_mq_has_sqsched() without any protection,
no .q_usage_counter is held, no queue srcu and rcu read lock is held,
so potential use-after-free may be triggered.
Fix the issue by adding one queue flag for checking if the elevator
uses single queue style dispatch. Meantime the elevator feature flag
of ELEVATOR_F_MQ_AWARE isn't needed any more.
Ming Lei [Thu, 16 Jun 2022 01:43:59 +0000 (09:43 +0800)]
blk-mq: protect q->elevator by ->sysfs_lock in blk_mq_elv_switch_none
elevator can be tore down by sysfs switch interface or disk release, so
hold ->sysfs_lock before referring to q->elevator, then potential
use-after-free can be avoided.
Bart Van Assche [Wed, 15 Jun 2022 21:00:04 +0000 (14:00 -0700)]
block: Fix handling of offline queues in blk_mq_alloc_request_hctx()
This patch prevents that test nvme/004 triggers the following:
UBSAN: array-index-out-of-bounds in block/blk-mq.h:135:9
index 512 is out of range for type 'long unsigned int [512]'
Call Trace:
show_stack+0x52/0x58
dump_stack_lvl+0x49/0x5e
dump_stack+0x10/0x12
ubsan_epilogue+0x9/0x3b
__ubsan_handle_out_of_bounds.cold+0x44/0x49
blk_mq_alloc_request_hctx+0x304/0x310
__nvme_submit_sync_cmd+0x70/0x200 [nvme_core]
nvmf_connect_io_queue+0x23e/0x2a0 [nvme_fabrics]
nvme_loop_connect_io_queues+0x8d/0xb0 [nvme_loop]
nvme_loop_create_ctrl+0x58e/0x7d0 [nvme_loop]
nvmf_create_ctrl+0x1d7/0x4d0 [nvme_fabrics]
nvmf_dev_write+0xae/0x111 [nvme_fabrics]
vfs_write+0x144/0x560
ksys_write+0xb7/0x140
__x64_sys_write+0x42/0x50
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Fixes: e3edfdaaa77e ("blk-mq: simplify queue mapping & schedule with each possisble CPU") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220615210004.1031820-1-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Thu, 16 Jun 2022 18:51:32 +0000 (11:51 -0700)]
Merge tag 'net-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Mostly driver fixes.
Current release - regressions:
- Revert "net: Add a second bind table hashed by port and address",
needs more work
- amd-xgbe: use platform_irq_count(), static setup of IRQ resources
had been removed from DT core
- dts: at91: ksz9477_evb: add phy-mode to fix port/phy validation
Current release - new code bugs:
- hns3: modify the ring param print info
Previous releases - always broken:
- axienet: make the 64b addressable DMA depends on 64b architectures
- iavf: fix issue with MAC address of VF shown as zero
- ice: fix PTP TX timestamp offset calculation
- usb: ax88179_178a needs FLAG_SEND_ZLP
Misc:
- document some net.sctp.* sysctls"
* tag 'net-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (31 commits)
net: axienet: add missing error return code in axienet_probe()
Revert "net: Add a second bind table hashed by port and address"
net: ax25: Fix deadlock caused by skb_recv_datagram in ax25_recvmsg
net: usb: ax88179_178a needs FLAG_SEND_ZLP
MAINTAINERS: add include/dt-bindings/net to NETWORKING DRIVERS
ARM: dts: at91: ksz9477_evb: fix port/phy validation
net: bgmac: Fix an erroneous kfree() in bgmac_remove()
ice: Fix memory corruption in VF driver
ice: Fix queue config fail handling
ice: Sync VLAN filtering features for DVM
ice: Fix PTP TX timestamp offset calculation
mlxsw: spectrum_cnt: Reorder counter pools
docs: networking: phy: Fix a typo
amd-xgbe: Use platform_irq_count()
octeontx2-vf: Add support for adaptive interrupt coalescing
xilinx: Fix build on x86.
net: axienet: Use iowrite64 to write all 64b descriptor pointers
net: axienet: make the 64b addresable DMA depends on 64b archectures
net: hns3: fix tm port shapping of fibre port is incorrect after driver initialization
net: hns3: fix PF rss size initialization bug
...
Yang Yingliang [Thu, 16 Jun 2022 06:29:17 +0000 (14:29 +0800)]
net: axienet: add missing error return code in axienet_probe()
It should return error code in error path in axienet_probe().
Fixes: 0909279f6be4 ("net: axienet: make the 64b addresable DMA depends on 64b archectures") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20220616062917.3601-1-yangyingliang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Joanne Koong [Wed, 15 Jun 2022 19:32:13 +0000 (12:32 -0700)]
Revert "net: Add a second bind table hashed by port and address"
This reverts:
commit 9da23f76f146 ("net: Add a second bind table hashed by port and address")
commit bcfa16c3ccf7 ("selftests: Add test for timing a bind request to a port with a populated bhash entry") Link: https://lore.kernel.org/netdev/20220520001834.2247810-1-kuba@kernel.org/
There are a few things that need to be fixed here:
* Updating bhash2 in cases where the socket's rcv saddr changes
* Adding bhash2 hashbucket locks
Fixes: 9da23f76f146 ("net: Add a second bind table hashed by port and address") Reported-by: syzbot+015d756bbd1f8b5c8f09@syzkaller.appspotmail.com Reported-by: syzbot+98fd2d1422063b0f8c44@syzkaller.appspotmail.com Reported-by: syzbot+0a847a982613c6438fba@syzkaller.appspotmail.com Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Link: https://lore.kernel.org/r/20220615193213.2419568-1-joannelkoong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Mark Brown [Wed, 15 Jun 2022 19:15:04 +0000 (20:15 +0100)]
arm64/cpufeature: Unexport set_cpu_feature()
We currently export set_cpu_feature() to modules but there are no in tree
users that can be built as modules and it is hard to see cases where it
would make sense for there to be any such users. Remove the export to avoid
anyone else having to worry about why it is there and ensure that any users
that do get added get a bit more visiblity.
Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20220615191504.626604-1-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Dylan Yudaken [Thu, 16 Jun 2022 13:50:11 +0000 (06:50 -0700)]
io_uring: do not use prio task_work_add in uring_cmd
io_req_task_prio_work_add has a strict assumption that it will only be
used with io_req_task_complete. There is a codepath that assumes this is
the case and will not even call the completion function if it is hit.
For uring_cmd with an arbitrary completion function change the call to the
correct non-priority version.
For recv/recvmsg, IO either completes immediately or gets queued for a
retry. This isn't the case for read/readv, if eg a normal file or a block
device is used. Here, an operation can get queued with the block layer.
If this happens, ring mapped buffers must get committed immediately to
avoid that the next read can consume the same buffer.
Check if we're dealing with pollable file, when getting a new ring mapped
provided buffer. If it's not, commit it immediately rather than wait post
issue. If we don't wait, we can race with completions coming in, or just
plain buffer reuse by committing after a retry where others could have
grabbed the same buffer.
Fixes: e790404d1245 ("io_uring: add support for ring mapped supplied buffers") Reviewed-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jan Kara [Wed, 15 Jun 2022 13:22:29 +0000 (15:22 +0200)]
init: Initialize noop_backing_dev_info early
noop_backing_dev_info is used by superblocks of various
pseudofilesystems such as kdevtmpfs. After commit 2eacea2ab2b2
("writeback: Fix inode->i_io_list not be protected by inode->i_lock
error") this broke because __mark_inode_dirty() started to access more
fields from noop_backing_dev_info and this led to crashes inside
locked_inode_to_wb_and_lock_list() called from __mark_inode_dirty().
Fix the problem by initializing noop_backing_dev_info before the
filesystems get mounted.
Fixes: 2eacea2ab2b2 ("writeback: Fix inode->i_io_list not be protected by inode->i_lock error") Reported-and-tested-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reported-and-tested-by: Alexandru Elisei <alexandru.elisei@arm.com> Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
Ye Bin [Wed, 15 Jun 2022 09:00:10 +0000 (17:00 +0800)]
ext2: fix fs corruption when trying to remove a non-empty directory with IO error
We got issue as follows:
[home]# mount /dev/sdd test
[home]# cd test
[test]# ls
dir1 lost+found
[test]# rmdir dir1
ext2_empty_dir: inject fault
[test]# ls
lost+found
[test]# cd ..
[home]# umount test
[home]# fsck.ext2 -fn /dev/sdd
e2fsck 1.42.9 (28-Dec-2013)
Pass 1: Checking inodes, blocks, and sizes
Inode 4065, i_size is 0, should be 1024. Fix? no
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Unconnected directory inode 4065 (/???)
Connect to /lost+found? no
'..' in ... (4065) is / (2), should be <The NULL inode> (0).
Fix? no
Pass 4: Checking reference counts
Inode 2 ref count is 3, should be 4. Fix? no
Inode 4065 ref count is 2, should be 3. Fix? no
Pass 5: Checking group summary information
/dev/sdd: ********** WARNING: Filesystem still has errors **********
Cc: stable@vger.kernel.org Fixes: 52e24495d7ae ("selinux: parse contexts for mount options early") Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
Cc: stable@vger.kernel.org Fixes: 79ed1a2d5572 ("audit: prepare audit_context for use in calling contexts beyond syscalls") Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
Linus Torvalds [Wed, 15 Jun 2022 21:20:26 +0000 (14:20 -0700)]
Merge tag 'hardening-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening fixes from Kees Cook:
- Correctly handle vm_map areas in hardened usercopy (Matthew Wilcox)
- Adjust CFI RCU usage to avoid boot splats with cpuidle (Sami Tolvanen)
* tag 'hardening-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
usercopy: Make usercopy resilient against ridiculously large copies
usercopy: Cast pointer to an integer once
usercopy: Handle vm_map_ram() areas
cfi: Fix __cfi_slowpath_diag RCU usage with cpuidle
Petr Mladek [Wed, 15 Jun 2022 16:28:05 +0000 (18:28 +0200)]
printk: Wait for the global console lock when the system is going down
There are reports that the console kthreads block the global console
lock when the system is going down, for example, reboot, panic.
First part of the solution was to block kthreads in these problematic
system states so they stopped handling newly added messages.
Second part of the solution is to wait when for the kthreads when
they are actively printing. It solves the problem when a message
was printed before the system entered the problematic state and
the kthreads managed to step in.
A busy waiting has to be used because panic() can be called in any
context and in an unknown state of the scheduler.
There must be a timeout because the kthread might get stuck or sleeping
and never release the lock. The timeout 10s is an arbitrary value
inspired by the softlockup timeout.
Petr Mladek [Wed, 15 Jun 2022 16:28:04 +0000 (18:28 +0200)]
printk: Block console kthreads when direct printing will be required
There are known situations when the console kthreads are not
reliable or does not work in principle, for example, early boot,
panic, shutdown.
For these situations there is the direct (legacy) mode when printk() tries
to get console_lock() and flush the messages directly. It works very well
during the early boot when the console kthreads are not available at all.
It gets more complicated in the other situations when console kthreads
might be actively printing and block console_trylock() in printk().
The same problem is in the legacy code as well. Any console_lock()
owner could block console_trylock() in printk(). It is solved by
a trick that the current console_lock() owner is responsible for
printing all pending messages. It is actually the reason why there
is the risk of softlockups and why the console kthreads were
introduced.
The console kthreads use the same approach. They are responsible
for printing the messages by definition. So that they handle
the messages anytime when they are awake and see new ones.
The global console_lock is available when there is nothing
to do.
It should work well when the problematic context is correctly
detected and printk() switches to the direct mode. But it seems
that it is not enough in practice. There are reports that
the messages are not printed during panic() or shutdown()
even though printk() tries to use the direct mode here.
The problem seems to be that console kthreads become active in these
situation as well. They steel the job before other CPUs are stopped.
Then they are stopped in the middle of the job and block the global
console_lock.
First part of the solution is to block console kthreads when
the system is in a problematic state and requires the direct
printk() mode.
Linus Torvalds [Wed, 15 Jun 2022 19:34:19 +0000 (12:34 -0700)]
Merge tag 'tpmdd-next-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
Pull tpm fixes from Jarkko Sakkinen:
"Two fixes for this merge window"
* tag 'tpmdd-next-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
certs: fix and refactor CONFIG_SYSTEM_BLACKLIST_HASH_LIST build
certs/blacklist_hashes.c: fix const confusion in certs blacklist
Tianyu Lan [Tue, 14 Jun 2022 01:45:53 +0000 (21:45 -0400)]
x86/Hyper-V: Add SEV negotiate protocol support in Isolation VM
Hyper-V Isolation VM current code uses sev_es_ghcb_hv_call()
to read/write MSR via GHCB page and depends on the sev code.
This may cause regression when sev code changes interface
design.
The latest SEV-ES code requires to negotiate GHCB version before
reading/writing MSR via GHCB page and sev_es_ghcb_hv_call() doesn't
work for Hyper-V Isolation VM. Add Hyper-V ghcb related implementation
to decouple SEV and Hyper-V code. Negotiate GHCB version in the
hyperv_init() and use the version to communicate with Hyper-V
in the ghcb hv call function.
Fixes: 8b389c85889c ("x86/sev: Save the negotiated GHCB version") Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20220614014553.1915929-1-ltykernel@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
Jens Axboe [Wed, 15 Jun 2022 17:56:07 +0000 (11:56 -0600)]
Merge branch 'md-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into block-5.19
Pull MD fixes from Song.
* 'md-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
md/raid5-ppl: Fix argument order in bio_alloc_bioset()
Revert "md: don't unregister sync_thread with reconfig_mutex held"
Logan Gunthorpe [Wed, 8 Jun 2022 16:27:46 +0000 (10:27 -0600)]
md/raid5-ppl: Fix argument order in bio_alloc_bioset()
bio_alloc_bioset() takes a block device, number of vectors, the
OP flags, the GFP mask and the bio set. However when the prototype
was changed, the callisite in ppl_do_flush() had the OP flags and
the GFP flags reversed. This introduced some sparse error:
drivers/md/raid5-ppl.c:632:57: warning: incorrect type in argument 3
(different base types)
drivers/md/raid5-ppl.c:632:57: expected unsigned int opf
drivers/md/raid5-ppl.c:632:57: got restricted gfp_t [usertype]
drivers/md/raid5-ppl.c:633:61: warning: incorrect type in argument 4
(different base types)
drivers/md/raid5-ppl.c:633:61: expected restricted gfp_t [usertype]
gfp_mask
drivers/md/raid5-ppl.c:633:61: got unsigned long long
The sparse error introduction may not have been reported correctly by
0day due to other work that was cleaning up other sparse errors in this
area.
Fixes: cffdbddd0da9 ("block: pass a block_device and opf to bio_alloc_bioset") Cc: stable@vger.kernel.org # 5.18+ Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Song Liu <song@kernel.org>
then mddev->reshape_position is set to MaxSector in raid5_finish_reshape
since MD_RECOVERY_INTR is cleared in md_check_recovery, which means
feature_map is not set with MD_FEATURE_RESHAPE_ACTIVE and superblock's
reshape_position can't be updated accordingly.
Fixes: 824237bd82758 ("md: don't unregister sync_thread with reconfig_mutex held") Reported-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev> Signed-off-by: Song Liu <song@kernel.org>
Linus Torvalds [Wed, 15 Jun 2022 16:04:55 +0000 (09:04 -0700)]
Merge tag 'fs.fixes.v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull vfs idmapping fix from Christian Brauner:
"This fixes an issue where we fail to change the group of a file when
the caller owns the file and is a member of the group to change to.
This is only relevant on idmapped mounts.
There's a detailed description in the commit message and regression
tests have been added to xfstests"
* tag 'fs.fixes.v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
fs: account for group membership
After commit 4e3c2b86efb7c ("dm: switch dm_io booleans over to proper
flags") dm_start_io_acct stopped atomically checking and setting
was_accounted, which turned into the DM_IO_ACCOUNTED flag. This opened
the possibility for a race where IO accounting is started twice for
duplicate bios. To remove the race, check the flag while holding the
io->lock.
Fixes: 4e3c2b86efb7c ("dm: switch dm_io booleans over to proper flags") Cc: stable@vger.kernel.org Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Jens Axboe [Wed, 15 Jun 2022 15:39:05 +0000 (09:39 -0600)]
Merge tag 'nvme-5.19-2022-06-15' of git://git.infradead.org/nvme into block-5.19
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 5.19
- quirks, quirks, quirks to work around buggy consumer grade devices
(Keith Bush, Ning Wang, Stefan Reiter, Rasheed Hsueh)
- better kernel messages for devices that need quirking (Keith Bush)
- make a kernel message more useful (Thomas Weißschuh)"
* tag 'nvme-5.19-2022-06-15' of git://git.infradead.org/nvme:
nvme-pci: disable write zeros support on UMIC and Samsung SSDs
nvme-pci: avoid the deepest sleep state on ZHITAI TiPro7000 SSDs
nvme-pci: sk hynix p31 has bogus namespace ids
nvme-pci: smi has bogus namespace ids
nvme-pci: phison e12 has bogus namespace ids
nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA XPG GAMMIX S50
nvme-pci: add trouble shooting steps for timeouts
nvme: add bug report info for global duplicate id
nvme: add device name to warning in uuid_show()
Mark Rutland [Tue, 14 Jun 2022 08:09:43 +0000 (09:09 +0100)]
arm64: ftrace: consistently handle PLTs.
Sometimes it is necessary to use a PLT entry to call an ftrace
trampoline. This is handled by ftrace_make_call() and ftrace_make_nop(),
with each having *almost* identical logic, but this is not handled by
ftrace_modify_call() since its introduction in commit:
Due to this, if we ever were to call ftrace_modify_call() for a callsite
which requires a PLT entry for a trampoline, then either:
a) If the old addr requires a trampoline, ftrace_modify_call() will use
an out-of-range address to generate the 'old' branch instruction.
This will result in warnings from aarch64_insn_gen_branch_imm() and
ftrace_modify_code(), and no instructions will be modified. As
ftrace_modify_call() will return an error, this will result in
subsequent internal ftrace errors.
b) If the old addr does not require a trampoline, but the new addr does,
ftrace_modify_call() will use an out-of-range address to generate the
'new' branch instruction. This will result in warnings from
aarch64_insn_gen_branch_imm(), and ftrace_modify_code() will replace
the 'old' branch with a BRK. This will result in a kernel panic when
this BRK is later executed.
Practically speaking, case (a) is vastly more likely than case (b), and
typically this will result in internal ftrace errors that don't
necessarily affect the rest of the system. This can be demonstrated with
an out-of-tree test module which triggers ftrace_modify_call(), e.g.
We can solve this by consistently determining whether to use a PLT entry
for an address.
Note that since (the earlier) commit:
39a2ca3b674e5e62 ("arm64: module/ftrace: intialize PLT at load time")
... we can consistently determine the PLT address that a given callsite
will use, and therefore ftrace_make_nop() does not need to skip
validation when a PLT is in use.
This patch factors the existing logic out of ftrace_make_call() and
ftrace_make_nop() into a common ftrace_find_callable_addr() helper
function, which is used by ftrace_make_call(), ftrace_make_nop(), and
ftrace_modify_call(). In ftrace_make_nop() the patching is consistently
validated by ftrace_modify_code() as we can always determine what the
old instruction should have been.
Fixes: 193e43bdc775 ("arm64: implement ftrace with regs") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Will Deacon <will@kernel.org> Tested-by: "Ivan T. Ivanov" <iivanov@suse.de> Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220614080944.1349146-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Mark Rutland [Tue, 14 Jun 2022 08:09:42 +0000 (09:09 +0100)]
arm64: ftrace: fix branch range checks
The branch range checks in ftrace_make_call() and ftrace_make_nop() are
incorrect, erroneously permitting a forwards branch of 128M and
erroneously rejecting a backwards branch of 128M.
This is because both functions calculate the offset backwards,
calculating the offset *from* the target *to* the branch, rather than
the other way around as the later comparisons expect.
If an out-of-range branch were erroeously permitted, this would later be
rejected by aarch64_insn_gen_branch_imm() as branch_imm_common() checks
the bounds correctly, resulting in warnings and the placement of a BRK
instruction. Note that this can only happen for a forwards branch of
exactly 128M, and so the caller would need to be exactly 128M bytes
below the relevant ftrace trampoline.
If an in-range branch were erroeously rejected, then:
* For modules when CONFIG_ARM64_MODULE_PLTS=y, this would result in the
use of a PLT entry, which is benign.
Note that this is the common case, as this is selected by
CONFIG_RANDOMIZE_BASE (and therefore RANDOMIZE_MODULE_REGION_FULL),
which distributions typically seelct. This is also selected by
CONFIG_ARM64_ERRATUM_843419.
* For modules when CONFIG_ARM64_MODULE_PLTS=n, this would result in
internal ftrace failures.
* For core kernel text, this would result in internal ftrace failues.
Note that for this to happen, the kernel text would need to be at
least 128M bytes in size, and typical configurations are smaller tha
this.
Fix this by calculating the offset *from* the branch *to* the target in
both functions.
Fixes: aa77dbc4da98 ("arm64: ftrace: don't validate branch via PLT in ftrace_make_nop()") Fixes: 3b5ecc7bd9fd ("arm64: ftrace: add support for far branches to dynamic ftrace") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Will Deacon <will@kernel.org> Tested-by: "Ivan T. Ivanov" <iivanov@suse.de> Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220614080944.1349146-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The reverted patch was needed as a fix after commit 8f20d58ba28b
("random: use static branch for crng_ready()"). However, this was
already fixed by 2453a6eed43e ("random: do not use jump labels before
they are initialized") and hence no longer necessary to initialise jump
labels before setup_machine_fdt().
Duoming Zhou [Tue, 14 Jun 2022 09:25:57 +0000 (17:25 +0800)]
net: ax25: Fix deadlock caused by skb_recv_datagram in ax25_recvmsg
The skb_recv_datagram() in ax25_recvmsg() will hold lock_sock
and block until it receives a packet from the remote. If the client
doesn`t connect to server and calls read() directly, it will not
receive any packets forever. As a result, the deadlock will happen.
This patch replaces skb_recv_datagram() with an open-coded variant of it
releasing the socket lock before the __skb_wait_for_more_packets() call
and re-acquiring it after such call in order that other functions that
need socket lock could be executed.
what's more, the socket lock will be released only when recvmsg() will
block and that should produce nicer overall behavior.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Suggested-by: Thomas Osterried <thomas@osterried.de> Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Reported-by: Thomas Habets <thomas@@habets.se> Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Begunkov [Wed, 15 Jun 2022 10:23:05 +0000 (11:23 +0100)]
io_uring: fix ->extra{1,2} misuse
We don't really know the state of req->extra{1,2] fields in
__io_fill_cqe_req(), if an opcode handler is not aware of CQE32 option,
it never sets them up properly. Track the state of those fields with a
request flag.
Pavel Begunkov [Wed, 15 Jun 2022 10:23:04 +0000 (11:23 +0100)]
io_uring: fill extra big cqe fields from req
The only user of io_req_complete32()-like functions is cmd
requests. Instead of keeping the whole complete32 family, remove them
and provide the extras in already added for inline completions
req->extra{1,2}. When fill_cqe_res() finds CQE32 option enabled
it'll use those fields to fill a 32B cqe.
Pavel Begunkov [Wed, 15 Jun 2022 10:23:03 +0000 (11:23 +0100)]
io_uring: unite fill_cqe and the 32B version
We want just one function that will handle both normal cqes and 32B
cqes. Combine __io_fill_cqe_req() and __io_fill_cqe_req32(). It's still
not entirely correct yet, but saves us from cases when we fill an CQE of
a wrong size.
Pavel Begunkov [Wed, 15 Jun 2022 10:23:02 +0000 (11:23 +0100)]
io_uring: get rid of __io_fill_cqe{32}_req()
There are too many cqe filling helpers, kill __io_fill_cqe{32}_req(),
use __io_fill_cqe{32}_req_filled() instead, and then rename it. It'll
simplify fixing in following patches.
Problems observed:
======================================================================
1) Using ssh/sshfs. The remote sshd daemon can abort with the message:
"message authentication code incorrect"
This happens because the tcp message sent is corrupted during the
USB "Bulk out". The device calculate the tcp checksum and send a
valid tcp message to the remote sshd. Then the encryption detects
the error and aborts.
2) NETDEV WATCHDOG: ... (ax88179_178a): transmit queue 0 timed out
3) Stop normal work without any log message.
The "Bulk in" continue receiving packets normally.
The host sends "Bulk out" and the device responds with -ECONNRESET.
(The netusb.c code tx_complete ignore -ECONNRESET)
Under normal conditions these errors take days to happen and in
intense usage take hours.
A test with ping gives packet loss, showing that something is wrong:
ping -4 -s 462 {destination} # 462 = 512 - 42 - 8
Not all packets fail.
My guess is that the device tries to find another packet starting
at the extra byte and will fail or not depending on the next
bytes (old buffer content).
======================================================================
Signed-off-by: Jose Alonso <joalonsof@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 15 Jun 2022 08:15:33 +0000 (09:15 +0100)]
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2022-06-14
This series contains updates to ice driver only.
Michal fixes incorrect Tx timestamp offset calculation for E822 devices.
Roman enforces required VLAN filtering settings for double VLAN mode.
Przemyslaw fixes memory corruption issues with VFs by ensuring
queues are disabled in the error path of VF queue configuration and to
disabled VFs during reset.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 14 Jun 2022 17:36:11 +0000 (10:36 -0700)]
netfs: fix up netfs_inode_init() docbook comment
Commit e8e91508b493 ("netfs: Further cleanups after struct netfs_inode
wrapper introduced") changed the argument types and names, and actually
updated the comment too (although that was thanks to David Howells, not
me: my original patch only changed the code).
But the comment fixup didn't go quite far enough, and didn't change the
argument name in the comment, resulting in
include/linux/netfs.h:314: warning: Function parameter or member 'ctx' not described in 'netfs_inode_init'
include/linux/netfs.h:314: warning: Excess function parameter 'inode' description in 'netfs_inode_init'
during htmldoc generation.
Fixes: e8e91508b493 ("netfs: Further cleanups after struct netfs_inode wrapper introduced") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roman Li [Thu, 19 May 2022 18:41:16 +0000 (14:41 -0400)]
drm/amd/display: Cap OLED brightness per max frame-average luminance
[Why]
For OLED eDP the Display Manager uses max_cll value as a limit
for brightness control.
max_cll defines the content light luminance for individual pixel.
Whereas max_fall defines frame-average level luminance.
The user may not observe the difference in brightness in between
max_fall and max_cll.
That negatively impacts the user experience.
[How]
Use max_fall value instead of max_cll as a limit for brightness control.
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Even though IORING_CLOSE_FD_AND_FILE_SLOT might save cycles for some
users, but it tries to do two things at a time and it's not clear how to
handle errors and what to return in a single result field when one part
fails and another completes well. Kill it for now.
CQE32 nops were used for debugging and benchmarking but it doesn't
target any real use case. Revert it, we can return it back if someone
finds a good way to use it.
Fixes: 823dac566650 ("ice: Check if VF is disabled for Opcode and other operations") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Co-developed-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Disable VF's RX/TX queues, when VIRTCHNL_OP_CONFIG_VSI_QUEUES fail.
Not disabling them might lead to scenario, where PF driver leaves VF
queues enabled, when VF's VSI failed queue config.
In this scenario VF should not have RX/TX queues enabled. If PF failed
to set up VF's queues, VF will reset due to TX timeouts in VF driver.
Initialize iterator 'i' to -1, so if error happens prior to configuring
queues then error path code will not disable queue 0. Loop that
configures queues will is using same iterator, so error path code will
only disable queues that were configured.
Fixes: 519d5f19ed82 ("ice: add support for virtchnl_queue_select.[tx|rx]_queues bitmap") Suggested-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
VLAN filtering features, that is C-Tag and S-Tag, in DVM mode must be
both enabled or disabled.
In case of turning off/on only one of the features, another feature must
be turned off/on automatically with issuing an appropriate message to
the kernel log.
Fixes: 65f13e0c2a7b ("ice: Advertise 802.1ad VLAN filtering and offloads for PF netdev") Signed-off-by: Roman Storozhenko <roman.storozhenko@intel.com> Co-developed-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com> Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Michal Michalik [Tue, 10 May 2022 11:03:43 +0000 (13:03 +0200)]
ice: Fix PTP TX timestamp offset calculation
The offset was being incorrectly calculated for E822 - that led to
collisions in choosing TX timestamp register location when more than
one port was trying to use timestamping mechanism.
In E822 one quad is being logically split between ports, so quad 0 is
having trackers for ports 0-3, quad 1 ports 4-7 etc. Each port should
have separate memory location for tracking timestamps. Due to error for
example ports 1 and 2 had been assigned to quad 0 with same offset (0),
while port 1 should have offset 0 and 1 offset 16.
Fix it by correctly calculating quad offset.
Fixes: ae5cb05910db ("ice: implement basic E822 PTP support") Signed-off-by: Michal Michalik <michal.michalik@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Linus Torvalds [Tue, 14 Jun 2022 14:57:18 +0000 (07:57 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"While last week's pull request contained miscellaneous fixes for x86,
this one covers other architectures, selftests changes, and a bigger
series for APIC virtualization bugs that were discovered during 5.20
development. The idea is to base 5.20 development for KVM on top of
this tag.
ARM64:
- Properly reset the SVE/SME flags on vcpu load
- Fix a vgic-v2 regression regarding accessing the pending state of a
HW interrupt from userspace (and make the code common with vgic-v3)
- Fix access to the idreg range for protected guests
- Ignore 'kvm-arm.mode=protected' when using VHE
- Return an error from kvm_arch_init_vm() on allocation failure
- A bunch of small cleanups (comments, annotations, indentation)
RISC-V:
- Typo fix in arch/riscv/kvm/vmid.c
- Remove broken reference pattern from MAINTAINERS entry
x86-64:
- Fix error in page tables with MKTME enabled
- Dirty page tracking performance test extended to running a nested
guest
- Disable APICv/AVIC in cases that it cannot implement correctly"
[ This merge also fixes a misplaced end parenthesis bug introduced in
commit 36e91d4510ec ("KVM: x86: inhibit APICv/AVIC on changes to APIC
ID or APIC base") pointed out by Sean Christopherson ]
Link: https://lore.kernel.org/all/20220610191813.371682-1-seanjc@google.com/
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (34 commits)
KVM: selftests: Restrict test region to 48-bit physical addresses when using nested
KVM: selftests: Add option to run dirty_log_perf_test vCPUs in L2
KVM: selftests: Clean up LIBKVM files in Makefile
KVM: selftests: Link selftests directly with lib object files
KVM: selftests: Drop unnecessary rule for STATIC_LIBS
KVM: selftests: Add a helper to check EPT/VPID capabilities
KVM: selftests: Move VMX_EPT_VPID_CAP_AD_BITS to vmx.h
KVM: selftests: Refactor nested_map() to specify target level
KVM: selftests: Drop stale function parameter comment for nested_map()
KVM: selftests: Add option to create 2M and 1G EPT mappings
KVM: selftests: Replace x86_page_size with PG_LEVEL_XX
KVM: x86: SVM: fix nested PAUSE filtering when L0 intercepts PAUSE
KVM: x86: SVM: drop preempt-safe wrappers for avic_vcpu_load/put
KVM: x86: disable preemption around the call to kvm_arch_vcpu_{un|}blocking
KVM: x86: disable preemption while updating apicv inhibition
KVM: x86: SVM: fix avic_kick_target_vcpus_fast
KVM: x86: SVM: remove avic's broken code that updated APIC ID
KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base
KVM: x86: document AVIC/APICv inhibit reasons
KVM: x86/mmu: Set memory encryption "value", not "mask", in shadow PDPTRs
...
Linus Torvalds [Tue, 14 Jun 2022 14:43:15 +0000 (07:43 -0700)]
Merge tag 'x86-bugs-2022-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 MMIO stale data fixes from Thomas Gleixner:
"Yet another hw vulnerability with a software mitigation: Processor
MMIO Stale Data.
They are a class of MMIO-related weaknesses which can expose stale
data by propagating it into core fill buffers. Data which can then be
leaked using the usual speculative execution methods.
Mitigations include this set along with microcode updates and are
similar to MDS and TAA vulnerabilities: VERW now clears those buffers
too"
* tag 'x86-bugs-2022-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/speculation/mmio: Print SMT warning
KVM: x86/speculation: Disable Fill buffer clear within guests
x86/speculation/mmio: Reuse SRBDS mitigation for SBDS
x86/speculation/srbds: Update SRBDS mitigation selection
x86/speculation/mmio: Add sysfs reporting for Processor MMIO Stale Data
x86/speculation/mmio: Enable CPU Fill buffer clearing on idle
x86/bugs: Group MDS, TAA & Processor MMIO Stale Data mitigations
x86/speculation/mmio: Add mitigation for Processor MMIO Stale Data
x86/speculation: Add a common function for MD_CLEAR mitigation update
x86/speculation/mmio: Enumerate Processor MMIO Stale Data bug
Documentation: Add documentation for Processor MMIO Stale Data
Petr Machata [Mon, 13 Jun 2022 12:50:17 +0000 (15:50 +0300)]
mlxsw: spectrum_cnt: Reorder counter pools
Both RIF and ACL flow counters use a 24-bit SW-managed counter address to
communicate which counter they want to bind.
In a number of Spectrum FW releases, binding a RIF counter is broken and
slices the counter index to 16 bits. As a result, on Spectrum-2 and above,
no more than about 410 RIF counters can be effectively used. This
translates to 205 netdevices for which L3 HW stats can be enabled. (This
does not happen on Spectrum-1, because there are fewer counters available
overall and the counter index never exceeds 16 bits.)
Binding counters to ACLs does not have this issue. Therefore reorder the
counter allocation scheme so that RIF counters come first and therefore get
lower indices that are below the 16-bit barrier.
Marek Szyprowski [Fri, 13 May 2022 08:31:05 +0000 (10:31 +0200)]
drm/exynos: mic: Rework initialization
Commit 91e812291b87 ("exynos: drm: dsi: Attach in_bridge in MIC driver")
moved Exynos MIC attaching from DSI to MIC driver. However the method
proposed there is incomplete and cannot really work. To properly attach
it to the bridge chain, access to the respective encoder is needed. The
Exynos MIC driver always attaches to the encoder created by the Exynos
DSI driver, so grab it via available helpers for getting access to the
CRTC and encoders. This also requires to change the order of driver
component binding to let DSI to be bound before MIC.
Fixes: 91e812291b87 ("exynos: drm: dsi: Attach in_bridge in MIC driver") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Fixed merge conflict. Signed-off-by: Inki Dae <inki.dae@samsung.com>
When calling setattr_prepare() to determine the validity of the
attributes the ia_{g,u}id fields contain the value that will be written
to inode->i_{g,u}id. This is exactly the same for idmapped and
non-idmapped mounts and allows callers to pass in the values they want
to see written to inode->i_{g,u}id.
When group ownership is changed a caller whose fsuid owns the inode can
change the group of the inode to any group they are a member of. When
searching through the caller's groups we need to use the gid mapped
according to the idmapped mount otherwise we will fail to change
ownership for unprivileged users.
Consider a caller running with fsuid and fsgid 1000 using an idmapped
mount that maps id 65534 to 1000 and 65535 to 1001. Consequently, a file
owned by 65534:65535 in the filesystem will be owned by 1000:1001 in the
idmapped mount.
The caller now requests the gid of the file to be changed to 1000 going
through the idmapped mount. In the vfs we will immediately map the
requested gid to the value that will need to be written to inode->i_gid
and place it in attr->ia_gid. Since this idmapped mount maps 65534 to
1000 we place 65534 in attr->ia_gid.
When we check whether the caller is allowed to change group ownership we
first validate that their fsuid matches the inode's uid. The
inode->i_uid is 65534 which is mapped to uid 1000 in the idmapped mount.
Since the caller's fsuid is 1000 we pass the check.
We now check whether the caller is allowed to change inode->i_gid to the
requested gid by calling in_group_p(). This will compare the passed in
gid to the caller's fsgid and search the caller's additional groups.
Since we're dealing with an idmapped mount we need to pass in the gid
mapped according to the idmapped mount. This is akin to checking whether
a caller is privileged over the future group the inode is owned by. And
that needs to take the idmapped mount into account. Note, all helpers
are nops without idmapped mounts.