]> git.baikalelectronics.ru Git - kernel.git/log
kernel.git
3 years agomtd: rawnand: marvell: Use correct logic for nand-keep-config
Tony O'Brien [Tue, 27 Sep 2022 02:47:28 +0000 (15:47 +1300)]
mtd: rawnand: marvell: Use correct logic for nand-keep-config

commit 51c7f684baf1fcf71ddf0cb61c1398bf8066f589 upstream.

Originally the absence of the marvell,nand-keep-config property caused
the setup_data_interface function to be provided. However when
setup_data_interface was moved into nand_controller_ops the logic was
unintentionally inverted. Update the logic so that only if the
marvell,nand-keep-config property is present the bootloader NAND config
kept.

Cc: stable@vger.kernel.org
Fixes: c2feabe3b9e5 ("mtd: rawnand: Move ->setup_data_interface() to nand_controller_ops")
Signed-off-by: Tony O'Brien <tony.obrien@alliedtelesis.co.nz>
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20220927024728.28447-1-chris.packham@alliedtelesis.co.nz
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousb: xhci: add XHCI_SPURIOUS_SUCCESS to ASM1042 despite being a V0.96 controller
Jens Glathe [Mon, 24 Oct 2022 14:27:17 +0000 (17:27 +0300)]
usb: xhci: add XHCI_SPURIOUS_SUCCESS to ASM1042 despite being a V0.96 controller

commit 07f48fe1b9fc3594470bfe0e9fb0d987ce89b1f1 upstream.

This appears to fix the error:
"xhci_hcd <address>; ERROR Transfer event TRB DMA ptr not part of
current TD ep_index 2 comp_code 13" that appear spuriously (or pretty
often) when using a r8152 USB3 ethernet adapter with integrated hub.

ASM1042 reports as a 0.96 controller, but appears to behave more like 1.0

Inspired by this email thread: https://markmail.org/thread/7vzqbe7t6du6qsw3

Cc: stable@vger.kernel.org
Signed-off-by: Jens Glathe <jens.glathe@oldschoolsolutions.biz>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20221024142720.4122053-2-mathias.nyman@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousb: bdc: change state when port disconnected
Justin Chen [Wed, 5 Oct 2022 19:13:55 +0000 (12:13 -0700)]
usb: bdc: change state when port disconnected

commit c6220e1480855e0a3bf0058335caabc1160c438e upstream.

When port is connected and then disconnected, the state stays as
configured. Which is incorrect as the port is no longer configured,
but in a not attached state.

Signed-off-by: Justin Chen <justinpopo6@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Fixes: 2cdfaf7a0925 ("usb: gadget: Add UDC driver for Broadcom USB3.0 device controller IP BDC")
Cc: stable <stable@kernel.org>
Link: https://lore.kernel.org/r/1664997235-18198-1-git-send-email-justinpopo6@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousb: dwc3: gadget: Don't set IMI for no_interrupt
Thinh Nguyen [Tue, 25 Oct 2022 22:10:20 +0000 (15:10 -0700)]
usb: dwc3: gadget: Don't set IMI for no_interrupt

commit c3811905e6edbd158c1fd59a47a86712d964f148 upstream.

The gadget driver may have a certain expectation of how the request
completion flow should be from to its configuration. Make sure the
controller driver respect that. That is, don't set IMI (Interrupt on
Missed Isoc) when usb_request->no_interrupt is set. Also, the driver
should only set IMI to the last TRB of a chain.

Fixes: 1e90ce98e30b ("usb: Introduce DesignWare USB3 DRD Driver")
Cc: stable@vger.kernel.org
Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Reviewed-by: Jeff Vanhoof <jdv1029@gmail.com>
Tested-by: Jeff Vanhoof <jdv1029@gmail.com>
Link: https://lore.kernel.org/r/ced336c84434571340c07994e3667a0ee284fefe.1666735451.git.Thinh.Nguyen@synopsys.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousb: dwc3: gadget: Stop processing more requests on IMI
Thinh Nguyen [Tue, 25 Oct 2022 22:10:14 +0000 (15:10 -0700)]
usb: dwc3: gadget: Stop processing more requests on IMI

commit ddad65a2f68f099c1ccd6eb805130e276fd13b76 upstream.

When servicing a transfer completion event, the dwc3 driver will reclaim
TRBs of started requests up to the request associated with the interrupt
event. Currently we don't check for interrupt due to missed isoc, and
the driver may attempt to reclaim TRBs beyond the associated event. This
causes invalid memory access when the hardware still owns the TRB. If
there's a missed isoc TRB with IMI (interrupt on missed isoc), make sure
to stop servicing further.

Note that only the last TRB of chained TRBs has its status updated with
missed isoc.

Fixes: 1e90ce98e30b ("usb: Introduce DesignWare USB3 DRD Driver")
Cc: stable@vger.kernel.org
Reported-by: Jeff Vanhoof <jdv1029@gmail.com>
Reported-by: Dan Vacura <w36195@motorola.com>
Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Reviewed-by: Jeff Vanhoof <jdv1029@gmail.com>
Tested-by: Jeff Vanhoof <jdv1029@gmail.com>
Link: https://lore.kernel.org/r/b29acbeab531b666095dfdafd8cb5c7654fbb3e1.1666735451.git.Thinh.Nguyen@synopsys.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoUSB: add RESET_RESUME quirk for NVIDIA Jetson devices in RCM
Hannu Hartikainen [Mon, 19 Sep 2022 17:16:10 +0000 (20:16 +0300)]
USB: add RESET_RESUME quirk for NVIDIA Jetson devices in RCM

commit 308638e7503b3d4c33256dbec7e3842772383f39 upstream.

NVIDIA Jetson devices in Force Recovery mode (RCM) do not support
suspending, ie. flashing fails if the device has been suspended. The
devices are still visible in lsusb and seem to work otherwise, making
the issue hard to debug. This has been discovered in various forum
posts, eg. [1].

The patch has been tested on NVIDIA Jetson AGX Xavier, but I'm adding
all the Jetson models listed in [2] on the assumption that they all
behave similarly.

[1]: https://forums.developer.nvidia.com/t/flashing-not-working/72365
[2]: https://docs.nvidia.com/jetson/archives/l4t-archived/l4t-3271/index.html#page/Tegra%20Linux%20Driver%20Package%20Development%20Guide/quick_start.html

Signed-off-by: Hannu Hartikainen <hannu@hrtk.in>
Cc: stable <stable@kernel.org> # after 6.1-rc3
Link: https://lore.kernel.org/r/20220919171610.30484-1-hannu@hrtk.in
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoALSA: au88x0: use explicitly signed char
Jason A. Donenfeld [Mon, 24 Oct 2022 16:29:29 +0000 (18:29 +0200)]
ALSA: au88x0: use explicitly signed char

commit a3956226b602ed177527f0dcf1b08916b2aee28b upstream.

With char becoming unsigned by default, and with `char` alone being
ambiguous and based on architecture, signed chars need to be marked
explicitly as such. This fixes warnings like:

sound/pci/au88x0/au88x0_core.c:2029 vortex_adb_checkinout() warn: signedness bug returning '(-22)'
sound/pci/au88x0/au88x0_core.c:2046 vortex_adb_checkinout() warn: signedness bug returning '(-12)'
sound/pci/au88x0/au88x0_core.c:2125 vortex_adb_allocroute() warn: 'vortex_adb_checkinout(vortex, (0), en, 0)' is unsigned
sound/pci/au88x0/au88x0_core.c:2170 vortex_adb_allocroute() warn: 'vortex_adb_checkinout(vortex, stream->resources, en, 4)' is unsigned

As well, since one function returns errnos, return an `int` rather than
a `signed char`.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20221024162929.536004-1-Jason@zx2c4.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoALSA: Use del_timer_sync() before freeing timer
Steven Rostedt (Google) [Thu, 27 Oct 2022 03:12:36 +0000 (23:12 -0400)]
ALSA: Use del_timer_sync() before freeing timer

commit 87feec0995cc7b45f84b0a9cb57ad012da51bf7e upstream.

The current code for freeing the emux timer is extremely dangerous:

  CPU0 CPU1
  ---- ----
snd_emux_timer_callback()
    snd_emux_free()
      spin_lock(&emu->voice_lock)
      del_timer(&emu->tlist); <-- returns immediately
      spin_unlock(&emu->voice_lock);
      [..]
      kfree(emu);

  spin_lock(&emu->voice_lock);

 [BOOM!]

Instead just use del_timer_sync() which will wait for the timer to finish
before continuing. No need to check if the timer is active or not when
doing so.

This doesn't fix the race of a possible re-arming of the timer, but at
least it won't use the data that has just been freed.

[ Fixed unused variable warning by tiwai ]

Cc: stable@vger.kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20221026231236.6834b551@gandalf.local.home
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agocan: kvaser_usb: Fix possible completions during init_completion
Anssi Hannula [Mon, 10 Oct 2022 18:52:27 +0000 (20:52 +0200)]
can: kvaser_usb: Fix possible completions during init_completion

commit 9975e58d217bbdf08d8e1ba2b61703a61c3c02e5 upstream.

kvaser_usb uses completions to signal when a response event is received
for outgoing commands.

However, it uses init_completion() to reinitialize the start_comp and
stop_comp completions before sending the start/stop commands.

In case the device sends the corresponding response just before the
actual command is sent, complete() may be called concurrently with
init_completion() which is not safe.

This might be triggerable even with a properly functioning device by
stopping the interface (CMD_STOP_CHIP) just after it goes bus-off (which
also causes the driver to send CMD_STOP_CHIP when restart-ms is off),
but that was not tested.

Fix the issue by using reinit_completion() instead.

Fixes: d9f21bf91c45 ("can: kvaser_usb: Add support for Kvaser CAN/USB devices")
Tested-by: Jimmy Assarsson <extja@kvaser.com>
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Signed-off-by: Jimmy Assarsson <extja@kvaser.com>
Link: https://lore.kernel.org/all/20221010185237.319219-2-extja@kvaser.com
Cc: stable@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agocan: j1939: transport: j1939_session_skb_drop_old(): spin_unlock_irqrestore() before...
Yang Yingliang [Thu, 27 Oct 2022 09:12:37 +0000 (17:12 +0800)]
can: j1939: transport: j1939_session_skb_drop_old(): spin_unlock_irqrestore() before kfree_skb()

commit a7cc53ac46d168ef5a70c79b1a13999bd7e504eb upstream.

It is not allowed to call kfree_skb() from hardware interrupt context
or with interrupts being disabled. The skb is unlinked from the queue,
so it can be freed after spin_unlock_irqrestore().

Fixes: 877b9bba3266 ("can: add support of SAE J1939 protocol")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://lore.kernel.org/all/20221027091237.2290111-1-yangyingliang@huawei.com
Cc: stable@vger.kernel.org
[mkl: adjust subject]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoLinux 5.4.222
Greg Kroah-Hartman [Tue, 1 Nov 2022 18:06:42 +0000 (19:06 +0100)]
Linux 5.4.222

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoonce: fix section mismatch on clang builds
Greg Kroah-Hartman [Tue, 1 Nov 2022 17:12:39 +0000 (18:12 +0100)]
once: fix section mismatch on clang builds

On older kernels (5.4 and older), building the kernel with clang can
cause the section name to end up with "" in them, which can cause lots
of runtime issues as that is not normally a valid portion of the string.

This was fixed up in newer kernels with commit 9c79cdaf17c3 ("treewide:
Convert macro and uses of __section(foo) to __section("foo")") but
that's too heavy-handed for older kernels.

So for now, fix up the problem that commit aace4a56fac3 ("once: add
DO_ONCE_SLOW() for sleepable contexts") caused by being backported by
removing the "" characters from the section definition.

Reported-by: Oleksandr Tymoshenko <ovt@google.com>
Reported-by: Yongqin Liu <yongqin.liu@linaro.org>
Tested-by: Yongqin Liu <yongqin.liu@linaro.org>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Link: https://lore.kernel.org/r/20221029011211.4049810-1-ovt@google.com
Link: https://lore.kernel.org/r/CAMSo37XApZ_F5nSQYWFsSqKdMv_gBpfdKG3KN1TDB+QNXqSh0A@mail.gmail.com
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: David S. Miller <davem@davemloft.net>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoLinux 5.4.221
Greg Kroah-Hartman [Sat, 29 Oct 2022 08:20:36 +0000 (10:20 +0200)]
Linux 5.4.221

Link: https://lore.kernel.org/r/20221027165049.817124510@linuxfoundation.org
Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agomm: /proc/pid/smaps_rollup: fix no vma's null-deref
Seth Jenkins [Thu, 27 Oct 2022 15:36:52 +0000 (11:36 -0400)]
mm: /proc/pid/smaps_rollup: fix no vma's null-deref

Commit 2c058326072c ("mm: /proc/pid/smaps_rollup: convert to single value
seq_file") introduced a null-deref if there are no vma's in the task in
show_smaps_rollup.

Fixes: 2c058326072c ("mm: /proc/pid/smaps_rollup: convert to single value seq_file")
Signed-off-by: Seth Jenkins <sethjenkins@google.com>
Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com>
Tested-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agohv_netvsc: Fix race between VF offering and VF association message from host
Gaurav Kohli [Thu, 6 Oct 2022 05:52:59 +0000 (22:52 -0700)]
hv_netvsc: Fix race between VF offering and VF association message from host

commit 84fe330a5d2598716767e39af49ec3eef4e381d5 upstream.

During vm boot, there might be possibility that vf registration
call comes before the vf association from host to vm.

And this might break netvsc vf path, To prevent the same block
vf registration until vf bind message comes from host.

Cc: stable@vger.kernel.org
Fixes: 29bd7df8e7a6d ("hv_netvsc: pair VF based on serial number")
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Gaurav Kohli <gauravkohli@linux.microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoMakefile.debug: re-enable debug info for .S files
Nick Desaulniers [Mon, 24 Oct 2022 20:34:14 +0000 (13:34 -0700)]
Makefile.debug: re-enable debug info for .S files

This is _not_ an upstream commit and just for 5.4.y only. It is based
on commit 4658eaed64f68b1ec029b23f6ae6d2076e451e4d upstream.

Alexey reported that the fraction of unknown filename instances in
kallsyms grew from ~0.3% to ~10% recently; Bill and Greg tracked it down
to assembler defined symbols, which regressed as a result of:

commit f57637f6477b ("Kbuild: do not emit debug info for assembly with LLVM_IAS=1")

In that commit, I allude to restoring debug info for assembler defined
symbols in a follow up patch, but it seems I forgot to do so in

commit 49f087762cc6 ("Kbuild: make DWARF version a choice")

Fixes: f57637f6477b ("Kbuild: do not emit debug info for assembly with LLVM_IAS=1")
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoACPI: video: Force backlight native for more TongFang devices
Werner Sembach [Wed, 26 Oct 2022 15:22:46 +0000 (17:22 +0200)]
ACPI: video: Force backlight native for more TongFang devices

commit e3ee8fe958f57360889f17dcfd384af66947b878 upstream.

This commit is very different from the upstream commit! It fixes the same
issue by adding more quirks, rather then the general fix from the 6.1
kernel, because the general fix from the 6.1 kernel is part of a larger
refactoring of the backlight code which is not suitable for the stable
series.

As described in "ACPI: video: Drop NL5x?U, PF4NU1F and PF5?U??
acpi_backlight=native quirks" (2626ba541463) the upstream commit "ACPI:
video: Make backlight class device registration a separate step (v2)"
(e3ee8fe958f5) makes these quirks unnecessary. However as mentioned in this
bugtracker ticket https://bugzilla.kernel.org/show_bug.cgi?id=215683#c17
the upstream fix is part of a larger patchset that is overall too complex
for stable.

The TongFang GKxNRxx, GMxNGxx, GMxZGxx, and GMxRGxx / TUXEDO
Stellaris/Polaris Gen 1-4, have the same problem as the Clevo NL5xRU and
NL5xNU / TUXEDO Aura 15 Gen1 and Gen2:
They have a working native and video interface for screen backlight.
However the default detection mechanism first registers the video interface
before unregistering it again and switching to the native interface during
boot. This results in a dangling SBIOS request for backlight change for
some reason, causing the backlight to switch to ~2% once per boot on the
first power cord connect or disconnect event. Setting the native interface
explicitly circumvents this buggy behaviour by avoiding the unregistering
process.

Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Werner Sembach <wse@tuxedocomputers.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoriscv: topology: fix default topology reporting
Conor Dooley [Wed, 19 Oct 2022 12:52:10 +0000 (13:52 +0100)]
riscv: topology: fix default topology reporting

commit 9bed98dfd4993a352db91246d621c9194909940e upstream.

RISC-V has no sane defaults to fall back on where there is no cpu-map
in the devicetree.
Without sane defaults, the package, core and thread IDs are all set to
-1. This causes user-visible inaccuracies for tools like hwloc/lstopo
which rely on the sysfs cpu topology files to detect a system's
topology.

On a PolarFire SoC, which should have 4 harts with a thread each,
lstopo currently reports:

Machine (793MB total)
  Package L#0
    NUMANode L#0 (P#0 793MB)
    Core L#0
      L1d L#0 (32KB) + L1i L#0 (32KB) + PU L#0 (P#0)
      L1d L#1 (32KB) + L1i L#1 (32KB) + PU L#1 (P#1)
      L1d L#2 (32KB) + L1i L#2 (32KB) + PU L#2 (P#2)
      L1d L#3 (32KB) + L1i L#3 (32KB) + PU L#3 (P#3)

Adding calls to store_cpu_topology() in {boot,smp} hart bringup code
results in the correct topolgy being reported:

Machine (793MB total)
  Package L#0
    NUMANode L#0 (P#0 793MB)
    L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
    L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)
    L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2)
    L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)

CC: stable@vger.kernel.org # f925e1848b58: arm64: topology: move store_cpu_topology() to shared code
Fixes: 83d57b1d0686 ("RISC-V: Parse cpu topology during boot.")
Reported-by: Brice Goglin <Brice.Goglin@inria.fr>
Link: https://github.com/open-mpi/hwloc/issues/536
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoarm64: topology: move store_cpu_topology() to shared code
Conor Dooley [Wed, 19 Oct 2022 12:52:09 +0000 (13:52 +0100)]
arm64: topology: move store_cpu_topology() to shared code

commit f925e1848b58df8c37936d7da3bd487f53fb7e77 upstream.

arm64's method of defining a default cpu topology requires only minimal
changes to apply to RISC-V also. The current arm64 implementation exits
early in a uniprocessor configuration by reading MPIDR & claiming that
uniprocessor can rely on the default values.

This is appears to be a hangover from prior to '206de32eec8a ("arm64:
topology: Stop using MPIDR for topology information")', because the
current code just assigns default values for multiprocessor systems.

With the MPIDR references removed, store_cpu_topolgy() can be moved to
the common arch_topology code.

Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoiommu/vt-d: Clean up si_domain in the init_dmars() error path
Jerry Snitselaar [Wed, 19 Oct 2022 00:44:47 +0000 (08:44 +0800)]
iommu/vt-d: Clean up si_domain in the init_dmars() error path

[ Upstream commit ed779996dff2cd2b37b0b2018b9c2fb939af748a ]

A splat from kmem_cache_destroy() was seen with a kernel prior to
commit a08ebdc5824e ("iommu/vt-d: Remove domain and devinfo mempool")
when there was a failure in init_dmars(), because the iommu_domain
cache still had objects. While the mempool code is now gone, there
still is a leak of the si_domain memory if init_dmars() fails. So
clean up si_domain in the init_dmars() error path.

Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Will Deacon <will@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Fixes: 965666c5930a ("iommu/vt-d: Allocate si_domain in init_dmars()")
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
Link: https://lore.kernel.org/r/20221010144842.308890-1-jsnitsel@redhat.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agonet: hns: fix possible memory leak in hnae_ae_register()
Yang Yingliang [Tue, 18 Oct 2022 12:24:51 +0000 (20:24 +0800)]
net: hns: fix possible memory leak in hnae_ae_register()

[ Upstream commit 6ddb52af9b701f33ee9f2170fa1683545e8223bb ]

Inject fault while probing module, if device_register() fails,
but the refcount of kobject is not decreased to 0, the name
allocated in dev_set_name() is leaked. Fix this by calling
put_device(), so that name can be freed in callback function
kobject_cleanup().

unreferenced object 0xffff00c01aba2100 (size 128):
  comm "systemd-udevd", pid 1259, jiffies 4294903284 (age 294.152s)
  hex dump (first 32 bytes):
    68 6e 61 65 30 00 00 00 18 21 ba 1a c0 00 ff ff  hnae0....!......
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<0000000034783f26>] slab_post_alloc_hook+0xa0/0x3e0
    [<00000000748188f2>] __kmem_cache_alloc_node+0x164/0x2b0
    [<00000000ab0743e8>] __kmalloc_node_track_caller+0x6c/0x390
    [<000000006c0ffb13>] kvasprintf+0x8c/0x118
    [<00000000fa27bfe1>] kvasprintf_const+0x60/0xc8
    [<0000000083e10ed7>] kobject_set_name_vargs+0x3c/0xc0
    [<000000000b87affc>] dev_set_name+0x7c/0xa0
    [<000000003fd8fe26>] hnae_ae_register+0xcc/0x190 [hnae]
    [<00000000fe97edc9>] hns_dsaf_ae_init+0x9c/0x108 [hns_dsaf]
    [<00000000c36ff1eb>] hns_dsaf_probe+0x548/0x748 [hns_dsaf]

Fixes: 8aa11e41557b ("net: add Hisilicon Network Subsystem hnae framework support")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20221018122451.1749171-1-yangyingliang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agonet: sched: cake: fix null pointer access issue when cake_init() fails
Zhengchao Shao [Tue, 18 Oct 2022 06:31:59 +0000 (14:31 +0800)]
net: sched: cake: fix null pointer access issue when cake_init() fails

[ Upstream commit 8cc15401fed41148cae86805ce701d2dfaecbbbb ]

When the default qdisc is cake, if the qdisc of dev_queue fails to be
inited during mqprio_init(), cake_reset() is invoked to clear
resources. In this case, the tins is NULL, and it will cause gpf issue.

The process is as follows:
qdisc_create_dflt()
cake_init()
q->tins = kvcalloc(...)        --->failed, q->tins is NULL
...
qdisc_put()
...
cake_reset()
...
cake_dequeue_one()
b = &q->tins[...]   --->q->tins is NULL

The following is the Call Trace information:
general protection fault, probably for non-canonical address
0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
RIP: 0010:cake_dequeue_one+0xc9/0x3c0
Call Trace:
<TASK>
cake_reset+0xb1/0x140
qdisc_reset+0xed/0x6f0
qdisc_destroy+0x82/0x4c0
qdisc_put+0x9e/0xb0
qdisc_create_dflt+0x2c3/0x4a0
mqprio_init+0xa71/0x1760
qdisc_create+0x3eb/0x1000
tc_modify_qdisc+0x408/0x1720
rtnetlink_rcv_msg+0x38e/0xac0
netlink_rcv_skb+0x12d/0x3a0
netlink_unicast+0x4a2/0x740
netlink_sendmsg+0x826/0xcc0
sock_sendmsg+0xc5/0x100
____sys_sendmsg+0x583/0x690
___sys_sendmsg+0xe8/0x160
__sys_sendmsg+0xbf/0x160
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7f89e5122d04
</TASK>

Fixes: 23f40d78028b ("sched: Add Common Applications Kept Enhanced (cake) qdisc")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agonet: phy: dp83867: Extend RX strap quirk for SGMII mode
Harini Katakam [Fri, 14 Oct 2022 06:47:35 +0000 (12:17 +0530)]
net: phy: dp83867: Extend RX strap quirk for SGMII mode

[ Upstream commit cf6da8c8a4a4eafa754bb2535c31502e7235af42 ]

When RX strap in HW is not set to MODE 3 or 4, bit 7 and 8 in CF4
register should be set. The former is already handled in
dp83867_config_init; add the latter in SGMII specific initialization.

Fixes: 6d27bc44975f ("net: phy: dp83867: Add TI dp83867 phy")
Signed-off-by: Harini Katakam <harini.katakam@amd.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agonet/atm: fix proc_mpc_write incorrect return value
Xiaobo Liu [Fri, 14 Oct 2022 02:05:40 +0000 (10:05 +0800)]
net/atm: fix proc_mpc_write incorrect return value

[ Upstream commit 32ebe52b40882a2477099e9fab3ed5ad655dbdb5 ]

Then the input contains '\0' or '\n', proc_mpc_write has read them,
so the return value needs +1.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Xiaobo Liu <cppcoffee@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agoHID: magicmouse: Do not set BTN_MOUSE on double report
José Expósito [Sun, 9 Oct 2022 18:27:47 +0000 (20:27 +0200)]
HID: magicmouse: Do not set BTN_MOUSE on double report

[ Upstream commit 6bbad19da7e1edb856053122aef3bb24fbde10ef ]

Under certain conditions the Magic Trackpad can group 2 reports in a
single packet. The packet is split and the raw event function is
invoked recursively for each part.

However, after processing each part, the BTN_MOUSE status is updated,
sending multiple click events. [1]

Return after processing double reports to avoid this issue.

Link: https://gitlab.freedesktop.org/libinput/libinput/-/issues/811
Fixes: c524c40cc083 ("HID: magicmouse: enable Magic Trackpad support")
Reported-by: Nulo <git@nulo.in>
Signed-off-by: José Expósito <jose.exposito89@gmail.com>
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Link: https://lore.kernel.org/r/20221009182747.90730-1-jose.exposito89@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agotipc: fix an information leak in tipc_topsrv_kern_subscr
Alexander Potapenko [Wed, 12 Oct 2022 15:25:14 +0000 (17:25 +0200)]
tipc: fix an information leak in tipc_topsrv_kern_subscr

[ Upstream commit 4d88bf2342b7fe516c2dd4d2b8610e6f7c39deae ]

Use a 8-byte write to initialize sub.usr_handle in
tipc_topsrv_kern_subscr(), otherwise four bytes remain uninitialized
when issuing setsockopt(..., SOL_TIPC, ...).
This resulted in an infoleak reported by KMSAN when the packet was
received:

  =====================================================
  BUG: KMSAN: kernel-infoleak in copyout+0xbc/0x100 lib/iov_iter.c:169
   instrument_copy_to_user ./include/linux/instrumented.h:121
   copyout+0xbc/0x100 lib/iov_iter.c:169
   _copy_to_iter+0x5c0/0x20a0 lib/iov_iter.c:527
   copy_to_iter ./include/linux/uio.h:176
   simple_copy_to_iter+0x64/0xa0 net/core/datagram.c:513
   __skb_datagram_iter+0x123/0xdc0 net/core/datagram.c:419
   skb_copy_datagram_iter+0x58/0x200 net/core/datagram.c:527
   skb_copy_datagram_msg ./include/linux/skbuff.h:3903
   packet_recvmsg+0x521/0x1e70 net/packet/af_packet.c:3469
   ____sys_recvmsg+0x2c4/0x810 net/socket.c:?
   ___sys_recvmsg+0x217/0x840 net/socket.c:2743
   __sys_recvmsg net/socket.c:2773
   __do_sys_recvmsg net/socket.c:2783
   __se_sys_recvmsg net/socket.c:2780
   __x64_sys_recvmsg+0x364/0x540 net/socket.c:2780
   do_syscall_x64 arch/x86/entry/common.c:50
   do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
   entry_SYSCALL_64_after_hwframe+0x63/0xcd arch/x86/entry/entry_64.S:120

  ...

  Uninit was stored to memory at:
   tipc_sub_subscribe+0x42d/0xb50 net/tipc/subscr.c:156
   tipc_conn_rcv_sub+0x246/0x620 net/tipc/topsrv.c:375
   tipc_topsrv_kern_subscr+0x2e8/0x400 net/tipc/topsrv.c:579
   tipc_group_create+0x4e7/0x7d0 net/tipc/group.c:190
   tipc_sk_join+0x2a8/0x770 net/tipc/socket.c:3084
   tipc_setsockopt+0xae5/0xe40 net/tipc/socket.c:3201
   __sys_setsockopt+0x87f/0xdc0 net/socket.c:2252
   __do_sys_setsockopt net/socket.c:2263
   __se_sys_setsockopt net/socket.c:2260
   __x64_sys_setsockopt+0xe0/0x160 net/socket.c:2260
   do_syscall_x64 arch/x86/entry/common.c:50
   do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
   entry_SYSCALL_64_after_hwframe+0x63/0xcd arch/x86/entry/entry_64.S:120

  Local variable sub created at:
   tipc_topsrv_kern_subscr+0x57/0x400 net/tipc/topsrv.c:562
   tipc_group_create+0x4e7/0x7d0 net/tipc/group.c:190

  Bytes 84-87 of 88 are uninitialized
  Memory access of size 88 starts at ffff88801ed57cd0
  Data copied to user address 0000000020000400
  ...
  =====================================================

Signed-off-by: Alexander Potapenko <glider@google.com>
Fixes: b4b521da45fe16 ("tipc: rename tipc_server to tipc_topsrv")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agotipc: Fix recognition of trial period
Mark Tomlinson [Mon, 10 Oct 2022 02:46:13 +0000 (15:46 +1300)]
tipc: Fix recognition of trial period

[ Upstream commit 094f35875aa23d942e69f4e329d10884e4cc3c3a ]

The trial period exists until jiffies is after addr_trial_end. But as
jiffies will eventually overflow, just using time_after will eventually
give incorrect results. As the node address is set once the trial period
ends, this can be used to know that we are not in the trial period.

Fixes: 41f882000ba8 ("tipc: correct discovery message handling during address trial period")
Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agoACPI: extlog: Handle multiple records
Tony Luck [Mon, 10 Oct 2022 20:34:23 +0000 (13:34 -0700)]
ACPI: extlog: Handle multiple records

[ Upstream commit 94bdf1eb461f223f187c791e25f9a6859a474471 ]

If there is no user space consumer of extlog_mem trace records, then
Linux properly handles multiple error records in an ELOG block

extlog_print()
  print_extlog_rcd()
    __print_extlog_rcd()
      cper_estatus_print()
apei_estatus_for_each_section()

But the other code path hard codes looking for a single record to
output a trace record.

Fix by using the same apei_estatus_for_each_section() iterator
to step over all records.

Fixes: bc64f740108a ("trace, RAS: Add eMCA trace event interface")
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agobtrfs: fix processing of delayed tree block refs during backref walking
Filipe Manana [Tue, 11 Oct 2022 12:16:52 +0000 (13:16 +0100)]
btrfs: fix processing of delayed tree block refs during backref walking

[ Upstream commit 71e122c29b0e1c9751d91673d4e31a90a2a0595a ]

During backref walking, when processing a delayed reference with a type of
BTRFS_TREE_BLOCK_REF_KEY, we have two bugs there:

1) We are accessing the delayed references extent_op, and its key, without
   the protection of the delayed ref head's lock;

2) If there's no extent op for the delayed ref head, we end up with an
   uninitialized key in the stack, variable 'tmp_op_key', and then pass
   it to add_indirect_ref(), which adds the reference to the indirect
   refs rb tree.

   This is wrong, because indirect references should have a NULL key
   when we don't have access to the key, and in that case they should be
   added to the indirect_missing_keys rb tree and not to the indirect rb
   tree.

   This means that if have BTRFS_TREE_BLOCK_REF_KEY delayed ref resulting
   from freeing an extent buffer, therefore with a count of -1, it will
   not cancel out the corresponding reference we have in the extent tree
   (with a count of 1), since both references end up in different rb
   trees.

   When using fiemap, where we often need to check if extents are shared
   through shared subtrees resulting from snapshots, it means we can
   incorrectly report an extent as shared when it's no longer shared.
   However this is temporary because after the transaction is committed
   the extent is no longer reported as shared, as running the delayed
   reference results in deleting the tree block reference from the extent
   tree.

   Outside the fiemap context, the result is unpredictable, as the key was
   not initialized but it's used when navigating the rb trees to insert
   and search for references (prelim_ref_compare()), and we expect all
   references in the indirect rb tree to have valid keys.

The following reproducer triggers the second bug:

   $ cat test.sh
   #!/bin/bash

   DEV=/dev/sdj
   MNT=/mnt/sdj

   mkfs.btrfs -f $DEV
   mount -o compress $DEV $MNT

   # With a compressed 128M file we get a tree height of 2 (level 1 root).
   xfs_io -f -c "pwrite -b 1M 0 128M" $MNT/foo

   btrfs subvolume snapshot $MNT $MNT/snap

   # Fiemap should output 0x2008 in the flags column.
   # 0x2000 means shared extent
   # 0x8 means encoded extent (because it's compressed)
   echo
   echo "fiemap after snapshot, range [120M, 120M + 128K):"
   xfs_io -c "fiemap -v 120M 128K" $MNT/foo
   echo

   # Overwrite one extent and fsync to flush delalloc and COW a new path
   # in the snapshot's tree.
   #
   # After this we have a BTRFS_DROP_DELAYED_REF delayed ref of type
   # BTRFS_TREE_BLOCK_REF_KEY with a count of -1 for every COWed extent
   # buffer in the path.
   #
   # In the extent tree we have inline references of type
   # BTRFS_TREE_BLOCK_REF_KEY, with a count of 1, for the same extent
   # buffers, so they should cancel each other, and the extent buffers in
   # the fs tree should no longer be considered as shared.
   #
   echo "Overwriting file range [120M, 120M + 128K)..."
   xfs_io -c "pwrite -b 128K 120M 128K" $MNT/snap/foo
   xfs_io -c "fsync" $MNT/snap/foo

   # Fiemap should output 0x8 in the flags column. The extent in the range
   # [120M, 120M + 128K) is no longer shared, it's now exclusive to the fs
   # tree.
   echo
   echo "fiemap after overwrite range [120M, 120M + 128K):"
   xfs_io -c "fiemap -v 120M 128K" $MNT/foo
   echo

   umount $MNT

Running it before this patch:

   $ ./test.sh
   (...)
   wrote 134217728/134217728 bytes at offset 0
   128 MiB, 128 ops; 0.1152 sec (1.085 GiB/sec and 1110.5809 ops/sec)
   Create a snapshot of '/mnt/sdj' in '/mnt/sdj/snap'

   fiemap after snapshot, range [120M, 120M + 128K):
   /mnt/sdj/foo:
    EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
      0: [245760..246015]: 34304..34559       256 0x2008

   Overwriting file range [120M, 120M + 128K)...
   wrote 131072/131072 bytes at offset 125829120
   128 KiB, 1 ops; 0.0001 sec (683.060 MiB/sec and 5464.4809 ops/sec)

   fiemap after overwrite range [120M, 120M + 128K):
   /mnt/sdj/foo:
    EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
      0: [245760..246015]: 34304..34559       256 0x2008

The extent in the range [120M, 120M + 128K) is still reported as shared
(0x2000 bit set) after overwriting that range and flushing delalloc, which
is not correct - an entire path was COWed in the snapshot's tree and the
extent is now only referenced by the original fs tree.

Running it after this patch:

   $ ./test.sh
   (...)
   wrote 134217728/134217728 bytes at offset 0
   128 MiB, 128 ops; 0.1198 sec (1.043 GiB/sec and 1068.2067 ops/sec)
   Create a snapshot of '/mnt/sdj' in '/mnt/sdj/snap'

   fiemap after snapshot, range [120M, 120M + 128K):
   /mnt/sdj/foo:
    EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
      0: [245760..246015]: 34304..34559       256 0x2008

   Overwriting file range [120M, 120M + 128K)...
   wrote 131072/131072 bytes at offset 125829120
   128 KiB, 1 ops; 0.0001 sec (694.444 MiB/sec and 5555.5556 ops/sec)

   fiemap after overwrite range [120M, 120M + 128K):
   /mnt/sdj/foo:
    EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
      0: [245760..246015]: 34304..34559       256   0x8

Now the extent is not reported as shared anymore.

So fix this by passing a NULL key pointer to add_indirect_ref() when
processing a delayed reference for a tree block if there's no extent op
for our delayed ref head with a defined key. Also access the extent op
only after locking the delayed ref head's lock.

The reproducer will be converted later to a test case for fstests.

Fixes: 11e540862165bd ("btrfs: convert prelimary reference tracking to use rbtrees")
Fixes: c395a32d92ba67 ("btrfs: Remove unused op_key var from add_delayed_refs")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agobtrfs: fix processing of delayed data refs during backref walking
Filipe Manana [Tue, 11 Oct 2022 12:16:51 +0000 (13:16 +0100)]
btrfs: fix processing of delayed data refs during backref walking

[ Upstream commit 3dd2b537865aa3f0cdd8c73a5786deee1a0918ba ]

When processing delayed data references during backref walking and we are
using a share context (we are being called through fiemap), whenever we
find a delayed data reference for an inode different from the one we are
interested in, then we immediately exit and consider the data extent as
shared. This is wrong, because:

1) This might be a DROP reference that will cancel out a reference in the
   extent tree;

2) Even if it's an ADD reference, it may be followed by a DROP reference
   that cancels it out.

In either case we should not exit immediately.

Fix this by never exiting when we find a delayed data reference for
another inode - instead add the reference and if it does not cancel out
other delayed reference, we will exit early when we call
extent_is_shared() after processing all delayed references. If we find
a drop reference, then signal the code that processes references from
the extent tree (add_inline_refs() and add_keyed_refs()) to not exit
immediately if it finds there a reference for another inode, since we
have delayed drop references that may cancel it out. In this later case
we exit once we don't have references in the rb trees that cancel out
each other and have two references for different inodes.

Example reproducer for case 1):

   $ cat test-1.sh
   #!/bin/bash

   DEV=/dev/sdj
   MNT=/mnt/sdj

   mkfs.btrfs -f $DEV
   mount $DEV $MNT

   xfs_io -f -c "pwrite 0 64K" $MNT/foo
   cp --reflink=always $MNT/foo $MNT/bar

   echo
   echo "fiemap after cloning:"
   xfs_io -c "fiemap -v" $MNT/foo

   rm -f $MNT/bar
   echo
   echo "fiemap after removing file bar:"
   xfs_io -c "fiemap -v" $MNT/foo

   umount $MNT

Running it before this patch, the extent is still listed as shared, it has
the flag 0x2000 (FIEMAP_EXTENT_SHARED) set:

   $ ./test-1.sh
   fiemap after cloning:
   /mnt/sdj/foo:
    EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
      0: [0..127]:        26624..26751       128 0x2001

   fiemap after removing file bar:
   /mnt/sdj/foo:
    EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
      0: [0..127]:        26624..26751       128 0x2001

Example reproducer for case 2):

   $ cat test-2.sh
   #!/bin/bash

   DEV=/dev/sdj
   MNT=/mnt/sdj

   mkfs.btrfs -f $DEV
   mount $DEV $MNT

   xfs_io -f -c "pwrite 0 64K" $MNT/foo
   cp --reflink=always $MNT/foo $MNT/bar

   # Flush delayed references to the extent tree and commit current
   # transaction.
   sync

   echo
   echo "fiemap after cloning:"
   xfs_io -c "fiemap -v" $MNT/foo

   rm -f $MNT/bar
   echo
   echo "fiemap after removing file bar:"
   xfs_io -c "fiemap -v" $MNT/foo

   umount $MNT

Running it before this patch, the extent is still listed as shared, it has
the flag 0x2000 (FIEMAP_EXTENT_SHARED) set:

   $ ./test-2.sh
   fiemap after cloning:
   /mnt/sdj/foo:
    EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
      0: [0..127]:        26624..26751       128 0x2001

   fiemap after removing file bar:
   /mnt/sdj/foo:
    EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
      0: [0..127]:        26624..26751       128 0x2001

After this patch, after deleting bar in both tests, the extent is not
reported with the 0x2000 flag anymore, it gets only the flag 0x1
(which is FIEMAP_EXTENT_LAST):

   $ ./test-1.sh
   fiemap after cloning:
   /mnt/sdj/foo:
    EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
      0: [0..127]:        26624..26751       128 0x2001

   fiemap after removing file bar:
   /mnt/sdj/foo:
    EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
      0: [0..127]:        26624..26751       128   0x1

   $ ./test-2.sh
   fiemap after cloning:
   /mnt/sdj/foo:
    EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
      0: [0..127]:        26624..26751       128 0x2001

   fiemap after removing file bar:
   /mnt/sdj/foo:
    EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
      0: [0..127]:        26624..26751       128   0x1

These tests will later be converted to a test case for fstests.

Fixes: 8fbbf0fa007721 ("Btrfs: make fiemap not blow when you have lots of snapshots")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agor8152: add PID for the Lenovo OneLink+ Dock
Jean-Francois Le Fillatre [Wed, 24 Aug 2022 19:14:36 +0000 (21:14 +0200)]
r8152: add PID for the Lenovo OneLink+ Dock

commit 6e344bde8b8cb92dcce3b03ca62c6a994e532224 upstream.

The Lenovo OneLink+ Dock contains an RTL8153 controller that behaves as
a broken CDC device by default. Add the custom Lenovo PID to the r8152
driver to support it properly.

Also, systems compatible with this dock provide a BIOS option to enable
MAC address passthrough (as per Lenovo document "ThinkPad Docking
Solutions 2017"). Add the custom PID to the MAC passthrough list too.

Tested on a ThinkPad 13 1st gen with the expected results:

passthrough disabled: Invalid header when reading pass-thru MAC addr
passthrough enabled:  Using pass-thru MAC addr XX:XX:XX:XX:XX:XX

Signed-off-by: Jean-Francois Le Fillatre <jflf_kernel@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoarm64: errata: Remove AES hwcap for COMPAT tasks
James Morse [Thu, 14 Jul 2022 16:15:23 +0000 (17:15 +0100)]
arm64: errata: Remove AES hwcap for COMPAT tasks

commit d776c9177611cacb6ca4d725671a35676e72e5b6 upstream.

Cortex-A57 and Cortex-A72 have an erratum where an interrupt that
occurs between a pair of AES instructions in aarch32 mode may corrupt
the ELR. The task will subsequently produce the wrong AES result.

The AES instructions are part of the cryptographic extensions, which are
optional. User-space software will detect the support for these
instructions from the hwcaps. If the platform doesn't support these
instructions a software implementation should be used.

Remove the hwcap bits on affected parts to indicate user-space should
not use the AES instructions.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20220714161523.279570-3-james.morse@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
[florian: resolved conflicts in arch/arm64/tools/cpucaps and cpu_errata.c]
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agomedia: venus: dec: Handle the case where find_format fails
Bryan O'Donoghue [Tue, 26 Jul 2022 02:14:54 +0000 (04:14 +0200)]
media: venus: dec: Handle the case where find_format fails

commit 4998691a52a3027d1316fee85b3cc7d3aa1d0ddb upstream.

Debugging the decoder on msm8916 I noticed the vdec probe was crashing if
the fmt pointer was NULL.

A similar fix from Colin Ian King found by Coverity was implemented for the
encoder. Implement the same fix on the decoder.

Fixes: f9675b152242 ("[media] media: venus: vdec: add video decoder files")
Cc: stable@vger.kernel.org # v4.13+
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoKVM: arm64: vgic: Fix exit condition in scan_its_table()
Eric Ren [Sat, 15 Oct 2022 03:19:28 +0000 (11:19 +0800)]
KVM: arm64: vgic: Fix exit condition in scan_its_table()

commit fced23a04a18fefcdf55ecebebe81d993ed6a3ab upstream.

With some PCIe topologies, restoring a guest fails while
parsing the ITS device tables.

Reproducer hints:
1. Create ARM virt VM with pxb-pcie bus which adds
   extra host bridges, with qemu command like:

```
  -device pxb-pcie,bus_nr=8,id=pci.x,numa_node=0,bus=pcie.0 \
  -device pcie-root-port,..,bus=pci.x \
  ...
  -device pxb-pcie,bus_nr=37,id=pci.y,numa_node=1,bus=pcie.0 \
  -device pcie-root-port,..,bus=pci.y \
  ...

```
2. Ensure the guest uses 2-level device table
3. Perform VM migration which calls save/restore device tables

In that setup, we get a big "offset" between 2 device_ids,
which makes unsigned "len" round up a big positive number,
causing the scan loop to continue with a bad GPA. For example:

1. L1 table has 2 entries;
2. and we are now scanning at L2 table entry index 2075 (pointed
   to by L1 first entry)
3. if next device id is 9472, we will get a big offset: 7397;
4. with unsigned 'len', 'len -= offset * esz', len will underflow to a
   positive number, mistakenly into next iteration with a bad GPA;
   (It should break out of the current L2 table scanning, and jump
   into the next L1 table entry)
5. that bad GPA fails the guest read.

Fix it by stopping the L2 table scan when the next device id is
outside of the current table, allowing the scan to continue from
the next L1 table entry.

Thanks to Eric Auger for the fix suggestion.

Fixes: ef7352ce7e4a ("KVM: arm64: vgic-its: Add infrastructure for tableookup")
Suggested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Eric Ren <renzhengeek@gmail.com>
[maz: commit message tidy-up]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/d9c3a564af9e2c5bf63f48a7dcbf08cd593c5c0b.1665802985.git.renzhengeek@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoata: ahci: Match EM_MAX_SLOTS with SATA_PMP_MAX_PORTS
Kai-Heng Feng [Tue, 11 Oct 2022 02:46:17 +0000 (10:46 +0800)]
ata: ahci: Match EM_MAX_SLOTS with SATA_PMP_MAX_PORTS

commit 7b4a7c18db55cead7707372ede59c4b9461d8d89 upstream.

UBSAN complains about array-index-out-of-bounds:
[ 1.980703] kernel: UBSAN: array-index-out-of-bounds in /build/linux-9H675w/linux-5.15.0/drivers/ata/libahci.c:968:41
[ 1.980709] kernel: index 15 is out of range for type 'ahci_em_priv [8]'
[ 1.980713] kernel: CPU: 0 PID: 209 Comm: scsi_eh_8 Not tainted 5.15.0-25-generic #25-Ubuntu
[ 1.980716] kernel: Hardware name: System manufacturer System Product Name/P5Q3, BIOS 1102 06/11/2010
[ 1.980718] kernel: Call Trace:
[ 1.980721] kernel: <TASK>
[ 1.980723] kernel: show_stack+0x52/0x58
[ 1.980729] kernel: dump_stack_lvl+0x4a/0x5f
[ 1.980734] kernel: dump_stack+0x10/0x12
[ 1.980736] kernel: ubsan_epilogue+0x9/0x45
[ 1.980739] kernel: __ubsan_handle_out_of_bounds.cold+0x44/0x49
[ 1.980742] kernel: ahci_qc_issue+0x166/0x170 [libahci]
[ 1.980748] kernel: ata_qc_issue+0x135/0x240
[ 1.980752] kernel: ata_exec_internal_sg+0x2c4/0x580
[ 1.980754] kernel: ? vprintk_default+0x1d/0x20
[ 1.980759] kernel: ata_exec_internal+0x67/0xa0
[ 1.980762] kernel: sata_pmp_read+0x8d/0xc0
[ 1.980765] kernel: sata_pmp_read_gscr+0x3c/0x90
[ 1.980768] kernel: sata_pmp_attach+0x8b/0x310
[ 1.980771] kernel: ata_eh_revalidate_and_attach+0x28c/0x4b0
[ 1.980775] kernel: ata_eh_recover+0x6b6/0xb30
[ 1.980778] kernel: ? ahci_do_hardreset+0x180/0x180 [libahci]
[ 1.980783] kernel: ? ahci_stop_engine+0xb0/0xb0 [libahci]
[ 1.980787] kernel: ? ahci_do_softreset+0x290/0x290 [libahci]
[ 1.980792] kernel: ? trace_event_raw_event_ata_eh_link_autopsy_qc+0xe0/0xe0
[ 1.980795] kernel: sata_pmp_eh_recover.isra.0+0x214/0x560
[ 1.980799] kernel: sata_pmp_error_handler+0x23/0x40
[ 1.980802] kernel: ahci_error_handler+0x43/0x80 [libahci]
[ 1.980806] kernel: ata_scsi_port_error_handler+0x2b1/0x600
[ 1.980810] kernel: ata_scsi_error+0x9c/0xd0
[ 1.980813] kernel: scsi_error_handler+0xa1/0x180
[ 1.980817] kernel: ? scsi_unjam_host+0x1c0/0x1c0
[ 1.980820] kernel: kthread+0x12a/0x150
[ 1.980823] kernel: ? set_kthread_struct+0x50/0x50
[ 1.980826] kernel: ret_from_fork+0x22/0x30
[ 1.980831] kernel: </TASK>

This happens because sata_pmp_init_links() initialize link->pmp up to
SATA_PMP_MAX_PORTS while em_priv is declared as 8 elements array.

I can't find the maximum Enclosure Management ports specified in AHCI
spec v1.3.1, but "12.2.1 LED message type" states that "Port Multiplier
Information" can utilize 4 bits, which implies it can support up to 16
ports. Hence, use SATA_PMP_MAX_PORTS as EM_MAX_SLOTS to resolve the
issue.

BugLink: https://bugs.launchpad.net/bugs/1970074
Cc: stable@vger.kernel.org
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoata: ahci-imx: Fix MODULE_ALIAS
Alexander Stein [Wed, 12 Oct 2022 13:11:05 +0000 (15:11 +0200)]
ata: ahci-imx: Fix MODULE_ALIAS

commit 8df9c34622fcf833e3835a9da25e5de3f375b204 upstream.

'ahci:' is an invalid prefix, preventing the module from autoloading.
Fix this by using the 'platform:' prefix and DRV_NAME.

Fixes: e826ed1599e1 ("ahci_imx: add ahci sata support on imx platforms")
Cc: stable@vger.kernel.org
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agohwmon/coretemp: Handle large core ID value
Zhang Rui [Fri, 14 Oct 2022 09:01:45 +0000 (17:01 +0800)]
hwmon/coretemp: Handle large core ID value

commit 2a9e28e11a861dc5fdb91f25679e8a1d4819973d upstream.

The coretemp driver supports up to a hard-coded limit of 128 cores.

Today, the driver can not support a core with an ID above that limit.
Yet, the encoding of core ID's is arbitrary (BIOS APIC-ID) and so they
may be sparse and they may be large.

Update the driver to map arbitrary core ID numbers into appropriate
array indexes so that 128 cores can be supported, no matter the encoding
of core ID's.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Len Brown <len.brown@intel.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20221014090147.1836-3-rui.zhang@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agox86/microcode/AMD: Apply the patch early on every logical thread
Borislav Petkov [Wed, 5 Oct 2022 10:00:08 +0000 (12:00 +0200)]
x86/microcode/AMD: Apply the patch early on every logical thread

commit af86df31c6fa8c7a0ee948df012e5af7a4304d50 upstream.

Currently, the patch application logic checks whether the revision
needs to be applied on each logical CPU (SMT thread). Therefore, on SMT
designs where the microcode engine is shared between the two threads,
the application happens only on one of them as that is enough to update
the shared microcode engine.

However, there are microcode patches which do per-thread modification,
see Link tag below.

Therefore, drop the revision check and try applying on each thread. This
is what the BIOS does too so this method is very much tested.

Btw, change only the early paths. On the late loading paths, there's no
point in doing per-thread modification because if is it some case like
in the bugzilla below - removing a CPUID flag - the kernel cannot go and
un-use features it has detected are there early. For that, one should
use early loading anyway.

  [ bp: Fixes does not contain the oldest commit which did check for
    equality but that is good enough. ]

Fixes: 3f703fdbc470 ("x86/microcode/AMD: Rework container parsing")
Reported-by: Ștefan Talpalaru <stefantalpalaru@yahoo.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Ștefan Talpalaru <stefantalpalaru@yahoo.com>
Cc: <stable@vger.kernel.org>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216211
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoocfs2: fix BUG when iput after ocfs2_mknod fails
Joseph Qi [Mon, 17 Oct 2022 13:02:26 +0000 (21:02 +0800)]
ocfs2: fix BUG when iput after ocfs2_mknod fails

commit c1ccdf4f58fc6e384a5d9f74748e3cc979bb31bd upstream.

Commit f19fba12ad86 "ocfs2: should reclaim the inode if
'__ocfs2_mknod_locked' returns an error" tried to reclaim the claimed
inode if __ocfs2_mknod_locked() fails later.  But this introduce a race,
the freed bit may be reused immediately by another thread, which will
update dinode, e.g.  i_generation.  Then iput this inode will lead to BUG:
inode->i_generation != le32_to_cpu(fe->i_generation)

We could make this inode as bad, but we did want to do operations like
wipe in some cases.  Since the claimed inode bit can only affect that an
dinode is missing and will return back after fsck, it seems not a big
problem.  So just leave it as is by revert the reclaim logic.

Link: https://lkml.kernel.org/r/20221017130227.234480-1-joseph.qi@linux.alibaba.com
Fixes: f19fba12ad86 ("ocfs2: should reclaim the inode if '__ocfs2_mknod_locked' returns an error")
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reported-by: Yan Wang <wangyan122@huawei.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoocfs2: clear dinode links count in case of error
Joseph Qi [Mon, 17 Oct 2022 13:02:27 +0000 (21:02 +0800)]
ocfs2: clear dinode links count in case of error

commit 5e7b6a73dabc5a3b56985276d122b91d46df2c42 upstream.

In ocfs2_mknod(), if error occurs after dinode successfully allocated,
ocfs2 i_links_count will not be 0.

So even though we clear inode i_nlink before iput in error handling, it
still won't wipe inode since we'll refresh inode from dinode during inode
lock.  So just like clear inode i_nlink, we clear ocfs2 i_links_count as
well.  Also do the same change for ocfs2_symlink().

Link: https://lkml.kernel.org/r/20221017130227.234480-2-joseph.qi@linux.alibaba.com
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reported-by: Yan Wang <wangyan122@huawei.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxfs: fix use-after-free on CIL context on shutdown
Dave Chinner [Wed, 26 Oct 2022 06:28:43 +0000 (11:58 +0530)]
xfs: fix use-after-free on CIL context on shutdown

commit 9ccd4e7daab006a8d3e17bf9c7a4284c7ef2ecbd upstream.

xlog_wait() on the CIL context can reference a freed context if the
waiter doesn't get scheduled before the CIL context is freed. This
can happen when a task is on the hard throttle and the CIL push
aborts due to a shutdown. This was detected by generic/019:

thread 1 thread 2

__xfs_trans_commit
 xfs_log_commit_cil
  <CIL size over hard throttle limit>
  xlog_wait
   schedule
xlog_cil_push_work
wake_up_all
<shutdown aborts commit>
xlog_cil_committed
kmem_free

   remove_wait_queue
    spin_lock_irqsave --> UAF

Fix it by moving the wait queue to the CIL rather than keeping it in
in the CIL context that gets freed on push completion. Because the
wait queue is now independent of the CIL context and we might have
multiple contexts in flight at once, only wake the waiters on the
push throttle when the context we are pushing is over the hard
throttle size threshold.

Fixes: 816b3545698d5 ("xfs: Throttle commits on delayed background CIL push")
Reported-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxfs: move inode flush to the sync workqueue
Darrick J. Wong [Wed, 26 Oct 2022 06:28:42 +0000 (11:58 +0530)]
xfs: move inode flush to the sync workqueue

commit e5c7b41c9a42f4e6db035aa9d453c4f557c11213 upstream.

[ Modify fs/xfs/xfs_super.c to include the changes at locations suitable for
 5.4-lts kernel ]

Move the inode dirty data flushing to a workqueue so that multiple
threads can take advantage of a single thread's flushing work.  The
ratelimiting technique used in bdd4ee4 was not successful, because
threads that skipped the inode flush scan due to ratelimiting would
ENOSPC early, which caused occasional (but noticeable) changes in
behavior and sporadic fstest regressions.

Therefore, make all the writer threads wait on a single inode flush,
which eliminates both the stampeding hordes of flushers and the small
window in which a write could fail with ENOSPC because it lost the
ratelimit race after even another thread freed space.

Fixes: 8056c99cd46b ("xfs: ratelimit inode flush on buffered write ENOSPC")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxfs: reflink should force the log out if mounted with wsync
Christoph Hellwig [Wed, 26 Oct 2022 06:28:41 +0000 (11:58 +0530)]
xfs: reflink should force the log out if mounted with wsync

commit 9700dd3e35d4f567ba1cfe3bd4382d129c00a3c6 upstream.

Reflink should force the log out to disk if the filesystem was mounted
with wsync, the same as most other operations in xfs.

[Note: XFS_MOUNT_WSYNC is set when the admin mounts the filesystem
with either the 'wsync' or 'sync' mount options, which effectively means
that we're classifying reflink/dedupe as IO operations and making them
synchronous when required.]

Fixes: 667b770843d06 ("xfs: remove xfs_reflink_remap_range")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
[darrick: add more to the changelog]
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxfs: factor out a new xfs_log_force_inode helper
Christoph Hellwig [Wed, 26 Oct 2022 06:28:40 +0000 (11:58 +0530)]
xfs: factor out a new xfs_log_force_inode helper

commit fd7796afc3c18a0f441d02e0488afd166fc938dc upstream.

Create a new helper to force the log up to the last LSN touching an
inode.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxfs: trylock underlying buffer on dquot flush
Brian Foster [Wed, 26 Oct 2022 06:28:39 +0000 (11:58 +0530)]
xfs: trylock underlying buffer on dquot flush

commit 60c63658a4e9c08b52ab41ec8a3d8cd47bc52ca8 upstream.

A dquot flush currently blocks on the buffer lock for the underlying
dquot buffer. In turn, this causes xfsaild to block rather than
continue processing other items in the meantime. Update
xfs_qm_dqflush() to trylock the buffer, similar to how inode buffers
are handled, and return -EAGAIN if the lock fails. Fix up any
callers that don't currently handle the error properly.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxfs: don't write a corrupt unmount record to force summary counter recalc
Darrick J. Wong [Wed, 26 Oct 2022 06:28:38 +0000 (11:58 +0530)]
xfs: don't write a corrupt unmount record to force summary counter recalc

commit 6312e563129a43203ceffa4a758819fc14e32a2a upstream.

[ Modify fs/xfs/xfs_log.c to include the changes at locations suitable for
  5.4-lts kernel ]

In commit e9b8320d877b6, I added the ability to force a recalculation of
the filesystem summary counters if they seemed incorrect.  This was done
(not entirely correctly) by tweaking the log code to write an unmount
record without the UMOUNT_TRANS flag set.  At next mount, the log
recovery code will fail to find the unmount record and go into recovery,
which triggers the recalculation.

What actually gets written to the log is what ought to be an unmount
record, but without any flags set to indicate what kind of record it
actually is.  This worked to trigger the recalculation, but we shouldn't
write bogus log records when we could simply write nothing.

Fixes: e9b8320d877b6 ("xfs: force summary counter recalc at next mount")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxfs: tail updates only need to occur when LSN changes
Dave Chinner [Wed, 26 Oct 2022 06:28:37 +0000 (11:58 +0530)]
xfs: tail updates only need to occur when LSN changes

commit cb48029f80104da2119a19773f79926d1d816e4d upstream.

We currently wake anything waiting on the log tail to move whenever
the log item at the tail of the log is removed. Historically this
was fine behaviour because there were very few items at any given
LSN. But with delayed logging, there may be thousands of items at
any given LSN, and we can't move the tail until they are all gone.

Hence if we are removing them in near tail-first order, we might be
waking up processes waiting on the tail LSN to change (e.g. log
space waiters) repeatedly without them being able to make progress.
This also occurs with the new sync push waiters, and can result in
thousands of spurious wakeups every second when under heavy direct
reclaim pressure.

To fix this, check that the tail LSN has actually changed on the
AIL before triggering wakeups. This will reduce the number of
spurious wakeups when doing bulk AIL removal and make this code much
more efficient.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxfs: factor common AIL item deletion code
Dave Chinner [Wed, 26 Oct 2022 06:28:36 +0000 (11:58 +0530)]
xfs: factor common AIL item deletion code

commit c4003c2948904e9cff090e341b7b4b846ce129e2 upstream.

Factor the common AIL deletion code that does all the wakeups into a
helper so we only have one copy of this somewhat tricky code to
interface with all the wakeups necessary when the LSN of the log
tail changes.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxfs: Throttle commits on delayed background CIL push
Dave Chinner [Wed, 26 Oct 2022 06:28:35 +0000 (11:58 +0530)]
xfs: Throttle commits on delayed background CIL push

commit 816b3545698d5afb64d27a16210abd41d12348ad upstream.

In certain situations the background CIL push can be indefinitely
delayed. While we have workarounds from the obvious cases now, it
doesn't solve the underlying issue. This issue is that there is no
upper limit on the CIL where we will either force or wait for
a background push to start, hence allowing the CIL to grow without
bound until it consumes all log space.

To fix this, add a new wait queue to the CIL which allows background
pushes to wait for the CIL context to be switched out. This happens
when the push starts, so it will allow us to block incoming
transaction commit completion until the push has started. This will
only affect processes that are running modifications, and only when
the CIL threshold has been significantly overrun.

This has no apparent impact on performance, and doesn't even trigger
until over 45 million inodes had been created in a 16-way fsmark
test on a 2GB log. That was limiting at 64MB of log space used, so
the active CIL size is only about 3% of the total log in that case.
The concurrent removal of those files did not trigger the background
sleep at all.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxfs: Lower CIL flush limit for large logs
Dave Chinner [Wed, 26 Oct 2022 06:28:34 +0000 (11:58 +0530)]
xfs: Lower CIL flush limit for large logs

commit b5d24c5e92ae3531b0c73c13b26e549043053087 upstream.

The current CIL size aggregation limit is 1/8th the log size. This
means for large logs we might be aggregating at least 250MB of dirty objects
in memory before the CIL is flushed to the journal. With CIL shadow
buffers sitting around, this means the CIL is often consuming >500MB
of temporary memory that is all allocated under GFP_NOFS conditions.

Flushing the CIL can take some time to do if there is other IO
ongoing, and can introduce substantial log force latency by itself.
It also pins the memory until the objects are in the AIL and can be
written back and reclaimed by shrinkers. Hence this threshold also
tends to determine the minimum amount of memory XFS can operate in
under heavy modification without triggering the OOM killer.

Modify the CIL space limit to prevent such huge amounts of pinned
metadata from aggregating. We can have 2MB of log IO in flight at
once, so limit aggregation to 16x this size. This threshold was
chosen as it little impact on performance (on 16-way fsmark) or log
traffic but pins a lot less memory on large logs especially under
heavy memory pressure.  An aggregation limit of 8x had 5-10%
performance degradation and a 50% increase in log throughput for
the same workload, so clearly that was too small for highly
concurrent workloads on large logs.

This was found via trace analysis of AIL behaviour. e.g. insertion
from a single CIL flush:

xfs_ail_insert: old lsn 0/0 new lsn 1/3033090 type XFS_LI_INODE flags IN_AIL

$ grep xfs_ail_insert /mnt/scratch/s.t |grep "new lsn 1/3033090" |wc -l
1721823
$

So there were 1.7 million objects inserted into the AIL from this
CIL checkpoint, the first at 2323.392108, the last at 2325.667566 which
was the end of the trace (i.e. it hadn't finished). Clearly a major
problem.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxfs: preserve default grace interval during quotacheck
Darrick J. Wong [Wed, 26 Oct 2022 06:28:33 +0000 (11:58 +0530)]
xfs: preserve default grace interval during quotacheck

commit 69f3ea28d03a3b505f94c10c3c5c50a958a819f4 upstream.

When quotacheck runs, it zeroes all the timer fields in every dquot.
Unfortunately, it also does this to the root dquot, which erases any
preconfigured grace intervals and warning limits that the administrator
may have set.  Worse yet, the incore copies of those variables remain
set.  This cache coherence problem manifests itself as the grace
interval mysteriously being reset back to the defaults at the /next/
mount.

Fix it by not resetting the root disk dquot's timer and warning fields.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxfs: fix unmount hang and memory leak on shutdown during quotaoff
Brian Foster [Wed, 26 Oct 2022 06:28:32 +0000 (11:58 +0530)]
xfs: fix unmount hang and memory leak on shutdown during quotaoff

commit f00d6ec3e3c85f2f9222d3ed8d6cf469c044be39 upstream.

AIL removal of the quotaoff start intent and free of both quotaoff
intents is currently limited to the ->iop_committed() handler of the
end intent. This executes when the end intent is committed to the
on-disk log and marks the completion of the operation. The problem
with this is it assumes the success of the operation. If a shutdown
or other error occurs during the quotaoff, it's possible for the
quotaoff task to exit without removing the start intent from the
AIL. This results in an unmount hang as the AIL cannot be emptied.
Further, no other codepath frees the intents and so this is also a
memory leak vector.

First, update the high level quotaoff error path to directly remove
and free the quotaoff start intent if it still exists in the AIL at
the time of the error. Next, update both of the start and end
quotaoff intents with an ->iop_release() callback to properly handle
transaction abort.

This means that If the quotaoff start transaction aborts, it frees
the start intent in the transaction commit path. If the filesystem
shuts down before the end transaction allocates, the quotaoff
sequence removes and frees the start intent. If the end transaction
aborts, it removes the start intent and frees both. This ensures
that a shutdown does not result in a hung unmount and that memory is
not leaked regardless of when a quotaoff error occurs.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxfs: factor out quotaoff intent AIL removal and memory free
Brian Foster [Wed, 26 Oct 2022 06:28:31 +0000 (11:58 +0530)]
xfs: factor out quotaoff intent AIL removal and memory free

commit 789a36fa57d76a1de9f7b5c2cb9c3d561b34a222 upstream.

AIL removal of the quotaoff start intent and free of both intents is
hardcoded to the ->iop_committed() handler of the end intent. Factor
out the start intent handling code so it can be used in a future
patch to properly handle quotaoff errors. Use xfs_trans_ail_remove()
instead of the _delete() variant to acquire the AIL lock and also
handle cases where an intent might not reside in the AIL at the
time of a failure.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxfs: Replace function declaration by actual definition
Pavel Reichl [Wed, 26 Oct 2022 06:28:30 +0000 (11:58 +0530)]
xfs: Replace function declaration by actual definition

commit fc280759c4149c13c3e7dbac63f11cbd85a6311d upstream.

Signed-off-by: Pavel Reichl <preichl@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: fix typo in subject line]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxfs: remove the xfs_qoff_logitem_t typedef
Pavel Reichl [Wed, 26 Oct 2022 06:28:29 +0000 (11:58 +0530)]
xfs: remove the xfs_qoff_logitem_t typedef

commit 9ea899e41391c5046bd66e04ab89df97797f76c7 upstream.

Signed-off-by: Pavel Reichl <preichl@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: fix a comment]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxfs: remove the xfs_dq_logitem_t typedef
Pavel Reichl [Wed, 26 Oct 2022 06:28:28 +0000 (11:58 +0530)]
xfs: remove the xfs_dq_logitem_t typedef

commit 2d9b847adff65f98f4ebc7fb8aed51ef92d66922 upstream.

Signed-off-by: Pavel Reichl <preichl@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxfs: remove the xfs_disk_dquot_t and xfs_dquot_t
Pavel Reichl [Wed, 26 Oct 2022 06:28:27 +0000 (11:58 +0530)]
xfs: remove the xfs_disk_dquot_t and xfs_dquot_t

commit 951509bd6992772df8da6277e154137ea0916017 upstream.

Signed-off-by: Pavel Reichl <preichl@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: fix some of the comments]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxfs: Use scnprintf() for avoiding potential buffer overflow
Takashi Iwai [Wed, 26 Oct 2022 06:28:26 +0000 (11:58 +0530)]
xfs: Use scnprintf() for avoiding potential buffer overflow

commit 9df0d0b2dececf39f45bae96b2542362d9ae440a upstream.

Since snprintf() returns the would-be-output size instead of the
actual output size, the succeeding calls may go beyond the given
buffer limit.  Fix it by replacing with scnprintf().

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxfs: check owner of dir3 blocks
Darrick J. Wong [Wed, 26 Oct 2022 06:28:25 +0000 (11:58 +0530)]
xfs: check owner of dir3 blocks

commit 26e740a9e0a69f8564e8a5575565b79745374fc6 upstream.

Check the owner field of dir3 block headers.  If it's corrupt, release
the buffer and return EFSCORRUPTED.  All callers handle this properly.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxfs: check owner of dir3 data blocks
Darrick J. Wong [Wed, 26 Oct 2022 06:28:24 +0000 (11:58 +0530)]
xfs: check owner of dir3 data blocks

commit 745191f2f73c5a696956593671376b1fb987ae7b upstream.

[Slightly edit xfs_dir3_data_read() to work with existing mapped_bno argument instead
of flag values introduced in later kernels]

Check the owner field of dir3 data block headers.  If it's corrupt,
release the buffer and return EFSCORRUPTED.  All callers handle this
properly.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxfs: fix buffer corruption reporting when xfs_dir3_free_header_check fails
Darrick J. Wong [Wed, 26 Oct 2022 06:28:23 +0000 (11:58 +0530)]
xfs: fix buffer corruption reporting when xfs_dir3_free_header_check fails

commit 685f9c29a662747addb45a4bef1afd0f310afe54 upstream.

xfs_verifier_error is supposed to be called on a corrupt metadata buffer
from within a buffer verifier function, whereas xfs_buf_mark_corrupt
is the function to be called when a piece of code has read a buffer and
catches something that a read verifier cannot.  The first function sets
b_error anticipating that the low level buffer handling code will see
the nonzero b_error and clear XBF_DONE on the buffer, whereas the second
function does not.

Since xfs_dir3_free_header_check examines fields in the dir free block
header that require more context than can be provided to read verifiers,
we must call xfs_buf_mark_corrupt when it finds a problem.

Switching the calls has a secondary effect that we no longer corrupt the
buffer state by setting b_error and leaving XBF_DONE set.  When /that/
happens, we'll trip over various state assertions (most commonly the
b_error check in xfs_buf_reverify) on a subsequent attempt to read the
buffer.

Fixes: 930c9b2e9fc92663 ("xfs: refactor verifier callers to print address of failing check")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxfs: xfs_buf_corruption_error should take __this_address
Darrick J. Wong [Wed, 26 Oct 2022 06:28:22 +0000 (11:58 +0530)]
xfs: xfs_buf_corruption_error should take __this_address

commit 1b629ada58862c58f4203477117fc858c2f3b62d upstream.

Add a xfs_failaddr_t parameter to this function so that callers can
potentially pass in (and therefore report) the exact point in the code
where we decided that a metadata buffer was corrupt.  This enables us to
wire it up to checking functions that have to run outside of verifiers.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxfs: add a function to deal with corrupt buffers post-verifiers
Darrick J. Wong [Wed, 26 Oct 2022 06:28:21 +0000 (11:58 +0530)]
xfs: add a function to deal with corrupt buffers post-verifiers

commit 82778c17bf5d56b9326d34fe0c096d4b0f465305 upstream.

Add a helper function to get rid of buffers that we have decided are
corrupt after the verifiers have run.  This function is intended to
handle metadata checks that can't happen in the verifiers, such as
inter-block relationship checking.  Note that we now mark the buffer
stale so that it will not end up on any LRU and will be purged on
release.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxfs: rework collapse range into an atomic operation
Brian Foster [Wed, 26 Oct 2022 06:28:20 +0000 (11:58 +0530)]
xfs: rework collapse range into an atomic operation

commit 7e4db195a6e734ffb3c9fa7ff39f72630798b555 upstream.

The collapse range operation uses a unique transaction and ilock
cycle for the hole punch and each extent shift iteration of the
overall operation. While the hole punch is safe as a separate
operation due to the iolock, cycling the ilock after each extent
shift is risky w.r.t. concurrent operations, similar to insert range.

To avoid this problem, make collapse range atomic with respect to
ilock. Hold the ilock across the entire operation, replace the
individual transactions with a single rolling transaction sequence
and finish dfops on each iteration to perform pending frees and roll
the transaction. Remove the unnecessary quota reservation as
collapse range can only ever merge extents (and thus remove extent
records and potentially free bmap blocks). The dfops call
automatically relogs the inode to keep it moving in the log. This
guarantees that nothing else can change the extent mapping of an
inode while a collapse range operation is in progress.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxfs: rework insert range into an atomic operation
Brian Foster [Wed, 26 Oct 2022 06:28:19 +0000 (11:58 +0530)]
xfs: rework insert range into an atomic operation

commit 02ed3ac858a927a17611bccb4e25424684801502 upstream.

The insert range operation uses a unique transaction and ilock cycle
for the extent split and each extent shift iteration of the overall
operation. While this works, it is risks racing with other
operations in subtle ways such as COW writeback modifying an extent
tree in the middle of a shift operation.

To avoid this problem, make insert range atomic with respect to
ilock. Hold the ilock across the entire operation, replace the
individual transactions with a single rolling transaction sequence
and relog the inode to keep it moving in the log. This guarantees
that nothing else can change the extent mapping of an inode while
an insert range operation is in progress.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoxfs: open code insert range extent split helper
Brian Foster [Wed, 26 Oct 2022 06:28:18 +0000 (11:58 +0530)]
xfs: open code insert range extent split helper

commit 8c5191ceec35a761368b430aa4e05def9c21a16e upstream.

The insert range operation currently splits the extent at the target
offset in a separate transaction and lock cycle from the one that
shifts extents. In preparation for reworking insert range into an
atomic operation, lift the code into the caller so it can be easily
condensed to a single rolling transaction and lock cycle and
eliminate the helper. No functional changes.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandan.babu@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoLinux 5.4.220
Greg Kroah-Hartman [Wed, 26 Oct 2022 11:23:01 +0000 (13:23 +0200)]
Linux 5.4.220

Link: https://lore.kernel.org/r/20221024113002.471093005@linuxfoundation.org
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Allen Pais <apais@linux.microsoft.com>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Tested-by: Slade Watkins <srw@sladewatkins.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agothermal: intel_powerclamp: Use first online CPU as control_cpu
Rafael J. Wysocki [Thu, 13 Oct 2022 12:50:28 +0000 (14:50 +0200)]
thermal: intel_powerclamp: Use first online CPU as control_cpu

commit 5a5b475e1d09c15287638d2ba8185b9501178208 upstream.

Commit a167c3ae55df ("thermal: intel_powerclamp: Use get_cpu() instead
of smp_processor_id() to avoid crash") fixed an issue related to using
smp_processor_id() in preemptible context by replacing it with a pair
of get_cpu()/put_cpu(), but what is needed there really is any online
CPU and not necessarily the one currently running the code.  Arguably,
getting the one that's running the code in there is confusing.

For this reason, simply give the control CPU role to the first online
one which automatically will be CPU0 if it is online, so one check
can be dropped from the code for an added benefit.

Link: https://lore.kernel.org/linux-pm/20221011113646.GA12080@duo.ucw.cz/
Fixes: a167c3ae55df ("thermal: intel_powerclamp: Use get_cpu() instead of smp_processor_id() to avoid crash")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoinet: fully convert sk->sk_rx_dst to RCU rules
Eric Dumazet [Mon, 20 Dec 2021 14:33:30 +0000 (06:33 -0800)]
inet: fully convert sk->sk_rx_dst to RCU rules

commit 6c4bcc7b5b2eed0dbeb9a194a086e1d15ff450b8 upstream.

syzbot reported various issues around early demux,
one being included in this changelog [1]

sk->sk_rx_dst is using RCU protection without clearly
documenting it.

And following sequences in tcp_v4_do_rcv()/tcp_v6_do_rcv()
are not following standard RCU rules.

[a]    dst_release(dst);
[b]    sk->sk_rx_dst = NULL;

They look wrong because a delete operation of RCU protected
pointer is supposed to clear the pointer before
the call_rcu()/synchronize_rcu() guarding actual memory freeing.

In some cases indeed, dst could be freed before [b] is done.

We could cheat by clearing sk_rx_dst before calling
dst_release(), but this seems the right time to stick
to standard RCU annotations and debugging facilities.

[1]
BUG: KASAN: use-after-free in dst_check include/net/dst.h:470 [inline]
BUG: KASAN: use-after-free in tcp_v4_early_demux+0x95b/0x960 net/ipv4/tcp_ipv4.c:1792
Read of size 2 at addr ffff88807f1cb73a by task syz-executor.5/9204

CPU: 0 PID: 9204 Comm: syz-executor.5 Not tainted 5.16.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 print_address_description.constprop.0.cold+0x8d/0x320 mm/kasan/report.c:247
 __kasan_report mm/kasan/report.c:433 [inline]
 kasan_report.cold+0x83/0xdf mm/kasan/report.c:450
 dst_check include/net/dst.h:470 [inline]
 tcp_v4_early_demux+0x95b/0x960 net/ipv4/tcp_ipv4.c:1792
 ip_rcv_finish_core.constprop.0+0x15de/0x1e80 net/ipv4/ip_input.c:340
 ip_list_rcv_finish.constprop.0+0x1b2/0x6e0 net/ipv4/ip_input.c:583
 ip_sublist_rcv net/ipv4/ip_input.c:609 [inline]
 ip_list_rcv+0x34e/0x490 net/ipv4/ip_input.c:644
 __netif_receive_skb_list_ptype net/core/dev.c:5508 [inline]
 __netif_receive_skb_list_core+0x549/0x8e0 net/core/dev.c:5556
 __netif_receive_skb_list net/core/dev.c:5608 [inline]
 netif_receive_skb_list_internal+0x75e/0xd80 net/core/dev.c:5699
 gro_normal_list net/core/dev.c:5853 [inline]
 gro_normal_list net/core/dev.c:5849 [inline]
 napi_complete_done+0x1f1/0x880 net/core/dev.c:6590
 virtqueue_napi_complete drivers/net/virtio_net.c:339 [inline]
 virtnet_poll+0xca2/0x11b0 drivers/net/virtio_net.c:1557
 __napi_poll+0xaf/0x440 net/core/dev.c:7023
 napi_poll net/core/dev.c:7090 [inline]
 net_rx_action+0x801/0xb40 net/core/dev.c:7177
 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558
 invoke_softirq kernel/softirq.c:432 [inline]
 __irq_exit_rcu+0x123/0x180 kernel/softirq.c:637
 irq_exit_rcu+0x5/0x20 kernel/softirq.c:649
 common_interrupt+0x52/0xc0 arch/x86/kernel/irq.c:240
 asm_common_interrupt+0x1e/0x40 arch/x86/include/asm/idtentry.h:629
RIP: 0033:0x7f5e972bfd57
Code: 39 d1 73 14 0f 1f 80 00 00 00 00 48 8b 50 f8 48 83 e8 08 48 39 ca 77 f3 48 39 c3 73 3e 48 89 13 48 8b 50 f8 48 89 38 49 8b 0e <48> 8b 3e 48 83 c3 08 48 83 c6 08 eb bc 48 39 d1 72 9e 48 39 d0 73
RSP: 002b:00007fff8a413210 EFLAGS: 00000283
RAX: 00007f5e97108990 RBX: 00007f5e97108338 RCX: ffffffff81d3aa45
RDX: ffffffff81d3aa45 RSI: 00007f5e97108340 RDI: ffffffff81d3aa45
RBP: 00007f5e97107eb8 R08: 00007f5e97108d88 R09: 0000000093c2e8d9
R10: 0000000000000000 R11: 0000000000000000 R12: 00007f5e97107eb0
R13: 00007f5e97108338 R14: 00007f5e97107ea8 R15: 0000000000000019
 </TASK>

Allocated by task 13:
 kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38
 kasan_set_track mm/kasan/common.c:46 [inline]
 set_alloc_info mm/kasan/common.c:434 [inline]
 __kasan_slab_alloc+0x90/0xc0 mm/kasan/common.c:467
 kasan_slab_alloc include/linux/kasan.h:259 [inline]
 slab_post_alloc_hook mm/slab.h:519 [inline]
 slab_alloc_node mm/slub.c:3234 [inline]
 slab_alloc mm/slub.c:3242 [inline]
 kmem_cache_alloc+0x202/0x3a0 mm/slub.c:3247
 dst_alloc+0x146/0x1f0 net/core/dst.c:92
 rt_dst_alloc+0x73/0x430 net/ipv4/route.c:1613
 ip_route_input_slow+0x1817/0x3a20 net/ipv4/route.c:2340
 ip_route_input_rcu net/ipv4/route.c:2470 [inline]
 ip_route_input_noref+0x116/0x2a0 net/ipv4/route.c:2415
 ip_rcv_finish_core.constprop.0+0x288/0x1e80 net/ipv4/ip_input.c:354
 ip_list_rcv_finish.constprop.0+0x1b2/0x6e0 net/ipv4/ip_input.c:583
 ip_sublist_rcv net/ipv4/ip_input.c:609 [inline]
 ip_list_rcv+0x34e/0x490 net/ipv4/ip_input.c:644
 __netif_receive_skb_list_ptype net/core/dev.c:5508 [inline]
 __netif_receive_skb_list_core+0x549/0x8e0 net/core/dev.c:5556
 __netif_receive_skb_list net/core/dev.c:5608 [inline]
 netif_receive_skb_list_internal+0x75e/0xd80 net/core/dev.c:5699
 gro_normal_list net/core/dev.c:5853 [inline]
 gro_normal_list net/core/dev.c:5849 [inline]
 napi_complete_done+0x1f1/0x880 net/core/dev.c:6590
 virtqueue_napi_complete drivers/net/virtio_net.c:339 [inline]
 virtnet_poll+0xca2/0x11b0 drivers/net/virtio_net.c:1557
 __napi_poll+0xaf/0x440 net/core/dev.c:7023
 napi_poll net/core/dev.c:7090 [inline]
 net_rx_action+0x801/0xb40 net/core/dev.c:7177
 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558

Freed by task 13:
 kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38
 kasan_set_track+0x21/0x30 mm/kasan/common.c:46
 kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:370
 ____kasan_slab_free mm/kasan/common.c:366 [inline]
 ____kasan_slab_free mm/kasan/common.c:328 [inline]
 __kasan_slab_free+0xff/0x130 mm/kasan/common.c:374
 kasan_slab_free include/linux/kasan.h:235 [inline]
 slab_free_hook mm/slub.c:1723 [inline]
 slab_free_freelist_hook+0x8b/0x1c0 mm/slub.c:1749
 slab_free mm/slub.c:3513 [inline]
 kmem_cache_free+0xbd/0x5d0 mm/slub.c:3530
 dst_destroy+0x2d6/0x3f0 net/core/dst.c:127
 rcu_do_batch kernel/rcu/tree.c:2506 [inline]
 rcu_core+0x7ab/0x1470 kernel/rcu/tree.c:2741
 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558

Last potentially related work creation:
 kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38
 __kasan_record_aux_stack+0xf5/0x120 mm/kasan/generic.c:348
 __call_rcu kernel/rcu/tree.c:2985 [inline]
 call_rcu+0xb1/0x740 kernel/rcu/tree.c:3065
 dst_release net/core/dst.c:177 [inline]
 dst_release+0x79/0xe0 net/core/dst.c:167
 tcp_v4_do_rcv+0x612/0x8d0 net/ipv4/tcp_ipv4.c:1712
 sk_backlog_rcv include/net/sock.h:1030 [inline]
 __release_sock+0x134/0x3b0 net/core/sock.c:2768
 release_sock+0x54/0x1b0 net/core/sock.c:3300
 tcp_sendmsg+0x36/0x40 net/ipv4/tcp.c:1441
 inet_sendmsg+0x99/0xe0 net/ipv4/af_inet.c:819
 sock_sendmsg_nosec net/socket.c:704 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:724
 sock_write_iter+0x289/0x3c0 net/socket.c:1057
 call_write_iter include/linux/fs.h:2162 [inline]
 new_sync_write+0x429/0x660 fs/read_write.c:503
 vfs_write+0x7cd/0xae0 fs/read_write.c:590
 ksys_write+0x1ee/0x250 fs/read_write.c:643
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

The buggy address belongs to the object at ffff88807f1cb700
 which belongs to the cache ip_dst_cache of size 176
The buggy address is located 58 bytes inside of
 176-byte region [ffff88807f1cb700ffff88807f1cb7b0)
The buggy address belongs to the page:
page:ffffea0001fc72c0 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7f1cb
flags: 0xfff00000000200(slab|node=0|zone=1|lastcpupid=0x7ff)
raw: 00fff00000000200 dead000000000100 dead000000000122 ffff8881413bb780
raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x112a20(GFP_ATOMIC|__GFP_NOWARN|__GFP_NORETRY|__GFP_HARDWALL), pid 5, ts 108466983062, free_ts 108048976062
 prep_new_page mm/page_alloc.c:2418 [inline]
 get_page_from_freelist+0xa72/0x2f50 mm/page_alloc.c:4149
 __alloc_pages+0x1b2/0x500 mm/page_alloc.c:5369
 alloc_pages+0x1a7/0x300 mm/mempolicy.c:2191
 alloc_slab_page mm/slub.c:1793 [inline]
 allocate_slab mm/slub.c:1930 [inline]
 new_slab+0x32d/0x4a0 mm/slub.c:1993
 ___slab_alloc+0x918/0xfe0 mm/slub.c:3022
 __slab_alloc.constprop.0+0x4d/0xa0 mm/slub.c:3109
 slab_alloc_node mm/slub.c:3200 [inline]
 slab_alloc mm/slub.c:3242 [inline]
 kmem_cache_alloc+0x35c/0x3a0 mm/slub.c:3247
 dst_alloc+0x146/0x1f0 net/core/dst.c:92
 rt_dst_alloc+0x73/0x430 net/ipv4/route.c:1613
 __mkroute_output net/ipv4/route.c:2564 [inline]
 ip_route_output_key_hash_rcu+0x921/0x2d00 net/ipv4/route.c:2791
 ip_route_output_key_hash+0x18b/0x300 net/ipv4/route.c:2619
 __ip_route_output_key include/net/route.h:126 [inline]
 ip_route_output_flow+0x23/0x150 net/ipv4/route.c:2850
 ip_route_output_key include/net/route.h:142 [inline]
 geneve_get_v4_rt+0x3a6/0x830 drivers/net/geneve.c:809
 geneve_xmit_skb drivers/net/geneve.c:899 [inline]
 geneve_xmit+0xc4a/0x3540 drivers/net/geneve.c:1082
 __netdev_start_xmit include/linux/netdevice.h:4994 [inline]
 netdev_start_xmit include/linux/netdevice.h:5008 [inline]
 xmit_one net/core/dev.c:3590 [inline]
 dev_hard_start_xmit+0x1eb/0x920 net/core/dev.c:3606
 __dev_queue_xmit+0x299a/0x3650 net/core/dev.c:4229
page last free stack trace:
 reset_page_owner include/linux/page_owner.h:24 [inline]
 free_pages_prepare mm/page_alloc.c:1338 [inline]
 free_pcp_prepare+0x374/0x870 mm/page_alloc.c:1389
 free_unref_page_prepare mm/page_alloc.c:3309 [inline]
 free_unref_page+0x19/0x690 mm/page_alloc.c:3388
 qlink_free mm/kasan/quarantine.c:146 [inline]
 qlist_free_all+0x5a/0xc0 mm/kasan/quarantine.c:165
 kasan_quarantine_reduce+0x180/0x200 mm/kasan/quarantine.c:272
 __kasan_slab_alloc+0xa2/0xc0 mm/kasan/common.c:444
 kasan_slab_alloc include/linux/kasan.h:259 [inline]
 slab_post_alloc_hook mm/slab.h:519 [inline]
 slab_alloc_node mm/slub.c:3234 [inline]
 kmem_cache_alloc_node+0x255/0x3f0 mm/slub.c:3270
 __alloc_skb+0x215/0x340 net/core/skbuff.c:414
 alloc_skb include/linux/skbuff.h:1126 [inline]
 alloc_skb_with_frags+0x93/0x620 net/core/skbuff.c:6078
 sock_alloc_send_pskb+0x783/0x910 net/core/sock.c:2575
 mld_newpack+0x1df/0x770 net/ipv6/mcast.c:1754
 add_grhead+0x265/0x330 net/ipv6/mcast.c:1857
 add_grec+0x1053/0x14e0 net/ipv6/mcast.c:1995
 mld_send_initial_cr.part.0+0xf6/0x230 net/ipv6/mcast.c:2242
 mld_send_initial_cr net/ipv6/mcast.c:1232 [inline]
 mld_dad_work+0x1d3/0x690 net/ipv6/mcast.c:2268
 process_one_work+0x9b2/0x1690 kernel/workqueue.c:2298
 worker_thread+0x658/0x11f0 kernel/workqueue.c:2445

Memory state around the buggy address:
 ffff88807f1cb600: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88807f1cb680: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
>ffff88807f1cb700: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                        ^
 ffff88807f1cb780: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
 ffff88807f1cb800: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Fixes: 6be1703d0c0f ("ipv4: Early TCP socket demux.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20211220143330.680945-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
[cmllamas: fixed trivial merge conflict]
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoefi: libstub: drop pointless get_memory_map() call
Ard Biesheuvel [Thu, 15 Sep 2022 17:00:24 +0000 (19:00 +0200)]
efi: libstub: drop pointless get_memory_map() call

commit a675f3657ecfcbd7609f41615e954b2d96dc52d3 upstream.

Currently, the non-x86 stub code calls get_memory_map() redundantly,
given that the data it returns is never used anywhere. So drop the call.

Cc: <stable@vger.kernel.org> # v4.14+
Fixes: 89d905412341 ("efi/arm-stub: Round up FDT allocation to mapping size")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agomd: Replace snprintf with scnprintf
Saurabh Sengar [Tue, 23 Aug 2022 18:51:04 +0000 (11:51 -0700)]
md: Replace snprintf with scnprintf

commit 647ad8c906051095bea32508a8eee941dc5bb41a upstream.

Current code produces a warning as shown below when total characters
in the constituent block device names plus the slashes exceeds 200.
snprintf() returns the number of characters generated from the given
input, which could cause the expression “200 – len” to wrap around
to a large positive number. Fix this by using scnprintf() instead,
which returns the actual number of characters written into the buffer.

[ 1513.267938] ------------[ cut here ]------------
[ 1513.267943] WARNING: CPU: 15 PID: 37247 at <snip>/lib/vsprintf.c:2509 vsnprintf+0x2c8/0x510
[ 1513.267944] Modules linked in:  <snip>
[ 1513.267969] CPU: 15 PID: 37247 Comm: mdadm Not tainted 5.4.0-1085-azure #90~18.04.1-Ubuntu
[ 1513.267969] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 05/09/2022
[ 1513.267971] RIP: 0010:vsnprintf+0x2c8/0x510
<-snip->
[ 1513.267982] Call Trace:
[ 1513.267986]  snprintf+0x45/0x70
[ 1513.267990]  ? disk_name+0x71/0xa0
[ 1513.267993]  dump_zones+0x114/0x240 [raid0]
[ 1513.267996]  ? _cond_resched+0x19/0x40
[ 1513.267998]  raid0_run+0x19e/0x270 [raid0]
[ 1513.268000]  md_run+0x5e0/0xc50
[ 1513.268003]  ? security_capable+0x3f/0x60
[ 1513.268005]  do_md_run+0x19/0x110
[ 1513.268006]  md_ioctl+0x195e/0x1f90
[ 1513.268007]  blkdev_ioctl+0x91f/0x9f0
[ 1513.268010]  block_ioctl+0x3d/0x50
[ 1513.268012]  do_vfs_ioctl+0xa9/0x640
[ 1513.268014]  ? __fput+0x162/0x260
[ 1513.268016]  ksys_ioctl+0x75/0x80
[ 1513.268017]  __x64_sys_ioctl+0x1a/0x20
[ 1513.268019]  do_syscall_64+0x5e/0x200
[ 1513.268021]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: ee8fa3f8d1359 ("md/raid0: replace printk() with pr_*()")
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Acked-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoext4: continue to expand file system when the target size doesn't reach
Jerry Lee 李修賢 [Mon, 18 Jul 2022 10:25:19 +0000 (10:25 +0000)]
ext4: continue to expand file system when the target size doesn't reach

commit 8224229e1cf223e43798798977b45a55afde884b upstream.

When expanding a file system from (16TiB-2MiB) to 18TiB, the operation
exits early which leads to result inconsistency between resize2fs and
Ext4 kernel driver.

=== before ===
○ → resize2fs /dev/mapper/thin
resize2fs 1.45.5 (07-Jan-2020)
Filesystem at /dev/mapper/thin is mounted on /mnt/test; on-line resizing required
old_desc_blocks = 2048, new_desc_blocks = 2304
The filesystem on /dev/mapper/thin is now 4831837696 (4k) blocks long.

[  865.186308] EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
[  912.091502] dm-4: detected capacity change from 34359738368 to 38654705664
[  970.030550] dm-5: detected capacity change from 34359734272 to 38654701568
[ 1000.012751] EXT4-fs (dm-5): resizing filesystem from 4294966784 to 4831837696 blocks
[ 1000.012878] EXT4-fs (dm-5): resized filesystem to 4294967296

=== after ===
[  129.104898] EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
[  143.773630] dm-4: detected capacity change from 34359738368 to 38654705664
[  198.203246] dm-5: detected capacity change from 34359734272 to 38654701568
[  207.918603] EXT4-fs (dm-5): resizing filesystem from 4294966784 to 4831837696 blocks
[  207.918754] EXT4-fs (dm-5): resizing filesystem from 4294967296 to 4831837696 blocks
[  207.918758] EXT4-fs (dm-5): Converting file system to meta_bg
[  207.918790] EXT4-fs (dm-5): resizing filesystem from 4294967296 to 4831837696 blocks
[  221.454050] EXT4-fs (dm-5): resized to 4658298880 blocks
[  227.634613] EXT4-fs (dm-5): resized filesystem to 4831837696

Signed-off-by: Jerry Lee <jerrylee@qnap.com>
Link: https://lore.kernel.org/r/PU1PR04MB22635E739BD21150DC182AC6A18C9@PU1PR04MB2263.apcprd04.prod.outlook.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agonet/ieee802154: don't warn zero-sized raw_sendmsg()
Tetsuo Handa [Wed, 5 Oct 2022 01:47:50 +0000 (21:47 -0400)]
net/ieee802154: don't warn zero-sized raw_sendmsg()

[ Upstream commit 14cf495c44f79602f9123894ff6adcc1ea95d9ea ]

syzbot is hitting skb_assert_len() warning at __dev_queue_xmit() [1],
for PF_IEEE802154 socket's zero-sized raw_sendmsg() request is hitting
__dev_queue_xmit() with skb->len == 0.

Since PF_IEEE802154 socket's zero-sized raw_sendmsg() request was
able to return 0, don't call __dev_queue_xmit() if packet length is 0.

  ----------
  #include <sys/socket.h>
  #include <netinet/in.h>

  int main(int argc, char *argv[])
  {
    struct sockaddr_in addr = { .sin_family = AF_INET, .sin_addr.s_addr = htonl(INADDR_LOOPBACK) };
    struct iovec iov = { };
    struct msghdr hdr = { .msg_name = &addr, .msg_namelen = sizeof(addr), .msg_iov = &iov, .msg_iovlen = 1 };
    sendmsg(socket(PF_IEEE802154, SOCK_RAW, 0), &hdr, 0);
    return 0;
  }
  ----------

Note that this might be a sign that commit 214de3de5b084df0 ("bpf: Don't
redirect packets with invalid pkt_len") should be reverted, for
skb->len == 0 was acceptable for at least PF_IEEE802154 socket.

Link: https://syzkaller.appspot.com/bug?extid=5ea725c25d06fb9114c4
Reported-by: syzbot <syzbot+5ea725c25d06fb9114c4@syzkaller.appspotmail.com>
Fixes: 214de3de5b084df0 ("bpf: Don't redirect packets with invalid pkt_len")
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/r/20221005014750.3685555-2-aahringo@redhat.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agoRevert "net/ieee802154: reject zero-sized raw_sendmsg()"
Alexander Aring [Wed, 5 Oct 2022 01:47:49 +0000 (21:47 -0400)]
Revert "net/ieee802154: reject zero-sized raw_sendmsg()"

[ Upstream commit 7ea0dcd2b91a18944403defc561bd830ab0dd042 ]

This reverts commit ae6d6e494ba8f21fc08d2bbb7bad972a2ad5fe2f.

There is a v2 which does return zero if zero length is given.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/r/20221005014750.3685555-1-aahringo@redhat.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agonet: ieee802154: return -EINVAL for unknown addr type
Alexander Aring [Thu, 6 Oct 2022 02:02:37 +0000 (22:02 -0400)]
net: ieee802154: return -EINVAL for unknown addr type

commit d2b6663db7651144fa04838e83829337b911a894 upstream.

This patch adds handling to return -EINVAL for an unknown addr type. The
current behaviour is to return 0 as successful but the size of an
unknown addr type is not defined and should return an error like -EINVAL.

Fixes: 891afbcca1ec ("net/ieee802154: fix uninit value bug in dgram_sendmsg")
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoio_uring/af_unix: defer registered files gc to io_uring release
Pavel Begunkov [Sun, 16 Oct 2022 23:03:11 +0000 (00:03 +0100)]
io_uring/af_unix: defer registered files gc to io_uring release

[ upstream commit 2b0d241d566fa268fdb1511c2b0b1e1ef0447479 ]

Instead of putting io_uring's registered files in unix_gc() we want it
to be done by io_uring itself. The trick here is to consider io_uring
registered files for cycle detection but not actually putting them down.
Because io_uring can't register other ring instances, this will remove
all refs to the ring file triggering the ->release path and clean up
with io_ring_ctx_free().

Cc: stable@vger.kernel.org
Fixes: c90c9784c76b ("io_uring: add file set registration")
Reported-and-tested-by: David Bouman <dbouman03@gmail.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
[axboe: add kerneldoc comment to skb, fold in skb leak fix]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoperf intel-pt: Fix segfault in intel_pt_print_info() with uClibc
Adrian Hunter [Wed, 12 Oct 2022 08:22:58 +0000 (11:22 +0300)]
perf intel-pt: Fix segfault in intel_pt_print_info() with uClibc

commit 47286d93af9832fb92b9160794aa367153112f80 upstream.

uClibc segfaulted because NULL was passed as the format to fprintf().

That happened because one of the format strings was missing and
intel_pt_print_info() didn't check that before calling fprintf().

Add the missing format string, and check format is not NULL before calling
fprintf().

Fixes: 3ea2f9522652918a ("perf tools: Pass Intel PT information for decoding MTC and CYC")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20221012082259.22394-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoclk: bcm2835: Make peripheral PLLC critical
Maxime Ripard [Mon, 26 Sep 2022 08:45:09 +0000 (10:45 +0200)]
clk: bcm2835: Make peripheral PLLC critical

[ Upstream commit ee487b9c7ae6aea8f2e37b91d640e6ce472d5123 ]

When testing for a series affecting the VEC, it was discovered that
turning off and on the VEC clock is crashing the system.

It turns out that, when disabling the VEC clock, it's the only child of
the PLLC-per clock which will also get disabled. The source of the crash
is PLLC-per being disabled.

It's likely that some other device might not take a clock reference that
it actually needs, but it's unclear which at this point. Let's make
PLLC-per critical so that we don't have that crash.

Reported-by: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20220926084509.12233-1-maxime@cerno.tech
Reviewed-by: Stefan Wahren <stefan.wahren@i2se.com>
Acked-by: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agousb: idmouse: fix an uninit-value in idmouse_open
Dongliang Mu [Thu, 22 Sep 2022 13:48:44 +0000 (21:48 +0800)]
usb: idmouse: fix an uninit-value in idmouse_open

[ Upstream commit f682e9fd27a656e52cdd62364fd71fbbc2d56c0d ]

In idmouse_create_image, if any ftip_command fails, it will
go to the reset label. However, this leads to the data in
bulk_in_buffer[HEADER..IMGSIZE] uninitialized. And the check
for valid image incurs an uninitialized dereference.

Fix this by moving the check before reset label since this
check only be valid if the data after bulk_in_buffer[HEADER]
has concrete data.

Note that this is found by KMSAN, so only kernel compilation
is tested.

Reported-by: syzbot+79832d33eb89fb3cd092@syzkaller.appspotmail.com
Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com>
Link: https://lore.kernel.org/r/20220922134847.1101921-1-dzm91@hust.edu.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agonvmet-tcp: add bounds check on Transfer Tag
Varun Prakash [Tue, 20 Sep 2022 18:36:49 +0000 (00:06 +0530)]
nvmet-tcp: add bounds check on Transfer Tag

[ Upstream commit c685d197247be54b291fde68b2888e05388bfe24 ]

ttag is used as an index to get cmd in nvmet_tcp_handle_h2c_data_pdu(),
add a bounds check to avoid out-of-bounds access.

Signed-off-by: Varun Prakash <varun@chelsio.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agonvme: copy firmware_rev on each init
Keith Busch [Mon, 19 Sep 2022 19:45:08 +0000 (12:45 -0700)]
nvme: copy firmware_rev on each init

[ Upstream commit 3ae31dc031c88e53580f2954781ab5137d454226 ]

The firmware revision can change on after a reset so copy the most
recent info each time instead of just the first time, otherwise the
sysfs firmware_rev entry may contain stale data.

Reported-by: Jeff Lien <jeff.lien@wdc.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Chao Leng <lengchao@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agostaging: rtl8723bs: fix a potential memory leak in rtw_init_cmd_priv()
Xiaoke Wang [Fri, 9 Sep 2022 11:27:21 +0000 (19:27 +0800)]
staging: rtl8723bs: fix a potential memory leak in rtw_init_cmd_priv()

[ Upstream commit c6ae3f73e44b4d973300fd9828071eda1dbc313f ]

In rtw_init_cmd_priv(), if `pcmdpriv->rsp_allocated_buf` is allocated
in failure, then `pcmdpriv->cmd_allocated_buf` will be not properly
released. Besides, considering there are only two error paths and the
first one can directly return, so we do not need implicitly jump to the
`exit` tag to execute the error handler.

So this patch added `kfree(pcmdpriv->cmd_allocated_buf);` on the error
path to release the resource and simplified the return logic of
rtw_init_cmd_priv(). As there is no proper device to test with, no runtime
testing was performed.

Signed-off-by: Xiaoke Wang <xkernel.wang@foxmail.com>
Link: https://lore.kernel.org/r/tencent_2B7931B79BA38E22205C5A09EFDF11E48805@qq.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agoRevert "usb: storage: Add quirk for Samsung Fit flash"
sunghwan jung [Tue, 13 Sep 2022 11:49:13 +0000 (20:49 +0900)]
Revert "usb: storage: Add quirk for Samsung Fit flash"

[ Upstream commit db02a0e0c59d48d7931bc3c51e983923094a4f65 ]

This reverts commit 22eff26a239272ab997b753e950202c20c2ab264,
which fix the timeout issue for "Samsung Fit Flash".

But the commit affects not only "Samsung Fit Flash" but also other usb
storages that use the same controller and causes severe performance
regression.

 # hdparm -t /dev/sda (without the quirk)
 Timing buffered disk reads: 622 MB in  3.01 seconds = 206.66 MB/sec

 # hdparm -t /dev/sda (with the quirk)
 Timing buffered disk reads: 220 MB in  3.00 seconds =  73.32 MB/sec

The commit author mentioned that "Issue was reproduced after device has
bad block", so this quirk should be applied when we have the timeout
issue with a device that has bad blocks.

We revert the commit so that we apply this quirk by adding kernel
paramters using a bootloader or other ways when we really need it,
without the performance regression with devices that don't have the
issue.

Signed-off-by: sunghwan jung <onenowy@gmail.com>
Link: https://lore.kernel.org/r/20220913114913.3073-1-onenowy@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agousb: musb: Fix musb_gadget.c rxstate overflow bug
Robin Guo [Tue, 6 Sep 2022 02:21:19 +0000 (10:21 +0800)]
usb: musb: Fix musb_gadget.c rxstate overflow bug

[ Upstream commit 223adfe78026ef7ed0482e89020496b75c200f42 ]

The usb function device call musb_gadget_queue() adds the passed
request to musb_ep::req_list,If the (request->length > musb_ep->packet_sz)
and (is_buffer_mapped(req) return false),the rxstate() will copy all data
in fifo to request->buf which may cause request->buf out of bounds.

Fix it by add the length check :
fifocnt = min_t(unsigned, request->length - request->actual, fifocnt);

Signed-off-by: Robin Guo <guoweibin@inspur.com>
Link: https://lore.kernel.org/r/20220906102119.1b071d07a8391ff115e6d1ef@inspur.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agousb: host: xhci: Fix potential memory leak in xhci_alloc_stream_info()
Jianglei Nie [Wed, 21 Sep 2022 12:34:45 +0000 (15:34 +0300)]
usb: host: xhci: Fix potential memory leak in xhci_alloc_stream_info()

[ Upstream commit be592a8435bebf4de02d1b6e602b6825eb6d134e ]

xhci_alloc_stream_info() allocates stream context array for stream_info
->stream_ctx_array with xhci_alloc_stream_ctx(). When some error occurs,
stream_info->stream_ctx_array is not released, which will lead to a
memory leak.

We can fix it by releasing the stream_info->stream_ctx_array with
xhci_free_stream_ctx() on the error path to avoid the potential memory
leak.

Signed-off-by: Jianglei Nie <niejianglei2021@163.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20220921123450.671459-2-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agomd/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d
Logan Gunthorpe [Wed, 21 Sep 2022 16:28:37 +0000 (10:28 -0600)]
md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d

[ Upstream commit 131927865b56ffc1e0297f6333dd4418c850ad0e ]

A complicated deadlock exists when using the journal and an elevated
group_thrtead_cnt. It was found with loop devices, but its not clear
whether it can be seen with real disks. The deadlock can occur simply
by writing data with an fio script.

When the deadlock occurs, multiple threads will hang in different ways:

 1) The group threads will hang in the blk-wbt code with bios waiting to
    be submitted to the block layer:

        io_schedule+0x70/0xb0
        rq_qos_wait+0x153/0x210
        wbt_wait+0x115/0x1b0
        io_schedule+0x70/0xb0
        rq_qos_wait+0x153/0x210
        wbt_wait+0x115/0x1b0
        __rq_qos_throttle+0x38/0x60
        blk_mq_submit_bio+0x589/0xcd0
        wbt_wait+0x115/0x1b0
        __rq_qos_throttle+0x38/0x60
        blk_mq_submit_bio+0x589/0xcd0
        __submit_bio+0xe6/0x100
        submit_bio_noacct_nocheck+0x42e/0x470
        submit_bio_noacct+0x4c2/0xbb0
        ops_run_io+0x46b/0x1a30
        handle_stripe+0xcd3/0x36b0
        handle_active_stripes.constprop.0+0x6f6/0xa60
        raid5_do_work+0x177/0x330

    Or:
        io_schedule+0x70/0xb0
        rq_qos_wait+0x153/0x210
        wbt_wait+0x115/0x1b0
        __rq_qos_throttle+0x38/0x60
        blk_mq_submit_bio+0x589/0xcd0
        __submit_bio+0xe6/0x100
        submit_bio_noacct_nocheck+0x42e/0x470
        submit_bio_noacct+0x4c2/0xbb0
        flush_deferred_bios+0x136/0x170
        raid5_do_work+0x262/0x330

 2) The r5l_reclaim thread will hang in the same way, submitting a
    bio to the block layer:

        io_schedule+0x70/0xb0
        rq_qos_wait+0x153/0x210
        wbt_wait+0x115/0x1b0
        __rq_qos_throttle+0x38/0x60
        blk_mq_submit_bio+0x589/0xcd0
        __submit_bio+0xe6/0x100
        submit_bio_noacct_nocheck+0x42e/0x470
        submit_bio_noacct+0x4c2/0xbb0
        submit_bio+0x3f/0xf0
        md_super_write+0x12f/0x1b0
        md_update_sb.part.0+0x7c6/0xff0
        md_update_sb+0x30/0x60
        r5l_do_reclaim+0x4f9/0x5e0
        r5l_reclaim_thread+0x69/0x30b

    However, before hanging, the MD_SB_CHANGE_PENDING flag will be
    set for sb_flags in r5l_write_super_and_discard_space(). This
    flag will never be cleared because the submit_bio() call never
    returns.

 3) Due to the MD_SB_CHANGE_PENDING flag being set, handle_stripe()
    will do no processing on any pending stripes and re-set
    STRIPE_HANDLE. This will cause the raid5d thread to enter an
    infinite loop, constantly trying to handle the same stripes
    stuck in the queue.

    The raid5d thread has a blk_plug that holds a number of bios
    that are also stuck waiting seeing the thread is in a loop
    that never schedules. These bios have been accounted for by
    blk-wbt thus preventing the other threads above from
    continuing when they try to submit bios. --Deadlock.

To fix this, add the same wait_event() that is used in raid5_do_work()
to raid5d() such that if MD_SB_CHANGE_PENDING is set, the thread will
schedule and wait until the flag is cleared. The schedule action will
flush the plug which will allow the r5l_reclaim thread to continue,
thus preventing the deadlock.

However, md_check_recovery() calls can also clear MD_SB_CHANGE_PENDING
from the same thread and can thus deadlock if the thread is put to
sleep. So avoid waiting if md_check_recovery() is being called in the
loop.

It's not clear when the deadlock was introduced, but the similar
wait_event() call in raid5_do_work() was added in 2017 by this
commit:

    d285b16c5567 ("md/raid5: simplfy delaying of writes while metadata
                   is updated.")

Link: https://lore.kernel.org/r/7f3b87b6-b52a-f737-51d7-a4eec5c44112@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agoHID: roccat: Fix use-after-free in roccat_read()
Hyunwoo Kim [Sun, 4 Sep 2022 19:31:15 +0000 (12:31 -0700)]
HID: roccat: Fix use-after-free in roccat_read()

[ Upstream commit 01230658fe7a1af5271205b0b0a4900c3d75bda5 ]

roccat_report_event() is responsible for registering
roccat-related reports in struct roccat_device.

int roccat_report_event(int minor, u8 const *data)
{
struct roccat_device *device;
struct roccat_reader *reader;
struct roccat_report *report;
uint8_t *new_value;

device = devices[minor];

new_value = kmemdup(data, device->report_size, GFP_ATOMIC);
if (!new_value)
return -ENOMEM;

report = &device->cbuf[device->cbuf_end];

/* passing NULL is safe */
kfree(report->value);
...

The registered report is stored in the struct roccat_device member
"struct roccat_report cbuf[ROCCAT_CBUF_SIZE];".
If more reports are received than the "ROCCAT_CBUF_SIZE" value,
kfree() the saved report from cbuf[0] and allocates a new reprot.
Since there is no lock when this kfree() is performed,
kfree() can be performed even while reading the saved report.

static ssize_t roccat_read(struct file *file, char __user *buffer,
size_t count, loff_t *ppos)
{
struct roccat_reader *reader = file->private_data;
struct roccat_device *device = reader->device;
struct roccat_report *report;
ssize_t retval = 0, len;
DECLARE_WAITQUEUE(wait, current);

mutex_lock(&device->cbuf_lock);

...

report = &device->cbuf[reader->cbuf_start];
/*
 * If report is larger than requested amount of data, rest of report
 * is lost!
 */
len = device->report_size > count ? count : device->report_size;

if (copy_to_user(buffer, report->value, len)) {
retval = -EFAULT;
goto exit_unlock;
}
...

The roccat_read() function receives the device->cbuf report and
delivers it to the user through copy_to_user().
If the N+ROCCAT_CBUF_SIZE th report is received while copying of
the Nth report->value is in progress, the pointer that copy_to_user()
is working on is kfree()ed and UAF read may occur. (race condition)

Since the device node of this driver does not set separate permissions,
this is not a security vulnerability, but because it is used for
requesting screen display of profile or dpi settings,
a user using the roccat device can apply udev to this device node or
There is a possibility to use it by giving.

Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agobcache: fix set_at_max_writeback_rate() for multiple attached devices
Coly Li [Mon, 19 Sep 2022 16:16:47 +0000 (00:16 +0800)]
bcache: fix set_at_max_writeback_rate() for multiple attached devices

[ Upstream commit 5918167b0ab296afdc556fa1809870b27c8a0018 ]

Inside set_at_max_writeback_rate() the calculation in following if()
check is wrong,
if (atomic_inc_return(&c->idle_counter) <
    atomic_read(&c->attached_dev_nr) * 6)

Because each attached backing device has its own writeback thread
running and increasing c->idle_counter, the counter increates much
faster than expected. The correct calculation should be,
(counter / dev_nr) < dev_nr * 6
which equals to,
counter < dev_nr * dev_nr * 6

This patch fixes the above mistake with correct calculation, and helper
routine idle_counter_exceeded() is added to make code be more clear.

Reported-by: Mingzhe Zou <mingzhe.zou@easystack.cn>
Signed-off-by: Coly Li <colyli@suse.de>
Acked-by: Mingzhe Zou <mingzhe.zou@easystack.cn>
Link: https://lore.kernel.org/r/20220919161647.81238-6-colyli@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agoata: libahci_platform: Sanity check the DT child nodes number
Baikal Electronics [Fri, 9 Sep 2022 19:36:06 +0000 (22:36 +0300)]
ata: libahci_platform: Sanity check the DT child nodes number

[ Upstream commit 2a2721432e45c3fe6a6a422ca1b4c979042d2565 ]

Having greater than AHCI_MAX_PORTS (32) ports detected isn't that critical
from the further AHCI-platform initialization point of view since
exceeding the ports upper limit will cause allocating more resources than
will be used afterwards. But detecting too many child DT-nodes doesn't
seem right since it's very unlikely to have it on an ordinary platform. In
accordance with the AHCI specification there can't be more than 32 ports
implemented at least due to having the CAP.NP field of 5 bits wide and the
PI register of dword size. Thus if such situation is found the DTB must
have been corrupted and the data read from it shouldn't be reliable. Let's
consider that as an erroneous situation and halt further resources
allocation.

Note it's logically more correct to have the nports set only after the
initialization value is checked for being sane. So while at it let's make
sure nports is assigned with a correct value.

Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agostaging: vt6655: fix potential memory leak
Nam Cao [Fri, 9 Sep 2022 14:13:39 +0000 (16:13 +0200)]
staging: vt6655: fix potential memory leak

[ Upstream commit e704dc8a370ce3f8b5b96f327f811f965785620e ]

In function device_init_td0_ring, memory is allocated for member
td_info of priv->apTD0Rings[i], with i increasing from 0. In case of
allocation failure, the memory is freed in reversed order, with i
decreasing to 0. However, the case i=0 is left out and thus memory is
leaked.

Modify the memory freeing loop to include the case i=0.

Tested-by: Philipp Hortmann <philipp.g.hortmann@gmail.com>
Signed-off-by: Nam Cao <namcaov@gmail.com>
Link: https://lore.kernel.org/r/20220909141338.19343-1-namcaov@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agopower: supply: adp5061: fix out-of-bounds read in adp5061_get_chg_type()
Wei Yongjun [Sat, 27 Aug 2022 07:32:23 +0000 (07:32 +0000)]
power: supply: adp5061: fix out-of-bounds read in adp5061_get_chg_type()

[ Upstream commit 850a7bca9f8278c9b3913e56d8acbe1ab14af08a ]

ADP5061_CHG_STATUS_1_CHG_STATUS is masked with 0x07, which means a length
of 8, but adp5061_chg_type array size is 4, may end up reading 4 elements
beyond the end of the adp5061_chg_type[] array.

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agonbd: Fix hung when signal interrupts nbd_start_device_ioctl()
Shigeru Yoshida [Wed, 7 Sep 2022 16:35:02 +0000 (01:35 +0900)]
nbd: Fix hung when signal interrupts nbd_start_device_ioctl()

[ Upstream commit 07595c646deca494ca8ffb9bc325ddb8e52fbac9 ]

syzbot reported hung task [1].  The following program is a simplified
version of the reproducer:

int main(void)
{
int sv[2], fd;

if (socketpair(AF_UNIX, SOCK_STREAM, 0, sv) < 0)
return 1;
if ((fd = open("/dev/nbd0", 0)) < 0)
return 1;
if (ioctl(fd, NBD_SET_SIZE_BLOCKS, 0x81) < 0)
return 1;
if (ioctl(fd, NBD_SET_SOCK, sv[0]) < 0)
return 1;
if (ioctl(fd, NBD_DO_IT) < 0)
return 1;
return 0;
}

When signal interrupt nbd_start_device_ioctl() waiting the condition
atomic_read(&config->recv_threads) == 0, the task can hung because it
waits the completion of the inflight IOs.

This patch fixes the issue by clearing queue, not just shutdown, when
signal interrupt nbd_start_device_ioctl().

Link: https://syzkaller.appspot.com/bug?id=7d89a3ffacd2b83fdd39549bc4d8e0a89ef21239
Reported-by: syzbot+38e6c55d4969a14c1534@syzkaller.appspotmail.com
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/r/20220907163502.577561-1-syoshida@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agoscsi: 3w-9xxx: Avoid disabling device if failing to enable it
Letu Ren [Mon, 29 Aug 2022 11:01:15 +0000 (19:01 +0800)]
scsi: 3w-9xxx: Avoid disabling device if failing to enable it

[ Upstream commit ef222f4f63e9ce97d371ecd0f2fd2059a7b7f64c ]

The original code will "goto out_disable_device" and call
pci_disable_device() if pci_enable_device() fails. The kernel will generate
a warning message like "3w-9xxx 0000:00:05.0: disabling already-disabled
device".

We shouldn't disable a device that failed to be enabled. A simple return is
fine.

Link: https://lore.kernel.org/r/20220829110115.38789-1-fantasquex@gmail.com
Reported-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Letu Ren <fantasquex@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agoclk: zynqmp: pll: rectify rate rounding in zynqmp_pll_round_rate
Quanyang Wang [Fri, 26 Aug 2022 14:20:30 +0000 (22:20 +0800)]
clk: zynqmp: pll: rectify rate rounding in zynqmp_pll_round_rate

[ Upstream commit d1eb71187c444ca81116edbf28a4f71889cd3d11 ]

The function zynqmp_pll_round_rate is used to find a most appropriate
PLL frequency which the hardware can generate according to the desired
frequency. For example, if the desired frequency is 297MHz, considering
the limited range from PS_PLL_VCO_MIN (1.5GHz) to PS_PLL_VCO_MAX (3.0GHz)
of PLL, zynqmp_pll_round_rate should return 1.872GHz (297MHz * 5).

There are two problems with the current code of zynqmp_pll_round_rate:

1) When the rate is below PS_PLL_VCO_MIN, it can't find a correct rate
when the parameter "rate" is an integer multiple of *prate, in other words,
if "f" is zero, zynqmp_pll_round_rate won't return a valid frequency which
is from PS_PLL_VCO_MIN to PS_PLL_VCO_MAX. For example, *prate is 33MHz
and the rate is 660MHz, zynqmp_pll_round_rate will not boost up rate and
just return 660MHz, and this will cause clk_calc_new_rates failure since
zynqmp_pll_round_rate returns an invalid rate out of its boundaries.

2) Even if the rate is higher than PS_PLL_VCO_MIN, there is still a risk
that zynqmp_pll_round_rate returns an invalid rate because the function
DIV_ROUND_CLOSEST makes some loss in the fractional part. If the parent
clock *prate is 33333333Hz and we want to set the PLL rate to 1.5GHz,
this function will return 1499999985Hz by using the formula below:
    value = *prate * DIV_ROUND_CLOSEST(rate, *prate)).
This value is also invalid since it's slightly smaller than PS_PLL_VCO_MIN.
because DIV_ROUND_CLOSEST makes some loss in the fractional part.

Signed-off-by: Quanyang Wang <quanyang.wang@windriver.com>
Link: https://lore.kernel.org/r/20220826142030.213805-1-quanyang.wang@windriver.com
Reviewed-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agomedia: cx88: Fix a null-ptr-deref bug in buffer_prepare()
Zheyu Ma [Thu, 28 Jul 2022 02:23:38 +0000 (04:23 +0200)]
media: cx88: Fix a null-ptr-deref bug in buffer_prepare()

[ Upstream commit c9360be25d30cc6d9d4a9c66331ec95df483aa0b ]

When the driver calls cx88_risc_buffer() to prepare the buffer, the
function call may fail, resulting in a empty buffer and null-ptr-deref
later in buffer_queue().

The following log can reveal it:

[   41.822762] general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN PTI
[   41.824488] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
[   41.828027] RIP: 0010:buffer_queue+0xc2/0x500
[   41.836311] Call Trace:
[   41.836945]  __enqueue_in_driver+0x141/0x360
[   41.837262]  vb2_start_streaming+0x62/0x4a0
[   41.838216]  vb2_core_streamon+0x1da/0x2c0
[   41.838516]  __vb2_init_fileio+0x981/0xbc0
[   41.839141]  __vb2_perform_fileio+0xbf9/0x1120
[   41.840072]  vb2_fop_read+0x20e/0x400
[   41.840346]  v4l2_read+0x215/0x290
[   41.840603]  vfs_read+0x162/0x4c0

Fix this by checking the return value of cx88_risc_buffer()

[hverkuil: fix coding style issues]

Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agoclk: zynqmp: Fix stack-out-of-bounds in strncpy`
Ian Nam [Tue, 10 May 2022 07:01:54 +0000 (12:31 +0530)]
clk: zynqmp: Fix stack-out-of-bounds in strncpy`

[ Upstream commit 90b8ffbcc434be3309ab222571a9885562549422 ]

"BUG: KASAN: stack-out-of-bounds in strncpy+0x30/0x68"

Linux-ATF interface is using 16 bytes of SMC payload. In case clock name is
longer than 15 bytes, string terminated NULL character will not be received
by Linux. Add explicit NULL character at last byte to fix issues when clock
name is longer.

This fixes below bug reported by KASAN:

 ==================================================================
 BUG: KASAN: stack-out-of-bounds in strncpy+0x30/0x68
 Read of size 1 at addr ffff0008c89a7410 by task swapper/0/1

 CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.4.0-00396-g81ef9e7-dirty #3
 Hardware name: Xilinx Versal vck190 Eval board revA (QSPI) (DT)
 Call trace:
  dump_backtrace+0x0/0x1e8
  show_stack+0x14/0x20
  dump_stack+0xd4/0x108
  print_address_description.isra.0+0xbc/0x37c
  __kasan_report+0x144/0x198
  kasan_report+0xc/0x18
  __asan_load1+0x5c/0x68
  strncpy+0x30/0x68
  zynqmp_clock_probe+0x238/0x7b8
  platform_drv_probe+0x6c/0xc8
  really_probe+0x14c/0x418
  driver_probe_device+0x74/0x130
  __device_attach_driver+0xc4/0xe8
  bus_for_each_drv+0xec/0x150
  __device_attach+0x160/0x1d8
  device_initial_probe+0x10/0x18
  bus_probe_device+0xe0/0xf0
  device_add+0x528/0x950
  of_device_add+0x5c/0x80
  of_platform_device_create_pdata+0x120/0x168
  of_platform_bus_create+0x244/0x4e0
  of_platform_populate+0x50/0xe8
  zynqmp_firmware_probe+0x370/0x3a8
  platform_drv_probe+0x6c/0xc8
  really_probe+0x14c/0x418
  driver_probe_device+0x74/0x130
  device_driver_attach+0x94/0xa0
  __driver_attach+0x70/0x108
  bus_for_each_dev+0xe4/0x158
  driver_attach+0x30/0x40
  bus_add_driver+0x21c/0x2b8
  driver_register+0xbc/0x1d0
  __platform_driver_register+0x7c/0x88
  zynqmp_firmware_driver_init+0x1c/0x24
  do_one_initcall+0xa4/0x234
  kernel_init_freeable+0x1b0/0x24c
  kernel_init+0x10/0x110
  ret_from_fork+0x10/0x18

 The buggy address belongs to the page:
 page:ffff0008f9be1c88 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0
 raw: 0008d00000000000 ffff0008f9be1c90 ffff0008f9be1c90 0000000000000000
 raw: 0000000000000000 0000000000000000 00000000ffffffff
 page dumped because: kasan: bad access detected

 addr ffff0008c89a7410 is located in stack of task swapper/0/1 at offset 112 in frame:
  zynqmp_clock_probe+0x0/0x7b8

 this frame has 3 objects:
  [32, 44) 'response'
  [64, 80) 'ret_payload'
  [96, 112) 'name'

 Memory state around the buggy address:
  ffff0008c89a7300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  ffff0008c89a7380: 00 00 00 00 f1 f1 f1 f1 00 04 f2 f2 00 00 f2 f2
 >ffff0008c89a7400: 00 00 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
                          ^
  ffff0008c89a7480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  ffff0008c89a7500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ==================================================================

Signed-off-by: Ian Nam <young.kwan.nam@xilinx.com>
Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com>
Link: https://lore.kernel.org/r/20220510070154.29528-3-shubhrajyoti.datta@xilinx.com
Acked-by: Michal Simek <michal.simek@amd.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agobtrfs: scrub: try to fix super block errors
Qu Wenruo [Tue, 2 Aug 2022 06:53:03 +0000 (14:53 +0800)]
btrfs: scrub: try to fix super block errors

[ Upstream commit 1f5639cc95074d986757d2b5506a14d6b39a4435 ]

[BUG]
The following script shows that, although scrub can detect super block
errors, it never tries to fix it:

mkfs.btrfs -f -d raid1 -m raid1 $dev1 $dev2
xfs_io -c "pwrite 67108864 4k" $dev2

mount $dev1 $mnt
btrfs scrub start -B $dev2
btrfs scrub start -Br $dev2
umount $mnt

The first scrub reports the super error correctly:

  scrub done for f3289218-abd3-41ac-a630-202f766c0859
  Scrub started:    Tue Aug  2 14:44:11 2022
  Status:           finished
  Duration:         0:00:00
  Total to scrub:   1.26GiB
  Rate:             0.00B/s
  Error summary:    super=1
    Corrected:      0
    Uncorrectable:  0
    Unverified:     0

But the second read-only scrub still reports the same super error:

  Scrub started:    Tue Aug  2 14:44:11 2022
  Status:           finished
  Duration:         0:00:00
  Total to scrub:   1.26GiB
  Rate:             0.00B/s
  Error summary:    super=1
    Corrected:      0
    Uncorrectable:  0
    Unverified:     0

[CAUSE]
The comments already shows that super block can be easily fixed by
committing a transaction:

/*
 * If we find an error in a super block, we just report it.
 * They will get written with the next transaction commit
 * anyway
 */

But the truth is, such assumption is not always true, and since scrub
should try to repair every error it found (except for read-only scrub),
we should really actively commit a transaction to fix this.

[FIX]
Just commit a transaction if we found any super block errors, after
everything else is done.

We cannot do this just after scrub_supers(), as
btrfs_commit_transaction() will try to pause and wait for the running
scrub, thus we can not call it with scrub_lock hold.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agoARM: dts: imx6sx: add missing properties for sram
Alexander Stein [Fri, 26 Aug 2022 05:53:36 +0000 (07:53 +0200)]
ARM: dts: imx6sx: add missing properties for sram

[ Upstream commit ba0e14e551aa6f6177156832e9ae3736dbb18ed2 ]

All 3 properties are required by sram.yaml. Fixes the dtbs_check warning:
sram@900000: '#address-cells' is a required property
sram@900000: '#size-cells' is a required property
sram@900000: 'ranges' is a required property

Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agoARM: dts: imx6sll: add missing properties for sram
Alexander Stein [Fri, 26 Aug 2022 05:53:35 +0000 (07:53 +0200)]
ARM: dts: imx6sll: add missing properties for sram

[ Upstream commit 3956183dff92027b285dc9709fea570d5389ef06 ]

All 3 properties are required by sram.yaml. Fixes the dtbs_check warning:
sram@900000: '#address-cells' is a required property
sram@900000: '#size-cells' is a required property
sram@900000: 'ranges' is a required property

Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
3 years agoARM: dts: imx6sl: add missing properties for sram
Alexander Stein [Fri, 26 Aug 2022 05:53:34 +0000 (07:53 +0200)]
ARM: dts: imx6sl: add missing properties for sram

[ Upstream commit d1429c330a31f7749bdee9aad403098ad39ee1e8 ]

All 3 properties are required by sram.yaml. Fixes the dtbs_check warning:
sram@900000: '#address-cells' is a required property
sram@900000: '#size-cells' is a required property
sram@900000: 'ranges' is a required property

Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>