'of_node_put()' can potentially release the memory pointed to by
'iiospec.np' which would leave us with an invalid pointer (and we would
still pass it in 'of_xlate()'). Note that it is not guaranteed for the
of_node lifespan to be attached to the device (to which is attached)
lifespan so that there is (even though very unlikely) the possibility
for the node to be freed while the device is still around. Thus, as there
are indeed some of_xlate users which do access the node, a race is indeed
possible.
As such, we can only release the node after we are done with it.
Fixes: 263880f5d92b8 ("iio: Add OF support") Signed-off-by: Nuno Sá <nuno.sa@analog.com> Link: https://lore.kernel.org/r/20220715122903.332535-2-nuno.sa@analog.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
.read_raw()/.write_raw() could be called asynchronously from user space
or other in kernel drivers. Without locking on st->lock these could be
called asynchronously while there is a conversion in progress. Read will
be harmless but changing registers while conversion is in progress may
lead to inconsistent results. Thus, to avoid this lock st->lock.
Fixes: 49f576877041 ("iio:adc:at91_adc8xx: introduce new atmel adc driver") Fixes: 9ab9401f0dd3 ("iio: adc: at91-sama5d2_adc: add support for oversampling resolution") Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Link: https://lore.kernel.org/r/20220803102855.2191070-4-claudiu.beznea@microchip.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
EHCI Oxynos (drivers/usb/host/ehci-exynos.c) drives VBUS GPIO high when
trying to power up the bus, therefore the GPIO in DTS must be marked as
"active high". This will be important when EHCI driver is converted to
gpiod API that respects declared polarities.
Fixes: 292174060a8b ("ARM: dts: exynos: Enable AX88760 USB hub on Origen board") Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Link: https://lore.kernel.org/r/20220927220504.3744878-1-dmitry.torokhov@gmail.com Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
On arm32, the configuration options to specify the kernel command line
type depend on ATAGS. However, the actual CMDLINE cofiguration option
does not depend on ATAGS, and the code that handles this is not specific
to ATAGS (see drivers/of/fdt.c:early_init_dt_scan_chosen()).
Hence users who desire to override the kernel command line on arm32 must
enable support for ATAGS, even on a pure-DT system. Other architectures
(arm64, loongarch, microblaze, nios2, powerpc, and riscv) do not impose
such a restriction.
Hence drop the dependency on ATAGS.
Fixes: 7139647d5dd02e02 ("ARM: 7506/1: allow for ATAGS to be configured out when DT support is selected") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
According to s5k6a3 driver code, the reset line for the chip appears to
be active low. This also matches the typical polarity of reset lines in
general. Let's fix it up as having correct polarity in DTS is important
when the driver will be switched over to gpiod API.
Both the Linkstation LS-CHLv2 and the LS-XHL have only one ethernet
port. This has always been wrong, i.e. the board code used to set up
both ports, but the driver will play nice and return -ENODEV if the
assiciated PHY is not found. Nevertheless, it is wrong. Remove it.
Fixes: df2e36ee7587 ("ARM: kirkwood: add gigabit ethernet and mvmdio device tree nodes") Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 2308f9b3f407 ("ARM: dts: kirkwood: consolidate common pinctrl
settings") unknowingly broke the serial output on this board. Before
this commit, the pinmux was still configured by the bootloader and the
kernel didn't reconfigured it again. This was an oversight by the
initial board support where the pinmux for the serial line was never
configured by the kernel. But with this commit, the serial line will be
reconfigured to the wrong pins. This is especially confusing, because
the output still works, but the input doesn't. Presumingly, the input is
reconfigured to MPP10, but the output is connected to both MPP11 and
MPP5.
Override the pinmux in the board device tree.
Fixes: 2308f9b3f407 ("ARM: dts: kirkwood: consolidate common pinctrl settings") Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
There is a bug in Turris Omnia's schematics, whereupon the MPP[26] pin,
which is routed to CN11 pin header, is documented as SPI CS1, but
MPP[26] pin does not support this function. Instead it controls chip
select 2 if in "spi0" mode.
Fix the name of the pin node in pinctrl node and fix the comment in SPI
node.
Fixes: 36b28c350a9a ("ARM: dts: add support for Turris Omnia") Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
In qcom_smem_state_register() and qcom_smem_state_release(), we
should better use of_node_get() and of_node_put() for the reference
creation and destruction of 'device_node'.
Fixes: d77f707506be ("soc: qcom: Introduce common SMEM state machine code") Signed-off-by: Liang He <windhl@126.com> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20220721135217.1301039-2-windhl@126.com Signed-off-by: Sasha Levin <sashal@kernel.org>
There are two refcount leak bugs in qcom_smsm_probe():
(1) The 'local_node' is escaped out from for_each_child_of_node() as
the break of iteration, we should call of_node_put() for it in error
path or when it is not used anymore.
(2) The 'node' is escaped out from for_each_available_child_of_node()
as the 'goto', we should call of_node_put() for it in goto target.
We should add the of_node_put() when breaking out of
for_each_child_of_node() as it will automatically increase
and decrease the refcount.
Fixes: f0de37a5b382 ("memory: emif: add device tree support to emif driver") Signed-off-by: Liang He <windhl@126.com> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20220719085640.1210583-1-windhl@126.com Signed-off-by: Sasha Levin <sashal@kernel.org>
The break of for_each_available_child_of_node() needs a
corresponding of_node_put() when the reference 'child' is not
used anymore. Here we do not need to call of_node_put() in
fail path as '!match' means no break.
While the of_platform_device_create() will created a new
reference by 'child' but it has considered the refcounting.
Fixes: fed625b4794e ("memory: pl353: Add driver for arm pl353 static memory controller") Signed-off-by: Liang He <windhl@126.com> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20220716031324.447680-1-windhl@126.com Signed-off-by: Sasha Levin <sashal@kernel.org>
The HDMI driver skips the notification handling from the graphics
driver when the codec driver is being in the PM operation. This
behavior was introduced by the commit 73864a554941 ("ALSA: hda - Skip
ELD notification during PM process"). This skip may cause a problem,
as we may miss the ELD update when the connection/disconnection
happens right at the runtime-PM operation of the audio codec.
Although this workaround was valid at that time, it's no longer true;
the fix was required just because the ELD update procedure needed to
wake up the audio codec, which had lead to a runtime-resume during a
runtime-suspend. Meanwhile, the ELD update procedure doesn't need a
codec wake up any longer since the commit 1e459ffe9770 ("ALSA: hda -
Use component ops for i915 HDMI/DP audio jack handling"); i.e. there
is no much reason for skipping the notification.
Let's drop those checks for addressing the missing notification.
The pm_runtime_enable will increase power disable depth. Thus
a pairing decrement is needed on the error handling path to
keep it balanced according to context. We fix it by moving
pm_runtime_enable to the endding of wm5102_probe.
The pm_runtime_enable will increase power disable depth. Thus
a pairing decrement is needed on the error handling path to
keep it balanced according to context. We fix it by moving
pm_runtime_enable to the endding of wm5110_probe.
The pm_runtime_enable will increase power disable depth. Thus
a pairing decrement is needed on the error handling path to
keep it balanced according to context. We fix it by moving
pm_runtime_enable to the endding of wm8997_probe
Remove loops over hw_vbif. Instead always VBIF's idx as an index in the
array. This fixes an error in dpu_kms_hw_init(), where we fill
dpu_kms->hw_vbif[i], but check for an error pointer at
dpu_kms->hw_vbif[vbif_idx].
In eukrea_tlv320_probe(), we need to hold the reference returned
from of_find_compatible_node() which has increased the refcount
and then call of_node_put() with it when done.
Fixes: 1dda4a8dfa38 ("ASoC: eukrea-tlv320: Add DT support.") Co-authored-by: Kelin Wang <wangkelin2023@163.com> Signed-off-by: Liang He <windhl@126.com> Link: https://lore.kernel.org/r/20220914134354.3995587-1-windhl@126.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
In dss_init_ports() and __dss_uninit_ports(), we should call
of_node_put() for the reference returned by of_graph_get_port_by_id()
in fail path or when it is not used anymore.
The recent fix for IDT codecs to keep the power up while the beep is
enabled can be better integrated into the beep helper code.
This patch cleans up the code with refactoring.
As rsnd_mod_power_on() can return negative numbers,
it should be better to check the return value and
deal with the exception.
Fixes: 0c4a9a034957 ("ASoC: rsnd: use mod base common method on SSI-parent") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Link: https://lore.kernel.org/r/20220902013030.3691266-1-jiasheng@iscas.ac.cn Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The two bridges (stdp2690, stdp4028) do not probe at the same time, so
the driver does not call ge_b850v3_resgiter() when probing, causing the
driver to try to remove the object that has not been initialized.
Fix this by checking whether both the bridges are probed.
drm_mipi_dbi needs lots of DRM_KMS_HELPER support, so select
that Kconfig symbol like it is done is most other uses, and
the way that it was before MIPS_DBI was moved from tinydrm
to its core location.
Fixes these build errors:
ld: drivers/gpu/drm/drm_mipi_dbi.o: in function `mipi_dbi_buf_copy':
drivers/gpu/drm/drm_mipi_dbi.c:205: undefined reference to `drm_gem_fb_get_obj'
ld: drivers/gpu/drm/drm_mipi_dbi.c:211: undefined reference to `drm_gem_fb_begin_cpu_access'
ld: drivers/gpu/drm/drm_mipi_dbi.c:215: undefined reference to `drm_gem_fb_vmap'
ld: drivers/gpu/drm/drm_mipi_dbi.c:222: undefined reference to `drm_fb_swab'
ld: drivers/gpu/drm/drm_mipi_dbi.c:224: undefined reference to `drm_fb_memcpy'
ld: drivers/gpu/drm/drm_mipi_dbi.c:227: undefined reference to `drm_fb_xrgb8888_to_rgb565'
ld: drivers/gpu/drm/drm_mipi_dbi.c:235: undefined reference to `drm_gem_fb_vunmap'
ld: drivers/gpu/drm/drm_mipi_dbi.c:237: undefined reference to `drm_gem_fb_end_cpu_access'
ld: drivers/gpu/drm/drm_mipi_dbi.o: in function `mipi_dbi_dev_init_with_formats':
ld: drivers/gpu/drm/drm_mipi_dbi.o:/X64/../drivers/gpu/drm/drm_mipi_dbi.c:469: undefined reference to `drm_gem_fb_create_with_dirty'
Fixes: dace47bdff0d ("drm/tinydrm: Move mipi-dbi") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Dillon Min <dillon.minfei@gmail.com> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Noralf Trønnes <noralf@tronnes.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: dri-devel@lists.freedesktop.org Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20220823004243.11596-1-rdunlap@infradead.org Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit c8132014743f ("msi-laptop: Port to new backlight interface
selection API") replaced this check:
if (!quirks->old_ec_model || acpi_video_backlight_support())
pr_info("Brightness ignored, ...");
else
do_register();
With:
if (quirks->old_ec_model ||
acpi_video_get_backlight_type() == acpi_backlight_vendor)
do_register();
But since the do_register() part was part of the else branch, the entire
condition should be inverted. So not only the 2 statements on either
side of the || should be inverted, but the || itself should be replaced
with a &&.
In practice this has likely not been an issue because the new-ec models
(old_ec_model==false) likely all support ACPI video backlight control,
making acpi_video_get_backlight_type() return acpi_backlight_video
turning the second part of the || also false when old_ec_model == false.
If chromeos_laptop_prepare_i2c_peripherals() fails after allocating memory
for 'cros_laptop->i2c_peripherals', this memory is freed at 'err_out' label
and nonzero value is returned. Then chromeos_laptop_destroy() is called,
resulting in double-free error.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Whenever the MIPI-DSI host is unregistered, the code of
mipi_dsi_host_unregister() loops over every device currently found on that
bus and will unregister it.
However, it doesn't detach it from the bus first, which leads to all kind
of resource leaks if the host wants to perform some clean up whenever a
device is detached.
Fixes: 1087c2f93cc4 ("drm: Add MIPI DSI bus support") Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://lore.kernel.org/r/20220711173939.1132294-2-maxime@cerno.tech Signed-off-by: Sasha Levin <sashal@kernel.org>
The ADV7511_REG_CEC_CTRL = 0xE2 register is part of the main register
map - not the CEC register map. As such, we shouldn't apply an offset to
the register address. Doing so will cause us to address a bogus register
for chips with a CEC register map offset (e.g. ADV7533).
When mvpp2 is unloaded, the driver specific debugfs directory is not
removed, which technically leads to a memory leak. However, this
directory is only created when the first device is probed, so the
hardware is present. Removing the module is only something a developer
would to when e.g. testing out changes, so the module would be
reloaded. So this memory leak is minor.
The original attempt in commit 5f23e1396e1f ("net: mvpp2: debugfs: fix
memory leak when using debugfs_lookup()") that was labelled as a memory
leak fix was not, it fixed a refcount leak, but in doing so created a
problem when the module is reloaded - the directory already exists, but
mvpp2_root is NULL, so we lose all debugfs entries. This fix has been
reverted.
This is the alternative fix, where we remove the offending directory
whenever the driver is unloaded.
Fixes: a8fd6ab05737 ("net: mvpp2: add a debugfs interface for the Header Parser") Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Marcin Wojtas <mw@semihalf.com> Link: https://lore.kernel.org/r/E1ofOAB-00CzkG-UO@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Christophe Leroy reported a ~80ms latency spike
happening at first TCP connect() time.
This is because __inet_hash_connect() uses get_random_once()
to populate a perturbation table which became quite big
after commit 85bf0dc6b58d ("tcp: increase source port perturb table to 2^16")
get_random_once() uses DO_ONCE(), which block hard irqs for the duration
of the operation.
This patch adds DO_ONCE_SLOW() which uses a mutex instead of a spinlock
for operations where we prefer to stay in process context.
Then __inet_hash_connect() can use get_random_slow_once()
to populate its perturbation table.
Fixes: 85bf0dc6b58d ("tcp: increase source port perturb table to 2^16") Fixes: 57c2c1517858 ("tcp: change source port randomizarion at connect() time") Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu> Link: https://lore.kernel.org/netdev/CANn89iLAEYBaoYajy0Y9UmGFff5GPxDUoG-ErVB2jDdRNQ5Tug@mail.gmail.com/T/#t Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willy Tarreau <w@1wt.eu> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
syzbot is hitting skb_assert_len() warning at raw_sendmsg() for ieee802154
socket. What commit aaf386125a6d0c1b ("net/af_packet: check len when
min_header_len equals to 0") does also applies to ieee802154 socket.
Link: https://syzkaller.appspot.com/bug?extid=5ea725c25d06fb9114c4 Reported-by: syzbot <syzbot+5ea725c25d06fb9114c4@syzkaller.appspotmail.com> Fixes: fe1f468a3ebcab50 ("bpf: Don't redirect packets with invalid pkt_len") Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
bnx2x_tpa_stop() allocates a memory chunk from new_data with
bnx2x_frag_alloc(). The new_data should be freed when gets some error.
But when "pad + len > fp->rx_buf_size" is true, bnx2x_tpa_stop() returns
without releasing the new_data, which will lead to a memory leak.
We should free the new_data with bnx2x_frag_free() when "pad + len >
fp->rx_buf_size" is true.
Fixes: ab406322156501e0d5b80d50445ab59c64db717c ("bnx2x: fix possible panic under memory stress") Signed-off-by: Jianglei Nie <niejianglei2021@163.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
syzbot is reporting lockdep warning at rds_tcp_reset_callbacks() [1], for
commit e6f1c90ba51e12c4 ("RDS: TCP: Reduce code duplication in
rds_tcp_reset_callbacks()") added cancel_delayed_work_sync() into a section
protected by lock_sock() without realizing that rds_send_xmit() might call
lock_sock().
We don't need to protect cancel_delayed_work_sync() using lock_sock(), for
even if rds_{send,recv}_worker() re-queued this work while __flush_work()
from cancel_delayed_work_sync() was waiting for this work to complete,
retried rds_{send,recv}_worker() is no-op due to the absence of RDS_CONN_UP
bit.
Link: https://syzkaller.appspot.com/bug?extid=78c55c7bc6f66e53dce2 Reported-by: syzbot <syzbot+78c55c7bc6f66e53dce2@syzkaller.appspotmail.com> Co-developed-by: Hillf Danton <hdanton@sina.com> Signed-off-by: Hillf Danton <hdanton@sina.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Tested-by: syzbot <syzbot+78c55c7bc6f66e53dce2@syzkaller.appspotmail.com> Fixes: e6f1c90ba51e12c4 ("RDS: TCP: Reduce code duplication in rds_tcp_reset_callbacks()") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
This commit fixes a bug in the tracking of max_packets_out and
is_cwnd_limited. This bug can cause the connection to fail to remember
that is_cwnd_limited is true, causing the connection to fail to grow
cwnd when it should, causing throughput to be lower than it should be.
The following event sequence is an example that triggers the bug:
(a) The connection is cwnd_limited, but packets_out is not at its
peak due to TSO deferral deciding not to send another skb yet.
In such cases the connection can advance max_packets_seq and set
tp->is_cwnd_limited to true and max_packets_out to a small
number.
(b) Then later in the round trip the connection is pacing-limited (not
cwnd-limited), and packets_out is larger. In such cases the
connection would raise max_packets_out to a bigger number but
(unexpectedly) flip tp->is_cwnd_limited from true to false.
This commit fixes that bug.
One straightforward fix would be to separately track (a) the next
window after max_packets_out reaches a maximum, and (b) the next
window after tp->is_cwnd_limited is set to true. But this would
require consuming an extra u32 sequence number.
Instead, to save space we track only the most important
information. Specifically, we track the strongest available signal of
the degree to which the cwnd is fully utilized:
(1) If the connection is cwnd-limited then we remember that fact for
the current window.
(2) If the connection not cwnd-limited then we track the maximum
number of outstanding packets in the current window.
In particular, note that the new logic cannot trigger the buggy
(a)/(b) sequence above because with the new logic a condition where
tp->packets_out > tp->max_packets_out can only trigger an update of
tp->is_cwnd_limited if tp->is_cwnd_limited is false.
This first showed up in a testing of a BBRv2 dev branch, but this
buggy behavior highlighted a general issue with the
tcp_cwnd_validate() logic that can cause cwnd to fail to increase at
the proper rate for any TCP congestion control, including Reno or
CUBIC.
Fixes: 52a1301f5152 ("tcp: make cwnd-limited checks measurement-based, and gentler") Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Kevin(Yudong) Yang <yyd@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
When it returns an error from sctp_auth_asoc_init_active_key(), the
active_key is actually not updated. The old sh_key will be freeed
while it's still used as active key in asoc. Then an use-after-free
will be triggered when sending patckets, as found by syzbot:
This patch is to fix it by not replacing the sh_key when it returns
errors from sctp_auth_asoc_init_active_key() in sctp_auth_set_key().
For sctp_auth_set_active_key(), old active_key_id will be set back
to asoc->active_key_id when the same thing happens.
Fixes: 52728af1daee ("sctp: update active_key for asoc when old key is being replaced") Reported-by: syzbot+a236dd8e9622ed8954a3@syzkaller.appspotmail.com Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The l1oip_cleanup() traverses the l1oip_ilist and calls
release_card() to cleanup module and stack. However,
release_card() calls del_timer() to delete the timers
such as keep_tl and timeout_tl. If the timer handler is
running, the del_timer() will not stop it and result in
UAF bugs. One of the processes is shown below:
Fix by calling del_timer_sync() in release_card(), which
makes sure the timer handlers have finished before the
resources, such as l1oip and so on, have been deallocated.
What's more, the hc->workq and hc->socket_thread can kick
those timers right back in. We add a bool flag to show
if card is released. Then, check this flag in hc->workq
and hc->socket_thread.
Fixes: f5842134c3d2 ("Add layer1 over IP support") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The COUNT_VALUE in the PACKET_CNT register is 16-bit so the maximum
value is 65535. Asking the driver to transfer a larger size currently
leads to the DMA transfer timing out. Implement ->max_transfer_size()
and have the core split the transfer as needed.
Analogous to commit 07980f538cd5d ("netfilter: Fix rpfilter
dropping vrf packets by mistake") but for nftables fib expression:
Add special treatment of VRF devices so that typical reverse path
filtering via 'fib saddr . iif oif' expression works as expected.
The pm_runtime_enable will increase power disable depth. Thus
a pairing decrement is needed on the error handling path to
keep it balanced according to context.
Fixes:321689cf450aa ("spi/omap100k: Convert to runtime PM")
In preparation for reducing the use of ksize(), record the actual
allocation size for later memcpy(). This avoids copying extra
(uninitialized!) bytes into the patch buffer when the requested
allocation size isn't exactly the size of a kmalloc bucket.
Additionally, fix potential future issues where runtime bounds checking
will notice that the buffer was allocated to a smaller value than
returned by ksize().
Fixes: 4487451a80f5 ("x86, microcode, amd: Early microcode patch loading support for AMD") Suggested-by: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/lkml/CA+DvKQ+bp7Y7gmaVhacjv9uF6Ar-o4tet872h4Q8RPYPJjcJQA@mail.gmail.com/ Signed-off-by: Sasha Levin <sashal@kernel.org>
Should check of_iomap return value 'fep->fec.fecp' instead of 'fep->fcc.fccp'
Fixes: 04e0d0e98e18 ("fs_enet: Be an of_platform device when CONFIG_PPC_CPM_NEW_BINDING is set.") Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Found by comparing with the vendor driver. Currently this affects
only the RTL8192EU, which is the only gen2 chip with 2 TX paths
supported by this driver. It's unclear what kind of effect the
mistake had in practice, since I don't have any RTL8192EU devices
to test it.
Fixes: 2dc492d81bf7 ("rtl8xxxu: First stab at adding IQK calibration for 8723bu parts") Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/30a59f3a-cfa9-8379-7af0-78a8f4c77cfd@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
When trying to finish resolving a struct member, btf_struct_resolve
saves the member type id in a u16 temporary variable. This truncates
the 32 bit type id value if it exceeds UINT16_MAX.
As a result, structs that have members with type ids > UINT16_MAX and
which need resolution will fail with a message like this:
rtl8xxxu_queue_select() selects the wrong TX queues because it's
reading memory from the wrong address. It expects to find ieee80211_hdr
at skb->data, but that's not the case after skb_push(). Move the call
to rtl8xxxu_queue_select() before the call to skb_push().
There some bounds checking to ensure that "map_addr" is not out of
bounds before the start of the loop. But the checking needs to be
done as we iterate through the loop because "map_addr" gets larger as
we iterate.
Fixes: 630609394850 ("New driver: rtl8xxxu (mac80211)") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Jes Sorensen <Jes.Sorensen@gmail.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/Yv8eGLdBslLAk3Ct@kili Signed-off-by: Sasha Levin <sashal@kernel.org>
The current pseudo_lock.c code overwrites the value of the
MSR_MISC_FEATURE_CONTROL to 0 even if the original value is not 0.
Therefore, modify it to save and restore the original values.
When `data` points to a boolean value, casting it to `int *` is problematic
and could lead to a wrong value being passed to `jsonw_bool`. Change the
cast to `bool *` instead.
From 'IEEE Std 802.11-2020 section 11.8.8.4.1':
The mesh channel switch may be triggered by the need to avoid
interference to a detected radar signal, or to reassign mesh STA
channels to ensure the MBSS connectivity.
A 20/40 MHz MBSS may be changed to a 20 MHz MBSS and a 20 MHz
MBSS may be changed to a 20/40 MHz MBSS.
Since the standard allows the change of bandwidth during
the channel switch in mesh, remove the bandwidth check present in
ieee80211_set_csa_beacon.
Fixes: fa56afac5b63 ("{nl,cfg,mac}80211: enable the triggering of CSA frame in mesh") Signed-off-by: Hari Chandrakanthan <quic_haric@quicinc.com> Link: https://lore.kernel.org/r/1658903549-21218-1-git-send-email-quic_haric@quicinc.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When peer delete failed in a disconnect operation, use-after-free
detected by KFENCE in below log. It is because for each vdev_id and
address, it has only one struct ath10k_peer, it is allocated in
ath10k_peer_map_event(). When connected to an AP, it has more than
one HTT_T2H_MSG_TYPE_PEER_MAP reported from firmware, then the
array peer_map of struct ath10k will be set muti-elements to the
same ath10k_peer in ath10k_peer_map_event(). When peer delete failed
in ath10k_sta_state(), the ath10k_peer will be free for the 1st peer
id in array peer_map of struct ath10k, and then use-after-free happened
for the 2nd peer id because they map to the same ath10k_peer.
And clean up all peers in array peer_map for the ath10k_peer, then
user-after-free disappeared
peer map event log:
[ 306.911021] wlan0: authenticate with b0:2a:43:e6:75:0e
[ 306.957187] ath10k_pci 0000:01:00.0: mac vdev 0 peer create b0:2a:43:e6:75:0e (new sta) sta 1 / 32 peer 1 / 33
[ 306.957395] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 246
[ 306.957404] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 198
[ 306.986924] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 166
peer unmap event log:
[ 435.715691] wlan0: deauthenticating from b0:2a:43:e6:75:0e by local choice (Reason: 3=DEAUTH_LEAVING)
[ 435.716802] ath10k_pci 0000:01:00.0: mac vdev 0 peer delete b0:2a:43:e6:75:0e sta ffff990e0e9c2b50 (sta gone)
[ 435.717177] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 246
[ 435.717186] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 198
[ 435.717193] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 166
use-after-free log:
[21705.888627] wlan0: deauthenticating from d0:76:8f:82:be:75 by local choice (Reason: 3=DEAUTH_LEAVING)
[21713.799910] ath10k_pci 0000:01:00.0: failed to delete peer d0:76:8f:82:be:75 for vdev 0: -110
[21713.799925] ath10k_pci 0000:01:00.0: found sta peer d0:76:8f:82:be:75 (ptr 0000000000000000 id 102) entry on vdev 0 after it was supposedly removed
[21713.799968] ==================================================================
[21713.799991] BUG: KFENCE: use-after-free read in ath10k_sta_state+0x265/0xb8a [ath10k_core]
[21713.799991]
[21713.799997] Use-after-free read at 0x00000000abe1c75e (in kfence-#69):
[21713.800010] ath10k_sta_state+0x265/0xb8a [ath10k_core]
[21713.800041] drv_sta_state+0x115/0x677 [mac80211]
[21713.800059] __sta_info_destroy_part2+0xb1/0x133 [mac80211]
[21713.800076] __sta_info_flush+0x11d/0x162 [mac80211]
[21713.800093] ieee80211_set_disassoc+0x12d/0x2f4 [mac80211]
[21713.800110] ieee80211_mgd_deauth+0x26c/0x29b [mac80211]
[21713.800137] cfg80211_mlme_deauth+0x13f/0x1bb [cfg80211]
[21713.800153] nl80211_deauthenticate+0xf8/0x121 [cfg80211]
[21713.800161] genl_rcv_msg+0x38e/0x3be
[21713.800166] netlink_rcv_skb+0x89/0xf7
[21713.800171] genl_rcv+0x28/0x36
[21713.800176] netlink_unicast+0x179/0x24b
[21713.800181] netlink_sendmsg+0x3a0/0x40e
[21713.800187] sock_sendmsg+0x72/0x76
[21713.800192] ____sys_sendmsg+0x16d/0x1e3
[21713.800196] ___sys_sendmsg+0x95/0xd1
[21713.800200] __sys_sendmsg+0x85/0xbf
[21713.800205] do_syscall_64+0x43/0x55
[21713.800210] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[21713.800213]
[21713.800219] kfence-#69: 0x000000009149b0d5-0x000000004c0697fb, size=1064, cache=kmalloc-2k
[21713.800219]
[21713.800224] allocated by task 13 on cpu 0 at 21705.501373s:
[21713.800241] ath10k_peer_map_event+0x7e/0x154 [ath10k_core]
[21713.800254] ath10k_htt_t2h_msg_handler+0x586/0x1039 [ath10k_core]
[21713.800265] ath10k_htt_htc_t2h_msg_handler+0x12/0x28 [ath10k_core]
[21713.800277] ath10k_htc_rx_completion_handler+0x14c/0x1b5 [ath10k_core]
[21713.800283] ath10k_pci_process_rx_cb+0x195/0x1df [ath10k_pci]
[21713.800294] ath10k_ce_per_engine_service+0x55/0x74 [ath10k_core]
[21713.800305] ath10k_ce_per_engine_service_any+0x76/0x84 [ath10k_core]
[21713.800310] ath10k_pci_napi_poll+0x49/0x144 [ath10k_pci]
[21713.800316] net_rx_action+0xdc/0x361
[21713.800320] __do_softirq+0x163/0x29a
[21713.800325] asm_call_irq_on_stack+0x12/0x20
[21713.800331] do_softirq_own_stack+0x3c/0x48
[21713.800337] __irq_exit_rcu+0x9b/0x9d
[21713.800342] common_interrupt+0xc9/0x14d
[21713.800346] asm_common_interrupt+0x1e/0x40
[21713.800351] ksoftirqd_should_run+0x5/0x16
[21713.800357] smpboot_thread_fn+0x148/0x211
[21713.800362] kthread+0x150/0x15f
[21713.800367] ret_from_fork+0x22/0x30
[21713.800370]
[21713.800374] freed by task 708 on cpu 1 at 21713.799953s:
[21713.800498] ath10k_sta_state+0x2c6/0xb8a [ath10k_core]
[21713.800515] drv_sta_state+0x115/0x677 [mac80211]
[21713.800532] __sta_info_destroy_part2+0xb1/0x133 [mac80211]
[21713.800548] __sta_info_flush+0x11d/0x162 [mac80211]
[21713.800565] ieee80211_set_disassoc+0x12d/0x2f4 [mac80211]
[21713.800581] ieee80211_mgd_deauth+0x26c/0x29b [mac80211]
[21713.800598] cfg80211_mlme_deauth+0x13f/0x1bb [cfg80211]
[21713.800614] nl80211_deauthenticate+0xf8/0x121 [cfg80211]
[21713.800619] genl_rcv_msg+0x38e/0x3be
[21713.800623] netlink_rcv_skb+0x89/0xf7
[21713.800628] genl_rcv+0x28/0x36
[21713.800632] netlink_unicast+0x179/0x24b
[21713.800637] netlink_sendmsg+0x3a0/0x40e
[21713.800642] sock_sendmsg+0x72/0x76
[21713.800646] ____sys_sendmsg+0x16d/0x1e3
[21713.800651] ___sys_sendmsg+0x95/0xd1
[21713.800655] __sys_sendmsg+0x85/0xbf
[21713.800659] do_syscall_64+0x43/0x55
[21713.800663] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: 0c22b9333a40 ("ath10k: Clean up peer when sta goes away.") Signed-off-by: Wen Gong <quic_wgong@quicinc.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://lore.kernel.org/r/20220801141930.16794-1-quic_wgong@quicinc.com Signed-off-by: Sasha Levin <sashal@kernel.org>
If this memdup_user() call fails, the memory allocated in a previous call
a few lines above should be freed. Otherwise it leaks.
Fixes: f50edfb1b698 ("nfsd: add support for upcall version 2") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
MT_MEMORY_RO is introduced by commit dfe4cb6c3e20 ("ARM: 9210/1:
Mark the FDT_FIXED sections as shareable"), which is a readonly
memory type for FDT area, but there are some different between
ARM_LPAE and non-ARM_LPAE, we need to setup PMD_SECT_AP2 and
L_PMD_SECT_RDONLY for MT_MEMORY_RO when ARM_LAPE enabled.
Fixes: dfe4cb6c3e20 ("ARM: 9210/1: Mark the FDT_FIXED sections as shareable") Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
As done for other sections, define the extern as a character array,
which relaxes many of the compiler-time object size checks, which would
otherwise assume it's a single long. Solves the following build error:
arch/sh/kernel/machvec.c: error: array subscript 'struct sh_machine_vector[0]' is partly outside array bounds of 'long int[1]' [-Werror=array-bounds]: => 105:33
Since userfaultfd doesn't implement a write operation, it is more
appropriate to open it read-only.
When userfaultfds are opened read-write like it is now, and such fd is
passed from one process to another, SELinux will check both read and
write permissions for the target process, even though it can't actually
do any write operation on the fd later.
Inspired by the following bug report, which has hit the SELinux scenario
described above:
https://bugzilla.redhat.com/show_bug.cgi?id=1974559
Reported-by: Robert O'Callahan <roc@ocallahan.org> Fixes: 5df97d311d21 ("userfaultfd: add new syscall to provide memory externalization") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Acked-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
It was found that some tracing functions in kernel/trace/trace.c acquire
an arch_spinlock_t with preemption and irqs enabled. An example is the
tracing_saved_cmdlines_size_read() function which intermittently causes
a "BUG: using smp_processor_id() in preemptible" warning when the LTP
read_all_proc test is run.
That can be problematic in case preemption happens after acquiring the
lock. Add the necessary preemption or interrupt disabling code in the
appropriate places before acquiring an arch_spinlock_t.
The convention here is to disable preemption for trace_cmdline_lock and
interupt for max_lock.
Link: https://lkml.kernel.org/r/20220922145622.1744826-1-longman@redhat.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Will Deacon <will@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: stable@vger.kernel.org Fixes: 91fd73cac46f ("tracing: Add conditional snapshot") Fixes: 8ce1567b63ba ("tracing: Introduce saved_cmdlines_size file") Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The latest version of grep claims that egrep is now obsolete so the build
now contains warnings that look like:
egrep: warning: egrep is obsolescent; using grep -E
fix this by using "grep -E" instead.
Cc: Paul Moore <paul@paul-moore.com> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Eric Paris <eparis@parisplace.org> Cc: selinux@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[PM: tweak to remove vdso reference, cleanup subj line] Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
nouveau_bo_init() is backed by ttm_bo_init() and ferries its return code
back to the caller. On failures, ttm will call nouveau_bo_del_ttm() and
free the memory.Thus, when nouveau_bo_init() returns an error, the gem
object has already been released. Then the call to nouveau_bo_ref() will
use the freed "nvbo->bo" and lead to a use-after-free bug.
We should delete the call to nouveau_bo_ref() to avoid the use-after-free.
Signed-off-by: Jianglei Nie <niejianglei2021@163.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Fixes: 26ea454bbfed ("drm/nouveau: Initialize GEM object before TTM object") Cc: Thierry Reding <treding@nvidia.com> Cc: <stable@vger.kernel.org> # v5.4+ Link: https://patchwork.freedesktop.org/patch/msgid/20220705132546.2247677-1-niejianglei2021@163.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Starting with GCC 12.1, the created .gcda format can't be read by gcov
tool. There are 2 significant changes to the .gcda file format that
need to be supported:
Deliberately truncate the exception error code when shoving it into the
VMCS (VM-Entry field for vmcs01 and vmcs02, VM-Exit field for vmcs12).
Intel CPUs are incapable of handling 32-bit error codes and will never
generate an error code with bits 31:16, but userspace can provide an
arbitrary error code via KVM_SET_VCPU_EVENTS. Failure to drop the bits
on exception injection results in failed VM-Entry, as VMX disallows
setting bits 31:16. Setting the bits on VM-Exit would at best confuse
L1, and at worse induce a nested VM-Entry failure, e.g. if L1 decided to
reinject the exception back into L2.
Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20220830231614.3580124-3-seanjc@google.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Drop pending exceptions and events queued for re-injection when leaving
nested guest mode, even if the "exit" is due to VM-Fail, SMI, or forced
by host userspace. Failure to purge events could result in an event
belonging to L2 being injected into L1.
This _should_ never happen for VM-Fail as all events should be blocked by
nested_run_pending, but it's possible if KVM, not the L1 hypervisor, is
the source of VM-Fail when running vmcs02.
SMI is a nop (barring unknown bugs) as recognition of SMI and thus entry
to SMM is blocked by pending exceptions and re-injected events.
Forced exit is definitely buggy, but has likely gone unnoticed because
userspace probably follows the forced exit with KVM_SET_VCPU_EVENTS (or
some other ioctl() that purges the queue).
Fixes: 71b7cd8e8004 ("kvm: nVMX: Handle deferred early VMLAUNCH/VMRESUME failure properly") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20220830231614.3580124-2-seanjc@google.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The emulator checks the wrong variable while setting the CPU
interruptibility state, the target segment is embedded in the instruction
opcode, not the ModR/M register. Fix the condition.
Signed-off-by: Michal Luczaj <mhal@rbox.co> Fixes: 46f2db219cd3 ("KVM: emulate: POP SS triggers a MOV SS shadow too") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20220821215900.1419215-1-mhal@rbox.co Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The cedrus_hw_resume() crashes with NULL deference on driver probe if
runtime PM is disabled because it uses platform data that hasn't been
set up yet. Fix this by setting the platform data earlier during probe.
Cc: stable@vger.kernel.org Fixes: 940eb4656803 (media: platform: Add Cedrus VPU decoder driver) Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Reviewed-by: Samuel Holland <samuel@sholland.org> Acked-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The ring buffer is broken up into sub buffers (currently of page size).
Each sub buffer has a pointer to its "tail" (the last event written to the
sub buffer). When a new event is requested, the tail is locally
incremented to cover the size of the new event. This is done in a way that
there is no need for locking.
If the tail goes past the end of the sub buffer, the process of moving to
the next sub buffer takes place. After setting the current sub buffer to
the next one, the previous one that had the tail go passed the end of the
sub buffer needs to be reset back to the original tail location (before
the new event was requested) and the rest of the sub buffer needs to be
"padded".
The race happens when a reader takes control of the sub buffer. As readers
do a "swap" of sub buffers from the ring buffer to get exclusive access to
the sub buffer, it replaces the "head" sub buffer with an empty sub buffer
that goes back into the writable portion of the ring buffer. This swap can
happen as soon as the writer moves to the next sub buffer and before it
updates the last sub buffer with padding.
Because the sub buffer can be released to the reader while the writer is
still updating the padding, it is possible for the reader to see the event
that goes past the end of the sub buffer. This can cause obvious issues.
To fix this, add a few memory barriers so that the reader definitely sees
the updates to the sub buffer, and also waits until the writer has put
back the "tail" of the sub buffer back to the last event that was written
on it.
To be paranoid, it will only spin for 1 second, otherwise it will
warn and shutdown the ring buffer code. 1 second should be enough as
the writer does have preemption disabled. If the writer doesn't move
within 1 second (with preemption disabled) something is horribly
wrong. No interrupt should last 1 second!
The wake up waiters only checks the "wakeup_full" variable and not the
"full_waiters_pending". The full_waiters_pending is set when a waiter is
added to the wait queue. The wakeup_full is only set when an event is
triggered, and it clears the full_waiters_pending to avoid multiple calls
to irq_work_queue().
The irq_work callback really needs to check both wakeup_full as well as
full_waiters_pending such that this code can be used to wake up waiters
when a file is closed that represents the ring buffer and the waiters need
to be woken up.
Link: https://lkml.kernel.org/r/20220927231824.209460321@goodmis.org Cc: stable@vger.kernel.org Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Fixes: c5e03c1a41c6e ("tracing/ring-buffer: Move poll wake ups into ring buffer code") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The logic to know when the shortest waiters on the ring buffer should be
woken up or not has uses a less than instead of a greater than compare,
which causes the shortest_full to actually be the longest.
Link: https://lkml.kernel.org/r/20220927231823.718039222@goodmis.org Cc: stable@vger.kernel.org Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Fixes: 57a63d9503ffb ("ring-buffer: Add percentage of ring buffer full to wake up reader") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If a page is partially read, and then the splice system call is run
against the ring buffer, it will always fail to read, no matter how much
is in the ring buffer. That's because the code path for a partial read of
the page does will fail if the "full" flag is set.
The splice system call wants full pages, so if the read of the ring buffer
is not yet full, it should return zero, and the splice will block. But if
a previous read was done, where the beginning has been consumed, it should
still be given to the splice caller if the rest of the page has been
written to.
This caused the splice command to never consume data in this scenario, and
let the ring buffer just fill up and lose events.
When executing following commands like what document said, but the log
"#### all functions enabled ####" was not shown as expect:
1. Set a 'mod' filter:
$ echo 'write*:mod:ext3' > /sys/kernel/tracing/set_ftrace_filter
2. Invert above filter:
$ echo '!write*:mod:ext3' >> /sys/kernel/tracing/set_ftrace_filter
3. Read the file:
$ cat /sys/kernel/tracing/set_ftrace_filter
By some debugging, I found that flag FTRACE_HASH_FL_MOD was not unset
after inversion like above step 2 and then result of ftrace_hash_empty()
is incorrect.
Link: https://lkml.kernel.org/r/20220926152008.2239274-1-zhengyejian1@huawei.com Cc: <mingo@redhat.com> Cc: stable@vger.kernel.org Fixes: 3ee854b94a4f ("ftrace: Have cached module filters be an active filter") Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The KLP transition code depends on the TIF_PATCH_PENDING and
the task->patch_state to stay in sync. On a normal (forward)
transition, TIF_PATCH_PENDING will be set on every task in
the system, while on a reverse transition (after a failed
forward one) first TIF_PATCH_PENDING will be cleared from
every task, followed by it being set on tasks that need to
be transitioned back to the original code.
However, the fork code copies over the TIF_PATCH_PENDING flag
from the parent to the child early on, in dup_task_struct and
setup_thread_stack. Much later, klp_copy_process will set
child->patch_state to match that of the parent.
However, the parent's patch_state may have been changed by KLP loading
or unloading since it was initially copied over into the child.
This results in the KLP code occasionally hitting this warning in
klp_complete_transition:
Set, or clear, the TIF_PATCH_PENDING flag in the child task
depending on whether or not it is needed at the time
klp_copy_process is called, at a point in copy_process where the
tasklist_lock is held exclusively, preventing races with the KLP
code.
The KLP code does have a few places where the state is changed
without the tasklist_lock held, but those should not cause
problems because klp_update_patch_state(current) cannot be
called while the current task is in the middle of fork,
klp_check_and_switch_task() which is called under the pi_lock,
which prevents rescheduling, and manipulation of the patch
state of idle tasks, which do not fork.
This should prevent this warning from triggering again in the
future, and close the race for both normal and reverse transitions.
Signed-off-by: Rik van Riel <riel@surriel.com> Reported-by: Breno Leitao <leitao@debian.org> Reviewed-by: Petr Mladek <pmladek@suse.com> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Fixes: 921d2632ba0f ("livepatch: change to a per-task consistency model") Cc: stable@kernel.org Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20220808150019.03d6a67b@imladris.surriel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In our product environment, we encounter some jbd hung waiting handles to
stop while several writters were doing memory reclaim for buffer head
allocation in delay alloc write path. Ext4 do buffer head allocation with
holding transaction handle which may be blocked too long if the reclaim
works not so smooth. According to our bcc trace, the reclaim time in
buffer head allocation can reach 258s and the jbd transaction commit also
take almost the same time meanwhile. Except for these extreme cases,
we often see several seconds delays for cgroup memory reclaim on our
servers. This is more likely to happen considering docker environment.
One thing to note, the allocation of buffer heads is as often as page
allocation or more often when blocksize less than page size. Just like
page cache allocation, we should also place the buffer head allocation
before startting the handle.
ext4_lazyinit_thread is not set freezable. Hence when the thread calls
try_to_freeze it doesn't freeze during suspend and continues to send
requests to the storage during suspend, resulting in suspend failures.
When inode is created and written to using direct IO, there is nothing
to clear the EXT4_STATE_MAY_INLINE_DATA flag. Thus when inode gets
truncated later to say 1 byte and written using normal write, we will
try to store the data as inline data. This confuses the code later
because the inode now has both normal block and inline data allocated
and the confusion manifests for example as:
If the beginning of the inode bitmap area is corrupted on disk, an inode
with the same inode number as the root inode can be allocated and fail
soon after. In this case, the subsequent call to nilfs_clear_inode() on
that bogus root inode will wrongly decrement the reference counter of
struct nilfs_root, and this will erroneously free struct nilfs_root,
causing kernel oopses.
This fixes the problem by changing nilfs_new_inode() to skip reserved
inode numbers while repairing the inode bitmap.
BUG: KASAN: use-after-free in recover_data+0x63ae/0x6ae0 [f2fs]
Read of size 4 at addr ffff8881464dcd80 by task mount/1013
CPU: 3 PID: 1013 Comm: mount Tainted: G W 6.0.0-rc4 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
Call Trace:
dump_stack_lvl+0x45/0x5e
print_report.cold+0xf3/0x68d
kasan_report+0xa8/0x130
recover_data+0x63ae/0x6ae0 [f2fs]
f2fs_recover_fsync_data+0x120d/0x1fc0 [f2fs]
f2fs_fill_super+0x4665/0x61e0 [f2fs]
mount_bdev+0x2cf/0x3b0
legacy_get_tree+0xed/0x1d0
vfs_get_tree+0x81/0x2b0
path_mount+0x47e/0x19d0
do_mount+0xce/0xf0
__x64_sys_mount+0x12c/0x1a0
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
The root cause is: in fuzzed image, SSA table is corrupted: ofs_in_node
is larger than ADDRS_PER_PAGE(), result in out-of-range access on 4k-size
page.
If we enable CONFIG_F2FS_CHECK_FS config, it will trigger a kernel panic
instead of warning.
The root cause is: in fuzzed image, SIT table is inconsistent with inode
mapping table, result in triggering such warning during SIT table update.
This patch introduces a new flag DATA_GENERIC_ENHANCE_UPDATE, w/ this
flag, data block recovery flow can check destination blkaddr's validation
in SIT table, and skip f2fs_replace_block() to avoid inconsistent status.
Cc: stable@vger.kernel.org Reported-by: Wenqing Liu <wenqingliu0120@gmail.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When enabling quotas, at btrfs_quota_enable(), after committing the
transaction, we change fs_info->quota_root to point to the quota root we
created and set BTRFS_FS_QUOTA_ENABLED at fs_info->flags. Then we try
to start the qgroup rescan worker, first by initializing it with a call
to qgroup_rescan_init() - however if that fails we end up freeing the
quota root but we leave fs_info->quota_root still pointing to it, this
can later result in a use-after-free somewhere else.
We have previously set the flags BTRFS_FS_QUOTA_ENABLED and
BTRFS_QGROUP_STATUS_FLAG_ON, so we can only fail with -EINPROGRESS at
btrfs_quota_enable(), which is possible if someone already called the
quota rescan ioctl, and therefore started the rescan worker.
So fix this by ignoring an -EINPROGRESS and asserting we can't get any
other error.
Reported-by: Ye Bin <yebin10@huawei.com> Link: https://lore.kernel.org/linux-btrfs/20220823015931.421355-1-yebin10@huawei.com/ CC: stable@vger.kernel.org # 4.19+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
uImage boot wrapper should not use SPE instructions, like kernel itself.
Boot wrapper has already disabled Altivec and VSX instructions but not SPE.
Options -mno-spe and -mspe=no already set when compilation of kernel, but
not when compiling uImage wrapper yet. Fix it.
When pci_assign_resource() is unable to assign resources to a BAR, it uses
pci_revert_fw_address() to fall back to a firmware assignment (if any).
Previously pci_revert_fw_address() assumed all addresses could reach the
device, but this is not true if the device is below a bridge that only
forwards addresses within its windows.
This problem was observed on a Tyan Tomcat IV S1564D system where the BIOS
did not assign valid addresses to several bridges and USB devices:
pci 0000:00:11.0: PCI-to-PCIe bridge to [bus 01-ff]
pci 0000:00:11.0: bridge window [io 0xe000-0xefff]
pci 0000:01:00.0: PCIe Upstream Port to [bus 02-ff]
pci 0000:01:00.0: bridge window [io 0x0000-0x0fff] # unreachable
pci 0000:02:02.0: PCIe Downstream Port to [bus 05-ff]
pci 0000:02:02.0: bridge window [io 0x0000-0x0fff] # unreachable
pci 0000:05:00.0: PCIe-to-PCI bridge to [bus 06-ff]
pci 0000:05:00.0: bridge window [io 0x0000-0x0fff] # unreachable
pci 0000:06:08.0: USB UHCI 1.1
pci 0000:06:08.0: BAR 4: [io 0xfce0-0xfcff] # unreachable
pci 0000:06:08.1: USB UHCI 1.1
pci 0000:06:08.1: BAR 4: [io 0xfce0-0xfcff] # unreachable
pci 0000:06:08.0: can't claim BAR 4 [io 0xfce0-0xfcff]: no compatible bridge window
pci 0000:06:08.1: can't claim BAR 4 [io 0xfce0-0xfcff]: no compatible bridge window
During the first pass of assigning unassigned resources, there was not
enough I/O space available, so we couldn't assign the 06:08.0 BAR and
reverted to the firmware assignment (still unreachable). Reverting the
06:08.1 assignment failed because it conflicted with 06:08.0:
pci 0000:00:11.0: bridge window [io 0xe000-0xefff]
pci 0000:01:00.0: no space for bridge window [io size 0x2000]
pci 0000:02:02.0: no space for bridge window [io size 0x1000]
pci 0000:05:00.0: no space for bridge window [io size 0x1000]
pci 0000:06:08.0: BAR 4: no space for [io size 0x0020]
pci 0000:06:08.0: BAR 4: trying firmware assignment [io 0xfce0-0xfcff]
pci 0000:06:08.1: BAR 4: no space for [io size 0x0020]
pci 0000:06:08.1: BAR 4: trying firmware assignment [io 0xfce0-0xfcff]
pci 0000:06:08.1: BAR 4: [io 0xfce0-0xfcff] conflicts with 0000:06:08.0 [io 0xfce0-0xfcff]
A subsequent pass assigned valid bridge windows and a valid 06:08.1 BAR,
but left the 06:08.0 BAR alone, so the UHCI device was still unusable:
pci 0000:00:11.0: bridge window [io 0xe000-0xefff] released
pci 0000:00:11.0: bridge window [io 0x1000-0x2fff] # reassigned
pci 0000:01:00.0: bridge window [io 0x1000-0x2fff] # reassigned
pci 0000:02:02.0: bridge window [io 0x2000-0x2fff] # reassigned
pci 0000:05:00.0: bridge window [io 0x2000-0x2fff] # reassigned
pci 0000:06:08.0: BAR 4: assigned [io 0xfce0-0xfcff] # left alone
pci 0000:06:08.1: BAR 4: assigned [io 0x2000-0x201f]
...
uhci_hcd 0000:06:08.0: host system error, PCI problems?
uhci_hcd 0000:06:08.0: host controller process error, something bad happened!
uhci_hcd 0000:06:08.0: host controller halted, very bad!
uhci_hcd 0000:06:08.0: HCRESET not completed yet!
uhci_hcd 0000:06:08.0: HC died; cleaning up
If the address assigned by firmware is not reachable because it's not
within upstream bridge windows, fail instead of assigning the unusable
address from firmware.
When CONFIG_CPUMASK_OFFSTACK and CONFIG_DEBUG_PER_CPU_MAPS is selected,
cpu_max_bits_warn() generates a runtime warning similar as below while
we show /proc/cpuinfo. Fix this by using nr_cpu_ids (the runtime limit)
instead of NR_CPUS to iterate CPUs.
lld since llvm:6611d58f5bbc ("[ELF] Relax R_RISCV_ALIGN"), which will be
included in the 15.0.0 release, has implemented some RISC-V linker
relaxation. -mno-relax is no longer needed in
KBUILD_CFLAGS/KBUILD_AFLAGS to suppress R_RISCV_ALIGN which older lld
can not handle:
ld.lld: error: capability.c:(.fixup+0x0): relocation R_RISCV_ALIGN
requires unimplemented linker relaxation; recompile with -mno-relax
but the .o is already compiled with -mno-relax
Commit 38557cabadf3 ("riscv: mmap with PROT_WRITE but no PROT_READ is
invalid") made mmap() return EINVAL if PROT_WRITE was set wihtout
PROT_READ with the justification that a write-only PTE is considered a
reserved PTE permission bit pattern in the privileged spec. This check
is unnecessary since we let VM_WRITE imply VM_READ on RISC-V, and it is
inconsistent with other architectures that don't support write-only PTEs,
creating a potential software portability issue. Just remove the check
altogether and let PROT_WRITE imply PROT_READ as is the case on other
architectures.
Note that this also allows PROT_WRITE|PROT_EXEC mappings which were
disallowed prior to the aforementioned commit; PROT_READ is implied in
such mappings as well.
Fixes: 38557cabadf3 ("riscv: mmap with PROT_WRITE but no PROT_READ is invalid") Reviewed-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Andrew Bresticker <abrestic@rivosinc.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220915193702.2201018-3-abrestic@rivosinc.com/ Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Independend of the current graphics resolution, adjust the reported
graphics card memory size to the next 4MB boundary.
This fixes the fbtest program which expects a naturally aligned size.
Fix port I/O string accessors such as `insb', `outsb', etc. which use
the physical PCI port I/O address rather than the corresponding memory
mapping to get at the requested location, which in turn breaks at least
accesses made by our parport driver to a PCIe parallel port such as:
PCI parallel port detected: 1415:c118, I/O at 0x1000(0x1008), IRQ 20
parport0: PC-style at 0x1000 (0x1008), irq 20, using FIFO [PCSPP,TRISTATE,COMPAT,EPP,ECP]
For simplicity address the problem by adding PCI_IOBASE to the physical
address requested in the respective wrapper macros only, observing that
the raw accessors such as `__insb', `__outsb', etc. are not supposed to
be used other than by said macros. Remove the cast to `long' that is no
longer needed on `addr' now that it is used as an offset from PCI_IOBASE
and add parentheses around `addr' needed for predictable evaluation in
macro expansion. No need to make said adjustments in separate changes
given that current code is gravely broken and does not ever work.
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Fixes: fc7c284523a3f ("RISC-V: Atomic and Locking Code") Cc: stable@vger.kernel.org # v4.15+ Reviewed-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/alpine.DEB.2.21.2209220223080.29493@angie.orcam.me.uk Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On recent kernels, the PM8058 L16 (or any other PM8058 LDO-regulator)
does not come up if they are supplied by an SMPS-regulator. This
is not very strange since the regulators are registered in a long
array and the L-regulators are registered before the S-regulators,
and if an L-regulator defers, it will never get around to registering
the S-regulator that it needs.
See arch/arm/boot/dts/qcom-apq8060-dragonboard.dts:
Fix this by moving the PM8058 S-regulators first in the array.
Do the same for the PM8901 S-regulators (though this is currently
not causing any problems with out device trees) so that the pattern
of registration order is the same on all PMnnnn chips.
Fixes: 43ba459e2059 ("regulator: qcom: Rework to single platform device") Cc: stable@vger.kernel.org Cc: Andy Gross <agross@kernel.org> Cc: Bjorn Andersson <andersson@kernel.org> Cc: Konrad Dybcio <konrad.dybcio@somainline.org> Cc: linux-arm-msm@vger.kernel.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20220909112529.239143-1-linus.walleij@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Slimbus streams are first prepared and then enabled, so the cleanup path
should reverse it. The unprepare sets stream->num_ports to 0 and frees
the stream->ports. Calling disable after unprepare was not really
effective (channels was not deactivated) and could lead to further
issues due to making transfers on unprepared stream.
Fixes: 8b27f352f489 ("ASoC: wcd9335: add support to wcd9335 codec") Cc: <stable@vger.kernel.org> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20220921145354.1683791-1-krzysztof.kozlowski@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>