With support for Ethernet PHY LEDs having been added, while
unregistering a MDIO bus and its child device liks PHYs there may be
"late" accesses to the MDIO bus. One typical use case is setting the PHY
LEDs brightness to OFF for instance.
We need to ensure that the MDIO bus controller remains entirely
functional since it runs off the main GENET adapter clock.
Meson NAND controller requires 8 bytes alignment for DMA addresses,
otherwise it "aligns" passed address by itself thus accessing invalid
location in the provided buffer. This patch makes unaligned buffers to
be reallocated to become valid.
tpm_amd_is_rng_defective is for dealing with an issue related to the
AMD firmware TPM, so on non-x86 architectures just have it inline and
return false.
Cc: stable@vger.kernel.org # v6.3+ Reported-by: Sachin Sant <sachinp@linux.ibm.com> Reported-by: Aneesh Kumar K. V <aneesh.kumar@linux.ibm.com> Closes: https://lore.kernel.org/lkml/99B81401-DB46-49B9-B321-CF832B50CAC3@linux.ibm.com/ Fixes: f1324bbc4011 ("tpm: disable hwrng for fTPM on some AMD designs") Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Underlying I2C bus drivers not always support longer transfers and
imx-lpi2c for instance doesn't. The fix is symmetric to previous patch
which fixed the read direction.
Cc: stable@vger.kernel.org # v5.20+ Fixes: bbc23a07b072 ("tpm: Add tpm_tis_i2c backend for tpm_tis_core") Tested-by: Michael Haener <michael.haener@siemens.com> Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For Pluton TPM devices, it was assumed that there was no ACPI memory
regions. This is not true for ASUS ROG Ally. ACPI advertises
0xfd500000-0xfd5fffff.
Since remapping is already done in `crb_map_pluton`, remapping again
in `crb_map_io` causes EBUSY error:
[ 3.510453] tpm_crb MSFT0101:00: can't request region for resource [mem 0xfd500000-0xfd5fffff]
[ 3.510463] tpm_crb: probe of MSFT0101:00 failed with error -16
Cc: stable@vger.kernel.org # v6.3+ Fixes: 4d2732882703 ("tpm_crb: Add support for CRB devices based on Pluton") Signed-off-by: Valentin David <valentin.david@gmail.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
pinctrl-amd currently tries to program bit 19 of all GPIOs to select
either a 4kΩ or 8hΩ pull up, but this isn't what bit 19 does. Bit
19 is marked as reserved, even in the latest platforms documentation.
Drop this programming functionality.
Tested-by: Jan Visser <starquake@linuxeverywhere.org> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20230705133005.577-4-mario.limonciello@amd.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On ASUS TUF A16 it is reported that the ITE5570 ACPI device connected to
GPIO 7 is causing an interrupt storm. This issue doesn't happen on
Windows.
Comparing the GPIO register configuration between Windows and Linux
bit 20 has been configured as a pull up on Windows, but not on Linux.
Checking GPIO declaration from the firmware it is clear it *should* have
been a pull up on Linux as well.
On Linux amd_gpio_set_config() is currently only used for programming
the debounce. Actually the GPIO core calls it with all the arguments
that are supported by a GPIO, pinctrl-amd just responds `-ENOTSUPP`.
To solve this issue expand amd_gpio_set_config() to support the other
arguments amd_pinconf_set() supports, namely `PIN_CONFIG_BIAS_PULL_DOWN`,
`PIN_CONFIG_BIAS_PULL_UP`, and `PIN_CONFIG_DRIVE_STRENGTH`.
Reported-by: Nik P <npliashechnikov@gmail.com> Reported-by: Nathan Schulte <nmschulte@gmail.com> Reported-by: Friedrich Vock <friedrich.vock@gmx.de> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217336 Reported-by: dridri85@gmail.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217493 Link: https://lore.kernel.org/linux-input/20230530154058.17594-1-friedrich.vock@gmx.de/ Tested-by: Jan Visser <starquake@linuxeverywhere.org> Fixes: 2956b5d94a76 ("pinctrl / gpio: Introduce .set_config() callback for GPIO chips") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20230705133005.577-3-mario.limonciello@amd.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe")
was well intentioned to mask a firmware issue on a surface laptop, but it
has a few problems:
1. It had a bug in the loop handling for iteration 63 that lead to other
problems with GPIO0 handling.
2. It disables interrupts that are used internally by the SOC but masked
by default.
3. It masked a real firmware problem in some chromebooks that should have
been caught during development but wasn't.
There has been a lot of other development around s2idle; particularly
around handling of the spurious wakeups. If there is still a problem on
the original reported surface laptop it should be avoided by adding a quirk
to gpiolib-acpi for that system instead.
Leverage gpiochip_line_is_irq to check whether a pin has an irq
associated with it. The previous check ("irq == 0") didn't make much
sense. The irq variable refers to the pinctrl irq, and has nothing do to
with an individual pin.
On some systems, during suspend/resume cycle, the firmware leaves
an interrupt enabled on a pin that is not used by the kernel.
Without this patch that caused an interrupt storm.
commit 4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe")
had a mistake in loop iteration 63 that it would clear offset 0xFC instead
of 0x100. Offset 0xFC is actually `WAKE_INT_MASTER_REG`. This was
clearing bits 13 and 15 from the register which significantly changed the
expected handling for some platforms for GPIO0.
commit 4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe")
had a mistake in loop iteration 63 that it would clear offset 0xFC instead
of 0x100. Offset 0xFC is actually `WAKE_INT_MASTER_REG`. This was
clearing bits 13 and 15 from the register which significantly changed the
expected handling for some platforms for GPIO0.
commit b26cd9325be4 ("pinctrl: amd: Disable and mask interrupts on resume")
actually fixed this bug, but lead to regressions on Lenovo Z13 and some
other systems. This is because there was no handling in the driver for bit
15 debounce behavior.
Quoting a public BKDG:
```
EnWinBlueBtn. Read-write. Reset: 0. 0=GPIO0 detect debounced power button;
Power button override is 4 seconds. 1=GPIO0 detect debounced power button
in S3/S5/S0i3, and detect "pressed less than 2 seconds" and "pressed 2~10
seconds" in S0; Power button override is 10 seconds
```
Cross referencing the same master register in Windows it's obvious that
Windows doesn't use debounce values in this configuration. So align the
Linux driver to do this as well. This fixes wake on lid when
WAKE_INT_MASTER_REG is properly programmed.
If the firmware has misconfigured a GPIO it may cause interrupt
status or wake status bits to be set and not asserted. Add these
to debug output to catch this case.
GPIO registers include Bit 27 for WakeCntrlZ used to enable wake in
Z state. Hence add Z-state wake control bits to debugfs output to
debug and analyze Z-states problems.
The wptr needs to be incremented at at least 64 dword intervals,
use 256 to align with windows. This should fix potential hangs
with unaligned updates.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e5df16d9428f5c6d2d0b1eff244d6c330ba9ef3a)
The path `drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c` doesn't exist in
6.1.y, only modify the file that does exist. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Generate a hotplug event after registering a client to allow the
client to configure its display. Remove the hotplug calls from the
existing clients for fbdev emulation. This change fixes a concurrency
bug between registering a client and receiving events from the DRM
core. The bug is present in the fbdev emulation of all drivers.
The fbdev emulation currently generates a hotplug event before
registering the client to the device. For each new output, the DRM
core sends an additional hotplug event to each registered client.
If the DRM core detects first output between sending the artificial
hotplug and registering the device, the output's hotplug event gets
lost. If this is the first output, the fbdev console display remains
dark. This has been observed with amdgpu and fbdev-generic.
Fix this by adding hotplug generation directly to the client's
register helper drm_client_register(). Registering the client and
receiving events are serialized by struct drm_device.clientlist_mutex.
So an output is either configured by the initial hotplug event, or
the client has already been registered.
The bug was originally added in commit 6e3f17ee73f7 ("drm/fb-helper:
generic: Call drm_client_add() after setup is done"), in which adding
a client and receiving a hotplug event switched order. It was hidden,
as most hardware and drivers have at least on static output configured.
Other drivers didn't use the internal DRM client or still had struct
drm_mode_config_funcs.output_poll_changed set. That callback handled
hotplug events as well. After not setting the callback in amdgpu in
commit 0e3172bac3f4 ("drm/amdgpu: Don't set struct
drm_driver.output_poll_changed"), amdgpu did not show a framebuffer
console if output events got lost. The bug got copy-pasted from
fbdev-generic into the other fbdev emulation.
Reported-by: Moritz Duge <MoritzDuge@kolahilft.de> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2649 Fixes: 6e3f17ee73f7 ("drm/fb-helper: generic: Call drm_client_add() after setup is done") Fixes: 8ab59da26bc0 ("drm/fb-helper: Move generic fbdev emulation into separate source file") Fixes: b79fe9abd58b ("drm/fbdev-dma: Implement fbdev emulation for GEM DMA helpers") Fixes: 63c381552f69 ("drm/armada: Implement fbdev emulation as in-kernel client") Fixes: 49953b70e7d3 ("drm/exynos: Implement fbdev emulation as in-kernel client") Fixes: 8f1aaccb04b7 ("drm/gma500: Implement client-based fbdev emulation") Fixes: 940b869c2f2f ("drm/msm: Implement fbdev emulation as in-kernel client") Fixes: 9e69bcd88e45 ("drm/omapdrm: Implement fbdev emulation as in-kernel client") Fixes: e317a69fe891 ("drm/radeon: Implement client-based fbdev emulation") Fixes: 71ec16f45ef8 ("drm/tegra: Implement fbdev emulation as in-kernel client") Fixes: 0e3172bac3f4 ("drm/amdgpu: Don't set struct drm_driver.output_poll_changed") Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Tested-by: Moritz Duge <MoritzDuge@kolahilft.de> Tested-by: Torsten Krah <krah.tm@gmail.com> Tested-by: Paul Schyska <pschyska@gmail.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: David Airlie <airlied@gmail.com> Cc: Noralf Trønnes <noralf@tronnes.org> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Javier Martinez Canillas <javierm@redhat.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Inki Dae <inki.dae@samsung.com> Cc: Seung-Woo Kim <sw0312.kim@samsung.com> Cc: Kyungmin Park <kyungmin.park@samsung.com> Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com> Cc: Rob Clark <robdclark@gmail.com> Cc: Abhinav Kumar <quic_abhinavk@quicinc.com> Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Cc: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Mikko Perttunen <mperttunen@nvidia.com> Cc: dri-devel@lists.freedesktop.org Cc: linux-kernel@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-samsung-soc@vger.kernel.org Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org Cc: linux-tegra@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v5.2+ Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> # msm Link: https://patchwork.freedesktop.org/patch/msgid/20230710091029.27503-1-tzimmermann@suse.de
(cherry picked from commit 27655b9bb9f0d9c32b8de8bec649b676898c52d5)
[ Dropped changes to drivers/gpu/drm/armada/armada_fbdev.c as 174c3c38e3a2 drm/armada: Initialize fbdev DRM client
was introduced in 6.5-rc1.
Dropped changes to exynos, msm, omapdrm, radeon, tegra drivers
as missing code these commits introduced:
99286486d674 drm/exynos: Initialize fbdev DRM client 841ef552b141 drm/msm: Initialize fbdev DRM client 9e69bcd88e45 drm/omapdrm: Implement fbdev emulation as in-kernel client e317a69fe891 drm/radeon: Implement client-based fbdev emulation 9b926bcf2636 drm/radeon: Only build fbdev if DRM_FBDEV_EMULATION is set 25dda38e0b07 drm/tegra: Initialize fbdev DRM client 8f1aaccb04b7 drm/gma500: Implement client-based fbdev emulation b79fe9abd58b drm/fbdev-dma: Implement fbdev emulation for GEM DMA helpers
Use the helper ovl_i_path_realinode() to get realinode and then do
non-nullptr checking.
There are some changes from upstream commit:
1. Corrusponds to do_ovl_get_acl() in 6.1 is ovl_get_acl()
2. Context conflicts caused by 6c0a8bfb84af8f3 ("ovl: implement get acl
method") is handled.
Let helper ovl_i_path_real() return the realinode to prepare for
checking non-null realinode in RCU walking path.
[msz] Use d_inode_rcu() since we are depending on the consitency
between dentry and inode being non-NULL in an RCU setting.
There are some changes from upstream commit:
1. Context conflicts caused by 73db6a063c785bc ("ovl: port to
vfs{g,u}id_t and associated helpers") is handled.
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Fixes: ffa5723c6d25 ("ovl: store lower path in ovl_inode") Cc: <stable@vger.kernel.org> # v5.19 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Added new functions index_hdr_check and index_buf_check.
Now we check all stuff for correctness while reading from disk.
Also fixed bug with stale nfs data.
Reported-by: van fantasy <g1042620637@gmail.com> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Fixes: 82cae269cfa95 ("fs/ntfs3: Add initialization of super block") Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Lion says:
-------
In the QFQ scheduler a similar issue to CVE-2023-31436
persists.
Consider the following code in net/sched/sch_qfq.c:
static int qfq_enqueue(struct sk_buff *skb, struct Qdisc *sch,
struct sk_buff **to_free)
{
unsigned int len = qdisc_pkt_len(skb), gso_segs;
// ...
if (unlikely(cl->agg->lmax < len)) {
pr_debug("qfq: increasing maxpkt from %u to %u for class %u",
cl->agg->lmax, len, cl->common.classid);
err = qfq_change_agg(sch, cl, cl->agg->class_weight, len);
if (err) {
cl->qstats.drops++;
return qdisc_drop(skb, sch, to_free);
}
// ...
}
Similarly to CVE-2023-31436, "lmax" is increased without any bounds
checks according to the packet length "len". Usually this would not
impose a problem because packet sizes are naturally limited.
This is however not the actual packet length, rather the
"qdisc_pkt_len(skb)" which might apply size transformations according to
"struct qdisc_size_table" as created by "qdisc_get_stab()" in
net/sched/sch_api.c if the TCA_STAB option was set when modifying the qdisc.
A user may choose virtually any size using such a table.
As a result the same issue as in CVE-2023-31436 can occur, allowing heap
out-of-bounds read / writes in the kmalloc-8192 cache.
-------
We can create the issue with the following commands:
tc qdisc add dev $DEV root handle 1: stab mtu 2048 tsize 512 mpu 0 \
overhead 999999999 linklayer ethernet qfq
tc class add dev $DEV parent 1: classid 1:1 htb rate 6mbit burst 15k
tc filter add dev $DEV parent 1: matchall classid 1:1
ping -I $DEV 1.1.1.2
This is caused by incorrectly assuming that qdisc_pkt_len() returns a
length within the QFQ_MIN_LMAX < len < QFQ_MAX_LMAX.
Fixes: 462dbc9101ac ("pkt_sched: QFQ Plus: fair-queueing service at DRR cost") Reported-by: Lion <nnamrec@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Two parameters can be transformed into netlink policies and
validated while parsing the netlink message.
Reviewed-by: Simon Horman <simon.horman@corigine.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: 3e337087c3b5 ("net/sched: sch_qfq: account for stab overhead in qfq_enqueue") Signed-off-by: Sasha Levin <sashal@kernel.org>
If there is a failure during rtw89_fw_h2c_raw() rtw89_debug_priv_send_h2c
should return negative error code instead of a positive value count.
Fix this bug by returning correct error code.
Eric Dumazet says[1]:
-------
Speaking of psched_mtu(), I see that net/sched/sch_pie.c is using it
without holding RTNL, so dev->mtu can be changed underneath.
KCSAN could issue a warning.
-------
Annotate dev->mtu with READ_ONCE() so KCSAN don't issue a warning.
The simple_write_to_buffer() function is designed to handle partial
writes. It returns negatives on error, otherwise it returns the number
of bytes that were able to be copied. This code doesn't check the
return properly. We only know that the first byte is written, the rest
of the buffer might be uninitialized.
There is no need to use the simple_write_to_buffer() function.
Partial writes are prohibited by the "if (*ppos != 0)" check at the
start of the function. Just use memdup_user() and copy the whole
buffer.
Fixes: d3cbb907ae57 ("netdevsim: add ACL trap reporting cookie as a metadata") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/7c1f950b-3a7d-4252-82a6-876e53078ef7@moroto.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
lkp reports below sparse warning when building for RV32:
arch/riscv/mm/init.c:1204:48: sparse: warning: cast truncates bits from
constant value (100000000 becomes 0)
IMO, the reason we didn't see this truncates bug in real world is "0"
means MEMBLOCK_ALLOC_ACCESSIBLE in memblock and there's no RV32 HW
with more than 4GB memory.
Fix it anyway to make sparse happy.
Fixes: decf89f86ecd ("riscv: try to allocate crashkern region from 32bit addressible memory") Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202306080034.SLiCiOMn-lkp@intel.com/ Link: https://lore.kernel.org/r/20230709171036.1906-1-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The kernel does not currently validate that both the minimum and maximum
ports of a port range are specified. This can lead user space to think
that a filter matching on a port range was successfully added, when in
fact it was not. For example, with a patched (buggy) iproute2 that only
sends the minimum port, the following commands do not return an error:
# tc filter add dev swp1 ingress pref 1 proto ip flower ip_proto udp src_port 100-200 action pass
# tc filter add dev swp1 ingress pref 1 proto ip flower ip_proto udp dst_port 100-200 action pass
# tc filter show dev swp1 ingress
filter protocol ip pref 1 flower chain 0
filter protocol ip pref 1 flower chain 0 handle 0x1
eth_type ipv4
ip_proto udp
not_in_hw
action order 1: gact action pass
random type none pass val 0
index 1 ref 1 bind 1
filter protocol ip pref 1 flower chain 0 handle 0x2
eth_type ipv4
ip_proto udp
not_in_hw
action order 1: gact action pass
random type none pass val 0
index 2 ref 1 bind 1
Fix by returning an error unless both ports are specified:
# tc filter add dev swp1 ingress pref 1 proto ip flower ip_proto udp src_port 100-200 action pass
Error: Both min and max source ports must be specified.
We have an error talking to the kernel
# tc filter add dev swp1 ingress pref 1 proto ip flower ip_proto udp dst_port 100-200 action pass
Error: Both min and max destination ports must be specified.
We have an error talking to the kernel
Fixes: 5c72299fba9d ("net: sched: cls_flower: Classify packets using port ranges") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
In the cpu_map_update_elem flow, when kthread_stop is called before
calling the threadfn of rcpu->kthread, since the KTHREAD_SHOULD_STOP bit
of kthread has been set by kthread_stop, the threadfn of rcpu->kthread
will never be executed, and rcpu->refcnt will never be 0, which will
lead to the allocated rcpu, rcpu->queue and rcpu->queue->queue cannot be
released.
Calling kthread_stop before executing kthread's threadfn will return
-EINTR. We can complete the release of memory resources in this state.
Fixes: 6710e1126934 ("bpf: introduce new bpf cpu map type BPF_MAP_TYPE_CPUMAP") Signed-off-by: Pu Lehui <pulehui@huawei.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20230711115848.2701559-1-pulehui@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Quieten a gcc (11.3.0) build error or warning by checking the function
call status and returning -EBUSY if the function call failed.
This is similar to what several other wireless drivers do for the
SIOCGIWRATE ioctl call when there is a locking problem.
drivers/net/wireless/cisco/airo.c: error: 'status_rid.currentXmitRate' is used uninitialized [-Werror=uninitialized]
DAX can be used to share page cache between VMs, reducing guest memory
overhead. And chunk based data format is widely used for VM and
container image. So enable dax support for it, make erofs better used
for VM scenarios.
z_erofs_do_read_page() may loop infinitely due to the inappropriate
truncation in the below statement. Since the offset is 64 bits and min_t()
truncates the result to 32 bits. The solution is to replace unsigned int
with a 64-bit type, such as erofs_off_t.
cur = end - min_t(unsigned int, offset + end - map->m_la, end);
- For example:
- offset = 0x400160000
- end = 0x370
- map->m_la = 0x160370
- offset + end - map->m_la = 0x400000000
- offset + end - map->m_la = 0x00000000 (truncated as unsigned int)
- Expected result:
- cur = 0
- Actual result:
- cur = 0x370
z_erofs_pcluster_readmore() may take a long time to loop when the page
offset is large enough, which is unnecessary should be prevented.
For example, when the following case is encountered, it will loop 4691368
times, taking about 27 seconds:
- offset = 19217289215
- inode_size = 1442672
Commit a4d86249c773 ("drm/i915/gt: Provide a utility to create a scratch
buffer") mistakenly passed in uapi I915_CACHING_CACHED as argument to
i915_gem_object_set_cache_coherency(), which actually takes internal
enum i915_cache_level.
No functional issue since the value matches I915_CACHE_LLC (1 == 1), which
is the intended caching mode, but lets clean it up nevertheless.
In order to generate the prologue and epilogue, the BPF JIT needs to
know which registers that are clobbered. Therefore, the during
pre-final passes, the prologue is generated after the body of the
program body-prologue-epilogue. Then, in the final pass, a proper
prologue-body-epilogue JITted image is generated.
This scheme has worked most of the time. However, for some large
programs with many jumps, e.g. the test_kmod.sh BPF selftest with
hardening enabled (blinding constants), this has shown to be
incorrect. For the final pass, when the proper prologue-body-epilogue
is generated, the image has not converged. This will lead to that the
final image will have incorrect jump offsets. The following is an
excerpt from an incorrect image:
The image has shrunk, and the epilogue offset is incorrect in the
final pass.
Correct the problem by always generating proper prologue-body-epilogue
outputs, which means that the first pass will only generate the body
to track what registers that are touched.
Fixes: 2353ecc6f91f ("bpf, riscv: add BPF JIT for RV64G") Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230710074131.19596-1-bjorn@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
The insertion of an empty frame was introduced with
commit db0b124f02ba ("igc: Enhance Qbv scheduling by using first flag bit")
in order to ensure that the current cycle has at least one packet if
there is some packet to be scheduled for the next cycle.
However, the current implementation does not properly check if
a packet is already scheduled for the current cycle. Currently,
an empty packet is always inserted if and only if
txtime >= end_of_cycle && txtime > last_tx_cycle
but since last_tx_cycle is always either the end of the current
cycle (end_of_cycle) or the end of a previous cycle, the
second part (txtime > last_tx_cycle) is always true unless
txtime == last_tx_cycle.
What actually needs to be checked here is if the last_tx_cycle
was already written within the current cycle, so an empty frame
should only be inserted if and only if
txtime >= end_of_cycle && end_of_cycle > last_tx_cycle.
This patch does not only avoid an unnecessary insertion, but it
can actually be harmful to insert an empty packet if packets
are already scheduled in the current cycle, because it can lead
to a situation where the empty packet is actually processed
as the first packet in the upcoming cycle shifting the packet
with the first_flag even one cycle into the future, finally leading
to a TX hang.
Fixes: db0b124f02ba ("igc: Enhance Qbv scheduling by using first flag bit") Signed-off-by: Florian Kauer <florian.kauer@linutronix.de> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
It is possible (verified on a running system) that frames are processed
by igc_tx_launchtime with a txtime before the start of the cycle
(baset_est).
However, the result of txtime - baset_est is written into a u32,
leading to a wrap around to a positive number. The following
launchtime > 0 check will only branch to executing launchtime = 0
if launchtime is already 0.
Fix it by using a s32 before checking launchtime > 0.
Fixes: db0b124f02ba ("igc: Enhance Qbv scheduling by using first flag bit") Signed-off-by: Florian Kauer <florian.kauer@linutronix.de> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The enable_trace_eprobe() function enables all event probes, attached
to given trace probe. If an error occurs in enabling one of the event
probes, all others should be roll backed. There is a bug in that roll
back logic - instead of all event probes, only the failed one is
disabled.
Link: https://lore.kernel.org/all/20230703042853.1427493-1-tz.stoyanov@gmail.com/ Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Fixes: 7491e2c44278 ("tracing: Add a probe that attaches to trace events") Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The while-loop may break on one of the two conditions, either ID string
is empty or GUID matches. The second one, may never be reached if the
parsed string is not correct GUID. In such a case the loop will never
advance to check the next ID.
Break possible infinite loop by factoring out guid_parse_and_compare()
helper which may be moved to the generic header for everyone later on
and preventing from similar mistake in the future.
Interestingly that firstly it appeared when WMI was turned into a bus
driver, but later when duplicated GUIDs were checked, the while-loop
has been replaced by for-loop and hence no mistake made again.
Fixes: a48e23385fcf ("platform/x86: wmi: add context pointer field to struct wmi_device_id") Fixes: 844af950da94 ("platform/x86: wmi: Turn WMI into a bus driver") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20230621151155.78279-1-andriy.shevchenko@linux.intel.com Tested-by: Armin Wolf <W_Armin@gmx.de> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Add check for the return value of skb_copy in order to avoid NULL pointer
dereference.
Fixes: 2cd548566384 ("net: dsa: qca8k: add support for phy read/write with mgmt Ethernet") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Now in addrconf_mod_rs_timer(), reference idev depends on whether
rs_timer is not pending. Then modify rs_timer timeout.
There is a time gap in [1], during which if the pending rs_timer
becomes not pending. It will miss to hold idev, but the rs_timer
is activated. Thus rs_timer callback function addrconf_rs_timer()
will be executed and put idev later without holding idev. A refcount
underflow issue for idev can be caused by this.
if (!timer_pending(&idev->rs_timer))
in6_dev_hold(idev);
<--------------[1]
mod_timer(&idev->rs_timer, jiffies + when);
To fix the issue, hold idev if mod_timer() return 0.
Fixes: b7b1bfce0bb6 ("ipv6: split duplicate address detection and router solicitation timer") Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
If device_register() returns error, the name allocated by
dev_set_name() need be freed. As comment of device_register()
says, it should use put_device() to give up the reference in
the error path. So fix this by calling put_device(), then the
name can be freed in kobject_cleanup(), and client_dev is freed
in ntb_transport_client_release().
Fixes: fce8a7bb5b4b ("PCI-Express Non-Transparent Bridge Support") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us> Signed-off-by: Sasha Levin <sashal@kernel.org>
The reason is that intel_ntb_pci_driver_init() returns
pci_register_driver() directly without checking its return value, if
pci_register_driver() failed, it returns without destroy the newly created
debugfs, resulting the debugfs of ntb_hw_intel can never be created later.
The reason is that amd_ntb_pci_driver_init() returns pci_register_driver()
directly without checking its return value, if pci_register_driver()
failed, it returns without destroy the newly created debugfs, resulting
the debugfs of ntb_hw_amd can never be created later.
The reason is that idt_pci_driver_init() returns pci_register_driver()
directly without checking its return value, if pci_register_driver()
failed, it returns without destroy the newly created debugfs, resulting
the debugfs of ntb_hw_idt can never be created later.
Amit Klein reported that udp6_ehash_secret was initialized but never used.
Fixes: 1bbdceef1e53 ("inet: convert inet_ehash_secret and ipv6_hash_secret to net_get_random_once") Reported-by: Amit Klein <aksecurity@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willy Tarreau <w@1wt.eu> Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Cc: David Ahern <dsahern@kernel.org> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
With some IPv6 Ext Hdr (RPL, SRv6, etc.), we can send a packet that
has the link-local address as src and dst IP and will be forwarded to
an external IP in the IPv6 Ext Hdr.
For example, the script below generates a packet whose src IP is the
link-local address and dst is updated to 11::.
# for f in $(find /proc/sys/net/ -name *seg6_enabled*); do echo 1 > $f; done
# python3
>>> from socket import *
>>> from scapy.all import *
>>>
>>> SRC_ADDR = DST_ADDR = "fe80::5054:ff:fe12:3456"
>>>
>>> pkt = IPv6(src=SRC_ADDR, dst=DST_ADDR)
>>> pkt /= IPv6ExtHdrSegmentRouting(type=4, addresses=["11::", "22::"], segleft=1)
>>>
>>> sk = socket(AF_INET6, SOCK_RAW, IPPROTO_RAW)
>>> sk.sendto(bytes(pkt), (DST_ADDR, 0))
For such a packet, we call ip6_route_input() to look up a route for the
next destination in these three functions depending on the header type.
If no route is found, ip6_null_entry is set to skb, and the following
dst_input(skb) calls ip6_pkt_drop().
Finally, in icmp6_dev(), we dereference skb_rt6_info(skb)->rt6i_idev->dev
as the input device is the loopback interface. Then, we have to check if
skb_rt6_info(skb)->rt6i_idev is NULL or not to avoid NULL pointer deref
for ip6_null_entry.
Fixes: 4832c30d5458 ("net: ipv6: put host and anycast routes on device with address") Reported-by: Wang Yufen <wangyufen@huawei.com> Closes: https://lore.kernel.org/netdev/c41403a9-c2f6-3b7e-0c96-e1901e605cd0@huawei.com/ Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
In the critical scenario, rx-gro-list GRO-ed packets are fed, via a
bridge, both to the local input path and to an egress device (tun).
The segmentation of such packets unsafely writes to the cloned skbs
with shared heads.
This change addresses the issue by uncloning as needed the
to-be-segmented skbs.
Reported-by: Ian Kumlien <ian.kumlien@gmail.com> Tested-by: Ian Kumlien <ian.kumlien@gmail.com> Fixes: 3a1296a38d0c ("net: Support GRO/GSO fraglist chaining.") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Turning IRQs off is done by accessing Ethernet controller registers.
That can't be done until device's clock is enabled. It results in a SoC
hang otherwise.
This bug remained unnoticed for years as most bootloaders keep all
Ethernet interfaces turned on. It seems to only affect a niche SoC
family BCM47189. It has two Ethernet controllers but CFE bootloader uses
only the first one.
Fixes: 34322615cbaa ("net: bgmac: Mask interrupts during probe") Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Remove unnecessary early code development check and the WARN_ON
that it uses. The irq alloc and free paths have long been
cleaned up and this check shouldn't have stuck around so long.
Fixes: 77ceb68e29cc ("ionic: Add notifyq support") Signed-off-by: Nitya Sunkad <nitya.sunkad@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
In legacy silicon, promiscuous mode is only modified
through CGX mbox messages. In CN10KB silicon, it is modified
from CGX mbox and NIX. This breaks legacy application
behaviour. Fix this by removing call from NIX.
Fixes: d6c9784baf59 ("octeontx2-af: Invoke exact match functions if supported") Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Current duplex mode was unset in the driver, resulting in the default
parameter being set to 0, which corresponds to half duplex. It might
mislead users to have incorrect expectation about the driver's
transmission capabilities.
Set the default duplex configuration to full, as the driver runs in
full duplex mode at this point.
Fixes: 7e074d5a76ca ("gve: Enable Link Speed Reporting in the driver.") Signed-off-by: Junfeng Guo <junfeng.guo@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Message-ID: <20230706044128.2726747-1-junfeng.guo@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
In the event of a failure in tcf_change_indev(), fw_set_parms() will
immediately return an error after incrementing or decrementing
reference counter in tcf_bind_filter(). If attacker can control
reference counter to zero and make reference freed, leading to
use after free.
In order to prevent this, move the point of possible failure above the
point where the TC_FW_CLASSID is handled.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: M A Ramdhan <ramdhan@starlabs.sg> Signed-off-by: M A Ramdhan <ramdhan@starlabs.sg> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Message-ID: <20230705161530.52003-1-ramdhan@starlabs.sg> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
If we boot with mvneta.txq_number=1, the txq_map is set incorrectly:
MVNETA_CPU_TXQ_ACCESS(1) refers to TX queue 1, but only TX queue 0 is
initialized. Fix this.
Fixes: 50bf8cb6fc9c ("net: mvneta: Configure XPS support") Signed-off-by: Klaus Kudielka <klaus.kudielka@gmail.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Link: https://lore.kernel.org/r/20230705053712.3914-1-klaus.kudielka@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The check_max_stack_depth pass happens after the verifier's symbolic
execution, and attempts to walk the call graph of the BPF program,
ensuring that the stack usage stays within bounds for all possible call
chains. There are two cases to consider: bpf_pseudo_func and
bpf_pseudo_call. In the former case, the callback pointer is loaded into
a register, and is assumed that it is passed to some helper later which
calls it (however there is no way to be sure), but the check remains
conservative and accounts the stack usage anyway. For this particular
case, asynchronous callbacks are skipped as they execute asynchronously
when their corresponding event fires.
The case of bpf_pseudo_call is simpler and we know that the call is
definitely made, hence the stack depth of the subprog is accounted for.
However, the current check still skips an asynchronous callback even if
a bpf_pseudo_call was made for it. This is erroneous, as it will miss
accounting for the stack usage of the asynchronous callback, which can
be used to breach the maximum stack depth limit.
Fix this by only skipping asynchronous callbacks when the instruction is
not a pseudo call to the subprog.
When RESET_CONTROLLER is not set, kconfig complains about missing
dependencies for RESET_TI_SYSCON, so add the missing dependency just as is
done above for SCSI_UFS_QCOM.
Silences this kconfig warning:
WARNING: unmet direct dependencies detected for RESET_TI_SYSCON
Depends on [n]: RESET_CONTROLLER [=n] && HAS_IOMEM [=y]
Selected by [m]:
- SCSI_UFS_MEDIATEK [=m] && SCSI_UFSHCD [=y] && SCSI_UFSHCD_PLATFORM [=y] && ARCH_MEDIATEK [=y]
Fixes: de48898d0cb6 ("scsi: ufs-mediatek: Create reset control device_link") Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: lore.kernel.org/r/202306020859.1wHg9AaT-lkp@intel.com Link: https://lore.kernel.org/r/20230701052348.28046-1-rdunlap@infradead.org Cc: Stanley Chu <stanley.chu@mediatek.com> Cc: Peter Wang <peter.wang@mediatek.com> Cc: Paul Gazzillo <paul@pgazz.com> Cc: Necip Fazil Yildiran <fazilyildiran@gmail.com> Cc: linux-scsi@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-mediatek@lists.infradead.org Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
This should be negative -EAGAIN instead of positive. The callers treat
non-zero error codes the same so it doesn't really impact runtime beyond
some trivial differences to debug output.
Fixes: 80676d054e5a ("scsi: qla2xxx: Fix session cleanup hang") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/49866d28-4cfe-47b0-842b-78f110e61aab@moroto.mountain Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When a device-mapper device is passing through the inline encryption
support of an underlying device, calls to blk_crypto_evict_key() take
the blk_crypto_profile::lock of the device-mapper device, then take the
blk_crypto_profile::lock of the underlying device (nested). This isn't
a real deadlock, but it causes a lockdep report because there is only
one lock class for all instances of this lock.
Lockdep subclasses don't really work here because the hierarchy of block
devices is dynamic and could have more than 2 levels.
Instead, register a dynamic lock class for each blk_crypto_profile, and
associate that with the lock.
This avoids false-positive lockdep reports like the following:
============================================
WARNING: possible recursive locking detected
6.4.0-rc5 #2 Not tainted
--------------------------------------------
fscryptctl/1421 is trying to acquire lock: ffffff80829ca418 (&profile->lock){++++}-{3:3}, at: __blk_crypto_evict_key+0x44/0x1c0
but task is already holding lock: ffffff8086b68ca8 (&profile->lock){++++}-{3:3}, at: __blk_crypto_evict_key+0xc8/0x1c0
other info that might help us debug this:
Possible unsafe locking scenario:
I225/6 hardware can be programmed to start PPS output once
the time in Target Time registers is reached. The time
programmed in these registers should always be into future.
Only then PPS output is triggered when SYSTIM register
reaches the programmed value. There are two modes in i225/6
hardware to program PPS, pulse and clock mode.
There were issues reported where PPS is not generated when
start time is in past.
Example 1, "echo 0 0 0 2 0 > /sys/class/ptp/ptp0/period"
In the current implementation, a value of '0' is programmed
into Target time registers and PPS output is in pulse mode.
Eventually an interrupt which is triggered upon SYSTIM
register reaching Target time is not fired. Thus no PPS
output is generated.
Example 2, "echo 0 0 0 1 0 > /sys/class/ptp/ptp0/period"
Above case, a value of '0' is programmed into Target time
registers and PPS output is in clock mode. Here, HW tries to
catch-up the current time by incrementing Target Time
register. This catch-up time seem to vary according to
programmed PPS period time as per the HW design. In my
experiments, the delay ranged between few tens of seconds to
few minutes. The PPS output is only generated after the
Target time register reaches current time.
In my experiments, I also observed PPS stopped working with
below test and could not recover until module is removed and
loaded again.
Currently the check for NOT_READY flag is performed before obtaining the
necessary lock. This opens a possibility for race condition when the flow
is concurrently removed from unready_flows list by the workqueue task,
which causes a double-removal from the list and a crash[0]. Fix the issue
by moving the flag check inside the section protected by
uplink_priv->unready_flows_lock mutex.
When kvzalloc_node or kvzalloc failed in mlx5e_ptp_open, the memory
pointed by "c" or "cparams" is not freed, which can lead to a memory
leak. Fix by freeing the array in the error path.
The memory pointed to by the fs->any pointer is not freed in the error
path of mlx5e_fs_tt_redirect_any_create, which can lead to a memory leak.
Fix by freeing the memory in the error path, thereby making the error path
identical to mlx5e_fs_tt_redirect_any_destroy().
In function accel_fs_tcp_create_groups(), when the ft->g memory is
successfully allocated but the 'in' memory fails to be allocated, the
memory pointed to by ft->g is released once. And in function
accel_fs_tcp_create_table, mlx5e_destroy_flow_table is called to release
the memory pointed to by ft->g again. This will cause double free problem.
Remove unnecessary delay during the TX ring configuration.
This will cause delay, especially during link down and
link up activity.
Furthermore, old SKUs like as I225 will call the reset_adapter
to reset the controller during TSN mode Gate Control List (GCL)
setting. This will add more time to the configuration of the
real-time use case.
It doesn't mentioned about this delay in the Software User Manual.
It might have been ported from legacy code I210 in the past.
Fixes: 13b5b7fd6a4a ("igc: Add support for Tx/Rx rings") Signed-off-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com> Acked-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Remove incorrect check in ice_validate_mqprio_opt() that limits
filter configuration when sum of max_rates of all TCs exceeds
the link speed. The max rate of each TC is unrelated to value
used by other TCs and is valid as long as it is less than link
speed.
Fixes: fbc7b27af0f9 ("ice: enable ndo_setup_tc support for mqprio_qdisc") Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Add missing drm_display_mode DRM_MODE_FLAG_NVSYNC | DRM_MODE_FLAG_NHSYNC
flags. Those are used by various bridges in the pipeline to correctly
configure its sync signals polarity.
Fixes: d69de69f2be1 ("drm/panel: simple: Add Powertip PH800480T013 panel") Signed-off-by: Marek Vasut <marex@denx.de> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20230615201602.565948-1-marex@denx.de Signed-off-by: Sasha Levin <sashal@kernel.org>
Although the desired size of the SWIOTLB memory pool is increased in
swiotlb_adjust_nareas() to match the number of areas, the actual allocation
may be smaller, which may require reducing the number of areas.
For example, Xen uses swiotlb_init_late(), which in turn uses the page
allocator. On x86, page size is 4 KiB and MAX_ORDER is 10 (1024 pages),
resulting in a maximum memory pool size of 4 MiB. This corresponds to 2048
slots of 2 KiB each. The minimum area size is 128 (IO_TLB_SEGSIZE),
allowing at most 2048 / 128 = 16 areas.
If num_possible_cpus() is greater than the maximum number of areas, areas
are smaller than IO_TLB_SEGSIZE and contiguous groups of free slots will
span multiple areas. When allocating and freeing slots, only one area will
be properly locked, causing race conditions on the unlocked slots and
ultimately data corruption, kernel hangs and crashes.
Fixes: 20347fca71a3 ("swiotlb: split up the global swiotlb lock") Signed-off-by: Petr Tesarik <petr.tesarik.ext@huawei.com> Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
At the moment the AMD encrypted platform reserves 6% of RAM for SWIOTLB
or 1GB, whichever is less. However it is possible that there is no block
big enough in the low memory which make SWIOTLB allocation fail and
the kernel continues without DMA. In such case a VM hangs on DMA.
This moves alloc+remap to a helper and calls it from a loop where
the size is halved on each iteration.
This updates default_nslabs on successful allocation which looks like
an oversight as not doing so should have broken callers of
swiotlb_size_or_default().
Signed-off-by: Alexey Kardashevskiy <aik@amd.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Stable-dep-of: 8ac04063354a ("swiotlb: reduce the number of areas to match actual memory pool size") Signed-off-by: Sasha Levin <sashal@kernel.org>
The number of areas defaults to the number of possible CPUs. However, the
total number of slots may have to be increased after adjusting the number
of areas. Consequently, the number of areas must be determined before
allocating the memory pool. This is even explained with a comment in
swiotlb_init_remap(), but swiotlb_init_late() adjusts the number of areas
after slots are already allocated. The areas may end up being smaller than
IO_TLB_SEGSIZE, which breaks per-area locking.
While fixing swiotlb_init_late(), move all relevant comments before the
definition of swiotlb_adjust_nareas() and convert them to kernel-doc.
Fixes: 20347fca71a3 ("swiotlb: split up the global swiotlb lock") Signed-off-by: Petr Tesarik <petr.tesarik.ext@huawei.com> Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Memory for the "struct device" for any given device isn't supposed to
be released until the device's release() is called. This is important
because someone might be holding a kobject reference to the "struct
device" and might try to access one of its members even after any
other cleanup/uninitialization has happened.
Code analysis of ti-sn65dsi86 shows that this isn't quite right. When
the code was written, it was believed that we could rely on the fact
that the child devices would all be freed before the parent devices
and thus we didn't need to worry about a release() function. While I
still believe that the parent's "struct device" is guaranteed to
outlive the child's "struct device" (because the child holds a kobject
reference to the parent), the parent's "devm" allocated memory is a
different story. That appears to be freed much earlier.
Let's make this better for ti-sn65dsi86 by allocating each auxiliary
with kzalloc and then free that memory in the release().
ksmbd does not consider the case of that smb2 session setup is
in compound request. If this is the second payload of the compound,
OOB read issue occurs while processing the first payload in
the smb2_sess_setup().
Cc: stable@vger.kernel.org Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-21355 Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch add the compound request handling to the some commands.
Existing clients do not send these commands as compound requests,
but ksmbd should consider that they may come.
Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dave Airlie reports that gcc-13.1.1 has started complaining about some
of the workqueue code in 32-bit arm builds:
kernel/workqueue.c: In function ‘get_work_pwq’:
kernel/workqueue.c:713:24: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
713 | return (void *)(data & WORK_STRUCT_WQ_DATA_MASK);
| ^
[ ... a couple of other cases ... ]
and while it's not immediately clear exactly why gcc started complaining
about it now, I suspect it's some C23-induced enum type handlign fixup in
gcc-13 is the cause.
Whatever the reason for starting to complain, the code and data types
are indeed disgusting enough that the complaint is warranted.
The wq code ends up creating various "helper constants" (like that
WORK_STRUCT_WQ_DATA_MASK) using an enum type, which is all kinds of
confused. The mask needs to be 'unsigned long', not some unspecified
enum type.
To make matters worse, the actual "mask and cast to a pointer" is
repeated a couple of times, and the cast isn't even always done to the
right pointer, but - as the error case above - to a 'void *' with then
the compiler finishing the job.
That's now how we roll in the kernel.
So create the masks using the proper types rather than some ambiguous
enumeration, and use a nice helper that actually does the type
conversion in one well-defined place.
Incidentally, this magically makes clang generate better code. That,
admittedly, is really just a sign of clang having been seriously
confused before, and cleaning up the typing unconfuses the compiler too.
I observed poor performance of io_uring compared to synchronous IO. That
turns out to be caused by deeper CPU idle states entered with io_uring,
due to io_uring using plain schedule(), whereas synchronous IO uses
io_schedule().
The losses due to this are substantial. On my cascade lake workstation,
t/io_uring from the fio repository e.g. yields regressions between 20%
and 40% with the following command:
./t/io_uring -r 5 -X0 -d 1 -s 1 -c 1 -p 0 -S$use_sync -R 0 /mnt/t2/fio/write.0.0
This is repeatable with different filesystems, using raw block devices
and using different block devices.
Use io_schedule_prepare() / io_schedule_finish() in
io_cqring_wait_schedule() to address the difference.
After that using io_uring is on par or surpassing synchronous IO (using
registered files etc makes it reliably win, but arguably is a less fair
comparison).
There are other calls to schedule() in io_uring/, but none immediately
jump out to be similarly situated, so I did not touch them. Similarly,
it's possible that mutex_lock_io() should be used, but it's not clear if
there are cases where that matters.
Cc: stable@vger.kernel.org # 5.10+ Cc: Pavel Begunkov <asml.silence@gmail.com> Cc: io-uring@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Andres Freund <andres@anarazel.de> Link: https://lore.kernel.org/r/20230707162007.194068-1-andres@anarazel.de
[axboe: minor style fixup] Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A recent change to start counting SuperH IRQ #s from 16 breaks support
for the Hitachi HD64461 companion chip.
Move the offchip IRQ base and HD64461 IRQ # by 16 in order to
accommodate for the new virq numbering rules.
Fixes: a8ac2961148e ("sh: Avoid using IRQ0 on SH3 and SH4") Signed-off-by: Artur Rojek <contact@artur-rojek.eu> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Link: https://lore.kernel.org/r/20230710233132.69734-1-contact@artur-rojek.eu Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Take into account the virq offset when translating cascaded interrupts.
Fixes: a8ac2961148e8c72 ("sh: Avoid using IRQ0 on SH3 and SH4") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Link: https://lore.kernel.org/r/7d0cb246c9f1cd24bb1f637ec5cb67e799a4c3b8.1688908227.git.geert+renesas@glider.be Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Take into account the virq offset when translating cascaded IRL
interrupts.
Fixes: a8ac2961148e8c72 ("sh: Avoid using IRQ0 on SH3 and SH4") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Link: https://lore.kernel.org/r/4fcb0d08a2b372431c41e04312742dc9e41e1be4.1688908186.git.geert+renesas@glider.be Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When booting rts7751r2dplus_defconfig on QEMU, the system hangs due to
an interrupt storm on IRQ 20. IRQ 20 aka event 0x280 is a cascaded IRL
interrupt, which maps to IRQ_VOYAGER, the interrupt used by the Silicon
Motion SM501 multimedia companion chip. As rts7751r2d_irq_demux() does
not take into account the new virq offset, the interrupt is no longer
translated, leading to an unhandled interrupt.
Fix this by taking into account the virq offset when translating
cascaded IRL interrupts.
Fixes: a8ac2961148e8c72 ("sh: Avoid using IRQ0 on SH3 and SH4") Reported-by: Guenter Roeck <linux@roeck-us.net> Closes: https://lore.kernel.org/r/fbfea3ad-d327-4ad5-ac9c-648c7ca3fe1f@roeck-us.net Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Tested-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/2c99d5df41c40691f6c407b7b6a040d406bc81ac.1688901306.git.geert+renesas@glider.be Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Making 'blk' sector_t (i.e. 64 bit if LBD support is active) fails the
'blk>0' test in the partition block loop if a value of (signed int) -1 is
used to mark the end of the partition block list.
Explicitly cast 'blk' to signed int to allow use of -1 to terminate the
partition block linked list.
Fixes: b6f3f28f604b ("block: add overflow checks for Amiga partition support") Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de> Link: https://lore.kernel.org/r/024ce4fa-cc6d-50a2-9aae-3701d0ebf668@xenosoft.de Signed-off-by: Michael Schmitz <schmitzmic@gmail.com> Reviewed-by: Martin Steigerwald <martin@lichtvoll.de> Tested-by: Christian Zigotzky <chzigotzky@xenosoft.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>