Heiko Schocher [Wed, 23 Nov 2022 07:16:36 +0000 (08:16 +0100)]
can: sja1000: fix size of OCR_MODE_MASK define
[ Upstream commit
d7e9bc660303a18f96f0634f058930ae998d574f ]
bitfield mode in ocr register has only 2 bits not 3, so correct
the OCR_MODE_MASK define.
Signed-off-by: Heiko Schocher <hs@denx.de>
Link: https://lore.kernel.org/all/20221123071636.2407823-1-hs@denx.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Ricardo Ribalda [Mon, 21 Nov 2022 23:38:55 +0000 (00:38 +0100)]
pinctrl: meditatek: Startup with the IRQs disabled
[ Upstream commit
60668f9fd3f572e5fe11d153816207ec02153bf0 ]
If the system is restarted via kexec(), the peripherals do not start
with a known state.
If the previous system had enabled an IRQs we will receive unexected
IRQs that can lock the system.
[ 28.109251] watchdog: BUG: soft lockup - CPU#0 stuck for 26s!
[swapper/0:0]
[ 28.109263] Modules linked in:
[ 28.109273] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
5.15.79-14458-g4b9edf7b1ac6 #1
9f2e76613148af94acccd64c609a552fb4b4354b
[ 28.109284] Hardware name: Google Elm (DT)
[ 28.109290] pstate:
40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS
BTYPE=--)
[ 28.109298] pc : __do_softirq+0xa0/0x388
[ 28.109309] lr : __do_softirq+0x70/0x388
[ 28.109316] sp :
ffffffc008003ee0
[ 28.109321] x29:
ffffffc008003f00 x28:
000000000000000a x27:
0000000000000080
[ 28.109334] x26:
0000000000000001 x25:
ffffffefa7b350c0 x24:
ffffffefa7b47480
[ 28.109346] x23:
ffffffefa7b3d000 x22:
0000000000000000 x21:
ffffffefa7b0fa40
[ 28.109358] x20:
ffffffefa7b005b0 x19:
ffffffefa7b47480 x18:
0000000000065b6b
[ 28.109370] x17:
ffffffefa749c8b0 x16:
000000000000018c x15:
00000000000001b8
[ 28.109382] x14:
00000000000d3b6b x13:
0000000000000006 x12:
0000000000057e91
[ 28.109394] x11:
0000000000000000 x10:
0000000000000000 x9 :
ffffffefa7b47480
[ 28.109406] x8 :
00000000000000e0 x7 :
000000000f424000 x6 :
0000000000000000
[ 28.109418] x5 :
ffffffefa7dfaca0 x4 :
ffffffefa7dfadf0 x3 :
000000000000000f
[ 28.109429] x2 :
0000000000000000 x1 :
0000000000000100 x0 :
0000000001ac65c5
[ 28.109441] Call trace:
[ 28.109447] __do_softirq+0xa0/0x388
[ 28.109454] irq_exit+0xc0/0xe0
[ 28.109464] handle_domain_irq+0x68/0x90
[ 28.109473] gic_handle_irq+0xac/0xf0
[ 28.109480] call_on_irq_stack+0x28/0x50
[ 28.109488] do_interrupt_handler+0x44/0x58
[ 28.109496] el1_interrupt+0x30/0x58
[ 28.109506] el1h_64_irq_handler+0x18/0x24
[ 28.109512] el1h_64_irq+0x7c/0x80
[ 28.109519] arch_local_irq_enable+0xc/0x18
[ 28.109529] default_idle_call+0x40/0x140
[ 28.109539] do_idle+0x108/0x290
[ 28.109547] cpu_startup_entry+0x2c/0x30
[ 28.109554] rest_init+0xe8/0xf8
[ 28.109562] arch_call_rest_init+0x18/0x24
[ 28.109571] start_kernel+0x338/0x42c
[ 28.109578] __primary_switched+0xbc/0xc4
[ 28.109588] Kernel panic - not syncing: softlockup: hung tasks
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Link: https://lore.kernel.org/r/20221122-mtk-pinctrl-v1-1-bedf5655a3d2@chromium.org
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Mark Brown [Wed, 11 May 2022 13:41:37 +0000 (14:41 +0100)]
ASoC: ops: Check bounds for second channel in snd_soc_put_volsw_sx()
[ Upstream commit
ca218ac85398b68850cc04b8338f61f17019b1eb ]
The bounds checks in snd_soc_put_volsw_sx() are only being applied to the
first channel, meaning it is possible to write out of bounds values to the
second channel in stereo controls. Add appropriate checks.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220511134137.169575-2-broonie@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jialiang Wang [Wed, 10 Aug 2022 07:30:57 +0000 (15:30 +0800)]
nfp: fix use-after-free in area_cache_get()
commit
899b3df06375e56bd6327ff2d261d9128c09f6d1 upstream.
area_cache_get() is used to distribute cache->area and set cache->id,
and if cache->id is not 0 and cache->area->kref refcount is 0, it will
release the cache->area by nfp_cpp_area_release(). area_cache_get()
set cache->id before cpp->op->area_init() and nfp_cpp_area_acquire().
But if area_init() or nfp_cpp_area_acquire() fails, the cache->id is
is already set but the refcount is not increased as expected. At this
time, calling the nfp_cpp_area_release() will cause use-after-free.
To avoid the use-after-free, set cache->id after area_init() and
nfp_cpp_area_acquire() complete successfully.
Note: This vulnerability is triggerable by providing emulated device
equipped with specified configuration.
BUG: KASAN: use-after-free in nfp6000_area_init (drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c:760)
Write of size 4 at addr
ffff888005b7f4a0 by task swapper/0/1
Call Trace:
<TASK>
nfp6000_area_init (drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c:760)
area_cache_get.constprop.8 (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:884)
Allocated by task 1:
nfp_cpp_area_alloc_with_name (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:303)
nfp_cpp_area_cache_add (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:802)
nfp6000_init (drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c:1230)
nfp_cpp_from_operations (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:1215)
nfp_pci_probe (drivers/net/ethernet/netronome/nfp/nfp_main.c:744)
Freed by task 1:
kfree (mm/slub.c:4562)
area_cache_get.constprop.8 (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:873)
nfp_cpp_read (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:924 drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c:973)
nfp_cpp_readl (drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cpplib.c:48)
Signed-off-by: Jialiang Wang <wangjialiang0806@163.com>
Reviewed-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Acked-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20220810073057.4032-1-wangjialiang0806@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ming Lei [Tue, 13 Dec 2022 07:16:03 +0000 (15:16 +0800)]
block: unhash blkdev part inode when the part is deleted
v5.11 changes the blkdev lookup mechanism completely since commit
0f711687d915 ("block: simplify bdev/disk lookup in blkdev_get"),
and small part of the change is to unhash part bdev inode when
deleting partition. Turns out this kind of change does fix one
nasty issue in case of BLOCK_EXT_MAJOR:
1) when one partition is deleted & closed, disk_put_part() is always
called before bdput(bdev), see blkdev_put(); so the part's devt can
be freed & re-used before the inode is dropped
2) then new partition with same devt can be created just before the
inode in 1) is dropped, then the old inode/bdev structurein 1) is
re-used for this new partition, this way causes use-after-free and
kernel panic.
It isn't possible to backport the whole big patchset of "merge struct
block_device and struct hd_struct v4" for addressing this issue.
https://lore.kernel.org/linux-block/
20201128161510.347752-1-hch@lst.de/
So fixes it by unhashing part bdev in delete_partition(), and this way
is actually aligned with v5.11+'s behavior.
Backported from the following 5.10.y commit:
5f2f77560591 ("block: unhash blkdev part inode when the part is deleted")
Reported-by: Shiwei Cui <cuishw@inspur.com>
Tested-by: Shiwei Cui <cuishw@inspur.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Baolin Wang [Thu, 1 Sep 2022 10:41:31 +0000 (18:41 +0800)]
mm/hugetlb: fix races when looking up a CONT-PTE/PMD size hugetlb page
commit
95341ba9e90928a9f5322d3d9d9a29c9e3367ecf upstream.
On some architectures (like ARM64), it can support CONT-PTE/PMD size
hugetlb, which means it can support not only PMD/PUD size hugetlb (2M and
1G), but also CONT-PTE/PMD size(64K and 32M) if a 4K page size specified.
So when looking up a CONT-PTE size hugetlb page by follow_page(), it will
use pte_offset_map_lock() to get the pte entry lock for the CONT-PTE size
hugetlb in follow_page_pte(). However this pte entry lock is incorrect
for the CONT-PTE size hugetlb, since we should use huge_pte_lock() to get
the correct lock, which is mm->page_table_lock.
That means the pte entry of the CONT-PTE size hugetlb under current pte
lock is unstable in follow_page_pte(), we can continue to migrate or
poison the pte entry of the CONT-PTE size hugetlb, which can cause some
potential race issues, even though they are under the 'pte lock'.
For example, suppose thread A is trying to look up a CONT-PTE size hugetlb
page by move_pages() syscall under the lock, however antoher thread B can
migrate the CONT-PTE hugetlb page at the same time, which will cause
thread A to get an incorrect page, if thread A also wants to do page
migration, then data inconsistency error occurs.
Moreover we have the same issue for CONT-PMD size hugetlb in
follow_huge_pmd().
To fix above issues, rename the follow_huge_pmd() as follow_huge_pmd_pte()
to handle PMD and PTE level size hugetlb, which uses huge_pte_lock() to
get the correct pte entry lock to make the pte entry stable.
Mike said:
Support for CONT_PMD/_PTE was added with
939050df9a5c ("arm64: hugetlb:
refactor find_num_contig()"). Patch series "Support for contiguous pte
hugepages", v4. However, I do not believe these code paths were
executed until migration support was added with
fb22f5e9b6f9 ("arm64/mm:
enable HugeTLB migration for contiguous bit HugeTLB pages") I would go
with
fb22f5e9b6f9 for the Fixes: targe.
Link: https://lkml.kernel.org/r/635f43bdd85ac2615a58405da82b4d33c6e5eb05.1662017562.git.baolin.wang@linux.alibaba.com
Fixes: fb22f5e9b6f9 ("arm64/mm: enable HugeTLB migration for contiguous bit HugeTLB pages")
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Suggested-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[5.4: Fixup contextual diffs before pin_user_pages()]
Signed-off-by: Samuel Mendoza-Jonas <samjonas@amazon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Paul E. McKenney [Wed, 21 Oct 2020 04:13:55 +0000 (21:13 -0700)]
x86/smpboot: Move rcu_cpu_starting() earlier
commit
1b2b292555b18727c2935735818abe804fca43cc upstream.
The call to rcu_cpu_starting() in mtrr_ap_init() is not early enough
in the CPU-hotplug onlining process, which results in lockdep splats
as follows:
=============================
WARNING: suspicious RCU usage
5.9.0+ #268 Not tainted
-----------------------------
kernel/kprobes.c:300 RCU-list traversed in non-reader section!!
other info that might help us debug this:
RCU used illegally from offline CPU!
rcu_scheduler_active = 1, debug_locks = 1
no locks held by swapper/1/0.
stack backtrace:
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.9.0+ #268
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1ubuntu1 04/01/2014
Call Trace:
dump_stack+0x77/0x97
__is_insn_slot_addr+0x15d/0x170
kernel_text_address+0xba/0xe0
? get_stack_info+0x22/0xa0
__kernel_text_address+0x9/0x30
show_trace_log_lvl+0x17d/0x380
? dump_stack+0x77/0x97
dump_stack+0x77/0x97
__lock_acquire+0xdf7/0x1bf0
lock_acquire+0x258/0x3d0
? vprintk_emit+0x6d/0x2c0
_raw_spin_lock+0x27/0x40
? vprintk_emit+0x6d/0x2c0
vprintk_emit+0x6d/0x2c0
printk+0x4d/0x69
start_secondary+0x1c/0x100
secondary_startup_64_no_verify+0xb8/0xbb
This is avoided by moving the call to rcu_cpu_starting up near
the beginning of the start_secondary() function. Note that the
raw_smp_processor_id() is required in order to avoid calling into lockdep
before RCU has declared the CPU to be watched for readers.
Link: https://lore.kernel.org/lkml/160223032121.7002.1269740091547117869.tip-bot2@tip-bot2/
Reported-by: Qian Cai <cai@redhat.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Joel Fernandes <joel@joelfernandes.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Lorenzo Colitti [Mon, 20 Apr 2020 18:34:08 +0000 (11:34 -0700)]
net: bpf: Allow TC programs to call BPF_FUNC_skb_change_head
commit
36bf5c22422a518f0e95318ad0e8f6e167947cab upstream.
This allows TC eBPF programs to modify and forward (redirect) packets
from interfaces without ethernet headers (for example cellular)
to interfaces with (for example ethernet/wifi).
The lack of this appears to simply be an oversight.
Tested:
in active use in Android R on 4.14+ devices for ipv6
cellular to wifi tethering offload.
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Wed, 14 Dec 2022 10:30:48 +0000 (11:30 +0100)]
Linux 5.4.227
Link: https://lore.kernel.org/r/20221212130917.599345531@linuxfoundation.org
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Frank Jungclaus [Wed, 30 Nov 2022 20:22:42 +0000 (21:22 +0100)]
can: esd_usb: Allow REC and TEC to return to zero
[ Upstream commit
918ee4911f7a41fb4505dff877c1d7f9f64eb43e ]
We don't get any further EVENT from an esd CAN USB device for changes
on REC or TEC while those counters converge to 0 (with ecc == 0). So
when handling the "Back to Error Active"-event force txerr = rxerr =
0, otherwise the berr-counters might stay on values like 95 forever.
Also, to make life easier during the ongoing development a
netdev_dbg() has been introduced to allow dumping error events send by
an esd CAN USB device.
Fixes: 37ca9589408a ("can: Add driver for esd CAN-USB/2 device")
Signed-off-by: Frank Jungclaus <frank.jungclaus@esd.eu>
Link: https://lore.kernel.org/all/20221130202242.3998219-2-frank.jungclaus@esd.eu
Cc: stable@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Dan Carpenter [Wed, 7 Dec 2022 07:06:31 +0000 (10:06 +0300)]
net: mvneta: Fix an out of bounds check
[ Upstream commit
cdd97383e19d4afe29adc3376025a15ae3bab3a3 ]
In an earlier commit, I added a bounds check to prevent an out of bounds
read and a WARN(). On further discussion and consideration that check
was probably too aggressive. Instead of returning -EINVAL, a better fix
would be to just prevent the out of bounds read but continue the process.
Background: The value of "pp->rxq_def" is a number between 0-7 by default,
or even higher depending on the value of "rxq_number", which is a module
parameter. If the value is more than the number of available CPUs then
it will trigger the WARN() in cpu_max_bits_warn().
Fixes: e8b4fc13900b ("net: mvneta: Prevent out of bounds read in mvneta_config_rss()")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/Y5A7d1E5ccwHTYPf@kadam
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Eric Dumazet [Tue, 6 Dec 2022 10:13:51 +0000 (10:13 +0000)]
ipv6: avoid use-after-free in ip6_fragment()
[ Upstream commit
803e84867de59a1e5d126666d25eb4860cfd2ebe ]
Blamed commit claimed rcu_read_lock() was held by ip6_fragment() callers.
It seems to not be always true, at least for UDP stack.
syzbot reported:
BUG: KASAN: use-after-free in ip6_dst_idev include/net/ip6_fib.h:245 [inline]
BUG: KASAN: use-after-free in ip6_fragment+0x2724/0x2770 net/ipv6/ip6_output.c:951
Read of size 8 at addr
ffff88801d403e80 by task syz-executor.3/7618
CPU: 1 PID: 7618 Comm: syz-executor.3 Not tainted
6.1.0-rc6-syzkaller-00012-g4312098baf37 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd1/0x138 lib/dump_stack.c:106
print_address_description mm/kasan/report.c:284 [inline]
print_report+0x15e/0x45d mm/kasan/report.c:395
kasan_report+0xbf/0x1f0 mm/kasan/report.c:495
ip6_dst_idev include/net/ip6_fib.h:245 [inline]
ip6_fragment+0x2724/0x2770 net/ipv6/ip6_output.c:951
__ip6_finish_output net/ipv6/ip6_output.c:193 [inline]
ip6_finish_output+0x9a3/0x1170 net/ipv6/ip6_output.c:206
NF_HOOK_COND include/linux/netfilter.h:291 [inline]
ip6_output+0x1f1/0x540 net/ipv6/ip6_output.c:227
dst_output include/net/dst.h:445 [inline]
ip6_local_out+0xb3/0x1a0 net/ipv6/output_core.c:161
ip6_send_skb+0xbb/0x340 net/ipv6/ip6_output.c:1966
udp_v6_send_skb+0x82a/0x18a0 net/ipv6/udp.c:1286
udp_v6_push_pending_frames+0x140/0x200 net/ipv6/udp.c:1313
udpv6_sendmsg+0x18da/0x2c80 net/ipv6/udp.c:1606
inet6_sendmsg+0x9d/0xe0 net/ipv6/af_inet6.c:665
sock_sendmsg_nosec net/socket.c:714 [inline]
sock_sendmsg+0xd3/0x120 net/socket.c:734
sock_write_iter+0x295/0x3d0 net/socket.c:1108
call_write_iter include/linux/fs.h:2191 [inline]
new_sync_write fs/read_write.c:491 [inline]
vfs_write+0x9ed/0xdd0 fs/read_write.c:584
ksys_write+0x1ec/0x250 fs/read_write.c:637
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fde3588c0d9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:
00007fde365b6168 EFLAGS:
00000246 ORIG_RAX:
0000000000000001
RAX:
ffffffffffffffda RBX:
00007fde359ac050 RCX:
00007fde3588c0d9
RDX:
000000000000ffdc RSI:
00000000200000c0 RDI:
000000000000000a
RBP:
00007fde358e7ae9 R08:
0000000000000000 R09:
0000000000000000
R10:
0000000000000000 R11:
0000000000000246 R12:
0000000000000000
R13:
00007fde35acfb1f R14:
00007fde365b6300 R15:
0000000000022000
</TASK>
Allocated by task 7618:
kasan_save_stack+0x22/0x40 mm/kasan/common.c:45
kasan_set_track+0x25/0x30 mm/kasan/common.c:52
__kasan_slab_alloc+0x82/0x90 mm/kasan/common.c:325
kasan_slab_alloc include/linux/kasan.h:201 [inline]
slab_post_alloc_hook mm/slab.h:737 [inline]
slab_alloc_node mm/slub.c:3398 [inline]
slab_alloc mm/slub.c:3406 [inline]
__kmem_cache_alloc_lru mm/slub.c:3413 [inline]
kmem_cache_alloc+0x2b4/0x3d0 mm/slub.c:3422
dst_alloc+0x14a/0x1f0 net/core/dst.c:92
ip6_dst_alloc+0x32/0xa0 net/ipv6/route.c:344
ip6_rt_pcpu_alloc net/ipv6/route.c:1369 [inline]
rt6_make_pcpu_route net/ipv6/route.c:1417 [inline]
ip6_pol_route+0x901/0x1190 net/ipv6/route.c:2254
pol_lookup_func include/net/ip6_fib.h:582 [inline]
fib6_rule_lookup+0x52e/0x6f0 net/ipv6/fib6_rules.c:121
ip6_route_output_flags_noref+0x2e6/0x380 net/ipv6/route.c:2625
ip6_route_output_flags+0x76/0x320 net/ipv6/route.c:2638
ip6_route_output include/net/ip6_route.h:98 [inline]
ip6_dst_lookup_tail+0x5ab/0x1620 net/ipv6/ip6_output.c:1092
ip6_dst_lookup_flow+0x90/0x1d0 net/ipv6/ip6_output.c:1222
ip6_sk_dst_lookup_flow+0x553/0x980 net/ipv6/ip6_output.c:1260
udpv6_sendmsg+0x151d/0x2c80 net/ipv6/udp.c:1554
inet6_sendmsg+0x9d/0xe0 net/ipv6/af_inet6.c:665
sock_sendmsg_nosec net/socket.c:714 [inline]
sock_sendmsg+0xd3/0x120 net/socket.c:734
__sys_sendto+0x23a/0x340 net/socket.c:2117
__do_sys_sendto net/socket.c:2129 [inline]
__se_sys_sendto net/socket.c:2125 [inline]
__x64_sys_sendto+0xe1/0x1b0 net/socket.c:2125
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Freed by task 7599:
kasan_save_stack+0x22/0x40 mm/kasan/common.c:45
kasan_set_track+0x25/0x30 mm/kasan/common.c:52
kasan_save_free_info+0x2e/0x40 mm/kasan/generic.c:511
____kasan_slab_free mm/kasan/common.c:236 [inline]
____kasan_slab_free+0x160/0x1c0 mm/kasan/common.c:200
kasan_slab_free include/linux/kasan.h:177 [inline]
slab_free_hook mm/slub.c:1724 [inline]
slab_free_freelist_hook+0x8b/0x1c0 mm/slub.c:1750
slab_free mm/slub.c:3661 [inline]
kmem_cache_free+0xee/0x5c0 mm/slub.c:3683
dst_destroy+0x2ea/0x400 net/core/dst.c:127
rcu_do_batch kernel/rcu/tree.c:2250 [inline]
rcu_core+0x81f/0x1980 kernel/rcu/tree.c:2510
__do_softirq+0x1fb/0xadc kernel/softirq.c:571
Last potentially related work creation:
kasan_save_stack+0x22/0x40 mm/kasan/common.c:45
__kasan_record_aux_stack+0xbc/0xd0 mm/kasan/generic.c:481
call_rcu+0x9d/0x820 kernel/rcu/tree.c:2798
dst_release net/core/dst.c:177 [inline]
dst_release+0x7d/0xe0 net/core/dst.c:167
refdst_drop include/net/dst.h:256 [inline]
skb_dst_drop include/net/dst.h:268 [inline]
skb_release_head_state+0x250/0x2a0 net/core/skbuff.c:838
skb_release_all net/core/skbuff.c:852 [inline]
__kfree_skb net/core/skbuff.c:868 [inline]
kfree_skb_reason+0x151/0x4b0 net/core/skbuff.c:891
kfree_skb_list_reason+0x4b/0x70 net/core/skbuff.c:901
kfree_skb_list include/linux/skbuff.h:1227 [inline]
ip6_fragment+0x2026/0x2770 net/ipv6/ip6_output.c:949
__ip6_finish_output net/ipv6/ip6_output.c:193 [inline]
ip6_finish_output+0x9a3/0x1170 net/ipv6/ip6_output.c:206
NF_HOOK_COND include/linux/netfilter.h:291 [inline]
ip6_output+0x1f1/0x540 net/ipv6/ip6_output.c:227
dst_output include/net/dst.h:445 [inline]
ip6_local_out+0xb3/0x1a0 net/ipv6/output_core.c:161
ip6_send_skb+0xbb/0x340 net/ipv6/ip6_output.c:1966
udp_v6_send_skb+0x82a/0x18a0 net/ipv6/udp.c:1286
udp_v6_push_pending_frames+0x140/0x200 net/ipv6/udp.c:1313
udpv6_sendmsg+0x18da/0x2c80 net/ipv6/udp.c:1606
inet6_sendmsg+0x9d/0xe0 net/ipv6/af_inet6.c:665
sock_sendmsg_nosec net/socket.c:714 [inline]
sock_sendmsg+0xd3/0x120 net/socket.c:734
sock_write_iter+0x295/0x3d0 net/socket.c:1108
call_write_iter include/linux/fs.h:2191 [inline]
new_sync_write fs/read_write.c:491 [inline]
vfs_write+0x9ed/0xdd0 fs/read_write.c:584
ksys_write+0x1ec/0x250 fs/read_write.c:637
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Second to last potentially related work creation:
kasan_save_stack+0x22/0x40 mm/kasan/common.c:45
__kasan_record_aux_stack+0xbc/0xd0 mm/kasan/generic.c:481
call_rcu+0x9d/0x820 kernel/rcu/tree.c:2798
dst_release net/core/dst.c:177 [inline]
dst_release+0x7d/0xe0 net/core/dst.c:167
refdst_drop include/net/dst.h:256 [inline]
skb_dst_drop include/net/dst.h:268 [inline]
__dev_queue_xmit+0x1b9d/0x3ba0 net/core/dev.c:4211
dev_queue_xmit include/linux/netdevice.h:3008 [inline]
neigh_resolve_output net/core/neighbour.c:1552 [inline]
neigh_resolve_output+0x51b/0x840 net/core/neighbour.c:1532
neigh_output include/net/neighbour.h:546 [inline]
ip6_finish_output2+0x56c/0x1530 net/ipv6/ip6_output.c:134
__ip6_finish_output net/ipv6/ip6_output.c:195 [inline]
ip6_finish_output+0x694/0x1170 net/ipv6/ip6_output.c:206
NF_HOOK_COND include/linux/netfilter.h:291 [inline]
ip6_output+0x1f1/0x540 net/ipv6/ip6_output.c:227
dst_output include/net/dst.h:445 [inline]
NF_HOOK include/linux/netfilter.h:302 [inline]
NF_HOOK include/linux/netfilter.h:296 [inline]
mld_sendpack+0xa09/0xe70 net/ipv6/mcast.c:1820
mld_send_cr net/ipv6/mcast.c:2121 [inline]
mld_ifc_work+0x720/0xdc0 net/ipv6/mcast.c:2653
process_one_work+0x9bf/0x1710 kernel/workqueue.c:2289
worker_thread+0x669/0x1090 kernel/workqueue.c:2436
kthread+0x2e8/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
The buggy address belongs to the object at
ffff88801d403dc0
which belongs to the cache ip6_dst_cache of size 240
The buggy address is located 192 bytes inside of
240-byte region [
ffff88801d403dc0,
ffff88801d403eb0)
The buggy address belongs to the physical page:
page:
ffffea00007500c0 refcount:1 mapcount:0 mapping:
0000000000000000 index:0x0 pfn:0x1d403
memcg:
ffff888022f49c81
flags: 0xfff00000000200(slab|node=0|zone=1|lastcpupid=0x7ff)
raw:
00fff00000000200 ffffea0001ef6580 dead000000000002 ffff88814addf640
raw:
0000000000000000 00000000800c000c 00000001ffffffff ffff888022f49c81
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x112a20(GFP_ATOMIC|__GFP_NOWARN|__GFP_NORETRY|__GFP_HARDWALL), pid 3719, tgid 3719 (kworker/0:6), ts
136223432244, free_ts
136222971441
prep_new_page mm/page_alloc.c:2539 [inline]
get_page_from_freelist+0x10b5/0x2d50 mm/page_alloc.c:4288
__alloc_pages+0x1cb/0x5b0 mm/page_alloc.c:5555
alloc_pages+0x1aa/0x270 mm/mempolicy.c:2285
alloc_slab_page mm/slub.c:1794 [inline]
allocate_slab+0x213/0x300 mm/slub.c:1939
new_slab mm/slub.c:1992 [inline]
___slab_alloc+0xa91/0x1400 mm/slub.c:3180
__slab_alloc.constprop.0+0x56/0xa0 mm/slub.c:3279
slab_alloc_node mm/slub.c:3364 [inline]
slab_alloc mm/slub.c:3406 [inline]
__kmem_cache_alloc_lru mm/slub.c:3413 [inline]
kmem_cache_alloc+0x31a/0x3d0 mm/slub.c:3422
dst_alloc+0x14a/0x1f0 net/core/dst.c:92
ip6_dst_alloc+0x32/0xa0 net/ipv6/route.c:344
icmp6_dst_alloc+0x71/0x680 net/ipv6/route.c:3261
mld_sendpack+0x5de/0xe70 net/ipv6/mcast.c:1809
mld_send_cr net/ipv6/mcast.c:2121 [inline]
mld_ifc_work+0x720/0xdc0 net/ipv6/mcast.c:2653
process_one_work+0x9bf/0x1710 kernel/workqueue.c:2289
worker_thread+0x669/0x1090 kernel/workqueue.c:2436
kthread+0x2e8/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
page last free stack trace:
reset_page_owner include/linux/page_owner.h:24 [inline]
free_pages_prepare mm/page_alloc.c:1459 [inline]
free_pcp_prepare+0x65c/0xd90 mm/page_alloc.c:1509
free_unref_page_prepare mm/page_alloc.c:3387 [inline]
free_unref_page+0x1d/0x4d0 mm/page_alloc.c:3483
__unfreeze_partials+0x17c/0x1a0 mm/slub.c:2586
qlink_free mm/kasan/quarantine.c:168 [inline]
qlist_free_all+0x6a/0x170 mm/kasan/quarantine.c:187
kasan_quarantine_reduce+0x184/0x210 mm/kasan/quarantine.c:294
__kasan_slab_alloc+0x66/0x90 mm/kasan/common.c:302
kasan_slab_alloc include/linux/kasan.h:201 [inline]
slab_post_alloc_hook mm/slab.h:737 [inline]
slab_alloc_node mm/slub.c:3398 [inline]
kmem_cache_alloc_node+0x304/0x410 mm/slub.c:3443
__alloc_skb+0x214/0x300 net/core/skbuff.c:497
alloc_skb include/linux/skbuff.h:1267 [inline]
netlink_alloc_large_skb net/netlink/af_netlink.c:1191 [inline]
netlink_sendmsg+0x9a6/0xe10 net/netlink/af_netlink.c:1896
sock_sendmsg_nosec net/socket.c:714 [inline]
sock_sendmsg+0xd3/0x120 net/socket.c:734
__sys_sendto+0x23a/0x340 net/socket.c:2117
__do_sys_sendto net/socket.c:2129 [inline]
__se_sys_sendto net/socket.c:2125 [inline]
__x64_sys_sendto+0xe1/0x1b0 net/socket.c:2125
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Fixes: 55a6a4d284cd ("ipv6: remove unnecessary dst_hold() in ip6_fragment()")
Reported-by: syzbot+8c0ac31aa9681abb9e2d@syzkaller.appspotmail.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Wei Wang <weiwan@google.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/r/20221206101351.2037285-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Yang Yingliang [Wed, 7 Dec 2022 01:53:10 +0000 (09:53 +0800)]
net: plip: don't call kfree_skb/dev_kfree_skb() under spin_lock_irq()
[ Upstream commit
7d8c19bfc8ff3f78e5337107ca9246327fcb6b45 ]
It is not allowed to call kfree_skb() or consume_skb() from
hardware interrupt context or with interrupts being disabled.
So replace kfree_skb/dev_kfree_skb() with dev_kfree_skb_irq()
and dev_consume_skb_irq() under spin_lock_irq().
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20221207015310.2984909-1-yangyingliang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Juergen Gross [Wed, 7 Dec 2022 07:19:38 +0000 (08:19 +0100)]
xen/netback: fix build warning
[ Upstream commit
7dfa764e0223a324366a2a1fc056d4d9d4e95491 ]
Commit
ad7f402ae4f4 ("xen/netback: Ensure protocol headers don't fall in
the non-linear area") introduced a (valid) build warning. There have
even been reports of this problem breaking networking of Xen guests.
Fixes: ad7f402ae4f4 ("xen/netback: Ensure protocol headers don't fall in the non-linear area")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Tested-by: Jason Andryuk <jandryuk@gmail.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Zhang Changzhong [Sun, 4 Dec 2022 06:09:08 +0000 (14:09 +0800)]
ethernet: aeroflex: fix potential skb leak in greth_init_rings()
[ Upstream commit
063a932b64db3317ec020c94466fe52923a15f60 ]
The greth_init_rings() function won't free the newly allocated skb when
dma_mapping_error() returns error, so add dev_kfree_skb() to fix it.
Compile tested only.
Fixes: 7c2a0febc262 ("net: Add Aeroflex Gaisler 10/100/1G Ethernet MAC driver")
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/1670134149-29516-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Ido Schimmel [Sun, 4 Dec 2022 07:50:45 +0000 (09:50 +0200)]
ipv4: Fix incorrect route flushing when table ID 0 is used
[ Upstream commit
c0d999348e01df03e0a7f550351f3907fabbf611 ]
Cited commit added the table ID to the FIB info structure, but did not
properly initialize it when table ID 0 is used. This can lead to a route
in the default VRF with a preferred source address not being flushed
when the address is deleted.
Consider the following example:
# ip address add dev dummy1 192.0.2.1/28
# ip address add dev dummy1 192.0.2.17/28
# ip route add 198.51.100.0/24 via 192.0.2.2 src 192.0.2.17 metric 100
# ip route add table 0 198.51.100.0/24 via 192.0.2.2 src 192.0.2.17 metric 200
# ip route show 198.51.100.0/24
198.51.100.0/24 via 192.0.2.2 dev dummy1 src 192.0.2.17 metric 100
198.51.100.0/24 via 192.0.2.2 dev dummy1 src 192.0.2.17 metric 200
Both routes are installed in the default VRF, but they are using two
different FIB info structures. One with a metric of 100 and table ID of
254 (main) and one with a metric of 200 and table ID of 0. Therefore,
when the preferred source address is deleted from the default VRF,
the second route is not flushed:
# ip address del dev dummy1 192.0.2.17/28
# ip route show 198.51.100.0/24
198.51.100.0/24 via 192.0.2.2 dev dummy1 src 192.0.2.17 metric 200
Fix by storing a table ID of 254 instead of 0 in the route configuration
structure.
Add a test case that fails before the fix:
# ./fib_tests.sh -t ipv4_del_addr
IPv4 delete address route tests
Regular FIB info
TEST: Route removed from VRF when source address deleted [ OK ]
TEST: Route in default VRF not removed [ OK ]
TEST: Route removed in default VRF when source address deleted [ OK ]
TEST: Route in VRF is not removed by address delete [ OK ]
Identical FIB info with different table ID
TEST: Route removed from VRF when source address deleted [ OK ]
TEST: Route in default VRF not removed [ OK ]
TEST: Route removed in default VRF when source address deleted [ OK ]
TEST: Route in VRF is not removed by address delete [ OK ]
Table ID 0
TEST: Route removed in default VRF when source address deleted [FAIL]
Tests passed: 8
Tests failed: 1
And passes after:
# ./fib_tests.sh -t ipv4_del_addr
IPv4 delete address route tests
Regular FIB info
TEST: Route removed from VRF when source address deleted [ OK ]
TEST: Route in default VRF not removed [ OK ]
TEST: Route removed in default VRF when source address deleted [ OK ]
TEST: Route in VRF is not removed by address delete [ OK ]
Identical FIB info with different table ID
TEST: Route removed from VRF when source address deleted [ OK ]
TEST: Route in default VRF not removed [ OK ]
TEST: Route removed in default VRF when source address deleted [ OK ]
TEST: Route in VRF is not removed by address delete [ OK ]
Table ID 0
TEST: Route removed in default VRF when source address deleted [ OK ]
Tests passed: 9
Tests failed: 0
Fixes: 0b000b29c23b ("net: Don't delete routes in different VRFs")
Reported-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Ido Schimmel [Sun, 4 Dec 2022 07:50:44 +0000 (09:50 +0200)]
ipv4: Fix incorrect route flushing when source address is deleted
[ Upstream commit
f96a3d74554df537b6db5c99c27c80e7afadc8d1 ]
Cited commit added the table ID to the FIB info structure, but did not
prevent structures with different table IDs from being consolidated.
This can lead to routes being flushed from a VRF when an address is
deleted from a different VRF.
Fix by taking the table ID into account when looking for a matching FIB
info. This is already done for FIB info structures backed by a nexthop
object in fib_find_info_nh().
Add test cases that fail before the fix:
# ./fib_tests.sh -t ipv4_del_addr
IPv4 delete address route tests
Regular FIB info
TEST: Route removed from VRF when source address deleted [ OK ]
TEST: Route in default VRF not removed [ OK ]
TEST: Route removed in default VRF when source address deleted [ OK ]
TEST: Route in VRF is not removed by address delete [ OK ]
Identical FIB info with different table ID
TEST: Route removed from VRF when source address deleted [FAIL]
TEST: Route in default VRF not removed [ OK ]
RTNETLINK answers: File exists
TEST: Route removed in default VRF when source address deleted [ OK ]
TEST: Route in VRF is not removed by address delete [FAIL]
Tests passed: 6
Tests failed: 2
And pass after:
# ./fib_tests.sh -t ipv4_del_addr
IPv4 delete address route tests
Regular FIB info
TEST: Route removed from VRF when source address deleted [ OK ]
TEST: Route in default VRF not removed [ OK ]
TEST: Route removed in default VRF when source address deleted [ OK ]
TEST: Route in VRF is not removed by address delete [ OK ]
Identical FIB info with different table ID
TEST: Route removed from VRF when source address deleted [ OK ]
TEST: Route in default VRF not removed [ OK ]
TEST: Route removed in default VRF when source address deleted [ OK ]
TEST: Route in VRF is not removed by address delete [ OK ]
Tests passed: 8
Tests failed: 0
Fixes: 0b000b29c23b ("net: Don't delete routes in different VRFs")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
YueHaibing [Sat, 3 Dec 2022 09:46:35 +0000 (17:46 +0800)]
tipc: Fix potential OOB in tipc_link_proto_rcv()
[ Upstream commit
743117a997bbd4840e827295c07e59bcd7f7caa3 ]
Fix the potential risk of OOB if skb_linearize() fails in
tipc_link_proto_rcv().
Fixes: b26dc52cd3fe ("tipc: linearize arriving NAME_DISTR and LINK_PROTO buffers")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/r/20221203094635.29024-1-yuehaibing@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Liu Jian [Sat, 3 Dec 2022 09:42:40 +0000 (17:42 +0800)]
net: hisilicon: Fix potential use-after-free in hix5hd2_rx()
[ Upstream commit
433c07a13f59856e4585e89e86b7d4cc59348fab ]
The skb is delivered to napi_gro_receive() which may free it, after
calling this, dereferencing skb may trigger use-after-free.
Fixes: d46253258421 ("net: hisilicon: add hix5hd2 mac driver")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Link: https://lore.kernel.org/r/20221203094240.1240211-2-liujian56@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Liu Jian [Sat, 3 Dec 2022 09:42:39 +0000 (17:42 +0800)]
net: hisilicon: Fix potential use-after-free in hisi_femac_rx()
[ Upstream commit
4640177049549de1a43e9bc49265f0cdfce08cfd ]
The skb is delivered to napi_gro_receive() which may free it, after
calling this, dereferencing skb may trigger use-after-free.
Fixes: bee04c3ed81a ("net: hisilicon: Add Fast Ethernet MAC driver")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Link: https://lore.kernel.org/r/20221203094240.1240211-1-liujian56@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Yongqiang Liu [Sat, 3 Dec 2022 09:41:25 +0000 (09:41 +0000)]
net: thunderx: Fix missing destroy_workqueue of nicvf_rx_mode_wq
[ Upstream commit
42330a32933fb42180c52022804dcf09f47a2f99 ]
The nicvf_probe() won't destroy workqueue when register_netdev()
failed. Add destroy_workqueue err handle case to fix this issue.
Fixes: 50ef047b1ed9 ("net: thunderx: replace global nicvf_rx_mode_wq work queue for all VFs to private for each of them.")
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://lore.kernel.org/r/20221203094125.602812-1-liuyongqiang13@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jisheng Zhang [Fri, 2 Dec 2022 16:17:39 +0000 (00:17 +0800)]
net: stmmac: fix "snps,axi-config" node property parsing
[ Upstream commit
61d4f140943c47c1386ed89f7260e00418dfad9d ]
In dt-binding snps,dwmac.yaml, some properties under "snps,axi-config"
node are named without "axi_" prefix, but the driver expects the
prefix. Since the dt-binding has been there for a long time, we'd
better make driver match the binding for compatibility.
Fixes: 99942d2abc37 ("stmmac: rework DMA bus setting and introduce new platform AXI structure")
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Link: https://lore.kernel.org/r/20221202161739.2203-1-jszhang@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Pankaj Raghav [Thu, 1 Dec 2022 12:52:34 +0000 (13:52 +0100)]
nvme initialize core quirks before calling nvme_init_subsystem
[ Upstream commit
6f2d71524bcfdeb1fcbd22a4a92a5b7b161ab224 ]
A device might have a core quirk for NVME_QUIRK_IGNORE_DEV_SUBNQN
(such as Samsung X5) but it would still give a:
"missing or invalid SUBNQN field"
warning as core quirks are filled after calling nvme_init_subnqn. Fill
ctrl->quirks from struct core_quirks before calling nvme_init_subsystem
to fix this.
Tested on a Samsung X5.
Fixes: 5d6ab0cab767 ("nvme: track subsystems")
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Kees Cook [Fri, 2 Dec 2022 21:44:14 +0000 (13:44 -0800)]
NFC: nci: Bounds check struct nfc_target arrays
[ Upstream commit
e329e71013c9b5a4535b099208493c7826ee4a64 ]
While running under CONFIG_FORTIFY_SOURCE=y, syzkaller reported:
memcpy: detected field-spanning write (size 129) of single field "target->sensf_res" at net/nfc/nci/ntf.c:260 (size 18)
This appears to be a legitimate lack of bounds checking in
nci_add_new_protocol(). Add the missing checks.
Reported-by: syzbot+210e196cef4711b65139@syzkaller.appspotmail.com
Link: https://lore.kernel.org/lkml/0000000000001c590f05ee7b3ff4@google.com
Fixes: e96615ca595b ("NFC: Add NCI multiple targets support")
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20221202214410.never.693-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Przemyslaw Patynowski [Tue, 15 Nov 2022 08:49:25 +0000 (09:49 +0100)]
i40e: Disallow ip4 and ip6 l4_4_bytes
[ Upstream commit
d64aaf3f7869f915fd120763d75f11d6b116424d ]
Return -EOPNOTSUPP, when user requests l4_4_bytes for raw IP4 or
IP6 flow director filters. Flow director does not support filtering
on l4 bytes for PCTYPEs used by IP4 and IP6 filters.
Without this patch, user could create filters with l4_4_bytes fields,
which did not do any filtering on L4, but only on L3 fields.
Fixes: 482b43e50562 ("i40e: check current configured input set when adding ntuple filters")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Kamil Maziarz <kamil.maziarz@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sylwester Dziedziuch [Mon, 31 Oct 2022 12:00:28 +0000 (13:00 +0100)]
i40e: Fix for VF MAC address 0
[ Upstream commit
08501970472077ed5de346ad89943a37d1692e9b ]
After spawning max VFs on a PF, some VFs were not getting resources and
their MAC addresses were 0. This was caused by PF sleeping before flushing
HW registers which caused VIRTCHNL_VFR_VFACTIVE to not be set in time for
VF.
Fix by adding a sleep after hw flush.
Fixes: 88d48b20c639 ("i40e: reset all VFs in parallel when rebuilding PF")
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Michal Jaron [Mon, 24 Oct 2022 08:19:42 +0000 (10:19 +0200)]
i40e: Fix not setting default xps_cpus after reset
[ Upstream commit
82e0572b23029b380464fa9fdc125db9c1506d0a ]
During tx rings configuration default XPS queue config is set and
__I40E_TX_XPS_INIT_DONE is locked. __I40E_TX_XPS_INIT_DONE state is
cleared and set again with default mapping only during queues build,
it means after first setup or reset with queues rebuild. (i.e.
ethtool -L <interface> combined <number>) After other resets (i.e.
ethtool -t <interface>) XPS_INIT_DONE is not cleared and those default
maps cannot be set again. It results in cleared xps_cpus mapping
until queues are not rebuild or mapping is not set by user.
Add clearing __I40E_TX_XPS_INIT_DONE state during reset to let
the driver set xps_cpus to defaults again after it was cleared.
Fixes: 111ff146641f ("i40e: allow XPS with QoS enabled")
Signed-off-by: Michal Jaron <michalx.jaron@intel.com>
Signed-off-by: Kamil Maziarz <kamil.maziarz@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Dan Carpenter [Fri, 2 Dec 2022 09:58:26 +0000 (12:58 +0300)]
net: mvneta: Prevent out of bounds read in mvneta_config_rss()
[ Upstream commit
e8b4fc13900b8e8be48debffd0dfd391772501f7 ]
The pp->indir[0] value comes from the user. It is passed to:
if (cpu_online(pp->rxq_def))
inside the mvneta_percpu_elect() function. It needs bounds checkeding
to ensure that it is not beyond the end of the cpu bitmap.
Fixes: 4e84c9e82f5d ("net: mvneta: Fix the CPU choice in mvneta_percpu_elect")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Lin Liu [Fri, 2 Dec 2022 08:52:48 +0000 (08:52 +0000)]
xen-netfront: Fix NULL sring after live migration
[ Upstream commit
d50b7914fae04d840ce36491d22133070b18cca9 ]
A NAPI is setup for each network sring to poll data to kernel
The sring with source host is destroyed before live migration and
new sring with target host is setup after live migration.
The NAPI for the old sring is not deleted until setup new sring
with target host after migration. With busy_poll/busy_read enabled,
the NAPI can be polled before got deleted when resume VM.
BUG: unable to handle kernel NULL pointer dereference at
0000000000000008
IP: xennet_poll+0xae/0xd20
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
Call Trace:
finish_task_switch+0x71/0x230
timerqueue_del+0x1d/0x40
hrtimer_try_to_cancel+0xb5/0x110
xennet_alloc_rx_buffers+0x2a0/0x2a0
napi_busy_loop+0xdb/0x270
sock_poll+0x87/0x90
do_sys_poll+0x26f/0x580
tracing_map_insert+0x1d4/0x2f0
event_hist_trigger+0x14a/0x260
finish_task_switch+0x71/0x230
__schedule+0x256/0x890
recalc_sigpending+0x1b/0x50
xen_sched_clock+0x15/0x20
__rb_reserve_next+0x12d/0x140
ring_buffer_lock_reserve+0x123/0x3d0
event_triggers_call+0x87/0xb0
trace_event_buffer_commit+0x1c4/0x210
xen_clocksource_get_cycles+0x15/0x20
ktime_get_ts64+0x51/0xf0
SyS_ppoll+0x160/0x1a0
SyS_ppoll+0x160/0x1a0
do_syscall_64+0x73/0x130
entry_SYSCALL_64_after_hwframe+0x41/0xa6
...
RIP: xennet_poll+0xae/0xd20 RSP:
ffffb4f041933900
CR2:
0000000000000008
---[ end trace
f8601785b354351c ]---
xen frontend should remove the NAPIs for the old srings before live
migration as the bond srings are destroyed
There is a tiny window between the srings are set to NULL and
the NAPIs are disabled, It is safe as the NAPI threads are still
frozen at that time
Signed-off-by: Lin Liu <lin.liu@citrix.com>
Fixes: 5914bf9131ee ([NET]: Do not check netif_running() and carrier state in ->poll())
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Valentina Goncharenko [Thu, 1 Dec 2022 17:34:08 +0000 (20:34 +0300)]
net: encx24j600: Fix invalid logic in reading of MISTAT register
[ Upstream commit
25f427ac7b8d89b0259f86c0c6407b329df742b2 ]
A loop for reading MISTAT register continues while regmap_read() fails
and (mistat & BUSY), but if regmap_read() fails a value of mistat is
undefined.
The patch proposes to check for BUSY flag only when regmap_read()
succeed. Compile test only.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 2b27d5f85455 ("net: Microchip encx24j600 driver")
Signed-off-by: Valentina Goncharenko <goncharenko.vp@ispras.ru>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Valentina Goncharenko [Thu, 1 Dec 2022 17:34:07 +0000 (20:34 +0300)]
net: encx24j600: Add parentheses to fix precedence
[ Upstream commit
167b3f2dcc62c271f3555b33df17e361bb1fa0ee ]
In functions regmap_encx24j600_phy_reg_read() and
regmap_encx24j600_phy_reg_write() in the conditions of the waiting
cycles for filling the variable 'ret' it is necessary to add parentheses
to prevent wrong assignment due to logical operations precedence.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 2b27d5f85455 ("net: Microchip encx24j600 driver")
Signed-off-by: Valentina Goncharenko <goncharenko.vp@ispras.ru>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Wei Yongjun [Wed, 30 Nov 2022 09:17:05 +0000 (09:17 +0000)]
mac802154: fix missing INIT_LIST_HEAD in ieee802154_if_add()
[ Upstream commit
b3d72d3135d2ef68296c1ee174436efd65386f04 ]
Kernel fault injection test reports null-ptr-deref as follows:
BUG: kernel NULL pointer dereference, address:
0000000000000008
RIP: 0010:cfg802154_netdev_notifier_call+0x120/0x310 include/linux/list.h:114
Call Trace:
<TASK>
raw_notifier_call_chain+0x6d/0xa0 kernel/notifier.c:87
call_netdevice_notifiers_info+0x6e/0xc0 net/core/dev.c:1944
unregister_netdevice_many_notify+0x60d/0xcb0 net/core/dev.c:1982
unregister_netdevice_queue+0x154/0x1a0 net/core/dev.c:10879
register_netdevice+0x9a8/0xb90 net/core/dev.c:10083
ieee802154_if_add+0x6ed/0x7e0 net/mac802154/iface.c:659
ieee802154_register_hw+0x29c/0x330 net/mac802154/main.c:229
mcr20a_probe+0xaaa/0xcb1 drivers/net/ieee802154/mcr20a.c:1316
ieee802154_if_add() allocates wpan_dev as netdev's private data, but not
init the list in struct wpan_dev. cfg802154_netdev_notifier_call() manage
the list when device register/unregister, and may lead to null-ptr-deref.
Use INIT_LIST_HEAD() on it to initialize it correctly.
Fixes: 96ce67f7ddc2 ("ieee802154: add wpan_dev_list")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/r/20221130091705.1831140-1-weiyongjun@huaweicloud.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Zhengchao Shao [Thu, 1 Dec 2022 08:22:46 +0000 (16:22 +0800)]
selftests: rtnetlink: correct xfrm policy rule in kci_test_ipsec_offload
[ Upstream commit
85a0506c073332a3057f5a9635fa0d4db5a8e03b ]
When testing in kci_test_ipsec_offload, srcip is configured as $dstip,
it should add xfrm policy rule in instead of out.
The test result of this patch is as follows:
PASS: ipsec_offload
Fixes: 25d87e7abc0f ("selftests: rtnetlink: add ipsec offload API test")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Acked-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20221201082246.14131-1-shaozhengchao@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Artem Chernyshev [Thu, 1 Dec 2022 14:00:30 +0000 (17:00 +0300)]
net: dsa: ksz: Check return value
[ Upstream commit
3d8fdcbf1f42e2bb9ae8b8c0b6f202278c788a22 ]
Return NULL if we got unexpected value from skb_trim_rcsum()
in ksz_common_rcv()
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 615c4ed0bcc1 ("net: dsa: ksz: Factor out common tag code")
Signed-off-by: Artem Chernyshev <artem.chernyshev@red-soft.ru>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20221201140032.26746-1-artem.chernyshev@red-soft.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Chen Zhongjin [Tue, 29 Nov 2022 09:25:56 +0000 (17:25 +0800)]
Bluetooth: Fix not cleanup led when bt_init fails
[ Upstream commit
2f3957c7eb4e07df944169a3e50a4d6790e1c744 ]
bt_init() calls bt_leds_init() to register led, but if it fails later,
bt_leds_cleanup() is not called to unregister it.
This can cause panic if the argument "bluetooth-power" in text is freed
and then another led_trigger_register() tries to access it:
BUG: unable to handle page fault for address:
ffffffffc06d3bc0
RIP: 0010:strcmp+0xc/0x30
Call Trace:
<TASK>
led_trigger_register+0x10d/0x4f0
led_trigger_register_simple+0x7d/0x100
bt_init+0x39/0xf7 [bluetooth]
do_one_initcall+0xd0/0x4e0
Fixes: 0c2bc0de821e ("Bluetooth: Add combined LED trigger for controller power")
Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Wang ShaoBo [Wed, 9 Nov 2022 09:37:26 +0000 (17:37 +0800)]
Bluetooth: 6LoWPAN: add missing hci_dev_put() in get_l2cap_conn()
[ Upstream commit
747da1308bdd5021409974f9180f0d8ece53d142 ]
hci_get_route() takes reference, we should use hci_dev_put() to release
it when not need anymore.
Fixes: 8ec8e626838e ("Bluetooth: 6LoWPAN: Use connected oriented channel instead of fixed one")
Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Kuniyuki Iwashima [Sun, 27 Nov 2022 01:24:11 +0000 (10:24 +0900)]
af_unix: Get user_ns from in_skb in unix_diag_get_exact().
[ Upstream commit
b3abe42e94900bdd045c472f9c9be620ba5ce553 ]
Wei Chen reported a NULL deref in sk_user_ns() [0][1], and Paolo diagnosed
the root cause: in unix_diag_get_exact(), the newly allocated skb does not
have sk. [2]
We must get the user_ns from the NETLINK_CB(in_skb).sk and pass it to
sk_diag_fill().
[0]:
BUG: kernel NULL pointer dereference, address:
0000000000000270
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD
12bbce067 P4D
12bbce067 PUD
12bc40067 PMD 0
Oops: 0000 [#1] PREEMPT SMP
CPU: 0 PID: 27942 Comm: syz-executor.0 Not tainted 6.1.0-rc5-next-
20221118 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.13.0-48-gd9c812dda519-prebuilt.qemu.org 04/01/2014
RIP: 0010:sk_user_ns include/net/sock.h:920 [inline]
RIP: 0010:sk_diag_dump_uid net/unix/diag.c:119 [inline]
RIP: 0010:sk_diag_fill+0x77d/0x890 net/unix/diag.c:170
Code: 89 ef e8 66 d4 2d fd c7 44 24 40 00 00 00 00 49 8d 7c 24 18 e8
54 d7 2d fd 49 8b 5c 24 18 48 8d bb 70 02 00 00 e8 43 d7 2d fd <48> 8b
9b 70 02 00 00 48 8d 7b 10 e8 33 d7 2d fd 48 8b 5b 10 48 8d
RSP: 0018:
ffffc90000d67968 EFLAGS:
00010246
RAX:
ffff88812badaa48 RBX:
0000000000000000 RCX:
ffffffff840d481d
RDX:
0000000000000465 RSI:
0000000000000000 RDI:
0000000000000270
RBP:
ffffc90000d679a8 R08:
0000000000000277 R09:
0000000000000000
R10:
0001ffffffffffff R11:
0001c90000d679a8 R12:
ffff88812ac03800
R13:
ffff88812c87c400 R14:
ffff88812ae42210 R15:
ffff888103026940
FS:
00007f08b4e6f700(0000) GS:
ffff88813bc00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
0000000000000270 CR3:
000000012c58b000 CR4:
00000000003506f0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
<TASK>
unix_diag_get_exact net/unix/diag.c:285 [inline]
unix_diag_handler_dump+0x3f9/0x500 net/unix/diag.c:317
__sock_diag_cmd net/core/sock_diag.c:235 [inline]
sock_diag_rcv_msg+0x237/0x250 net/core/sock_diag.c:266
netlink_rcv_skb+0x13e/0x250 net/netlink/af_netlink.c:2564
sock_diag_rcv+0x24/0x40 net/core/sock_diag.c:277
netlink_unicast_kernel net/netlink/af_netlink.c:1330 [inline]
netlink_unicast+0x5e9/0x6b0 net/netlink/af_netlink.c:1356
netlink_sendmsg+0x739/0x860 net/netlink/af_netlink.c:1932
sock_sendmsg_nosec net/socket.c:714 [inline]
sock_sendmsg net/socket.c:734 [inline]
____sys_sendmsg+0x38f/0x500 net/socket.c:2476
___sys_sendmsg net/socket.c:2530 [inline]
__sys_sendmsg+0x197/0x230 net/socket.c:2559
__do_sys_sendmsg net/socket.c:2568 [inline]
__se_sys_sendmsg net/socket.c:2566 [inline]
__x64_sys_sendmsg+0x42/0x50 net/socket.c:2566
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x4697f9
Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48
89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:
00007f08b4e6ec48 EFLAGS:
00000246 ORIG_RAX:
000000000000002e
RAX:
ffffffffffffffda RBX:
000000000077bf80 RCX:
00000000004697f9
RDX:
0000000000000000 RSI:
00000000200001c0 RDI:
0000000000000003
RBP:
00000000004d29e9 R08:
0000000000000000 R09:
0000000000000000
R10:
0000000000000000 R11:
0000000000000246 R12:
000000000077bf80
R13:
0000000000000000 R14:
000000000077bf80 R15:
00007ffdb36bc6c0
</TASK>
Modules linked in:
CR2:
0000000000000270
[1]: https://lore.kernel.org/netdev/CAO4mrfdvyjFpokhNsiwZiP-wpdSD0AStcJwfKcKQdAALQ9_2Qw@mail.gmail.com/
[2]: https://lore.kernel.org/netdev/
e04315e7c90d9a75613f3993c2baf2d344eef7eb.camel@redhat.com/
Fixes: 8801bf4b0899 ("net: Add UNIX_DIAG_UID to Netlink UNIX socket diagnostics.")
Reported-by: syzbot <syzkaller@googlegroups.com>
Reported-by: Wei Chen <harperchen1110@gmail.com>
Diagnosed-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Akihiko Odaki [Fri, 25 Nov 2022 13:30:31 +0000 (22:30 +0900)]
igb: Allocate MSI-X vector when testing
[ Upstream commit
28e96556baca7056d11d9fb3cdd0aba4483e00d8 ]
Without this change, the interrupt test fail with MSI-X environment:
$ sudo ethtool -t enp0s2 offline
[ 43.921783] igb 0000:00:02.0: offline testing starting
[ 44.855824] igb 0000:00:02.0 enp0s2: igb: enp0s2 NIC Link is Down
[ 44.961249] igb 0000:00:02.0 enp0s2: igb: enp0s2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[ 51.272202] igb 0000:00:02.0: testing shared interrupt
[ 56.996975] igb 0000:00:02.0 enp0s2: igb: enp0s2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
The test result is FAIL
The test extra info:
Register test (offline) 0
Eeprom test (offline) 0
Interrupt test (offline) 4
Loopback test (offline) 0
Link test (on/offline) 0
Here, "4" means an expected interrupt was not delivered.
To fix this, route IRQs correctly to the first MSI-X vector by setting
IVAR_MISC. Also, set bit 0 of EIMS so that the vector will not be
masked. The interrupt test now runs properly with this change:
$ sudo ethtool -t enp0s2 offline
[ 42.762985] igb 0000:00:02.0: offline testing starting
[ 50.141967] igb 0000:00:02.0: testing shared interrupt
[ 56.163957] igb 0000:00:02.0 enp0s2: igb: enp0s2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
The test result is PASS
The test extra info:
Register test (offline) 0
Eeprom test (offline) 0
Interrupt test (offline) 0
Loopback test (offline) 0
Link test (on/offline) 0
Fixes: 38ab6c37e85e ("igb: add single vector msi-x testing to interrupt test")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Akihiko Odaki [Fri, 28 Oct 2022 13:00:00 +0000 (22:00 +0900)]
e1000e: Fix TX dispatch condition
[ Upstream commit
eed913f6919e253f35d454b2f115f2a4db2b741a ]
e1000_xmit_frame is expected to stop the queue and dispatch frames to
hardware if there is not sufficient space for the next frame in the
buffer, but sometimes it failed to do so because the estimated maximum
size of frame was wrong. As the consequence, the later invocation of
e1000_xmit_frame failed with NETDEV_TX_BUSY, and the frame in the buffer
remained forever, resulting in a watchdog failure.
This change fixes the estimated size by making it match with the
condition for NETDEV_TX_BUSY. Apparently, the old estimation failed to
account for the following lines which determines the space requirement
for not causing NETDEV_TX_BUSY:
```
/* reserve a descriptor for the offload context */
if ((mss) || (skb->ip_summed == CHECKSUM_PARTIAL))
count++;
count++;
count += DIV_ROUND_UP(len, adapter->tx_fifo_limit);
```
This issue was found when running http-stress02 test included in Linux
Test Project
20220930 on QEMU with the following commandline:
```
qemu-system-x86_64 -M q35,accel=kvm -m 8G -smp 8
-drive if=virtio,format=raw,file=root.img,file.locking=on
-device e1000e,netdev=netdev
-netdev tap,script=ifup,downscript=no,id=netdev
```
Fixes: 3f32e81cc0ea ("[E1000E]: New pci-express e1000 driver (currently for ICH9 devices only)")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Xiongfeng Wang [Tue, 22 Nov 2022 12:35:08 +0000 (20:35 +0800)]
gpio: amd8111: Fix PCI device reference count leak
[ Upstream commit
45fecdb9f658d9c82960c98240bc0770ade19aca ]
for_each_pci_dev() is implemented by pci_get_device(). The comment of
pci_get_device() says that it will increase the reference count for the
returned pci_dev and also decrease the reference count for the input
pci_dev @from if it is not NULL.
If we break for_each_pci_dev() loop with pdev not NULL, we need to call
pci_dev_put() to decrease the reference count. Add the missing
pci_dev_put() after the 'out' label. Since pci_dev_put() can handle NULL
input parameter, there is no problem for the 'Device not found' branch.
For the normal path, add pci_dev_put() in amd_gpio_exit().
Fixes: 0215c02983ce ("gpio: add a driver for GPIO pins found on AMD-8111 south bridge chips")
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Qiqi Zhang [Fri, 25 Nov 2022 10:45:58 +0000 (18:45 +0800)]
drm/bridge: ti-sn65dsi86: Fix output polarity setting bug
[ Upstream commit
8c115864501fc09932cdfec53d9ec1cde82b4a28 ]
According to the description in ti-sn65dsi86's datasheet:
CHA_HSYNC_POLARITY:
0 = Active High Pulse. Synchronization signal is high for the sync
pulse width. (default)
1 = Active Low Pulse. Synchronization signal is low for the sync
pulse width.
CHA_VSYNC_POLARITY:
0 = Active High Pulse. Synchronization signal is high for the sync
pulse width. (Default)
1 = Active Low Pulse. Synchronization signal is low for the sync
pulse width.
We should only set these bits when the polarity is negative.
Fixes: 75a27075bea1 ("drm/bridge: add support for sn65dsi86 bridge driver")
Signed-off-by: Qiqi Zhang <eddy.zhang@rock-chips.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20221125104558.84616-1-eddy.zhang@rock-chips.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Hauke Mehrtens [Mon, 21 Nov 2022 00:22:01 +0000 (01:22 +0100)]
ca8210: Fix crash by zero initializing data
[ Upstream commit
1e24c54da257ab93cff5826be8a793b014a5dc9c ]
The struct cas_control embeds multiple generic SPI structures and we
have to make sure these structures are initialized to default values.
This driver does not set all attributes. When using kmalloc before some
attributes were not initialized and contained random data which caused
random crashes at bootup.
Fixes: 8c74c55c8213 ("ieee802154: Add CA8210 IEEE 802.15.4 device driver")
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Link: https://lore.kernel.org/r/20221121002201.1339636-1-hauke@hauke-m.de
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Ziyang Xuan [Sun, 20 Nov 2022 07:50:46 +0000 (15:50 +0800)]
ieee802154: cc2520: Fix error return code in cc2520_hw_init()
[ Upstream commit
4d002d6a2a00ac1c433899bd7625c6400a74cfba ]
In cc2520_hw_init(), if oscillator start failed, the error code
should be returned.
Fixes: 65537e0c4468 ("ieee802154: cc2520: adds driver for TI CC2520 radio")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Link: https://lore.kernel.org/r/20221120075046.2213633-1-william.xuanziyang@huawei.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Oliver Hartkopp [Tue, 6 Dec 2022 20:12:59 +0000 (21:12 +0100)]
can: af_can: fix NULL pointer dereference in can_rcv_filter
commit
0acc442309a0a1b01bcdaa135e56e6398a49439c upstream.
Analogue to commit
16d6a19398a6 ("can: af_can: fix NULL pointer
dereference in can_rx_register()") we need to check for a missing
initialization of ml_priv in the receive path of CAN frames.
Since commit
82817a015495 ("net: introduce CAN specific pointer in the
struct net_device") the check for dev->type to be ARPHRD_CAN is not
sufficient anymore since bonding or tun netdevices claim to be CAN
devices but do not initialize ml_priv accordingly.
Fixes: 82817a015495 ("net: introduce CAN specific pointer in the struct net_device")
Reported-by: syzbot+2d7f58292cb5b29eb5ad@syzkaller.appspotmail.com
Reported-by: Wei Chen <harperchen1110@gmail.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://lore.kernel.org/all/20221206201259.3028-1-socketcan@hartkopp.net
Cc: stable@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ZhangPeng [Wed, 16 Nov 2022 07:14:28 +0000 (07:14 +0000)]
HID: core: fix shift-out-of-bounds in hid_report_raw_event
commit
ec61b41918587be530398b0d1c9a0d16619397e5 upstream.
Syzbot reported shift-out-of-bounds in hid_report_raw_event.
microsoft 0003:045E:07DA.0001: hid_field_extract() called with n (128) >
32! (swapper/0)
======================================================================
UBSAN: shift-out-of-bounds in drivers/hid/hid-core.c:1323:20
shift exponent 127 is too large for 32-bit type 'int'
CPU: 0 PID: 0 Comm: swapper/0 Not tainted
6.1.0-rc4-syzkaller-00159-g4bbf3422df78 #0
Hardware name: Google Compute Engine/Google Compute Engine, BIOS
Google 10/26/2022
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1e3/0x2cb lib/dump_stack.c:106
ubsan_epilogue lib/ubsan.c:151 [inline]
__ubsan_handle_shift_out_of_bounds+0x3a6/0x420 lib/ubsan.c:322
snto32 drivers/hid/hid-core.c:1323 [inline]
hid_input_fetch_field drivers/hid/hid-core.c:1572 [inline]
hid_process_report drivers/hid/hid-core.c:1665 [inline]
hid_report_raw_event+0xd56/0x18b0 drivers/hid/hid-core.c:1998
hid_input_report+0x408/0x4f0 drivers/hid/hid-core.c:2066
hid_irq_in+0x459/0x690 drivers/hid/usbhid/hid-core.c:284
__usb_hcd_giveback_urb+0x369/0x530 drivers/usb/core/hcd.c:1671
dummy_timer+0x86b/0x3110 drivers/usb/gadget/udc/dummy_hcd.c:1988
call_timer_fn+0xf5/0x210 kernel/time/timer.c:1474
expire_timers kernel/time/timer.c:1519 [inline]
__run_timers+0x76a/0x980 kernel/time/timer.c:1790
run_timer_softirq+0x63/0xf0 kernel/time/timer.c:1803
__do_softirq+0x277/0x75b kernel/softirq.c:571
__irq_exit_rcu+0xec/0x170 kernel/softirq.c:650
irq_exit_rcu+0x5/0x20 kernel/softirq.c:662
sysvec_apic_timer_interrupt+0x91/0xb0 arch/x86/kernel/apic/apic.c:1107
======================================================================
If the size of the integer (unsigned n) is bigger than 32 in snto32(),
shift exponent will be too large for 32-bit type 'int', resulting in a
shift-out-of-bounds bug.
Fix this by adding a check on the size of the integer (unsigned n) in
snto32(). To add support for n greater than 32 bits, set n to 32, if n
is greater than 32.
Reported-by: syzbot+8b1641d2f14732407e23@syzkaller.appspotmail.com
Fixes: 5770e309402e ("[PATCH] Generic HID layer - code split")
Signed-off-by: ZhangPeng <zhangpeng362@huawei.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Anastasia Belova [Fri, 11 Nov 2022 12:55:11 +0000 (15:55 +0300)]
HID: hid-lg4ff: Add check for empty lbuf
commit
d180b6496143cd360c5d5f58ae4b9a8229c1f344 upstream.
If an empty buf is received, lbuf is also empty. So lbuf is
accessed by index -1.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: bf2fb4c11b50 ("HID: hid-lg4ff: Allow switching of Logitech gaming wheels between compatibility modes")
Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ankit Patel [Tue, 22 Nov 2022 07:35:20 +0000 (15:35 +0800)]
HID: usbhid: Add ALWAYS_POLL quirk for some mice
commit
f6d910a89a2391e5ce1f275d205023880a33d3f8 upstream.
Some additional USB mouse devices are needing ALWAYS_POLL quirk without
which they disconnect and reconnect every 60s.
Add below devices to the known quirk list.
CHERRY VID 0x046a, PID 0x000c
MICROSOFT VID 0x045e, PID 0x0783
PRIMAX VID 0x0461, PID 0x4e2a
Signed-off-by: Ankit Patel <anpatel@nvidia.com>
Signed-off-by: Haotien Hsu <haotienh@nvidia.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Rob Clark [Wed, 30 Nov 2022 18:57:47 +0000 (10:57 -0800)]
drm/shmem-helper: Remove errant put in error path
commit
24013314be6ee4ee456114a671e9fa3461323de8 upstream.
drm_gem_shmem_mmap() doesn't own this reference, resulting in the GEM
object getting prematurely freed leading to a later use-after-free.
Link: https://syzkaller.appspot.com/bug?extid=c8ae65286134dd1b800d
Reported-by: syzbot+c8ae65286134dd1b800d@syzkaller.appspotmail.com
Fixes: 1f4e743ab37e ("drm: Add library for shmem backed GEM objects")
Cc: stable@vger.kernel.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221130185748.357410-2-robdclark@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Thomas Huth [Wed, 23 Nov 2022 09:08:33 +0000 (10:08 +0100)]
KVM: s390: vsie: Fix the initialization of the epoch extension (epdx) field
commit
0dd4cdccdab3d74bd86b868768a7dca216bcce7e upstream.
We recently experienced some weird huge time jumps in nested guests when
rebooting them in certain cases. After adding some debug code to the epoch
handling in vsie.c (thanks to David Hildenbrand for the idea!), it was
obvious that the "epdx" field (the multi-epoch extension) did not get set
to 0xff in case the "epoch" field was negative.
Seems like the code misses to copy the value from the epdx field from
the guest to the shadow control block. By doing so, the weird time
jumps are gone in our scenarios.
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2140899
Fixes: 81274e7131c3 ("KVM: s390: Multiple Epoch Facility support")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Cc: stable@vger.kernel.org # 4.19+
Link: https://lore.kernel.org/r/20221123090833.292938-1-thuth@redhat.com
Message-Id: <
20221123090833.292938-1-thuth@redhat.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
John Starks [Wed, 7 Dec 2022 06:00:53 +0000 (22:00 -0800)]
mm/gup: fix gup_pud_range() for dax
commit
fcd0ccd836ffad73d98a66f6fea7b16f735ea920 upstream.
For dax pud, pud_huge() returns true on x86. So the function works as long
as hugetlb is configured. However, dax doesn't depend on hugetlb.
Commit
d7f964c3b3fb ("mm/gup: fix gup_pmd_range() for dax") fixed
devmap-backed huge PMDs, but missed devmap-backed huge PUDs. Fix this as
well.
This fixes the below kernel panic:
general protection fault, probably for non-canonical address 0x69e7c000cc478: 0000 [#1] SMP
< snip >
Call Trace:
<TASK>
get_user_pages_fast+0x1f/0x40
iov_iter_get_pages+0xc6/0x3b0
? mempool_alloc+0x5d/0x170
bio_iov_iter_get_pages+0x82/0x4e0
? bvec_alloc+0x91/0xc0
? bio_alloc_bioset+0x19a/0x2a0
blkdev_direct_IO+0x282/0x480
? __io_complete_rw_common+0xc0/0xc0
? filemap_range_has_page+0x82/0xc0
generic_file_direct_write+0x9d/0x1a0
? inode_update_time+0x24/0x30
__generic_file_write_iter+0xbd/0x1e0
blkdev_write_iter+0xb4/0x150
? io_import_iovec+0x8d/0x340
io_write+0xf9/0x300
io_issue_sqe+0x3c3/0x1d30
? sysvec_reschedule_ipi+0x6c/0x80
__io_queue_sqe+0x33/0x240
? fget+0x76/0xa0
io_submit_sqes+0xe6a/0x18d0
? __fget_light+0xd1/0x100
__x64_sys_io_uring_enter+0x199/0x880
? __context_tracking_enter+0x1f/0x70
? irqentry_exit_to_user_mode+0x24/0x30
? irqentry_exit+0x1d/0x30
? __context_tracking_exit+0xe/0x70
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7fc97c11a7be
< snip >
</TASK>
---[ end trace
48b2e0e67debcaeb ]---
RIP: 0010:internal_get_user_pages_fast+0x340/0x990
< snip >
Kernel panic - not syncing: Fatal exception
Kernel Offset: disabled
Link: https://lkml.kernel.org/r/1670392853-28252-1-git-send-email-ssengar@linux.microsoft.com
Fixes: d7f964c3b3fb ("mm/gup: fix gup_pmd_range() for dax")
Signed-off-by: John Starks <jostarks@microsoft.com>
Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tejun Heo [Thu, 8 Dec 2022 02:53:15 +0000 (16:53 -1000)]
memcg: fix possible use-after-free in memcg_write_event_control()
commit
4a7ba45b1a435e7097ca0f79a847d0949d0eb088 upstream.
memcg_write_event_control() accesses the dentry->d_name of the specified
control fd to route the write call. As a cgroup interface file can't be
renamed, it's safe to access d_name as long as the specified file is a
regular cgroup file. Also, as these cgroup interface files can't be
removed before the directory, it's safe to access the parent too.
Prior to
5cc08e6f6b2d ("memcg: remove cgroup_event->cft"), there was a
call to __file_cft() which verified that the specified file is a regular
cgroupfs file before further accesses. The cftype pointer returned from
__file_cft() was no longer necessary and the commit inadvertently dropped
the file type check with it allowing any file to slip through. With the
invarients broken, the d_name and parent accesses can now race against
renames and removals of arbitrary files and cause use-after-free's.
Fix the bug by resurrecting the file type check in __file_cft(). Now that
cgroupfs is implemented through kernfs, checking the file operations needs
to go through a layer of indirection. Instead, let's check the superblock
and dentry type.
Link: https://lkml.kernel.org/r/Y5FRm/cfcKPGzWwl@slm.duckdns.org
Fixes: 5cc08e6f6b2d ("memcg: remove cgroup_event->cft")
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Jann Horn <jannh@google.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: <stable@vger.kernel.org> [3.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hans Verkuil [Wed, 16 Nov 2022 15:07:22 +0000 (15:07 +0000)]
media: v4l2-dv-timings.c: fix too strict blanking sanity checks
commit
5eef2141776da02772c44ec406d6871a790761ee upstream.
Sanity checks were added to verify the v4l2_bt_timings blanking fields
in order to avoid integer overflows when userspace passes weird values.
But that assumed that userspace would correctly fill in the front porch,
backporch and sync values, but sometimes all you know is the total
blanking, which is then assigned to just one of these fields.
And that can fail with these checks.
So instead set a maximum for the total horizontal and vertical
blanking and check that each field remains below that.
That is still sufficient to avoid integer overflows, but it also
allows for more flexibility in how userspace fills in these fields.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Fixes: df0d3d3669a2 ("media: v4l2-dv-timings: add sanity checks for blanking values")
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Rafał Miłecki [Wed, 7 Dec 2022 07:01:55 +0000 (08:01 +0100)]
Revert "net: dsa: b53: Fix valid setting for MDB entries"
This reverts commit
344e6a2c1094ecf2147b6445915c1e70f42bea13.
Upstream commit was a fix for an overlook of setting "ent.is_valid"
twice after
2a15aa936298 ("net: dsa: b53: Add support for MDB").
Since MDB support was not backported to stable kernels (it's not a bug
fix) there is nothing to fix there. Backporting this commit resulted in
"env.is_valid" not being set at all.
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Juergen Gross [Tue, 6 Dec 2022 07:54:24 +0000 (08:54 +0100)]
xen/netback: don't call kfree_skb() with interrupts disabled
[ Upstream commit
74e7e1efdad45580cc3839f2a155174cf158f9b5 ]
It is not allowed to call kfree_skb() from hardware interrupt
context or with interrupts being disabled. So remove kfree_skb()
from the spin_lock_irqsave() section and use the already existing
"drop" label in xenvif_start_xmit() for dropping the SKB. At the
same time replace the dev_kfree_skb() call there with a call of
dev_kfree_skb_any(), as xenvif_start_xmit() can be called with
disabled interrupts.
This is XSA-424 / CVE-2022-42328 / CVE-2022-42329.
Fixes: a5bc687013cb ("xen/netback: don't queue unlimited number of packages")
Reported-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Juergen Gross [Wed, 8 Jun 2022 04:37:26 +0000 (06:37 +0200)]
xen/netback: do some code cleanup
[ Upstream commit
4cdae8589e7d26bb0ef4f5652d463c13afd24fd8 ]
Remove some unused macros and functions, make local functions static.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Link: https://lore.kernel.org/r/20220608043726.9380-1-jgross@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of:
74e7e1efdad4 ("xen/netback: don't call kfree_skb() with interrupts disabled")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Ross Lagerwall [Tue, 22 Nov 2022 09:16:59 +0000 (09:16 +0000)]
xen/netback: Ensure protocol headers don't fall in the non-linear area
[ Upstream commit
ad7f402ae4f466647c3a669b8a6f3e5d4271c84a ]
In some cases, the frontend may send a packet where the protocol headers
are spread across multiple slots. This would result in netback creating
an skb where the protocol headers spill over into the non-linear area.
Some drivers and NICs don't handle this properly resulting in an
interface reset or worse.
This issue was introduced by the removal of an unconditional skb pull in
the tx path to improve performance. Fix this without reintroducing the
pull by setting up grant copy ops for as many slots as needed to reach
the XEN_NETBACK_TX_COPY_LEN size. Adjust the rest of the code to handle
multiple copy operations per skb.
This is XSA-423 / CVE-2022-3643.
Fixes: 08ec38ce457e ("xen-netback: remove unconditional __pskb_pull_tail() in guest Tx path")
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jann Horn [Tue, 6 Dec 2022 17:16:10 +0000 (18:16 +0100)]
mm/khugepaged: invoke MMU notifiers in shmem/file collapse paths
commit
f268f6cf875f3220afc77bdd0bf1bb136eb54db9 upstream.
Any codepath that zaps page table entries must invoke MMU notifiers to
ensure that secondary MMUs (like KVM) don't keep accessing pages which
aren't mapped anymore. Secondary MMUs don't hold their own references to
pages that are mirrored over, so failing to notify them can lead to page
use-after-free.
I'm marking this as addressing an issue introduced in commit
43fcdf499b7c
("khugepaged: add support of collapse for tmpfs/shmem pages"), but most of
the security impact of this only came in commit
33026f37c9ba ("khugepaged:
enable collapse pmd for pte-mapped THP"), which actually omitted flushes
for the removal of present PTEs, not just for the removal of empty page
tables.
Link: https://lkml.kernel.org/r/20221129154730.2274278-3-jannh@google.com
Link: https://lkml.kernel.org/r/20221128180252.1684965-3-jannh@google.com
Link: https://lkml.kernel.org/r/20221125213714.4115729-3-jannh@google.com
Fixes: 43fcdf499b7c ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[manual backport: this code was refactored from two copies into a common
helper between 5.15 and 6.0]
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jann Horn [Tue, 6 Dec 2022 17:16:09 +0000 (18:16 +0100)]
mm/khugepaged: fix GUP-fast interaction by sending IPI
commit
2ba99c5e08812494bc57f319fb562f527d9bacd8 upstream.
Since commit
65e21d68dc361 ("mm: gup: fix the fast GUP race against THP
collapse"), the lockless_pages_from_mm() fastpath rechecks the pmd_t to
ensure that the page table was not removed by khugepaged in between.
However, lockless_pages_from_mm() still requires that the page table is
not concurrently freed. Fix it by sending IPIs (if the architecture uses
semi-RCU-style page table freeing) before freeing/reusing page tables.
Link: https://lkml.kernel.org/r/20221129154730.2274278-2-jannh@google.com
Link: https://lkml.kernel.org/r/20221128180252.1684965-2-jannh@google.com
Link: https://lkml.kernel.org/r/20221125213714.4115729-2-jannh@google.com
Fixes: dee06e1310c9 ("thp: khugepaged")
Signed-off-by: Jann Horn <jannh@google.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[manual backport: two of the three places in khugepaged that can free
ptes were refactored into a common helper between 5.15 and 6.0;
TLB flushing was refactored between 5.4 and 5.10]
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jann Horn [Tue, 6 Dec 2022 17:16:11 +0000 (18:16 +0100)]
mm/khugepaged: take the right locks for page table retraction
commit
8d3c106e19e8d251da31ff4cc7462e4565d65084 upstream.
pagetable walks on address ranges mapped by VMAs can be done under the
mmap lock, the lock of an anon_vma attached to the VMA, or the lock of the
VMA's address_space. Only one of these needs to be held, and it does not
need to be held in exclusive mode.
Under those circumstances, the rules for concurrent access to page table
entries are:
- Terminal page table entries (entries that don't point to another page
table) can be arbitrarily changed under the page table lock, with the
exception that they always need to be consistent for
hardware page table walks and lockless_pages_from_mm().
This includes that they can be changed into non-terminal entries.
- Non-terminal page table entries (which point to another page table)
can not be modified; readers are allowed to READ_ONCE() an entry, verify
that it is non-terminal, and then assume that its value will stay as-is.
Retracting a page table involves modifying a non-terminal entry, so
page-table-level locks are insufficient to protect against concurrent page
table traversal; it requires taking all the higher-level locks under which
it is possible to start a page walk in the relevant range in exclusive
mode.
The collapse_huge_page() path for anonymous THP already follows this rule,
but the shmem/file THP path was getting it wrong, making it possible for
concurrent rmap-based operations to cause corruption.
Link: https://lkml.kernel.org/r/20221129154730.2274278-1-jannh@google.com
Link: https://lkml.kernel.org/r/20221128180252.1684965-1-jannh@google.com
Link: https://lkml.kernel.org/r/20221125213714.4115729-1-jannh@google.com
Fixes: 33026f37c9ba ("khugepaged: enable collapse pmd for pte-mapped THP")
Signed-off-by: Jann Horn <jannh@google.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[manual backport: this code was refactored from two copies into a common
helper between 5.15 and 6.0]
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Davide Tronchin [Mon, 21 Nov 2022 12:54:55 +0000 (13:54 +0100)]
net: usb: qmi_wwan: add u-blox 0x1342 composition
[ Upstream commit
a487069e11b6527373f7c6f435d8998051d0b5d9 ]
Add RmNet support for LARA-L6.
LARA-L6 module can be configured (by AT interface) in three different
USB modes:
* Default mode (Vendor ID: 0x1546 Product ID: 0x1341) with 4 serial
interfaces
* RmNet mode (Vendor ID: 0x1546 Product ID: 0x1342) with 4 serial
interfaces and 1 RmNet virtual network interface
* CDC-ECM mode (Vendor ID: 0x1546 Product ID: 0x1343) with 4 serial
interface and 1 CDC-ECM virtual network interface
In RmNet mode LARA-L6 exposes the following interfaces:
If 0: Diagnostic
If 1: AT parser
If 2: AT parser
If 3: AT parset/alternative functions
If 4: RMNET interface
Signed-off-by: Davide Tronchin <davide.tronchin.94@gmail.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Dominique Martinet [Fri, 18 Nov 2022 13:44:41 +0000 (22:44 +0900)]
9p/xen: check logical size for buffer size
[ Upstream commit
391c18cf776eb4569ecda1f7794f360fe0a45a26 ]
trans_xen did not check the data fits into the buffer before copying
from the xen ring, but we probably should.
Add a check that just skips the request and return an error to
userspace if it did not fit
Tested-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Link: https://lkml.kernel.org/r/20221118135542.63400-1-asmadeus@codewreck.org
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Tetsuo Handa [Thu, 17 Nov 2022 15:27:58 +0000 (00:27 +0900)]
fbcon: Use kzalloc() in fbcon_prepare_logo()
[ Upstream commit
a6a00d7e8ffd78d1cdb7a43f1278f081038c638f ]
A kernel built with syzbot's config file reported that
scr_memcpyw(q, save, array3_size(logo_lines, new_cols, 2))
causes uninitialized "save" to be copied.
----------
[drm] Initialized vgem 1.0.0
20120112 for vgem on minor 0
[drm] Initialized vkms 1.0.0
20180514 for vkms on minor 1
Console: switching to colour frame buffer device 128x48
=====================================================
BUG: KMSAN: uninit-value in do_update_region+0x4b8/0xba0
do_update_region+0x4b8/0xba0
update_region+0x40d/0x840
fbcon_switch+0x3364/0x35e0
redraw_screen+0xae3/0x18a0
do_bind_con_driver+0x1cb3/0x1df0
do_take_over_console+0x11cb/0x13f0
fbcon_fb_registered+0xacc/0xfd0
register_framebuffer+0x1179/0x1320
__drm_fb_helper_initial_config_and_unlock+0x23ad/0x2b40
drm_fbdev_client_hotplug+0xbea/0xda0
drm_fbdev_generic_setup+0x65e/0x9d0
vkms_init+0x9f3/0xc76
(...snipped...)
Uninit was stored to memory at:
fbcon_prepare_logo+0x143b/0x1940
fbcon_init+0x2c1b/0x31c0
visual_init+0x3e7/0x820
do_bind_con_driver+0x14a4/0x1df0
do_take_over_console+0x11cb/0x13f0
fbcon_fb_registered+0xacc/0xfd0
register_framebuffer+0x1179/0x1320
__drm_fb_helper_initial_config_and_unlock+0x23ad/0x2b40
drm_fbdev_client_hotplug+0xbea/0xda0
drm_fbdev_generic_setup+0x65e/0x9d0
vkms_init+0x9f3/0xc76
(...snipped...)
Uninit was created at:
__kmem_cache_alloc_node+0xb69/0x1020
__kmalloc+0x379/0x680
fbcon_prepare_logo+0x704/0x1940
fbcon_init+0x2c1b/0x31c0
visual_init+0x3e7/0x820
do_bind_con_driver+0x14a4/0x1df0
do_take_over_console+0x11cb/0x13f0
fbcon_fb_registered+0xacc/0xfd0
register_framebuffer+0x1179/0x1320
__drm_fb_helper_initial_config_and_unlock+0x23ad/0x2b40
drm_fbdev_client_hotplug+0xbea/0xda0
drm_fbdev_generic_setup+0x65e/0x9d0
vkms_init+0x9f3/0xc76
(...snipped...)
CPU: 2 PID: 1 Comm: swapper/0 Not tainted
6.1.0-rc4-00356-g8f2975c2bb4c #924
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
----------
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/cad03d25-0ea0-32c4-8173-fd1895314bce@I-love.SAKURA.ne.jp
Signed-off-by: Sasha Levin <sashal@kernel.org>
Andreas Kemnade [Sun, 20 Nov 2022 22:12:08 +0000 (23:12 +0100)]
regulator: twl6030: fix get status of twl6032 regulators
[ Upstream commit
31a6297b89aabc81b274c093a308a7f5b55081a7 ]
Status is reported as always off in the 6032 case. Status
reporting now matches the logic in the setters. Once of
the differences to the 6030 is that there are no groups,
therefore the state needs to be read out in the lower bits.
Signed-off-by: Andreas Kemnade <andreas@kemnade.info>
Link: https://lore.kernel.org/r/20221120221208.3093727-3-andreas@kemnade.info
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Srinivasa Rao Mandadapu [Tue, 22 Nov 2022 06:31:13 +0000 (12:01 +0530)]
ASoC: soc-pcm: Add NULL check in BE reparenting
[ Upstream commit
db8f91d424fe0ea6db337aca8bc05908bbce1498 ]
Add NULL check in dpcm_be_reparent API, to handle
kernel NULL pointer dereference error.
The issue occurred in fuzzing test.
Signed-off-by: Srinivasa Rao Mandadapu <quic_srivasam@quicinc.com>
Link: https://lore.kernel.org/r/1669098673-29703-1-git-send-email-quic_srivasam@quicinc.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Filipe Manana [Tue, 15 Nov 2022 16:29:44 +0000 (16:29 +0000)]
btrfs: send: avoid unaligned encoded writes when attempting to clone range
[ Upstream commit
a11452a3709e217492798cf3686ac2cc8eb3fb51 ]
When trying to see if we can clone a file range, there are cases where we
end up sending two write operations in case the inode from the source root
has an i_size that is not sector size aligned and the length from the
current offset to its i_size is less than the remaining length we are
trying to clone.
Issuing two write operations when we could instead issue a single write
operation is not incorrect. However it is not optimal, specially if the
extents are compressed and the flag BTRFS_SEND_FLAG_COMPRESSED was passed
to the send ioctl. In that case we can end up sending an encoded write
with an offset that is not sector size aligned, which makes the receiver
fallback to decompressing the data and writing it using regular buffered
IO (so re-compressing the data in case the fs is mounted with compression
enabled), because encoded writes fail with -EINVAL when an offset is not
sector size aligned.
The following example, which triggered a bug in the receiver code for the
fallback logic of decompressing + regular buffer IO and is fixed by the
patchset referred in a Link at the bottom of this changelog, is an example
where we have the non-optimal behaviour due to an unaligned encoded write:
$ cat test.sh
#!/bin/bash
DEV=/dev/sdj
MNT=/mnt/sdj
mkfs.btrfs -f $DEV > /dev/null
mount -o compress $DEV $MNT
# File foo has a size of 33K, not aligned to the sector size.
xfs_io -f -c "pwrite -S 0xab 0 33K" $MNT/foo
xfs_io -f -c "pwrite -S 0xcd 0 64K" $MNT/bar
# Now clone the first 32K of file bar into foo at offset 0.
xfs_io -c "reflink $MNT/bar 0 0 32K" $MNT/foo
# Snapshot the default subvolume and create a full send stream (v2).
btrfs subvolume snapshot -r $MNT $MNT/snap
btrfs send --compressed-data -f /tmp/test.send $MNT/snap
echo -e "\nFile bar in the original filesystem:"
od -A d -t x1 $MNT/snap/bar
umount $MNT
mkfs.btrfs -f $DEV > /dev/null
mount $DEV $MNT
echo -e "\nReceiving stream in a new filesystem..."
btrfs receive -f /tmp/test.send $MNT
echo -e "\nFile bar in the new filesystem:"
od -A d -t x1 $MNT/snap/bar
umount $MNT
Before this patch, the send stream included one regular write and one
encoded write for file 'bar', with the later being not sector size aligned
and causing the receiver to fallback to decompression + buffered writes.
The output of the btrfs receive command in verbose mode (-vvv):
(...)
mkfile o258-7-0
rename o258-7-0 -> bar
utimes
clone bar - source=foo source offset=0 offset=0 length=32768
write bar - offset=32768 length=1024
encoded_write bar - offset=33792, len=4096, unencoded_offset=33792, unencoded_file_len=31744, unencoded_len=65536, compression=1, encryption=0
encoded_write bar - falling back to decompress and write due to errno 22 ("Invalid argument")
(...)
This patch avoids the regular write followed by an unaligned encoded write
so that we end up sending a single encoded write that is aligned. So after
this patch the stream content is (output of btrfs receive -vvv):
(...)
mkfile o258-7-0
rename o258-7-0 -> bar
utimes
clone bar - source=foo source offset=0 offset=0 length=32768
encoded_write bar - offset=32768, len=4096, unencoded_offset=32768, unencoded_file_len=32768, unencoded_len=65536, compression=1, encryption=0
(...)
So we get more optimal behaviour and avoid the silent data loss bug in
versions of btrfs-progs affected by the bug referred by the Link tag
below (btrfs-progs v5.19, v5.19.1, v6.0 and v6.0.1).
Link: https://lore.kernel.org/linux-btrfs/cover.1668529099.git.fdmanana@suse.com/
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Kees Cook [Fri, 18 Nov 2022 23:23:50 +0000 (15:23 -0800)]
ALSA: seq: Fix function prototype mismatch in snd_seq_expand_var_event
[ Upstream commit
05530ef7cf7c7d700f6753f058999b1b5099a026 ]
With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed.
seq_copy_in_user() and seq_copy_in_kernel() did not have prototypes
matching snd_seq_dump_func_t. Adjust this and remove the casts. There
are not resulting binary output differences.
This was found as a result of Clang's new -Wcast-function-type-strict
flag, which is more sensitive than the simpler -Wcast-function-type,
which only checks for type width mismatches.
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/lkml/202211041527.HD8TLSE1-lkp@intel.com
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.com>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: alsa-devel@alsa-project.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221118232346.never.380-kees@kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Konrad Dybcio [Fri, 18 Nov 2022 13:10:35 +0000 (14:10 +0100)]
regulator: slg51000: Wait after asserting CS pin
[ Upstream commit
0b24dfa587c6cc7484cfb170da5c7dd73451f670 ]
Sony's downstream driver [1], among some other changes, adds a
seemingly random 10ms usleep_range, which turned out to be necessary
for the hardware to function properly on at least Sony Xperia 1 IV.
Without this, I2C transactions with the SLG51000 straight up fail.
Relax (10-10ms -> 10-11ms) and add the aforementioned sleep to make
sure the hardware has some time to wake up.
(nagara-2.0.0-mlc/vendor/semc/hardware/camera-kernel-module/)
[1] https://developer.sony.com/file/download/open-source-archive-for-64-0-m-4-29/
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Link: https://lore.kernel.org/r/20221118131035.54874-1-konrad.dybcio@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
GUO Zihua [Thu, 17 Nov 2022 09:11:59 +0000 (17:11 +0800)]
9p/fd: Use P9_HDRSZ for header size
[ Upstream commit
6854fadbeee10891ed74246bdc05031906b6c8cf ]
Cleanup hardcoded header sizes to use P9_HDRSZ instead of '7'
Link: https://lkml.kernel.org/r/20221117091159.31533-4-guozihua@huawei.com
Signed-off-by: GUO Zihua <guozihua@huawei.com>
Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com>
[Dominique: commit message adjusted to make sense after offset size
adjustment got removed]
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Johan Jonker [Sun, 30 Oct 2022 20:56:29 +0000 (21:56 +0100)]
ARM: dts: rockchip: disable arm_global_timer on rk3066 and rk3188
[ Upstream commit
da74858a475782a3f16470907814c8cc5950ad68 ]
The clock source and the sched_clock provided by the arm_global_timer
on Rockchip rk3066a/rk3188 are quite unstable because their rates
depend on the CPU frequency.
Recent changes to the arm_global_timer driver makes it impossible to use.
On the other side, the arm_global_timer has a higher rating than the
ROCKCHIP_TIMER, it will be selected by default by the time framework
while we want to use the stable Rockchip clock source.
Keep the arm_global_timer disabled in order to have the
DW_APB_TIMER (rk3066a) or ROCKCHIP_TIMER (rk3188) selected by default.
Signed-off-by: Johan Jonker <jbx6244@gmail.com>
Link: https://lore.kernel.org/r/f275ca8d-fd0a-26e5-b978-b7f3df815e0a@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Giulio Benetti [Fri, 4 Nov 2022 20:46:18 +0000 (21:46 +0100)]
ARM: 9266/1: mm: fix no-MMU ZERO_PAGE() implementation
[ Upstream commit
340a982825f76f1cff0daa605970fe47321b5ee7 ]
Actually in no-MMU SoCs(i.e. i.MXRT) ZERO_PAGE(vaddr) expands to
```
virt_to_page(0)
```
that in order expands to:
```
pfn_to_page(virt_to_pfn(0))
```
and then virt_to_pfn(0) to:
```
((((unsigned long)(0) - PAGE_OFFSET) >> PAGE_SHIFT) +
PHYS_PFN_OFFSET)
```
where PAGE_OFFSET and PHYS_PFN_OFFSET are the DRAM offset(0x80000000) and
PAGE_SHIFT is 12. This way we obtain 16MB(0x01000000) summed to the base of
DRAM(0x80000000).
When ZERO_PAGE(0) is then used, for example in bio_add_page(), the page
gets an address that is out of DRAM bounds.
So instead of using fake virtual page 0 let's allocate a dedicated
zero_page during paging_init() and assign it to a global 'struct page *
empty_zero_page' the same way mmu.c does and it's the same approach used
in m68k with commit
06f3928b52d0 as discussed here[0]. Then let's move
ZERO_PAGE() definition to the top of pgtable.h to be in common between
mmu.c and nommu.c.
[0]: https://lore.kernel.org/linux-m68k/
2a462b23-5b8e-bbf4-ec7d-
778434a3b9d7@google.com/T/#m1266ceb63
ad140743174d6b3070364d3c9a5179b
Signed-off-by: Giulio Benetti <giulio.benetti@benettiengineering.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Tomislav Novak [Mon, 26 Sep 2022 15:09:12 +0000 (16:09 +0100)]
ARM: 9251/1: perf: Fix stacktraces for tracepoint events in THUMB2 kernels
[ Upstream commit
612695bccfdbd52004551308a55bae410e7cd22f ]
Store the frame address where arm_get_current_stackframe() looks for it
(ARM_r7 instead of ARM_fp if CONFIG_THUMB2_KERNEL=y). Otherwise frame->fp
gets set to 0, causing unwind_frame() to fail.
# bpftrace -e 't:sched:sched_switch { @[kstack] = count(); exit(); }'
Attaching 1 probe...
@[
__schedule+1059
]: 1
A typical first unwind instruction is 0x97 (SP = R7), so after executing
it SP ends up being 0 and -URC_FAILURE is returned.
unwind_frame(pc =
ac9da7d7 lr =
00000000 sp =
c69bdda0 fp =
00000000)
unwind_find_idx(
ac9da7d7)
unwind_exec_insn: insn =
00000097
unwind_exec_insn: fp =
00000000 sp =
00000000 lr =
00000000 pc =
00000000
With this patch:
# bpftrace -e 't:sched:sched_switch { @[kstack] = count(); exit(); }'
Attaching 1 probe...
@[
__schedule+1059
__schedule+1059
schedule+79
schedule_hrtimeout_range_clock+163
schedule_hrtimeout_range+17
ep_poll+471
SyS_epoll_wait+111
sys_epoll_pwait+231
__ret_fast_syscall+1
]: 1
Link: https://lore.kernel.org/r/20220920230728.2617421-1-tnovak@fb.com/
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Tomislav Novak <tnovak@fb.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Johan Jonker [Wed, 26 Oct 2022 23:31:37 +0000 (01:31 +0200)]
ARM: dts: rockchip: rk3188: fix lcdc1-rgb24 node name
[ Upstream commit
11871e20bcb23c00966e785a124fb72bc8340af4 ]
The lcdc1-rgb24 node name is out of line with the rest
of the rk3188 lcdc1 node, so fix it.
Signed-off-by: Johan Jonker <jbx6244@gmail.com>
Link: https://lore.kernel.org/r/7b9c0a6f-626b-07e8-ae74-7e0f08b8d241@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Johan Jonker [Thu, 27 Oct 2022 08:58:22 +0000 (10:58 +0200)]
ARM: dts: rockchip: fix ir-receiver node names
[ Upstream commit
dd847fe34cdf1e89afed1af24986359f13082bfb ]
Fix ir-receiver node names on Rockchip boards,
so that they match with regex: '^ir(-receiver)?(@[a-f0-9]+)?$'
Signed-off-by: Johan Jonker <jbx6244@gmail.com>
Link: https://lore.kernel.org/r/ea5af279-f44c-afea-023d-bb37f5a0d58d@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sebastian Reichel [Mon, 24 Oct 2022 16:55:46 +0000 (18:55 +0200)]
arm: dts: rockchip: fix node name for hym8563 rtc
[ Upstream commit
17b57beafccb4569accbfc8c11390744cf59c021 ]
Fix the node name for hym8563 in all arm rockchip devicetrees.
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Link: https://lore.kernel.org/r/20221024165549.74574-4-sebastian.reichel@collabora.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
FUKAUMI Naoki [Sat, 24 Sep 2022 11:28:12 +0000 (11:28 +0000)]
arm64: dts: rockchip: keep I2S1 disabled for GPIO function on ROCK Pi 4 series
[ Upstream commit
849c19d14940b87332d5d59c7fc581d73f2099fd ]
I2S1 pins are exposed on 40-pin header on Radxa ROCK Pi 4 series.
their default function is GPIO, so I2S1 need to be disabled.
Signed-off-by: FUKAUMI Naoki <naoki@radxa.com>
Link: https://lore.kernel.org/r/20220924112812.1219-1-naoki@radxa.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Greg Kroah-Hartman [Thu, 8 Dec 2022 10:23:06 +0000 (11:23 +0100)]
Linux 5.4.226
Link: https://lore.kernel.org/r/20221205190808.733996403@linuxfoundation.org
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Tested-by: Allen Pais <apais@linux.microsoft.com>
Link: https://lore.kernel.org/r/20221206124054.310184563@linuxfoundation.org
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Allen Pais <apais@linux.microsoft.com>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Hulk Robot <hulkrobot@huawei.com>
Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jann Horn [Mon, 5 Dec 2022 16:59:27 +0000 (17:59 +0100)]
ipc/sem: Fix dangling sem_array access in semtimedop race
[ Upstream commit
b52be557e24c47286738276121177a41f54e3b83 ]
When __do_semtimedop() goes to sleep because it has to wait for a
semaphore value becoming zero or becoming bigger than some threshold, it
links the on-stack sem_queue to the sem_array, then goes to sleep
without holding a reference on the sem_array.
When __do_semtimedop() comes back out of sleep, one of two things must
happen:
a) We prove that the on-stack sem_queue has been disconnected from the
(possibly freed) sem_array, making it safe to return from the stack
frame that the sem_queue exists in.
b) We stabilize our reference to the sem_array, lock the sem_array, and
detach the sem_queue from the sem_array ourselves.
sem_array has RCU lifetime, so for case (b), the reference can be
stabilized inside an RCU read-side critical section by locklessly
checking whether the sem_queue is still connected to the sem_array.
However, the current code does the lockless check on sem_queue before
starting an RCU read-side critical section, so the result of the
lockless check immediately becomes useless.
Fix it by doing rcu_read_lock() before the lockless check. Now RCU
ensures that if we observe the object being on our queue, the object
can't be freed until rcu_read_unlock().
This bug is only hittable on kernel builds with full preemption support
(either CONFIG_PREEMPT or PREEMPT_DYNAMIC with preempt=full).
Fixes: 17528e5d3466 ("ipc/sem: avoid idr tree lookup for interrupted semop")
Cc: stable@vger.kernel.org
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Linus Torvalds [Thu, 1 Dec 2022 00:10:52 +0000 (16:10 -0800)]
v4l2: don't fall back to follow_pfn() if pin_user_pages_fast() fails
commit
6647e76ab623b2b3fb2efe03a86e9c9046c52c33 upstream.
The V4L2_MEMORY_USERPTR interface is long deprecated and shouldn't be
used (and is discouraged for any modern v4l drivers). And Seth Jenkins
points out that the fallback to VM_PFNMAP/VM_IO is fundamentally racy
and dangerous.
Note that it's not even a case that should trigger, since any normal
user pointer logic ends up just using the pin_user_pages_fast() call
that does the proper page reference counting. That's not the problem
case, only if you try to use special device mappings do you have any
issues.
Normally I'd just remove this during the merge window, but since Seth
pointed out the problem cases, we really want to know as soon as
possible if there are actually any users of this odd special case of a
legacy interface. Neither Hans nor Mauro seem to think that such
mis-uses of the old legacy interface should exist. As Mauro says:
"See, V4L2 has actually 4 streaming APIs:
- Kernel-allocated mmap (usually referred simply as just mmap);
- USERPTR mmap;
- read();
- dmabuf;
The USERPTR is one of the oldest way to use it, coming from V4L
version 1 times, and by far the least used one"
And Hans chimed in on the USERPTR interface:
"To be honest, I wouldn't mind if it goes away completely, but that's a
bit of a pipe dream right now"
but while removing this legacy interface entirely may be a pipe dream we
can at least try to remove the unlikely (and actively broken) case of
using special device mappings for USERPTR accesses.
This replaces it with a WARN_ONCE() that we can remove once we've
hopefully confirmed that no actual users exist.
NOTE! Longer term, this means that a 'struct frame_vector' only ever
contains proper page pointers, and all the games we have with converting
them to pages can go away (grep for 'frame_vector_to_pages()' and the
uses of 'vec->is_pfns'). But this is just the first step, to verify
that this code really is all dead, and do so as quickly as possible.
Reported-by: Seth Jenkins <sethjenkins@google.com>
Acked-by: Hans Verkuil <hverkuil@xs4all.nl>
Acked-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Linus Torvalds [Mon, 5 Dec 2022 20:09:06 +0000 (12:09 -0800)]
proc: proc_skip_spaces() shouldn't think it is working on C strings
commit
bce9332220bd677d83b19d21502776ad555a0e73 upstream.
proc_skip_spaces() seems to think it is working on C strings, and ends
up being just a wrapper around skip_spaces() with a really odd calling
convention.
Instead of basing it on skip_spaces(), it should have looked more like
proc_skip_char(), which really is the exact same function (except it
skips a particular character, rather than whitespace). So use that as
inspiration, odd coding and all.
Now the calling convention actually makes sense and works for the
intended purpose.
Reported-and-tested-by: Kyle Zeng <zengyhkyle@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Linus Torvalds [Mon, 5 Dec 2022 19:33:40 +0000 (11:33 -0800)]
proc: avoid integer type confusion in get_proc_long
commit
e6cfaf34be9fcd1a8285a294e18986bfc41a409c upstream.
proc_get_long() is passed a size_t, but then assigns it to an 'int'
variable for the length. Let's not do that, even if our IO paths are
limited to MAX_RW_COUNT (exactly because of these kinds of type errors).
So do the proper test in the rigth type.
Reported-by: Kyle Zeng <zengyhkyle@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Adrian Hunter [Mon, 28 Nov 2022 13:32:56 +0000 (15:32 +0200)]
mmc: sdhci: Fix voltage switch delay
commit
c981cdfb9925f64a364f13c2b4f98f877308a408 upstream.
Commit
1bcb2ef5efa9 ("mmc: sdhci: update signal voltage switch code")
removed voltage switch delays from sdhci because mmc core had been
enhanced to support them. However that assumed that sdhci_set_ios()
did a single clock change, which it did not, and so the delays in mmc
core, which should have come after the first clock change, were not
effective.
Fix by avoiding re-configuring UHS and preset settings when the clock
is turning on and the settings have not changed. That then also avoids
the associated clock changes, so that then sdhci_set_ios() does a single
clock change when voltage switching, and the mmc core delays become
effective.
To do that has meant keeping track of driver strength (host->drv_type),
and cases of reinitialization (host->reinit_uhs).
Note also, the 'turning_on_clk' restriction should not be necessary
but is done to minimize the impact of the change on stable kernels.
Fixes: 1bcb2ef5efa9 ("mmc: sdhci: update signal voltage switch code")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20221128133259.38305-2-adrian.hunter@intel.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Masahiro Yamada [Thu, 12 Mar 2020 11:00:50 +0000 (20:00 +0900)]
mmc: sdhci: use FIELD_GET for preset value bit masks
commit
8473ebd56ddaca5dff15b963ae24c1df272d094b upstream.
Use the FIELD_GET macro to get access to the register fields.
Delete the shift macros.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Link: https://lore.kernel.org/r/20200312110050.21732-1-yamada.masahiro@socionext.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jan Dabros [Mon, 28 Nov 2022 19:56:51 +0000 (20:56 +0100)]
char: tpm: Protect tpm_pm_suspend with locks
commit
23393c6461422df5bf8084a086ada9a7e17dc2ba upstream.
Currently tpm transactions are executed unconditionally in
tpm_pm_suspend() function, which may lead to races with other tpm
accessors in the system.
Specifically, the hw_random tpm driver makes use of tpm_get_random(),
and this function is called in a loop from a kthread, which means it's
not frozen alongside userspace, and so can race with the work done
during system suspend:
tpm tpm0: tpm_transmit: tpm_recv: error -52
tpm tpm0: invalid TPM_STS.x 0xff, dumping stack for forensics
CPU: 0 PID: 1 Comm: init Not tainted 6.1.0-rc5+ #135
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-20220807_005459-localhost 04/01/2014
Call Trace:
tpm_tis_status.cold+0x19/0x20
tpm_transmit+0x13b/0x390
tpm_transmit_cmd+0x20/0x80
tpm1_pm_suspend+0xa6/0x110
tpm_pm_suspend+0x53/0x80
__pnp_bus_suspend+0x35/0xe0
__device_suspend+0x10f/0x350
Fix this by calling tpm_try_get_ops(), which itself is a wrapper around
tpm_chip_start(), but takes the appropriate mutex.
Signed-off-by: Jan Dabros <jsd@semihalf.com>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Jason A. Donenfeld <Jason@zx2c4.com>
Tested-by: Vlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/all/c5ba47ef-393f-1fba-30bd-1230d1b4b592@suse.cz/
Cc: stable@vger.kernel.org
Fixes: be2d459bc6cb ("tpm: turn on TPM on suspend for TPM 1.x")
[Jason: reworked commit message, added metadata]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Conor Dooley [Tue, 22 Nov 2022 12:16:21 +0000 (12:16 +0000)]
Revert "clocksource/drivers/riscv: Events are stopped during CPU suspend"
[ Upstream commit
d9f15a9de44affe733e34f93bc184945ba277e6d ]
This reverts commit
6475a3b55a3b502e7d571173d4128e24955a874b.
On the subject of suspend, the RISC-V SBI spec states:
This does not cover whether any given events actually reach the hart or
not, just what the hart will do if it receives an event. On PolarFire
SoC, and potentially other SiFive based implementations, events from the
RISC-V timer do reach a hart during suspend. This is not the case for the
implementation on the Allwinner D1 - there timer events are not received
during suspend.
To fix this, the CLOCK_EVT_FEAT_C3STOP (mis)feature was enabled for the
timer driver - but this has broken both RCU stall detection and timers
generally on PolarFire SoC and potentially other SiFive based
implementations.
If an AXI read to the PCIe controller on PolarFire SoC times out, the
system will stall, however, with CLOCK_EVT_FEAT_C3STOP active, the system
just locks up without RCU stalling:
io scheduler mq-deadline registered
io scheduler kyber registered
microchip-pcie
2000000000.pcie: host bridge /soc/pcie@
2000000000 ranges:
microchip-pcie
2000000000.pcie: MEM 0x2008000000..0x2087ffffff -> 0x0008000000
microchip-pcie
2000000000.pcie: sec error in pcie2axi buffer
microchip-pcie
2000000000.pcie: ded error in pcie2axi buffer
microchip-pcie
2000000000.pcie: axi read request error
microchip-pcie
2000000000.pcie: axi read timeout
microchip-pcie
2000000000.pcie: sec error in pcie2axi buffer
microchip-pcie
2000000000.pcie: ded error in pcie2axi buffer
microchip-pcie
2000000000.pcie: sec error in pcie2axi buffer
microchip-pcie
2000000000.pcie: ded error in pcie2axi buffer
microchip-pcie
2000000000.pcie: sec error in pcie2axi buffer
microchip-pcie
2000000000.pcie: ded error in pcie2axi buffer
Freeing initrd memory: 7332K
Similarly issues were reported with clock_nanosleep() - with a test app
that sleeps each cpu for 6, 5, 4, 3 ms respectively, HZ=250 & the blamed
commit in place, the sleep times are rounded up to the next jiffy:
== CPU: 1 == == CPU: 2 == == CPU: 3 == == CPU: 4 ==
Mean: 7.974992 Mean: 7.976534 Mean: 7.962591 Mean: 3.952179
Std Dev: 0.154374 Std Dev: 0.156082 Std Dev: 0.171018 Std Dev: 0.076193
Hi: 9.472000 Hi: 10.495000 Hi: 8.864000 Hi: 4.736000
Lo: 6.087000 Lo: 6.380000 Lo: 4.872000 Lo: 3.403000
Samples: 521 Samples: 521 Samples: 521 Samples: 521
Fortunately, the D1 has a second timer, which is "currently used in
preference to the RISC-V/SBI timer driver" so a revert here does not
hurt operation of D1 in its current form.
Ultimately, a DeviceTree property (or node) will be added to encode the
behaviour of the timers, but until then revert the addition of
CLOCK_EVT_FEAT_C3STOP.
Fixes: 6475a3b55a3b ("clocksource/drivers/riscv: Events are stopped during CPU suspend")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Acked-by: Samuel Holland <samuel@sholland.org>
Link: https://lore.kernel.org/linux-riscv/YzYTNQRxLr7Q9JR0@spud/
Link: https://github.com/riscv-non-isa/riscv-sbi-doc/issues/98/
Link: https://lore.kernel.org/linux-riscv/bf6d3b1f-f703-4a25-833e-972a44a04114@sholland.org/
Link: https://lore.kernel.org/r/20221122121620.3522431-1-conor.dooley@microchip.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Michael Kelley [Sun, 4 Dec 2022 21:52:01 +0000 (13:52 -0800)]
x86/ioremap: Fix page aligned size calculation in __ioremap_caller()
[ Upstream commit
4dbd6a3e90e03130973688fd79e19425f720d999 ]
Current code re-calculates the size after aligning the starting and
ending physical addresses on a page boundary. But the re-calculation
also embeds the masking of high order bits that exceed the size of
the physical address space (via PHYSICAL_PAGE_MASK). If the masking
removes any high order bits, the size calculation results in a huge
value that is likely to immediately fail.
Fix this by re-calculating the page-aligned size first. Then mask any
high order bits using PHYSICAL_PAGE_MASK.
Fixes: cd4419f5a482 ("x86, ioremap: Fix incorrect physical address handling in PAE mode")
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/1668624097-14884-2-git-send-email-mikelley@microsoft.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Luiz Augusto von Dentz [Mon, 31 Oct 2022 23:10:32 +0000 (16:10 -0700)]
Bluetooth: L2CAP: Fix accepting connection request for invalid SPSM
commit
dfb6df4bc13d2371a15749c98a36a6ff0caad82a upstream.
The Bluetooth spec states that the valid range for SPSM is from
0x0001-0x00ff so it is invalid to accept values outside of this range:
BLUETOOTH CORE SPECIFICATION Version 5.3 | Vol 3, Part A
page 1059:
Table 4.15: L2CAP_LE_CREDIT_BASED_CONNECTION_REQ SPSM ranges
CVE: CVE-2022-42896
CC: stable@vger.kernel.org
Reported-by: Tamás Koczka <poprdi@google.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Reviewed-by: Tedd Ho-Jeong An <tedd.an@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pawan Gupta [Thu, 1 Dec 2022 22:55:50 +0000 (14:55 -0800)]
x86/pm: Add enumeration check before spec MSRs save/restore setup
commit
50bcceb7724e471d9b591803889df45dcbb584bc upstream.
pm_save_spec_msr() keeps a list of all the MSRs which _might_ need
to be saved and restored at hibernate and resume. However, it has
zero awareness of CPU support for these MSRs. It mostly works by
unconditionally attempting to manipulate these MSRs and relying on
rdmsrl_safe() being able to handle a #GP on CPUs where the support is
unavailable.
However, it's possible for reads (RDMSR) to be supported for a given MSR
while writes (WRMSR) are not. In this case, msr_build_context() sees
a successful read (RDMSR) and marks the MSR as valid. Then, later, a
write (WRMSR) fails, producing a nasty (but harmless) error message.
This causes restore_processor_state() to try and restore it, but writing
this MSR is not allowed on the Intel Atom N2600 leading to:
unchecked MSR access error: WRMSR to 0x122 (tried to write 0x0000000000000002) \
at rIP: 0xffffffff8b07a574 (native_write_msr+0x4/0x20)
Call Trace:
<TASK>
restore_processor_state
x86_acpi_suspend_lowlevel
acpi_suspend_enter
suspend_devices_and_enter
pm_suspend.cold
state_store
kernfs_fop_write_iter
vfs_write
ksys_write
do_syscall_64
? do_syscall_64
? up_read
? lock_is_held_type
? asm_exc_page_fault
? lockdep_hardirqs_on
entry_SYSCALL_64_after_hwframe
To fix this, add the corresponding X86_FEATURE bit for each MSR. Avoid
trying to manipulate the MSR when the feature bit is clear. This
required adding a X86_FEATURE bit for MSRs that do not have one already,
but it's a small price to pay.
[ bp: Move struct msr_enumeration inside the only function that uses it. ]
[Pawan: Resolve build issue in backport]
Fixes: a754861978c5 ("x86/pm: Save the MSR validity status at context setup")
Reported-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/c24db75d69df6e66c0465e13676ad3f2837a2ed8.1668539735.git.pawan.kumar.gupta@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pawan Gupta [Thu, 1 Dec 2022 22:55:43 +0000 (14:55 -0800)]
x86/tsx: Add a feature bit for TSX control MSR support
commit
aaa65d17eec372c6a9756833f3964ba05b05ea14 upstream.
Support for the TSX control MSR is enumerated in MSR_IA32_ARCH_CAPABILITIES.
This is different from how other CPU features are enumerated i.e. via
CPUID. Currently, a call to tsx_ctrl_is_supported() is required for
enumerating the feature. In the absence of a feature bit for TSX control,
any code that relies on checking feature bits directly will not work.
In preparation for adding a feature bit check in MSR save/restore
during suspend/resume, set a new feature bit X86_FEATURE_TSX_CTRL when
MSR_IA32_TSX_CTRL is present.
[ bp: Remove tsx_ctrl_is_supported()]
[Pawan: Resolved conflicts in backport; Removed parts of commit message
referring to removed function tsx_ctrl_is_supported()]
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/de619764e1d98afbb7a5fa58424f1278ede37b45.1668539735.git.pawan.kumar.gupta@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Keith Busch [Thu, 22 Sep 2022 15:13:47 +0000 (08:13 -0700)]
nvme: ensure subsystem reset is single threaded
commit
7a1ebcdac9d52bb7416bc410ed60d4aee92f9a5b upstream.
The subsystem reset writes to a register, so we have to ensure the
device state is capable of handling that otherwise the driver may access
unmapped registers. Use the state machine to ensure the subsystem reset
doesn't try to write registers on a device already undergoing this type
of reset.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=214771
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Keith Busch [Thu, 22 Sep 2022 14:54:06 +0000 (07:54 -0700)]
nvme: restrict management ioctls to admin
commit
c0393ef282ed49a7c8955a6f12e9803773568b84 upstream.
The passthrough commands already have this restriction, but the other
operations do not. Require the same capabilities for all users as all of
these operations, which include resets and rescans, can be disruptive.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
Soheil Hassas Yeganeh [Fri, 18 Dec 2020 22:01:44 +0000 (14:01 -0800)]
epoll: check for events when removing a timed out thread from the wait queue
commit
cf6146f9636f58ba1f21634b105bfdf9f99292b2 upstream.
Patch series "simplify ep_poll".
This patch series is a followup based on the suggestions and feedback by
Linus:
https://lkml.kernel.org/r/CAHk-=wizk=OxUyQPbO8MS41w2Pag1kniUV5WdD5qWL-gq1kjDA@mail.gmail.com
The first patch in the series is a fix for the epoll race in presence of
timeouts, so that it can be cleanly backported to all affected stable
kernels.
The rest of the patch series simplify the ep_poll() implementation. Some
of these simplifications result in minor performance enhancements as well.
We have kept these changes under self tests and internal benchmarks for a
few days, and there are minor (1-2%) performance enhancements as a result.
This patch (of 8):
After
0eeee653e103 ("fs/epoll: avoid barrier after an epoll_wait(2)
timeout"), we break out of the ep_poll loop upon timeout, without checking
whether there is any new events available. Prior to that patch-series we
always called ep_events_available() after exiting the loop.
This can cause races and missed wakeups. For example, consider the
following scenario reported by Guantao Liu:
Suppose we have an eventfd added using EPOLLET to an epollfd.
Thread 1: Sleeps for just below 5ms and then writes to an eventfd.
Thread 2: Calls epoll_wait with a timeout of 5 ms. If it sees an
event of the eventfd, it will write back on that fd.
Thread 3: Calls epoll_wait with a negative timeout.
Prior to
0eeee653e103, it is guaranteed that Thread 3 will wake up either
by Thread 1 or Thread 2. After
0eeee653e103, Thread 3 can be blocked
indefinitely if Thread 2 sees a timeout right before the write to the
eventfd by Thread 1. Thread 2 will be woken up from
schedule_hrtimeout_range and, with evail 0, it will not call
ep_send_events().
To fix this issue:
1) Simplify the timed_out case as suggested by Linus.
2) while holding the lock, recheck whether the thread was woken up
after its time out has reached.
Note that (2) is different from Linus' original suggestion: It do not set
"eavail = ep_events_available(ep)" to avoid unnecessary contention (when
there are too many timed-out threads and a small number of events), as
well as races mentioned in the discussion thread.
This is the first patch in the series so that the backport to stable
releases is straightforward.
Link: https://lkml.kernel.org/r/20201106231635.3528496-1-soheil.kdev@gmail.com
Link: https://lkml.kernel.org/r/CAHk-=wizk=OxUyQPbO8MS41w2Pag1kniUV5WdD5qWL-gq1kjDA@mail.gmail.com
Link: https://lkml.kernel.org/r/20201106231635.3528496-2-soheil.kdev@gmail.com
Fixes: 0eeee653e103 ("fs/epoll: avoid barrier after an epoll_wait(2) timeout")
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Tested-by: Guantao Liu <guantaol@google.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Guantao Liu <guantaol@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Khazhismel Kumykov <khazhy@google.com>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Rishabh Bhatnagar <risbhat@amazon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Roman Penyaev [Thu, 14 May 2020 00:50:38 +0000 (17:50 -0700)]
epoll: call final ep_events_available() check under the lock
commit
5df0210df8d914e8e21858b612fa5e9aa9c15289 upstream.
There is a possible race when ep_scan_ready_list() leaves ->rdllist and
->obflist empty for a short period of time although some events are
pending. It is quite likely that ep_events_available() observes empty
lists and goes to sleep.
Since commit
3b8e522e50f8 ("fs/epoll: remove unnecessary wakeups of
nested epoll") we are conservative in wakeups (there is only one place
for wakeup and this is ep_poll_callback()), thus ep_events_available()
must always observe correct state of two lists.
The easiest and correct way is to do the final check under the lock.
This does not impact the performance, since lock is taken anyway for
adding a wait entry to the wait queue.
The discussion of the problem can be found here:
https://lore.kernel.org/linux-fsdevel/
a2f22c3c-c25a-4bda-8339-
a7bdaf17849e@akamai.com/
In this patch barrierless __set_current_state() is used. This is safe
since waitqueue_active() is called under the same lock on wakeup side.
Short-circuit for fatal signals (i.e. fatal_signal_pending() check) is
moved to the line just before actual events harvesting routine. This is
fully compliant to what is said in the comment of the patch where the
actual fatal_signal_pending() check was added:
d2d2d2423c40 ("fs, epoll:
short circuit fetching events if thread has been killed").
Fixes: 3b8e522e50f8 ("fs/epoll: remove unnecessary wakeups of nested epoll")
Reported-by: Jason Baron <jbaron@akamai.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jason Baron <jbaron@akamai.com>
Cc: Khazhismel Kumykov <khazhy@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200505145609.1865152-1-rpenyaev@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Rishabh Bhatnagar <risbhat@amazon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Steven Rostedt (Google) [Fri, 21 Oct 2022 03:14:27 +0000 (23:14 -0400)]
tracing/ring-buffer: Have polling block on watermark
commit
42fb0a1e84ff525ebe560e2baf9451ab69127e2b upstream.
Currently the way polling works on the ring buffer is broken. It will
return immediately if there's any data in the ring buffer whereas a read
will block until the watermark (defined by the tracefs buffer_percent file)
is hit.
That is, a select() or poll() will return as if there's data available,
but then the following read will block. This is broken for the way
select()s and poll()s are supposed to work.
Have the polling on the ring buffer also block the same way reads and
splice does on the ring buffer.
Link: https://lkml.kernel.org/r/20221020231427.41be3f26@gandalf.local.home
Cc: Linux Trace Kernel <linux-trace-kernel@vger.kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Primiano Tucci <primiano@google.com>
Cc: stable@vger.kernel.org
Fixes: da7c181a22d3e ("ring-buffer: Do not wake up a splice waiter when page is not full")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ido Schimmel [Thu, 24 Nov 2022 21:09:32 +0000 (23:09 +0200)]
ipv4: Fix route deletion when nexthop info is not specified
[ Upstream commit
d5082d386eee7e8ec46fa8581932c81a4961dcef ]
When the kernel receives a route deletion request from user space it
tries to delete a route that matches the route attributes specified in
the request.
If only prefix information is specified in the request, the kernel
should delete the first matching FIB alias regardless of its associated
FIB info. However, an error is currently returned when the FIB info is
backed by a nexthop object:
# ip nexthop add id 1 via 192.0.2.2 dev dummy10
# ip route add 198.51.100.0/24 nhid 1
# ip route del 198.51.100.0/24
RTNETLINK answers: No such process
Fix by matching on such a FIB info when legacy nexthop attributes are
not specified in the request. An earlier check already covers the case
where a nexthop ID is specified in the request.
Add tests that cover these flows. Before the fix:
# ./fib_nexthops.sh -t ipv4_fcnal
...
TEST: Delete route when not specifying nexthop attributes [FAIL]
Tests passed: 11
Tests failed: 1
After the fix:
# ./fib_nexthops.sh -t ipv4_fcnal
...
TEST: Delete route when not specifying nexthop attributes [ OK ]
Tests passed: 12
Tests failed: 0
No regressions in other tests:
# ./fib_nexthops.sh
...
Tests passed: 228
Tests failed: 0
# ./fib_tests.sh
...
Tests passed: 186
Tests failed: 0
Cc: stable@vger.kernel.org
Reported-by: Jonas Gorski <jonas.gorski@gmail.com>
Tested-by: Jonas Gorski <jonas.gorski@gmail.com>
Fixes: 003f1686de1a ("ipv4: Allow routes to use nexthop objects")
Fixes: a2f6d77a9f37 ("net: ipv4: fix route with nexthop object delete warning")
Fixes: f0725bb23c48 ("ipv4: Handle attempt to delete multipath route when fib_info contains an nh reference")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20221124210932.2470010-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
David Ahern [Thu, 6 Oct 2022 16:48:49 +0000 (10:48 -0600)]
ipv4: Handle attempt to delete multipath route when fib_info contains an nh reference
[ Upstream commit
f0725bb23c480e09183bb16266cb87f07686c8a5 ]
Gwangun Jung reported a slab-out-of-bounds access in fib_nh_match:
fib_nh_match+0xf98/0x1130 linux-6.0-rc7/net/ipv4/fib_semantics.c:961
fib_table_delete+0x5f3/0xa40 linux-6.0-rc7/net/ipv4/fib_trie.c:1753
inet_rtm_delroute+0x2b3/0x380 linux-6.0-rc7/net/ipv4/fib_frontend.c:874
Separate nexthop objects are mutually exclusive with the legacy
multipath spec. Fix fib_nh_match to return if the config for the
to be deleted route contains a multipath spec while the fib_info
is using a nexthop object.
Fixes: 003f1686de1a ("ipv4: Allow routes to use nexthop objects")
Fixes: a2f6d77a9f37 ("net: ipv4: fix route with nexthop object delete warning")
Reported-by: Gwangun Jung <exsociety@gmail.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of:
d5082d386eee ("ipv4: Fix route deletion when nexthop info is not specified")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Nikolay Aleksandrov [Fri, 1 Apr 2022 15:54:27 +0000 (18:54 +0300)]
selftests: net: fix nexthop warning cleanup double ip typo
[ Upstream commit
0ccc341f8b76c8bc66ae8e142dd66d48d0cb3e04 ]
I made a stupid typo when adding the nexthop route warning selftest and
added both $IP and ip after it (double ip) on the cleanup path. The
error doesn't show up when running the test, but obviously it doesn't
cleanup properly after it.
Fixes: ebc172b37e74 ("selftests: net: add delete nexthop route warning test")
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of:
d5082d386eee ("ipv4: Fix route deletion when nexthop info is not specified")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Nikolay Aleksandrov [Fri, 1 Apr 2022 07:33:43 +0000 (10:33 +0300)]
selftests: net: add delete nexthop route warning test
[ Upstream commit
ebc172b37e7465ca97fbba10cef1d3d29149dd3f ]
Add a test which causes a WARNING on kernels which treat a
nexthop route like a normal route when comparing for deletion and a
device is specified. That is, a route is found but we hit a warning while
matching it. The warning is from fib_info_nh() in include/net/nexthop.h
because we run it on a fib_info with nexthop object. The call chain is:
inet_rtm_delroute -> fib_table_delete -> fib_nh_match (called with a
nexthop fib_info and also with fc_oif set thus calling fib_info_nh on
the fib_info and triggering the warning).
Repro steps:
$ ip nexthop add id 12 via 172.16.1.3 dev veth1
$ ip route add 172.16.101.1/32 nhid 12
$ ip route delete 172.16.101.1/32 dev veth1
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of:
d5082d386eee ("ipv4: Fix route deletion when nexthop info is not specified")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Lee Jones [Fri, 25 Nov 2022 12:07:50 +0000 (12:07 +0000)]
Kconfig.debug: provide a little extra FRAME_WARN leeway when KASAN is enabled
[ Upstream commit
152fe65f300e1819d59b80477d3e0999b4d5d7d2 ]
When enabled, KASAN enlarges function's stack-frames. Pushing quite a few
over the current threshold. This can mainly be seen on 32-bit
architectures where the present limit (when !GCC) is a lowly 1024-Bytes.
Link: https://lkml.kernel.org/r/20221125120750.3537134-3-lee@kernel.org
Signed-off-by: Lee Jones <lee@kernel.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@gmail.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Tom Rix <trix@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Helge Deller [Fri, 19 Nov 2021 21:31:03 +0000 (22:31 +0100)]
parisc: Increase FRAME_WARN to 2048 bytes on parisc
[ Upstream commit
c665e0cc89075d0545efb18e589a7dc40a6debed ]
PA-RISC uses a much bigger frame size for functions than other
architectures. So increase it to 2048 for 32- and 64-bit kernels.
This fixes e.g. a warning in lib/xxhash.c.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Stable-dep-of:
152fe65f300e ("Kconfig.debug: provide a little extra FRAME_WARN leeway when KASAN is enabled")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Guenter Roeck [Fri, 24 Sep 2021 22:43:29 +0000 (15:43 -0700)]
xtensa: increase size of gcc stack frame check
[ Upstream commit
ee51df0f7c62fa0b430c3aa52cfdd5b7d6e50979 ]
xtensa frame size is larger than the frame size for almost all other
architectures. This results in more than 50 "the frame size of <n> is
larger than 1024 bytes" errors when trying to build xtensa:allmodconfig.
Increase frame size for xtensa to 1536 bytes to avoid compile errors due
to frame size limits.
Link: https://lkml.kernel.org/r/20210912025235.3514761-1-linux@roeck-us.net
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Max Filippov <jcmvbkbc@gmail.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Stable-dep-of:
152fe65f300e ("Kconfig.debug: provide a little extra FRAME_WARN leeway when KASAN is enabled")
Signed-off-by: Sasha Levin <sashal@kernel.org>