[ Upstream commit
0483b8034dd9f71a6f95a418e6a76b3fe807242d ]
When the buffer length of the recvmsg system call is 0, we got the
flollowing soft lockup problem:
watchdog: BUG: soft lockup - CPU#3 stuck for 27s! [a.out:6149]
CPU: 3 PID: 6149 Comm: a.out Kdump: loaded Not tainted 6.2.0+ #30
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
RIP: 0010:remove_wait_queue+0xb/0xc0
Code: 5e 41 5f c3 cc cc cc cc 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 57 <41> 56 41 55 41 54 55 48 89 fd 53 48 89 f3 4c 8d 6b 18 4c 8d 73 20
RSP: 0018:
ffff88811b5978b8 EFLAGS:
00000246
RAX:
0000000000000000 RBX:
ffff88811a7d3780 RCX:
ffffffffb7a4d768
RDX:
dffffc0000000000 RSI:
ffff88811b597908 RDI:
ffff888115408040
RBP:
1ffff110236b2f1b R08:
0000000000000000 R09:
ffff88811a7d37e7
R10:
ffffed10234fa6fc R11:
0000000000000001 R12:
ffff88811179b800
R13:
0000000000000001 R14:
ffff88811a7d38a8 R15:
ffff88811a7d37e0
FS:
00007f6fb5398740(0000) GS:
ffff888237180000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
0000000020000000 CR3:
000000010b6ba002 CR4:
0000000000370ee0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
<TASK>
tcp_msg_wait_data+0x279/0x2f0
tcp_bpf_recvmsg_parser+0x3c6/0x490
inet_recvmsg+0x280/0x290
sock_recvmsg+0xfc/0x120
____sys_recvmsg+0x160/0x3d0
___sys_recvmsg+0xf0/0x180
__sys_recvmsg+0xea/0x1a0
do_syscall_64+0x3f/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc
The logic in tcp_bpf_recvmsg_parser is as follows:
msg_bytes_ready:
copied = sk_msg_recvmsg(sk, psock, msg, len, flags);
if (!copied) {
wait data;
goto msg_bytes_ready;
}
In this case, "copied" always is 0, the infinite loop occurs.
According to the Linux system call man page, 0 should be returned in this
case. Therefore, in tcp_bpf_recvmsg_parser(), if the length is 0, directly
return. Also modify several other functions with the same problem.
Fixes: 698fcd555908 ("udp: Implement udp_bpf_recvmsg() for sockmap")
Fixes: efd2512dd28c ("af_unix: Implement unix_dgram_bpf_recvmsg()")
Fixes: 653d74294f4d ("bpf, sockmap: Fix race in ingress receive verdict with redirect to self")
Fixes: 38506f4bbc9d ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Cc: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20230303080946.1146638-1-liujian56@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
if (unlikely(flags & MSG_ERRQUEUE))
return inet_recv_error(sk, msg, len, addr_len);
+ if (!len)
+ return 0;
+
psock = sk_psock_get(sk);
if (unlikely(!psock))
return tcp_recvmsg(sk, msg, len, flags, addr_len);
if (unlikely(flags & MSG_ERRQUEUE))
return inet_recv_error(sk, msg, len, addr_len);
+ if (!len)
+ return 0;
+
psock = sk_psock_get(sk);
if (unlikely(!psock))
return tcp_recvmsg(sk, msg, len, flags, addr_len);
if (unlikely(flags & MSG_ERRQUEUE))
return inet_recv_error(sk, msg, len, addr_len);
+ if (!len)
+ return 0;
+
psock = sk_psock_get(sk);
if (unlikely(!psock))
return sk_udp_recvmsg(sk, msg, len, flags, addr_len);
struct sk_psock *psock;
int copied;
+ if (!len)
+ return 0;
+
psock = sk_psock_get(sk);
if (unlikely(!psock))
return __unix_recvmsg(sk, msg, len, flags);