We check for the func with an OR condition, which means it always ends
up being false and we never match the task_work we want to cancel. In
the unexpected case that we do exit with that pending, we can trigger
a hang waiting for a worker to exit, but it was never created. syzbot
reports that as such:
INFO: task syz-executor687:8514 blocked for more than 143 seconds.
Not tainted 5.14.0-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor687 state:D stack:27296 pid: 8514 ppid: 8479 flags:0x00024004
Call Trace:
context_switch kernel/sched/core.c:4940 [inline]
__schedule+0x940/0x26f0 kernel/sched/core.c:6287
schedule+0xd3/0x270 kernel/sched/core.c:6366
schedule_timeout+0x1db/0x2a0 kernel/time/timer.c:1857
do_wait_for_common kernel/sched/completion.c:85 [inline]
__wait_for_common kernel/sched/completion.c:106 [inline]
wait_for_common kernel/sched/completion.c:117 [inline]
wait_for_completion+0x176/0x280 kernel/sched/completion.c:138
io_wq_exit_workers fs/io-wq.c:1162 [inline]
io_wq_put_and_exit+0x40c/0xc70 fs/io-wq.c:1197
io_uring_clean_tctx fs/io_uring.c:9607 [inline]
io_uring_cancel_generic+0x5fe/0x740 fs/io_uring.c:9687
io_uring_files_cancel include/linux/io_uring.h:16 [inline]
do_exit+0x265/0x2a30 kernel/exit.c:780
do_group_exit+0x125/0x310 kernel/exit.c:922
get_signal+0x47f/0x2160 kernel/signal.c:2868
arch_do_signal_or_restart+0x2a9/0x1c40 arch/x86/kernel/signal.c:865
handle_signal_work kernel/entry/common.c:148 [inline]
exit_to_user_mode_loop kernel/entry/common.c:172 [inline]
exit_to_user_mode_prepare+0x17d/0x290 kernel/entry/common.c:209
__syscall_exit_to_user_mode_work kernel/entry/common.c:291 [inline]
syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:302
do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x445cd9
RSP: 002b:
00007fc657f4b308 EFLAGS:
00000246 ORIG_RAX:
00000000000000ca
RAX:
0000000000000001 RBX:
00000000004cb448 RCX:
0000000000445cd9
RDX:
00000000000f4240 RSI:
0000000000000081 RDI:
00000000004cb44c
RBP:
00000000004cb440 R08:
000000000000000e R09:
0000000000000000
R10:
0000000000000000 R11:
0000000000000246 R12:
000000000049b154
R13:
0000000000000003 R14:
00007fc657f4b400 R15:
0000000000022000
While in there, also decrement accr->nr_workers. This isn't strictly
needed as we're exiting, but let's make sure the accounting matches up.
Fixes: e917e6222792 ("io-wq: make worker creation resilient against signals")
Reported-by: syzbot+f62d3e0a4ea4f38f5326@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
{
struct io_worker *worker;
- if (cb->func != create_worker_cb || cb->func != create_worker_cont)
+ if (cb->func != create_worker_cb && cb->func != create_worker_cont)
return false;
worker = container_of(cb, struct io_worker, create_work);
return worker->wqe->wq == data;
while ((cb = task_work_cancel_match(wq->task, io_task_work_match, wq)) != NULL) {
struct io_worker *worker;
+ struct io_wqe_acct *acct;
worker = container_of(cb, struct io_worker, create_work);
- atomic_dec(&worker->wqe->acct[worker->create_index].nr_running);
+ acct = io_wqe_get_acct(worker);
+ atomic_dec(&acct->nr_running);
+ raw_spin_lock(&worker->wqe->lock);
+ acct->nr_workers--;
+ raw_spin_unlock(&worker->wqe->lock);
io_worker_ref_put(wq);
clear_bit_unlock(0, &worker->create_state);
io_worker_release(worker);