Changes to hrtimer mode (potentially made by __hrtimer_init_sleeper on
PREEMPT_RT) are not visible to hrtimer_start_range_ns, thus not
accounted for by hrtimer_start_expires call paths. In particular,
__wait_event_hrtimeout suffers from this problem as we have, for
example:
fs/aio.c::read_events
wait_event_interruptible_hrtimeout
__wait_event_hrtimeout
hrtimer_init_sleeper_on_stack <- this might "mode |= HRTIMER_MODE_HARD"
on RT if task runs at RT/DL priority
hrtimer_start_range_ns
WARN_ON_ONCE(!(mode & HRTIMER_MODE_HARD) ^ !timer->is_hard)
fires since the latter doesn't see the change of mode done by
init_sleeper
Fix it by making __wait_event_hrtimeout call hrtimer_sleeper_start_expires,
which is aware of the special RT/DL case, instead of hrtimer_start_range_ns.
Reported-by: Bruno Goncalves <bgoncalv@redhat.com>
Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Reviewed-by: Valentin Schneider <vschneid@redhat.com>
Link: https://lore.kernel.org/r/20220627095051.42470-1-juri.lelli@redhat.com
\
hrtimer_init_sleeper_on_stack(&__t, CLOCK_MONOTONIC, \
HRTIMER_MODE_REL); \
- if ((timeout) != KTIME_MAX) \
- hrtimer_start_range_ns(&__t.timer, timeout, \
- current->timer_slack_ns, \
- HRTIMER_MODE_REL); \
+ if ((timeout) != KTIME_MAX) { \
+ hrtimer_set_expires_range_ns(&__t.timer, timeout, \
+ current->timer_slack_ns); \
+ hrtimer_sleeper_start_expires(&__t, HRTIMER_MODE_REL); \
+ } \
\
__ret = ___wait_event(wq_head, condition, state, 0, 0, \
if (!__t.task) { \