sched: core: Fix stale rq clock usage in migration path
While migrating a task, move_queued_task() updates current cpu's
rq clock (which sets RQCF_UPDATED) with rq->lock held, and
momentarly releases the rq->lock and reaquire it along with new cpu
rq->lock. In between, if any other cpu takes the current rq->lock,
which might have called rq_pin_lock() (which clears RQCF_UPDATED) and
released the lock without updating cpu rq clock, then rq's
clock_update_flags becomes stale until rq_pin_lock() called again.
If the migration tries to reports load to cpufreq governor, then it would
access the stale rq_clock, and the assert_clock_updated reports warning
with below call stack:
detach_entity_cfs_rq+0x71c/0x780
migrate_task_rq_fair+0x50/0xd0
set_task_cpu+0x150/0x238
move_queued_task+0x1b4/0x3e8
migration_cpu_stop+0x188/0x1f0
cpu_stopper_thread+0xac/0x150
smpboot_thread_fn+0x1c4/0x2e8
Also as commit '2463f46361a02d("sched: Fix assert_clock_updated
warning emitted during CPU isolation")' mentioned, this warning
could lead to deadlock when console enabled.
To fix this, while reacquring the cpu rq->lock, if RQCF_UPDATED is
not set then force update the rq clock.
Change-Id: Ibc7bae4fc489e7f182339e6195cb440af6d7676b
Signed-off-by:  Lingutla Chandrasekhar <clingutla@codeaurora.org>
Lingutla Chandrasekhar <clingutla@codeaurora.org>
Loading
Please register or sign in to comment
