sched: Fix assert_clock_updated warning emitted during CPU isolation
The assert_clock_updated warning is observed with the below call stack
set_next_entity+0xb10
pick_next_task_fair+0x47c
migrate_tasks+0x290
do_isolation_work_cpu_stop+0xa0
cpu_stopper_thread+0xa0
smpboot_thread_fn+0x1b4
kthread+0x120
ret_from_fork+0x10
The migrate_tasks() is called during isolation of a CPU to migrate off
all the unpinned kernel threads to a different CPU. The CPU's rq->lock
is dropped before acquiring the task's pi_lock and while migrating the
task to a different CPU. Since isolation does not lock all other CPUs
in the stop_machine loop like CPU hotplug, this CPU's rq->lock could be
taken by other CPUs and clear the RQCF_UPDATED flag. Instead of mucking
around of these flags, force a clock update when RQCF_UPDATED is not set
upon reacquiring the rq->lock.
This warning can also result in a deadlock when console is enabled.
The assert_clock_updated() warning is emitted with rq->lock held and
the console driver is calling into the scheduler with its private
lock held, which results in a circular dependency.
The rq_flags are not initialized while passing to the migrate_tasks()
from do_isolation_work_cpu_stop(). The original rq_flags passed to
migrate_tasks() are passed to rq_relock() after migrating each task.
Since rq_flags are not initialized correctly, the rq->clock_update_flags
are corrupted in rq_relock(). This can result in weird behavior. Fix
this by using rq_lock() and rq_unlock() wrappers to acquire the lock
in do_isolation_work_cpu_stop() function.
Change-Id: I5fed50f3298f712bbfe21e16df6945aa6a99e72a
Signed-off-by:
Pavankumar Kondeti <pkondeti@codeaurora.org>
Loading
Please register or sign in to comment