Commit 2463f463 authored Oct 09, 2018 by Pavankumar Kondeti Committed by Gerrit - the friendly Code Review server Oct 10, 2018

sched: Fix assert_clock_updated warning emitted during CPU isolation



The assert_clock_updated warning is observed with the below call stack

set_next_entity+0xb10
pick_next_task_fair+0x47c
migrate_tasks+0x290
do_isolation_work_cpu_stop+0xa0
cpu_stopper_thread+0xa0
smpboot_thread_fn+0x1b4
kthread+0x120
ret_from_fork+0x10

The migrate_tasks() is called during isolation of a CPU to migrate off
all the unpinned kernel threads to a different CPU. The CPU's rq->lock
is dropped before acquiring the task's pi_lock and while migrating the
task to a different CPU. Since isolation does not lock all other CPUs
in the stop_machine loop like CPU hotplug, this CPU's rq->lock could be
taken by other CPUs and clear the RQCF_UPDATED flag. Instead of mucking
around of these flags, force a clock update when RQCF_UPDATED is not set
upon reacquiring the rq->lock.

This warning can also result in a deadlock when console is enabled.
The assert_clock_updated() warning is emitted with rq->lock held and
the console driver is calling into the scheduler with its private
lock held, which results in a circular dependency.

The rq_flags are not initialized while passing to the migrate_tasks()
from do_isolation_work_cpu_stop(). The original rq_flags passed to
migrate_tasks() are passed to rq_relock() after migrating each task.
Since rq_flags are not initialized correctly, the rq->clock_update_flags
are corrupted in rq_relock(). This can result in weird behavior. Fix
this by using rq_lock() and rq_unlock() wrappers to acquire the lock
in do_isolation_work_cpu_stop() function.

Change-Id: I5fed50f3298f712bbfe21e16df6945aa6a99e72a
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>

parent ec453551

Show whitespace changes

Inline Side-by-side

Please register or to comment