sched: Remove thread group iteration from colocation
Iterating a leader task's thread group in order to add them to a
colocation group involves a complex locking chain that ends up
causing a deadlock. The deadlock is as follows when the same task
is being referenced on three different CPUs:
----- ------ -----
CPU 0 CPU 1 CPU 2
----- ------ -----
add_task_to_group(p)
__schedule(prev = p) write_lock( ttwu(p)
related_thread_grp_lock)
lock(pi_lock)
idle_balance() wait for
p->on_cpu
load_balance() unable to acquire
p->pi_lock
send_notification()
wait for read_lock(
related_thread_grp_lock)
unable to set p->on_cpu
There are a couple of ways to resolve this deadlock in the kernel,
however, they are not trivial. For the sake of simplicity, move
the responsibility of thread group iteration back to userspace. This
would apply to both adding and removing the leader task from a
colocation group. The kernel would continue to automatically add
newly forked children of the colocated leader to the colocation
group.
This still leaves an issue with the locking order of the pi_lock and
the related_thread_group_lock. To solve all deadlocks, we need to avoid
taking the pi_lock in reset_all_task_stats() and instead rely on a more
heavy handed approach of taking all rq locks. The pi_lock was taken to
avoid a race between reset_all_task_stats() and sched_exit(). The race
can be avoided with rq locks as well.
Change-Id: I15323e3ef91401142d3841db59c18fd8fee753fd
Signed-off-by:
Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
Loading
Please register or sign in to comment