Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 2efe86b8 authored by Paul Jackson's avatar Paul Jackson Committed by Linus Torvalds
Browse files

[PATCH] cpuset exit NULL dereference fix



There is a race in the kernel cpuset code, between the code
to handle notify_on_release, and the code to remove a cpuset.
The notify_on_release code can end up trying to access a
cpuset that has been removed.  In the most common case, this
causes a NULL pointer dereference from the routine cpuset_path.
However all manner of bad things are possible, in theory at least.

The existing code decrements the cpuset use count, and if the
count goes to zero, processes the notify_on_release request,
if appropriate.  However, once the count goes to zero, unless we
are holding the global cpuset_sem semaphore, there is nothing to
stop another task from immediately removing the cpuset entirely,
and recycling its memory.

The obvious fix would be to always hold the cpuset_sem
semaphore while decrementing the use count and dealing with
notify_on_release.  However we don't want to force a global
semaphore into the mainline task exit path, as that might create
a scaling problem.

The actual fix is almost as easy - since this is only an issue
for cpusets using notify_on_release, which the top level big
cpusets don't normally need to use, only take the cpuset_sem
for cpusets using notify_on_release.

This code has been run for hours without a hiccup, while running
a cpuset create/destroy stress test that could crash the existing
kernel in seconds.  This patch applies to the current -linus
git kernel.

Signed-off-by: default avatarPaul Jackson <pj@sgi.com>
Acked-by: default avatarSimon Derr <simon.derr@bull.net>
Acked-by: default avatarDinakar Guniguntala <dino@in.ibm.com>
Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
parent 88c18346
Loading
Loading
Loading
Loading
+19 −5
Original line number Original line Diff line number Diff line
@@ -166,9 +166,8 @@ static struct super_block *cpuset_sb = NULL;
 * The hooks from fork and exit, cpuset_fork() and cpuset_exit(), don't
 * The hooks from fork and exit, cpuset_fork() and cpuset_exit(), don't
 * (usually) grab cpuset_sem.  These are the two most performance
 * (usually) grab cpuset_sem.  These are the two most performance
 * critical pieces of code here.  The exception occurs on exit(),
 * critical pieces of code here.  The exception occurs on exit(),
 * if the last task using a cpuset exits, and the cpuset was marked
 * when a task in a notify_on_release cpuset exits.  Then cpuset_sem
 * notify_on_release.  In that case, the cpuset_sem is taken, the
 * is taken, and if the cpuset count is zero, a usermode call made
 * path to the released cpuset calculated, and a usermode call made
 * to /sbin/cpuset_release_agent with the name of the cpuset (path
 * to /sbin/cpuset_release_agent with the name of the cpuset (path
 * relative to the root of cpuset file system) as the argument.
 * relative to the root of cpuset file system) as the argument.
 *
 *
@@ -1404,6 +1403,18 @@ void cpuset_fork(struct task_struct *tsk)
 *
 *
 * Description: Detach cpuset from @tsk and release it.
 * Description: Detach cpuset from @tsk and release it.
 *
 *
 * Note that cpusets marked notify_on_release force every task
 * in them to take the global cpuset_sem semaphore when exiting.
 * This could impact scaling on very large systems.  Be reluctant
 * to use notify_on_release cpusets where very high task exit
 * scaling is required on large systems.
 *
 * Don't even think about derefencing 'cs' after the cpuset use
 * count goes to zero, except inside a critical section guarded
 * by the cpuset_sem semaphore.  If you don't hold cpuset_sem,
 * then a zero cpuset use count is a license to any other task to
 * nuke the cpuset immediately.
 *
 **/
 **/


void cpuset_exit(struct task_struct *tsk)
void cpuset_exit(struct task_struct *tsk)
@@ -1415,10 +1426,13 @@ void cpuset_exit(struct task_struct *tsk)
	tsk->cpuset = NULL;
	tsk->cpuset = NULL;
	task_unlock(tsk);
	task_unlock(tsk);


	if (atomic_dec_and_test(&cs->count)) {
	if (notify_on_release(cs)) {
		down(&cpuset_sem);
		down(&cpuset_sem);
		if (atomic_dec_and_test(&cs->count))
			check_for_release(cs);
			check_for_release(cs);
		up(&cpuset_sem);
		up(&cpuset_sem);
	} else {
		atomic_dec(&cs->count);
	}
	}
}
}