Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit acfec9a5 authored by Al Viro's avatar Al Viro
Browse files

livelock avoidance in sget()



Eric Sandeen has found a nasty livelock in sget() - take a mount(2) about
to fail.  The superblock is on ->fs_supers, ->s_umount is held exclusive,
->s_active is 1.  Along comes two more processes, trying to mount the same
thing; sget() in each is picking that superblock, bumping ->s_count and
trying to grab ->s_umount.  ->s_active is 3 now.  Original mount(2)
finally gets to deactivate_locked_super() on failure; ->s_active is 2,
superblock is still ->fs_supers because shutdown will *not* happen until
->s_active hits 0.  ->s_umount is dropped and now we have two processes
chasing each other:
s_active = 2, A acquired ->s_umount, B blocked
A sees that the damn thing is stillborn, does deactivate_locked_super()
s_active = 1, A drops ->s_umount, B gets it
A restarts the search and finds the same superblock.  And bumps it ->s_active.
s_active = 2, B holds ->s_umount, A blocked on trying to get it
... and we are in the earlier situation with A and B switched places.

The root cause, of course, is that ->s_active should not grow until we'd
got MS_BORN.  Then failing ->mount() will have deactivate_locked_super()
shut the damn thing down.  Fortunately, it's easy to do - the key point
is that grab_super() is called only for superblocks currently on ->fs_supers,
so it can bump ->s_count and grab ->s_umount first, then check MS_BORN and
bump ->s_active; we must never increment ->s_count for superblocks past
->kill_sb(), but grab_super() is never called for those.

The bug is pretty old; we would've caught it by now, if not for accidental
exclusion between sget() for block filesystems; the things like cgroup or
e.g. mtd-based filesystems don't have anything of that sort, so they get
bitten.  The right way to deal with that is obviously to fix sget()...

Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
parent ba57ea64
Loading
Loading
Loading
Loading
+10 −15
Original line number Original line Diff line number Diff line
@@ -336,19 +336,19 @@ EXPORT_SYMBOL(deactivate_super);
 *	and want to turn it into a full-blown active reference.  grab_super()
 *	and want to turn it into a full-blown active reference.  grab_super()
 *	is called with sb_lock held and drops it.  Returns 1 in case of
 *	is called with sb_lock held and drops it.  Returns 1 in case of
 *	success, 0 if we had failed (superblock contents was already dead or
 *	success, 0 if we had failed (superblock contents was already dead or
 *	dying when grab_super() had been called).
 *	dying when grab_super() had been called).  Note that this is only
 *	called for superblocks not in rundown mode (== ones still on ->fs_supers
 *	of their type), so increment of ->s_count is OK here.
 */
 */
static int grab_super(struct super_block *s) __releases(sb_lock)
static int grab_super(struct super_block *s) __releases(sb_lock)
{
{
	if (atomic_inc_not_zero(&s->s_active)) {
		spin_unlock(&sb_lock);
		return 1;
	}
	/* it's going away */
	s->s_count++;
	s->s_count++;
	spin_unlock(&sb_lock);
	spin_unlock(&sb_lock);
	/* wait for it to die */
	down_write(&s->s_umount);
	down_write(&s->s_umount);
	if ((s->s_flags & MS_BORN) && atomic_inc_not_zero(&s->s_active)) {
		put_super(s);
		return 1;
	}
	up_write(&s->s_umount);
	up_write(&s->s_umount);
	put_super(s);
	put_super(s);
	return 0;
	return 0;
@@ -463,11 +463,6 @@ struct super_block *sget(struct file_system_type *type,
				destroy_super(s);
				destroy_super(s);
				s = NULL;
				s = NULL;
			}
			}
			down_write(&old->s_umount);
			if (unlikely(!(old->s_flags & MS_BORN))) {
				deactivate_locked_super(old);
				goto retry;
			}
			return old;
			return old;
		}
		}
	}
	}
@@ -660,10 +655,10 @@ struct super_block *get_active_super(struct block_device *bdev)
		if (hlist_unhashed(&sb->s_instances))
		if (hlist_unhashed(&sb->s_instances))
			continue;
			continue;
		if (sb->s_bdev == bdev) {
		if (sb->s_bdev == bdev) {
			if (grab_super(sb)) /* drops sb_lock */
			if (!grab_super(sb))
				return sb;
			else
				goto restart;
				goto restart;
			up_write(&sb->s_umount);
			return sb;
		}
		}
	}
	}
	spin_unlock(&sb_lock);
	spin_unlock(&sb_lock);