Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit ccfc7bf1 authored by Nate Dailey's avatar Nate Dailey Committed by Shaohua Li
Browse files

raid1: include bio_end_io_list in nr_queued to prevent freeze_array hang



If raid1d is handling a mix of read and write errors, handle_read_error's
call to freeze_array can get stuck.

This can happen because, though the bio_end_io_list is initially drained,
writes can be added to it via handle_write_finished as the retry_list
is processed. These writes contribute to nr_pending but are not included
in nr_queued.

If a later entry on the retry_list triggers a call to handle_read_error,
freeze array hangs waiting for nr_pending == nr_queued+extra. The writes
on the bio_end_io_list aren't included in nr_queued so the condition will
never be satisfied.

To prevent the hang, include bio_end_io_list writes in nr_queued.

There's probably a better way to handle decrementing nr_queued, but this
seemed like the safest way to avoid breaking surrounding code.

I'm happy to supply the script I used to repro this hang.

Fixes: 55ce74d4(md/raid1: ensure device failure recorded before write request returns.)
Cc: stable@vger.kernel.org (v4.3+)
Signed-off-by: default avatarNate Dailey <nate.dailey@stratus.com>
Signed-off-by: default avatarShaohua Li <shli@fb.com>
parent d85326cf
Loading
Loading
Loading
Loading
+5 −2
Original line number Original line Diff line number Diff line
@@ -2274,6 +2274,7 @@ static void handle_write_finished(struct r1conf *conf, struct r1bio *r1_bio)
	if (fail) {
	if (fail) {
		spin_lock_irq(&conf->device_lock);
		spin_lock_irq(&conf->device_lock);
		list_add(&r1_bio->retry_list, &conf->bio_end_io_list);
		list_add(&r1_bio->retry_list, &conf->bio_end_io_list);
		conf->nr_queued++;
		spin_unlock_irq(&conf->device_lock);
		spin_unlock_irq(&conf->device_lock);
		md_wakeup_thread(conf->mddev->thread);
		md_wakeup_thread(conf->mddev->thread);
	} else {
	} else {
@@ -2391,8 +2392,10 @@ static void raid1d(struct md_thread *thread)
		LIST_HEAD(tmp);
		LIST_HEAD(tmp);
		spin_lock_irqsave(&conf->device_lock, flags);
		spin_lock_irqsave(&conf->device_lock, flags);
		if (!test_bit(MD_CHANGE_PENDING, &mddev->flags)) {
		if (!test_bit(MD_CHANGE_PENDING, &mddev->flags)) {
			list_add(&tmp, &conf->bio_end_io_list);
			while (!list_empty(&conf->bio_end_io_list)) {
			list_del_init(&conf->bio_end_io_list);
				list_move(conf->bio_end_io_list.prev, &tmp);
				conf->nr_queued--;
			}
		}
		}
		spin_unlock_irqrestore(&conf->device_lock, flags);
		spin_unlock_irqrestore(&conf->device_lock, flags);
		while (!list_empty(&tmp)) {
		while (!list_empty(&tmp)) {