Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 108cef3a authored by NeilBrown's avatar NeilBrown
Browse files

md/raid5: fetch_block must fetch all the blocks handle_stripe_dirtying wants.



It is critical that fetch_block() and handle_stripe_dirtying()
are consistent in their analysis of what needs to be loaded.
Otherwise raid5 can wait forever for a block that won't be loaded.

Currently when writing to a RAID5 that is resyncing, to a location
beyond the resync offset, handle_stripe_dirtying chooses a
reconstruct-write cycle, but fetch_block() assumes a
read-modify-write, and a lockup can happen.

So treat that case just like RAID6, just as we do in
handle_stripe_dirtying.  RAID6 always does reconstruct-write.

This bug was introduced when the behaviour of handle_stripe_dirtying
was changed in 3.7, so the patch is suitable for any kernel since,
though it will need careful merging for some versions.

Cc: stable@vger.kernel.org (v3.7+)
Fixes: a7854487
Reported-by: default avatarHenry Cai <henryplusplus@gmail.com>
Signed-off-by: default avatarNeilBrown <neilb@suse.de>
parent 3a18ca06
Loading
Loading
Loading
Loading
+5 −2
Original line number Diff line number Diff line
@@ -2917,8 +2917,11 @@ static int fetch_block(struct stripe_head *sh, struct stripe_head_state *s,
	     (sh->raid_conf->level <= 5 && s->failed && fdev[0]->towrite &&
	      (!test_bit(R5_Insync, &dev->flags) || test_bit(STRIPE_PREREAD_ACTIVE, &sh->state)) &&
	      !test_bit(R5_OVERWRITE, &fdev[0]->flags)) ||
	     (sh->raid_conf->level == 6 && s->failed && s->to_write &&
	      s->to_write - s->non_overwrite < sh->raid_conf->raid_disks - 2 &&
	     ((sh->raid_conf->level == 6 ||
	       sh->sector >= sh->raid_conf->mddev->recovery_cp)
	      && s->failed && s->to_write &&
	      (s->to_write - s->non_overwrite <
	       sh->raid_conf->raid_disks - sh->raid_conf->max_degraded) &&
	      (!test_bit(R5_Insync, &dev->flags) || test_bit(STRIPE_PREREAD_ACTIVE, &sh->state))))) {
		/* we would like to get this block, possibly by computing it,
		 * otherwise read it if the backing disk is insync