Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 975b2979 authored by Philipp Reisner's avatar Philipp Reisner
Browse files

drbd: fix potential spinlock deadlock



drbd_try_clear_on_disk_bm() has a sanity check for the number of blocks
left to be resynced (rs_left) in the current resync extent.
If it detects a mismatch, it complains, and forces a disconnect using
drbd_force_state(mdev, NS(conn, C_DISCONNECTING));

Unfortunately, this may be called while holding the req_lock,
and drbd_force_state() want's to aquire that lock itself. Deadlock.

Don't force a disconnect, but fix up rs_left by recounting and
reassigning the number of dirty blocks in that extent.

Signed-off-by: default avatarPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: default avatarLars Ellenberg <lars.ellenberg@linbit.com>
parent 77fede51
Loading
Loading
Loading
Loading
+12 −8
Original line number Diff line number Diff line
@@ -573,16 +573,20 @@ static void drbd_try_clear_on_disk_bm(struct drbd_conf *mdev, sector_t sector,
			else
				ext->rs_failed += count;
			if (ext->rs_left < ext->rs_failed) {
				dev_err(DEV, "BAD! sector=%llus enr=%u rs_left=%d "
				    "rs_failed=%d count=%d\n",
				dev_warn(DEV, "BAD! sector=%llus enr=%u rs_left=%d "
				    "rs_failed=%d count=%d cstate=%s\n",
				     (unsigned long long)sector,
				     ext->lce.lc_number, ext->rs_left,
				     ext->rs_failed, count);
				dump_stack();

				lc_put(mdev->resync, &ext->lce);
				conn_request_state(mdev->tconn, NS(conn, C_DISCONNECTING), CS_HARD);
				return;
				     ext->rs_failed, count,
				     drbd_conn_str(mdev->state.conn));

				/* We don't expect to be able to clear more bits
				 * than have been set when we originally counted
				 * the set bits to cache that value in ext->rs_left.
				 * Whatever the reason (disconnect during resync,
				 * delayed local completion of an application write),
				 * try to fix it up by recounting here. */
				ext->rs_left = drbd_bm_e_weight(mdev, enr);
			}
		} else {
			/* Normally this element should be in the cache,