Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit dbb7339c authored by Damien Le Moal's avatar Damien Le Moal Committed by Greg Kroah-Hartman
Browse files

block: mq-deadline: Fix queue restart handling



[ Upstream commit cb8acabbe33b110157955a7425ee876fb81e6bbc ]

Commit 7211aef86f79 ("block: mq-deadline: Fix write completion
handling") added a call to blk_mq_sched_mark_restart_hctx() in
dd_dispatch_request() to make sure that write request dispatching does
not stall when all target zones are locked. This fix left a subtle race
when a write completion happens during a dispatch execution on another
CPU:

CPU 0: Dispatch			CPU1: write completion

dd_dispatch_request()
    lock(&dd->lock);
    ...
    lock(&dd->zone_lock);	dd_finish_request()
    rq = find request		lock(&dd->zone_lock);
    unlock(&dd->zone_lock);
    				zone write unlock
				unlock(&dd->zone_lock);
				...
				__blk_mq_free_request
                                      check restart flag (not set)
				      -> queue not run
    ...
    if (!rq && have writes)
        blk_mq_sched_mark_restart_hctx()
    unlock(&dd->lock)

Since the dispatch context finishes after the write request completion
handling, marking the queue as needing a restart is not seen from
__blk_mq_free_request() and blk_mq_sched_restart() not executed leading
to the dispatch stall under 100% write workloads.

Fix this by moving the call to blk_mq_sched_mark_restart_hctx() from
dd_dispatch_request() into dd_finish_request() under the zone lock to
ensure full mutual exclusion between write request dispatch selection
and zone unlock on write request completion.

Fixes: 7211aef86f79 ("block: mq-deadline: Fix write completion handling")
Cc: stable@vger.kernel.org
Reported-by: default avatarHans Holmberg <Hans.Holmberg@wdc.com>
Reviewed-by: default avatarHans Holmberg <hans.holmberg@wdc.com>
Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
parent af10ffa6
Loading
Loading
Loading
Loading
+13 −10
Original line number Diff line number Diff line
@@ -376,13 +376,6 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd)
 * hardware queue, but we may return a request that is for a
 * different hardware queue. This is because mq-deadline has shared
 * state for all hardware queues, in terms of sorting, FIFOs, etc.
 *
 * For a zoned block device, __dd_dispatch_request() may return NULL
 * if all the queued write requests are directed at zones that are already
 * locked due to on-going write requests. In this case, make sure to mark
 * the queue as needing a restart to ensure that the queue is run again
 * and the pending writes dispatched once the target zones for the ongoing
 * write requests are unlocked in dd_finish_request().
 */
static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
{
@@ -391,9 +384,6 @@ static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)

	spin_lock(&dd->lock);
	rq = __dd_dispatch_request(dd);
	if (!rq && blk_queue_is_zoned(hctx->queue) &&
	    !list_empty(&dd->fifo_list[WRITE]))
		blk_mq_sched_mark_restart_hctx(hctx);
	spin_unlock(&dd->lock);

	return rq;
@@ -559,6 +549,13 @@ static void dd_prepare_request(struct request *rq, struct bio *bio)
 * spinlock so that the zone is never unlocked while deadline_fifo_request()
 * or deadline_next_request() are executing. This function is called for
 * all requests, whether or not these requests complete successfully.
 *
 * For a zoned block device, __dd_dispatch_request() may have stopped
 * dispatching requests if all the queued requests are write requests directed
 * at zones that are already locked due to on-going write requests. To ensure
 * write request dispatch progress in this case, mark the queue as needing a
 * restart to ensure that the queue is run again after completion of the
 * request and zones being unlocked.
 */
static void dd_finish_request(struct request *rq)
{
@@ -570,6 +567,12 @@ static void dd_finish_request(struct request *rq)

		spin_lock_irqsave(&dd->zone_lock, flags);
		blk_req_zone_write_unlock(rq);
		if (!list_empty(&dd->fifo_list[WRITE])) {
			struct blk_mq_hw_ctx *hctx;

			hctx = blk_mq_map_queue(q, rq->mq_ctx->cpu);
			blk_mq_sched_mark_restart_hctx(hctx);
		}
		spin_unlock_irqrestore(&dd->zone_lock, flags);
	}
}