Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 04412819 authored by Linus Torvalds's avatar Linus Torvalds
Browse files

Merge tag 'for-linus-20190726' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

 - Several io_uring fixes/improvements:
     - Blocking fix for O_DIRECT (me)
     - Latter page slowness for registered buffers (me)
     - Fix poll hang under certain conditions (me)
     - Defer sequence check fix for wrapped rings (Zhengyuan)
     - Mismatch in async inc/dec accounting (Zhengyuan)
     - Memory ordering issue that could cause stall (Zhengyuan)
      - Track sequential defer in bytes, not pages (Zhengyuan)

 - NVMe pull request from Christoph

 - Set of hang fixes for wbt (Josef)

 - Redundant error message kill for libahci (Ding)

 - Remove unused blk_mq_sched_started_request() and related ops (Marcos)

 - drbd dynamic alloc shash descriptor to reduce stack use (Arnd)

 - blkcg ->pd_stat() non-debug print (Tejun)

 - bcache memory leak fix (Wei)

 - Comment fix (Akinobu)

 - BFQ perf regression fix (Paolo)

* tag 'for-linus-20190726' of git://git.kernel.dk/linux-block: (24 commits)
  io_uring: ensure ->list is initialized for poll commands
  Revert "nvme-pci: don't create a read hctx mapping without read queues"
  nvme: fix multipath crash when ANA is deactivated
  nvme: fix memory leak caused by incorrect subsystem free
  nvme: ignore subnqn for ADATA SX6000LNP
  drbd: dynamically allocate shash descriptor
  block: blk-mq: Remove blk_mq_sched_started_request and started_request
  bcache: fix possible memory leak in bch_cached_dev_run()
  io_uring: track io length in async_list based on bytes
  io_uring: don't use iov_iter_advance() for fixed buffers
  block: properly handle IOCB_NOWAIT for async O_DIRECT IO
  blk-mq: allow REQ_NOWAIT to return an error inline
  io_uring: add a memory barrier before atomic_read
  rq-qos: use a mb for got_token
  rq-qos: set ourself TASK_UNINTERRUPTIBLE after we schedule
  rq-qos: don't reset has_sleepers on spurious wakeups
  rq-qos: fix missed wake-ups in rq_qos_throttle
  wait: add wq_has_single_sleeper helper
  block, bfq: check also in-flight I/O in dispatch plugging
  block: fix sysfs module parameters directory path in comment
  ...
parents 750c930b 9c0b2596
Loading
Loading
Loading
Loading
+43 −24
Original line number Diff line number Diff line
@@ -3354,38 +3354,57 @@ static void bfq_dispatch_remove(struct request_queue *q, struct request *rq)
 * there is no active group, then the primary expectation for
 * this device is probably a high throughput.
 *
 * We are now left only with explaining the additional
 * compound condition that is checked below for deciding
 * whether the scenario is asymmetric. To explain this
 * compound condition, we need to add that the function
 * We are now left only with explaining the two sub-conditions in the
 * additional compound condition that is checked below for deciding
 * whether the scenario is asymmetric. To explain the first
 * sub-condition, we need to add that the function
 * bfq_asymmetric_scenario checks the weights of only
 * non-weight-raised queues, for efficiency reasons (see
 * comments on bfq_weights_tree_add()). Then the fact that
 * bfqq is weight-raised is checked explicitly here. More
 * precisely, the compound condition below takes into account
 * also the fact that, even if bfqq is being weight-raised,
 * the scenario is still symmetric if all queues with requests
 * waiting for completion happen to be
 * weight-raised. Actually, we should be even more precise
 * here, and differentiate between interactive weight raising
 * and soft real-time weight raising.
 * non-weight-raised queues, for efficiency reasons (see comments on
 * bfq_weights_tree_add()). Then the fact that bfqq is weight-raised
 * is checked explicitly here. More precisely, the compound condition
 * below takes into account also the fact that, even if bfqq is being
 * weight-raised, the scenario is still symmetric if all queues with
 * requests waiting for completion happen to be
 * weight-raised. Actually, we should be even more precise here, and
 * differentiate between interactive weight raising and soft real-time
 * weight raising.
 *
 * The second sub-condition checked in the compound condition is
 * whether there is a fair amount of already in-flight I/O not
 * belonging to bfqq. If so, I/O dispatching is to be plugged, for the
 * following reason. The drive may decide to serve in-flight
 * non-bfqq's I/O requests before bfqq's ones, thereby delaying the
 * arrival of new I/O requests for bfqq (recall that bfqq is sync). If
 * I/O-dispatching is not plugged, then, while bfqq remains empty, a
 * basically uncontrolled amount of I/O from other queues may be
 * dispatched too, possibly causing the service of bfqq's I/O to be
 * delayed even longer in the drive. This problem gets more and more
 * serious as the speed and the queue depth of the drive grow,
 * because, as these two quantities grow, the probability to find no
 * queue busy but many requests in flight grows too. By contrast,
 * plugging I/O dispatching minimizes the delay induced by already
 * in-flight I/O, and enables bfqq to recover the bandwidth it may
 * lose because of this delay.
 *
 * As a side note, it is worth considering that the above
 * device-idling countermeasures may however fail in the
 * following unlucky scenario: if idling is (correctly)
 * disabled in a time period during which all symmetry
 * sub-conditions hold, and hence the device is allowed to
 * enqueue many requests, but at some later point in time some
 * sub-condition stops to hold, then it may become impossible
 * to let requests be served in the desired order until all
 * the requests already queued in the device have been served.
 * device-idling countermeasures may however fail in the following
 * unlucky scenario: if I/O-dispatch plugging is (correctly) disabled
 * in a time period during which all symmetry sub-conditions hold, and
 * therefore the device is allowed to enqueue many requests, but at
 * some later point in time some sub-condition stops to hold, then it
 * may become impossible to make requests be served in the desired
 * order until all the requests already queued in the device have been
 * served. The last sub-condition commented above somewhat mitigates
 * this problem for weight-raised queues.
 */
static bool idling_needed_for_service_guarantees(struct bfq_data *bfqd,
						 struct bfq_queue *bfqq)
{
	return (bfqq->wr_coeff > 1 &&
		bfqd->wr_busy_queues <
		bfq_tot_busy_queues(bfqd)) ||
		(bfqd->wr_busy_queues <
		 bfq_tot_busy_queues(bfqd) ||
		 bfqd->rq_in_driver >=
		 bfqq->dispatched + 4)) ||
		bfq_asymmetric_scenario(bfqd, bfqq);
}

+3 −6
Original line number Diff line number Diff line
@@ -54,7 +54,7 @@ static struct blkcg_policy *blkcg_policy[BLKCG_MAX_POLS];

static LIST_HEAD(all_blkcgs);		/* protected by blkcg_pol_mutex */

static bool blkcg_debug_stats = false;
bool blkcg_debug_stats = false;
static struct workqueue_struct *blkcg_punt_bio_wq;

static bool blkcg_policy_enabled(struct request_queue *q,
@@ -944,10 +944,7 @@ static int blkcg_print_stat(struct seq_file *sf, void *v)
					 dbytes, dios);
		}

		if (!blkcg_debug_stats)
			goto next;

		if (atomic_read(&blkg->use_delay)) {
		if (blkcg_debug_stats && atomic_read(&blkg->use_delay)) {
			has_stats = true;
			off += scnprintf(buf+off, size-off,
					 " use_delay=%d delay_nsec=%llu",
@@ -967,7 +964,7 @@ static int blkcg_print_stat(struct seq_file *sf, void *v)
				has_stats = true;
			off += written;
		}
next:

		if (has_stats) {
			if (off < size - 1) {
				off += scnprintf(buf+off, size-off, "\n");
+3 −0
Original line number Diff line number Diff line
@@ -917,6 +917,9 @@ static size_t iolatency_pd_stat(struct blkg_policy_data *pd, char *buf,
	unsigned long long avg_lat;
	unsigned long long cur_win;

	if (!blkcg_debug_stats)
		return 0;

	if (iolat->ssd)
		return iolatency_ssd_stat(iolat, buf, size);

+0 −9
Original line number Diff line number Diff line
@@ -61,15 +61,6 @@ static inline void blk_mq_sched_completed_request(struct request *rq, u64 now)
		e->type->ops.completed_request(rq, now);
}

static inline void blk_mq_sched_started_request(struct request *rq)
{
	struct request_queue *q = rq->q;
	struct elevator_queue *e = q->elevator;

	if (e && e->type->ops.started_request)
		e->type->ops.started_request(rq);
}

static inline void blk_mq_sched_requeue_request(struct request *rq)
{
	struct request_queue *q = rq->q;
+6 −4
Original line number Diff line number Diff line
@@ -669,8 +669,6 @@ void blk_mq_start_request(struct request *rq)
{
	struct request_queue *q = rq->q;

	blk_mq_sched_started_request(rq);

	trace_block_rq_issue(q, rq);

	if (test_bit(QUEUE_FLAG_STATS, &q->queue_flags)) {
@@ -1960,9 +1958,13 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
	rq = blk_mq_get_request(q, bio, &data);
	if (unlikely(!rq)) {
		rq_qos_cleanup(q, bio);
		if (bio->bi_opf & REQ_NOWAIT)

		cookie = BLK_QC_T_NONE;
		if (bio->bi_opf & REQ_NOWAIT_INLINE)
			cookie = BLK_QC_T_EAGAIN;
		else if (bio->bi_opf & REQ_NOWAIT)
			bio_wouldblock_error(bio);
		return BLK_QC_T_NONE;
		return cookie;
	}

	trace_block_getrq(q, bio, bio->bi_opf);
Loading