Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit b0a1ea51 authored by Linus Torvalds's avatar Linus Torvalds
Browse files

Merge branch 'for-4.3/blkcg' of git://git.kernel.dk/linux-block

Pull blk-cg updates from Jens Axboe:
 "A bit later in the cycle, but this has been in the block tree for a a
  while.  This is basically four patchsets from Tejun, that improve our
  buffered cgroup writeback.  It was dependent on the other cgroup
  changes, but they went in earlier in this cycle.

  Series 1 is set of 5 patches that has cgroup writeback updates:

   - bdi_writeback iteration fix which could lead to some wb's being
     skipped or repeated during e.g. sync under memory pressure.

   - Simplification of wb work wait mechanism.

   - Writeback tracepoints updated to report cgroup.

  Series 2 is is a set of updates for the CFQ cgroup writeback handling:

     cfq has always charged all async IOs to the root cgroup.  It didn't
     have much choice as writeback didn't know about cgroups and there
     was no way to tell who to blame for a given writeback IO.
     writeback finally grew support for cgroups and now tags each
     writeback IO with the appropriate cgroup to charge it against.

     This patchset updates cfq so that it follows the blkcg each bio is
     tagged with.  Async cfq_queues are now shared across cfq_group,
     which is per-cgroup, instead of per-request_queue cfq_data.  This
     makes all IOs follow the weight based IO resource distribution
     implemented by cfq.

     - Switched from GFP_ATOMIC to GFP_NOWAIT as suggested by Jeff.

     - Other misc review points addressed, acks added and rebased.

  Series 3 is the blkcg policy cleanup patches:

     This patchset contains assorted cleanups for blkcg_policy methods
     and blk[c]g_policy_data handling.

     - alloc/free added for blkg_policy_data.  exit dropped.

     - alloc/free added for blkcg_policy_data.

     - blk-throttle's async percpu allocation is replaced with direct
       allocation.

     - all methods now take blk[c]g_policy_data instead of blkcg_gq or
       blkcg.

  And finally, series 4 is a set of patches cleaning up the blkcg stats
  handling:

    blkcg's stats have always been somwhat of a mess.  This patchset
    tries to improve the situation a bit.

     - The following patches added to consolidate blkcg entry point and
       blkg creation.  This is in itself is an improvement and helps
       colllecting common stats on bio issue.

     - per-blkg stats now accounted on bio issue rather than request
       completion so that bio based and request based drivers can behave
       the same way.  The issue was spotted by Vivek.

     - cfq-iosched implements custom recursive stats and blk-throttle
       implements custom per-cpu stats.  This patchset make blkcg core
       support both by default.

     - cfq-iosched and blk-throttle keep track of the same stats
       multiple times.  Unify them"

* 'for-4.3/blkcg' of git://git.kernel.dk/linux-block: (45 commits)
  blkcg: use CGROUP_WEIGHT_* scale for io.weight on the unified hierarchy
  blkcg: s/CFQ_WEIGHT_*/CFQ_WEIGHT_LEGACY_*/
  blkcg: implement interface for the unified hierarchy
  blkcg: misc preparations for unified hierarchy interface
  blkcg: separate out tg_conf_updated() from tg_set_conf()
  blkcg: move body parsing from blkg_conf_prep() to its callers
  blkcg: mark existing cftypes as legacy
  blkcg: rename subsystem name from blkio to io
  blkcg: refine error codes returned during blkcg configuration
  blkcg: remove unnecessary NULL checks from __cfqg_set_weight_device()
  blkcg: reduce stack usage of blkg_rwstat_recursive_sum()
  blkcg: remove cfqg_stats->sectors
  blkcg: move io_service_bytes and io_serviced stats into blkcg_gq
  blkcg: make blkg_[rw]stat_recursive_sum() to be able to index into blkcg_gq
  blkcg: make blkcg_[rw]stat per-cpu
  blkcg: add blkg_[rw]stat->aux_cnt and replace cfq_group->dead_stats with it
  blkcg: consolidate blkg creation in blkcg_bio_issue_check()
  blk-throttle: improve queue bypass handling
  blkcg: move root blkg lookup optimization from throtl_lookup_tg() to __blkg_lookup()
  blkcg: inline [__]blkg_lookup()
  ...
parents 33e247c7 69d7fde5
Loading
Loading
Loading
Loading
+6 −18
Original line number Original line Diff line number Diff line
@@ -201,7 +201,7 @@ Proportional weight policy files
	  specifies the number of bytes.
	  specifies the number of bytes.


- blkio.io_serviced
- blkio.io_serviced
	- Number of IOs completed to/from the disk by the group. These
	- Number of IOs (bio) issued to the disk by the group. These
	  are further divided by the type of operation - read or write, sync
	  are further divided by the type of operation - read or write, sync
	  or async. First two fields specify the major and minor number of the
	  or async. First two fields specify the major and minor number of the
	  device, third field specifies the operation type and the fourth field
	  device, third field specifies the operation type and the fourth field
@@ -327,18 +327,11 @@ Note: If both BW and IOPS rules are specified for a device, then IO is
      subjected to both the constraints.
      subjected to both the constraints.


- blkio.throttle.io_serviced
- blkio.throttle.io_serviced
	- Number of IOs (bio) completed to/from the disk by the group (as
	- Number of IOs (bio) issued to the disk by the group. These
	  seen by throttling policy). These are further divided by the type
	  are further divided by the type of operation - read or write, sync
	  of operation - read or write, sync or async. First two fields specify
	  or async. First two fields specify the major and minor number of the
	  the major and minor number of the device, third field specifies the
	  device, third field specifies the operation type and the fourth field
	  operation type and the fourth field specifies the number of IOs.
	  specifies the number of IOs.

	  blkio.io_serviced does accounting as seen by CFQ and counts are in
	  number of requests (struct request). On the other hand,
	  blkio.throttle.io_serviced counts number of IO in terms of number
	  of bios as seen by throttling policy.  These bios can later be
	  merged by elevator and total number of requests completed can be
	  lesser.


- blkio.throttle.io_service_bytes
- blkio.throttle.io_service_bytes
	- Number of bytes transferred to/from the disk by the group. These
	- Number of bytes transferred to/from the disk by the group. These
@@ -347,11 +340,6 @@ Note: If both BW and IOPS rules are specified for a device, then IO is
	  device, third field specifies the operation type and the fourth field
	  device, third field specifies the operation type and the fourth field
	  specifies the number of bytes.
	  specifies the number of bytes.


	  These numbers should roughly be same as blkio.io_service_bytes as
	  updated by CFQ. The difference between two is that
	  blkio.io_service_bytes will not be updated if CFQ is not operating
	  on request queue.

Common files among various policies
Common files among various policies
-----------------------------------
-----------------------------------
- blkio.reset_stats
- blkio.reset_stats
+57 −4
Original line number Original line Diff line number Diff line
@@ -27,7 +27,7 @@ CONTENTS
    5-3-1. Format
    5-3-1. Format
    5-3-2. Control Knobs
    5-3-2. Control Knobs
  5-4. Per-Controller Changes
  5-4. Per-Controller Changes
    5-4-1. blkio
    5-4-1. io
    5-4-2. cpuset
    5-4-2. cpuset
    5-4-3. memory
    5-4-3. memory
6. Planned Changes
6. Planned Changes
@@ -203,7 +203,7 @@ other issues. The mapping from nice level to weight isn't obvious or
universal, and there are various other knobs which simply aren't
universal, and there are various other knobs which simply aren't
available for tasks.
available for tasks.


The blkio controller implicitly creates a hidden leaf node for each
The io controller implicitly creates a hidden leaf node for each
cgroup to host the tasks.  The hidden leaf has its own copies of all
cgroup to host the tasks.  The hidden leaf has its own copies of all
the knobs with "leaf_" prefixed.  While this allows equivalent control
the knobs with "leaf_" prefixed.  While this allows equivalent control
over internal tasks, it's with serious drawbacks.  It always adds an
over internal tasks, it's with serious drawbacks.  It always adds an
@@ -438,9 +438,62 @@ may be specified in any order and not all pairs have to be specified.


5-4. Per-Controller Changes
5-4. Per-Controller Changes


5-4-1. blkio
5-4-1. io


- blk-throttle becomes properly hierarchical.
- blkio is renamed to io.  The interface is overhauled anyway.  The
  new name is more in line with the other two major controllers, cpu
  and memory, and better suited given that it may be used for cgroup
  writeback without involving block layer.

- Everything including stat is always hierarchical making separate
  recursive stat files pointless and, as no internal node can have
  tasks, leaf weights are meaningless.  The operation model is
  simplified and the interface is overhauled accordingly.

  io.stat

	The stat file.  The reported stats are from the point where
	bio's are issued to request_queue.  The stats are counted
	independent of which policies are enabled.  Each line in the
	file follows the following format.  More fields may later be
	added at the end.

	  $MAJ:$MIN rbytes=$RBYTES wbytes=$WBYTES rios=$RIOS wrios=$WIOS

  io.weight

	The weight setting, currently only available and effective if
	cfq-iosched is in use for the target device.  The weight is
	between 1 and 10000 and defaults to 100.  The first line
	always contains the default weight in the following format to
	use when per-device setting is missing.

	  default $WEIGHT

	Subsequent lines list per-device weights of the following
	format.

	  $MAJ:$MIN $WEIGHT

	Writing "$WEIGHT" or "default $WEIGHT" changes the default
	setting.  Writing "$MAJ:$MIN $WEIGHT" sets per-device weight
	while "$MAJ:$MIN default" clears it.

	This file is available only on non-root cgroups.

  io.max

	The maximum bandwidth and/or iops setting, only available if
	blk-throttle is enabled.  The file is of the following format.

	  $MAJ:$MIN rbps=$RBPS wbps=$WBPS riops=$RIOPS wiops=$WIOPS

	${R|W}BPS are read/write bytes per second and ${R|W}IOPS are
	read/write IOs per second.  "max" indicates no limit.  Writing
	to the file follows the same format but the individual
	settings may be ommitted or specified in any order.

	This file is available only on non-root cgroups.




5-4-2. cpuset
5-4-2. cpuset
+1 −1
Original line number Original line Diff line number Diff line
@@ -1990,7 +1990,7 @@ int bio_associate_current(struct bio *bio)


	get_io_context_active(ioc);
	get_io_context_active(ioc);
	bio->bi_ioc = ioc;
	bio->bi_ioc = ioc;
	bio->bi_css = task_get_css(current, blkio_cgrp_id);
	bio->bi_css = task_get_css(current, io_cgrp_id);
	return 0;
	return 0;
}
}
EXPORT_SYMBOL_GPL(bio_associate_current);
EXPORT_SYMBOL_GPL(bio_associate_current);
+335 −189

File changed.

Preview size limit exceeded, changes collapsed.

+2 −2
Original line number Original line Diff line number Diff line
@@ -1888,8 +1888,8 @@ generic_make_request_checks(struct bio *bio)
	 */
	 */
	create_io_context(GFP_ATOMIC, q->node);
	create_io_context(GFP_ATOMIC, q->node);


	if (blk_throtl_bio(q, bio))
	if (!blkcg_bio_issue_check(q, bio))
		return false;	/* throttled, will be resubmitted later */
		return false;


	trace_block_bio_queue(q, bio);
	trace_block_bio_queue(q, bio);
	return true;
	return true;
Loading