Merge tag 'for-4.19/block-20180812' of git://git.kernel.dk/linux-block (73ba2fb3) · Commits · e / devices / android_kernel_teracube_mt6765

Documentation/ABI/testing/procfs-diskstats

+10 −0

Original line number	Original line	Diff line number	Diff line
	@@ -5,6 +5,7 @@ Description:
	The /proc/diskstats file displays the I/O statistics		The /proc/diskstats file displays the I/O statistics
	of block devices. Each line contains the following 14		of block devices. Each line contains the following 14
	fields:		fields:

	1 - major number		1 - major number
	2 - minor mumber		2 - minor mumber
	3 - device name		3 - device name
	@@ -19,4 +20,13 @@ Description:
	12 - I/Os currently in progress		12 - I/Os currently in progress
	13 - time spent doing I/Os (ms)		13 - time spent doing I/Os (ms)
	14 - weighted time spent doing I/Os (ms)		14 - weighted time spent doing I/Os (ms)

			Kernel 4.18+ appends four more fields for discard
			tracking putting the total at 18:

			15 - discards completed successfully
			16 - discards merged
			17 - sectors discarded
			18 - time spent discarding

	For more details refer to Documentation/iostats.txt		For more details refer to Documentation/iostats.txt

Documentation/admin-guide/cgroup-v2.rst

+88 −4

Original line number	Original line	Diff line number	Diff line
	@@ -51,6 +51,9 @@ v1 is available under Documentation/cgroup-v1/.
	5-3. IO		5-3. IO
	5-3-1. IO Interface Files		5-3-1. IO Interface Files
	5-3-2. Writeback		5-3-2. Writeback
			5-3-3. IO Latency
			5-3-3-1. How IO Latency Throttling Works
			5-3-3-2. IO Latency Interface Files
	5-4. PID		5-4. PID
	5-4-1. PID Interface Files		5-4-1. PID Interface Files
	5-5. Device		5-5. Device
	@@ -1314,17 +1317,19 @@ IO Interface Files
	Lines are keyed by $MAJ:$MIN device numbers and not ordered.		Lines are keyed by $MAJ:$MIN device numbers and not ordered.
	The following nested keys are defined.		The following nested keys are defined.

	====== ===================		====== =====================
	rbytes Bytes read		rbytes Bytes read
	wbytes Bytes written		wbytes Bytes written
	rios Number of read IOs		rios Number of read IOs
	wios Number of write IOs		wios Number of write IOs
	====== ===================		dbytes Bytes discarded
			dios Number of discard IOs
			====== =====================

	An example read output follows:		An example read output follows:

	8:16 rbytes=1459200 wbytes=314773504 rios=192 wios=353		8:16 rbytes=1459200 wbytes=314773504 rios=192 wios=353 dbytes=0 dios=0
	8:0 rbytes=90430464 wbytes=299008000 rios=8950 wios=1252		8:0 rbytes=90430464 wbytes=299008000 rios=8950 wios=1252 dbytes=50331648 dios=3021

	io.weight		io.weight
	A read-write flat-keyed file which exists on non-root cgroups.		A read-write flat-keyed file which exists on non-root cgroups.
	@@ -1446,6 +1451,85 @@ writeback as follows.
	vm.dirty[_background]_ratio.		vm.dirty[_background]_ratio.


			IO Latency
			~~~~~~~~~~

			This is a cgroup v2 controller for IO workload protection. You provide a group
			with a latency target, and if the average latency exceeds that target the
			controller will throttle any peers that have a lower latency target than the
			protected workload.

			The limits are only applied at the peer level in the hierarchy. This means that
			in the diagram below, only groups A, B, and C will influence each other, and
			groups D and F will influence each other. Group G will influence nobody.

			[root]
			/ \| \
			A B C
			/ \ \|
			D F G


			So the ideal way to configure this is to set io.latency in groups A, B, and C.
			Generally you do not want to set a value lower than the latency your device
			supports. Experiment to find the value that works best for your workload.
			Start at higher than the expected latency for your device and watch the
			avg_lat value in io.stat for your workload group to get an idea of the
			latency you see during normal operation. Use the avg_lat value as a basis for
			your real setting, setting at 10-15% higher than the value in io.stat.

			How IO Latency Throttling Works
			~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

			io.latency is work conserving; so as long as everybody is meeting their latency
			target the controller doesn't do anything. Once a group starts missing its
			target it begins throttling any peer group that has a higher target than itself.
			This throttling takes 2 forms:

			- Queue depth throttling. This is the number of outstanding IO's a group is
			allowed to have. We will clamp down relatively quickly, starting at no limit
			and going all the way down to 1 IO at a time.

			- Artificial delay induction. There are certain types of IO that cannot be
			throttled without possibly adversely affecting higher priority groups. This
			includes swapping and metadata IO. These types of IO are allowed to occur
			normally, however they are "charged" to the originating group. If the
			originating group is being throttled you will see the use_delay and delay
			fields in io.stat increase. The delay value is how many microseconds that are
			being added to any process that runs in this group. Because this number can
			grow quite large if there is a lot of swapping or metadata IO occurring we
			limit the individual delay events to 1 second at a time.

			Once the victimized group starts meeting its latency target again it will start
			unthrottling any peer groups that were throttled previously. If the victimized
			group simply stops doing IO the global counter will unthrottle appropriately.

			IO Latency Interface Files
			~~~~~~~~~~~~~~~~~~~~~~~~~~

			io.latency
			This takes a similar format as the other controllers.

			"MAJOR:MINOR target=<target time in microseconds"

			io.stat
			If the controller is enabled you will see extra stats in io.stat in
			addition to the normal ones.

			depth
			This is the current queue depth for the group.

			avg_lat
			This is an exponential moving average with a decay rate of 1/exp
			bound by the sampling interval. The decay rate interval can be
			calculated by multiplying the win value in io.stat by the
			corresponding number of samples based on the win value.

			win
			The sampling window size in milliseconds. This is the minimum
			duration of time between evaluation events. Windows only elapse
			with IO activity. Idle periods extend the most recent window.

	PID		PID
	---		---

Documentation/block/null_blk.txt

+7 −0

Original line number	Original line	Diff line number	Diff line
	@@ -85,3 +85,10 @@ shared_tags=[0/1]: Default: 0
	0: Tag set is not shared.		0: Tag set is not shared.
	1: Tag set shared between devices for blk-mq. Only makes sense with		1: Tag set shared between devices for blk-mq. Only makes sense with
	nr_devices > 1, otherwise there's no tag set to share.		nr_devices > 1, otherwise there's no tag set to share.

			zoned=[0/1]: Default: 0
			0: Block device is exposed as a random-access block device.
			1: Block device is exposed as a host-managed zoned block device.

			zone_size=[MB]: Default: 256
			Per zone size when exposed as a zoned block device. Must be a power of two.

Documentation/block/stat.txt

+16 −12

Original line number	Original line	Diff line number	Diff line
	@@ -31,28 +31,32 @@ write ticks milliseconds total wait time for write requests
	in_flight requests number of I/Os currently in flight		in_flight requests number of I/Os currently in flight
	io_ticks milliseconds total time this block device has been active		io_ticks milliseconds total time this block device has been active
	time_in_queue milliseconds total wait time for all requests		time_in_queue milliseconds total wait time for all requests
			discard I/Os requests number of discard I/Os processed
			discard merges requests number of discard I/Os merged with in-queue I/O
			discard sectors sectors number of sectors discarded
			discard ticks milliseconds total wait time for discard requests

	read I/Os, write I/Os		read I/Os, write I/Os, discard I/0s
	=====================		===================================

	These values increment when an I/O request completes.		These values increment when an I/O request completes.

	read merges, write merges		read merges, write merges, discard merges
	=========================		=========================================

	These values increment when an I/O request is merged with an		These values increment when an I/O request is merged with an
	already-queued I/O request.		already-queued I/O request.

	read sectors, write sectors		read sectors, write sectors, discard_sectors
	===========================		============================================

	These values count the number of sectors read from or written to this		These values count the number of sectors read from, written to, or
	block device. The "sectors" in question are the standard UNIX 512-byte		discarded from this block device. The "sectors" in question are the
	sectors, not any device- or filesystem-specific block size. The		standard UNIX 512-byte sectors, not any device- or filesystem-specific
	counters are incremented when the I/O completes.		block size. The counters are incremented when the I/O completes.

	read ticks, write ticks		read ticks, write ticks, discard ticks
	=======================		======================================

	These values count the number of milliseconds that I/O requests have		These values count the number of milliseconds that I/O requests have
	waited on this block device. If there are multiple I/O requests waiting,		waited on this block device. If there are multiple I/O requests waiting,

Documentation/iostats.txt

+15 −0

Original line number	Original line	Diff line number	Diff line
	@@ -31,6 +31,9 @@ Here are examples of these different formats::
	3 0 hda 446216 784926 9550688 4382310 424847 312726 5922052 19310380 0 3376340 23705160		3 0 hda 446216 784926 9550688 4382310 424847 312726 5922052 19310380 0 3376340 23705160
	3 1 hda1 35486 38030 38030 38030		3 1 hda1 35486 38030 38030 38030

			4.18+ diskstats:
			3 0 hda 446216 784926 9550688 4382310 424847 312726 5922052 19310380 0 3376340 23705160 0 0 0 0

	On 2.4 you might execute ``grep 'hda ' /proc/partitions``. On 2.6+, you have		On 2.4 you might execute ``grep 'hda ' /proc/partitions``. On 2.6+, you have
	a choice of ``cat /sys/block/hda/stat`` or ``grep 'hda ' /proc/diskstats``.		a choice of ``cat /sys/block/hda/stat`` or ``grep 'hda ' /proc/diskstats``.

	@@ -101,6 +104,18 @@ Field 11 -- weighted # of milliseconds spent doing I/Os
	last update of this field. This can provide an easy measure of both		last update of this field. This can provide an easy measure of both
	I/O completion time and the backlog that may be accumulating.		I/O completion time and the backlog that may be accumulating.

			Field 12 -- # of discards completed
			This is the total number of discards completed successfully.

			Field 13 -- # of discards merged
			See the description of field 2

			Field 14 -- # of sectors discarded
			This is the total number of sectors discarded successfully.

			Field 15 -- # of milliseconds spent discarding
			This is the total number of milliseconds spent by all discards (as
			measured from __make_request() to end_that_request_last()).

	To avoid introducing performance bottlenecks, no locks are held while		To avoid introducing performance bottlenecks, no locks are held while
	modifying these counters. This implies that minor inaccuracies may be		modifying these counters. This implies that minor inaccuracies may be