Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 48e70bc1 authored by Jens Axboe's avatar Jens Axboe
Browse files

Document and move the various READ/WRITE types



It's a somewhat twisty maze of hints and behavioural modifiers, try
and clear it up a bit with some documentation.

Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
parent f600abe2
Loading
Loading
Loading
Loading
+59 −0
Original line number Diff line number Diff line
@@ -87,6 +87,60 @@ struct inodes_stat_t {
 */
#define FMODE_NOCMTIME		((__force fmode_t)2048)

/*
 * The below are the various read and write types that we support. Some of
 * them include behavioral modifiers that send information down to the
 * block layer and IO scheduler. Terminology:
 *
 *	The block layer uses device plugging to defer IO a little bit, in
 *	the hope that we will see more IO very shortly. This increases
 *	coalescing of adjacent IO and thus reduces the number of IOs we
 *	have to send to the device. It also allows for better queuing,
 *	if the IO isn't mergeable. If the caller is going to be waiting
 *	for the IO, then he must ensure that the device is unplugged so
 *	that the IO is dispatched to the driver.
 *
 *	All IO is handled async in Linux. This is fine for background
 *	writes, but for reads or writes that someone waits for completion
 *	on, we want to notify the block layer and IO scheduler so that they
 *	know about it. That allows them to make better scheduling
 *	decisions. So when the below references 'sync' and 'async', it
 *	is referencing this priority hint.
 *
 * With that in mind, the available types are:
 *
 * READ			A normal read operation. Device will be plugged.
 * READ_SYNC		A synchronous read. Device is not plugged, caller can
 *			immediately wait on this read without caring about
 *			unplugging.
 * READA		Used for read-ahead operations. Lower priority, and the
 *			 block layer could (in theory) choose to ignore this
 *			request if it runs into resource problems.
 * WRITE		A normal async write. Device will be plugged.
 * SWRITE		Like WRITE, but a special case for ll_rw_block() that
 *			tells it to lock the buffer first. Normally a buffer
 *			must be locked before doing IO.
 * WRITE_SYNC_PLUG	Synchronous write. Identical to WRITE, but passes down
 *			the hint that someone will be waiting on this IO
 *			shortly. The device must still be unplugged explicitly,
 *			WRITE_SYNC_PLUG does not do this as we could be
 *			submitting more writes before we actually wait on any
 *			of them.
 * WRITE_SYNC		Like WRITE_SYNC_PLUG, but also unplugs the device
 *			immediately after submission. The write equivalent
 *			of READ_SYNC.
 * WRITE_ODIRECT	Special case write for O_DIRECT only.
 * SWRITE_SYNC
 * SWRITE_SYNC_PLUG	Like WRITE_SYNC/WRITE_SYNC_PLUG, but locks the buffer.
 *			See SWRITE.
 * WRITE_BARRIER	Like WRITE, but tells the block layer that all
 *			previously submitted writes must be safely on storage
 *			before this one is started. Also guarantees that when
 *			this write is complete, it itself is also safely on
 *			storage. Prevents reordering of writes on both sides
 *			of this IO.
 *
 */
#define RW_MASK		1
#define RWA_MASK	2
#define READ 0
@@ -102,6 +156,11 @@ struct inodes_stat_t {
			(SWRITE | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_NOIDLE))
#define SWRITE_SYNC	(SWRITE_SYNC_PLUG | (1 << BIO_RW_UNPLUG))
#define WRITE_BARRIER	(WRITE | (1 << BIO_RW_BARRIER))

/*
 * These aren't really reads or writes, they pass down information about
 * parts of device that are now unused by the file system.
 */
#define DISCARD_NOBARRIER (1 << BIO_RW_DISCARD)
#define DISCARD_BARRIER ((1 << BIO_RW_DISCARD) | (1 << BIO_RW_BARRIER))