Merge branch 'for-3.14/core' of git://git.kernel.dk/linux-block (f568849e) · Commits · e / devices / android_kernel_oneplus_sm7250

Documentation/block/biodoc.txt

+3 −4

Original line number	Diff line number	Diff line
		@@ -447,14 +447,13 @@ struct bio_vec {
		* main unit of I/O for the block layer and lower layers (ie drivers)
		*/
		struct bio {
		sector_t bi_sector;
		struct bio bi_next; / request queue link */
		struct block_device bi_bdev; / target device */
		unsigned long bi_flags; /* status, command, etc */
		unsigned long bi_rw; /* low bits: r/w, high: priority */

		unsigned int bi_vcnt; /* how may bio_vec's */
		unsigned int bi_idx; /* current index into bio_vec array */
		struct bvec_iter bi_iter; /* current index into bio_vec array */

		unsigned int bi_size; /* total size in bytes */
		unsigned short bi_phys_segments; /* segments after physaddr coalesce*/
		@@ -480,7 +479,7 @@ With this multipage bio design:
		- Code that traverses the req list can find all the segments of a bio
		by using rq_for_each_segment. This handles the fact that a request
		has multiple bios, each of which can have multiple segments.
		- Drivers which can't process a large bio in one shot can use the bi_idx
		- Drivers which can't process a large bio in one shot can use the bi_iter
		field to keep track of the next bio_vec entry to process.
		(e.g a 1MB bio_vec needs to be handled in max 128kB chunks for IDE)
		[TBD: Should preferably also have a bi_voffset and bi_vlen to avoid modifying
		@@ -589,7 +588,7 @@ driver should not modify these values. The block layer sets up the
		nr_sectors and current_nr_sectors fields (based on the corresponding
		hard_xxx values and the number of bytes transferred) and updates it on
		every transfer that invokes end_that_request_first. It does the same for the
		buffer, bio, bio->bi_idx fields too.
		buffer, bio, bio->bi_iter fields too.

		The buffer field is just a virtual address mapping of the current segment
		of the i/o buffer in cases where the buffer resides in low-memory. For high

Documentation/block/biovecs.txt

0 → 100644

+111 −0

Original line number	Diff line number	Diff line

		Immutable biovecs and biovec iterators:
		=======================================

		Kent Overstreet <kmo@daterainc.com>

		As of 3.13, biovecs should never be modified after a bio has been submitted.
		Instead, we have a new struct bvec_iter which represents a range of a biovec -
		the iterator will be modified as the bio is completed, not the biovec.

		More specifically, old code that needed to partially complete a bio would
		update bi_sector and bi_size, and advance bi_idx to the next biovec. If it
		ended up partway through a biovec, it would increment bv_offset and decrement
		bv_len by the number of bytes completed in that biovec.

		In the new scheme of things, everything that must be mutated in order to
		partially complete a bio is segregated into struct bvec_iter: bi_sector,
		bi_size and bi_idx have been moved there; and instead of modifying bv_offset
		and bv_len, struct bvec_iter has bi_bvec_done, which represents the number of
		bytes completed in the current bvec.

		There are a bunch of new helper macros for hiding the gory details - in
		particular, presenting the illusion of partially completed biovecs so that
		normal code doesn't have to deal with bi_bvec_done.

		* Driver code should no longer refer to biovecs directly; we now have
		bio_iovec() and bio_iovec_iter() macros that return literal struct biovecs,
		constructed from the raw biovecs but taking into account bi_bvec_done and
		bi_size.

		bio_for_each_segment() has been updated to take a bvec_iter argument
		instead of an integer (that corresponded to bi_idx); for a lot of code the
		conversion just required changing the types of the arguments to
		bio_for_each_segment().

		* Advancing a bvec_iter is done with bio_advance_iter(); bio_advance() is a
		wrapper around bio_advance_iter() that operates on bio->bi_iter, and also
		advances the bio integrity's iter if present.

		There is a lower level advance function - bvec_iter_advance() - which takes
		a pointer to a biovec, not a bio; this is used by the bio integrity code.

		What's all this get us?
		=======================

		Having a real iterator, and making biovecs immutable, has a number of
		advantages:

		* Before, iterating over bios was very awkward when you weren't processing
		exactly one bvec at a time - for example, bio_copy_data() in fs/bio.c,
		which copies the contents of one bio into another. Because the biovecs
		wouldn't necessarily be the same size, the old code was tricky convoluted -
		it had to walk two different bios at the same time, keeping both bi_idx and
		and offset into the current biovec for each.

		The new code is much more straightforward - have a look. This sort of
		pattern comes up in a lot of places; a lot of drivers were essentially open
		coding bvec iterators before, and having common implementation considerably
		simplifies a lot of code.

		* Before, any code that might need to use the biovec after the bio had been
		completed (perhaps to copy the data somewhere else, or perhaps to resubmit
		it somewhere else if there was an error) had to save the entire bvec array
		- again, this was being done in a fair number of places.

		* Biovecs can be shared between multiple bios - a bvec iter can represent an
		arbitrary range of an existing biovec, both starting and ending midway
		through biovecs. This is what enables efficient splitting of arbitrary
		bios. Note that this means we _only_ use bi_size to determine when we've
		reached the end of a bio, not bi_vcnt - and the bio_iovec() macro takes
		bi_size into account when constructing biovecs.

		* Splitting bios is now much simpler. The old bio_split() didn't even work on
		bios with more than a single bvec! Now, we can efficiently split arbitrary
		size bios - because the new bio can share the old bio's biovec.

		Care must be taken to ensure the biovec isn't freed while the split bio is
		still using it, in case the original bio completes first, though. Using
		bio_chain() when splitting bios helps with this.

		* Submitting partially completed bios is now perfectly fine - this comes up
		occasionally in stacking block drivers and various code (e.g. md and
		bcache) had some ugly workarounds for this.

		It used to be the case that submitting a partially completed bio would work
		fine to _most_ devices, but since accessing the raw bvec array was the
		norm, not all drivers would respect bi_idx and those would break. Now,
		since all drivers _must_ go through the bvec iterator - and have been
		audited to make sure they are - submitting partially completed bios is
		perfectly fine.

		Other implications:
		===================

		* Almost all usage of bi_idx is now incorrect and has been removed; instead,
		where previously you would have used bi_idx you'd now use a bvec_iter,
		probably passing it to one of the helper macros.

		I.e. instead of using bio_iovec_idx() (or bio->bi_iovec[bio->bi_idx]), you
		now use bio_iter_iovec(), which takes a bvec_iter and returns a
		literal struct bio_vec - constructed on the fly from the raw biovec but
		taking into account bi_bvec_done (and bi_size).

		* bi_vcnt can't be trusted or relied upon by driver code - i.e. anything that
		doesn't actually own the bio. The reason is twofold: firstly, it's not
		actually needed for iterating over the bio anymore - we only use bi_size.
		Secondly, when cloning a bio and reusing (a portion of) the original bio's
		biovec, in order to calculate bi_vcnt for the new bio we'd have to iterate
		over all the biovecs in the new bio - which is silly as it's not needed.

		So, don't use bi_vcnt anymore.

arch/m68k/emu/nfblock.c

+7 −6

Original line number	Diff line number	Diff line
		@@ -62,17 +62,18 @@ struct nfhd_device {
		static void nfhd_make_request(struct request_queue queue, struct bio bio)
		{
		struct nfhd_device *dev = queue->queuedata;
		struct bio_vec *bvec;
		int i, dir, len, shift;
		sector_t sec = bio->bi_sector;
		struct bio_vec bvec;
		struct bvec_iter iter;
		int dir, len, shift;
		sector_t sec = bio->bi_iter.bi_sector;

		dir = bio_data_dir(bio);
		shift = dev->bshift;
		bio_for_each_segment(bvec, bio, i) {
		len = bvec->bv_len;
		bio_for_each_segment(bvec, bio, iter) {
		len = bvec.bv_len;
		len >>= 9;
		nfhd_read_write(dev->id, 0, dir, sec >> shift, len >> shift,
		bvec_to_phys(bvec));
		bvec_to_phys(&bvec));
		sec += len;
		}
		bio_endio(bio, 0);

arch/powerpc/sysdev/axonram.c

+11 −10

Original line number	Diff line number	Diff line
		@@ -109,27 +109,28 @@ axon_ram_make_request(struct request_queue queue, struct bio bio)
		struct axon_ram_bank *bank = bio->bi_bdev->bd_disk->private_data;
		unsigned long phys_mem, phys_end;
		void *user_mem;
		struct bio_vec *vec;
		struct bio_vec vec;
		unsigned int transfered;
		unsigned short idx;
		struct bvec_iter iter;

		phys_mem = bank->io_addr + (bio->bi_sector << AXON_RAM_SECTOR_SHIFT);
		phys_mem = bank->io_addr + (bio->bi_iter.bi_sector <<
		AXON_RAM_SECTOR_SHIFT);
		phys_end = bank->io_addr + bank->size;
		transfered = 0;
		bio_for_each_segment(vec, bio, idx) {
		if (unlikely(phys_mem + vec->bv_len > phys_end)) {
		bio_for_each_segment(vec, bio, iter) {
		if (unlikely(phys_mem + vec.bv_len > phys_end)) {
		bio_io_error(bio);
		return;
		}

		user_mem = page_address(vec->bv_page) + vec->bv_offset;
		user_mem = page_address(vec.bv_page) + vec.bv_offset;
		if (bio_data_dir(bio) == READ)
		memcpy(user_mem, (void *) phys_mem, vec->bv_len);
		memcpy(user_mem, (void *) phys_mem, vec.bv_len);
		else
		memcpy((void *) phys_mem, user_mem, vec->bv_len);
		memcpy((void *) phys_mem, user_mem, vec.bv_len);

		phys_mem += vec->bv_len;
		transfered += vec->bv_len;
		phys_mem += vec.bv_len;
		transfered += vec.bv_len;
		}
		bio_endio(bio, 0);
		}

arch/xtensa/platforms/iss/simdisk.c

+7 −7

Original line number	Diff line number	Diff line
		@@ -103,18 +103,18 @@ static void simdisk_transfer(struct simdisk *dev, unsigned long sector,

		static int simdisk_xfer_bio(struct simdisk dev, struct bio bio)
		{
		int i;
		struct bio_vec *bvec;
		sector_t sector = bio->bi_sector;
		struct bio_vec bvec;
		struct bvec_iter iter;
		sector_t sector = bio->bi_iter.bi_sector;

		bio_for_each_segment(bvec, bio, i) {
		char *buffer = __bio_kmap_atomic(bio, i);
		unsigned len = bvec->bv_len >> SECTOR_SHIFT;
		bio_for_each_segment(bvec, bio, iter) {
		char *buffer = __bio_kmap_atomic(bio, iter);
		unsigned len = bvec.bv_len >> SECTOR_SHIFT;

		simdisk_transfer(dev, sector, len, buffer,
		bio_data_dir(bio) == WRITE);
		sector += len;
		__bio_kunmap_atomic(bio);
		__bio_kunmap_atomic(buffer);
		}
		return 0;
		}