Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 2ef176f1 authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull device mapper fixes from Mike Snitzer:

 - dm-cache memory allocation failure fix
 - fix DM's Kconfig identation
 - dm-snapshot metadata corruption fix for bug introduced in 3.14-rc1
 - important refcount < 0 fix for the DM persistent data library's space
   map metadata interface which fixes corruption reported by a few
   dm-thinp users

and last but not least:

 - more extensive fixes than ideal for dm-thinp's data resize capability
   (which has had growing pain much like we've seen from -ENOSPC
   handling of filesystems that mature).

   The end result is dm-thinp now handles metadata operation failure and
   no data space error conditions much better than before.

* tag 'dm-3.14-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm space map metadata: fix refcount decrement below 0 which caused corruption
  dm thin: fix Documentation for held metadata root feature
  dm thin: fix noflush suspend IO queueing
  dm thin: fix deadlock in __requeue_bio_list
  dm thin: fix out of data space handling
  dm thin: ensure user takes action to validate data and metadata consistency
  dm thin: synchronize the pool mode during suspend
  dm snapshot: fix metadata corruption
  dm: fix Kconfig indentation
  dm cache mq: fix memory allocation failure for large cache devices
parents b053940d cebc2de4
Loading
Loading
Loading
Loading
+5 −6
Original line number Diff line number Diff line
@@ -124,12 +124,11 @@ the default being 204800 sectors (or 100MB).
Updating on-disk metadata
-------------------------

On-disk metadata is committed every time a REQ_SYNC or REQ_FUA bio is
written.  If no such requests are made then commits will occur every
second.  This means the cache behaves like a physical disk that has a
write cache (the same is true of the thin-provisioning target).  If
power is lost you may lose some recent writes.  The metadata should
always be consistent in spite of any crash.
On-disk metadata is committed every time a FLUSH or FUA bio is written.
If no such requests are made then commits will occur every second.  This
means the cache behaves like a physical disk that has a volatile write
cache.  If power is lost you may lose some recent writes.  The metadata
should always be consistent in spite of any crash.

The 'dirty' state for a cache block changes far too frequently for us
to keep updating it on the fly.  So we treat it as a hint.  In normal
+31 −3
Original line number Diff line number Diff line
@@ -116,6 +116,35 @@ Resuming a device with a new table itself triggers an event so the
userspace daemon can use this to detect a situation where a new table
already exceeds the threshold.

A low water mark for the metadata device is maintained in the kernel and
will trigger a dm event if free space on the metadata device drops below
it.

Updating on-disk metadata
-------------------------

On-disk metadata is committed every time a FLUSH or FUA bio is written.
If no such requests are made then commits will occur every second.  This
means the thin-provisioning target behaves like a physical disk that has
a volatile write cache.  If power is lost you may lose some recent
writes.  The metadata should always be consistent in spite of any crash.

If data space is exhausted the pool will either error or queue IO
according to the configuration (see: error_if_no_space).  If metadata
space is exhausted or a metadata operation fails: the pool will error IO
until the pool is taken offline and repair is performed to 1) fix any
potential inconsistencies and 2) clear the flag that imposes repair.
Once the pool's metadata device is repaired it may be resized, which
will allow the pool to return to normal operation.  Note that if a pool
is flagged as needing repair, the pool's data and metadata devices
cannot be resized until repair is performed.  It should also be noted
that when the pool's metadata space is exhausted the current metadata
transaction is aborted.  Given that the pool will cache IO whose
completion may have already been acknowledged to upper IO layers
(e.g. filesystem) it is strongly suggested that consistency checks
(e.g. fsck) be performed on those layers when repair of the pool is
required.

Thin provisioning
-----------------

@@ -258,10 +287,9 @@ ii) Status
	should register for the event and then check the target's status.

    held metadata root:
	The location, in sectors, of the metadata root that has been
	The location, in blocks, of the metadata root that has been
	'held' for userspace read access.  '-' indicates there is no
	held root.  This feature is not yet implemented so '-' is
	always returned.
	held root.

    discard_passdown|no_discard_passdown
	Whether or not discards are actually being passed down to the
+0 −10
Original line number Diff line number Diff line
@@ -254,16 +254,6 @@ config DM_THIN_PROVISIONING
       ---help---
         Provides thin provisioning and snapshots that share a data store.

config DM_DEBUG_BLOCK_STACK_TRACING
	boolean "Keep stack trace of persistent data block lock holders"
	depends on STACKTRACE_SUPPORT && DM_PERSISTENT_DATA
	select STACKTRACE
	---help---
	  Enable this for messages that may help debug problems with the
	  block manager locking used by thin provisioning and caching.

	  If unsure, say N.

config DM_CACHE
       tristate "Cache target (EXPERIMENTAL)"
       depends on BLK_DEV_DM
+2 −2
Original line number Diff line number Diff line
@@ -872,7 +872,7 @@ static void mq_destroy(struct dm_cache_policy *p)
{
	struct mq_policy *mq = to_mq_policy(p);

	kfree(mq->table);
	vfree(mq->table);
	epool_exit(&mq->cache_pool);
	epool_exit(&mq->pre_cache_pool);
	kfree(mq);
@@ -1245,7 +1245,7 @@ static struct dm_cache_policy *mq_create(dm_cblock_t cache_size,

	mq->nr_buckets = next_power(from_cblock(cache_size) / 2, 16);
	mq->hash_bits = ffs(mq->nr_buckets) - 1;
	mq->table = kzalloc(sizeof(*mq->table) * mq->nr_buckets, GFP_KERNEL);
	mq->table = vzalloc(sizeof(*mq->table) * mq->nr_buckets);
	if (!mq->table)
		goto bad_alloc_table;

+3 −0
Original line number Diff line number Diff line
@@ -546,6 +546,9 @@ static int read_exceptions(struct pstore *ps,
		r = insert_exceptions(ps, area, callback, callback_context,
				      &full);

		if (!full)
			memcpy(ps->area, area, ps->store->chunk_size << SECTOR_SHIFT);

		dm_bufio_release(bp);

		dm_bufio_forget(client, chunk);
Loading