Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit a8460882 authored by Elena Reshetova's avatar Elena Reshetova Committed by Bernhard Thoben
Browse files

refcount_t: Add ACQUIRE ordering on success for dec(sub)_and_test() variants



This adds an smp_acquire__after_ctrl_dep() barrier on successful
decrease of refcounter value from 1 to 0 for refcount_dec(sub)_and_test
variants and therefore gives stronger memory ordering guarantees than
prior versions of these functions.

Co-developed-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: default avatarElena Reshetova <elena.reshetova@intel.com>
Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: default avatarAndrea Parri <andrea.parri@amarulasolutions.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: dvyukov@google.com
Cc: keescook@chromium.org
Cc: stern@rowland.harvard.edu
Link: https://lkml.kernel.org/r/1548847131-27854-2-git-send-email-elena.reshetova@intel.com


Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
(cherry picked from commit 47b8f3ab9c49daa824af848f9e02889662d8638f)
parent e830f2cf
Loading
Loading
Loading
Loading
+168 −0
Original line number Diff line number Diff line
===================================
refcount_t API compared to atomic_t
===================================

.. contents:: :local:

Introduction
============

The goal of refcount_t API is to provide a minimal API for implementing
an object's reference counters. While a generic architecture-independent
implementation from lib/refcount.c uses atomic operations underneath,
there are a number of differences between some of the ``refcount_*()`` and
``atomic_*()`` functions with regards to the memory ordering guarantees.
This document outlines the differences and provides respective examples
in order to help maintainers validate their code against the change in
these memory ordering guarantees.

The terms used through this document try to follow the formal LKMM defined in
tools/memory-model/Documentation/explanation.txt.

memory-barriers.txt and atomic_t.txt provide more background to the
memory ordering in general and for atomic operations specifically.

Relevant types of memory ordering
=================================

.. note:: The following section only covers some of the memory
   ordering types that are relevant for the atomics and reference
   counters and used through this document. For a much broader picture
   please consult memory-barriers.txt document.

In the absence of any memory ordering guarantees (i.e. fully unordered)
atomics & refcounters only provide atomicity and
program order (po) relation (on the same CPU). It guarantees that
each ``atomic_*()`` and ``refcount_*()`` operation is atomic and instructions
are executed in program order on a single CPU.
This is implemented using :c:func:`READ_ONCE`/:c:func:`WRITE_ONCE` and
compare-and-swap primitives.

A strong (full) memory ordering guarantees that all prior loads and
stores (all po-earlier instructions) on the same CPU are completed
before any po-later instruction is executed on the same CPU.
It also guarantees that all po-earlier stores on the same CPU
and all propagated stores from other CPUs must propagate to all
other CPUs before any po-later instruction is executed on the original
CPU (A-cumulative property). This is implemented using :c:func:`smp_mb`.

A RELEASE memory ordering guarantees that all prior loads and
stores (all po-earlier instructions) on the same CPU are completed
before the operation. It also guarantees that all po-earlier
stores on the same CPU and all propagated stores from other CPUs
must propagate to all other CPUs before the release operation
(A-cumulative property). This is implemented using
:c:func:`smp_store_release`.

An ACQUIRE memory ordering guarantees that all post loads and
stores (all po-later instructions) on the same CPU are
completed after the acquire operation. It also guarantees that all
po-later stores on the same CPU must propagate to all other CPUs
after the acquire operation executes. This is implemented using
:c:func:`smp_acquire__after_ctrl_dep`.

A control dependency (on success) for refcounters guarantees that
if a reference for an object was successfully obtained (reference
counter increment or addition happened, function returned true),
then further stores are ordered against this operation.
Control dependency on stores are not implemented using any explicit
barriers, but rely on CPU not to speculate on stores. This is only
a single CPU relation and provides no guarantees for other CPUs.


Comparison of functions
=======================

case 1) - non-"Read/Modify/Write" (RMW) ops
-------------------------------------------

Function changes:

 * :c:func:`atomic_set` --> :c:func:`refcount_set`
 * :c:func:`atomic_read` --> :c:func:`refcount_read`

Memory ordering guarantee changes:

 * none (both fully unordered)


case 2) - increment-based ops that return no value
--------------------------------------------------

Function changes:

 * :c:func:`atomic_inc` --> :c:func:`refcount_inc`
 * :c:func:`atomic_add` --> :c:func:`refcount_add`

Memory ordering guarantee changes:

 * none (both fully unordered)

case 3) - decrement-based RMW ops that return no value
------------------------------------------------------

Function changes:

 * :c:func:`atomic_dec` --> :c:func:`refcount_dec`

Memory ordering guarantee changes:

 * fully unordered --> RELEASE ordering


case 4) - increment-based RMW ops that return a value
-----------------------------------------------------

Function changes:

 * :c:func:`atomic_inc_not_zero` --> :c:func:`refcount_inc_not_zero`
 * no atomic counterpart --> :c:func:`refcount_add_not_zero`

Memory ordering guarantees changes:

 * fully ordered --> control dependency on success for stores

.. note:: We really assume here that necessary ordering is provided as a
   result of obtaining pointer to the object!


case 5) - generic dec/sub decrement-based RMW ops that return a value
---------------------------------------------------------------------

Function changes:

 * :c:func:`atomic_dec_and_test` --> :c:func:`refcount_dec_and_test`
 * :c:func:`atomic_sub_and_test` --> :c:func:`refcount_sub_and_test`

Memory ordering guarantees changes:

 * fully ordered --> RELEASE ordering + ACQUIRE ordering on success


case 6) other decrement-based RMW ops that return a value
---------------------------------------------------------

Function changes:

 * no atomic counterpart --> :c:func:`refcount_dec_if_one`
 * ``atomic_add_unless(&var, -1, 1)`` --> ``refcount_dec_not_one(&var)``

Memory ordering guarantees changes:

 * fully ordered --> RELEASE ordering + control dependency

.. note:: :c:func:`atomic_add_unless` only provides full order on success.


case 7) - lock-based RMW
------------------------

Function changes:

 * :c:func:`atomic_dec_and_lock` --> :c:func:`refcount_dec_and_lock`
 * :c:func:`atomic_dec_and_mutex_lock` --> :c:func:`refcount_dec_and_mutex_lock`

Memory ordering guarantees changes:

 * fully ordered --> RELEASE ordering + control dependency + hold
   :c:func:`spin_lock` on success
+26 −0
Original line number Diff line number Diff line
@@ -67,14 +67,40 @@ static __always_inline void refcount_dec(refcount_t *r)
static __always_inline __must_check
bool refcount_sub_and_test(unsigned int i, refcount_t *r)
{
<<<<<<< HEAD
	GEN_BINARY_SUFFIXED_RMWcc(LOCK_PREFIX "subl", REFCOUNT_CHECK_LT_ZERO,
				  r->refs.counter, "er", i, "%0", e, "cx");
=======
	bool ret = GEN_BINARY_SUFFIXED_RMWcc(LOCK_PREFIX "subl",
					 REFCOUNT_CHECK_LT_ZERO,
					 r->refs.counter, e, "er", i, "cx");

	if (ret) {
		smp_acquire__after_ctrl_dep();
		return true;
	}

	return false;
>>>>>>> 47b8f3ab9c49 (refcount_t: Add ACQUIRE ordering on success for dec(sub)_and_test() variants)
}

static __always_inline __must_check bool refcount_dec_and_test(refcount_t *r)
{
<<<<<<< HEAD
	GEN_UNARY_SUFFIXED_RMWcc(LOCK_PREFIX "decl", REFCOUNT_CHECK_LT_ZERO,
				 r->refs.counter, "%0", e, "cx");
=======
	bool ret = GEN_UNARY_SUFFIXED_RMWcc(LOCK_PREFIX "decl",
					 REFCOUNT_CHECK_LT_ZERO,
					 r->refs.counter, e, "cx");

	if (ret) {
		smp_acquire__after_ctrl_dep();
		return true;
	}

	return false;
>>>>>>> 47b8f3ab9c49 (refcount_t: Add ACQUIRE ordering on success for dec(sub)_and_test() variants)
}

static __always_inline __must_check
+13 −5
Original line number Diff line number Diff line
@@ -32,6 +32,9 @@
 * Note that the allocator is responsible for ordering things between free()
 * and alloc().
 *
 * The decrements dec_and_test() and sub_and_test() also provide acquire
 * ordering on success.
 *
 */

#include <linux/mutex.h>
@@ -163,8 +166,8 @@ EXPORT_SYMBOL(refcount_inc_checked);
 * at UINT_MAX.
 *
 * Provides release memory ordering, such that prior loads and stores are done
 * before, and provides a control dependency such that free() must come after.
 * See the comment on top.
 * before, and provides an acquire ordering on success such that free()
 * must come after.
 *
 * Use of this function is not recommended for the normal reference counting
 * use case in which references are taken and released one at a time.  In these
@@ -189,7 +192,12 @@ bool refcount_sub_and_test_checked(unsigned int i, refcount_t *r)

	} while (!atomic_try_cmpxchg_release(&r->refs, &val, new));

	return !new;
	if (!new) {
		smp_acquire__after_ctrl_dep();
		return true;
	}
	return false;

}
EXPORT_SYMBOL(refcount_sub_and_test_checked);

@@ -201,8 +209,8 @@ EXPORT_SYMBOL(refcount_sub_and_test_checked);
 * decrement when saturated at UINT_MAX.
 *
 * Provides release memory ordering, such that prior loads and stores are done
 * before, and provides a control dependency such that free() must come after.
 * See the comment on top.
 * before, and provides an acquire ordering on success such that free()
 * must come after.
 *
 * Return: true if the resulting refcount is 0, false otherwise
 */