Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit b854e4de authored by Linus Torvalds's avatar Linus Torvalds
Browse files

Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RCU updates from Ingo Molnar:
 "Main RCU changes this cycle were:

   - Full-system idle detection.  This is for use by Frederic
     Weisbecker's adaptive-ticks mechanism.  Its purpose is to allow the
     timekeeping CPU to shut off its tick when all other CPUs are idle.

   - Miscellaneous fixes.

   - Improved rcutorture test coverage.

   - Updated RCU documentation"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits)
  nohz_full: Force RCU's grace-period kthreads onto timekeeping CPU
  nohz_full: Add full-system-idle state machine
  jiffies: Avoid undefined behavior from signed overflow
  rcu: Simplify _rcu_barrier() processing
  rcu: Make rcutorture emit online failures if verbose
  rcu: Remove unused variable from rcu_torture_writer()
  rcu: Sort rcutorture module parameters
  rcu: Increase rcutorture test coverage
  rcu: Add duplicate-callback tests to rcutorture
  doc: Fix memory-barrier control-dependency example
  rcu: Update RTFP documentation
  nohz_full: Add full-system-idle arguments to API
  nohz_full: Add full-system idle states and variables
  nohz_full: Add per-CPU idle-state tracking
  nohz_full: Add rcu_dyntick data for scalable detection of all-idle state
  nohz_full: Add Kconfig parameter for scalable detection of all-idle state
  nohz_full: Add testing information to documentation
  rcu: Eliminate unused APIs intended for adaptive ticks
  rcu: Select IRQ_WORK from TREE_PREEMPT_RCU
  rculist: list_first_or_null_rcu() should use list_entry_rcu()
  ...
parents 458c3f60 7d992feb
Loading
Loading
Loading
Loading
+553 −305

File changed.

Preview size limit exceeded, changes collapsed.

+8 −4
Original line number Diff line number Diff line
@@ -70,10 +70,14 @@ in realtime kernels in order to avoid excessive scheduling latencies.

rcu_barrier()

We instead need the rcu_barrier() primitive. This primitive is similar
to synchronize_rcu(), but instead of waiting solely for a grace
period to elapse, it also waits for all outstanding RCU callbacks to
complete. Pseudo-code using rcu_barrier() is as follows:
We instead need the rcu_barrier() primitive.  Rather than waiting for
a grace period to elapse, rcu_barrier() waits for all outstanding RCU
callbacks to complete.  Please note that rcu_barrier() does -not- imply
synchronize_rcu(), in particular, if there are no RCU callbacks queued
anywhere, rcu_barrier() is within its rights to return immediately,
without waiting for a grace period to elapse.

Pseudo-code using rcu_barrier() is as follows:

   1. Prevent any new RCU callbacks from being posted.
   2. Execute rcu_barrier().
+10 −0
Original line number Diff line number Diff line
@@ -42,6 +42,16 @@ fqs_holdoff Holdoff time (in microseconds) between consecutive calls
fqs_stutter	Wait time (in seconds) between consecutive bursts
		of calls to force_quiescent_state().

gp_normal	Make the fake writers use normal synchronous grace-period
		primitives.

gp_exp		Make the fake writers use expedited synchronous grace-period
		primitives.  If both gp_normal and gp_exp are set, or
		if neither gp_normal nor gp_exp are set, then randomly
		choose the primitive so that about 50% are normal and
		50% expedited.  By default, neither are set, which
		gives best overall test coverage.

irqreader	Says to invoke RCU readers from irq level.  This is currently
		done via timers.  Defaults to "1" for variants of RCU that
		permit this.  (Or, more accurately, variants of RCU that do
+6 −4
Original line number Diff line number Diff line
@@ -531,9 +531,10 @@ dependency barrier to make it work correctly. Consider the following bit of
code:

	q = &a;
	if (p)
		q = &b;
	if (p) {
		<data dependency barrier>
		q = &b;
	}
	x = *q;

This will not have the desired effect because there is no actual data
@@ -542,9 +543,10 @@ attempting to predict the outcome in advance. In such a case what's actually
required is:

	q = &a;
	if (p)
		q = &b;
	if (p) {
		<read barrier>
		q = &b;
	}
	x = *q;


+34 −10
Original line number Diff line number Diff line
@@ -24,8 +24,8 @@ There are three main ways of managing scheduling-clock interrupts
	workloads, you will normally -not- want this option.

These three cases are described in the following three sections, followed
by a third section on RCU-specific considerations and a fourth and final
section listing known issues.
by a third section on RCU-specific considerations, a fourth section
discussing testing, and a fifth and final section listing known issues.


NEVER OMIT SCHEDULING-CLOCK TICKS
@@ -121,14 +121,15 @@ boot parameter specifies the adaptive-ticks CPUs. For example,
"nohz_full=1,6-8" says that CPUs 1, 6, 7, and 8 are to be adaptive-ticks
CPUs.  Note that you are prohibited from marking all of the CPUs as
adaptive-tick CPUs:  At least one non-adaptive-tick CPU must remain
online to handle timekeeping tasks in order to ensure that system calls
like gettimeofday() returns accurate values on adaptive-tick CPUs.
(This is not an issue for CONFIG_NO_HZ_IDLE=y because there are no
running user processes to observe slight drifts in clock rate.)
Therefore, the boot CPU is prohibited from entering adaptive-ticks
mode.  Specifying a "nohz_full=" mask that includes the boot CPU will
result in a boot-time error message, and the boot CPU will be removed
from the mask.
online to handle timekeeping tasks in order to ensure that system
calls like gettimeofday() returns accurate values on adaptive-tick CPUs.
(This is not an issue for CONFIG_NO_HZ_IDLE=y because there are no running
user processes to observe slight drifts in clock rate.)  Therefore, the
boot CPU is prohibited from entering adaptive-ticks mode.  Specifying a
"nohz_full=" mask that includes the boot CPU will result in a boot-time
error message, and the boot CPU will be removed from the mask.  Note that
this means that your system must have at least two CPUs in order for
CONFIG_NO_HZ_FULL=y to do anything for you.

Alternatively, the CONFIG_NO_HZ_FULL_ALL=y Kconfig parameter specifies
that all CPUs other than the boot CPU are adaptive-ticks CPUs.  This
@@ -232,6 +233,29 @@ scheduler will decide where to run them, which might or might not be
where you want them to run.


TESTING

So you enable all the OS-jitter features described in this document,
but do not see any change in your workload's behavior.  Is this because
your workload isn't affected that much by OS jitter, or is it because
something else is in the way?  This section helps answer this question
by providing a simple OS-jitter test suite, which is available on branch
master of the following git archive:

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/dynticks-testing.git

Clone this archive and follow the instructions in the README file.
This test procedure will produce a trace that will allow you to evaluate
whether or not you have succeeded in removing OS jitter from your system.
If this trace shows that you have removed OS jitter as much as is
possible, then you can conclude that your workload is not all that
sensitive to OS jitter.

Note: this test requires that your system have at least two CPUs.
We do not currently have a good way to remove OS jitter from single-CPU
systems.


KNOWN ISSUES

o	Dyntick-idle slows transitions to and from idle slightly.
Loading