Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 620e7753 authored by Linus Torvalds's avatar Linus Torvalds
Browse files

Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RCU changes from Ingo Molnar:

 0. 'idle RCU':

     Adds RCU APIs that allow non-idle tasks to enter RCU idle mode and
     provides x86 code to make use of them, allowing RCU to treat
     user-mode execution as an extended quiescent state when the new
     RCU_USER_QS kernel configuration parameter is specified.  (Work is
     in progress to port this to a few other architectures, but is not
     part of this series.)

 1.  A fix for a latent bug that has been in RCU ever since the addition
     of CPU stall warnings.  This bug results in false-positive stall
     warnings, but thus far only on embedded systems with severely
     cut-down userspace configurations.

 2.  Further reductions in latency spikes for huge systems, along with
     additional boot-time adaptation to the actual hardware.

     This is a large change, as it moves RCU grace-period initialization
     and cleanup, along with quiescent-state forcing, from softirq to a
     kthread.  However, it appears to be in quite good shape (famous
     last words).

 3.  Updates to documentation and rcutorture, the latter category
     including keeping statistics on CPU-hotplug latencies and fixing
     some initialization-time races.

 4.  CPU-hotplug fixes and improvements.

 5.  Idle-loop fixes that were omitted on an earlier submission.

 6.  Miscellaneous fixes and improvements

In certain RCU configurations new kernel threads will show up (rcu_bh,
rcu_sched), showing RCU processing overhead.

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (90 commits)
  rcu: Apply micro-optimization and int/bool fixes to RCU's idle handling
  rcu: Userspace RCU extended QS selftest
  x86: Exit RCU extended QS on notify resume
  x86: Use the new schedule_user API on userspace preemption
  rcu: Exit RCU extended QS on user preemption
  rcu: Exit RCU extended QS on kernel preemption after irq/exception
  x86: Exception hooks for userspace RCU extended QS
  x86: Unspaghettize do_general_protection()
  x86: Syscall hooks for userspace RCU extended QS
  rcu: Switch task's syscall hooks on context switch
  rcu: Ignore userspace extended quiescent state by default
  rcu: Allow rcu_user_enter()/exit() to nest
  rcu: Settle config for userspace extended quiescent state
  rcu: Make RCU_FAST_NO_HZ handle adaptive ticks
  rcu: New rcu_user_enter_after_irq() and rcu_user_exit_after_irq() APIs
  rcu: New rcu_user_enter() and rcu_user_exit() APIs
  ia64: Add missing RCU idle APIs on idle loop
  xtensa: Add missing RCU idle APIs on idle loop
  score: Add missing RCU idle APIs on idle loop
  parisc: Add missing RCU idle APIs on idle loop
  ...
parents 6977b4c7 fa34da70
Loading
Loading
Loading
Loading
+6 −0
Original line number Original line Diff line number Diff line
@@ -310,6 +310,12 @@ over a rather long period of time, but improvements are always welcome!
	code under the influence of preempt_disable(), you instead
	code under the influence of preempt_disable(), you instead
	need to use synchronize_irq() or synchronize_sched().
	need to use synchronize_irq() or synchronize_sched().


	This same limitation also applies to synchronize_rcu_bh()
	and synchronize_srcu(), as well as to the asynchronous and
	expedited forms of the three primitives, namely call_rcu(),
	call_rcu_bh(), call_srcu(), synchronize_rcu_expedited(),
	synchronize_rcu_bh_expedited(), and synchronize_srcu_expedited().

12.	Any lock acquired by an RCU callback must be acquired elsewhere
12.	Any lock acquired by an RCU callback must be acquired elsewhere
	with softirq disabled, e.g., via spin_lock_irqsave(),
	with softirq disabled, e.g., via spin_lock_irqsave(),
	spin_lock_bh(), etc.  Failing to disable irq on a given
	spin_lock_bh(), etc.  Failing to disable irq on a given
+8 −8
Original line number Original line Diff line number Diff line
@@ -99,7 +99,7 @@ In kernels with CONFIG_RCU_FAST_NO_HZ, even more information is
printed:
printed:


	INFO: rcu_preempt detected stall on CPU
	INFO: rcu_preempt detected stall on CPU
	0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 drain=0 . timer=-1
	0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 drain=0 . timer not pending
	   (t=65000 jiffies)
	   (t=65000 jiffies)


The "(64628 ticks this GP)" indicates that this CPU has taken more
The "(64628 ticks this GP)" indicates that this CPU has taken more
@@ -116,13 +116,13 @@ number between the two "/"s is the value of the nesting, which will
be a small positive number if in the idle loop and a very large positive
be a small positive number if in the idle loop and a very large positive
number (as shown above) otherwise.
number (as shown above) otherwise.


For CONFIG_RCU_FAST_NO_HZ kernels, the "drain=0" indicates that the
For CONFIG_RCU_FAST_NO_HZ kernels, the "drain=0" indicates that the CPU is
CPU is not in the process of trying to force itself into dyntick-idle
not in the process of trying to force itself into dyntick-idle state, the
state, the "." indicates that the CPU has not given up forcing RCU
"." indicates that the CPU has not given up forcing RCU into dyntick-idle
into dyntick-idle mode (it would be "H" otherwise), and the "timer=-1"
mode (it would be "H" otherwise), and the "timer not pending" indicates
indicates that the CPU has not recented forced RCU into dyntick-idle
that the CPU has not recently forced RCU into dyntick-idle mode (it
mode (it would otherwise indicate the number of microseconds remaining
would otherwise indicate the number of microseconds remaining in this
in this forced state).
forced state).




Multiple Warnings From One Stall
Multiple Warnings From One Stall
+16 −27
Original line number Original line Diff line number Diff line
@@ -333,23 +333,23 @@ o Each element of the form "1/1 0:127 ^0" represents one struct
The output of "cat rcu/rcu_pending" looks as follows:
The output of "cat rcu/rcu_pending" looks as follows:


rcu_sched:
rcu_sched:
  0 np=255892 qsp=53936 rpq=85 cbr=0 cng=14417 gpc=10033 gps=24320 nf=6445 nn=146741
  0 np=255892 qsp=53936 rpq=85 cbr=0 cng=14417 gpc=10033 gps=24320 nn=146741
  1 np=261224 qsp=54638 rpq=33 cbr=0 cng=25723 gpc=16310 gps=2849 nf=5912 nn=155792
  1 np=261224 qsp=54638 rpq=33 cbr=0 cng=25723 gpc=16310 gps=2849 nn=155792
  2 np=237496 qsp=49664 rpq=23 cbr=0 cng=2762 gpc=45478 gps=1762 nf=1201 nn=136629
  2 np=237496 qsp=49664 rpq=23 cbr=0 cng=2762 gpc=45478 gps=1762 nn=136629
  3 np=236249 qsp=48766 rpq=98 cbr=0 cng=286 gpc=48049 gps=1218 nf=207 nn=137723
  3 np=236249 qsp=48766 rpq=98 cbr=0 cng=286 gpc=48049 gps=1218 nn=137723
  4 np=221310 qsp=46850 rpq=7 cbr=0 cng=26 gpc=43161 gps=4634 nf=3529 nn=123110
  4 np=221310 qsp=46850 rpq=7 cbr=0 cng=26 gpc=43161 gps=4634 nn=123110
  5 np=237332 qsp=48449 rpq=9 cbr=0 cng=54 gpc=47920 gps=3252 nf=201 nn=137456
  5 np=237332 qsp=48449 rpq=9 cbr=0 cng=54 gpc=47920 gps=3252 nn=137456
  6 np=219995 qsp=46718 rpq=12 cbr=0 cng=50 gpc=42098 gps=6093 nf=4202 nn=120834
  6 np=219995 qsp=46718 rpq=12 cbr=0 cng=50 gpc=42098 gps=6093 nn=120834
  7 np=249893 qsp=49390 rpq=42 cbr=0 cng=72 gpc=38400 gps=17102 nf=41 nn=144888
  7 np=249893 qsp=49390 rpq=42 cbr=0 cng=72 gpc=38400 gps=17102 nn=144888
rcu_bh:
rcu_bh:
  0 np=146741 qsp=1419 rpq=6 cbr=0 cng=6 gpc=0 gps=0 nf=2 nn=145314
  0 np=146741 qsp=1419 rpq=6 cbr=0 cng=6 gpc=0 gps=0 nn=145314
  1 np=155792 qsp=12597 rpq=3 cbr=0 cng=0 gpc=4 gps=8 nf=3 nn=143180
  1 np=155792 qsp=12597 rpq=3 cbr=0 cng=0 gpc=4 gps=8 nn=143180
  2 np=136629 qsp=18680 rpq=1 cbr=0 cng=0 gpc=7 gps=6 nf=0 nn=117936
  2 np=136629 qsp=18680 rpq=1 cbr=0 cng=0 gpc=7 gps=6 nn=117936
  3 np=137723 qsp=2843 rpq=0 cbr=0 cng=0 gpc=10 gps=7 nf=0 nn=134863
  3 np=137723 qsp=2843 rpq=0 cbr=0 cng=0 gpc=10 gps=7 nn=134863
  4 np=123110 qsp=12433 rpq=0 cbr=0 cng=0 gpc=4 gps=2 nf=0 nn=110671
  4 np=123110 qsp=12433 rpq=0 cbr=0 cng=0 gpc=4 gps=2 nn=110671
  5 np=137456 qsp=4210 rpq=1 cbr=0 cng=0 gpc=6 gps=5 nf=0 nn=133235
  5 np=137456 qsp=4210 rpq=1 cbr=0 cng=0 gpc=6 gps=5 nn=133235
  6 np=120834 qsp=9902 rpq=2 cbr=0 cng=0 gpc=6 gps=3 nf=2 nn=110921
  6 np=120834 qsp=9902 rpq=2 cbr=0 cng=0 gpc=6 gps=3 nn=110921
  7 np=144888 qsp=26336 rpq=0 cbr=0 cng=0 gpc=8 gps=2 nf=0 nn=118542
  7 np=144888 qsp=26336 rpq=0 cbr=0 cng=0 gpc=8 gps=2 nn=118542


As always, this is once again split into "rcu_sched" and "rcu_bh"
As always, this is once again split into "rcu_sched" and "rcu_bh"
portions, with CONFIG_TREE_PREEMPT_RCU kernels having an additional
portions, with CONFIG_TREE_PREEMPT_RCU kernels having an additional
@@ -377,17 +377,6 @@ o "gpc" is the number of times that an old grace period had
o	"gps" is the number of times that a new grace period had started,
o	"gps" is the number of times that a new grace period had started,
	but this CPU was not yet aware of it.
	but this CPU was not yet aware of it.


o	"nf" is the number of times that this CPU suspected that the
	current grace period had run for too long, and thus needed to
	be forced.

	Please note that "forcing" consists of sending resched IPIs
	to holdout CPUs.  If that CPU really still is in an old RCU
	read-side critical section, then we really do have to wait for it.
	The assumption behing "forcing" is that the CPU is not still in
	an old RCU read-side critical section, but has not yet responded
	for some other reason.

o	"nn" is the number of times that this CPU needed nothing.  Alert
o	"nn" is the number of times that this CPU needed nothing.  Alert
	readers will note that the rcu "nn" number for a given CPU very
	readers will note that the rcu "nn" number for a given CPU very
	closely matches the rcu_bh "np" number for that same CPU.  This
	closely matches the rcu_bh "np" number for that same CPU.  This
+7 −2
Original line number Original line Diff line number Diff line
@@ -873,7 +873,7 @@ d. Do you need to treat NMI handlers, hardirq handlers,
	and code segments with preemption disabled (whether
	and code segments with preemption disabled (whether
	via preempt_disable(), local_irq_save(), local_bh_disable(),
	via preempt_disable(), local_irq_save(), local_bh_disable(),
	or some other mechanism) as if they were explicit RCU readers?
	or some other mechanism) as if they were explicit RCU readers?
	If so, you need RCU-sched.
	If so, RCU-sched is the only choice that will work for you.


e.	Do you need RCU grace periods to complete even in the face
e.	Do you need RCU grace periods to complete even in the face
	of softirq monopolization of one or more of the CPUs?  For
	of softirq monopolization of one or more of the CPUs?  For
@@ -884,7 +884,12 @@ f. Is your workload too update-intensive for normal use of
	RCU, but inappropriate for other synchronization mechanisms?
	RCU, but inappropriate for other synchronization mechanisms?
	If so, consider SLAB_DESTROY_BY_RCU.  But please be careful!
	If so, consider SLAB_DESTROY_BY_RCU.  But please be careful!


g.	Otherwise, use RCU.
g.	Do you need read-side critical sections that are respected
	even though they are in the middle of the idle loop, during
	user-mode execution, or on an offlined CPU?  If so, SRCU is the
	only choice that will work for you.

h.	Otherwise, use RCU.


Of course, this all assumes that you have determined that RCU is in fact
Of course, this all assumes that you have determined that RCU is in fact
the right tool for your job.
the right tool for your job.
+11 −0
Original line number Original line Diff line number Diff line
@@ -2385,6 +2385,17 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
	rcutree.rcu_cpu_stall_timeout= [KNL,BOOT]
	rcutree.rcu_cpu_stall_timeout= [KNL,BOOT]
			Set timeout for RCU CPU stall warning messages.
			Set timeout for RCU CPU stall warning messages.


	rcutree.jiffies_till_first_fqs= [KNL,BOOT]
			Set delay from grace-period initialization to
			first attempt to force quiescent states.
			Units are jiffies, minimum value is zero,
			and maximum value is HZ.

	rcutree.jiffies_till_next_fqs= [KNL,BOOT]
			Set delay between subsequent attempts to force
			quiescent states.  Units are jiffies, minimum
			value is one, and maximum value is HZ.

	rcutorture.fqs_duration= [KNL,BOOT]
	rcutorture.fqs_duration= [KNL,BOOT]
			Set duration of force_quiescent_state bursts.
			Set duration of force_quiescent_state bursts.


Loading