Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit a9b86fab authored by Ingo Molnar's avatar Ingo Molnar
Browse files

Merge branch 'rcu/next' of...

Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu

Pull v3.7 RCU commits from Paul E. McKenney:

"
0.	A fix for a latent bug that has been in RCU ever since the
        addition of CPU stall warnings.  This bug results in
        false-positive stall warnings, but thus far only on embedded
        systems with severely cut-down userspace configurations.
        This fix is located on an rcu/urgent branch, with the rest
        of the commits based on top of it.  This commit CCs stable.
        Given that the merge window is coming quite soon and given
        the small number of affected users, I do -not- recommend
        pushing it to 3.6, but the separate branch makes it easy to
        find if someone needs it.

1.	Further reductions in latency spikes for huge systems, along
        with additional boot-time adaptation to the actual hardware.
        This is a large change, as it moves RCU grace-period
        initialization and cleanup, along with quiescent-state forcing,
        from softirq to a kthread.  However, it appears to be in
        quite good shape (famous last words).  Posted to LKML at
        https://lkml.org/lkml/2012/9/20/427.
2.	Updates to documentation and rcutorture, the latter category
        including keeping statistics on CPU-hotplug latencies and
        fixing some initialization-time races.  Posted to LKML at
        https://lkml.org/lkml/2012/8/30/193.

3.	Miscellaneous fixes and improvements, posted to LKML at
        https://lkml.org/lkml/2012/8/30/199.

4.	CPU-hotplug fixes and improvements, posted to LKML at
        https://lkml.org/lkml/2012/8/30/292 for first three and at
        https://lkml.org/lkml/2012/8/3/416.

5.	Idle-loop fixes that were omitted on an earlier submission,
        posted to LKML at https://lkml.org/lkml/2012/8/30/251

.
"

Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
parents 9b20aa63 593d1006
Loading
Loading
Loading
Loading
+6 −0
Original line number Diff line number Diff line
@@ -310,6 +310,12 @@ over a rather long period of time, but improvements are always welcome!
	code under the influence of preempt_disable(), you instead
	need to use synchronize_irq() or synchronize_sched().

	This same limitation also applies to synchronize_rcu_bh()
	and synchronize_srcu(), as well as to the asynchronous and
	expedited forms of the three primitives, namely call_rcu(),
	call_rcu_bh(), call_srcu(), synchronize_rcu_expedited(),
	synchronize_rcu_bh_expedited(), and synchronize_srcu_expedited().

12.	Any lock acquired by an RCU callback must be acquired elsewhere
	with softirq disabled, e.g., via spin_lock_irqsave(),
	spin_lock_bh(), etc.  Failing to disable irq on a given
+8 −8
Original line number Diff line number Diff line
@@ -99,7 +99,7 @@ In kernels with CONFIG_RCU_FAST_NO_HZ, even more information is
printed:

	INFO: rcu_preempt detected stall on CPU
	0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 drain=0 . timer=-1
	0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 drain=0 . timer not pending
	   (t=65000 jiffies)

The "(64628 ticks this GP)" indicates that this CPU has taken more
@@ -116,13 +116,13 @@ number between the two "/"s is the value of the nesting, which will
be a small positive number if in the idle loop and a very large positive
number (as shown above) otherwise.

For CONFIG_RCU_FAST_NO_HZ kernels, the "drain=0" indicates that the
CPU is not in the process of trying to force itself into dyntick-idle
state, the "." indicates that the CPU has not given up forcing RCU
into dyntick-idle mode (it would be "H" otherwise), and the "timer=-1"
indicates that the CPU has not recented forced RCU into dyntick-idle
mode (it would otherwise indicate the number of microseconds remaining
in this forced state).
For CONFIG_RCU_FAST_NO_HZ kernels, the "drain=0" indicates that the CPU is
not in the process of trying to force itself into dyntick-idle state, the
"." indicates that the CPU has not given up forcing RCU into dyntick-idle
mode (it would be "H" otherwise), and the "timer not pending" indicates
that the CPU has not recently forced RCU into dyntick-idle mode (it
would otherwise indicate the number of microseconds remaining in this
forced state).


Multiple Warnings From One Stall
+16 −27
Original line number Diff line number Diff line
@@ -333,23 +333,23 @@ o Each element of the form "1/1 0:127 ^0" represents one struct
The output of "cat rcu/rcu_pending" looks as follows:

rcu_sched:
  0 np=255892 qsp=53936 rpq=85 cbr=0 cng=14417 gpc=10033 gps=24320 nf=6445 nn=146741
  1 np=261224 qsp=54638 rpq=33 cbr=0 cng=25723 gpc=16310 gps=2849 nf=5912 nn=155792
  2 np=237496 qsp=49664 rpq=23 cbr=0 cng=2762 gpc=45478 gps=1762 nf=1201 nn=136629
  3 np=236249 qsp=48766 rpq=98 cbr=0 cng=286 gpc=48049 gps=1218 nf=207 nn=137723
  4 np=221310 qsp=46850 rpq=7 cbr=0 cng=26 gpc=43161 gps=4634 nf=3529 nn=123110
  5 np=237332 qsp=48449 rpq=9 cbr=0 cng=54 gpc=47920 gps=3252 nf=201 nn=137456
  6 np=219995 qsp=46718 rpq=12 cbr=0 cng=50 gpc=42098 gps=6093 nf=4202 nn=120834
  7 np=249893 qsp=49390 rpq=42 cbr=0 cng=72 gpc=38400 gps=17102 nf=41 nn=144888
  0 np=255892 qsp=53936 rpq=85 cbr=0 cng=14417 gpc=10033 gps=24320 nn=146741
  1 np=261224 qsp=54638 rpq=33 cbr=0 cng=25723 gpc=16310 gps=2849 nn=155792
  2 np=237496 qsp=49664 rpq=23 cbr=0 cng=2762 gpc=45478 gps=1762 nn=136629
  3 np=236249 qsp=48766 rpq=98 cbr=0 cng=286 gpc=48049 gps=1218 nn=137723
  4 np=221310 qsp=46850 rpq=7 cbr=0 cng=26 gpc=43161 gps=4634 nn=123110
  5 np=237332 qsp=48449 rpq=9 cbr=0 cng=54 gpc=47920 gps=3252 nn=137456
  6 np=219995 qsp=46718 rpq=12 cbr=0 cng=50 gpc=42098 gps=6093 nn=120834
  7 np=249893 qsp=49390 rpq=42 cbr=0 cng=72 gpc=38400 gps=17102 nn=144888
rcu_bh:
  0 np=146741 qsp=1419 rpq=6 cbr=0 cng=6 gpc=0 gps=0 nf=2 nn=145314
  1 np=155792 qsp=12597 rpq=3 cbr=0 cng=0 gpc=4 gps=8 nf=3 nn=143180
  2 np=136629 qsp=18680 rpq=1 cbr=0 cng=0 gpc=7 gps=6 nf=0 nn=117936
  3 np=137723 qsp=2843 rpq=0 cbr=0 cng=0 gpc=10 gps=7 nf=0 nn=134863
  4 np=123110 qsp=12433 rpq=0 cbr=0 cng=0 gpc=4 gps=2 nf=0 nn=110671
  5 np=137456 qsp=4210 rpq=1 cbr=0 cng=0 gpc=6 gps=5 nf=0 nn=133235
  6 np=120834 qsp=9902 rpq=2 cbr=0 cng=0 gpc=6 gps=3 nf=2 nn=110921
  7 np=144888 qsp=26336 rpq=0 cbr=0 cng=0 gpc=8 gps=2 nf=0 nn=118542
  0 np=146741 qsp=1419 rpq=6 cbr=0 cng=6 gpc=0 gps=0 nn=145314
  1 np=155792 qsp=12597 rpq=3 cbr=0 cng=0 gpc=4 gps=8 nn=143180
  2 np=136629 qsp=18680 rpq=1 cbr=0 cng=0 gpc=7 gps=6 nn=117936
  3 np=137723 qsp=2843 rpq=0 cbr=0 cng=0 gpc=10 gps=7 nn=134863
  4 np=123110 qsp=12433 rpq=0 cbr=0 cng=0 gpc=4 gps=2 nn=110671
  5 np=137456 qsp=4210 rpq=1 cbr=0 cng=0 gpc=6 gps=5 nn=133235
  6 np=120834 qsp=9902 rpq=2 cbr=0 cng=0 gpc=6 gps=3 nn=110921
  7 np=144888 qsp=26336 rpq=0 cbr=0 cng=0 gpc=8 gps=2 nn=118542

As always, this is once again split into "rcu_sched" and "rcu_bh"
portions, with CONFIG_TREE_PREEMPT_RCU kernels having an additional
@@ -377,17 +377,6 @@ o "gpc" is the number of times that an old grace period had
o	"gps" is the number of times that a new grace period had started,
	but this CPU was not yet aware of it.

o	"nf" is the number of times that this CPU suspected that the
	current grace period had run for too long, and thus needed to
	be forced.

	Please note that "forcing" consists of sending resched IPIs
	to holdout CPUs.  If that CPU really still is in an old RCU
	read-side critical section, then we really do have to wait for it.
	The assumption behing "forcing" is that the CPU is not still in
	an old RCU read-side critical section, but has not yet responded
	for some other reason.

o	"nn" is the number of times that this CPU needed nothing.  Alert
	readers will note that the rcu "nn" number for a given CPU very
	closely matches the rcu_bh "np" number for that same CPU.  This
+7 −2
Original line number Diff line number Diff line
@@ -873,7 +873,7 @@ d. Do you need to treat NMI handlers, hardirq handlers,
	and code segments with preemption disabled (whether
	via preempt_disable(), local_irq_save(), local_bh_disable(),
	or some other mechanism) as if they were explicit RCU readers?
	If so, you need RCU-sched.
	If so, RCU-sched is the only choice that will work for you.

e.	Do you need RCU grace periods to complete even in the face
	of softirq monopolization of one or more of the CPUs?  For
@@ -884,7 +884,12 @@ f. Is your workload too update-intensive for normal use of
	RCU, but inappropriate for other synchronization mechanisms?
	If so, consider SLAB_DESTROY_BY_RCU.  But please be careful!

g.	Otherwise, use RCU.
g.	Do you need read-side critical sections that are respected
	even though they are in the middle of the idle loop, during
	user-mode execution, or on an offlined CPU?  If so, SRCU is the
	only choice that will work for you.

h.	Otherwise, use RCU.

Of course, this all assumes that you have determined that RCU is in fact
the right tool for your job.
+11 −0
Original line number Diff line number Diff line
@@ -2385,6 +2385,17 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
	rcutree.rcu_cpu_stall_timeout= [KNL,BOOT]
			Set timeout for RCU CPU stall warning messages.

	rcutree.jiffies_till_first_fqs= [KNL,BOOT]
			Set delay from grace-period initialization to
			first attempt to force quiescent states.
			Units are jiffies, minimum value is zero,
			and maximum value is HZ.

	rcutree.jiffies_till_next_fqs= [KNL,BOOT]
			Set delay between subsequent attempts to force
			quiescent states.  Units are jiffies, minimum
			value is one, and maximum value is HZ.

	rcutorture.fqs_duration= [KNL,BOOT]
			Set duration of force_quiescent_state bursts.

Loading