Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 5928a2b6 authored by Linus Torvalds's avatar Linus Torvalds
Browse files

Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RCU changes for v3.4 from Ingo Molnar.  The major features of this
series are:

 - making RCU more aggressive about entering dyntick-idle mode in order
   to improve energy efficiency

 - converting a few more call_rcu()s to kfree_rcu()s

 - applying a number of rcutree fixes and cleanups to rcutiny

 - removing CONFIG_SMP #ifdefs from treercu

 - allowing RCU CPU stall times to be set via sysfs

 - adding CPU-stall capability to rcutorture

 - adding more RCU-abuse diagnostics

 - updating documentation

 - fixing yet more issues located by the still-ongoing top-to-bottom
   inspection of RCU, this time with a special focus on the CPU-hotplug
   code path.

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (48 commits)
  rcu: Stop spurious warnings from synchronize_sched_expedited
  rcu: Hold off RCU_FAST_NO_HZ after timer posted
  rcu: Eliminate softirq-mediated RCU_FAST_NO_HZ idle-entry loop
  rcu: Add RCU_NONIDLE() for idle-loop RCU read-side critical sections
  rcu: Allow nesting of rcu_idle_enter() and rcu_idle_exit()
  rcu: Remove redundant check for rcu_head misalignment
  PTR_ERR should be called before its argument is cleared.
  rcu: Convert WARN_ON_ONCE() in rcu_lock_acquire() to lockdep
  rcu: Trace only after NULL-pointer check
  rcu: Call out dangers of expedited RCU primitives
  rcu: Rework detection of use of RCU by offline CPUs
  lockdep: Add CPU-idle/offline warning to lockdep-RCU splat
  rcu: No interrupt disabling for rcu_prepare_for_idle()
  rcu: Move synchronize_sched_expedited() to rcutree.c
  rcu: Check for illegal use of RCU from offlined CPUs
  rcu: Update stall-warning documentation
  rcu: Add CPU-stall capability to rcutorture
  rcu: Make documentation give more realistic rcutorture duration
  rcutorture: Permit holding off CPU-hotplug operations during boot
  rcu: Print scheduling-clock information on RCU CPU stall-warning messages
  ...
parents 5ed59af8 bdd4431c
Loading
Loading
Loading
Loading
+1745 −157

File changed.

Preview size limit exceeded, changes collapsed.

+14 −0
Original line number Diff line number Diff line
@@ -180,6 +180,20 @@ over a rather long period of time, but improvements are always welcome!
	operations that would not normally be undertaken while a real-time
	workload is running.

	In particular, if you find yourself invoking one of the expedited
	primitives repeatedly in a loop, please do everyone a favor:
	Restructure your code so that it batches the updates, allowing
	a single non-expedited primitive to cover the entire batch.
	This will very likely be faster than the loop containing the
	expedited primitive, and will be much much easier on the rest
	of the system, especially to real-time workloads running on
	the rest of the system.

	In addition, it is illegal to call the expedited forms from
	a CPU-hotplug notifier, or while holding a lock that is acquired
	by a CPU-hotplug notifier.  Failing to observe this restriction
	will result in deadlock.

7.	If the updater uses call_rcu() or synchronize_rcu(), then the
	corresponding readers must use rcu_read_lock() and
	rcu_read_unlock().  If the updater uses call_rcu_bh() or
+80 −7
Original line number Diff line number Diff line
@@ -12,14 +12,38 @@ CONFIG_RCU_CPU_STALL_TIMEOUT
	This kernel configuration parameter defines the period of time
	that RCU will wait from the beginning of a grace period until it
	issues an RCU CPU stall warning.  This time period is normally
	ten seconds.
	sixty seconds.

RCU_SECONDS_TILL_STALL_RECHECK
	This configuration parameter may be changed at runtime via the
	/sys/module/rcutree/parameters/rcu_cpu_stall_timeout, however
	this parameter is checked only at the beginning of a cycle.
	So if you are 30 seconds into a 70-second stall, setting this
	sysfs parameter to (say) five will shorten the timeout for the
	-next- stall, or the following warning for the current stall
	(assuming the stall lasts long enough).  It will not affect the
	timing of the next warning for the current stall.

	This macro defines the period of time that RCU will wait after
	issuing a stall warning until it issues another stall warning
	for the same stall.  This time period is normally set to three
	times the check interval plus thirty seconds.
	Stall-warning messages may be enabled and disabled completely via
	/sys/module/rcutree/parameters/rcu_cpu_stall_suppress.

CONFIG_RCU_CPU_STALL_VERBOSE

	This kernel configuration parameter causes the stall warning to
	also dump the stacks of any tasks that are blocking the current
	RCU-preempt grace period.

RCU_CPU_STALL_INFO

	This kernel configuration parameter causes the stall warning to
	print out additional per-CPU diagnostic information, including
	information on scheduling-clock ticks and RCU's idle-CPU tracking.

RCU_STALL_DELAY_DELTA

	Although the lockdep facility is extremely useful, it does add
	some overhead.  Therefore, under CONFIG_PROVE_RCU, the
	RCU_STALL_DELAY_DELTA macro allows five extra seconds before
	giving an RCU CPU stall warning message.

RCU_STALL_RAT_DELAY

@@ -64,6 +88,54 @@ INFO: rcu_bh_state detected stalls on CPUs/tasks: { } (detected by 4, 2502 jiffi

This is rare, but does happen from time to time in real life.

If the CONFIG_RCU_CPU_STALL_INFO kernel configuration parameter is set,
more information is printed with the stall-warning message, for example:

	INFO: rcu_preempt detected stall on CPU
	0: (63959 ticks this GP) idle=241/3fffffffffffffff/0
	   (t=65000 jiffies)

In kernels with CONFIG_RCU_FAST_NO_HZ, even more information is
printed:

	INFO: rcu_preempt detected stall on CPU
	0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 drain=0 . timer=-1
	   (t=65000 jiffies)

The "(64628 ticks this GP)" indicates that this CPU has taken more
than 64,000 scheduling-clock interrupts during the current stalled
grace period.  If the CPU was not yet aware of the current grace
period (for example, if it was offline), then this part of the message
indicates how many grace periods behind the CPU is.

The "idle=" portion of the message prints the dyntick-idle state.
The hex number before the first "/" is the low-order 12 bits of the
dynticks counter, which will have an even-numbered value if the CPU is
in dyntick-idle mode and an odd-numbered value otherwise.  The hex
number between the two "/"s is the value of the nesting, which will
be a small positive number if in the idle loop and a very large positive
number (as shown above) otherwise.

For CONFIG_RCU_FAST_NO_HZ kernels, the "drain=0" indicates that the
CPU is not in the process of trying to force itself into dyntick-idle
state, the "." indicates that the CPU has not given up forcing RCU
into dyntick-idle mode (it would be "H" otherwise), and the "timer=-1"
indicates that the CPU has not recented forced RCU into dyntick-idle
mode (it would otherwise indicate the number of microseconds remaining
in this forced state).


Multiple Warnings From One Stall

If a stall lasts long enough, multiple stall-warning messages will be
printed for it.  The second and subsequent messages are printed at
longer intervals, so that the time between (say) the first and second
message will be about three times the interval between the beginning
of the stall and the first message.


What Causes RCU CPU Stall Warnings?

So your kernel printed an RCU CPU stall warning.  The next question is
"What caused it?"  The following problems can result in RCU CPU stall
warnings:
@@ -128,4 +200,5 @@ is occurring, which will usually be in the function nearest the top of
that portion of the stack which remains the same from trace to trace.
If you can reliably trigger the stall, ftrace can be quite helpful.

RCU bugs can often be debugged with the help of CONFIG_RCU_TRACE.
RCU bugs can often be debugged with the help of CONFIG_RCU_TRACE
and with RCU's event tracing.
+30 −3
Original line number Diff line number Diff line
@@ -69,6 +69,13 @@ onoff_interval
		CPU-hotplug operations regardless of what value is
		specified for onoff_interval.

onoff_holdoff	The number of seconds to wait until starting CPU-hotplug
		operations.  This would normally only be used when
		rcutorture was built into the kernel and started
		automatically at boot time, in which case it is useful
		in order to avoid confusing boot-time code with CPUs
		coming and going.

shuffle_interval
		The number of seconds to keep the test threads affinitied
		to a particular subset of the CPUs, defaults to 3 seconds.
@@ -79,6 +86,24 @@ shutdown_secs The number of seconds to run the test before terminating
		zero, which disables test termination and system shutdown.
		This capability is useful for automated testing.

stall_cpu	The number of seconds that a CPU should be stalled while
		within both an rcu_read_lock() and a preempt_disable().
		This stall happens only once per rcutorture run.
		If you need multiple stalls, use modprobe and rmmod to
		repeatedly run rcutorture.  The default for stall_cpu
		is zero, which prevents rcutorture from stalling a CPU.

		Note that attempts to rmmod rcutorture while the stall
		is ongoing will hang, so be careful what value you
		choose for this module parameter!  In addition, too-large
		values for stall_cpu might well induce failures and
		warnings in other parts of the kernel.  You have been
		warned!

stall_cpu_holdoff
		The number of seconds to wait after rcutorture starts
		before stalling a CPU.  Defaults to 10 seconds.

stat_interval	The number of seconds between output of torture
		statistics (via printk()).  Regardless of the interval,
		statistics are printed when the module is unloaded.
@@ -271,11 +296,13 @@ The following script may be used to torture RCU:
	#!/bin/sh

	modprobe rcutorture
	sleep 100
	sleep 3600
	rmmod rcutorture
	dmesg | grep torture:

The output can be manually inspected for the error flag of "!!!".
One could of course create a more elaborate script that automatically
checked for such errors.  The "rmmod" command forces a "SUCCESS" or
"FAILURE" indication to be printk()ed.
checked for such errors.  The "rmmod" command forces a "SUCCESS",
"FAILURE", or "RCU_HOTPLUG" indication to be printk()ed.  The first
two are self-explanatory, while the last indicates that while there
were no RCU failures, CPU-hotplug problems were detected.
+16 −20
Original line number Diff line number Diff line
@@ -33,23 +33,23 @@ rcu/rcuboost:
The output of "cat rcu/rcudata" looks as follows:

rcu_sched:
  0 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=545/1/0 df=50 of=0 ri=0 ql=163 qs=NRW. kt=0/W/0 ktl=ebc3 b=10 ci=153737 co=0 ca=0
  1 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=967/1/0 df=58 of=0 ri=0 ql=634 qs=NRW. kt=0/W/1 ktl=58c b=10 ci=191037 co=0 ca=0
  2 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=1081/1/0 df=175 of=0 ri=0 ql=74 qs=N.W. kt=0/W/2 ktl=da94 b=10 ci=75991 co=0 ca=0
  3 c=20942 g=20943 pq=1 pgp=20942 qp=1 dt=1846/0/0 df=404 of=0 ri=0 ql=0 qs=.... kt=0/W/3 ktl=d1cd b=10 ci=72261 co=0 ca=0
  4 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=369/1/0 df=83 of=0 ri=0 ql=48 qs=N.W. kt=0/W/4 ktl=e0e7 b=10 ci=128365 co=0 ca=0
  5 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=381/1/0 df=64 of=0 ri=0 ql=169 qs=NRW. kt=0/W/5 ktl=fb2f b=10 ci=164360 co=0 ca=0
  6 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=1037/1/0 df=183 of=0 ri=0 ql=62 qs=N.W. kt=0/W/6 ktl=d2ad b=10 ci=65663 co=0 ca=0
  7 c=20897 g=20897 pq=1 pgp=20896 qp=0 dt=1572/0/0 df=382 of=0 ri=0 ql=0 qs=.... kt=0/W/7 ktl=cf15 b=10 ci=75006 co=0 ca=0
  0 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=545/1/0 df=50 of=0 ql=163 qs=NRW. kt=0/W/0 ktl=ebc3 b=10 ci=153737 co=0 ca=0
  1 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=967/1/0 df=58 of=0 ql=634 qs=NRW. kt=0/W/1 ktl=58c b=10 ci=191037 co=0 ca=0
  2 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=1081/1/0 df=175 of=0 ql=74 qs=N.W. kt=0/W/2 ktl=da94 b=10 ci=75991 co=0 ca=0
  3 c=20942 g=20943 pq=1 pgp=20942 qp=1 dt=1846/0/0 df=404 of=0 ql=0 qs=.... kt=0/W/3 ktl=d1cd b=10 ci=72261 co=0 ca=0
  4 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=369/1/0 df=83 of=0 ql=48 qs=N.W. kt=0/W/4 ktl=e0e7 b=10 ci=128365 co=0 ca=0
  5 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=381/1/0 df=64 of=0 ql=169 qs=NRW. kt=0/W/5 ktl=fb2f b=10 ci=164360 co=0 ca=0
  6 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=1037/1/0 df=183 of=0 ql=62 qs=N.W. kt=0/W/6 ktl=d2ad b=10 ci=65663 co=0 ca=0
  7 c=20897 g=20897 pq=1 pgp=20896 qp=0 dt=1572/0/0 df=382 of=0 ql=0 qs=.... kt=0/W/7 ktl=cf15 b=10 ci=75006 co=0 ca=0
rcu_bh:
  0 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=545/1/0 df=6 of=0 ri=1 ql=0 qs=.... kt=0/W/0 ktl=ebc3 b=10 ci=0 co=0 ca=0
  1 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=967/1/0 df=3 of=0 ri=1 ql=0 qs=.... kt=0/W/1 ktl=58c b=10 ci=151 co=0 ca=0
  2 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=1081/1/0 df=6 of=0 ri=1 ql=0 qs=.... kt=0/W/2 ktl=da94 b=10 ci=0 co=0 ca=0
  3 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=1846/0/0 df=8 of=0 ri=1 ql=0 qs=.... kt=0/W/3 ktl=d1cd b=10 ci=0 co=0 ca=0
  4 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=369/1/0 df=6 of=0 ri=1 ql=0 qs=.... kt=0/W/4 ktl=e0e7 b=10 ci=0 co=0 ca=0
  5 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=381/1/0 df=4 of=0 ri=1 ql=0 qs=.... kt=0/W/5 ktl=fb2f b=10 ci=0 co=0 ca=0
  6 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=1037/1/0 df=6 of=0 ri=1 ql=0 qs=.... kt=0/W/6 ktl=d2ad b=10 ci=0 co=0 ca=0
  7 c=1474 g=1474 pq=1 pgp=1473 qp=0 dt=1572/0/0 df=8 of=0 ri=1 ql=0 qs=.... kt=0/W/7 ktl=cf15 b=10 ci=0 co=0 ca=0
  0 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=545/1/0 df=6 of=0 ql=0 qs=.... kt=0/W/0 ktl=ebc3 b=10 ci=0 co=0 ca=0
  1 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=967/1/0 df=3 of=0 ql=0 qs=.... kt=0/W/1 ktl=58c b=10 ci=151 co=0 ca=0
  2 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=1081/1/0 df=6 of=0 ql=0 qs=.... kt=0/W/2 ktl=da94 b=10 ci=0 co=0 ca=0
  3 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=1846/0/0 df=8 of=0 ql=0 qs=.... kt=0/W/3 ktl=d1cd b=10 ci=0 co=0 ca=0
  4 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=369/1/0 df=6 of=0 ql=0 qs=.... kt=0/W/4 ktl=e0e7 b=10 ci=0 co=0 ca=0
  5 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=381/1/0 df=4 of=0 ql=0 qs=.... kt=0/W/5 ktl=fb2f b=10 ci=0 co=0 ca=0
  6 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=1037/1/0 df=6 of=0 ql=0 qs=.... kt=0/W/6 ktl=d2ad b=10 ci=0 co=0 ca=0
  7 c=1474 g=1474 pq=1 pgp=1473 qp=0 dt=1572/0/0 df=8 of=0 ql=0 qs=.... kt=0/W/7 ktl=cf15 b=10 ci=0 co=0 ca=0

The first section lists the rcu_data structures for rcu_sched, the second
for rcu_bh.  Note that CONFIG_TREE_PREEMPT_RCU kernels will have an
@@ -119,10 +119,6 @@ o "of" is the number of times that some other CPU has forced a
	CPU is offline when it is really alive and kicking) is a fatal
	error, so it makes sense to err conservatively.

o	"ri" is the number of times that RCU has seen fit to send a
	reschedule IPI to this CPU in order to get it to report a
	quiescent state.

o	"ql" is the number of RCU callbacks currently residing on
	this CPU.  This is the total number of callbacks, regardless
	of what state they are in (new, waiting for grace period to
Loading