Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 8ff4fbfd authored by Paul E. McKenney's avatar Paul E. McKenney
Browse files

Merge branches 'fixes.2015.07.22a' and 'initexp.2015.08.04a' into HEAD

fixes.2015.07.22a: Miscellaneous fixes.
initexp.2015.08.04a: Initialization and expedited updates.
	(Single branch due to conflicts.)
parents 9a54f98e af859bea
Loading
Loading
Loading
Loading
+19 −10
Original line number Diff line number Diff line
@@ -26,12 +26,6 @@ CONFIG_RCU_CPU_STALL_TIMEOUT
	Stall-warning messages may be enabled and disabled completely via
	/sys/module/rcupdate/parameters/rcu_cpu_stall_suppress.

CONFIG_RCU_CPU_STALL_INFO

	This kernel configuration parameter causes the stall warning to
	print out additional per-CPU diagnostic information, including
	information on scheduling-clock ticks and RCU's idle-CPU tracking.

RCU_STALL_DELAY_DELTA

	Although the lockdep facility is extremely useful, it does add
@@ -101,15 +95,13 @@ interact. Please note that it is not possible to entirely eliminate this
sort of false positive without resorting to things like stop_machine(),
which is overkill for this sort of problem.

If the CONFIG_RCU_CPU_STALL_INFO kernel configuration parameter is set,
more information is printed with the stall-warning message, for example:
Recent kernels will print a long form of the stall-warning message:

	INFO: rcu_preempt detected stall on CPU
	0: (63959 ticks this GP) idle=241/3fffffffffffffff/0 softirq=82/543
	   (t=65000 jiffies)

In kernels with CONFIG_RCU_FAST_NO_HZ, even more information is
printed:
In kernels with CONFIG_RCU_FAST_NO_HZ, more information is printed:

	INFO: rcu_preempt detected stall on CPU
	0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 softirq=82/543 last_accelerate: a345/d342 nonlazy_posted: 25 .D
@@ -171,6 +163,23 @@ message will be about three times the interval between the beginning
of the stall and the first message.


Stall Warnings for Expedited Grace Periods

If an expedited grace period detects a stall, it will place a message
like the following in dmesg:

	INFO: rcu_sched detected expedited stalls on CPUs: { 1 2 6 } 26009 jiffies s: 1043

This indicates that CPUs 1, 2, and 6 have failed to respond to a
reschedule IPI, that the expedited grace period has been going on for
26,009 jiffies, and that the expedited grace-period sequence counter is
1043.  The fact that this last value is odd indicates that an expedited
grace period is in flight.

It is entirely possible to see stall warnings from normal and from
expedited grace periods at about the same time from the same run.


What Causes RCU CPU Stall Warnings?

So your kernel printed an RCU CPU stall warning.  The next question is
+10 −26
Original line number Diff line number Diff line
@@ -237,42 +237,26 @@ o "ktl" is the low-order 16 bits (in hexadecimal) of the count of

The output of "cat rcu/rcu_preempt/rcuexp" looks as follows:

s=21872 d=21872 w=0 tf=0 wd1=0 wd2=0 n=0 sc=21872 dt=21872 dl=0 dx=21872
s=21872 wd0=0 wd1=0 wd2=0 wd3=5 n=0 enq=0 sc=21872

These fields are as follows:

o	"s" is the starting sequence number.
o	"s" is the sequence number, with an odd number indicating that
	an expedited grace period is in progress.

o	"d" is the ending sequence number.  When the starting and ending
	numbers differ, there is an expedited grace period in progress.

o	"w" is the number of times that the sequence numbers have been
	in danger of wrapping.

o	"tf" is the number of times that contention has resulted in a
	failure to begin an expedited grace period.

o	"wd1" and "wd2" are the number of times that an attempt to
	start an expedited grace period found that someone else had
	completed an expedited grace period that satisfies the
o	"wd0", "wd1", "wd2", and "wd3" are the number of times that an
	attempt to start an expedited grace period found that someone
	else had completed an expedited grace period that satisfies the
	attempted request.  "Our work is done."

o	"n" is number of times that contention was so great that
	the request was demoted from an expedited grace period to
	a normal grace period.
o	"n" is number of times that a concurrent CPU-hotplug operation
	forced a fallback to a normal grace period.

o	"enq" is the number of quiescent states still outstanding.

o	"sc" is the number of times that the attempt to start a
	new expedited grace period succeeded.

o	"dt" is the number of times that we attempted to update
	the "d" counter.

o	"dl" is the number of times that we failed to update the "d"
	counter.

o	"dx" is the number of times that we succeeded in updating
	the "d" counter.


The output of "cat rcu/rcu_preempt/rcugp" looks as follows:

+0 −1
Original line number Diff line number Diff line
@@ -661,7 +661,6 @@ TRACE_EVENT(rcu_torture_read,
 * Tracepoint for _rcu_barrier() execution.  The string "s" describes
 * the _rcu_barrier phase:
 *	"Begin": _rcu_barrier() started.
 *	"Check": _rcu_barrier() checking for piggybacking.
 *	"EarlyExit": _rcu_barrier() piggybacked, thus early exit.
 *	"Inc1": _rcu_barrier() piggyback check counter incremented.
 *	"OfflineNoCB": _rcu_barrier() found callback on never-online CPU
+327 −262

File changed.

Preview size limit exceeded, changes collapsed.

+51 −33
Original line number Diff line number Diff line
@@ -27,6 +27,7 @@
#include <linux/threads.h>
#include <linux/cpumask.h>
#include <linux/seqlock.h>
#include <linux/stop_machine.h>

/*
 * Define shape of hierarchy based on NR_CPUS, CONFIG_RCU_FANOUT, and
@@ -36,8 +37,6 @@
 * Of course, your mileage may vary.
 */

#define MAX_RCU_LVLS 4

#ifdef CONFIG_RCU_FANOUT
#define RCU_FANOUT CONFIG_RCU_FANOUT
#else /* #ifdef CONFIG_RCU_FANOUT */
@@ -66,38 +65,53 @@
#if NR_CPUS <= RCU_FANOUT_1
#  define RCU_NUM_LVLS	      1
#  define NUM_RCU_LVL_0	      1
#  define NUM_RCU_LVL_1	      (NR_CPUS)
#  define NUM_RCU_LVL_2	      0
#  define NUM_RCU_LVL_3	      0
#  define NUM_RCU_LVL_4	      0
#  define NUM_RCU_NODES	      NUM_RCU_LVL_0
#  define NUM_RCU_LVL_INIT    { NUM_RCU_LVL_0 }
#  define RCU_NODE_NAME_INIT  { "rcu_node_0" }
#  define RCU_FQS_NAME_INIT   { "rcu_node_fqs_0" }
#  define RCU_EXP_NAME_INIT   { "rcu_node_exp_0" }
#  define RCU_EXP_SCHED_NAME_INIT \
			      { "rcu_node_exp_sched_0" }
#elif NR_CPUS <= RCU_FANOUT_2
#  define RCU_NUM_LVLS	      2
#  define NUM_RCU_LVL_0	      1
#  define NUM_RCU_LVL_1	      DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_1)
#  define NUM_RCU_LVL_2	      (NR_CPUS)
#  define NUM_RCU_LVL_3	      0
#  define NUM_RCU_LVL_4	      0
#  define NUM_RCU_NODES	      (NUM_RCU_LVL_0 + NUM_RCU_LVL_1)
#  define NUM_RCU_LVL_INIT    { NUM_RCU_LVL_0, NUM_RCU_LVL_1 }
#  define RCU_NODE_NAME_INIT  { "rcu_node_0", "rcu_node_1" }
#  define RCU_FQS_NAME_INIT   { "rcu_node_fqs_0", "rcu_node_fqs_1" }
#  define RCU_EXP_NAME_INIT   { "rcu_node_exp_0", "rcu_node_exp_1" }
#  define RCU_EXP_SCHED_NAME_INIT \
			      { "rcu_node_exp_sched_0", "rcu_node_exp_sched_1" }
#elif NR_CPUS <= RCU_FANOUT_3
#  define RCU_NUM_LVLS	      3
#  define NUM_RCU_LVL_0	      1
#  define NUM_RCU_LVL_1	      DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_2)
#  define NUM_RCU_LVL_2	      DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_1)
#  define NUM_RCU_LVL_3	      (NR_CPUS)
#  define NUM_RCU_LVL_4	      0
#  define NUM_RCU_NODES	      (NUM_RCU_LVL_0 + NUM_RCU_LVL_1 + NUM_RCU_LVL_2)
#  define NUM_RCU_LVL_INIT    { NUM_RCU_LVL_0, NUM_RCU_LVL_1, NUM_RCU_LVL_2 }
#  define RCU_NODE_NAME_INIT  { "rcu_node_0", "rcu_node_1", "rcu_node_2" }
#  define RCU_FQS_NAME_INIT   { "rcu_node_fqs_0", "rcu_node_fqs_1", "rcu_node_fqs_2" }
#  define RCU_EXP_NAME_INIT   { "rcu_node_exp_0", "rcu_node_exp_1", "rcu_node_exp_2" }
#  define RCU_EXP_SCHED_NAME_INIT \
			      { "rcu_node_exp_sched_0", "rcu_node_exp_sched_1", "rcu_node_exp_sched_2" }
#elif NR_CPUS <= RCU_FANOUT_4
#  define RCU_NUM_LVLS	      4
#  define NUM_RCU_LVL_0	      1
#  define NUM_RCU_LVL_1	      DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_3)
#  define NUM_RCU_LVL_2	      DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_2)
#  define NUM_RCU_LVL_3	      DIV_ROUND_UP(NR_CPUS, RCU_FANOUT_1)
#  define NUM_RCU_LVL_4	      (NR_CPUS)
#  define NUM_RCU_NODES	      (NUM_RCU_LVL_0 + NUM_RCU_LVL_1 + NUM_RCU_LVL_2 + NUM_RCU_LVL_3)
#  define NUM_RCU_LVL_INIT    { NUM_RCU_LVL_0, NUM_RCU_LVL_1, NUM_RCU_LVL_2, NUM_RCU_LVL_3 }
#  define RCU_NODE_NAME_INIT  { "rcu_node_0", "rcu_node_1", "rcu_node_2", "rcu_node_3" }
#  define RCU_FQS_NAME_INIT   { "rcu_node_fqs_0", "rcu_node_fqs_1", "rcu_node_fqs_2", "rcu_node_fqs_3" }
#  define RCU_EXP_NAME_INIT   { "rcu_node_exp_0", "rcu_node_exp_1", "rcu_node_exp_2", "rcu_node_exp_3" }
#  define RCU_EXP_SCHED_NAME_INIT \
			      { "rcu_node_exp_sched_0", "rcu_node_exp_sched_1", "rcu_node_exp_sched_2", "rcu_node_exp_sched_3" }
#else
# error "CONFIG_RCU_FANOUT insufficient for NR_CPUS"
#endif /* #if (NR_CPUS) <= RCU_FANOUT_1 */

#define RCU_SUM (NUM_RCU_LVL_0 + NUM_RCU_LVL_1 + NUM_RCU_LVL_2 + NUM_RCU_LVL_3 + NUM_RCU_LVL_4)
#define NUM_RCU_NODES (RCU_SUM - NR_CPUS)

extern int rcu_num_lvls;
extern int rcu_num_nodes;

@@ -236,6 +250,8 @@ struct rcu_node {
	int need_future_gp[2];
				/* Counts of upcoming no-CB GP requests. */
	raw_spinlock_t fqslock ____cacheline_internodealigned_in_smp;

	struct mutex exp_funnel_mutex ____cacheline_internodealigned_in_smp;
} ____cacheline_internodealigned_in_smp;

/*
@@ -287,12 +303,13 @@ struct rcu_data {
	bool		gpwrap;		/* Possible gpnum/completed wrap. */
	struct rcu_node *mynode;	/* This CPU's leaf of hierarchy */
	unsigned long grpmask;		/* Mask to apply to leaf qsmask. */
#ifdef CONFIG_RCU_CPU_STALL_INFO
	unsigned long	ticks_this_gp;	/* The number of scheduling-clock */
					/*  ticks this CPU has handled */
					/*  during and after the last grace */
					/* period it is aware of. */
#endif /* #ifdef CONFIG_RCU_CPU_STALL_INFO */
	struct cpu_stop_work exp_stop_work;
					/* Expedited grace-period control */
					/*  for CPU stopping. */

	/* 2) batch handling */
	/*
@@ -355,11 +372,13 @@ struct rcu_data {
	unsigned long n_rp_nocb_defer_wakeup;
	unsigned long n_rp_need_nothing;

	/* 6) _rcu_barrier() and OOM callbacks. */
	/* 6) _rcu_barrier(), OOM callbacks, and expediting. */
	struct rcu_head barrier_head;
#ifdef CONFIG_RCU_FAST_NO_HZ
	struct rcu_head oom_head;
#endif /* #ifdef CONFIG_RCU_FAST_NO_HZ */
	struct mutex exp_funnel_mutex;
	bool exp_done;			/* Expedited QS for this CPU? */

	/* 7) Callback offloading. */
#ifdef CONFIG_RCU_NOCB_CPU
@@ -387,9 +406,7 @@ struct rcu_data {
#endif /* #ifdef CONFIG_RCU_NOCB_CPU */

	/* 8) RCU CPU stall data. */
#ifdef CONFIG_RCU_CPU_STALL_INFO
	unsigned int softirq_snap;	/* Snapshot of softirq activity. */
#endif /* #ifdef CONFIG_RCU_CPU_STALL_INFO */

	int cpu;
	struct rcu_state *rsp;
@@ -442,9 +459,9 @@ do { \
 */
struct rcu_state {
	struct rcu_node node[NUM_RCU_NODES];	/* Hierarchy. */
	struct rcu_node *level[RCU_NUM_LVLS];	/* Hierarchy levels. */
	u32 levelcnt[MAX_RCU_LVLS + 1];		/* # nodes in each level. */
	u8 levelspread[RCU_NUM_LVLS];		/* kids/node in each level. */
	struct rcu_node *level[RCU_NUM_LVLS + 1];
						/* Hierarchy levels (+1 to */
						/*  shut bogus gcc warning) */
	u8 flavor_mask;				/* bit in flavor mask. */
	struct rcu_data __percpu *rda;		/* pointer of percu rcu_data. */
	void (*call)(struct rcu_head *head,	/* call_rcu() flavor. */
@@ -479,21 +496,18 @@ struct rcu_state {
	struct mutex barrier_mutex;		/* Guards barrier fields. */
	atomic_t barrier_cpu_count;		/* # CPUs waiting on. */
	struct completion barrier_completion;	/* Wake at barrier end. */
	unsigned long n_barrier_done;		/* ++ at start and end of */
	unsigned long barrier_sequence;		/* ++ at start and end of */
						/*  _rcu_barrier(). */
	/* End of fields guarded by barrier_mutex. */

	atomic_long_t expedited_start;		/* Starting ticket. */
	atomic_long_t expedited_done;		/* Done ticket. */
	atomic_long_t expedited_wrap;		/* # near-wrap incidents. */
	atomic_long_t expedited_tryfail;	/* # acquisition failures. */
	unsigned long expedited_sequence;	/* Take a ticket. */
	atomic_long_t expedited_workdone0;	/* # done by others #0. */
	atomic_long_t expedited_workdone1;	/* # done by others #1. */
	atomic_long_t expedited_workdone2;	/* # done by others #2. */
	atomic_long_t expedited_workdone3;	/* # done by others #3. */
	atomic_long_t expedited_normal;		/* # fallbacks to normal. */
	atomic_long_t expedited_stoppedcpus;	/* # successful stop_cpus. */
	atomic_long_t expedited_done_tries;	/* # tries to update _done. */
	atomic_long_t expedited_done_lost;	/* # times beaten to _done. */
	atomic_long_t expedited_done_exit;	/* # times exited _done loop. */
	atomic_t expedited_need_qs;		/* # CPUs left to check in. */
	wait_queue_head_t expedited_wq;		/* Wait for check-ins. */

	unsigned long jiffies_force_qs;		/* Time at which to invoke */
						/*  force_quiescent_state(). */
@@ -527,7 +541,11 @@ struct rcu_state {
/* Values for rcu_state structure's gp_flags field. */
#define RCU_GP_WAIT_INIT 0	/* Initial state. */
#define RCU_GP_WAIT_GPS  1	/* Wait for grace-period start. */
#define RCU_GP_WAIT_FQS  2	/* Wait for force-quiescent-state time. */
#define RCU_GP_DONE_GPS  2	/* Wait done for grace-period start. */
#define RCU_GP_WAIT_FQS  3	/* Wait for force-quiescent-state time. */
#define RCU_GP_DOING_FQS 4	/* Wait done for force-quiescent-state time. */
#define RCU_GP_CLEANUP   5	/* Grace-period cleanup started. */
#define RCU_GP_CLEANED   6	/* Grace-period cleanup complete. */

extern struct list_head rcu_struct_flavors;

Loading