Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 4385428a authored by Ingo Molnar's avatar Ingo Molnar
Browse files

Merge branch 'tip/perf/core' of...

Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/urgent
parents 047a3772 2d75af2f
Loading
Loading
Loading
Loading
+128 −16
Original line number Diff line number Diff line
CONFIG_RCU_TRACE debugfs Files and Formats


The rcutree implementation of RCU provides debugfs trace output that
summarizes counters and state.  This information is useful for debugging
RCU itself, and can sometimes also help to debug abuses of RCU.
The following sections describe the debugfs files and formats.
The rcutree and rcutiny implementations of RCU provide debugfs trace
output that summarizes counters and state.  This information is useful for
debugging RCU itself, and can sometimes also help to debug abuses of RCU.
The following sections describe the debugfs files and formats, first
for rcutree and next for rcutiny.


Hierarchical RCU debugfs Files and Formats
CONFIG_TREE_RCU and CONFIG_TREE_PREEMPT_RCU debugfs Files and Formats

This implementation of RCU provides three debugfs files under the
These implementations of RCU provides five debugfs files under the
top-level directory RCU: rcu/rcudata (which displays fields in struct
rcu_data), rcu/rcugp (which displays grace-period counters), and
rcu/rcuhier (which displays the struct rcu_node hierarchy).
rcu_data), rcu/rcudata.csv (which is a .csv spreadsheet version of
rcu/rcudata), rcu/rcugp (which displays grace-period counters),
rcu/rcuhier (which displays the struct rcu_node hierarchy), and
rcu/rcu_pending (which displays counts of the reasons that the
rcu_pending() function decided that there was core RCU work to do).

The output of "cat rcu/rcudata" looks as follows:

@@ -130,7 +134,8 @@ o "ci" is the number of RCU callbacks that have been invoked for
	been registered in absence of CPU-hotplug activity.

o	"co" is the number of RCU callbacks that have been orphaned due to
	this CPU going offline.
	this CPU going offline.  These orphaned callbacks have been moved
	to an arbitrarily chosen online CPU.

o	"ca" is the number of RCU callbacks that have been adopted due to
	other CPUs going offline.  Note that ci+co-ca+ql is the number of
@@ -168,12 +173,12 @@ o "gpnum" is the number of grace periods that have started. It is

The output of "cat rcu/rcuhier" looks as follows, with very long lines:

c=6902 g=6903 s=2 jfq=3 j=72c7 nfqs=13142/nfqsng=0(13142) fqlh=6 oqlen=0
c=6902 g=6903 s=2 jfq=3 j=72c7 nfqs=13142/nfqsng=0(13142) fqlh=6
1/1 .>. 0:127 ^0    
3/3 .>. 0:35 ^0    0/0 .>. 36:71 ^1    0/0 .>. 72:107 ^2    0/0 .>. 108:127 ^3    
3/3f .>. 0:5 ^0    2/3 .>. 6:11 ^1    0/0 .>. 12:17 ^2    0/0 .>. 18:23 ^3    0/0 .>. 24:29 ^4    0/0 .>. 30:35 ^5    0/0 .>. 36:41 ^0    0/0 .>. 42:47 ^1    0/0 .>. 48:53 ^2    0/0 .>. 54:59 ^3    0/0 .>. 60:65 ^4    0/0 .>. 66:71 ^5    0/0 .>. 72:77 ^0    0/0 .>. 78:83 ^1    0/0 .>. 84:89 ^2    0/0 .>. 90:95 ^3    0/0 .>. 96:101 ^4    0/0 .>. 102:107 ^5    0/0 .>. 108:113 ^0    0/0 .>. 114:119 ^1    0/0 .>. 120:125 ^2    0/0 .>. 126:127 ^3    
rcu_bh:
c=-226 g=-226 s=1 jfq=-5701 j=72c7 nfqs=88/nfqsng=0(88) fqlh=0 oqlen=0
c=-226 g=-226 s=1 jfq=-5701 j=72c7 nfqs=88/nfqsng=0(88) fqlh=0
0/1 .>. 0:127 ^0    
0/3 .>. 0:35 ^0    0/0 .>. 36:71 ^1    0/0 .>. 72:107 ^2    0/0 .>. 108:127 ^3    
0/3f .>. 0:5 ^0    0/3 .>. 6:11 ^1    0/0 .>. 12:17 ^2    0/0 .>. 18:23 ^3    0/0 .>. 24:29 ^4    0/0 .>. 30:35 ^5    0/0 .>. 36:41 ^0    0/0 .>. 42:47 ^1    0/0 .>. 48:53 ^2    0/0 .>. 54:59 ^3    0/0 .>. 60:65 ^4    0/0 .>. 66:71 ^5    0/0 .>. 72:77 ^0    0/0 .>. 78:83 ^1    0/0 .>. 84:89 ^2    0/0 .>. 90:95 ^3    0/0 .>. 96:101 ^4    0/0 .>. 102:107 ^5    0/0 .>. 108:113 ^0    0/0 .>. 114:119 ^1    0/0 .>. 120:125 ^2    0/0 .>. 126:127 ^3
@@ -212,11 +217,6 @@ o "fqlh" is the number of calls to force_quiescent_state() that
	exited immediately (without even being counted in nfqs above)
	due to contention on ->fqslock.

o	"oqlen" is the number of callbacks on the "orphan" callback
	list.  RCU callbacks are placed on this list by CPUs going
	offline, and are "adopted" either by the CPU helping the outgoing
	CPU or by the next rcu_barrier*() call, whichever comes first.

o	Each element of the form "1/1 0:127 ^0" represents one struct
	rcu_node.  Each line represents one level of the hierarchy, from
	root to leaves.  It is best to think of the rcu_data structures
@@ -326,3 +326,115 @@ o "nn" is the number of times that this CPU needed nothing. Alert
	readers will note that the rcu "nn" number for a given CPU very
	closely matches the rcu_bh "np" number for that same CPU.  This
	is due to short-circuit evaluation in rcu_pending().


CONFIG_TINY_RCU and CONFIG_TINY_PREEMPT_RCU debugfs Files and Formats

These implementations of RCU provides a single debugfs file under the
top-level directory RCU, namely rcu/rcudata, which displays fields in
rcu_bh_ctrlblk, rcu_sched_ctrlblk and, for CONFIG_TINY_PREEMPT_RCU,
rcu_preempt_ctrlblk.

The output of "cat rcu/rcudata" is as follows:

rcu_preempt: qlen=24 gp=1097669 g197/p197/c197 tasks=...
             ttb=. btg=no ntb=184 neb=0 nnb=183 j=01f7 bt=0274
             normal balk: nt=1097669 gt=0 bt=371 b=0 ny=25073378 nos=0
             exp balk: bt=0 nos=0
rcu_sched: qlen: 0
rcu_bh: qlen: 0

This is split into rcu_preempt, rcu_sched, and rcu_bh sections, with the
rcu_preempt section appearing only in CONFIG_TINY_PREEMPT_RCU builds.
The last three lines of the rcu_preempt section appear only in
CONFIG_RCU_BOOST kernel builds.  The fields are as follows:

o	"qlen" is the number of RCU callbacks currently waiting either
	for an RCU grace period or waiting to be invoked.  This is the
	only field present for rcu_sched and rcu_bh, due to the
	short-circuiting of grace period in those two cases.

o	"gp" is the number of grace periods that have completed.

o	"g197/p197/c197" displays the grace-period state, with the
	"g" number being the number of grace periods that have started
	(mod 256), the "p" number being the number of grace periods
	that the CPU has responded to (also mod 256), and the "c"
	number being the number of grace periods that have completed
	(once again mode 256).

	Why have both "gp" and "g"?  Because the data flowing into
	"gp" is only present in a CONFIG_RCU_TRACE kernel.

o	"tasks" is a set of bits.  The first bit is "T" if there are
	currently tasks that have recently blocked within an RCU
	read-side critical section, the second bit is "N" if any of the
	aforementioned tasks are blocking the current RCU grace period,
	and the third bit is "E" if any of the aforementioned tasks are
	blocking the current expedited grace period.  Each bit is "."
	if the corresponding condition does not hold.

o	"ttb" is a single bit.  It is "B" if any of the blocked tasks
	need to be priority boosted and "." otherwise.

o	"btg" indicates whether boosting has been carried out during
	the current grace period, with "exp" indicating that boosting
	is in progress for an expedited grace period, "no" indicating
	that boosting has not yet started for a normal grace period,
	"begun" indicating that boosting has bebug for a normal grace
	period, and "done" indicating that boosting has completed for
	a normal grace period.

o	"ntb" is the total number of tasks subjected to RCU priority boosting
	periods since boot.

o	"neb" is the number of expedited grace periods that have had
	to resort to RCU priority boosting since boot.

o	"nnb" is the number of normal grace periods that have had
	to resort to RCU priority boosting since boot.

o	"j" is the low-order 12 bits of the jiffies counter in hexadecimal.

o	"bt" is the low-order 12 bits of the value that the jiffies counter
	will have at the next time that boosting is scheduled to begin.

o	In the line beginning with "normal balk", the fields are as follows:

	o	"nt" is the number of times that the system balked from
		boosting because there were no blocked tasks to boost.
		Note that the system will balk from boosting even if the
		grace period is overdue when the currently running task
		is looping within an RCU read-side critical section.
		There is no point in boosting in this case, because
		boosting a running task won't make it run any faster.

	o	"gt" is the number of times that the system balked
		from boosting because, although there were blocked tasks,
		none of them were preventing the current grace period
		from completing.

	o	"bt" is the number of times that the system balked
		from boosting because boosting was already in progress.

	o	"b" is the number of times that the system balked from
		boosting because boosting had already completed for
		the grace period in question.

	o	"ny" is the number of times that the system balked from
		boosting because it was not yet time to start boosting
		the grace period in question.

	o	"nos" is the number of times that the system balked from
		boosting for inexplicable ("not otherwise specified")
		reasons.  This can actually happen due to races involving
		increments of the jiffies counter.

o	In the line beginning with "exp balk", the fields are as follows:

	o	"bt" is the number of times that the system balked from
		boosting because there were no blocked tasks to boost.

	o	"nos" is the number of times that the system balked from
		 boosting for inexplicable ("not otherwise specified")
		 reasons.
+26 −0
Original line number Diff line number Diff line
@@ -62,6 +62,10 @@ aic7*reg_print.c*
aic7*seq.h*
aicasm
aicdb.h*
altivec1.c
altivec2.c
altivec4.c
altivec8.c
asm-offsets.h
asm_offsets.h
autoconf.h*
@@ -76,6 +80,7 @@ btfixupprep
build
bvmlinux
bzImage*
capflags.c
classlist.h*
comp*.log
compile.h*
@@ -94,6 +99,7 @@ devlist.h*
docproc
elf2ecoff
elfconfig.h*
evergreen_reg_safe.h
fixdep
flask.h
fore200e_mkfirm
@@ -108,9 +114,16 @@ genksyms
*_gray256.c
ihex2fw
ikconfig.h*
inat-tables.c
initramfs_data.cpio
initramfs_data.cpio.gz
initramfs_list
int16.c
int1.c
int2.c
int32.c
int4.c
int8.c
kallsyms
kconfig
keywords.c
@@ -140,6 +153,7 @@ mkprep
mktables
mktree
modpost
modules.builtin
modules.order
modversions.h*
ncscope.*
@@ -153,14 +167,23 @@ pca200e.bin
pca200e_ecd.bin2
piggy.gz
piggyback
piggy.S
pnmtologo
ppc_defs.h*
pss_boot.h
qconf
r100_reg_safe.h
r200_reg_safe.h
r300_reg_safe.h
r420_reg_safe.h
r600_reg_safe.h
raid6altivec*.c
raid6int*.c
raid6tables.c
relocs
rn50_reg_safe.h
rs600_reg_safe.h
rv515_reg_safe.h
series
setup
setup.bin
@@ -169,6 +192,7 @@ sImage
sm_tbl*
split-include
syscalltab.h
tables.c
tags
tftpboot.img
timeconst.h
@@ -190,6 +214,7 @@ vmlinux
vmlinux-*
vmlinux.aout
vmlinux.lds
voffset.h
vsyscall.lds
vsyscall_32.lds
wanxlfw.inc
@@ -200,3 +225,4 @@ wakeup.elf
wakeup.lds
zImage*
zconf.hash.c
zoffset.h
+2 −25
Original line number Diff line number Diff line
@@ -537,7 +537,7 @@
       Notes: Further information in
       http://www.oreilly.com/catalog/linuxdrive2/

     * Title: "Linux Device Drivers, 3nd Edition"
     * Title: "Linux Device Drivers, 3rd Edition"
       Authors: Jonathan Corbet, Alessandro Rubini, and Greg Kroah-Hartman
       Publisher: O'Reilly & Associates.
       Date: 2005.
@@ -592,14 +592,6 @@
       Pages: 600.
       ISBN: 0-13-101908-2

     * Title:  "The  Design  and Implementation of the 4.4 BSD UNIX
       Operating System"
       Author: Marshall Kirk McKusick, Keith Bostic, Michael J. Karels,
       John S. Quarterman.
       Publisher: Addison-Wesley.
       Date: 1996.
       ISBN: 0-201-54979-4

     * Title: "Programming for the real world - POSIX.4"
       Author: Bill O. Gallmeister.
       Publisher: O'Reilly & Associates, Inc..
@@ -610,28 +602,13 @@
       POSIX. Good reference.

     * Title:  "UNIX  Systems  for  Modern Architectures: Symmetric
       Multiprocesssing and Caching for Kernel Programmers"
       Multiprocessing and Caching for Kernel Programmers"
       Author: Curt Schimmel.
       Publisher: Addison Wesley.
       Date: June, 1994.
       Pages: 432.
       ISBN: 0-201-63338-8

     * Title:  "The  Design  and Implementation of the 4.3 BSD UNIX
       Operating System"
       Author: Samuel J. Leffler, Marshall Kirk McKusick, Michael J.
       Karels, John S. Quarterman.
       Publisher: Addison-Wesley.
       Date: 1989 (reprinted with corrections on October, 1990).
       ISBN: 0-201-06196-1

     * Title: "The Design of the UNIX Operating System"
       Author: Maurice J. Bach.
       Publisher: Prentice Hall.
       Date: 1986.
       Pages: 471.
       ISBN: 0-13-201757-1

     MISCELLANEOUS:

     * Name: linux/Documentation
+7 −4
Original line number Diff line number Diff line
@@ -1614,6 +1614,8 @@ and is between 256 and 4096 characters. It is defined in the file
	noapic		[SMP,APIC] Tells the kernel to not make use of any
			IOAPICs that may be present in the system.

	noautogroup	Disable scheduler automatic task group creation.

	nobats		[PPC] Do not use BATs for mapping kernel lowmem
			on "Classic" PPC cores.

@@ -2459,12 +2461,13 @@ and is between 256 and 4096 characters. It is defined in the file
			to facilitate early boot debugging.
			See also Documentation/trace/events.txt

	tsc=		Disable clocksource-must-verify flag for TSC.
	tsc=		Disable clocksource stability checks for TSC.
			Format: <string>
			[x86] reliable: mark tsc clocksource as reliable, this
			disables clocksource verification at runtime.
			Used to enable high-resolution timer mode on older
			hardware, and in virtualized environment.
			disables clocksource verification at runtime, as well
			as the stability checks done at bootup.	Used to enable
			high-resolution timer mode on older hardware, and in
			virtualized environment.
			[x86] noirqtime: Do not use TSC to do irq accounting.
			Used to run time disable IRQ_TIME_ACCOUNTING on any
			platforms where RDTSC is slow and this accounting
+1 −0
Original line number Diff line number Diff line
@@ -600,6 +600,7 @@ Protocol: 2.07+
  0x00000001	lguest
  0x00000002	Xen
  0x00000003	Moorestown MID
  0x00000004	CE4100 TV Platform

Field name:	hardware_subarch_data
Type:		write (subarch-dependent)
Loading