Merge branches 'tracing/ftrace', 'tracing/fastboot', 'tracing/nmisafe' and... (a6b0786f) · Commits · e / devices / android_kernel_sony_msm8994

Documentation/ftrace.txt

+73 −98

Original line number	Diff line number	Diff line
		@@ -8,7 +8,7 @@ Copyright 2008 Red Hat Inc.
		Reviewers: Elias Oltmanns, Randy Dunlap, Andrew Morton,
		John Kacur, and David Teigland.

		Written for: 2.6.27-rc1
		Written for: 2.6.28-rc2

		Introduction
		------------
		@@ -94,23 +94,20 @@ of ftrace. Here is a list of some of the key files:
		only be recorded if the latency is greater than
		the value in this file. (in microseconds)

		trace_entries : This sets or displays the number of trace
		entries each CPU buffer can hold. The tracer buffers
		are the same size for each CPU. The displayed number
		is the size of the CPU buffer and not total size. The
		trace_entries: This sets or displays the number of bytes each CPU
		buffer can hold. The tracer buffers are the same size
		for each CPU. The displayed number is the size of the
		CPU buffer and not total size of all buffers. The
		trace buffers are allocated in pages (blocks of memory
		that the kernel uses for allocation, usually 4 KB in size).
		Since each entry is smaller than a page, if the last
		allocated page has room for more entries than were
		requested, the rest of the page is used to allocate
		entries.
		If the last page allocated has room for more bytes
		than requested, the rest of the page will be used,
		making the actual allocation bigger than requested.
		(Note, the size may not be a multiple of the page size due
		to buffer managment overhead.)

		This can only be updated when the current_tracer
		is set to "none".

		NOTE: It is planned on changing the allocated buffers
		from being the number of possible CPUS to
		the number of online CPUS.
		is set to "nop".

		tracing_cpumask: This is a mask that lets the user only trace
		on specified CPUS. The format is a hex string
		@@ -130,14 +127,11 @@ of ftrace. Here is a list of some of the key files:
		be traced. If a function exists in both set_ftrace_filter
		and set_ftrace_notrace, the function will _not_ be traced.

		available_filter_functions : When a function is encountered the first
		time by the dynamic tracer, it is recorded and
		later the call is converted into a nop. This file
		lists the functions that have been recorded
		by the dynamic tracer and these functions can
		be used to set the ftrace filter by the above
		"set_ftrace_filter" file. (See the section "dynamic ftrace"
		below for more details).
		available_filter_functions: This lists the functions that ftrace
		has processed and can trace. These are the function
		names that you can pass to "set_ftrace_filter" or
		"set_ftrace_notrace". (See the section "dynamic ftrace"
		below for more details.)


		The Tracers
		@@ -145,7 +139,7 @@ The Tracers

		Here is the list of current tracers that may be configured.

		ftrace - function tracer that uses mcount to trace all functions.
		function - function tracer that uses mcount to trace all functions.

		sched_switch - traces the context switches between tasks.

		@@ -166,8 +160,8 @@ Here is the list of current tracers that may be configured.
		the highest priority task to get scheduled after
		it has been woken up.

		none - This is not a tracer. To remove all tracers from tracing
		simply echo "none" into current_tracer.
		nop - This is not a tracer. To remove all tracers from tracing
		simply echo "nop" into current_tracer.


		Examples of using the tracer
		@@ -182,7 +176,7 @@ Output format:
		Here is an example of the output format of the file "trace"

		--------
		# tracer: ftrace
		# tracer: function
		#
		# TASK-PID CPU# TIMESTAMP FUNCTION
		# \| \| \| \| \|
		@@ -192,7 +186,7 @@ Here is an example of the output format of the file "trace"
		--------

		A header is printed with the tracer name that is represented by the trace.
		In this case the tracer is "ftrace". Then a header showing the format. Task
		In this case the tracer is "function". Then a header showing the format. Task
		name "bash", the task PID "4251", the CPU that it was running on
		"01", the timestamp in <secs>.<usecs> format, the function name that was
		traced "path_put" and the parent function that called this function
		@@ -1003,22 +997,20 @@ is the stack for the hard interrupt. This hides the fact that NEED_RESCHED
		has been set. We do not see the 'N' until we switch back to the task's
		assigned stack.

		ftrace
		------
		function
		--------

		ftrace is not only the name of the tracing infrastructure, but it
		is also a name of one of the tracers. The tracer is the function
		tracer. Enabling the function tracer can be done from the
		debug file system. Make sure the ftrace_enabled is set otherwise
		this tracer is a nop.
		This tracer is the function tracer. Enabling the function tracer
		can be done from the debug file system. Make sure the ftrace_enabled is
		set; otherwise this tracer is a nop.

		# sysctl kernel.ftrace_enabled=1
		# echo ftrace > /debug/tracing/current_tracer
		# echo function > /debug/tracing/current_tracer
		# echo 1 > /debug/tracing/tracing_enabled
		# usleep 1
		# echo 0 > /debug/tracing/tracing_enabled
		# cat /debug/tracing/trace
		# tracer: ftrace
		# tracer: function
		#
		# TASK-PID CPU# TIMESTAMP FUNCTION
		# \| \| \| \| \|
		@@ -1040,10 +1032,10 @@ this tracer is a nop.
		[...]


		Note: ftrace uses ring buffers to store the above entries. The newest data
		may overwrite the oldest data. Sometimes using echo to stop the trace
		is not sufficient because the tracing could have overwritten the data
		that you wanted to record. For this reason, it is sometimes better to
		Note: function tracer uses ring buffers to store the above entries.
		The newest data may overwrite the oldest data. Sometimes using echo to
		stop the trace is not sufficient because the tracing could have overwritten
		the data that you wanted to record. For this reason, it is sometimes better to
		disable tracing directly from a program. This allows you to stop the
		tracing at the point that you hit the part that you are interested in.
		To disable the tracing directly from a C program, something like following
		@@ -1077,18 +1069,31 @@ every kernel function, produced by the -pg switch in gcc), starts
		of pointing to a simple return. (Enabling FTRACE will include the
		-pg switch in the compiling of the kernel.)

		When dynamic ftrace is initialized, it calls kstop_machine to make
		the machine act like a uniprocessor so that it can freely modify code
		without worrying about other processors executing that same code. At
		initialization, the mcount calls are changed to call a "record_ip"
		function. After this, the first time a kernel function is called,
		it has the calling address saved in a hash table.

		Later on the ftraced kernel thread is awoken and will again call
		kstop_machine if new functions have been recorded. The ftraced thread
		will change all calls to mcount to "nop". Just calling mcount
		and having mcount return has shown a 10% overhead. By converting
		it to a nop, there is no measurable overhead to the system.
		At compile time every C file object is run through the
		recordmcount.pl script (located in the scripts directory). This
		script will process the C object using objdump to find all the
		locations in the .text section that call mcount. (Note, only
		the .text section is processed, since processing other sections
		like .init.text may cause races due to those sections being freed).

		A new section called "__mcount_loc" is created that holds references
		to all the mcount call sites in the .text section. This section is
		compiled back into the original object. The final linker will add
		all these references into a single table.

		On boot up, before SMP is initialized, the dynamic ftrace code
		scans this table and updates all the locations into nops. It also
		records the locations, which are added to the available_filter_functions
		list. Modules are processed as they are loaded and before they are
		executed. When a module is unloaded, it also removes its functions from
		the ftrace function list. This is automatic in the module unload
		code, and the module author does not need to worry about it.

		When tracing is enabled, kstop_machine is called to prevent races
		with the CPUS executing code being modified (which can cause the
		CPU to do undesireable things), and the nops are patched back
		to calls. But this time, they do not call mcount (which is just
		a function stub). They now call into the ftrace infrastructure.

		One special side-effect to the recording of the functions being
		traced is that we can now selectively choose which functions we
		@@ -1251,36 +1256,6 @@ Produces:

		We can see that there's no more lock or preempt tracing.

		ftraced
		-------

		As mentioned above, when dynamic ftrace is configured in, a kernel
		thread wakes up once a second and checks to see if there are mcount
		calls that need to be converted into nops. If there are not any, then
		it simply goes back to sleep. But if there are some, it will call
		kstop_machine to convert the calls to nops.

		There may be a case in which you do not want this added latency.
		Perhaps you are doing some audio recording and this activity might
		cause skips in the playback. There is an interface to disable
		and enable the "ftraced" kernel thread.

		# echo 0 > /debug/tracing/ftraced_enabled

		This will disable the calling of kstop_machine to update the
		mcount calls to nops. Remember that there is a large overhead
		to calling mcount. Without this kernel thread, that overhead will
		exist.

		If there are recorded calls to mcount, any write to the ftraced_enabled
		file will cause the kstop_machine to run. This means that a
		user can manually perform the updates when they want to by simply
		echoing a '0' into the ftraced_enabled file.

		The updates are also done at the beginning of enabling a tracer
		that uses ftrace function recording.


		trace_pipe
		----------

		@@ -1289,14 +1264,14 @@ on the tracing is different. Every read from trace_pipe is consumed.
		This means that subsequent reads will be different. The trace
		is live.

		# echo ftrace > /debug/tracing/current_tracer
		# echo function > /debug/tracing/current_tracer
		# cat /debug/tracing/trace_pipe > /tmp/trace.out &
		[1] 4153
		# echo 1 > /debug/tracing/tracing_enabled
		# usleep 1
		# echo 0 > /debug/tracing/tracing_enabled
		# cat /debug/tracing/trace
		# tracer: ftrace
		# tracer: function
		#
		# TASK-PID CPU# TIMESTAMP FUNCTION
		# \| \| \| \| \|
		@@ -1317,7 +1292,7 @@ is live.

		Note, reading the trace_pipe file will block until more input is added.
		By changing the tracer, trace_pipe will issue an EOF. We needed
		to set the ftrace tracer _before_ cating the trace_pipe file.
		to set the function tracer _before_ we "cat" the trace_pipe file.


		trace entries
		@@ -1334,10 +1309,10 @@ number of entries.
		65620

		Note, to modify this, you must have tracing completely disabled. To do that,
		echo "none" into the current_tracer. If the current_tracer is not set
		to "none", an EINVAL error will be returned.
		echo "nop" into the current_tracer. If the current_tracer is not set
		to "nop", an EINVAL error will be returned.

		# echo none > /debug/tracing/current_tracer
		# echo nop > /debug/tracing/current_tracer
		# echo 100000 > /debug/tracing/trace_entries
		# cat /debug/tracing/trace_entries
		100045

Documentation/io-mapping.txt

0 → 100644

+82 −0

Original line number	Diff line number	Diff line
		The io_mapping functions in linux/io-mapping.h provide an abstraction for
		efficiently mapping small regions of an I/O device to the CPU. The initial
		usage is to support the large graphics aperture on 32-bit processors where
		ioremap_wc cannot be used to statically map the entire aperture to the CPU
		as it would consume too much of the kernel address space.

		A mapping object is created during driver initialization using

		struct io_mapping *io_mapping_create_wc(unsigned long base,
		unsigned long size)

		'base' is the bus address of the region to be made
		mappable, while 'size' indicates how large a mapping region to
		enable. Both are in bytes.

		This _wc variant provides a mapping which may only be used
		with the io_mapping_map_atomic_wc or io_mapping_map_wc.

		With this mapping object, individual pages can be mapped either atomically
		or not, depending on the necessary scheduling environment. Of course, atomic
		maps are more efficient:

		void io_mapping_map_atomic_wc(struct io_mapping mapping,
		unsigned long offset)

		'offset' is the offset within the defined mapping region.
		Accessing addresses beyond the region specified in the
		creation function yields undefined results. Using an offset
		which is not page aligned yields an undefined result. The
		return value points to a single page in CPU address space.

		This _wc variant returns a write-combining map to the
		page and may only be used with mappings created by
		io_mapping_create_wc

		Note that the task may not sleep while holding this page
		mapped.

		void io_mapping_unmap_atomic(void *vaddr)

		'vaddr' must be the the value returned by the last
		io_mapping_map_atomic_wc call. This unmaps the specified
		page and allows the task to sleep once again.

		If you need to sleep while holding the lock, you can use the non-atomic
		variant, although they may be significantly slower.

		void io_mapping_map_wc(struct io_mapping mapping,
		unsigned long offset)

		This works like io_mapping_map_atomic_wc except it allows
		the task to sleep while holding the page mapped.

		void io_mapping_unmap(void *vaddr)

		This works like io_mapping_unmap_atomic, except it is used
		for pages mapped with io_mapping_map_wc.

		At driver close time, the io_mapping object must be freed:

		void io_mapping_free(struct io_mapping *mapping)

		Current Implementation:

		The initial implementation of these functions uses existing mapping
		mechanisms and so provides only an abstraction layer and no new
		functionality.

		On 64-bit processors, io_mapping_create_wc calls ioremap_wc for the whole
		range, creating a permanent kernel-visible mapping to the resource. The
		map_atomic and map functions add the requested offset to the base of the
		virtual address returned by ioremap_wc.

		On 32-bit processors with HIGHMEM defined, io_mapping_map_atomic_wc uses
		kmap_atomic_pfn to map the specified page in an atomic fashion;
		kmap_atomic_pfn isn't really supposed to be used with device pages, but it
		provides an efficient mapping for this usage.

		On 32-bit processors without HIGHMEM defined, io_mapping_map_atomic_wc and
		io_mapping_map_wc both use ioremap_wc, a terribly inefficient function which
		performs an IPI to inform all processors about the new mapping. This results
		in a significant performance penalty.

arch/arm/include/asm/ftrace.h

+0 −5

Original line number	Diff line number	Diff line
		#ifndef _ASM_ARM_FTRACE
		#define _ASM_ARM_FTRACE

		#ifndef __ASSEMBLY__
		static inline void ftrace_nmi_enter(void) { }
		static inline void ftrace_nmi_exit(void) { }
		#endif

		#ifdef CONFIG_FUNCTION_TRACER
		#define MCOUNT_ADDR ((long)(mcount))
		#define MCOUNT_INSN_SIZE 4 /* sizeof mcount call */

arch/powerpc/include/asm/ftrace.h

+0 −5

Original line number	Diff line number	Diff line
		#ifndef _ASM_POWERPC_FTRACE
		#define _ASM_POWERPC_FTRACE

		#ifndef __ASSEMBLY__
		static inline void ftrace_nmi_enter(void) { }
		static inline void ftrace_nmi_exit(void) { }
		#endif

		#ifdef CONFIG_FUNCTION_TRACER
		#define MCOUNT_ADDR ((long)(_mcount))
		#define MCOUNT_INSN_SIZE 4 /* sizeof mcount call */

arch/sh/include/asm/ftrace.h

+0 −5

Original line number	Diff line number	Diff line
		#ifndef __ASM_SH_FTRACE_H
		#define __ASM_SH_FTRACE_H

		#ifndef __ASSEMBLY__
		static inline void ftrace_nmi_enter(void) { }
		static inline void ftrace_nmi_exit(void) { }
		#endif

		#ifndef __ASSEMBLY__
		extern void mcount(void);
		#endif