Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 3ce5aceb authored by Ingo Molnar's avatar Ingo Molnar
Browse files

Merge tag 'perf-core-for-mingo-5.3-20190611' of...

Merge tag 'perf-core-for-mingo-5.3-20190611' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

 into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

perf record:

  Alexey Budankov:

  - Allow mixing --user-regs with --call-graph=dwarf, making sure that
    the minimal set of registers for DWARF unwinding is present in the
    set of user registers requested to be present in each sample, while
    warning the user that this may make callchains unreliable if more
    that the minimal set of registers is needed to unwind.

  yuzhoujian:

  - Add support to collect callchains from kernel or user space only,
    IOW allow setting the perf_event_attr.exclude_callchain_{kernel,user}
    bits from the command line.

perf trace:

  Arnaldo Carvalho de Melo:

  - Remove x86_64 specific syscall numbers from the augmented_raw_syscalls
    BPF in-kernel collector of augmented raw_syscalls:sys_{enter,exit}
    payloads, use instead the syscall numbers obtainer either by the
    arch specific syscalltbl generators or from audit-libs.

  - Allow 'perf trace' to ask for the number of bytes to collect for
    string arguments, for now ask for PATH_MAX, i.e. the whole
    pathnames, which ends up being just a way to speficy which syscall
    args are pathnames and thus should be read using bpf_probe_read_str().

  - Skip unknown syscalls when expanding strace like syscall groups.
    This helps using the 'string' group of syscalls to work in arm64,
    where some of the syscalls present in x86_64 that deal with
    strings, for instance 'access', are deprecated and this should not
    be asked for tracing.

  Leo Yan:

  - Exit when failing to build eBPF program.

perf config:

  Arnaldo Carvalho de Melo:

  - Bail out when a handler returns failure for a key-value pair. This
    helps with cases where processing a key-value pair is not just a
    matter of setting some tool specific knob, involving, for instance
    building a BPF program to then attach to the list of events 'perf
    trace' will use, e.g. augmented_raw_syscalls.c.

perf.data:

  Kan Liang:

  - Read and store die ID information available in new Intel processors
    in CPUID.1F in the CPU topology written in the perf.data header.

perf stat:

  Kan Liang:

  - Support per-die aggregation.

Documentation:

  Arnaldo Carvalho de Melo:

  - Update perf.data documentation about the CPU_TOPOLOGY, MEM_TOPOLOGY,
    CLOCKID and DIR_FORMAT headers.

  Song Liu:

  - Add description of headers HEADER_BPF_PROG_INFO and HEADER_BPF_BTF.

  Leo Yan:

  - Update default value for llvm.clang-bpf-cmd-template in 'man perf-config'.

JVMTI:

  Jiri Olsa:

  - Address gcc string overflow warning for strncpy()

core:

  - Remove superfluous nthreads system_wide setup in perf_evsel__alloc_fd().

Intel PT:

  Adrian Hunter:

  - Add support for samples to contain IPC ratio, collecting cycles
    information from CYC packets, showing the IPC info periodically, because
    Intel PT does not update the cycle count on every branch or instruction,
    the incremental values will often be zero.  When there are values, they
    will be the number of instructions and number of cycles since the last
    update, and thus represent the average IPC since the last IPC value.

    E.g.:

    # perf record --cpu 1 -m200000 -a -e intel_pt/cyc/u sleep 0.0001
    rounding mmap pages size to 1024M (262144 pages)
    [ perf record: Woken up 0 times to write data ]
    [ perf record: Captured and wrote 2.208 MB perf.data ]
    # perf script --insn-trace --xed -F+ipc,-dso,-cpu,-tid
    #
    <SNIP + add line numbering to make sense of IPC counts e.g.: (18/3)>
    1   cc1 63501.650479626: 7f5219ac27bf _int_free+0x3f   jnz 0x7f5219ac2af0       IPC: 0.81 (36/44)
    2   cc1 63501.650479626: 7f5219ac27c5 _int_free+0x45   cmp $0x1f, %rbp
    3   cc1 63501.650479626: 7f5219ac27c9 _int_free+0x49   jbe 0x7f5219ac2b00
    4   cc1 63501.650479626: 7f5219ac27cf _int_free+0x4f   test $0x8, %al
    5   cc1 63501.650479626: 7f5219ac27d1 _int_free+0x51   jnz 0x7f5219ac2b00
    6   cc1 63501.650479626: 7f5219ac27d7 _int_free+0x57   movq  0x13c58a(%rip), %rcx
    7   cc1 63501.650479626: 7f5219ac27de _int_free+0x5e   mov %rdi, %r12
    8   cc1 63501.650479626: 7f5219ac27e1 _int_free+0x61   movq  %fs:(%rcx), %rax
    9   cc1 63501.650479626: 7f5219ac27e5 _int_free+0x65   test %rax, %rax
   10   cc1 63501.650479626: 7f5219ac27e8 _int_free+0x68   jz 0x7f5219ac2821
   11   cc1 63501.650479626: 7f5219ac27ea _int_free+0x6a   leaq  -0x11(%rbp), %rdi
   12   cc1 63501.650479626: 7f5219ac27ee _int_free+0x6e   mov %rdi, %rsi
   13   cc1 63501.650479626: 7f5219ac27f1 _int_free+0x71   shr $0x4, %rsi
   14   cc1 63501.650479626: 7f5219ac27f5 _int_free+0x75   cmpq  %rsi, 0x13caf4(%rip)
   15   cc1 63501.650479626: 7f5219ac27fc _int_free+0x7c   jbe 0x7f5219ac2821
   16   cc1 63501.650479626: 7f5219ac2821 _int_free+0xa1   cmpq  0x13f138(%rip), %rbp
   17   cc1 63501.650479626: 7f5219ac2828 _int_free+0xa8   jnbe 0x7f5219ac28d8
   18   cc1 63501.650479626: 7f5219ac28d8 _int_free+0x158  testb  $0x2, 0x8(%rbx)
   19   cc1 63501.650479628: 7f5219ac28dc _int_free+0x15c  jnz 0x7f5219ac2ab0       IPC: 6.00 (18/3)
    <SNIP>

  - Allow using time ranges with Intel PT, i.e. these features, already
    present but not optimially usable with Intel PT, should be now:

        Select the second 10% time slice:

        $ perf script --time 10%/2

        Select from 0% to 10% time slice:

        $ perf script --time 0%-10%

        Select the first and second 10% time slices:

        $ perf script --time 10%/1,10%/2

        Select from 0% to 10% and 30% to 40% slices:

        $ perf script --time 0%-10%,30%-40%

cs-etm (ARM):

  Mathieu Poirier:

  - Add support for CPU-wide trace scenarios.

s390:

  Thomas Richter:

  - Fix missing kvm module load for s390.

  - Fix OOM error in TUI mode on s390

  - Support s390 diag event display when doing analysis on !s390
    architectures.

Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
parents d0e1a507 04c41bcb
Loading
Loading
Loading
Loading
+41 −0
Original line number Diff line number Diff line
Database Export
===============

perf tool's python scripting engine:

	tools/perf/util/scripting-engines/trace-event-python.c

supports scripts:

	tools/perf/scripts/python/export-to-sqlite.py
	tools/perf/scripts/python/export-to-postgresql.py

which export data to a SQLite3 or PostgreSQL database.

The export process provides records with unique sequential ids which allows the
data to be imported directly to a database and provides the relationships
between tables.

Over time it is possible to continue to expand the export while maintaining
backward and forward compatibility, by following some simple rules:

1. Because of the nature of SQL, existing tables and columns can continue to be
used so long as the names and meanings (and to some extent data types) remain
the same.

2. New tables and columns can be added, without affecting existing SQL queries,
so long as the new names are unique.

3. Scripts that use a database (e.g. exported-sql-viewer.py) can maintain
backward compatibility by testing for the presence of new tables and columns
before using them. e.g. function IsSelectable() in exported-sql-viewer.py

4. The export scripts themselves maintain forward compatibility (i.e. an existing
script will continue to work with new versions of perf) by accepting a variable
number of arguments (e.g. def call_return_table(*x)) i.e. perf can pass more
arguments which old scripts will ignore.

5. The scripting engine tests for the existence of script handler functions
before calling them.  The scripting engine can also test for the support of new
or optional features by checking for the existence and value of script global
variables.
+30 −0
Original line number Diff line number Diff line
@@ -103,6 +103,36 @@ The flags are "bcrosyiABEx" which stand for branch, call, return, conditional,
system, asynchronous, interrupt, transaction abort, trace begin, trace end, and
in transaction, respectively.

Another interesting field that is not printed by default is 'ipc' which can be
displayed as follows:

	perf script --itrace=be -F+ipc

There are two ways that instructions-per-cycle (IPC) can be calculated depending
on the recording.

If the 'cyc' config term (see config terms section below) was used, then IPC is
calculated using the cycle count from CYC packets, otherwise MTC packets are
used - refer to the 'mtc' config term.  When MTC is used, however, the values
are less accurate because the timing is less accurate.

Because Intel PT does not update the cycle count on every branch or instruction,
the values will often be zero.  When there are values, they will be the number
of instructions and number of cycles since the last update, and thus represent
the average IPC since the last IPC for that event type.  Note IPC for "branches"
events is calculated separately from IPC for "instructions" events.

Also note that the IPC instruction count may or may not include the current
instruction.  If the cycle count is associated with an asynchronous branch
(e.g. page fault or interrupt), then the instruction count does not include the
current instruction, otherwise it does.  That is consistent with whether or not
that instruction has retired when the cycle count is updated.

Another note, in the case of "branches" events, non-taken branches are not
presently sampled, so IPC values for them do not appear e.g. a CYC packet with a
TNT packet that starts with a non-taken branch.  To see every possible IPC
value, "instructions" events can be used e.g. --itrace=i0ns

While it is possible to create scripts to analyze the data, an alternative
approach is available to export the data to a sqlite or postgresql database.
Refer to script export-to-sqlite.py or export-to-postgresql.py for more details,
+6 −3
Original line number Diff line number Diff line
@@ -564,9 +564,12 @@ llvm.*::
	llvm.clang-bpf-cmd-template::
		Cmdline template. Below lines show its default value. Environment
		variable is used to pass options.
		"$CLANG_EXEC -D__KERNEL__ $CLANG_OPTIONS $KERNEL_INC_OPTIONS \
		-Wno-unused-value -Wno-pointer-sign -working-directory \
		$WORKING_DIR  -c $CLANG_SOURCE -target bpf -O2 -o -"
		"$CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS "\
		"-DLINUX_VERSION_CODE=$LINUX_VERSION_CODE "	\
		"$CLANG_OPTIONS $PERF_BPF_INC_OPTIONS $KERNEL_INC_OPTIONS " \
		"-Wno-unused-value -Wno-pointer-sign "		\
		"-working-directory $WORKING_DIR "		\
		"-c \"$CLANG_SOURCE\" -target bpf $CLANG_EMIT_LLVM -O2 -o - $LLVM_OPTIONS_PIPE"

	llvm.clang-opt::
		Options passed to clang.
+8 −6
Original line number Diff line number Diff line
@@ -142,12 +142,14 @@ OPTIONS
	  perf diff --time 0%-10%,30%-40%

	It also supports analyzing samples within a given time window
	<start>,<stop>. Times have the format seconds.microseconds. If 'start'
	is not given (i.e., time string is ',x.y') then analysis starts at
	the beginning of the file. If stop time is not given (i.e, time
	string is 'x.y,') then analysis goes to the end of the file. Time string is
	'a1.b1,c1.d1:a2.b2,c2.d2'. Use ':' to separate timestamps for different
	perf.data files.
	<start>,<stop>. Times have the format seconds.nanoseconds. If 'start'
	is not given (i.e. time string is ',x.y') then analysis starts at
	the beginning of the file. If stop time is not given (i.e. time
	string is 'x.y,') then analysis goes to the end of the file.
	Multiple ranges can be separated by spaces, which requires the argument
	to be quoted e.g. --time "1234.567,1234.789 1235,"
	Time string is'a1.b1,c1.d1:a2.b2,c2.d2'. Use ':' to separate timestamps
	for different perf.data files.

	For example, we get the timestamp information from 'perf script'.

+11 −0
Original line number Diff line number Diff line
@@ -490,6 +490,17 @@ Configure all used events to run in kernel space.
--all-user::
Configure all used events to run in user space.

--kernel-callchains::
Collect callchains only from kernel space. I.e. this option sets
perf_event_attr.exclude_callchain_user to 1.

--user-callchains::
Collect callchains only from user space. I.e. this option sets
perf_event_attr.exclude_callchain_kernel to 1.

Don't use both --kernel-callchains and --user-callchains at the same time or no
callchains will be collected.

--timestamp-filename
Append timestamp to output file name.

Loading