Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit e0972916 authored by Linus Torvalds's avatar Linus Torvalds
Browse files

Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf updates from Ingo Molnar:
 "Features:

   - Add "uretprobes" - an optimization to uprobes, like kretprobes are
     an optimization to kprobes.  "perf probe -x file sym%return" now
     works like kretprobes.  By Oleg Nesterov.

   - Introduce per core aggregation in 'perf stat', from Stephane
     Eranian.

   - Add memory profiling via PEBS, from Stephane Eranian.

   - Event group view for 'annotate' in --stdio, --tui and --gtk, from
     Namhyung Kim.

   - Add support for AMD NB and L2I "uncore" counters, by Jacob Shin.

   - Add Ivy Bridge-EP uncore support, by Zheng Yan

   - IBM zEnterprise EC12 oprofile support patchlet from Robert Richter.

   - Add perf test entries for checking breakpoint overflow signal
     handler issues, from Jiri Olsa.

   - Add perf test entry for for checking number of EXIT events, from
     Namhyung Kim.

   - Add perf test entries for checking --cpu in record and stat, from
     Jiri Olsa.

   - Introduce perf stat --repeat forever, from Frederik Deweerdt.

   - Add --no-demangle to report/top, from Namhyung Kim.

   - PowerPC fixes plus a couple of cleanups/optimizations in uprobes
     and trace_uprobes, by Oleg Nesterov.

  Various fixes and refactorings:

   - Fix dependency of the python binding wrt libtraceevent, from
     Naohiro Aota.

   - Simplify some perf_evlist methods and to allow 'stat' to share code
     with 'record' and 'trace', by Arnaldo Carvalho de Melo.

   - Remove dead code in related to libtraceevent integration, from
     Namhyung Kim.

   - Revert "perf sched: Handle PERF_RECORD_EXIT events" to get 'perf
     sched lat' back working, by Arnaldo Carvalho de Melo

   - We don't use Newt anymore, just plain libslang, by Arnaldo Carvalho
     de Melo.

   - Kill a bunch of die() calls, from Namhyung Kim.

   - Fix build on non-glibc systems due to libio.h absence, from Cody P
     Schafer.

   - Remove some perf_session and tracing dead code, from David Ahern.

   - Honor parallel jobs, fix from Borislav Petkov

   - Introduce tools/lib/lk library, initially just removing duplication
     among tools/perf and tools/vm.  from Borislav Petkov

  ... and many more I missed to list, see the shortlog and git log for
  more details."

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (136 commits)
  perf/x86/intel/P4: Robistify P4 PMU types
  perf/x86/amd: Fix AMD NB and L2I "uncore" support
  perf/x86/amd: Remove old-style NB counter support from perf_event_amd.c
  perf/x86: Check all MSRs before passing hw check
  perf/x86/amd: Add support for AMD NB and L2I "uncore" counters
  perf/x86/intel: Add Ivy Bridge-EP uncore support
  perf/x86/intel: Fix SNB-EP CBO and PCU uncore PMU filter management
  perf/x86: Avoid kfree() in CPU_{STARTING,DYING}
  uprobes/perf: Avoid perf_trace_buf_prepare/submit if ->perf_events is empty
  uprobes/tracing: Don't pass addr=ip to perf_trace_buf_submit()
  uprobes/tracing: Change create_trace_uprobe() to support uretprobes
  uprobes/tracing: Make seq_printf() code uretprobe-friendly
  uprobes/tracing: Make register_uprobe_event() paths uretprobe-friendly
  uprobes/tracing: Make uprobe_{trace,perf}_print() uretprobe-friendly
  uprobes/tracing: Introduce is_ret_probe() and uretprobe_dispatcher()
  uprobes/tracing: Introduce uprobe_{trace,perf}_print() helpers
  uprobes/tracing: Generalize struct uprobe_trace_entry_head
  uprobes/tracing: Kill the pointless local_save_flags/preempt_count calls
  uprobes/tracing: Kill the pointless seq_print_ip_sym() call
  uprobes/tracing: Kill the pointless task_pt_regs() calls
  ...
parents 1f889ec6 5ac2b5c2
Loading
Loading
Loading
Loading
+67 −47
Original line number Diff line number Diff line
            Uprobe-tracer: Uprobe-based Event Tracing
            =========================================

           Documentation written by Srikar Dronamraju


Overview
--------
Uprobe based trace events are similar to kprobe based trace events.
@@ -13,16 +15,18 @@ current_tracer. Instead of that, add probe points via
/sys/kernel/debug/tracing/events/uprobes/<EVENT>/enabled.

However unlike kprobe-event tracer, the uprobe event interface expects the
user to calculate the offset of the probepoint in the object
user to calculate the offset of the probepoint in the object.

Synopsis of uprobe_tracer
-------------------------
  p[:[GRP/]EVENT] PATH:SYMBOL[+offs] [FETCHARGS]	: Set a probe

 GRP		: Group name. If omitted, use "uprobes" for it.
 EVENT		: Event name. If omitted, the event name is generated
		  based on SYMBOL+offs.
 PATH		: path to an executable or a library.
  p[:[GRP/]EVENT] PATH:SYMBOL[+offs] [FETCHARGS] : Set a uprobe
  r[:[GRP/]EVENT] PATH:SYMBOL[+offs] [FETCHARGS] : Set a return uprobe (uretprobe)
  -:[GRP/]EVENT                                  : Clear uprobe or uretprobe event

  GRP           : Group name. If omitted, "uprobes" is the default value.
  EVENT         : Event name. If omitted, the event name is generated based
                  on SYMBOL+offs.
  PATH          : Path to an executable or a library.
  SYMBOL[+offs] : Symbol+offset where the probe is inserted.

  FETCHARGS     : Arguments. Each probe can have up to 128 args.
@@ -37,20 +41,29 @@ the third is the number of probe miss-hits.

Usage examples
--------------
To add a probe as a new event, write a new definition to uprobe_events
as below.
 * Add a probe as a new uprobe event, write a new definition to uprobe_events
as below: (sets a uprobe at an offset of 0x4245c0 in the executable /bin/bash)

    echo 'p: /bin/bash:0x4245c0' > /sys/kernel/debug/tracing/uprobe_events

 This sets a uprobe at an offset of 0x4245c0 in the executable /bin/bash
 * Add a probe as a new uretprobe event:

  echo > /sys/kernel/debug/tracing/uprobe_events
    echo 'r: /bin/bash:0x4245c0' > /sys/kernel/debug/tracing/uprobe_events

 * Unset registered event:

    echo '-:bash_0x4245c0' >> /sys/kernel/debug/tracing/uprobe_events

 * Print out the events that are registered:

 This clears all probe points.
    cat /sys/kernel/debug/tracing/uprobe_events

The following example shows how to dump the instruction pointer and %ax
a register at the probed text address.  Here we are trying to probe
function zfree in /bin/zsh
 * Clear all events:

    echo > /sys/kernel/debug/tracing/uprobe_events

Following example shows how to dump the instruction pointer and %ax register
at the probed text address. Probe zfree function in /bin/zsh:

    # cd /sys/kernel/debug/tracing/
    # cat /proc/`pgrep zsh`/maps | grep /bin/zsh | grep r-xp
@@ -59,21 +72,26 @@ function zfree in /bin/zsh
    0000000000446420 g    DF .text  0000000000000012  Base        zfree

  0x46420 is the offset of zfree in object /bin/zsh that is loaded at
0x00400000. Hence the command to probe would be :
  0x00400000. Hence the command to uprobe would be:

    # echo 'p:zfree_entry /bin/zsh:0x46420 %ip %ax' > uprobe_events

    # echo 'p /bin/zsh:0x46420 %ip %ax' > uprobe_events
  And the same for the uretprobe would be:

Please note: User has to explicitly calculate the offset of the probepoint
    # echo 'r:zfree_exit /bin/zsh:0x46420 %ip %ax' >> uprobe_events

Please note: User has to explicitly calculate the offset of the probe-point
in the object. We can see the events that are registered by looking at the
uprobe_events file.

    # cat uprobe_events
    p:uprobes/p_zsh_0x46420 /bin/zsh:0x00046420 arg1=%ip arg2=%ax
    p:uprobes/zfree_entry /bin/zsh:0x00046420 arg1=%ip arg2=%ax
    r:uprobes/zfree_exit /bin/zsh:0x00046420 arg1=%ip arg2=%ax

The format of events can be seen by viewing the file events/uprobes/p_zsh_0x46420/format
Format of events can be seen by viewing the file events/uprobes/zfree_entry/format

    # cat events/uprobes/p_zsh_0x46420/format
    name: p_zsh_0x46420
    # cat events/uprobes/zfree_entry/format
    name: zfree_entry
    ID: 922
    format:
         field:unsigned short common_type;         offset:0;  size:2; signed:0;
@@ -94,6 +112,7 @@ events, you need to enable it by:
    # echo 1 > events/uprobes/enable

Lets disable the event after sleeping for some time.

    # sleep 20
    # echo 0 > events/uprobes/enable

@@ -104,10 +123,11 @@ And you can see the traced information via /sys/kernel/debug/tracing/trace.
    #
    #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
    #              | |       |          |         |
                 zsh-24842 [006] 258544.995456: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
                 zsh-24842 [007] 258545.000270: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
                 zsh-24842 [002] 258545.043929: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
                 zsh-24842 [004] 258547.046129: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79

Each line shows us probes were triggered for a pid 24842 with ip being
0x446421 and contents of ax register being 79.
                 zsh-24842 [006] 258544.995456: zfree_entry: (0x446420) arg1=446420 arg2=79
                 zsh-24842 [007] 258545.000270: zfree_exit:  (0x446540 <- 0x446420) arg1=446540 arg2=0
                 zsh-24842 [002] 258545.043929: zfree_entry: (0x446420) arg1=446420 arg2=79
                 zsh-24842 [004] 258547.046129: zfree_exit:  (0x446540 <- 0x446420) arg1=446540 arg2=0

Output shows us uprobe was triggered for a pid 24842 with ip being 0x446420
and contents of ax register being 79. And uretprobe was triggered with ip at
0x446540 with counterpart function entry at 0x446420.
+2 −2
Original line number Diff line number Diff line
@@ -1332,11 +1332,11 @@ kernelversion:
# Clear a bunch of variables before executing the submake
tools/: FORCE
	$(Q)mkdir -p $(objtree)/tools
	$(Q)$(MAKE) LDFLAGS= MAKEFLAGS= O=$(objtree) subdir=tools -C $(src)/tools/
	$(Q)$(MAKE) LDFLAGS= MAKEFLAGS="$(filter --j% -j,$(MAKEFLAGS))" O=$(objtree) subdir=tools -C $(src)/tools/

tools/%: FORCE
	$(Q)mkdir -p $(objtree)/tools
	$(Q)$(MAKE) LDFLAGS= MAKEFLAGS= O=$(objtree) subdir=tools -C $(src)/tools/ $*
	$(Q)$(MAKE) LDFLAGS= MAKEFLAGS="$(filter --j% -j,$(MAKEFLAGS))" O=$(objtree) subdir=tools -C $(src)/tools/ $*

# Single targets
# ---------------------------------------------------------------------------
+1 −0
Original line number Diff line number Diff line
@@ -51,4 +51,5 @@ extern int arch_uprobe_post_xol(struct arch_uprobe *aup, struct pt_regs *regs);
extern bool arch_uprobe_xol_was_trapped(struct task_struct *tsk);
extern int  arch_uprobe_exception_notify(struct notifier_block *self, unsigned long val, void *data);
extern void arch_uprobe_abort_xol(struct arch_uprobe *aup, struct pt_regs *regs);
extern unsigned long arch_uretprobe_hijack_return_addr(unsigned long trampoline_vaddr, struct pt_regs *regs);
#endif	/* _ASM_UPROBES_H */
+23 −6
Original line number Diff line number Diff line
@@ -30,6 +30,16 @@

#define UPROBE_TRAP_NR	UINT_MAX

/**
 * is_trap_insn - check if the instruction is a trap variant
 * @insn: instruction to be checked.
 * Returns true if @insn is a trap variant.
 */
bool is_trap_insn(uprobe_opcode_t *insn)
{
	return (is_trap(*insn));
}

/**
 * arch_uprobe_analyze_insn
 * @mm: the probed address space.
@@ -43,12 +53,6 @@ int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe,
	if (addr & 0x03)
		return -EINVAL;

	/*
	 * We currently don't support a uprobe on an already
	 * existing breakpoint instruction underneath
	 */
	if (is_trap(auprobe->ainsn))
		return -ENOTSUPP;
	return 0;
}

@@ -188,3 +192,16 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)

	return false;
}

unsigned long
arch_uretprobe_hijack_return_addr(unsigned long trampoline_vaddr, struct pt_regs *regs)
{
	unsigned long orig_ret_vaddr;

	orig_ret_vaddr = regs->link;

	/* Replace the return addr with trampoline addr */
	regs->link = trampoline_vaddr;

	return orig_ret_vaddr;
}
+1 −0
Original line number Diff line number Diff line
@@ -440,6 +440,7 @@ static int oprofile_hwsampler_init(struct oprofile_operations *ops)
		switch (id.machine) {
		case 0x2097: case 0x2098: ops->cpu_type = "s390/z10"; break;
		case 0x2817: case 0x2818: ops->cpu_type = "s390/z196"; break;
		case 0x2827:              ops->cpu_type = "s390/zEC12"; break;
		default: return -ENODEV;
		}
	}
Loading