Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 174e7194 authored by Linus Torvalds's avatar Linus Torvalds
Browse files

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull more perf updates from Thomas Gleixner:
 "A rather large set of perf updates:

  Kernel:

   - Fix various initialization issues

   - Prevent creating [ku]probes for not CAP_SYS_ADMIN users

  Tooling:

   - Show only failing syscalls with 'perf trace --failure' (Arnaldo
     Carvalho de Melo)

            e.g: See what 'openat' syscalls are failing:

        # perf trace --failure -e openat
         762.323 ( 0.007 ms): VideoCapture/4566 openat(dfd: CWD, filename: /dev/video2) = -1 ENOENT No such file or directory
         <SNIP N /dev/videoN open attempts... sigh, where is that improvised camera lid?!? >
         790.228 ( 0.008 ms): VideoCapture/4566 openat(dfd: CWD, filename: /dev/video63) = -1 ENOENT No such file or directory
        ^C#

   - Show information about the event (freq, nr_samples, total
     period/nr_events) in the annotate --tui and --stdio2 'perf
     annotate' output, similar to the first line in the 'perf report
     --tui', but just for the samples for a the annotated symbol
     (Arnaldo Carvalho de Melo)

   - Introduce 'perf version --build-options' to show what features were
     linked, aliased as well as a shorter 'perf -vv' (Jin Yao)

   - Add a "dso_size" sort order (Kim Phillips)

   - Remove redundant ')' in the tracepoint output in 'perf trace'
     (Changbin Du)

   - Synchronize x86's cpufeatures.h, no effect on toolss (Arnaldo
     Carvalho de Melo)

   - Show group details on the title line in the annotate browser and
     'perf annotate --stdio2' output, so that the per-event columns can
     have headers (Arnaldo Carvalho de Melo)

   - Fixup vertical line separating metrics from instructions and
     cleaning unused lines at the bottom, both in the annotate TUI
     browser (Arnaldo Carvalho de Melo)

   - Remove duplicated 'samples' in lost samples warning in
     'perf report' (Arnaldo Carvalho de Melo)

   - Synchronize i915_drm.h, silencing the perf build process,
     automagically adding support for the new DRM_I915_QUERY ioctl
     (Arnaldo Carvalho de Melo)

   - Make auxtrace_queues__add_buffer() allocate struct buffer, from a
     patchkit already applied (Adrian Hunter)

   - Fix the --stdio2/TUI annotate output to include group details, be
     it for a recorded '{a,b,f}' explicit event group or when forcing
     group display using 'perf report --group' for a set of events not
     recorded as a group (Arnaldo Carvalho de Melo)

   - Fix display artifacts in the ui browser (base class for the
     annotate and main report/top TUI browser) related to the extra
     title lines work (Arnaldo Carvalho de Melo)

   - perf auxtrace refactorings, leftovers from a previously partially
     processed patchset (Adrian Hunter)

   - Fix the builtin clang build (Sandipan Das, Arnaldo Carvalho de
     Melo)

   - Synchronize i915_drm.h, silencing a perf build warning and in the
     process automagically adding support for a new ioctl command
     (Arnaldo Carvalho de Melo)

   - Fix a strncpy issue in uprobe tracing"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
  perf/core: Need CAP_SYS_ADMIN to create k/uprobe with perf_event_open()
  tracing/uprobe_event: Fix strncpy corner case
  perf/core: Fix perf_uprobe_init()
  perf/core: Fix perf_kprobe_init()
  perf/core: Fix use-after-free in uprobe_perf_close()
  perf tests clang: Fix function name for clang IR test
  perf clang: Add support for recent clang versions
  perf tools: Fix perf builds with clang support
  perf tools: No need to include namespaces.h in util.h
  perf hists browser: Remove leftover from row returned from refresh
  perf hists browser: Show extra_title_lines in the 'D' debug hotkey
  perf auxtrace: Make auxtrace_queues__add_buffer() do CPU filtering
  tools headers uapi: Synchronize i915_drm.h
  perf report: Remove duplicated 'samples' in lost samples warning
  perf ui browser: Fixup cleaning unused lines at the bottom
  perf annotate browser: Fixup vertical line separating metrics from instructions
  perf annotate: Show group details on the title line
  perf auxtrace: Make auxtrace_queues__add_buffer() allocate struct buffer
  perf/x86/intel: Move regs->flags EXACT bit init
  perf trace: Remove redundant ')'
  ...
parents 19ca90de 32e6e967
Loading
Loading
Loading
Loading
+24 −10
Original line number Diff line number Diff line
@@ -1153,7 +1153,6 @@ static void setup_pebs_sample_data(struct perf_event *event,
	if (pebs == NULL)
		return;

	regs->flags &= ~PERF_EFLAGS_EXACT;
	sample_type = event->attr.sample_type;
	dsrc = sample_type & PERF_SAMPLE_DATA_SRC;

@@ -1197,7 +1196,13 @@ static void setup_pebs_sample_data(struct perf_event *event,
	 * and PMI.
	 */
	*regs = *iregs;
	regs->flags = pebs->flags;

	/*
	 * Initialize regs_>flags from PEBS,
	 * Clear exact bit (which uses x86 EFLAGS Reserved bit 3),
	 * i.e., do not rely on it being zero:
	 */
	regs->flags = pebs->flags & ~PERF_EFLAGS_EXACT;

	if (sample_type & PERF_SAMPLE_REGS_INTR) {
		regs->ax = pebs->ax;
@@ -1217,10 +1222,6 @@ static void setup_pebs_sample_data(struct perf_event *event,
			regs->sp = pebs->sp;
		}

		/*
		 * Preserve PERF_EFLAGS_VM from set_linear_ip().
		 */
		regs->flags = pebs->flags | (regs->flags & PERF_EFLAGS_VM);
#ifndef CONFIG_X86_32
		regs->r8 = pebs->r8;
		regs->r9 = pebs->r9;
@@ -1234,20 +1235,33 @@ static void setup_pebs_sample_data(struct perf_event *event,
	}

	if (event->attr.precise_ip > 1) {
		/* Haswell and later have the eventing IP, so use it: */
		/*
		 * Haswell and later processors have an 'eventing IP'
		 * (real IP) which fixes the off-by-1 skid in hardware.
		 * Use it when precise_ip >= 2 :
		 */
		if (x86_pmu.intel_cap.pebs_format >= 2) {
			set_linear_ip(regs, pebs->real_ip);
			regs->flags |= PERF_EFLAGS_EXACT;
		} else {
			/* Otherwise use PEBS off-by-1 IP: */
			/* Otherwise, use PEBS off-by-1 IP: */
			set_linear_ip(regs, pebs->ip);

			/* ... and try to fix it up using the LBR entries: */
			/*
			 * With precise_ip >= 2, try to fix up the off-by-1 IP
			 * using the LBR. If successful, the fixup function
			 * corrects regs->ip and calls set_linear_ip() on regs:
			 */
			if (intel_pmu_pebs_fixup_ip(regs))
				regs->flags |= PERF_EFLAGS_EXACT;
		}
	} else
	} else {
		/*
		 * When precise_ip == 1, return the PEBS off-by-1 IP,
		 * no fixup attempted:
		 */
		set_linear_ip(regs, pebs->ip);
	}


	if ((sample_type & (PERF_SAMPLE_ADDR | PERF_SAMPLE_PHYS_ADDR)) &&
+14 −0
Original line number Diff line number Diff line
@@ -4447,6 +4447,9 @@ static void _free_event(struct perf_event *event)
	if (event->ctx)
		put_ctx(event->ctx);

	if (event->hw.target)
		put_task_struct(event->hw.target);

	exclusive_event_destroy(event);
	module_put(event->pmu->module);

@@ -8397,6 +8400,10 @@ static int perf_kprobe_event_init(struct perf_event *event)

	if (event->attr.type != perf_kprobe.type)
		return -ENOENT;

	if (!capable(CAP_SYS_ADMIN))
		return -EACCES;

	/*
	 * no branch sampling for probe events
	 */
@@ -8434,6 +8441,10 @@ static int perf_uprobe_event_init(struct perf_event *event)

	if (event->attr.type != perf_uprobe.type)
		return -ENOENT;

	if (!capable(CAP_SYS_ADMIN))
		return -EACCES;

	/*
	 * no branch sampling for probe events
	 */
@@ -9955,6 +9966,7 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
		 * and we cannot use the ctx information because we need the
		 * pmu before we get a ctx.
		 */
		get_task_struct(task);
		event->hw.target = task;
	}

@@ -10070,6 +10082,8 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
		perf_detach_cgroup(event);
	if (event->ns)
		put_pid_ns(event->ns);
	if (event->hw.target)
		put_task_struct(event->hw.target);
	kfree(event);

	return ERR_PTR(err);
+4 −0
Original line number Diff line number Diff line
@@ -252,6 +252,8 @@ int perf_kprobe_init(struct perf_event *p_event, bool is_retprobe)
		ret = strncpy_from_user(
			func, u64_to_user_ptr(p_event->attr.kprobe_func),
			KSYM_NAME_LEN);
		if (ret == KSYM_NAME_LEN)
			ret = -E2BIG;
		if (ret < 0)
			goto out;

@@ -300,6 +302,8 @@ int perf_uprobe_init(struct perf_event *p_event, bool is_retprobe)
		return -ENOMEM;
	ret = strncpy_from_user(
		path, u64_to_user_ptr(p_event->attr.uprobe_path), PATH_MAX);
	if (ret == PATH_MAX)
		return -E2BIG;
	if (ret < 0)
		goto out;
	if (path[0] == '\0') {
+2 −0
Original line number Diff line number Diff line
@@ -151,6 +151,8 @@ static void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
		return;

	ret = strncpy_from_user(dst, src, maxlen);
	if (ret == maxlen)
		dst[--ret] = '\0';

	if (ret < 0) {	/* Failed to fetch string */
		((u8 *)get_rloc_data(dest))[0] = '\0';
+2 −0
Original line number Diff line number Diff line
@@ -316,6 +316,7 @@
#define X86_FEATURE_VPCLMULQDQ		(16*32+10) /* Carry-Less Multiplication Double Quadword */
#define X86_FEATURE_AVX512_VNNI		(16*32+11) /* Vector Neural Network Instructions */
#define X86_FEATURE_AVX512_BITALG	(16*32+12) /* Support for VPOPCNT[B,W] and VPSHUF-BITQMB instructions */
#define X86_FEATURE_TME			(16*32+13) /* Intel Total Memory Encryption */
#define X86_FEATURE_AVX512_VPOPCNTDQ	(16*32+14) /* POPCNT for vectors of DW/QW */
#define X86_FEATURE_LA57		(16*32+16) /* 5-level page tables */
#define X86_FEATURE_RDPID		(16*32+22) /* RDPID instruction */
@@ -328,6 +329,7 @@
/* Intel-defined CPU features, CPUID level 0x00000007:0 (EDX), word 18 */
#define X86_FEATURE_AVX512_4VNNIW	(18*32+ 2) /* AVX-512 Neural Network Instructions */
#define X86_FEATURE_AVX512_4FMAPS	(18*32+ 3) /* AVX-512 Multiply Accumulation Single precision */
#define X86_FEATURE_PCONFIG		(18*32+18) /* Intel PCONFIG */
#define X86_FEATURE_SPEC_CTRL		(18*32+26) /* "" Speculation Control (IBRS + IBPB) */
#define X86_FEATURE_INTEL_STIBP		(18*32+27) /* "" Single Thread Indirect Branch Predictors */
#define X86_FEATURE_ARCH_CAPABILITIES	(18*32+29) /* IA32_ARCH_CAPABILITIES MSR (Intel) */
Loading