Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 38f0b33e authored by Linus Torvalds's avatar Linus Torvalds
Browse files

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Thomas Gleixner:
 "A larger set of updates for perf.

  Kernel:

   - Handle the SBOX uncore monitoring correctly on Broadwell CPUs which
     do not have SBOX.

   - Store context switch out type in PERF_RECORD_SWITCH[_CPU_WIDE]. The
     percentage of preempting and non-preempting context switches help
     understanding the nature of workloads (CPU or IO bound) that are
     running on a machine. This adds the kernel facility and userspace
     changes needed to show this information in 'perf script' and 'perf
     report -D' (Alexey Budankov)

   - Remove a WARN_ON() in the trace/kprobes code which is pointless
     because the return error code is already telling the caller what's
     wrong.

   - Revert a fugly workaround for clang BPF targets.

   - Fix sample_max_stack maximum check and do not proceed when an error
     has been detect, return them to avoid misidentifying errors (Jiri
     Olsa)

   - Add SPDX idenitifiers and get rid of GPL boilderplate.

  Tools:

   - Synchronize kernel ABI headers, v4.17-rc1 (Ingo Molnar)

   - Support MAP_FIXED_NOREPLACE, noticed when updating the
     tools/include/ copies (Arnaldo Carvalho de Melo)

   - Add '\n' at the end of parse-options error messages (Ravi Bangoria)

   - Add s390 support for detailed/verbose PMU event description (Thomas
     Richter)

   - perf annotate fixes and improvements:

      * Allow showing offsets in more than just jump targets, use the
        new 'O' hotkey in the TUI, config ~/.perfconfig
        annotate.offset_level for it and for --stdio2 (Arnaldo Carvalho
        de Melo)

      * Use the resolved variable names from objdump disassembled lines
        to make them more compact, just like was already done for some
        instructions, like "mov", this eventually will be done more
        generally, but lets now add some more to the existing mechanism
        (Arnaldo Carvalho de Melo)

   - perf record fixes:

      * Change warning for missing topology sysfs entry to debug, as not
        all architectures have those files, s390 being one of those
        (Thomas Richter)

      * Remove old error messages about things that unlikely to be the
        root cause in modern systems (Andi Kleen)

   - perf sched fixes:

      * Fix -g/--call-graph documentation (Takuya Yamamoto)

   - perf stat:

      * Enable 1ms interval for printing event counters values in
        (Alexey Budankov)

   - perf test fixes:

      * Run dwarf unwind on arm32 (Kim Phillips)

      * Remove unused ptrace.h include from LLVM test, sidesteping older
        clang's lack of support for some asm constructs (Arnaldo
        Carvalho de Melo)

      * Fixup BPF test using epoll_pwait syscall function probe, to cope
        with the syscall routines renames performed in this development
        cycle (Arnaldo Carvalho de Melo)

   - perf version fixes:

      * Do not print info about HAVE_LIBAUDIT_SUPPORT in 'perf version
        --build-options' when HAVE_SYSCALL_TABLE_SUPPORT is true, as
        libaudit won't be used in that case, print info about
        syscall_table support instead (Jin Yao)

   - Build system fixes:

      * Use HAVE_..._SUPPORT used consistently (Jin Yao)

      * Restore READ_ONCE() C++ compatibility in tools/include (Mark
        Rutland)

      * Give hints about package names needed to build jvmti (Arnaldo
        Carvalho de Melo)"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits)
  perf/x86/intel/uncore: Fix SBOX support for Broadwell CPUs
  perf/x86/intel/uncore: Revert "Remove SBOX support for Broadwell server"
  coresight: Move to SPDX identifier
  perf test BPF: Fixup BPF test using epoll_pwait syscall function probe
  perf tests mmap: Show which tracepoint is failing
  perf tools: Add '\n' at the end of parse-options error messages
  perf record: Remove suggestion to enable APIC
  perf record: Remove misleading error suggestion
  perf hists browser: Clarify top/report browser help
  perf mem: Allow all record/report options
  perf trace: Support MAP_FIXED_NOREPLACE
  perf: Remove superfluous allocation error check
  perf: Fix sample_max_stack maximum check
  perf: Return proper values for user stack errors
  perf list: Add s390 support for detailed/verbose PMU event description
  perf script: Extend misc field decoding with switch out event type
  perf report: Extend raw dump (-D) out with switch out event type
  perf/core: Store context switch out type in PERF_RECORD_SWITCH[_CPU_WIDE]
  tools/headers: Synchronize kernel ABI headers, v4.17-rc1
  trace_kprobe: Remove warning message "Could not insert probe at..."
  ...
parents 18de45a9 c042f7e9
Loading
Loading
Loading
Loading
+37 −0
Original line number Diff line number Diff line
@@ -3028,10 +3028,27 @@ static struct intel_uncore_type bdx_uncore_cbox = {
	.format_group		= &hswep_uncore_cbox_format_group,
};

static struct intel_uncore_type bdx_uncore_sbox = {
	.name			= "sbox",
	.num_counters		= 4,
	.num_boxes		= 4,
	.perf_ctr_bits		= 48,
	.event_ctl		= HSWEP_S0_MSR_PMON_CTL0,
	.perf_ctr		= HSWEP_S0_MSR_PMON_CTR0,
	.event_mask		= HSWEP_S_MSR_PMON_RAW_EVENT_MASK,
	.box_ctl		= HSWEP_S0_MSR_PMON_BOX_CTL,
	.msr_offset		= HSWEP_SBOX_MSR_OFFSET,
	.ops			= &hswep_uncore_sbox_msr_ops,
	.format_group		= &hswep_uncore_sbox_format_group,
};

#define BDX_MSR_UNCORE_SBOX	3

static struct intel_uncore_type *bdx_msr_uncores[] = {
	&bdx_uncore_ubox,
	&bdx_uncore_cbox,
	&hswep_uncore_pcu,
	&bdx_uncore_sbox,
	NULL,
};

@@ -3043,10 +3060,25 @@ static struct event_constraint bdx_uncore_pcu_constraints[] = {

void bdx_uncore_cpu_init(void)
{
	int pkg = topology_phys_to_logical_pkg(0);

	if (bdx_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
		bdx_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
	uncore_msr_uncores = bdx_msr_uncores;

	/* BDX-DE doesn't have SBOX */
	if (boot_cpu_data.x86_model == 86) {
		uncore_msr_uncores[BDX_MSR_UNCORE_SBOX] = NULL;
	/* Detect systems with no SBOXes */
	} else if (uncore_extra_pci_dev[pkg].dev[HSWEP_PCI_PCU_3]) {
		struct pci_dev *pdev;
		u32 capid4;

		pdev = uncore_extra_pci_dev[pkg].dev[HSWEP_PCI_PCU_3];
		pci_read_config_dword(pdev, 0x94, &capid4);
		if (((capid4 >> 6) & 0x3) == 0)
			bdx_msr_uncores[BDX_MSR_UNCORE_SBOX] = NULL;
	}
	hswep_uncore_pcu.constraints = bdx_uncore_pcu_constraints;
}

@@ -3264,6 +3296,11 @@ static const struct pci_device_id bdx_uncore_pci_ids[] = {
		PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x6f46),
		.driver_data = UNCORE_PCI_DEV_DATA(UNCORE_EXTRA_PCI_DEV, 2),
	},
	{ /* PCU.3 (for Capability registers) */
		PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x6fc0),
		.driver_data = UNCORE_PCI_DEV_DATA(UNCORE_EXTRA_PCI_DEV,
						   HSWEP_PCI_PCU_3),
	},
	{ /* end: all zeroes */ }
};

+0 −2
Original line number Diff line number Diff line
@@ -136,7 +136,6 @@
#endif

#ifndef __ASSEMBLY__
#ifndef __BPF__
/*
 * This output constraint should be used for any inline asm which has a "call"
 * instruction.  Otherwise the asm may be inserted before the frame pointer
@@ -146,6 +145,5 @@
register unsigned long current_stack_pointer asm(_ASM_SP);
#define ASM_CALL_CONSTRAINT "+r" (current_stack_pointer)
#endif
#endif

#endif /* _ASM_X86_ASM_H */
+1 −12
Original line number Diff line number Diff line
/* SPDX-License-Identifier: GPL-2.0 */
/*
 * Copyright(C) 2015 Linaro Limited. All rights reserved.
 * Author: Mathieu Poirier <mathieu.poirier@linaro.org>
 *
 * This program is free software; you can redistribute it and/or modify it
 * under the terms of the GNU General Public License version 2 as published by
 * the Free Software Foundation.
 *
 * This program is distributed in the hope that it will be useful, but WITHOUT
 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
 * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
 * more details.
 *
 * You should have received a copy of the GNU General Public License along with
 * this program.  If not, see <http://www.gnu.org/licenses/>.
 */

#ifndef _LINUX_CORESIGHT_PMU_H
+15 −3
Original line number Diff line number Diff line
@@ -650,11 +650,23 @@ struct perf_event_mmap_page {
#define PERF_RECORD_MISC_COMM_EXEC		(1 << 13)
#define PERF_RECORD_MISC_SWITCH_OUT		(1 << 13)
/*
 * These PERF_RECORD_MISC_* flags below are safely reused
 * for the following events:
 *
 *   PERF_RECORD_MISC_EXACT_IP           - PERF_RECORD_SAMPLE of precise events
 *   PERF_RECORD_MISC_SWITCH_OUT_PREEMPT - PERF_RECORD_SWITCH* events
 *
 *
 * PERF_RECORD_MISC_EXACT_IP:
 *   Indicates that the content of PERF_SAMPLE_IP points to
 *   the actual instruction that triggered the event. See also
 *   perf_event_attr::precise_ip.
 *
 * PERF_RECORD_MISC_SWITCH_OUT_PREEMPT:
 *   Indicates that thread was preempted in TASK_RUNNING state.
 */
#define PERF_RECORD_MISC_EXACT_IP		(1 << 14)
#define PERF_RECORD_MISC_SWITCH_OUT_PREEMPT	(1 << 14)
/*
 * Reserve the last bit to indicate some extended misc field
 */
+11 −14
Original line number Diff line number Diff line
@@ -119,10 +119,6 @@ int get_callchain_buffers(int event_max_stack)
		goto exit;
	}

	if (count > 1) {
		/* If the allocation failed, give up */
		if (!callchain_cpus_entries)
			err = -ENOMEM;
	/*
	 * If requesting per event more than the global cap,
	 * return a different error to help userspace figure
@@ -130,11 +126,12 @@ int get_callchain_buffers(int event_max_stack)
	 *
	 * And also do it here so that we have &callchain_mutex held.
	 */
		if (event_max_stack > sysctl_perf_event_max_stack)
	if (event_max_stack > sysctl_perf_event_max_stack) {
		err = -EOVERFLOW;
		goto exit;
	}

	if (count == 1)
		err = alloc_callchain_buffers();
exit:
	if (err)
Loading