Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit f0fabf9c authored by Ravi Bangoria's avatar Ravi Bangoria Committed by Arnaldo Carvalho de Melo
Browse files

perf mem/c2c: Fix perf_mem_events to support powerpc

PowerPC hardware does not have a builtin latency filter (--ldlat) for
the "mem-load" event and perf_mem_events by default includes
"/ldlat=30/" which is causing a failure on PowerPC. Refactor the code to
support "perf mem/c2c" on PowerPC.

This patch depends on kernel side changes done my Madhavan:
https://lists.ozlabs.org/pipermail/linuxppc-dev/2018-December/182596.html



Signed-off-by: default avatarRavi Bangoria <ravi.bangoria@linux.ibm.com>
Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
Cc: Dick Fowles <fowles@inreach.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/20190129132412.771-1-ravi.bangoria@linux.ibm.com


Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
parent 489338a7
Loading
Loading
Loading
Loading
+12 −4
Original line number Diff line number Diff line
@@ -19,8 +19,11 @@ C2C stands for Cache To Cache.
The perf c2c tool provides means for Shared Data C2C/HITM analysis. It allows
you to track down the cacheline contentions.

The tool is based on x86's load latency and precise store facility events
provided by Intel CPUs. These events provide:
On x86, the tool is based on load latency and precise store facility events
provided by Intel CPUs. On PowerPC, the tool uses random instruction sampling
with thresholding feature.

These events provide:
  - memory address of the access
  - type of the access (load and store details)
  - latency (in cycles) of the load access
@@ -46,7 +49,7 @@ RECORD OPTIONS

-l::
--ldlat::
	Configure mem-loads latency.
	Configure mem-loads latency. (x86 only)

-k::
--all-kernel::
@@ -119,11 +122,16 @@ Following perf record options are configured by default:
  -W,-d,--phys-data,--sample-cpu

Unless specified otherwise with '-e' option, following events are monitored by
default:
default on x86:

  cpu/mem-loads,ldlat=30/P
  cpu/mem-stores/P

and following on PowerPC:

  cpu/mem-loads/
  cpu/mem-stores/

User can pass any 'perf record' option behind '--' mark, like (to enable
callchains and system wide monitoring):

+1 −1
Original line number Diff line number Diff line
@@ -82,7 +82,7 @@ RECORD OPTIONS
	Be more verbose (show counter open errors, etc)

--ldlat <n>::
	Specify desired latency for loads event.
	Specify desired latency for loads event. (x86 only)

In addition, for report all perf report options are valid, and for record
all perf record options.
+1 −0
Original line number Diff line number Diff line
@@ -2,6 +2,7 @@ libperf-y += header.o
libperf-y += sym-handling.o
libperf-y += kvm-stat.o
libperf-y += perf_regs.o
libperf-y += mem-events.o

libperf-$(CONFIG_DWARF) += dwarf-regs.o
libperf-$(CONFIG_DWARF) += skip-callchain-idx.o
+11 −0
Original line number Diff line number Diff line
// SPDX-License-Identifier: GPL-2.0
#include "mem-events.h"

/* PowerPC does not support 'ldlat' parameter. */
char *perf_mem_events__name(int i)
{
	if (i == PERF_MEM_EVENTS__LOAD)
		return (char *) "cpu/mem-loads/";

	return (char *) "cpu/mem-stores/";
}
+1 −1
Original line number Diff line number Diff line
@@ -28,7 +28,7 @@ struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
static char mem_loads_name[100];
static bool mem_loads_name__init;

char *perf_mem_events__name(int i)
char * __weak perf_mem_events__name(int i)
{
	if (i == PERF_MEM_EVENTS__LOAD) {
		if (!mem_loads_name__init) {