Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 8c2accc8 authored by Ingo Molnar's avatar Ingo Molnar
Browse files

Merge tag 'perf-core-for-mingo' of...

Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

 into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

User visible changes:

  - Allows BPF scriptlets specify arguments to be fetched using
    DWARF info, using a prologue generated at compile/build time (He Kuang, Wang Nan)

  - Allow attaching BPF scriptlets to module symbols (Wang Nan)

  - Allow attaching BPF scriptlets to userspace code using uprobe (Wang Nan)

  - BPF programs now can specify 'perf probe' tunables via its section name,
    separating key=val values using semicolons (Wang Nan)

Testing some of these new BPF features:

  Use case: get callchains when receiving SSL packets, filter then in the
            kernel, at arbitrary place.

  # cat ssl.bpf.c
  #define SEC(NAME) __attribute__((section(NAME), used))

  struct pt_regs;

  SEC("func=__inet_lookup_established hnum")
  int func(struct pt_regs *ctx, int err, unsigned short port)
  {
          return err == 0 && port == 443;
  }

  char _license[] SEC("license") = "GPL";
  int  _version   SEC("version") = LINUX_VERSION_CODE;
  #
  # perf record -a -g -e ssl.bpf.c
  ^C[ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.787 MB perf.data (3 samples) ]
  # perf script | head -30
  swapper     0 [000] 58783.268118: perf_bpf_probe:func: (ffffffff816a0f60) hnum=0x1bb
	 8a0f61 __inet_lookup_established (/lib/modules/4.3.0+/build/vmlinux)
	 896def ip_rcv_finish (/lib/modules/4.3.0+/build/vmlinux)
	 8976c2 ip_rcv (/lib/modules/4.3.0+/build/vmlinux)
	 855eba __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux)
	 8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux)
	 8572a8 process_backlog (/lib/modules/4.3.0+/build/vmlinux)
	 856b11 net_rx_action (/lib/modules/4.3.0+/build/vmlinux)
	 2a284b __do_softirq (/lib/modules/4.3.0+/build/vmlinux)
	 2a2ba3 irq_exit (/lib/modules/4.3.0+/build/vmlinux)
	 96b7a4 do_IRQ (/lib/modules/4.3.0+/build/vmlinux)
	 969807 ret_from_intr (/lib/modules/4.3.0+/build/vmlinux)
	 2dede5 cpu_startup_entry (/lib/modules/4.3.0+/build/vmlinux)
	 95d5bc rest_init (/lib/modules/4.3.0+/build/vmlinux)
	1163ffa start_kernel ([kernel.vmlinux].init.text)
	11634d7 x86_64_start_reservations ([kernel.vmlinux].init.text)
	1163623 x86_64_start_kernel ([kernel.vmlinux].init.text)

  qemu-system-x86  9178 [003] 58785.792417: perf_bpf_probe:func: (ffffffff816a0f60) hnum=0x1bb
	 8a0f61 __inet_lookup_established (/lib/modules/4.3.0+/build/vmlinux)
	 896def ip_rcv_finish (/lib/modules/4.3.0+/build/vmlinux)
	 8976c2 ip_rcv (/lib/modules/4.3.0+/build/vmlinux)
	 855eba __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux)
	 8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux)
	 856660 netif_receive_skb_internal (/lib/modules/4.3.0+/build/vmlinux)
	 8566ec netif_receive_skb_sk (/lib/modules/4.3.0+/build/vmlinux)
	   430a br_handle_frame_finish ([bridge])
	   48bc br_handle_frame ([bridge])
	 855f44 __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux)
	 8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux)
  #

    Use 'perf probe' various options to list functions, see what variables can
    be collected at any given point, experiment first collecting without a filter,
    then filter, use it together with 'perf trace', 'perf top', with or without
    callchains, if it explodes, please tell us!

  - Introduce a new callchain mode: "folded", that will list per line
    representations of all callchains for a give histogram entry, facilitating
    'perf report' output processing by other tools, such as Brendan Gregg's
    flamegraph tools (Namhyung Kim)

  E.g:

 # perf report | grep -v ^# | head
    18.37%     0.00%  swapper  [kernel.kallsyms]   [k] cpu_startup_entry
                    |
                    ---cpu_startup_entry
                       |
                       |--12.07%--start_secondary
                       |
                        --6.30%--rest_init
                                  start_kernel
                                  x86_64_start_reservations
                                  x86_64_start_kernel
  #

 Becomes, in "folded" mode:

 # perf report -g folded | grep -v ^# | head -5
     18.37%     0.00%  swapper [kernel.kallsyms]   [k] cpu_startup_entry
   12.07% cpu_startup_entry;start_secondary
    6.30% cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
     16.90%     0.00%  swapper [kernel.kallsyms]   [k] call_cpuidle
   11.23% call_cpuidle;cpu_startup_entry;start_secondary
    5.67% call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
     16.90%     0.00%  swapper [kernel.kallsyms]   [k] cpuidle_enter
   11.23% cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary
    5.67% cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
     15.12%     0.00%  swapper [kernel.kallsyms]   [k] cpuidle_enter_state
  #

   The user can also select one of "count", "period" or "percent" as the first column.

Infrastructure changes:

  - Fix multiple leaks found with Valgrind and a refcount
    debugger (Masami Hiramatsu)

  - Add further 'perf test' entries for BPF and LLVM (Wang Nan)

  - Improve 'perf test' to suport subtests, so that the series of tests
    performed in the LLVM and BPF main tests appear in the default 'perf test'
    output (Wang Nan)

  - Move memdup() from tools/perf to tools/lib/string.c (Arnaldo Carvalho de Melo)

  - Adopt strtobool() from the kernel into tools/lib/ (Wang Nan)

  - Fix selftests_install tools/ Makefile rule (Kevin Hilman)

Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
parents 90eec103 2c6caff2
Loading
Loading
Loading
Loading
+1 −1
Original line number Diff line number Diff line
@@ -96,7 +96,7 @@ cgroup_install firewire_install hv_install lguest_install perf_install usb_insta
	$(call descend,$(@:_install=),install)

selftests_install:
	$(call descend,testing/$(@:_clean=),install)
	$(call descend,testing/$(@:_install=),install)

turbostat_install x86_energy_perf_policy_install:
	$(call descend,power/x86/$(@:_install=),install)
+11 −0
Original line number Diff line number Diff line
#ifndef _TOOLS_LINUX_STRING_H_
#define _TOOLS_LINUX_STRING_H_


#include <linux/types.h>	/* for size_t */

void *memdup(const void *src, size_t len);

int strtobool(const char *s, bool *res);

#endif /* _LINUX_STRING_H_ */
+137 −9
Original line number Diff line number Diff line
@@ -152,7 +152,11 @@ struct bpf_program {
	} *reloc_desc;
	int nr_reloc;

	int fd;
	struct {
		int nr;
		int *fds;
	} instances;
	bpf_program_prep_t preprocessor;

	struct bpf_object *obj;
	void *priv;
@@ -206,10 +210,25 @@ struct bpf_object {

static void bpf_program__unload(struct bpf_program *prog)
{
	int i;

	if (!prog)
		return;

	zclose(prog->fd);
	/*
	 * If the object is opened but the program was never loaded,
	 * it is possible that prog->instances.nr == -1.
	 */
	if (prog->instances.nr > 0) {
		for (i = 0; i < prog->instances.nr; i++)
			zclose(prog->instances.fds[i]);
	} else if (prog->instances.nr != -1) {
		pr_warning("Internal error: instances.nr is %d\n",
			   prog->instances.nr);
	}

	prog->instances.nr = -1;
	zfree(&prog->instances.fds);
}

static void bpf_program__exit(struct bpf_program *prog)
@@ -260,7 +279,8 @@ bpf_program__init(void *data, size_t size, char *name, int idx,
	memcpy(prog->insns, data,
	       prog->insns_cnt * sizeof(struct bpf_insn));
	prog->idx = idx;
	prog->fd = -1;
	prog->instances.fds = NULL;
	prog->instances.nr = -1;

	return 0;
errout:
@@ -860,13 +880,73 @@ static int
bpf_program__load(struct bpf_program *prog,
		  char *license, u32 kern_version)
{
	int err, fd;
	int err = 0, fd, i;

	if (prog->instances.nr < 0 || !prog->instances.fds) {
		if (prog->preprocessor) {
			pr_warning("Internal error: can't load program '%s'\n",
				   prog->section_name);
			return -LIBBPF_ERRNO__INTERNAL;
		}

		prog->instances.fds = malloc(sizeof(int));
		if (!prog->instances.fds) {
			pr_warning("Not enough memory for BPF fds\n");
			return -ENOMEM;
		}
		prog->instances.nr = 1;
		prog->instances.fds[0] = -1;
	}

	if (!prog->preprocessor) {
		if (prog->instances.nr != 1) {
			pr_warning("Program '%s' is inconsistent: nr(%d) != 1\n",
				   prog->section_name, prog->instances.nr);
		}
		err = load_program(prog->insns, prog->insns_cnt,
				   license, kern_version, &fd);
		if (!err)
		prog->fd = fd;
			prog->instances.fds[0] = fd;
		goto out;
	}

	for (i = 0; i < prog->instances.nr; i++) {
		struct bpf_prog_prep_result result;
		bpf_program_prep_t preprocessor = prog->preprocessor;

		bzero(&result, sizeof(result));
		err = preprocessor(prog, i, prog->insns,
				   prog->insns_cnt, &result);
		if (err) {
			pr_warning("Preprocessing the %dth instance of program '%s' failed\n",
				   i, prog->section_name);
			goto out;
		}

		if (!result.new_insn_ptr || !result.new_insn_cnt) {
			pr_debug("Skip loading the %dth instance of program '%s'\n",
				 i, prog->section_name);
			prog->instances.fds[i] = -1;
			if (result.pfd)
				*result.pfd = -1;
			continue;
		}

		err = load_program(result.new_insn_ptr,
				   result.new_insn_cnt,
				   license, kern_version, &fd);

		if (err) {
			pr_warning("Loading the %dth instance of program '%s' failed\n",
					i, prog->section_name);
			goto out;
		}

		if (result.pfd)
			*result.pfd = fd;
		prog->instances.fds[i] = fd;
	}
out:
	if (err)
		pr_warning("failed to load program '%s'\n",
			   prog->section_name);
@@ -1121,5 +1201,53 @@ const char *bpf_program__title(struct bpf_program *prog, bool needs_copy)

int bpf_program__fd(struct bpf_program *prog)
{
	return prog->fd;
	return bpf_program__nth_fd(prog, 0);
}

int bpf_program__set_prep(struct bpf_program *prog, int nr_instances,
			  bpf_program_prep_t prep)
{
	int *instances_fds;

	if (nr_instances <= 0 || !prep)
		return -EINVAL;

	if (prog->instances.nr > 0 || prog->instances.fds) {
		pr_warning("Can't set pre-processor after loading\n");
		return -EINVAL;
	}

	instances_fds = malloc(sizeof(int) * nr_instances);
	if (!instances_fds) {
		pr_warning("alloc memory failed for fds\n");
		return -ENOMEM;
	}

	/* fill all fd with -1 */
	memset(instances_fds, -1, sizeof(int) * nr_instances);

	prog->instances.nr = nr_instances;
	prog->instances.fds = instances_fds;
	prog->preprocessor = prep;
	return 0;
}

int bpf_program__nth_fd(struct bpf_program *prog, int n)
{
	int fd;

	if (n >= prog->instances.nr || n < 0) {
		pr_warning("Can't get the %dth fd from program %s: only %d instances\n",
			   n, prog->section_name, prog->instances.nr);
		return -EINVAL;
	}

	fd = prog->instances.fds[n];
	if (fd < 0) {
		pr_warning("%dth instance of program '%s' is invalid\n",
			   n, prog->section_name);
		return -ENOENT;
	}

	return fd;
}
+64 −0
Original line number Diff line number Diff line
@@ -88,6 +88,70 @@ const char *bpf_program__title(struct bpf_program *prog, bool needs_copy);

int bpf_program__fd(struct bpf_program *prog);

struct bpf_insn;

/*
 * Libbpf allows callers to adjust BPF programs before being loaded
 * into kernel. One program in an object file can be transform into
 * multiple variants to be attached to different code.
 *
 * bpf_program_prep_t, bpf_program__set_prep and bpf_program__nth_fd
 * are APIs for this propose.
 *
 * - bpf_program_prep_t:
 *   It defines 'preprocessor', which is a caller defined function
 *   passed to libbpf through bpf_program__set_prep(), and will be
 *   called before program is loaded. The processor should adjust
 *   the program one time for each instances according to the number
 *   passed to it.
 *
 * - bpf_program__set_prep:
 *   Attachs a preprocessor to a BPF program. The number of instances
 *   whould be created is also passed through this function.
 *
 * - bpf_program__nth_fd:
 *   After the program is loaded, get resuling fds from bpf program for
 *   each instances.
 *
 * If bpf_program__set_prep() is not used, the program whould be loaded
 * without adjustment during bpf_object__load(). The program has only
 * one instance. In this case bpf_program__fd(prog) is equal to
 * bpf_program__nth_fd(prog, 0).
 */

struct bpf_prog_prep_result {
	/*
	 * If not NULL, load new instruction array.
	 * If set to NULL, don't load this instance.
	 */
	struct bpf_insn *new_insn_ptr;
	int new_insn_cnt;

	/* If not NULL, result fd is set to it */
	int *pfd;
};

/*
 * Parameters of bpf_program_prep_t:
 *  - prog:	The bpf_program being loaded.
 *  - n:	Index of instance being generated.
 *  - insns:	BPF instructions array.
 *  - insns_cnt:Number of instructions in insns.
 *  - res:	Output parameter, result of transformation.
 *
 * Return value:
 *  - Zero: pre-processing success.
 *  - Non-zero: pre-processing, stop loading.
 */
typedef int (*bpf_program_prep_t)(struct bpf_program *prog, int n,
				  struct bpf_insn *insns, int insns_cnt,
				  struct bpf_prog_prep_result *res);

int bpf_program__set_prep(struct bpf_program *prog, int nr_instance,
			  bpf_program_prep_t prep);

int bpf_program__nth_fd(struct bpf_program *prog, int n);

/*
 * We don't need __attribute__((packed)) now since it is
 * unnecessary for 'bpf_map_def' because they are all aligned.

tools/lib/string.c

0 → 100644
+62 −0
Original line number Diff line number Diff line
/*
 *  linux/tools/lib/string.c
 *
 *  Copied from linux/lib/string.c, where it is:
 *
 *  Copyright (C) 1991, 1992  Linus Torvalds
 *
 *  More specifically, the first copied function was strtobool, which
 *  was introduced by:
 *
 *  d0f1fed29e6e ("Add a strtobool function matching semantics of existing in kernel equivalents")
 *  Author: Jonathan Cameron <jic23@cam.ac.uk>
 */

#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <linux/string.h>

/**
 * memdup - duplicate region of memory
 *
 * @src: memory region to duplicate
 * @len: memory region length
 */
void *memdup(const void *src, size_t len)
{
	void *p = malloc(len);

	if (p)
		memcpy(p, src, len);

	return p;
}

/**
 * strtobool - convert common user inputs into boolean values
 * @s: input string
 * @res: result
 *
 * This routine returns 0 iff the first character is one of 'Yy1Nn0'.
 * Otherwise it will return -EINVAL.  Value pointed to by res is
 * updated upon finding a match.
 */
int strtobool(const char *s, bool *res)
{
	switch (s[0]) {
	case 'y':
	case 'Y':
	case '1':
		*res = true;
		break;
	case 'n':
	case 'N':
	case '0':
		*res = false;
		break;
	default:
		return -EINVAL;
	}
	return 0;
}
Loading