Loading Documentation/perf/qcom_l2_counters.txt 0 → 100644 +63 −0 Original line number Diff line number Diff line Qualcomm Technologies, Inc. l2 Cache counters ============================================= This driver supports the L2 cache clusters counters found in Qualcomm Technologies, Inc. There are multiple physical L2 cache clusters, each with their own counters. Each cluster has one or more CPUs associated with it. There is one logical L2 PMU exposed, which aggregates the results from the physical PMUs(counters). The driver provides a description of its available events and configuration options in sysfs, see /sys/devices/l2cache_counters. The "format" directory describes the format of the events. And format is of the form 0xXXX Where, 1 bit(lsb) for group (group is either txn/tenure counter). 4 bits for serial number for counter starting from 0 to 8. 5 bits for bit position of counter enable bit in a register. The driver provides a "cpumask" sysfs attribute which contains a mask consisting of one CPU per cluster which will be used to handle all the PMU events on that cluster. Examples for use with perf: perf stat -e l2cache_counters/ddr_read/,l2cache_counters/ddr_write/ -a sleep 1 perf stat -e l2cache_counters/cycles/ -C 2 sleep 1 Limitation: The driver does not support sampling, therefore "perf record" will not work. Per-task perf sessions are not supported. For transaction counters we don't need to set any configuration before monitoring. For tenure counter use case, we need to set threshold value of low and mid range occurrence counter value of cluster(as these occurrence counter exist for each cluster) in sysfs. echo 1 > /sys/bus/eventsource/devices/l2cache_counters/which_cluster_tenure echo X > /sys/bus/event_source/devices/l2cache_counters/low_tenure_threshold echo Y > /sys/bus/event_source/devices/l2cache_counters/mid_tenure_threshold Here, X < Y e.g: perf stat -e l2cache_counters/low_range_occur/ -e l2cache_counters/mid_range_occur/ -e l2cache_counters/high_range _occur/ -C 4 sleep 10 Performance counter stats for 'CPU(s) 4': 7 l2cache_counters/low_range_occur/ 5 l2cache_counters/mid_range_occur/ 7 l2cache_counters/high_range_occur/ 10.204140400 seconds time elapsed drivers/perf/Kconfig +9 −0 Original line number Diff line number Diff line Loading @@ -77,6 +77,15 @@ config QCOM_L2_PMU Adds the L2 cache PMU into the perf events subsystem for monitoring L2 cache events. config QCOM_L2_COUNTERS bool "Qualcomm Technologies L2-cache counters (PMU)" depends on ARCH_QCOM && ARM64 help Provides support for the L2 cache counters in Qualcomm Technologies processors. Adds the L2 cache counters support into the perf events subsystem for monitoring L2 cache events. config QCOM_L3_PMU bool "Qualcomm Technologies L3-cache PMU" depends on ARCH_QCOM && ARM64 && ACPI Loading drivers/perf/Makefile +1 −0 Original line number Diff line number Diff line Loading @@ -6,6 +6,7 @@ obj-$(CONFIG_ARM_PMU) += arm_pmu.o arm_pmu_platform.o obj-$(CONFIG_ARM_PMU_ACPI) += arm_pmu_acpi.o obj-$(CONFIG_HISI_PMU) += hisilicon/ obj-$(CONFIG_QCOM_L2_PMU) += qcom_l2_pmu.o obj-$(CONFIG_QCOM_L2_COUNTERS) += qcom_l2_counters.o obj-$(CONFIG_QCOM_L3_PMU) += qcom_l3_pmu.o obj-$(CONFIG_QCOM_LLCC_PMU) += qcom_llcc_pmu.o obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o Loading Loading
Documentation/perf/qcom_l2_counters.txt 0 → 100644 +63 −0 Original line number Diff line number Diff line Qualcomm Technologies, Inc. l2 Cache counters ============================================= This driver supports the L2 cache clusters counters found in Qualcomm Technologies, Inc. There are multiple physical L2 cache clusters, each with their own counters. Each cluster has one or more CPUs associated with it. There is one logical L2 PMU exposed, which aggregates the results from the physical PMUs(counters). The driver provides a description of its available events and configuration options in sysfs, see /sys/devices/l2cache_counters. The "format" directory describes the format of the events. And format is of the form 0xXXX Where, 1 bit(lsb) for group (group is either txn/tenure counter). 4 bits for serial number for counter starting from 0 to 8. 5 bits for bit position of counter enable bit in a register. The driver provides a "cpumask" sysfs attribute which contains a mask consisting of one CPU per cluster which will be used to handle all the PMU events on that cluster. Examples for use with perf: perf stat -e l2cache_counters/ddr_read/,l2cache_counters/ddr_write/ -a sleep 1 perf stat -e l2cache_counters/cycles/ -C 2 sleep 1 Limitation: The driver does not support sampling, therefore "perf record" will not work. Per-task perf sessions are not supported. For transaction counters we don't need to set any configuration before monitoring. For tenure counter use case, we need to set threshold value of low and mid range occurrence counter value of cluster(as these occurrence counter exist for each cluster) in sysfs. echo 1 > /sys/bus/eventsource/devices/l2cache_counters/which_cluster_tenure echo X > /sys/bus/event_source/devices/l2cache_counters/low_tenure_threshold echo Y > /sys/bus/event_source/devices/l2cache_counters/mid_tenure_threshold Here, X < Y e.g: perf stat -e l2cache_counters/low_range_occur/ -e l2cache_counters/mid_range_occur/ -e l2cache_counters/high_range _occur/ -C 4 sleep 10 Performance counter stats for 'CPU(s) 4': 7 l2cache_counters/low_range_occur/ 5 l2cache_counters/mid_range_occur/ 7 l2cache_counters/high_range_occur/ 10.204140400 seconds time elapsed
drivers/perf/Kconfig +9 −0 Original line number Diff line number Diff line Loading @@ -77,6 +77,15 @@ config QCOM_L2_PMU Adds the L2 cache PMU into the perf events subsystem for monitoring L2 cache events. config QCOM_L2_COUNTERS bool "Qualcomm Technologies L2-cache counters (PMU)" depends on ARCH_QCOM && ARM64 help Provides support for the L2 cache counters in Qualcomm Technologies processors. Adds the L2 cache counters support into the perf events subsystem for monitoring L2 cache events. config QCOM_L3_PMU bool "Qualcomm Technologies L3-cache PMU" depends on ARCH_QCOM && ARM64 && ACPI Loading
drivers/perf/Makefile +1 −0 Original line number Diff line number Diff line Loading @@ -6,6 +6,7 @@ obj-$(CONFIG_ARM_PMU) += arm_pmu.o arm_pmu_platform.o obj-$(CONFIG_ARM_PMU_ACPI) += arm_pmu_acpi.o obj-$(CONFIG_HISI_PMU) += hisilicon/ obj-$(CONFIG_QCOM_L2_PMU) += qcom_l2_pmu.o obj-$(CONFIG_QCOM_L2_COUNTERS) += qcom_l2_counters.o obj-$(CONFIG_QCOM_L3_PMU) += qcom_l3_pmu.o obj-$(CONFIG_QCOM_LLCC_PMU) += qcom_llcc_pmu.o obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o Loading