Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 16642a2e authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull power management updates from Rafael J Wysocki:

 - Improved system suspend/resume and runtime PM handling for the SH
   TMU, CMT and MTU2 clock event devices (also used by ARM/shmobile).

 - Generic PM domains framework extensions related to cpuidle support
   and domain objects lookup using names.

 - ARM/shmobile power management updates including improved support for
   the SH7372's A4S power domain containing the CPU core.

 - cpufreq changes related to AMD CPUs support from Matthew Garrett,
   Andre Przywara and Borislav Petkov.

 - cpu0 cpufreq driver from Shawn Guo.

 - cpufreq governor fixes related to the relaxing of limit from Michal
   Pecio.

 - OMAP cpufreq updates from Axel Lin and Richard Zhao.

 - cpuidle ladder governor fixes related to the disabling of states from
   Carsten Emde and me.

 - Runtime PM core updates related to the interactions with the system
   suspend core from Alan Stern and Kevin Hilman.

 - Wakeup sources modification allowing more helper functions to be
   called from interrupt context from John Stultz and additional
   diagnostic code from Todd Poynor.

 - System suspend error code path fix from Feng Hong.

Fixed up conflicts in cpufreq/powernow-k8 that stemmed from the
workqueue fixes conflicting fairly badly with the removal of support for
hardware P-state chips.  The changes were independent but somewhat
intertwined.

* tag 'pm-for-3.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (76 commits)
  Revert "PM QoS: Use spinlock in the per-device PM QoS constraints code"
  PM / Runtime: let rpm_resume() succeed if RPM_ACTIVE, even when disabled, v2
  cpuidle: rename function name "__cpuidle_register_driver", v2
  cpufreq: OMAP: Check IS_ERR() instead of NULL for omap_device_get_by_hwmod_name
  cpuidle: remove some empty lines
  PM: Prevent runtime suspend during system resume
  PM QoS: Use spinlock in the per-device PM QoS constraints code
  PM / Sleep: use resume event when call dpm_resume_early
  cpuidle / ACPI : move cpuidle_device field out of the acpi_processor_power structure
  ACPI / processor: remove pointless variable initialization
  ACPI / processor: remove unused function parameter
  cpufreq: OMAP: remove loops_per_jiffy recalculate for smp
  sections: fix section conflicts in drivers/cpufreq
  cpufreq: conservative: update frequency when limits are relaxed
  cpufreq / ondemand: update frequency when limits are relaxed
  properly __init-annotate pm_sysrq_init()
  cpufreq: Add a generic cpufreq-cpu0 driver
  PM / OPP: Initialize OPP table from device tree
  ARM: add cpufreq transiton notifier to adjust loops_per_jiffy for smp
  cpufreq: Remove support for hardware P-state chips from powernow-k8
  ...
parents 51562cba b9142167
Loading
Loading
Loading
Loading
+11 −0
Original line number Diff line number Diff line
@@ -176,3 +176,14 @@ Description: Disable L3 cache indices
		All AMD processors with L3 caches provide this functionality.
		For details, see BKDGs at
		http://developer.amd.com/documentation/guides/Pages/default.aspx


What:		/sys/devices/system/cpu/cpufreq/boost
Date:		August 2012
Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
Description:	Processor frequency boosting control

		This switch controls the boost setting for the whole system.
		Boosting allows the CPU and the firmware to run at a frequency
		beyound it's nominal limit.
		More details can be found in Documentation/cpu-freq/boost.txt
+93 −0
Original line number Diff line number Diff line
Processor boosting control

	- information for users -

Quick guide for the impatient:
--------------------
/sys/devices/system/cpu/cpufreq/boost
controls the boost setting for the whole system. You can read and write
that file with either "0" (boosting disabled) or "1" (boosting allowed).
Reading or writing 1 does not mean that the system is boosting at this
very moment, but only that the CPU _may_ raise the frequency at it's
discretion.
--------------------

Introduction
-------------
Some CPUs support a functionality to raise the operating frequency of
some cores in a multi-core package if certain conditions apply, mostly
if the whole chip is not fully utilized and below it's intended thermal
budget. This is done without operating system control by a combination
of hardware and firmware.
On Intel CPUs this is called "Turbo Boost", AMD calls it "Turbo-Core",
in technical documentation "Core performance boost". In Linux we use
the term "boost" for convenience.

Rationale for disable switch
----------------------------

Though the idea is to just give better performance without any user
intervention, sometimes the need arises to disable this functionality.
Most systems offer a switch in the (BIOS) firmware to disable the
functionality at all, but a more fine-grained and dynamic control would
be desirable:
1. While running benchmarks, reproducible results are important. Since
   the boosting functionality depends on the load of the whole package,
   single thread performance can vary. By explicitly disabling the boost
   functionality at least for the benchmark's run-time the system will run
   at a fixed frequency and results are reproducible again.
2. To examine the impact of the boosting functionality it is helpful
   to do tests with and without boosting.
3. Boosting means overclocking the processor, though under controlled
   conditions. By raising the frequency and the voltage the processor
   will consume more power than without the boosting, which may be
   undesirable for instance for mobile users. Disabling boosting may
   save power here, though this depends on the workload.


User controlled switch
----------------------

To allow the user to toggle the boosting functionality, the acpi-cpufreq
driver exports a sysfs knob to disable it. There is a file:
/sys/devices/system/cpu/cpufreq/boost
which can either read "0" (boosting disabled) or "1" (boosting enabled).
Reading the file is always supported, even if the processor does not
support boosting. In this case the file will be read-only and always
reads as "0". Explicitly changing the permissions and writing to that
file anyway will return EINVAL.

On supported CPUs one can write either a "0" or a "1" into this file.
This will either disable the boost functionality on all cores in the
whole system (0) or will allow the hardware to boost at will (1).

Writing a "1" does not explicitly boost the system, but just allows the
CPU (and the firmware) to boost at their discretion. Some implementations
take external factors like the chip's temperature into account, so
boosting once does not necessarily mean that it will occur every time
even using the exact same software setup.


AMD legacy cpb switch
---------------------
The AMD powernow-k8 driver used to support a very similar switch to
disable or enable the "Core Performance Boost" feature of some AMD CPUs.
This switch was instantiated in each CPU's cpufreq directory
(/sys/devices/system/cpu[0-9]*/cpufreq) and was called "cpb".
Though the per CPU existence hints at a more fine grained control, the
actual implementation only supported a system-global switch semantics,
which was simply reflected into each CPU's file. Writing a 0 or 1 into it
would pull the other CPUs to the same state.
For compatibility reasons this file and its behavior is still supported
on AMD CPUs, though it is now protected by a config switch
(X86_ACPI_CPUFREQ_CPB). On Intel CPUs this file will never be created,
even with the config option set.
This functionality is considered legacy and will be removed in some future
kernel version.

More fine grained boosting control
----------------------------------

Technically it is possible to switch the boosting functionality at least
on a per package basis, for some CPUs even per core. Currently the driver
does not support it, but this may be implemented in the future.
+9 −1
Original line number Diff line number Diff line
@@ -76,9 +76,17 @@ total 0


* desc : Small description about the idle state (string)
* disable : Option to disable this idle state (bool)
* disable : Option to disable this idle state (bool) -> see note below
* latency : Latency to exit out of this idle state (in microseconds)
* name : Name of the idle state (string)
* power : Power consumed while in this idle state (in milliwatts)
* time : Total time spent in this idle state (in microseconds)
* usage : Number of times this state was entered (count)

Note:
The behavior and the effect of the disable variable depends on the
implementation of a particular governor. In the ladder governor, for
example, it is not coherent, i.e. if one is disabling a light state,
then all deeper states are disabled as well, but the disable variable
does not reflect it. Likewise, if one enables a deep state but a lighter
state still is disabled, then this has no effect.
+55 −0
Original line number Diff line number Diff line
Generic CPU0 cpufreq driver

It is a generic cpufreq driver for CPU0 frequency management.  It
supports both uniprocessor (UP) and symmetric multiprocessor (SMP)
systems which share clock and voltage across all CPUs.

Both required and optional properties listed below must be defined
under node /cpus/cpu@0.

Required properties:
- operating-points: Refer to Documentation/devicetree/bindings/power/opp.txt
  for details

Optional properties:
- clock-latency: Specify the possible maximum transition latency for clock,
  in unit of nanoseconds.
- voltage-tolerance: Specify the CPU voltage tolerance in percentage.

Examples:

cpus {
	#address-cells = <1>;
	#size-cells = <0>;

	cpu@0 {
		compatible = "arm,cortex-a9";
		reg = <0>;
		next-level-cache = <&L2>;
		operating-points = <
			/* kHz    uV */
			792000  1100000
			396000  950000
			198000  850000
		>;
		transition-latency = <61036>; /* two CLK32 periods */
	};

	cpu@1 {
		compatible = "arm,cortex-a9";
		reg = <1>;
		next-level-cache = <&L2>;
	};

	cpu@2 {
		compatible = "arm,cortex-a9";
		reg = <2>;
		next-level-cache = <&L2>;
	};

	cpu@3 {
		compatible = "arm,cortex-a9";
		reg = <3>;
		next-level-cache = <&L2>;
	};
};
+25 −0
Original line number Diff line number Diff line
* Generic OPP Interface

SoCs have a standard set of tuples consisting of frequency and
voltage pairs that the device will support per voltage domain. These
are called Operating Performance Points or OPPs.

Properties:
- operating-points: An array of 2-tuples items, and each item consists
  of frequency and voltage like <freq-kHz vol-uV>.
	freq: clock frequency in kHz
	vol: voltage in microvolt

Examples:

cpu@0 {
	compatible = "arm,cortex-a9";
	reg = <0>;
	next-level-cache = <&L2>;
	operating-points = <
		/* kHz    uV */
		792000  1100000
		396000  950000
		198000  850000
	>;
};
Loading