Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit cde0ba66 authored by Ivaylo Georgiev's avatar Ivaylo Georgiev
Browse files

Merge android-4.19.11 (a87fb6b9) into msm-4.19



* refs/heads/tmp-a87fb6b9:
  Linux 4.19.11
  x86/build: Fix compiler support check for CONFIG_RETPOLINE
  dm zoned: Fix target BIO completion handling
  drm/amdgpu: update SMC firmware image for polaris10 variants
  drm/amdgpu: update smu firmware images for VI variants (v2)
  drm/amdgpu: add some additional vega10 pci ids
  drm/amdkfd: add new vega10 pci ids
  drm/amdgpu/powerplay: Apply avfs cks-off voltages on VI
  drm/i915/execlists: Apply a full mb before execution for Braswell
  drm/i915/gvt: Fix tiled memory decoding bug on BDW
  Revert "drm/rockchip: Allow driver to be shutdown on reboot/kexec"
  drm/nouveau/kms/nv50-: also flush fb writes when rewinding push buffer
  drm/nouveau/kms: Fix memory leak in nv50_mstm_del()
  powerpc: Look for "stdout-path" when setting up legacy consoles
  powerpc/msi: Fix NULL pointer access in teardown code
  media: vb2: don't call __vb2_queue_cancel if vb2_start_streaming failed
  tracing: Fix memory leak of instance function hash filters
  tracing: Fix memory leak in set_trigger_filter()
  tracing: Fix memory leak in create_filter()
  dm: call blk_queue_split() to impose device limits on bios
  dm cache metadata: verify cache has blocks in blocks_are_clean_separate_dirty()
  dm thin: send event about thin-pool state change _after_ making it
  ARM: dts: bcm2837: Fix polarity of wifi reset GPIOs
  ARM: mmp/mmp2: fix cpu_is_mmp2() on mmp2-dt
  fuse: continue to send FUSE_RELEASEDIR when FUSE_OPEN returns ENOSYS
  mmc: sdhci: fix the timeout check window for clock and reset
  mmc: sdhci-omap: Fix DCRC error handling during tuning
  mmc: core: use mrq->sbc when sending CMD23 for RPMB
  MMC: OMAP: fix broken MMC on OMAP15XX/OMAP5910/OMAP310
  ovl: fix missing override creds in link of a metacopy upper
  ovl: fix decode of dir file handle with multi lower layers
  block/bio: Do not zero user pages
  arm64: dma-mapping: Fix FORCE_CONTIGUOUS buffer clearing
  userfaultfd: check VM_MAYWRITE was set after verifying the uffd is registered
  fs/iomap.c: get/put the page in iomap_page_create/release()
  scripts/spdxcheck.py: always open files in binary mode
  aio: fix spectre gadget in lookup_ioctx
  pinctrl: sunxi: a83t: Fix IRQ offset typo for PH11
  drm/msm: fix address space warning
  ARM: dts: qcom-apq8064-arrow-sd-600eval fix graph_endpoint warning
  i2c: aspeed: fix build warning
  slimbus: ngd: mark PM functions as __maybe_unused
  staging: olpc_dcon: add a missing dependency
  scsi: raid_attrs: fix unused variable warning
  sched/pelt: Fix warning and clean up IRQ PELT config
  FROMGIT: dm verity: log the hash algorithm implementation
  FROMGIT: dm crypt: log the encryption algorithm implementation
  ANDROID: sched: Clean-up SchedTune documentation
  ANDROID: sched/events: Fix out of bound memory access

Conflicts:
	drivers/gpu/drm/msm/disp/dpu1/dpu_dbg.c
	drivers/slimbus/qcom-ngd-ctrl.c
	init/Kconfig

Change-Id: I52e910e1330e8cac1103c32e0cd0fc43ddeac744
Signed-off-by: default avatarIvaylo Georgiev <irgeorgiev@codeaurora.org>
parents 80ac3620 a87fb6b9
Loading
Loading
Loading
Loading
+36 −61
Original line number Diff line number Diff line
@@ -30,11 +30,9 @@ Table of Contents
1. Motivation
=============

Sched-DVFS [3] was a new event-driven cpufreq governor which allows the
Schedutil [3] is a utilization-driven cpufreq governor which allows the
scheduler to select the optimal DVFS operating point (OPP) for running a task
allocated to a CPU. Later, the cpufreq maintainers introduced a similar
governor, schedutil. The introduction of schedutil also enables running
workloads at the most energy efficient OPPs.
allocated to a CPU.

However, sometimes it may be desired to intentionally boost the performance of
a workload even if that could imply a reasonable increase in energy
@@ -44,16 +42,16 @@ by it's CPU bandwidth demand.

This last requirement is especially important if we consider that one of the
main goals of the utilization-driven governor component is to replace all
currently available CPUFreq policies. Since sched-DVFS and schedutil are event
based, as opposed to the sampling driven governors we currently have, they are
already more responsive at selecting the optimal OPP to run tasks allocated to
a CPU. However, just tracking the actual task load demand may not be enough
from a performance standpoint.  For example, it is not possible to get
behaviors similar to those provided by the "performance" and "interactive"
CPUFreq governors.
currently available CPUFreq policies. Since schedutil is event-based, as
opposed to the sampling driven governors we currently have, they are already
more responsive at selecting the optimal OPP to run tasks allocated to a CPU.
However, just tracking the actual task utilization may not be enough from a
performance standpoint.  For example, it is not possible to get behaviors
similar to those provided by the "performance" and "interactive" CPUFreq
governors.

This document describes an implementation of a tunable, stacked on top of the
utilization-driven governors which extends their functionality to support task
utilization-driven governor which extends its functionality to support task
performance boosting.

By "performance boosting" we mean the reduction of the time required to
@@ -63,17 +61,6 @@ example, if we consider a simple periodic task which executes the same workload
for 5[s] every 20[s] while running at a certain OPP, a boosted execution of
that task must complete each of its activations in less than 5[s].

A previous attempt [5] to introduce such a boosting feature has not been
successful mainly because of the complexity of the proposed solution. Previous
versions of the approach described in this document exposed a single simple
interface to user-space.  This single tunable knob allowed the tuning of
system wide scheduler behaviours ranging from energy efficiency at one end
through to incremental performance boosting at the other end.  This first
tunable affects all tasks. However, that is not useful for Android products
so in this version only a more advanced extension of the concept is provided
which uses CGroups to boost the performance of only selected tasks while using
the energy efficient default for all others.

The rest of this document introduces in more details the proposed solution
which has been named SchedTune.

@@ -97,25 +84,22 @@ More details are given in section 5.
2.1 Boosting
============

The boost value is expressed as an integer in the range [-100..0..100].
The boost value is expressed as an integer in the range [0..100].

A value of 0 (default) configures the CFS scheduler for maximum energy
efficiency. This means that sched-DVFS runs the tasks at the minimum OPP
efficiency. This means that schedutil runs the tasks at the minimum OPP
required to satisfy their workload demand.

A value of 100 configures scheduler for maximum performance, which translates
to the selection of the maximum OPP on that CPU.

A value of -100 configures scheduler for minimum performance, which translates
to the selection of the minimum OPP on that CPU.

The range between -100, 0 and 100 can be set to satisfy other scenarios suitably.
For example to satisfy interactive response or depending on other system events
The range between 0 and 100 can be set to satisfy other scenarios suitably. For
example to satisfy interactive response or depending on other system events
(battery level etc).

The overall design of the SchedTune module is built on top of "Per-Entity Load
Tracking" (PELT) signals and sched-DVFS by introducing a bias on the Operating
Performance Point (OPP) selection.
Tracking" (PELT) signals and schedutil by introducing a bias on the OPP
selection.

Each time a task is allocated on a CPU, cpufreq is given the opportunity to tune
the operating frequency of that CPU to better match the workload demand. The
@@ -141,9 +125,6 @@ can be placed according to the energy-aware wakeup strategy.
A value of 1 signals to the CFS scheduler that tasks in this group should be
placed to minimise wakeup latency.

The value is combined with the boost value - task placement will not be
boost aware however CPU OPP selection is still boost aware.

Android platforms typically use this flag for application tasks which the
user is currently interacting with.

@@ -169,21 +150,16 @@ to a signal to get its inflated value:
  margin         := boosting_strategy(sched_cfs_boost, signal)
  boosted_signal := signal + margin

Different boosting strategies were identified and analyzed before selecting the
one found to be most effective.

Signal Proportional Compensation (SPC)
--------------------------------------

In this boosting strategy the sched_cfs_boost value is used to compute a
margin which is proportional to the complement of the original signal.
The boosting strategy currently implemented in SchedTune is called 'Signal
Proportional Compensation' (SPC). With SPC, the sched_cfs_boost value is used to
compute a margin which is proportional to the complement of the original signal.
When a signal has a maximum possible value, its complement is defined as
the delta from the actual value and its possible maximum.

Since the tunable implementation uses signals which have SCHED_LOAD_SCALE as
Since the tunable implementation uses signals which have SCHED_CAPACITY_SCALE as
the maximum possible value, the margin becomes:

	margin := sched_cfs_boost * (SCHED_LOAD_SCALE - signal)
	margin := sched_cfs_boost * (SCHED_CAPACITY_SCALE - signal)

Using this boosting strategy:
- a 100% sched_cfs_boost means that the signal is scaled to the maximum value
@@ -209,7 +185,7 @@ following figure where:


   ^
   |  SCHED_LOAD_SCALE
   |  SCHED_CAPACITY_SCALE
   +-----------------------------------------------------------------+
   |pppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppp
   |
@@ -250,7 +226,7 @@ one, depending on the value of sched_cfs_boost. This is a clean an non invasive
modification of the existing existing code paths.

The signal representing a CPU's utilization is boosted according to the
previously described SPC boosting strategy. To sched-DVFS, this allows a CPU
previously described SPC boosting strategy. To schedutil, this allows a CPU
(ie CFS run-queue) to appear more used then it actually is.

Thus, with the sched_cfs_boost enabled we have the following main functions to
@@ -262,10 +238,9 @@ get the current utilization of a CPU:
The new boosted_cpu_util() is similar to the first but returns a boosted
utilization signal which is a function of the sched_cfs_boost value.

This function is used in the CFS scheduler code paths where sched-DVFS needs to
decide the OPP to run a CPU at.
For example, this allows selecting the highest OPP for a CPU which has
the boost value set to 100%.
This function is used in the CFS scheduler code paths where schedutil needs to
decide the OPP to run a CPU at. For example, this allows selecting the highest
OPP for a CPU which has the boost value set to 100%.


5. Per task group boosting
@@ -305,16 +280,16 @@ main characteristics:

     This number is defined at compile time and by default configured to 16.
     This is a design decision motivated by two main reasons:
     a) In a real system we do not expect utilization scenarios with more then few
	boost groups. For example, a reasonable collection of groups could be
        just "background", "interactive" and "performance".
     a) In a real system we do not expect utilization scenarios with more than
        a few boost groups. For example, a reasonable collection of groups could
        be just "background", "interactive" and "performance".
     b) It simplifies the implementation considerably, especially for the code
	which has to compute the per CPU boosting once there are multiple
        RUNNABLE tasks with different boost values.

Such a simple design should allow servicing the main utilization scenarios identified
so far. It provides a simple interface which can be used to manage the
power-performance of all tasks or only selected tasks.
Such a simple design should allow servicing the main utilization scenarios
identified so far. It provides a simple interface which can be used to manage
the power-performance of all tasks or only selected tasks.
Moreover, this interface can be easily integrated by user-space run-times (e.g.
Android, ChromeOS) to implement a QoS solution for task boosting based on tasks
classification, which has been a long standing requirement.
@@ -397,9 +372,9 @@ How are multiple groups of tasks with different boost values managed?
---------------------------------------------------------------------

The current SchedTune implementation keeps track of the boosted RUNNABLE tasks
on a CPU. The CPU utilization seen by the scheduler-driven cpufreq governors
(and used to select an appropriate OPP) is boosted with a value which is the
maximum of the boost values of the currently RUNNABLE tasks in its RQ.
on a CPU. The CPU utilization seen by schedutil (and used to select an
appropriate OPP) is boosted with a value which is the maximum of the boost
values of the currently RUNNABLE tasks in its RQ.

This allows cpufreq to boost a CPU only while there are boosted tasks ready
to run and switch back to the energy efficient mode as soon as the last boosted
@@ -410,4 +385,4 @@ task is dequeued.
=============
[1] http://lwn.net/Articles/552889
[2] http://lkml.org/lkml/2012/5/18/91
[3] http://lkml.org/lkml/2015/6/26/620
[3] https://lkml.org/lkml/2016/3/29/1041
+1 −1
Original line number Diff line number Diff line
# SPDX-License-Identifier: GPL-2.0
VERSION = 4
PATCHLEVEL = 19
SUBLEVEL = 10
SUBLEVEL = 11
EXTRAVERSION =
NAME = "People's Front"

+1 −1
Original line number Diff line number Diff line
@@ -31,7 +31,7 @@

	wifi_pwrseq: wifi-pwrseq {
		compatible = "mmc-pwrseq-simple";
		reset-gpios = <&expgpio 1 GPIO_ACTIVE_HIGH>;
		reset-gpios = <&expgpio 1 GPIO_ACTIVE_LOW>;
	};
};

+1 −1
Original line number Diff line number Diff line
@@ -26,7 +26,7 @@

	wifi_pwrseq: wifi-pwrseq {
		compatible = "mmc-pwrseq-simple";
		reset-gpios = <&expgpio 1 GPIO_ACTIVE_HIGH>;
		reset-gpios = <&expgpio 1 GPIO_ACTIVE_LOW>;
	};
};

+5 −0
Original line number Diff line number Diff line
@@ -387,6 +387,11 @@
			hpd-gpio = <&tlmm_pinmux 72 GPIO_ACTIVE_HIGH>;

			ports {
				port@0 {
					endpoint {
						remote-endpoint = <&mdp_dtv_out>;
					};
				};
				port@1 {
					endpoint {
						remote-endpoint = <&hdmi_con>;
Loading