Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit d6a3b247 authored by Mauro Carvalho Chehab's avatar Mauro Carvalho Chehab Committed by Jonathan Corbet
Browse files

docs: scheduler: convert docs to ReST and rename to *.rst



In order to prepare to add them to the Kernel API book,
convert the files to ReST format.

The conversion is actually:
  - add blank lines and identation in order to identify paragraphs;
  - fix tables markups;
  - add some lists markups;
  - mark literal blocks;
  - adjust title markups.

At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.

Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: default avatarJonathan Corbet <corbet@lwn.net>
parent d2238840
Loading
Loading
Loading
Loading
+1 −1
Original line number Diff line number Diff line
@@ -11,4 +11,4 @@ Description:
		example would be, if User A has shares = 1024 and user
		B has shares = 2048, User B will get twice the CPU
		bandwidth user A will. For more details refer
		Documentation/scheduler/sched-design-CFS.txt
		Documentation/scheduler/sched-design-CFS.rst
+20 −18
Original line number Diff line number Diff line
================================================
Completions - "wait for completion" barrier APIs
================================================

@@ -46,7 +47,7 @@ it has to wait for it.

To use completions you need to #include <linux/completion.h> and
create a static or dynamic variable of type 'struct completion',
which has only two fields:
which has only two fields::

	struct completion {
		unsigned int done;
@@ -57,7 +58,7 @@ This provides the ->wait waitqueue to place tasks on for waiting (if any), and
the ->done completion flag for indicating whether it's completed or not.

Completions should be named to refer to the event that is being synchronized on.
A good example is:
A good example is::

	wait_for_completion(&early_console_added);

@@ -81,7 +82,7 @@ have taken place, even if these wait functions return prematurely due to a timeo
or a signal triggering.

Initializing of dynamically allocated completion objects is done via a call to
init_completion():
init_completion()::

	init_completion(&dynamic_object->done);

@@ -100,7 +101,8 @@ but be aware of other races.

For static declaration and initialization, macros are available.

For static (or global) declarations in file scope you can use DECLARE_COMPLETION():
For static (or global) declarations in file scope you can use
DECLARE_COMPLETION()::

	static DECLARE_COMPLETION(setup_done);
	DECLARE_COMPLETION(setup_done);
@@ -111,7 +113,7 @@ initialized to 'not done' and doesn't require an init_completion() call.
When a completion is declared as a local variable within a function,
then the initialization should always use DECLARE_COMPLETION_ONSTACK()
explicitly, not just to make lockdep happy, but also to make it clear
that limited scope had been considered and is intentional:
that limited scope had been considered and is intentional::

	DECLARE_COMPLETION_ONSTACK(setup_done)

@@ -140,11 +142,11 @@ Waiting for completions:
------------------------

For a thread to wait for some concurrent activity to finish, it
calls wait_for_completion() on the initialized completion structure:
calls wait_for_completion() on the initialized completion structure::

	void wait_for_completion(struct completion *done)

A typical usage scenario is:
A typical usage scenario is::

	CPU#1					CPU#2

@@ -192,17 +194,17 @@ A common problem that occurs is to have unclean assignment of return types,
so take care to assign return-values to variables of the proper type.

Checking for the specific meaning of return values also has been found
to be quite inaccurate, e.g. constructs like:
to be quite inaccurate, e.g. constructs like::

	if (!wait_for_completion_interruptible_timeout(...))

... would execute the same code path for successful completion and for the
interrupted case - which is probably not what you want.
interrupted case - which is probably not what you want::

	int wait_for_completion_interruptible(struct completion *done)

This function marks the task TASK_INTERRUPTIBLE while it is waiting.
If a signal was received while waiting it will return -ERESTARTSYS; 0 otherwise.
If a signal was received while waiting it will return -ERESTARTSYS; 0 otherwise::

	unsigned long wait_for_completion_timeout(struct completion *done, unsigned long timeout)

@@ -214,7 +216,7 @@ Timeouts are preferably calculated with msecs_to_jiffies() or usecs_to_jiffies()
to make the code largely HZ-invariant.

If the returned timeout value is deliberately ignored a comment should probably explain
why (e.g. see drivers/mfd/wm8350-core.c wm8350_read_auxadc()).
why (e.g. see drivers/mfd/wm8350-core.c wm8350_read_auxadc())::

	long wait_for_completion_interruptible_timeout(struct completion *done, unsigned long timeout)

@@ -225,14 +227,14 @@ jiffies if completion occurred.

Further variants include _killable which uses TASK_KILLABLE as the
designated tasks state and will return -ERESTARTSYS if it is interrupted,
or 0 if completion was achieved.  There is a _timeout variant as well:
or 0 if completion was achieved.  There is a _timeout variant as well::

	long wait_for_completion_killable(struct completion *done)
	long wait_for_completion_killable_timeout(struct completion *done, unsigned long timeout)

The _io variants wait_for_completion_io() behave the same as the non-_io
variants, except for accounting waiting time as 'waiting on IO', which has
an impact on how the task is accounted in scheduling/IO stats:
an impact on how the task is accounted in scheduling/IO stats::

	void wait_for_completion_io(struct completion *done)
	unsigned long wait_for_completion_io_timeout(struct completion *done, unsigned long timeout)
@@ -243,11 +245,11 @@ Signaling completions:

A thread that wants to signal that the conditions for continuation have been
achieved calls complete() to signal exactly one of the waiters that it can
continue:
continue::

	void complete(struct completion *done)

... or calls complete_all() to signal all current and future waiters:
... or calls complete_all() to signal all current and future waiters::

	void complete_all(struct completion *done)

@@ -276,14 +278,14 @@ try_wait_for_completion()/completion_done():

The try_wait_for_completion() function will not put the thread on the wait
queue but rather returns false if it would need to enqueue (block) the thread,
else it consumes one posted completion and returns true.
else it consumes one posted completion and returns true::

	bool try_wait_for_completion(struct completion *done)

Finally, to check the state of a completion without changing it in any way,
call completion_done(), which returns false if there are no posted
completions that were not yet consumed by waiters (implying that there are
waiters) and true otherwise;
waiters) and true otherwise::

	bool completion_done(struct completion *done)

+29 −0
Original line number Diff line number Diff line
:orphan:

===============
Linux Scheduler
===============

.. toctree::
    :maxdepth: 1


    completion
    sched-arch
    sched-bwc
    sched-deadline
    sched-design-CFS
    sched-domains
    sched-energy
    sched-nice-design
    sched-rt-group
    sched-stats

    text_files

.. only::  subproject and html

   Indices
   =======

   * :ref:`genindex`
+10 −8
Original line number Diff line number Diff line
=================================================================
CPU Scheduler implementation hints for architecture specific code
=================================================================

	Nick Piggin, 2005

@@ -35,9 +37,10 @@ Your cpu_idle routines need to obey the following rules:
4. The only time interrupts need to be disabled when checking
   need_resched is if we are about to sleep the processor until
   the next interrupt (this doesn't provide any protection of
   need_resched, it prevents losing an interrupt).
   need_resched, it prevents losing an interrupt):

	4a. Common problem with this type of sleep appears to be::

	4a. Common problem with this type of sleep appears to be:
	        local_irq_disable();
	        if (!need_resched()) {
	                local_irq_enable();
@@ -51,7 +54,7 @@ Your cpu_idle routines need to obey the following rules:
   although it may be reasonable to do some background work or enter
   a low CPU priority.

   	5a. If TIF_POLLING_NRFLAG is set, and we do decide to enter
      - 5a. If TIF_POLLING_NRFLAG is set, and we do decide to enter
	an interrupt sleep, it needs to be cleared then a memory
	barrier issued (followed by a test of need_resched with
	interrupts disabled, as explained in 3).
@@ -71,4 +74,3 @@ sh64 - Is sleeping racy vs interrupts? (See #4a)

sparc - IRQs on at this point(?), change local_irq_save to _disable.
      - TODO: needs secondary CPUs to disable preempt (See #1)
+18 −12
Original line number Diff line number Diff line
=====================
CFS Bandwidth Control
=====================

[ This document only discusses CPU bandwidth control for SCHED_NORMAL.
  The SCHED_RT case is covered in Documentation/scheduler/sched-rt-group.txt ]
  The SCHED_RT case is covered in Documentation/scheduler/sched-rt-group.rst ]

CFS bandwidth control is a CONFIG_FAIR_GROUP_SCHED extension which allows the
specification of the maximum CPU bandwidth available to a group or hierarchy.
@@ -27,7 +28,8 @@ cpu.cfs_quota_us: the total available run-time within a period (in microseconds)
cpu.cfs_period_us: the length of a period (in microseconds)
cpu.stat: exports throttling statistics [explained further below]

The default values are:
The default values are::

	cpu.cfs_period_us=100ms
	cpu.cfs_quota=-1

@@ -55,7 +57,8 @@ For efficiency run-time is transferred between the global pool and CPU local
on large systems.  The amount transferred each time such an update is required
is described as the "slice".

This is tunable via procfs:
This is tunable via procfs::

	/proc/sys/kernel/sched_cfs_bandwidth_slice_us (default=5ms)

Larger slice values will reduce transfer overheads, while smaller values allow
@@ -66,6 +69,7 @@ Statistics
A group's bandwidth statistics are exported via 3 fields in cpu.stat.

cpu.stat:

- nr_periods: Number of enforcement intervals that have elapsed.
- nr_throttled: Number of times the group has been throttled/limited.
- throttled_time: The total time duration (in nanoseconds) for which entities
@@ -78,12 +82,15 @@ Hierarchical considerations
The interface enforces that an individual entity's bandwidth is always
attainable, that is: max(c_i) <= C. However, over-subscription in the
aggregate case is explicitly allowed to enable work-conserving semantics
within a hierarchy.
within a hierarchy:

  e.g. \Sum (c_i) may exceed C

[ Where C is the parent's bandwidth, and c_i its children ]


There are two ways in which a group may become throttled:

	a. it fully consumes its own quota within a period
	b. a parent's quota is fully consumed within its period

@@ -92,7 +99,7 @@ be allowed to until the parent's runtime is refreshed.

Examples
--------
1. Limit a group to 1 CPU worth of runtime.
1. Limit a group to 1 CPU worth of runtime::

	If period is 250ms and quota is also 250ms, the group will get
	1 CPU worth of runtime every 250ms.
@@ -100,10 +107,10 @@ Examples
	# echo 250000 > cpu.cfs_quota_us /* quota = 250ms */
	# echo 250000 > cpu.cfs_period_us /* period = 250ms */

2. Limit a group to 2 CPUs worth of runtime on a multi-CPU machine.
2. Limit a group to 2 CPUs worth of runtime on a multi-CPU machine

   With 500ms period and 1000ms quota, the group can get 2 CPUs worth of
	runtime every 500ms.
   runtime every 500ms::

	# echo 1000000 > cpu.cfs_quota_us /* quota = 1000ms */
	# echo 500000 > cpu.cfs_period_us /* period = 500ms */
@@ -112,11 +119,10 @@ Examples

3. Limit a group to 20% of 1 CPU.

	With 50ms period, 10ms quota will be equivalent to 20% of 1 CPU.
   With 50ms period, 10ms quota will be equivalent to 20% of 1 CPU::

	# echo 10000 > cpu.cfs_quota_us /* quota = 10ms */
	# echo 50000 > cpu.cfs_period_us /* period = 50ms */

   By using a small period here we are ensuring a consistent latency
   response at the expense of burst capacity.
Loading