Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit ccc9d651 authored by Claudio Scordino's avatar Claudio Scordino Committed by Ingo Molnar
Browse files

sched/deadline: Add documentation about GRUB reclaiming



This patch adds the documentation about the GRUB reclaiming algorithm,
adding a few details discussed in list.

Signed-off-by: default avatarClaudio Scordino <claudio@evidence.eu.com>
Signed-off-by: default avatarLuca Abeni <luca.abeni@santannapisa.it>
Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
Link: http://lkml.kernel.org/r/1495138417-6203-11-git-send-email-luca.abeni@santannapisa.it


Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
parent daec5798
Loading
Loading
Loading
Loading
+168 −0
Original line number Diff line number Diff line
@@ -7,6 +7,8 @@ CONTENTS
 0. WARNING
 1. Overview
 2. Scheduling algorithm
   2.1 Main algorithm
   2.2 Bandwidth reclaiming
 3. Scheduling Real-Time Tasks
   3.1 Definitions
   3.2 Schedulability Analysis for Uniprocessor Systems
@@ -44,6 +46,9 @@ CONTENTS
2. Scheduling algorithm
==================

2.1 Main algorithm
------------------

 SCHED_DEADLINE uses three parameters, named "runtime", "period", and
 "deadline", to schedule tasks. A SCHED_DEADLINE task should receive
 "runtime" microseconds of execution time every "period" microseconds, and
@@ -113,6 +118,160 @@ CONTENTS
         remaining runtime = remaining runtime + runtime


2.2 Bandwidth reclaiming
------------------------

 Bandwidth reclaiming for deadline tasks is based on the GRUB (Greedy
 Reclamation of Unused Bandwidth) algorithm [15, 16, 17] and it is enabled
 when flag SCHED_FLAG_RECLAIM is set.

 The following diagram illustrates the state names for tasks handled by GRUB:

                             ------------
                 (d)        |   Active   |
              ------------->|            |
              |             | Contending |
              |              ------------
              |                A      |
          ----------           |      |
         |          |          |      |
         | Inactive |          |(b)   | (a)
         |          |          |      |
          ----------           |      |
              A                |      V
              |              ------------
              |             |   Active   |
              --------------|     Non    |
                 (c)        | Contending |
                             ------------

 A task can be in one of the following states:

  - ActiveContending: if it is ready for execution (or executing);

  - ActiveNonContending: if it just blocked and has not yet surpassed the 0-lag
    time;

  - Inactive: if it is blocked and has surpassed the 0-lag time.

 State transitions:

  (a) When a task blocks, it does not become immediately inactive since its
      bandwidth cannot be immediately reclaimed without breaking the
      real-time guarantees. It therefore enters a transitional state called
      ActiveNonContending. The scheduler arms the "inactive timer" to fire at
      the 0-lag time, when the task's bandwidth can be reclaimed without
      breaking the real-time guarantees.

      The 0-lag time for a task entering the ActiveNonContending state is
      computed as

                        (runtime * dl_period)
             deadline - ---------------------
                             dl_runtime

      where runtime is the remaining runtime, while dl_runtime and dl_period
      are the reservation parameters.

  (b) If the task wakes up before the inactive timer fires, the task re-enters
      the ActiveContending state and the "inactive timer" is canceled.
      In addition, if the task wakes up on a different runqueue, then
      the task's utilization must be removed from the previous runqueue's active
      utilization and must be added to the new runqueue's active utilization.
      In order to avoid races between a task waking up on a runqueue while the
       "inactive timer" is running on a different CPU, the "dl_non_contending"
      flag is used to indicate that a task is not on a runqueue but is active
      (so, the flag is set when the task blocks and is cleared when the
      "inactive timer" fires or when the task  wakes up).

  (c) When the "inactive timer" fires, the task enters the Inactive state and
      its utilization is removed from the runqueue's active utilization.

  (d) When an inactive task wakes up, it enters the ActiveContending state and
      its utilization is added to the active utilization of the runqueue where
      it has been enqueued.

 For each runqueue, the algorithm GRUB keeps track of two different bandwidths:

  - Active bandwidth (running_bw): this is the sum of the bandwidths of all
    tasks in active state (i.e., ActiveContending or ActiveNonContending);

  - Total bandwidth (this_bw): this is the sum of all tasks "belonging" to the
    runqueue, including the tasks in Inactive state.


 The algorithm reclaims the bandwidth of the tasks in Inactive state.
 It does so by decrementing the runtime of the executing task Ti at a pace equal
 to

           dq = -max{ Ui, (1 - Uinact) } dt

 where Uinact is the inactive utilization, computed as (this_bq - running_bw),
 and Ui is the bandwidth of task Ti.


 Let's now see a trivial example of two deadline tasks with runtime equal
 to 4 and period equal to 8 (i.e., bandwidth equal to 0.5):

     A            Task T1
     |
     |                               |
     |                               |
     |--------                       |----
     |       |                       V
     |---|---|---|---|---|---|---|---|--------->t
     0   1   2   3   4   5   6   7   8


     A            Task T2
     |
     |                               |
     |                               |
     |       ------------------------|
     |       |                       V
     |---|---|---|---|---|---|---|---|--------->t
     0   1   2   3   4   5   6   7   8


     A            running_bw
     |
   1 -----------------               ------
     |               |               |
  0.5-               -----------------
     |                               |
     |---|---|---|---|---|---|---|---|--------->t
     0   1   2   3   4   5   6   7   8


  - Time t = 0:

    Both tasks are ready for execution and therefore in ActiveContending state.
    Suppose Task T1 is the first task to start execution.
    Since there are no inactive tasks, its runtime is decreased as dq = -1 dt.

  - Time t = 2:

    Suppose that task T1 blocks
    Task T1 therefore enters the ActiveNonContending state. Since its remaining
    runtime is equal to 2, its 0-lag time is equal to t = 4.
    Task T2 start execution, with runtime still decreased as dq = -1 dt since
    there are no inactive tasks.

  - Time t = 4:

    This is the 0-lag time for Task T1. Since it didn't woken up in the
    meantime, it enters the Inactive state. Its bandwidth is removed from
    running_bw.
    Task T2 continues its execution. However, its runtime is now decreased as
    dq = - 0.5 dt because Uinact = 0.5.
    Task T2 therefore reclaims the bandwidth unused by Task T1.

  - Time t = 8:

    Task T1 wakes up. It enters the ActiveContending state again, and the
    running_bw is incremented.


3. Scheduling Real-Time Tasks
=============================

@@ -330,6 +489,15 @@ CONTENTS
  14 - J. Erickson, U. Devi and S. Baruah. Improved tardiness bounds for
       Global EDF. Proceedings of the 22nd Euromicro Conference on
       Real-Time Systems, 2010.
  15 - G. Lipari, S. Baruah, Greedy reclamation of unused bandwidth in
       constant-bandwidth servers, 12th IEEE Euromicro Conference on Real-Time
       Systems, 2000.
  16 - L. Abeni, J. Lelli, C. Scordino, L. Palopoli, Greedy CPU reclaiming for
       SCHED DEADLINE. In Proceedings of the Real-Time Linux Workshop (RTLWS),
       Dusseldorf, Germany, 2014.
  17 - L. Abeni, G. Lipari, A. Parri, Y. Sun, Multicore CPU reclaiming: parallel
       or sequential?. In Proceedings of the 31st Annual ACM Symposium on Applied
       Computing, 2016.


4. Bandwidth management