Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit c488a473 authored by Paul Mundt's avatar Paul Mundt
Browse files

Merge branch 'common/mmcif' into rmobile-latest

parents 6d2ae89c bba95878
Loading
Loading
Loading
Loading
+74 −0
Original line number Diff line number Diff line
@@ -385,6 +385,10 @@ mapped_file - # of bytes of mapped file (includes tmpfs/shmem)
pgpgin		- # of pages paged in (equivalent to # of charging events).
pgpgout		- # of pages paged out (equivalent to # of uncharging events).
swap		- # of bytes of swap usage
dirty		- # of bytes that are waiting to get written back to the disk.
writeback	- # of bytes that are actively being written back to the disk.
nfs_unstable	- # of bytes sent to the NFS server, but not yet committed to
		the actual storage.
inactive_anon	- # of bytes of anonymous memory and swap cache memory on
		LRU list.
active_anon	- # of bytes of anonymous and swap cache memory on active
@@ -406,6 +410,9 @@ total_mapped_file - sum of all children's "cache"
total_pgpgin		- sum of all children's "pgpgin"
total_pgpgout		- sum of all children's "pgpgout"
total_swap		- sum of all children's "swap"
total_dirty		- sum of all children's "dirty"
total_writeback		- sum of all children's "writeback"
total_nfs_unstable	- sum of all children's "nfs_unstable"
total_inactive_anon	- sum of all children's "inactive_anon"
total_active_anon	- sum of all children's "active_anon"
total_inactive_file	- sum of all children's "inactive_file"
@@ -453,6 +460,73 @@ memory under it will be reclaimed.
You can reset failcnt by writing 0 to failcnt file.
# echo 0 > .../memory.failcnt

5.5 dirty memory

Control the maximum amount of dirty pages a cgroup can have at any given time.

Limiting dirty memory is like fixing the max amount of dirty (hard to reclaim)
page cache used by a cgroup.  So, in case of multiple cgroup writers, they will
not be able to consume more than their designated share of dirty pages and will
be forced to perform write-out if they cross that limit.

The interface is equivalent to the procfs interface: /proc/sys/vm/dirty_*.  It
is possible to configure a limit to trigger both a direct writeback or a
background writeback performed by per-bdi flusher threads.  The root cgroup
memory.dirty_* control files are read-only and match the contents of
the /proc/sys/vm/dirty_* files.

Per-cgroup dirty limits can be set using the following files in the cgroupfs:

- memory.dirty_ratio: the amount of dirty memory (expressed as a percentage of
  cgroup memory) at which a process generating dirty pages will itself start
  writing out dirty data.

- memory.dirty_limit_in_bytes: the amount of dirty memory (expressed in bytes)
  in the cgroup at which a process generating dirty pages will start itself
  writing out dirty data.  Suffix (k, K, m, M, g, or G) can be used to indicate
  that value is kilo, mega or gigabytes.

  Note: memory.dirty_limit_in_bytes is the counterpart of memory.dirty_ratio.
  Only one of them may be specified at a time.  When one is written it is
  immediately taken into account to evaluate the dirty memory limits and the
  other appears as 0 when read.

- memory.dirty_background_ratio: the amount of dirty memory of the cgroup
  (expressed as a percentage of cgroup memory) at which background writeback
  kernel threads will start writing out dirty data.

- memory.dirty_background_limit_in_bytes: the amount of dirty memory (expressed
  in bytes) in the cgroup at which background writeback kernel threads will
  start writing out dirty data.  Suffix (k, K, m, M, g, or G) can be used to
  indicate that value is kilo, mega or gigabytes.

  Note: memory.dirty_background_limit_in_bytes is the counterpart of
  memory.dirty_background_ratio.  Only one of them may be specified at a time.
  When one is written it is immediately taken into account to evaluate the dirty
  memory limits and the other appears as 0 when read.

A cgroup may contain more dirty memory than its dirty limit.  This is possible
because of the principle that the first cgroup to touch a page is charged for
it.  Subsequent page counting events (dirty, writeback, nfs_unstable) are also
counted to the originally charged cgroup.

Example: If page is allocated by a cgroup A task, then the page is charged to
cgroup A.  If the page is later dirtied by a task in cgroup B, then the cgroup A
dirty count will be incremented.  If cgroup A is over its dirty limit but cgroup
B is not, then dirtying a cgroup A page from a cgroup B task may push cgroup A
over its dirty limit without throttling the dirtying cgroup B task.

When use_hierarchy=0, each cgroup has dirty memory usage and limits.
System-wide dirty limits are also consulted.  Dirty memory consumption is
checked against both system-wide and per-cgroup dirty limits.

The current implementation does not enforce per-cgroup dirty limits when
use_hierarchy=1.  System-wide dirty limits are used for processes in such
cgroups.  Attempts to read memory.dirty_* files return the system-wide
values.  Writes to the memory.dirty_* files return error.  An enhanced
implementation is needed to check the chain of parents to ensure that no
dirty limit is exceeded.

6. Hierarchy support

The memory controller supports a deep hierarchy and hierarchical accounting.
+6 −1
Original line number Diff line number Diff line
@@ -8,7 +8,7 @@ Parameters: <cipher> <key> <iv_offset> <device path> <offset>

<cipher>
    Encryption cipher and an optional IV generation mode.
    (In format cipher-chainmode-ivopts:ivmode).
    (In format cipher[:keycount]-chainmode-ivopts:ivmode).
    Examples:
       des
       aes-cbc-essiv:sha256
@@ -20,6 +20,11 @@ Parameters: <cipher> <key> <iv_offset> <device path> <offset>
    Key used for encryption. It is encoded as a hexadecimal number.
    You can only use key sizes that are valid for the selected cipher.

<keycount>
    Multi-key compatibility mode. You can define <keycount> keys and
    then sectors are encrypted according to their offsets (sector 0 uses key0;
    sector 1 uses key1 etc.).  <keycount> must be a power of two.

<iv_offset>
    The IV offset is a sector count that is added to the sector number
    before creating the IV.
+70 −0
Original line number Diff line number Diff line
Device-mapper RAID (dm-raid) is a bridge from DM to MD.  It
provides a way to use device-mapper interfaces to access the MD RAID
drivers.

As with all device-mapper targets, the nominal public interfaces are the
constructor (CTR) tables and the status outputs (both STATUSTYPE_INFO
and STATUSTYPE_TABLE).  The CTR table looks like the following:

1: <s> <l> raid \
2:      <raid_type> <#raid_params> <raid_params> \
3:      <#raid_devs> <meta_dev1> <dev1> .. <meta_devN> <devN>

Line 1 contains the standard first three arguments to any device-mapper
target - the start, length, and target type fields.  The target type in
this case is "raid".

Line 2 contains the arguments that define the particular raid
type/personality/level, the required arguments for that raid type, and
any optional arguments.  Possible raid types include: raid4, raid5_la,
raid5_ls, raid5_rs, raid6_zr, raid6_nr, and raid6_nc.  (raid1 is
planned for the future.)  The list of required and optional parameters
is the same for all the current raid types.  The required parameters are
positional, while the optional parameters are given as key/value pairs.
The possible parameters are as follows:
 <chunk_size>           Chunk size in sectors.
 [[no]sync]             Force/Prevent RAID initialization
 [rebuild <idx>]        Rebuild the drive indicated by the index
 [daemon_sleep <ms>]    Time between bitmap daemon work to clear bits
 [min_recovery_rate <kB/sec/disk>]      Throttle RAID initialization
 [max_recovery_rate <kB/sec/disk>]      Throttle RAID initialization
 [max_write_behind <sectors>]           See '-write-behind=' (man mdadm)
 [stripe_cache <sectors>]               Stripe cache size for higher RAIDs

Line 3 contains the list of devices that compose the array in
metadata/data device pairs.  If the metadata is stored separately, a '-'
is given for the metadata device position.  If a drive has failed or is
missing at creation time, a '-' can be given for both the metadata and
data drives for a given position.

NB. Currently all metadata devices must be specified as '-'.

Examples:
# RAID4 - 4 data drives, 1 parity
# No metadata devices specified to hold superblock/bitmap info
# Chunk size of 1MiB
# (Lines separated for easy reading)
0 1960893648 raid \
        raid4 1 2048 \
        5 - 8:17 - 8:33 - 8:49 - 8:65 - 8:81

# RAID4 - 4 data drives, 1 parity (no metadata devices)
# Chunk size of 1MiB, force RAID initialization,
#       min recovery rate at 20 kiB/sec/disk
0 1960893648 raid \
        raid4 4 2048 min_recovery_rate 20 sync\
        5 - 8:17 - 8:33 - 8:49 - 8:65 - 8:81

Performing a 'dmsetup table' should display the CTR table used to
construct the mapping (with possible reordering of optional
parameters).

Performing a 'dmsetup status' will yield information on the state and
health of the array.  The output is as follows:
1: <s> <l> raid \
2:      <raid_type> <#devices> <1 health char for each dev> <resync_ratio>

Line 1 is standard DM output.  Line 2 is best shown by example:
        0 1960893648 raid raid4 5 AAAAA 2/490221568
Here we can see the RAID type is raid4, there are 5 devices - all of
which are 'A'live, and the array is 2/490221568 complete with recovery.
+7 −0
Original line number Diff line number Diff line
@@ -375,6 +375,7 @@ Anonymous: 0 kB
Swap:                  0 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Locked:              374 kB

The first of these lines shows the same information as is displayed for the
mapping in /proc/PID/maps.  The remaining lines show the size of the mapping
@@ -670,6 +671,8 @@ varies by architecture and compile options. The following is from a

> cat /proc/meminfo

The "Locked" indicates whether the mapping is locked in memory or not.


MemTotal:     16344972 kB
MemFree:      13634064 kB
@@ -1320,6 +1323,10 @@ scaled linearly with /proc/<pid>/oom_score_adj.
Writing to /proc/<pid>/oom_score_adj or /proc/<pid>/oom_adj will change the
other with its scaled value.

The value of /proc/<pid>/oom_score_adj may be reduced no lower than the last
value set by a CAP_SYS_RESOURCE process. To reduce the value any lower
requires CAP_SYS_RESOURCE.

NOTICE: /proc/<pid>/oom_adj is deprecated and will be removed, please see
Documentation/feature-removal-schedule.txt.

+1 −1
Original line number Diff line number Diff line
@@ -135,7 +135,7 @@ setting up a platform_device using the GPIO, is mark its direction:
	int gpio_direction_input(unsigned gpio);
	int gpio_direction_output(unsigned gpio, int value);

The return value is zero for success, else a negative errno.  It must
The return value is zero for success, else a negative errno.  It should
be checked, since the get/set calls don't have error returns and since
misconfiguration is possible.  You should normally issue these calls from
a task context.  However, for spinlock-safe GPIOs it's OK to use them
Loading