Loading Documentation/cgroups/memory.txt +74 −0 Original line number Diff line number Diff line Loading @@ -385,6 +385,10 @@ mapped_file - # of bytes of mapped file (includes tmpfs/shmem) pgpgin - # of pages paged in (equivalent to # of charging events). pgpgout - # of pages paged out (equivalent to # of uncharging events). swap - # of bytes of swap usage dirty - # of bytes that are waiting to get written back to the disk. writeback - # of bytes that are actively being written back to the disk. nfs_unstable - # of bytes sent to the NFS server, but not yet committed to the actual storage. inactive_anon - # of bytes of anonymous memory and swap cache memory on LRU list. active_anon - # of bytes of anonymous and swap cache memory on active Loading @@ -406,6 +410,9 @@ total_mapped_file - sum of all children's "cache" total_pgpgin - sum of all children's "pgpgin" total_pgpgout - sum of all children's "pgpgout" total_swap - sum of all children's "swap" total_dirty - sum of all children's "dirty" total_writeback - sum of all children's "writeback" total_nfs_unstable - sum of all children's "nfs_unstable" total_inactive_anon - sum of all children's "inactive_anon" total_active_anon - sum of all children's "active_anon" total_inactive_file - sum of all children's "inactive_file" Loading Loading @@ -453,6 +460,73 @@ memory under it will be reclaimed. You can reset failcnt by writing 0 to failcnt file. # echo 0 > .../memory.failcnt 5.5 dirty memory Control the maximum amount of dirty pages a cgroup can have at any given time. Limiting dirty memory is like fixing the max amount of dirty (hard to reclaim) page cache used by a cgroup. So, in case of multiple cgroup writers, they will not be able to consume more than their designated share of dirty pages and will be forced to perform write-out if they cross that limit. The interface is equivalent to the procfs interface: /proc/sys/vm/dirty_*. It is possible to configure a limit to trigger both a direct writeback or a background writeback performed by per-bdi flusher threads. The root cgroup memory.dirty_* control files are read-only and match the contents of the /proc/sys/vm/dirty_* files. Per-cgroup dirty limits can be set using the following files in the cgroupfs: - memory.dirty_ratio: the amount of dirty memory (expressed as a percentage of cgroup memory) at which a process generating dirty pages will itself start writing out dirty data. - memory.dirty_limit_in_bytes: the amount of dirty memory (expressed in bytes) in the cgroup at which a process generating dirty pages will start itself writing out dirty data. Suffix (k, K, m, M, g, or G) can be used to indicate that value is kilo, mega or gigabytes. Note: memory.dirty_limit_in_bytes is the counterpart of memory.dirty_ratio. Only one of them may be specified at a time. When one is written it is immediately taken into account to evaluate the dirty memory limits and the other appears as 0 when read. - memory.dirty_background_ratio: the amount of dirty memory of the cgroup (expressed as a percentage of cgroup memory) at which background writeback kernel threads will start writing out dirty data. - memory.dirty_background_limit_in_bytes: the amount of dirty memory (expressed in bytes) in the cgroup at which background writeback kernel threads will start writing out dirty data. Suffix (k, K, m, M, g, or G) can be used to indicate that value is kilo, mega or gigabytes. Note: memory.dirty_background_limit_in_bytes is the counterpart of memory.dirty_background_ratio. Only one of them may be specified at a time. When one is written it is immediately taken into account to evaluate the dirty memory limits and the other appears as 0 when read. A cgroup may contain more dirty memory than its dirty limit. This is possible because of the principle that the first cgroup to touch a page is charged for it. Subsequent page counting events (dirty, writeback, nfs_unstable) are also counted to the originally charged cgroup. Example: If page is allocated by a cgroup A task, then the page is charged to cgroup A. If the page is later dirtied by a task in cgroup B, then the cgroup A dirty count will be incremented. If cgroup A is over its dirty limit but cgroup B is not, then dirtying a cgroup A page from a cgroup B task may push cgroup A over its dirty limit without throttling the dirtying cgroup B task. When use_hierarchy=0, each cgroup has dirty memory usage and limits. System-wide dirty limits are also consulted. Dirty memory consumption is checked against both system-wide and per-cgroup dirty limits. The current implementation does not enforce per-cgroup dirty limits when use_hierarchy=1. System-wide dirty limits are used for processes in such cgroups. Attempts to read memory.dirty_* files return the system-wide values. Writes to the memory.dirty_* files return error. An enhanced implementation is needed to check the chain of parents to ensure that no dirty limit is exceeded. 6. Hierarchy support The memory controller supports a deep hierarchy and hierarchical accounting. Loading Documentation/device-mapper/dm-crypt.txt +6 −1 Original line number Diff line number Diff line Loading @@ -8,7 +8,7 @@ Parameters: <cipher> <key> <iv_offset> <device path> <offset> <cipher> Encryption cipher and an optional IV generation mode. (In format cipher-chainmode-ivopts:ivmode). (In format cipher[:keycount]-chainmode-ivopts:ivmode). Examples: des aes-cbc-essiv:sha256 Loading @@ -20,6 +20,11 @@ Parameters: <cipher> <key> <iv_offset> <device path> <offset> Key used for encryption. It is encoded as a hexadecimal number. You can only use key sizes that are valid for the selected cipher. <keycount> Multi-key compatibility mode. You can define <keycount> keys and then sectors are encrypted according to their offsets (sector 0 uses key0; sector 1 uses key1 etc.). <keycount> must be a power of two. <iv_offset> The IV offset is a sector count that is added to the sector number before creating the IV. Loading Documentation/device-mapper/dm-raid.txt 0 → 100644 +70 −0 Original line number Diff line number Diff line Device-mapper RAID (dm-raid) is a bridge from DM to MD. It provides a way to use device-mapper interfaces to access the MD RAID drivers. As with all device-mapper targets, the nominal public interfaces are the constructor (CTR) tables and the status outputs (both STATUSTYPE_INFO and STATUSTYPE_TABLE). The CTR table looks like the following: 1: <s> <l> raid \ 2: <raid_type> <#raid_params> <raid_params> \ 3: <#raid_devs> <meta_dev1> <dev1> .. <meta_devN> <devN> Line 1 contains the standard first three arguments to any device-mapper target - the start, length, and target type fields. The target type in this case is "raid". Line 2 contains the arguments that define the particular raid type/personality/level, the required arguments for that raid type, and any optional arguments. Possible raid types include: raid4, raid5_la, raid5_ls, raid5_rs, raid6_zr, raid6_nr, and raid6_nc. (raid1 is planned for the future.) The list of required and optional parameters is the same for all the current raid types. The required parameters are positional, while the optional parameters are given as key/value pairs. The possible parameters are as follows: <chunk_size> Chunk size in sectors. [[no]sync] Force/Prevent RAID initialization [rebuild <idx>] Rebuild the drive indicated by the index [daemon_sleep <ms>] Time between bitmap daemon work to clear bits [min_recovery_rate <kB/sec/disk>] Throttle RAID initialization [max_recovery_rate <kB/sec/disk>] Throttle RAID initialization [max_write_behind <sectors>] See '-write-behind=' (man mdadm) [stripe_cache <sectors>] Stripe cache size for higher RAIDs Line 3 contains the list of devices that compose the array in metadata/data device pairs. If the metadata is stored separately, a '-' is given for the metadata device position. If a drive has failed or is missing at creation time, a '-' can be given for both the metadata and data drives for a given position. NB. Currently all metadata devices must be specified as '-'. Examples: # RAID4 - 4 data drives, 1 parity # No metadata devices specified to hold superblock/bitmap info # Chunk size of 1MiB # (Lines separated for easy reading) 0 1960893648 raid \ raid4 1 2048 \ 5 - 8:17 - 8:33 - 8:49 - 8:65 - 8:81 # RAID4 - 4 data drives, 1 parity (no metadata devices) # Chunk size of 1MiB, force RAID initialization, # min recovery rate at 20 kiB/sec/disk 0 1960893648 raid \ raid4 4 2048 min_recovery_rate 20 sync\ 5 - 8:17 - 8:33 - 8:49 - 8:65 - 8:81 Performing a 'dmsetup table' should display the CTR table used to construct the mapping (with possible reordering of optional parameters). Performing a 'dmsetup status' will yield information on the state and health of the array. The output is as follows: 1: <s> <l> raid \ 2: <raid_type> <#devices> <1 health char for each dev> <resync_ratio> Line 1 is standard DM output. Line 2 is best shown by example: 0 1960893648 raid raid4 5 AAAAA 2/490221568 Here we can see the RAID type is raid4, there are 5 devices - all of which are 'A'live, and the array is 2/490221568 complete with recovery. Documentation/filesystems/proc.txt +7 −0 Original line number Diff line number Diff line Loading @@ -375,6 +375,7 @@ Anonymous: 0 kB Swap: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 374 kB The first of these lines shows the same information as is displayed for the mapping in /proc/PID/maps. The remaining lines show the size of the mapping Loading Loading @@ -670,6 +671,8 @@ varies by architecture and compile options. The following is from a > cat /proc/meminfo The "Locked" indicates whether the mapping is locked in memory or not. MemTotal: 16344972 kB MemFree: 13634064 kB Loading Loading @@ -1320,6 +1323,10 @@ scaled linearly with /proc/<pid>/oom_score_adj. Writing to /proc/<pid>/oom_score_adj or /proc/<pid>/oom_adj will change the other with its scaled value. The value of /proc/<pid>/oom_score_adj may be reduced no lower than the last value set by a CAP_SYS_RESOURCE process. To reduce the value any lower requires CAP_SYS_RESOURCE. NOTICE: /proc/<pid>/oom_adj is deprecated and will be removed, please see Documentation/feature-removal-schedule.txt. Loading Documentation/gpio.txt +1 −1 Original line number Diff line number Diff line Loading @@ -135,7 +135,7 @@ setting up a platform_device using the GPIO, is mark its direction: int gpio_direction_input(unsigned gpio); int gpio_direction_output(unsigned gpio, int value); The return value is zero for success, else a negative errno. It must The return value is zero for success, else a negative errno. It should be checked, since the get/set calls don't have error returns and since misconfiguration is possible. You should normally issue these calls from a task context. However, for spinlock-safe GPIOs it's OK to use them Loading Loading
Documentation/cgroups/memory.txt +74 −0 Original line number Diff line number Diff line Loading @@ -385,6 +385,10 @@ mapped_file - # of bytes of mapped file (includes tmpfs/shmem) pgpgin - # of pages paged in (equivalent to # of charging events). pgpgout - # of pages paged out (equivalent to # of uncharging events). swap - # of bytes of swap usage dirty - # of bytes that are waiting to get written back to the disk. writeback - # of bytes that are actively being written back to the disk. nfs_unstable - # of bytes sent to the NFS server, but not yet committed to the actual storage. inactive_anon - # of bytes of anonymous memory and swap cache memory on LRU list. active_anon - # of bytes of anonymous and swap cache memory on active Loading @@ -406,6 +410,9 @@ total_mapped_file - sum of all children's "cache" total_pgpgin - sum of all children's "pgpgin" total_pgpgout - sum of all children's "pgpgout" total_swap - sum of all children's "swap" total_dirty - sum of all children's "dirty" total_writeback - sum of all children's "writeback" total_nfs_unstable - sum of all children's "nfs_unstable" total_inactive_anon - sum of all children's "inactive_anon" total_active_anon - sum of all children's "active_anon" total_inactive_file - sum of all children's "inactive_file" Loading Loading @@ -453,6 +460,73 @@ memory under it will be reclaimed. You can reset failcnt by writing 0 to failcnt file. # echo 0 > .../memory.failcnt 5.5 dirty memory Control the maximum amount of dirty pages a cgroup can have at any given time. Limiting dirty memory is like fixing the max amount of dirty (hard to reclaim) page cache used by a cgroup. So, in case of multiple cgroup writers, they will not be able to consume more than their designated share of dirty pages and will be forced to perform write-out if they cross that limit. The interface is equivalent to the procfs interface: /proc/sys/vm/dirty_*. It is possible to configure a limit to trigger both a direct writeback or a background writeback performed by per-bdi flusher threads. The root cgroup memory.dirty_* control files are read-only and match the contents of the /proc/sys/vm/dirty_* files. Per-cgroup dirty limits can be set using the following files in the cgroupfs: - memory.dirty_ratio: the amount of dirty memory (expressed as a percentage of cgroup memory) at which a process generating dirty pages will itself start writing out dirty data. - memory.dirty_limit_in_bytes: the amount of dirty memory (expressed in bytes) in the cgroup at which a process generating dirty pages will start itself writing out dirty data. Suffix (k, K, m, M, g, or G) can be used to indicate that value is kilo, mega or gigabytes. Note: memory.dirty_limit_in_bytes is the counterpart of memory.dirty_ratio. Only one of them may be specified at a time. When one is written it is immediately taken into account to evaluate the dirty memory limits and the other appears as 0 when read. - memory.dirty_background_ratio: the amount of dirty memory of the cgroup (expressed as a percentage of cgroup memory) at which background writeback kernel threads will start writing out dirty data. - memory.dirty_background_limit_in_bytes: the amount of dirty memory (expressed in bytes) in the cgroup at which background writeback kernel threads will start writing out dirty data. Suffix (k, K, m, M, g, or G) can be used to indicate that value is kilo, mega or gigabytes. Note: memory.dirty_background_limit_in_bytes is the counterpart of memory.dirty_background_ratio. Only one of them may be specified at a time. When one is written it is immediately taken into account to evaluate the dirty memory limits and the other appears as 0 when read. A cgroup may contain more dirty memory than its dirty limit. This is possible because of the principle that the first cgroup to touch a page is charged for it. Subsequent page counting events (dirty, writeback, nfs_unstable) are also counted to the originally charged cgroup. Example: If page is allocated by a cgroup A task, then the page is charged to cgroup A. If the page is later dirtied by a task in cgroup B, then the cgroup A dirty count will be incremented. If cgroup A is over its dirty limit but cgroup B is not, then dirtying a cgroup A page from a cgroup B task may push cgroup A over its dirty limit without throttling the dirtying cgroup B task. When use_hierarchy=0, each cgroup has dirty memory usage and limits. System-wide dirty limits are also consulted. Dirty memory consumption is checked against both system-wide and per-cgroup dirty limits. The current implementation does not enforce per-cgroup dirty limits when use_hierarchy=1. System-wide dirty limits are used for processes in such cgroups. Attempts to read memory.dirty_* files return the system-wide values. Writes to the memory.dirty_* files return error. An enhanced implementation is needed to check the chain of parents to ensure that no dirty limit is exceeded. 6. Hierarchy support The memory controller supports a deep hierarchy and hierarchical accounting. Loading
Documentation/device-mapper/dm-crypt.txt +6 −1 Original line number Diff line number Diff line Loading @@ -8,7 +8,7 @@ Parameters: <cipher> <key> <iv_offset> <device path> <offset> <cipher> Encryption cipher and an optional IV generation mode. (In format cipher-chainmode-ivopts:ivmode). (In format cipher[:keycount]-chainmode-ivopts:ivmode). Examples: des aes-cbc-essiv:sha256 Loading @@ -20,6 +20,11 @@ Parameters: <cipher> <key> <iv_offset> <device path> <offset> Key used for encryption. It is encoded as a hexadecimal number. You can only use key sizes that are valid for the selected cipher. <keycount> Multi-key compatibility mode. You can define <keycount> keys and then sectors are encrypted according to their offsets (sector 0 uses key0; sector 1 uses key1 etc.). <keycount> must be a power of two. <iv_offset> The IV offset is a sector count that is added to the sector number before creating the IV. Loading
Documentation/device-mapper/dm-raid.txt 0 → 100644 +70 −0 Original line number Diff line number Diff line Device-mapper RAID (dm-raid) is a bridge from DM to MD. It provides a way to use device-mapper interfaces to access the MD RAID drivers. As with all device-mapper targets, the nominal public interfaces are the constructor (CTR) tables and the status outputs (both STATUSTYPE_INFO and STATUSTYPE_TABLE). The CTR table looks like the following: 1: <s> <l> raid \ 2: <raid_type> <#raid_params> <raid_params> \ 3: <#raid_devs> <meta_dev1> <dev1> .. <meta_devN> <devN> Line 1 contains the standard first three arguments to any device-mapper target - the start, length, and target type fields. The target type in this case is "raid". Line 2 contains the arguments that define the particular raid type/personality/level, the required arguments for that raid type, and any optional arguments. Possible raid types include: raid4, raid5_la, raid5_ls, raid5_rs, raid6_zr, raid6_nr, and raid6_nc. (raid1 is planned for the future.) The list of required and optional parameters is the same for all the current raid types. The required parameters are positional, while the optional parameters are given as key/value pairs. The possible parameters are as follows: <chunk_size> Chunk size in sectors. [[no]sync] Force/Prevent RAID initialization [rebuild <idx>] Rebuild the drive indicated by the index [daemon_sleep <ms>] Time between bitmap daemon work to clear bits [min_recovery_rate <kB/sec/disk>] Throttle RAID initialization [max_recovery_rate <kB/sec/disk>] Throttle RAID initialization [max_write_behind <sectors>] See '-write-behind=' (man mdadm) [stripe_cache <sectors>] Stripe cache size for higher RAIDs Line 3 contains the list of devices that compose the array in metadata/data device pairs. If the metadata is stored separately, a '-' is given for the metadata device position. If a drive has failed or is missing at creation time, a '-' can be given for both the metadata and data drives for a given position. NB. Currently all metadata devices must be specified as '-'. Examples: # RAID4 - 4 data drives, 1 parity # No metadata devices specified to hold superblock/bitmap info # Chunk size of 1MiB # (Lines separated for easy reading) 0 1960893648 raid \ raid4 1 2048 \ 5 - 8:17 - 8:33 - 8:49 - 8:65 - 8:81 # RAID4 - 4 data drives, 1 parity (no metadata devices) # Chunk size of 1MiB, force RAID initialization, # min recovery rate at 20 kiB/sec/disk 0 1960893648 raid \ raid4 4 2048 min_recovery_rate 20 sync\ 5 - 8:17 - 8:33 - 8:49 - 8:65 - 8:81 Performing a 'dmsetup table' should display the CTR table used to construct the mapping (with possible reordering of optional parameters). Performing a 'dmsetup status' will yield information on the state and health of the array. The output is as follows: 1: <s> <l> raid \ 2: <raid_type> <#devices> <1 health char for each dev> <resync_ratio> Line 1 is standard DM output. Line 2 is best shown by example: 0 1960893648 raid raid4 5 AAAAA 2/490221568 Here we can see the RAID type is raid4, there are 5 devices - all of which are 'A'live, and the array is 2/490221568 complete with recovery.
Documentation/filesystems/proc.txt +7 −0 Original line number Diff line number Diff line Loading @@ -375,6 +375,7 @@ Anonymous: 0 kB Swap: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 374 kB The first of these lines shows the same information as is displayed for the mapping in /proc/PID/maps. The remaining lines show the size of the mapping Loading Loading @@ -670,6 +671,8 @@ varies by architecture and compile options. The following is from a > cat /proc/meminfo The "Locked" indicates whether the mapping is locked in memory or not. MemTotal: 16344972 kB MemFree: 13634064 kB Loading Loading @@ -1320,6 +1323,10 @@ scaled linearly with /proc/<pid>/oom_score_adj. Writing to /proc/<pid>/oom_score_adj or /proc/<pid>/oom_adj will change the other with its scaled value. The value of /proc/<pid>/oom_score_adj may be reduced no lower than the last value set by a CAP_SYS_RESOURCE process. To reduce the value any lower requires CAP_SYS_RESOURCE. NOTICE: /proc/<pid>/oom_adj is deprecated and will be removed, please see Documentation/feature-removal-schedule.txt. Loading
Documentation/gpio.txt +1 −1 Original line number Diff line number Diff line Loading @@ -135,7 +135,7 @@ setting up a platform_device using the GPIO, is mark its direction: int gpio_direction_input(unsigned gpio); int gpio_direction_output(unsigned gpio, int value); The return value is zero for success, else a negative errno. It must The return value is zero for success, else a negative errno. It should be checked, since the get/set calls don't have error returns and since misconfiguration is possible. You should normally issue these calls from a task context. However, for spinlock-safe GPIOs it's OK to use them Loading