Loading Documentation/RCU/RTFP.txt +125 −24 Original line number Diff line number Diff line Loading @@ -31,6 +31,14 @@ has lapsed, so this approach may be used in non-GPL software, if desired. (In contrast, implementation of RCU is permitted only in software licensed under either GPL or LGPL. Sorry!!!) In 1987, Rashid et al. described lazy TLB-flush [RichardRashid87a]. At first glance, this has nothing to do with RCU, but nevertheless this paper helped inspire the update-side batching used in the later RCU implementation in DYNIX/ptx. In 1988, Barbara Liskov published a description of Argus that noted that use of out-of-date values can be tolerated in some situations. Thus, this paper provides some early theoretical justification for use of stale data. In 1990, Pugh [Pugh90] noted that explicitly tracking which threads were reading a given data structure permitted deferred free to operate in the presence of non-terminating threads. However, this explicit Loading @@ -41,11 +49,11 @@ providing a fine-grained locking design, however, it would be interesting to see how much of the performance advantage reported in 1990 remains today. At about this same time, Adams [Adams91] described ``chaotic relaxation'', where the normal barriers between successive iterations of convergent numerical algorithms are relaxed, so that iteration $n$ might use data from iteration $n-1$ or even $n-2$. This introduces error, which typically slows convergence and thus increases the number of At about this same time, Andrews [Andrews91textbook] described ``chaotic relaxation'', where the normal barriers between successive iterations of convergent numerical algorithms are relaxed, so that iteration $n$ might use data from iteration $n-1$ or even $n-2$. This introduces error, which typically slows convergence and thus increases the number of iterations required. However, this increase is sometimes more than made up for by a reduction in the number of expensive barrier operations, which are otherwise required to synchronize the threads at the end Loading @@ -55,7 +63,8 @@ is thus inapplicable to most data structures in operating-system kernels. In 1992, Henry (now Alexia) Massalin completed a dissertation advising parallel programmers to defer processing when feasible to simplify synchronization. RCU makes extremely heavy use of this advice. synchronization [HMassalinPhD]. RCU makes extremely heavy use of this advice. In 1993, Jacobson [Jacobson93] verbally described what is perhaps the simplest deferred-free technique: simply waiting a fixed amount of time Loading Loading @@ -90,27 +99,29 @@ mechanism, which is quite similar to RCU [Gamsa99]. These operating systems made pervasive use of RCU in place of "existence locks", which greatly simplifies locking hierarchies and helps avoid deadlocks. 2001 saw the first RCU presentation involving Linux [McKenney01a] at OLS. The resulting abundance of RCU patches was presented the following year [McKenney02a], and use of RCU in dcache was first described that same year [Linder02a]. The year 2000 saw an email exchange that would likely have led to yet another independent invention of something like RCU [RustyRussell2000a,RustyRussell2000b]. Instead, 2001 saw the first RCU presentation involving Linux [McKenney01a] at OLS. The resulting abundance of RCU patches was presented the following year [McKenney02a], and use of RCU in dcache was first described that same year [Linder02a]. Also in 2002, Michael [Michael02b,Michael02a] presented "hazard-pointer" techniques that defer the destruction of data structures to simplify non-blocking synchronization (wait-free synchronization, lock-free synchronization, and obstruction-free synchronization are all examples of non-blocking synchronization). In particular, this technique eliminates locking, reduces contention, reduces memory latency for readers, and parallelizes pipeline stalls and memory latency for writers. However, these techniques still impose significant read-side overhead in the form of memory barriers. Researchers at Sun worked along similar lines in the same timeframe [HerlihyLM02]. These techniques can be thought of as inside-out reference counts, where the count is represented by the number of hazard pointers referencing a given data structure rather than the more conventional counter field within the data structure itself. The key advantage of inside-out reference counts is that they can be stored in immortal variables, thus allowing races between access and deletion to be avoided. non-blocking synchronization). The corresponding journal article appeared in 2004 [MagedMichael04a]. This technique eliminates locking, reduces contention, reduces memory latency for readers, and parallelizes pipeline stalls and memory latency for writers. However, these techniques still impose significant read-side overhead in the form of memory barriers. Researchers at Sun worked along similar lines in the same timeframe [HerlihyLM02]. These techniques can be thought of as inside-out reference counts, where the count is represented by the number of hazard pointers referencing a given data structure rather than the more conventional counter field within the data structure itself. The key advantage of inside-out reference counts is that they can be stored in immortal variables, thus allowing races between access and deletion to be avoided. By the same token, RCU can be thought of as a "bulk reference count", where some form of reference counter covers all reference by a given CPU Loading @@ -123,8 +134,10 @@ can be thought of in other terms as well. In 2003, the K42 group described how RCU could be used to create hot-pluggable implementations of operating-system functions [Appavoo03a]. Later that year saw a paper describing an RCU implementation of System V IPC [Arcangeli03], and an introduction to RCU in Linux Journal Later that year saw a paper describing an RCU implementation of System V IPC [Arcangeli03] (following up on a suggestion by Hugh Dickins [Dickins02a] and an implementation by Mingming Cao [MingmingCao2002IPCRCU]), and an introduction to RCU in Linux Journal [McKenney03a]. 2004 has seen a Linux-Journal article on use of RCU in dcache Loading Loading @@ -383,6 +396,21 @@ for Programming Languages and Operating Systems}" } } @phdthesis{HMassalinPhD ,author="H. Massalin" ,title="Synthesis: An Efficient Implementation of Fundamental Operating System Services" ,school="Columbia University" ,address="New York, NY" ,year="1992" ,annotation={ Mondo optimizing compiler. Wait-free stuff. Good advice: defer work to avoid synchronization. See page 90 (PDF page 106), Section 5.4, fourth bullet point. } } @unpublished{Jacobson93 ,author="Van Jacobson" ,title="Avoid Read-Side Locking Via Delayed Free" Loading Loading @@ -671,6 +699,20 @@ Orran Krieger and Rusty Russell and Dipankar Sarma and Maneesh Soni" [Viewed October 18, 2004]" } @conference{Michael02b ,author="Maged M. Michael" ,title="High Performance Dynamic Lock-Free Hash Tables and List-Based Sets" ,Year="2002" ,Month="August" ,booktitle="{Proceedings of the 14\textsuperscript{th} Annual ACM Symposium on Parallel Algorithms and Architecture}" ,pages="73-82" ,annotation={ Like the title says... } } @Conference{Linder02a ,Author="Hanna Linder and Dipankar Sarma and Maneesh Soni" ,Title="Scalability of the Directory Entry Cache" Loading Loading @@ -727,6 +769,24 @@ Andrea Arcangeli and Andi Kleen and Orran Krieger and Rusty Russell" } } @conference{Michael02a ,author="Maged M. Michael" ,title="Safe Memory Reclamation for Dynamic Lock-Free Objects Using Atomic Reads and Writes" ,Year="2002" ,Month="August" ,booktitle="{Proceedings of the 21\textsuperscript{st} Annual ACM Symposium on Principles of Distributed Computing}" ,pages="21-30" ,annotation={ Each thread keeps an array of pointers to items that it is currently referencing. Sort of an inside-out garbage collection mechanism, but one that requires the accessing code to explicitly state its needs. Also requires read-side memory barriers on most architectures. } } @unpublished{Dickins02a ,author="Hugh Dickins" ,title="Use RCU for System-V IPC" Loading @@ -735,6 +795,17 @@ Andrea Arcangeli and Andi Kleen and Orran Krieger and Rusty Russell" ,note="private communication" } @InProceedings{HerlihyLM02 ,author={Maurice Herlihy and Victor Luchangco and Mark Moir} ,title="The Repeat Offender Problem: A Mechanism for Supporting Dynamic-Sized, Lock-Free Data Structures" ,booktitle={Proceedings of 16\textsuperscript{th} International Symposium on Distributed Computing} ,year=2002 ,month="October" ,pages="339-353" } @unpublished{Sarma02b ,Author="Dipankar Sarma" ,Title="Some dcache\_rcu benchmark numbers" Loading @@ -749,6 +820,19 @@ Andrea Arcangeli and Andi Kleen and Orran Krieger and Rusty Russell" } } @unpublished{MingmingCao2002IPCRCU ,Author="Mingming Cao" ,Title="[PATCH]updated ipc lock patch" ,month="October" ,year="2002" ,note="Available: \url{https://lkml.org/lkml/2002/10/24/262} [Viewed February 15, 2014]" ,annotation={ Mingming Cao's patch to introduce RCU to SysV IPC. } } @unpublished{LinusTorvalds2003a ,Author="Linus Torvalds" ,Title="Re: {[PATCH]} small fixes in brlock.h" Loading Loading @@ -982,6 +1066,23 @@ Realtime Applications" } } @article{MagedMichael04a ,author="Maged M. Michael" ,title="Hazard Pointers: Safe Memory Reclamation for Lock-Free Objects" ,Year="2004" ,Month="June" ,journal="IEEE Transactions on Parallel and Distributed Systems" ,volume="15" ,number="6" ,pages="491-504" ,url="Available: \url{http://www.research.ibm.com/people/m/michael/ieeetpds-2004.pdf} [Viewed March 1, 2005]" ,annotation={ New canonical hazard-pointer citation. } } @phdthesis{PaulEdwardMcKenneyPhD ,author="Paul E. McKenney" ,title="Exploiting Deferred Destruction: Loading Documentation/RCU/checklist.txt +13 −5 Original line number Diff line number Diff line Loading @@ -256,10 +256,10 @@ over a rather long period of time, but improvements are always welcome! variations on this theme. b. Limiting update rate. For example, if updates occur only once per hour, then no explicit rate limiting is required, unless your system is already badly broken. The dcache subsystem takes this approach -- updates are guarded by a global lock, limiting their rate. once per hour, then no explicit rate limiting is required, unless your system is already badly broken. Older versions of the dcache subsystem take this approach, guarding updates with a global lock, limiting their rate. c. Trusted update -- if updates can only be done manually by superuser or some other trusted user, then it might not Loading @@ -268,7 +268,8 @@ over a rather long period of time, but improvements are always welcome! the machine. d. Use call_rcu_bh() rather than call_rcu(), in order to take advantage of call_rcu_bh()'s faster grace periods. advantage of call_rcu_bh()'s faster grace periods. (This is only a partial solution, though.) e. Periodically invoke synchronize_rcu(), permitting a limited number of updates per grace period. Loading @@ -276,6 +277,13 @@ over a rather long period of time, but improvements are always welcome! The same cautions apply to call_rcu_bh(), call_rcu_sched(), call_srcu(), and kfree_rcu(). Note that although these primitives do take action to avoid memory exhaustion when any given CPU has too many callbacks, a determined user could still exhaust memory. This is especially the case if a system with a large number of CPUs has been configured to offload all of its RCU callbacks onto a single CPU, or if the system has relatively little free memory. 9. All RCU list-traversal primitives, which include rcu_dereference(), list_for_each_entry_rcu(), and list_for_each_safe_rcu(), must be either within an RCU read-side Loading Documentation/arm64/memory.txt +10 −6 Original line number Diff line number Diff line Loading @@ -35,11 +35,13 @@ ffffffbc00000000 ffffffbdffffffff 8GB vmemmap ffffffbe00000000 ffffffbffbbfffff ~8GB [guard, future vmmemap] ffffffbffbc00000 ffffffbffbdfffff 2MB earlyprintk device ffffffbffa000000 ffffffbffaffffff 16MB PCI I/O space ffffffbffb000000 ffffffbffbbfffff 12MB [guard] ffffffbffbe00000 ffffffbffbe0ffff 64KB PCI I/O space ffffffbffbc00000 ffffffbffbdfffff 2MB earlyprintk device ffffffbffbe10000 ffffffbcffffffff ~2MB [guard] ffffffbffbe00000 ffffffbffbffffff 2MB [guard] ffffffbffc000000 ffffffbfffffffff 64MB modules Loading @@ -60,11 +62,13 @@ fffffdfc00000000 fffffdfdffffffff 8GB vmemmap fffffdfe00000000 fffffdfffbbfffff ~8GB [guard, future vmmemap] fffffdfffbc00000 fffffdfffbdfffff 2MB earlyprintk device fffffdfffa000000 fffffdfffaffffff 16MB PCI I/O space fffffdfffb000000 fffffdfffbbfffff 12MB [guard] fffffdfffbe00000 fffffdfffbe0ffff 64KB PCI I/O space fffffdfffbc00000 fffffdfffbdfffff 2MB earlyprintk device fffffdfffbe10000 fffffdfffbffffff ~2MB [guard] fffffdfffbe00000 fffffdfffbffffff 2MB [guard] fffffdfffc000000 fffffdffffffffff 64MB modules Loading Documentation/device-mapper/cache.txt +5 −6 Original line number Diff line number Diff line Loading @@ -124,12 +124,11 @@ the default being 204800 sectors (or 100MB). Updating on-disk metadata ------------------------- On-disk metadata is committed every time a REQ_SYNC or REQ_FUA bio is written. If no such requests are made then commits will occur every second. This means the cache behaves like a physical disk that has a write cache (the same is true of the thin-provisioning target). If power is lost you may lose some recent writes. The metadata should always be consistent in spite of any crash. On-disk metadata is committed every time a FLUSH or FUA bio is written. If no such requests are made then commits will occur every second. This means the cache behaves like a physical disk that has a volatile write cache. If power is lost you may lose some recent writes. The metadata should always be consistent in spite of any crash. The 'dirty' state for a cache block changes far too frequently for us to keep updating it on the fly. So we treat it as a hint. In normal Loading Documentation/device-mapper/thin-provisioning.txt +31 −3 Original line number Diff line number Diff line Loading @@ -116,6 +116,35 @@ Resuming a device with a new table itself triggers an event so the userspace daemon can use this to detect a situation where a new table already exceeds the threshold. A low water mark for the metadata device is maintained in the kernel and will trigger a dm event if free space on the metadata device drops below it. Updating on-disk metadata ------------------------- On-disk metadata is committed every time a FLUSH or FUA bio is written. If no such requests are made then commits will occur every second. This means the thin-provisioning target behaves like a physical disk that has a volatile write cache. If power is lost you may lose some recent writes. The metadata should always be consistent in spite of any crash. If data space is exhausted the pool will either error or queue IO according to the configuration (see: error_if_no_space). If metadata space is exhausted or a metadata operation fails: the pool will error IO until the pool is taken offline and repair is performed to 1) fix any potential inconsistencies and 2) clear the flag that imposes repair. Once the pool's metadata device is repaired it may be resized, which will allow the pool to return to normal operation. Note that if a pool is flagged as needing repair, the pool's data and metadata devices cannot be resized until repair is performed. It should also be noted that when the pool's metadata space is exhausted the current metadata transaction is aborted. Given that the pool will cache IO whose completion may have already been acknowledged to upper IO layers (e.g. filesystem) it is strongly suggested that consistency checks (e.g. fsck) be performed on those layers when repair of the pool is required. Thin provisioning ----------------- Loading Loading @@ -258,10 +287,9 @@ ii) Status should register for the event and then check the target's status. held metadata root: The location, in sectors, of the metadata root that has been The location, in blocks, of the metadata root that has been 'held' for userspace read access. '-' indicates there is no held root. This feature is not yet implemented so '-' is always returned. held root. discard_passdown|no_discard_passdown Whether or not discards are actually being passed down to the Loading Loading
Documentation/RCU/RTFP.txt +125 −24 Original line number Diff line number Diff line Loading @@ -31,6 +31,14 @@ has lapsed, so this approach may be used in non-GPL software, if desired. (In contrast, implementation of RCU is permitted only in software licensed under either GPL or LGPL. Sorry!!!) In 1987, Rashid et al. described lazy TLB-flush [RichardRashid87a]. At first glance, this has nothing to do with RCU, but nevertheless this paper helped inspire the update-side batching used in the later RCU implementation in DYNIX/ptx. In 1988, Barbara Liskov published a description of Argus that noted that use of out-of-date values can be tolerated in some situations. Thus, this paper provides some early theoretical justification for use of stale data. In 1990, Pugh [Pugh90] noted that explicitly tracking which threads were reading a given data structure permitted deferred free to operate in the presence of non-terminating threads. However, this explicit Loading @@ -41,11 +49,11 @@ providing a fine-grained locking design, however, it would be interesting to see how much of the performance advantage reported in 1990 remains today. At about this same time, Adams [Adams91] described ``chaotic relaxation'', where the normal barriers between successive iterations of convergent numerical algorithms are relaxed, so that iteration $n$ might use data from iteration $n-1$ or even $n-2$. This introduces error, which typically slows convergence and thus increases the number of At about this same time, Andrews [Andrews91textbook] described ``chaotic relaxation'', where the normal barriers between successive iterations of convergent numerical algorithms are relaxed, so that iteration $n$ might use data from iteration $n-1$ or even $n-2$. This introduces error, which typically slows convergence and thus increases the number of iterations required. However, this increase is sometimes more than made up for by a reduction in the number of expensive barrier operations, which are otherwise required to synchronize the threads at the end Loading @@ -55,7 +63,8 @@ is thus inapplicable to most data structures in operating-system kernels. In 1992, Henry (now Alexia) Massalin completed a dissertation advising parallel programmers to defer processing when feasible to simplify synchronization. RCU makes extremely heavy use of this advice. synchronization [HMassalinPhD]. RCU makes extremely heavy use of this advice. In 1993, Jacobson [Jacobson93] verbally described what is perhaps the simplest deferred-free technique: simply waiting a fixed amount of time Loading Loading @@ -90,27 +99,29 @@ mechanism, which is quite similar to RCU [Gamsa99]. These operating systems made pervasive use of RCU in place of "existence locks", which greatly simplifies locking hierarchies and helps avoid deadlocks. 2001 saw the first RCU presentation involving Linux [McKenney01a] at OLS. The resulting abundance of RCU patches was presented the following year [McKenney02a], and use of RCU in dcache was first described that same year [Linder02a]. The year 2000 saw an email exchange that would likely have led to yet another independent invention of something like RCU [RustyRussell2000a,RustyRussell2000b]. Instead, 2001 saw the first RCU presentation involving Linux [McKenney01a] at OLS. The resulting abundance of RCU patches was presented the following year [McKenney02a], and use of RCU in dcache was first described that same year [Linder02a]. Also in 2002, Michael [Michael02b,Michael02a] presented "hazard-pointer" techniques that defer the destruction of data structures to simplify non-blocking synchronization (wait-free synchronization, lock-free synchronization, and obstruction-free synchronization are all examples of non-blocking synchronization). In particular, this technique eliminates locking, reduces contention, reduces memory latency for readers, and parallelizes pipeline stalls and memory latency for writers. However, these techniques still impose significant read-side overhead in the form of memory barriers. Researchers at Sun worked along similar lines in the same timeframe [HerlihyLM02]. These techniques can be thought of as inside-out reference counts, where the count is represented by the number of hazard pointers referencing a given data structure rather than the more conventional counter field within the data structure itself. The key advantage of inside-out reference counts is that they can be stored in immortal variables, thus allowing races between access and deletion to be avoided. non-blocking synchronization). The corresponding journal article appeared in 2004 [MagedMichael04a]. This technique eliminates locking, reduces contention, reduces memory latency for readers, and parallelizes pipeline stalls and memory latency for writers. However, these techniques still impose significant read-side overhead in the form of memory barriers. Researchers at Sun worked along similar lines in the same timeframe [HerlihyLM02]. These techniques can be thought of as inside-out reference counts, where the count is represented by the number of hazard pointers referencing a given data structure rather than the more conventional counter field within the data structure itself. The key advantage of inside-out reference counts is that they can be stored in immortal variables, thus allowing races between access and deletion to be avoided. By the same token, RCU can be thought of as a "bulk reference count", where some form of reference counter covers all reference by a given CPU Loading @@ -123,8 +134,10 @@ can be thought of in other terms as well. In 2003, the K42 group described how RCU could be used to create hot-pluggable implementations of operating-system functions [Appavoo03a]. Later that year saw a paper describing an RCU implementation of System V IPC [Arcangeli03], and an introduction to RCU in Linux Journal Later that year saw a paper describing an RCU implementation of System V IPC [Arcangeli03] (following up on a suggestion by Hugh Dickins [Dickins02a] and an implementation by Mingming Cao [MingmingCao2002IPCRCU]), and an introduction to RCU in Linux Journal [McKenney03a]. 2004 has seen a Linux-Journal article on use of RCU in dcache Loading Loading @@ -383,6 +396,21 @@ for Programming Languages and Operating Systems}" } } @phdthesis{HMassalinPhD ,author="H. Massalin" ,title="Synthesis: An Efficient Implementation of Fundamental Operating System Services" ,school="Columbia University" ,address="New York, NY" ,year="1992" ,annotation={ Mondo optimizing compiler. Wait-free stuff. Good advice: defer work to avoid synchronization. See page 90 (PDF page 106), Section 5.4, fourth bullet point. } } @unpublished{Jacobson93 ,author="Van Jacobson" ,title="Avoid Read-Side Locking Via Delayed Free" Loading Loading @@ -671,6 +699,20 @@ Orran Krieger and Rusty Russell and Dipankar Sarma and Maneesh Soni" [Viewed October 18, 2004]" } @conference{Michael02b ,author="Maged M. Michael" ,title="High Performance Dynamic Lock-Free Hash Tables and List-Based Sets" ,Year="2002" ,Month="August" ,booktitle="{Proceedings of the 14\textsuperscript{th} Annual ACM Symposium on Parallel Algorithms and Architecture}" ,pages="73-82" ,annotation={ Like the title says... } } @Conference{Linder02a ,Author="Hanna Linder and Dipankar Sarma and Maneesh Soni" ,Title="Scalability of the Directory Entry Cache" Loading Loading @@ -727,6 +769,24 @@ Andrea Arcangeli and Andi Kleen and Orran Krieger and Rusty Russell" } } @conference{Michael02a ,author="Maged M. Michael" ,title="Safe Memory Reclamation for Dynamic Lock-Free Objects Using Atomic Reads and Writes" ,Year="2002" ,Month="August" ,booktitle="{Proceedings of the 21\textsuperscript{st} Annual ACM Symposium on Principles of Distributed Computing}" ,pages="21-30" ,annotation={ Each thread keeps an array of pointers to items that it is currently referencing. Sort of an inside-out garbage collection mechanism, but one that requires the accessing code to explicitly state its needs. Also requires read-side memory barriers on most architectures. } } @unpublished{Dickins02a ,author="Hugh Dickins" ,title="Use RCU for System-V IPC" Loading @@ -735,6 +795,17 @@ Andrea Arcangeli and Andi Kleen and Orran Krieger and Rusty Russell" ,note="private communication" } @InProceedings{HerlihyLM02 ,author={Maurice Herlihy and Victor Luchangco and Mark Moir} ,title="The Repeat Offender Problem: A Mechanism for Supporting Dynamic-Sized, Lock-Free Data Structures" ,booktitle={Proceedings of 16\textsuperscript{th} International Symposium on Distributed Computing} ,year=2002 ,month="October" ,pages="339-353" } @unpublished{Sarma02b ,Author="Dipankar Sarma" ,Title="Some dcache\_rcu benchmark numbers" Loading @@ -749,6 +820,19 @@ Andrea Arcangeli and Andi Kleen and Orran Krieger and Rusty Russell" } } @unpublished{MingmingCao2002IPCRCU ,Author="Mingming Cao" ,Title="[PATCH]updated ipc lock patch" ,month="October" ,year="2002" ,note="Available: \url{https://lkml.org/lkml/2002/10/24/262} [Viewed February 15, 2014]" ,annotation={ Mingming Cao's patch to introduce RCU to SysV IPC. } } @unpublished{LinusTorvalds2003a ,Author="Linus Torvalds" ,Title="Re: {[PATCH]} small fixes in brlock.h" Loading Loading @@ -982,6 +1066,23 @@ Realtime Applications" } } @article{MagedMichael04a ,author="Maged M. Michael" ,title="Hazard Pointers: Safe Memory Reclamation for Lock-Free Objects" ,Year="2004" ,Month="June" ,journal="IEEE Transactions on Parallel and Distributed Systems" ,volume="15" ,number="6" ,pages="491-504" ,url="Available: \url{http://www.research.ibm.com/people/m/michael/ieeetpds-2004.pdf} [Viewed March 1, 2005]" ,annotation={ New canonical hazard-pointer citation. } } @phdthesis{PaulEdwardMcKenneyPhD ,author="Paul E. McKenney" ,title="Exploiting Deferred Destruction: Loading
Documentation/RCU/checklist.txt +13 −5 Original line number Diff line number Diff line Loading @@ -256,10 +256,10 @@ over a rather long period of time, but improvements are always welcome! variations on this theme. b. Limiting update rate. For example, if updates occur only once per hour, then no explicit rate limiting is required, unless your system is already badly broken. The dcache subsystem takes this approach -- updates are guarded by a global lock, limiting their rate. once per hour, then no explicit rate limiting is required, unless your system is already badly broken. Older versions of the dcache subsystem take this approach, guarding updates with a global lock, limiting their rate. c. Trusted update -- if updates can only be done manually by superuser or some other trusted user, then it might not Loading @@ -268,7 +268,8 @@ over a rather long period of time, but improvements are always welcome! the machine. d. Use call_rcu_bh() rather than call_rcu(), in order to take advantage of call_rcu_bh()'s faster grace periods. advantage of call_rcu_bh()'s faster grace periods. (This is only a partial solution, though.) e. Periodically invoke synchronize_rcu(), permitting a limited number of updates per grace period. Loading @@ -276,6 +277,13 @@ over a rather long period of time, but improvements are always welcome! The same cautions apply to call_rcu_bh(), call_rcu_sched(), call_srcu(), and kfree_rcu(). Note that although these primitives do take action to avoid memory exhaustion when any given CPU has too many callbacks, a determined user could still exhaust memory. This is especially the case if a system with a large number of CPUs has been configured to offload all of its RCU callbacks onto a single CPU, or if the system has relatively little free memory. 9. All RCU list-traversal primitives, which include rcu_dereference(), list_for_each_entry_rcu(), and list_for_each_safe_rcu(), must be either within an RCU read-side Loading
Documentation/arm64/memory.txt +10 −6 Original line number Diff line number Diff line Loading @@ -35,11 +35,13 @@ ffffffbc00000000 ffffffbdffffffff 8GB vmemmap ffffffbe00000000 ffffffbffbbfffff ~8GB [guard, future vmmemap] ffffffbffbc00000 ffffffbffbdfffff 2MB earlyprintk device ffffffbffa000000 ffffffbffaffffff 16MB PCI I/O space ffffffbffb000000 ffffffbffbbfffff 12MB [guard] ffffffbffbe00000 ffffffbffbe0ffff 64KB PCI I/O space ffffffbffbc00000 ffffffbffbdfffff 2MB earlyprintk device ffffffbffbe10000 ffffffbcffffffff ~2MB [guard] ffffffbffbe00000 ffffffbffbffffff 2MB [guard] ffffffbffc000000 ffffffbfffffffff 64MB modules Loading @@ -60,11 +62,13 @@ fffffdfc00000000 fffffdfdffffffff 8GB vmemmap fffffdfe00000000 fffffdfffbbfffff ~8GB [guard, future vmmemap] fffffdfffbc00000 fffffdfffbdfffff 2MB earlyprintk device fffffdfffa000000 fffffdfffaffffff 16MB PCI I/O space fffffdfffb000000 fffffdfffbbfffff 12MB [guard] fffffdfffbe00000 fffffdfffbe0ffff 64KB PCI I/O space fffffdfffbc00000 fffffdfffbdfffff 2MB earlyprintk device fffffdfffbe10000 fffffdfffbffffff ~2MB [guard] fffffdfffbe00000 fffffdfffbffffff 2MB [guard] fffffdfffc000000 fffffdffffffffff 64MB modules Loading
Documentation/device-mapper/cache.txt +5 −6 Original line number Diff line number Diff line Loading @@ -124,12 +124,11 @@ the default being 204800 sectors (or 100MB). Updating on-disk metadata ------------------------- On-disk metadata is committed every time a REQ_SYNC or REQ_FUA bio is written. If no such requests are made then commits will occur every second. This means the cache behaves like a physical disk that has a write cache (the same is true of the thin-provisioning target). If power is lost you may lose some recent writes. The metadata should always be consistent in spite of any crash. On-disk metadata is committed every time a FLUSH or FUA bio is written. If no such requests are made then commits will occur every second. This means the cache behaves like a physical disk that has a volatile write cache. If power is lost you may lose some recent writes. The metadata should always be consistent in spite of any crash. The 'dirty' state for a cache block changes far too frequently for us to keep updating it on the fly. So we treat it as a hint. In normal Loading
Documentation/device-mapper/thin-provisioning.txt +31 −3 Original line number Diff line number Diff line Loading @@ -116,6 +116,35 @@ Resuming a device with a new table itself triggers an event so the userspace daemon can use this to detect a situation where a new table already exceeds the threshold. A low water mark for the metadata device is maintained in the kernel and will trigger a dm event if free space on the metadata device drops below it. Updating on-disk metadata ------------------------- On-disk metadata is committed every time a FLUSH or FUA bio is written. If no such requests are made then commits will occur every second. This means the thin-provisioning target behaves like a physical disk that has a volatile write cache. If power is lost you may lose some recent writes. The metadata should always be consistent in spite of any crash. If data space is exhausted the pool will either error or queue IO according to the configuration (see: error_if_no_space). If metadata space is exhausted or a metadata operation fails: the pool will error IO until the pool is taken offline and repair is performed to 1) fix any potential inconsistencies and 2) clear the flag that imposes repair. Once the pool's metadata device is repaired it may be resized, which will allow the pool to return to normal operation. Note that if a pool is flagged as needing repair, the pool's data and metadata devices cannot be resized until repair is performed. It should also be noted that when the pool's metadata space is exhausted the current metadata transaction is aborted. Given that the pool will cache IO whose completion may have already been acknowledged to upper IO layers (e.g. filesystem) it is strongly suggested that consistency checks (e.g. fsck) be performed on those layers when repair of the pool is required. Thin provisioning ----------------- Loading Loading @@ -258,10 +287,9 @@ ii) Status should register for the event and then check the target's status. held metadata root: The location, in sectors, of the metadata root that has been The location, in blocks, of the metadata root that has been 'held' for userspace read access. '-' indicates there is no held root. This feature is not yet implemented so '-' is always returned. held root. discard_passdown|no_discard_passdown Whether or not discards are actually being passed down to the Loading