diff options
Diffstat (limited to 'Documentation/memory-barriers.txt')
-rw-r--r-- | Documentation/memory-barriers.txt | 1913 |
1 files changed, 1913 insertions, 0 deletions
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt new file mode 100644 index 00000000000..f8550310a6d --- /dev/null +++ b/Documentation/memory-barriers.txt @@ -0,0 +1,1913 @@ + ============================ + LINUX KERNEL MEMORY BARRIERS + ============================ + +By: David Howells <dhowells@redhat.com> + +Contents: + + (*) Abstract memory access model. + + - Device operations. + - Guarantees. + + (*) What are memory barriers? + + - Varieties of memory barrier. + - What may not be assumed about memory barriers? + - Data dependency barriers. + - Control dependencies. + - SMP barrier pairing. + - Examples of memory barrier sequences. + + (*) Explicit kernel barriers. + + - Compiler barrier. + - The CPU memory barriers. + - MMIO write barrier. + + (*) Implicit kernel memory barriers. + + - Locking functions. + - Interrupt disabling functions. + - Miscellaneous functions. + + (*) Inter-CPU locking barrier effects. + + - Locks vs memory accesses. + - Locks vs I/O accesses. + + (*) Where are memory barriers needed? + + - Interprocessor interaction. + - Atomic operations. + - Accessing devices. + - Interrupts. + + (*) Kernel I/O barrier effects. + + (*) Assumed minimum execution ordering model. + + (*) The effects of the cpu cache. + + - Cache coherency. + - Cache coherency vs DMA. + - Cache coherency vs MMIO. + + (*) The things CPUs get up to. + + - And then there's the Alpha. + + (*) References. + + +============================ +ABSTRACT MEMORY ACCESS MODEL +============================ + +Consider the following abstract model of the system: + + : : + : : + : : + +-------+ : +--------+ : +-------+ + | | : | | : | | + | | : | | : | | + | CPU 1 |<----->| Memory |<----->| CPU 2 | + | | : | | : | | + | | : | | : | | + +-------+ : +--------+ : +-------+ + ^ : ^ : ^ + | : | : | + | : | : | + | : v : | + | : +--------+ : | + | : | | : | + | : | | : | + +---------->| Device |<----------+ + : | | : + : | | : + : +--------+ : + : : + +Each CPU executes a program that generates memory access operations. In the +abstract CPU, memory operation ordering is very relaxed, and a CPU may actually +perform the memory operations in any order it likes, provided program causality +appears to be maintained. Similarly, the compiler may also arrange the +instructions it emits in any order it likes, provided it doesn't affect the +apparent operation of the program. + +So in the above diagram, the effects of the memory operations performed by a +CPU are perceived by the rest of the system as the operations cross the +interface between the CPU and rest of the system (the dotted lines). + + +For example, consider the following sequence of events: + + CPU 1 CPU 2 + =============== =============== + { A == 1; B == 2 } + A = 3; x = A; + B = 4; y = B; + +The set of accesses as seen by the memory system in the middle can be arranged +in 24 different combinations: + + STORE A=3, STORE B=4, x=LOAD A->3, y=LOAD B->4 + STORE A=3, STORE B=4, y=LOAD B->4, x=LOAD A->3 + STORE A=3, x=LOAD A->3, STORE B=4, y=LOAD B->4 + STORE A=3, x=LOAD A->3, y=LOAD B->2, STORE B=4 + STORE A=3, y=LOAD B->2, STORE B=4, x=LOAD A->3 + STORE A=3, y=LOAD B->2, x=LOAD A->3, STORE B=4 + STORE B=4, STORE A=3, x=LOAD A->3, y=LOAD B->4 + STORE B=4, ... + ... + +and can thus result in four different combinations of values: + + x == 1, y == 2 + x == 1, y == 4 + x == 3, y == 2 + x == 3, y == 4 + + +Furthermore, the stores committed by a CPU to the memory system may not be +perceived by the loads made by another CPU in the same order as the stores were +committed. + + +As a further example, consider this sequence of events: + + CPU 1 CPU 2 + =============== =============== + { A == 1, B == 2, C = 3, P == &A, Q == &C } + B = 4; Q = P; + P = &B D = *Q; + +There is an obvious data dependency here, as the value loaded into D depends on +the address retrieved from P by CPU 2. At the end of the sequence, any of the +following results are possible: + + (Q == &A) and (D == 1) + (Q == &B) and (D == 2) + (Q == &B) and (D == 4) + +Note that CPU 2 will never try and load C into D because the CPU will load P +into Q before issuing the load of *Q. + + +DEVICE OPERATIONS +----------------- + +Some devices present their control interfaces as collections of memory +locations, but the order in which the control registers are accessed is very +important. For instance, imagine an ethernet card with a set of internal +registers that are accessed through an address port register (A) and a data +port register (D). To read internal register 5, the following code might then +be used: + + *A = 5; + x = *D; + +but this might show up as either of the following two sequences: + + STORE *A = 5, x = LOAD *D + x = LOAD *D, STORE *A = 5 + +the second of which will almost certainly result in a malfunction, since it set +the address _after_ attempting to read the register. + + +GUARANTEES +---------- + +There are some minimal guarantees that may be expected of a CPU: + + (*) On any given CPU, dependent memory accesses will be issued in order, with + respect to itself. This means that for: + + Q = P; D = *Q; + + the CPU will issue the following memory operations: + + Q = LOAD P, D = LOAD *Q + + and always in that order. + + (*) Overlapping loads and stores within a particular CPU will appear to be + ordered within that CPU. This means that for: + + a = *X; *X = b; + + the CPU will only issue the following sequence of memory operations: + + a = LOAD *X, STORE *X = b + + And for: + + *X = c; d = *X; + + the CPU will only issue: + + STORE *X = c, d = LOAD *X + + (Loads and stores overlap if they are targetted at overlapping pieces of + memory). + +And there are a number of things that _must_ or _must_not_ be assumed: + + (*) It _must_not_ be assumed that independent loads and stores will be issued + in the order given. This means that for: + + X = *A; Y = *B; *D = Z; + + we may get any of the following sequences: + + X = LOAD *A, Y = LOAD *B, STORE *D = Z + X = LOAD *A, STORE *D = Z, Y = LOAD *B + Y = LOAD *B, X = LOAD *A, STORE *D = Z + Y = LOAD *B, STORE *D = Z, X = LOAD *A + STORE *D = Z, X = LOAD *A, Y = LOAD *B + STORE *D = Z, Y = LOAD *B, X = LOAD *A + + (*) It _must_ be assumed that overlapping memory accesses may be merged or + discarded. This means that for: + + X = *A; Y = *(A + 4); + + we may get any one of the following sequences: + + X = LOAD *A; Y = LOAD *(A + 4); + Y = LOAD *(A + 4); X = LOAD *A; + {X, Y} = LOAD {*A, *(A + 4) }; + + And for: + + *A = X; Y = *A; + + we may get either of: + + STORE *A = X; Y = LOAD *A; + STORE *A = Y; + + +========================= +WHAT ARE MEMORY BARRIERS? +========================= + +As can be seen above, independent memory operations are effectively performed +in random order, but this can be a problem for CPU-CPU interaction and for I/O. +What is required is some way of intervening to instruct the compiler and the +CPU to restrict the order. + +Memory barriers are such interventions. They impose a perceived partial +ordering between the memory operations specified on either side of the barrier. +They request that the sequence of memory events generated appears to other +parts of the system as if the barrier is effective on that CPU. + + +VARIETIES OF MEMORY BARRIER +--------------------------- + +Memory barriers come in four basic varieties: + + (1) Write (or store) memory barriers. + + A write memory barrier gives a guarantee that all the STORE operations + specified before the barrier will appear to happen before all the STORE + operations specified after the barrier with respect to the other + components of the system. + + A write barrier is a partial ordering on stores only; it is not required + to have any effect on loads. + + A CPU can be viewed as as commiting a sequence of store operations to the + memory system as time progresses. All stores before a write barrier will + occur in the sequence _before_ all the stores after the write barrier. + + [!] Note that write barriers should normally be paired with read or data + dependency barriers; see the "SMP barrier pairing" subsection. + + + (2) Data dependency barriers. + + A data dependency barrier is a weaker form of read barrier. In the case + where two loads are performed such that the second depends on the result + of the first (eg: the first load retrieves the address to which the second + load will be directed), a data dependency barrier would be required to + make sure that the target of the second load is updated before the address + obtained by the first load is accessed. + + A data dependency barrier is a partial ordering on interdependent loads + only; it is not required to have any effect on stores, independent loads + or overlapping loads. + + As mentioned in (1), the other CPUs in the system can be viewed as + committing sequences of stores to the memory system that the CPU being + considered can then perceive. A data dependency barrier issued by the CPU + under consideration guarantees that for any load preceding it, if that + load touches one of a sequence of stores from another CPU, then by the + time the barrier completes, the effects of all the stores prior to that + touched by the load will be perceptible to any loads issued after the data + dependency barrier. + + See the "Examples of memory barrier sequences" subsection for diagrams + showing the ordering constraints. + + [!] Note that the first load really has to have a _data_ dependency and + not a control dependency. If the address for the second load is dependent + on the first load, but the dependency is through a conditional rather than + actually loading the address itself, then it's a _control_ dependency and + a full read barrier or better is required. See the "Control dependencies" + subsection for more information. + + [!] Note that data dependency barriers should normally be paired with + write barriers; see the "SMP barrier pairing" subsection. + + + (3) Read (or load) memory barriers. + + A read barrier is a data dependency barrier plus a guarantee that all the + LOAD operations specified before the barrier will appear to happen before + all the LOAD operations specified after the barrier with respect to the + other components of the system. + + A read barrier is a partial ordering on loads only; it is not required to + have any effect on stores. + + Read memory barriers imply data dependency barriers, and so can substitute + for them. + + [!] Note that read barriers should normally be paired with write barriers; + see the "SMP barrier pairing" subsection. + + + (4) General memory barriers. + + A general memory barrier is a combination of both a read memory barrier + and a write memory barrier. It is a partial ordering over both loads and + stores. + + General memory barriers imply both read and write memory barriers, and so + can substitute for either. + + +And a couple of implicit varieties: + + (5) LOCK operations. + + This acts as a one-way permeable barrier. It guarantees that all memory + operations after the LOCK operation will appear to happen after the LOCK + operation with respect to the other components of the system. + + Memory operations that occur before a LOCK operation may appear to happen + after it completes. + + A LOCK operation should almost always be paired with an UNLOCK operation. + + + (6) UNLOCK operations. + + This also acts as a one-way permeable barrier. It guarantees that all + memory operations before the UNLOCK operation will appear to happen before + the UNLOCK operation with respect to the other components of the system. + + Memory operations that occur after an UNLOCK operation may appear to + happen before it completes. + + LOCK and UNLOCK operations are guaranteed to appear with respect to each + other strictly in the order specified. + + The use of LOCK and UNLOCK operations generally precludes the need for + other sorts of memory barrier (but note the exceptions mentioned in the + subsection "MMIO write barrier"). + + +Memory barriers are only required where there's a possibility of interaction +between two CPUs or between a CPU and a device. If it can be guaranteed that +there won't be any such interaction in any particular piece of code, then +memory barriers are unnecessary in that piece of code. + + +Note that these are the _minimum_ guarantees. Different architectures may give +more substantial guarantees, but they may _not_ be relied upon outside of arch +specific code. + + +WHAT MAY NOT BE ASSUMED ABOUT MEMORY BARRIERS? +---------------------------------------------- + +There are certain things that the Linux kernel memory barriers do not guarantee: + + (*) There is no guarantee that any of the memory accesses specified before a + memory barrier will be _complete_ by the completion of a memory barrier + instruction; the barrier can be considered to draw a line in that CPU's + access queue that accesses of the appropriate type may not cross. + + (*) There is no guarantee that issuing a memory barrier on one CPU will have + any direct effect on another CPU or any other hardware in the system. The + indirect effect will be the order in which the second CPU sees the effects + of the first CPU's accesses occur, but see the next point: + + (*) There is no guarantee that the a CPU will see the correct order of effects + from a second CPU's accesses, even _if_ the second CPU uses a memory + barrier, unless the first CPU _also_ uses a matching memory barrier (see + the subsection on "SMP Barrier Pairing"). + + (*) There is no guarantee that some intervening piece of off-the-CPU + hardware[*] will not reorder the memory accesses. CPU cache coherency + mechanisms should propagate the indirect effects of a memory barrier + between CPUs, but might not do so in order. + + [*] For information on bus mastering DMA and coherency please read: + + Documentation/pci.txt + Documentation/DMA-mapping.txt + Documentation/DMA-API.txt + + +DATA DEPENDENCY BARRIERS +------------------------ + +The usage requirements of data dependency barriers are a little subtle, and +it's not always obvious that they're needed. To illustrate, consider the +following sequence of events: + + CPU 1 CPU 2 + =============== =============== + { A == 1, B == 2, C = 3, P == &A, Q == &C } + B = 4; + <write barrier> + P = &B + Q = P; + D = *Q; + +There's a clear data dependency here, and it would seem that by the end of the +sequence, Q must be either &A or &B, and that: + + (Q == &A) implies (D == 1) + (Q == &B) implies (D == 4) + +But! CPU 2's perception of P may be updated _before_ its perception of B, thus +leading to the following situation: + + (Q == &B) and (D == 2) ???? + +Whilst this may seem like a failure of coherency or causality maintenance, it +isn't, and this behaviour can be observed on certain real CPUs (such as the DEC +Alpha). + +To deal with this, a data dependency barrier must be inserted between the +address load and the data load: + + CPU 1 CPU 2 + =============== =============== + { A == 1, B == 2, C = 3, P == &A, Q == &C } + B = 4; + <write barrier> + P = &B + Q = P; + <data dependency barrier> + D = *Q; + +This enforces the occurrence of one of the two implications, and prevents the +third possibility from arising. + +[!] Note that this extremely counterintuitive situation arises most easily on +machines with split caches, so that, for example, one cache bank processes +even-numbered cache lines and the other bank processes odd-numbered cache +lines. The pointer P might be stored in an odd-numbered cache line, and the +variable B might be stored in an even-numbered cache line. Then, if the +even-numbered bank of the reading CPU's cache is extremely busy while the +odd-numbered bank is idle, one can see the new value of the pointer P (&B), +but the old value of the variable B (1). + + +Another example of where data dependency barriers might by required is where a +number is read from memory and then used to calculate the index for an array +access: + + CPU 1 CPU 2 + =============== =============== + { M[0] == 1, M[1] == 2, M[3] = 3, P == 0, Q == 3 } + M[1] = 4; + <write barrier> + P = 1 + Q = P; + <data dependency barrier> + D = M[Q]; + + +The data dependency barrier is very important to the RCU system, for example. +See rcu_dereference() in include/linux/rcupdate.h. This permits the current +target of an RCU'd pointer to be replaced with a new modified target, without +the replacement target appearing to be incompletely initialised. + +See also the subsection on "Cache Coherency" for a more thorough example. + + +CONTROL DEPENDENCIES +-------------------- + +A control dependency requires a full read memory barrier, not simply a data +dependency barrier to make it work correctly. Consider the following bit of +code: + + q = &a; + if (p) + q = &b; + <data dependency barrier> + x = *q; + +This will not have the desired effect because there is no actual data +dependency, but rather a control dependency that the CPU may short-circuit by +attempting to predict the outcome in advance. In such a case what's actually +required is: + + q = &a; + if (p) + q = &b; + <read barrier> + x = *q; + + +SMP BARRIER PAIRING +------------------- + +When dealing with CPU-CPU interactions, certain types of memory barrier should +always be paired. A lack of appropriate pairing is almost certainly an error. + +A write barrier should always be paired with a data dependency barrier or read +barrier, though a general barrier would also be viable. Similarly a read +barrier or a data dependency barrier should always be paired with at least an +write barrier, though, again, a general barrier is viable: + + CPU 1 CPU 2 + =============== =============== + a = 1; + <write barrier> + b = 2; x = a; + <read barrier> + y = b; + +Or: + + CPU 1 CPU 2 + =============== =============================== + a = 1; + <write barrier> + b = &a; x = b; + <data dependency barrier> + y = *x; + +Basically, the read barrier always has to be there, even though it can be of +the "weaker" type. + + +EXAMPLES OF MEMORY BARRIER SEQUENCES +------------------------------------ + +Firstly, write barriers act as a partial orderings on store operations. +Consider the following sequence of events: + + CPU 1 + ======================= + STORE A = 1 + STORE B = 2 + STORE C = 3 + <write barrier> + STORE D = 4 + STORE E = 5 + +This sequence of events is committed to the memory coherence system in an order +that the rest of the system might perceive as the unordered set of { STORE A, +STORE B, STORE C } all occuring before the unordered set of { STORE D, STORE E +}: + + +-------+ : : + | | +------+ + | |------>| C=3 | } /\ + | | : +------+ }----- \ -----> Events perceptible + | | : | A=1 | } \/ to rest of system + | | : +------+ } + | CPU 1 | : | B=2 | } + | | +------+ } + | | wwwwwwwwwwwwwwww } <--- At this point the write barrier + | | +------+ } requires all stores prior to the + | | : | E=5 | } barrier to be committed before + | | : +------+ } further stores may be take place. + | |------>| D=4 | } + | | +------+ + +-------+ : : + | + | Sequence in which stores committed to memory system + | by CPU 1 + V + + +Secondly, data dependency barriers act as a partial orderings on data-dependent +loads. Consider the following sequence of events: + + CPU 1 CPU 2 + ======================= ======================= + STORE A = 1 + STORE B = 2 + <write barrier> + STORE C = &B LOAD X + STORE D = 4 LOAD C (gets &B) + LOAD *C (reads B) + +Without intervention, CPU 2 may perceive the events on CPU 1 in some +effectively random order, despite the write barrier issued by CPU 1: + + +-------+ : : : : + | | +------+ +-------+ | Sequence of update + | |------>| B=2 |----- --->| Y->8 | | of perception on + | | : +------+ \ +-------+ | CPU 2 + | CPU 1 | : | A=1 | \ --->| C->&Y | V + | | +------+ | +-------+ + | | wwwwwwwwwwwwwwww | : : + | | +------+ | : : + | | : | C=&B |--- | : : +-------+ + | | : +------+ \ | +-------+ | | + | |------>| D=4 | ----------->| C->&B |------>| | + | | +------+ | +-------+ | | + +-------+ : : | : : | | + | : : | | + | : : | CPU 2 | + | +-------+ | | + Apparently incorrect ---> | | B->7 |------>| | + perception of B (!) | +-------+ | | + | : : | | + | +-------+ | | + The load of X holds ---> \ | X->9 |------>| | + up the maintenance \ +-------+ | | + of coherence of B ----->| B->2 | +-------+ + +-------+ + : : + + +In the above example, CPU 2 perceives that B is 7, despite the load of *C +(which would be B) coming after the the LOAD of C. + +If, however, a data dependency barrier were to be placed between the load of C +and the load of *C (ie: B) on CPU 2, then the following will occur: + + +-------+ : : : : + | | +------+ +-------+ + | |------>| B=2 |----- --->| Y->8 | + | | : +------+ \ +-------+ + | CPU 1 | : | A=1 | \ --->| C->&Y | + | | +------+ | +-------+ + | | wwwwwwwwwwwwwwww | : : + | | +------+ | : : + | | : | C=&B |--- | : : +-------+ + | | : +------+ \ | +-------+ | | + | |------>| D=4 | ----------->| C->&B |------>| | + | | +------+ | +-------+ | | + +-------+ : : | : : | | + | : : | | + | : : | CPU 2 | + | +-------+ | | + \ | X->9 |------>| | + \ +-------+ | | + ----->| B->2 | | | + +-------+ | | + Makes sure all effects ---> ddddddddddddddddd | | + prior to the store of C +-------+ | | + are perceptible to | B->2 |------>| | + successive loads +-------+ | | + : : +-------+ + + +And thirdly, a read barrier acts as a partial order on loads. Consider the +following sequence of events: + + CPU 1 CPU 2 + ======================= ======================= + STORE A=1 + STORE B=2 + STORE C=3 + <write barrier> + STORE D=4 + STORE E=5 + LOAD A + LOAD B + LOAD C + LOAD D + LOAD E + +Without intervention, CPU 2 may then choose to perceive the events on CPU 1 in +some effectively random order, despite the write barrier issued by CPU 1: + + +-------+ : : + | | +------+ + | |------>| C=3 | } + | | : +------+ } + | | : | A=1 | } + | | : +------+ } + | CPU 1 | : | B=2 | }--- + | | +------+ } \ + | | wwwwwwwwwwwww} \ + | | +------+ } \ : : +-------+ + | | : | E=5 | } \ +-------+ | | + | | : +------+ } \ { | C->3 |------>| | + | |------>| D=4 | } \ { +-------+ : | | + | | +------+ \ { | E->5 | : | | + +-------+ : : \ { +-------+ : | | + Transfer -->{ | A->1 | : | CPU 2 | + from CPU 1 { +-------+ : | | + to CPU 2 { | D->4 | : | | + { +-------+ : | | + { | B->2 |------>| | + +-------+ | | + : : +-------+ + + +If, however, a read barrier were to be placed between the load of C and the +load of D on CPU 2, then the partial ordering imposed by CPU 1 will be +perceived correctly by CPU 2. + + +-------+ : : + | | +------+ + | |------>| C=3 | } + | | : +------+ } + | | : | A=1 | }--- + | | : +------+ } \ + | CPU 1 | : | B=2 | } \ + | | +------+ \ + | | wwwwwwwwwwwwwwww \ + | | +------+ \ : : +-------+ + | | : | E=5 | } \ +-------+ | | + | | : +------+ }--- \ { | C->3 |------>| | + | |------>| D=4 | } \ \ { +-------+ : | | + | | +------+ \ -->{ | B->2 | : | | + +-------+ : : \ { +-------+ : | | + \ { | A->1 | : | CPU 2 | + \ +-------+ | | + At this point the read ----> \ rrrrrrrrrrrrrrrrr | | + barrier causes all effects \ +-------+ | | + prior to the storage of C \ { | E->5 | : | | + to be perceptible to CPU 2 -->{ +-------+ : | | + { | D->4 |------>| | + +-------+ | | + : : +-------+ + + +======================== +EXPLICIT KERNEL BARRIERS +======================== + +The Linux kernel has a variety of different barriers that act at different +levels: + + (*) Compiler barrier. + + (*) CPU memory barriers. + + (*) MMIO write barrier. + + +COMPILER BARRIER +---------------- + +The Linux kernel has an explicit compiler barrier function that prevents the +compiler from moving the memory accesses either side of it to the other side: + + barrier(); + +This a general barrier - lesser varieties of compiler barrier do not exist. + +The compiler barrier has no direct effect on the CPU, which may then reorder +things however it wishes. + + +CPU MEMORY BARRIERS +------------------- + +The Linux kernel has eight basic CPU memory barriers: + + TYPE MANDATORY SMP CONDITIONAL + =============== ======================= =========================== + GENERAL mb() smp_mb() + WRITE wmb() smp_wmb() + READ rmb() smp_rmb() + DATA DEPENDENCY read_barrier_depends() smp_read_barrier_depends() + + +All CPU memory barriers unconditionally imply compiler barriers. + +SMP memory barriers are reduced to compiler barriers on uniprocessor compiled +systems because it is assumed that a CPU will be appear to be self-consistent, +and will order overlapping accesses correctly with respect to itself. + +[!] Note that SMP memory barriers _must_ be used to control the ordering of +references to shared memory on SMP systems, though the use of locking instead +is sufficient. + +Mandatory barriers should not be used to control SMP effects, since mandatory +barriers unnecessarily impose overhead on UP systems. They may, however, be +used to control MMIO effects on accesses through relaxed memory I/O windows. +These are required even on non-SMP systems as they affect the order in which +memory operations appear to a device by prohibiting both the compiler and the +CPU from reordering them. + + +There are some more advanced barrier functions: + + (*) set_mb(var, value) + (*) set_wmb(var, value) + + These assign the value to the variable and then insert at least a write + barrier after it, depending on the function. They aren't guaranteed to + insert anything more than a compiler barrier in a UP compilation. + + + (*) smp_mb__before_atomic_dec(); + (*) smp_mb__after_atomic_dec(); + (*) smp_mb__before_atomic_inc(); + (*) smp_mb__after_atomic_inc(); + + These are for use with atomic add, subtract, increment and decrement + functions, especially when used for reference counting. These functions + do not imply memory barriers. + + As an example, consider a piece of code that marks an object as being dead + and then decrements the object's reference count: + + obj->dead = 1; + smp_mb__before_atomic_dec(); + atomic_dec(&obj->ref_count); + + This makes sure that the death mark on the object is perceived to be set + *before* the reference counter is decremented. + + See Documentation/atomic_ops.txt for more information. See the "Atomic + operations" subsection for information on where to use these. + + + (*) smp_mb__before_clear_bit(void); + (*) smp_mb__after_clear_bit(void); + + These are for use similar to the atomic inc/dec barriers. These are + typically used for bitwise unlocking operations, so care must be taken as + there are no implicit memory barriers here either. + + Consider implementing an unlock operation of some nature by clearing a + locking bit. The clear_bit() would then need to be barriered like this: + + smp_mb__before_clear_bit(); + clear_bit( ... ); + + This prevents memory operations before the clear leaking to after it. See + the subsection on "Locking Functions" with reference to UNLOCK operation + implications. + + See Documentation/atomic_ops.txt for more information. See the "Atomic + operations" subsection for information on where to use these. + + +MMIO WRITE BARRIER +------------------ + +The Linux kernel also has a special barrier for use with memory-mapped I/O +writes: + + mmiowb(); + +This is a variation on the mandatory write barrier that causes writes to weakly +ordered I/O regions to be partially ordered. Its effects may go beyond the +CPU->Hardware interface and actually affect the hardware at some level. + +See the subsection "Locks vs I/O accesses" for more information. + + +=============================== +IMPLICIT KERNEL MEMORY BARRIERS +=============================== + +Some of the other functions in the linux kernel imply memory barriers, amongst +which are locking, scheduling and memory allocation functions. + +This specification is a _minimum_ guarantee; any particular architecture may +provide more substantial guarantees, but these may not be relied upon outside +of arch specific code. + + +LOCKING FUNCTIONS +----------------- + +The Linux kernel has a number of locking constructs: + + (*) spin locks + (*) R/W spin locks + (*) mutexes + (*) semaphores + (*) R/W semaphores + (*) RCU + +In all cases there are variants on "LOCK" operations and "UNLOCK" operations +for each construct. These operations all imply certain barriers: + + (1) LOCK operation implication: + + Memory operations issued after the LOCK will be completed after the LOCK + operation has completed. + + Memory operations issued before the LOCK may be completed after the LOCK + operation has completed. + + (2) UNLOCK operation implication: + + Memory operations issued before the UNLOCK will be completed before the + UNLOCK operation has completed. + + Memory operations issued after the UNLOCK may be completed before the + UNLOCK operation has completed. + + (3) LOCK vs LOCK implication: + + All LOCK operations issued before another LOCK operation will be completed + before that LOCK operation. + + (4) LOCK vs UNLOCK implication: + + All LOCK operations issued before an UNLOCK operation will be completed + before the UNLOCK operation. + + All UNLOCK operations issued before a LOCK operation will be completed + before the LOCK operation. + + (5) Failed conditional LOCK implication: + + Certain variants of the LOCK operation may fail, either due to being + unable to get the lock immediately, or due to receiving an unblocked + signal whilst asleep waiting for the lock to become available. Failed + locks do not imply any sort of barrier. + +Therefore, from (1), (2) and (4) an UNLOCK followed by an unconditional LOCK is +equivalent to a full barrier, but a LOCK followed by an UNLOCK is not. + +[!] Note: one of the consequence of LOCKs and UNLOCKs being only one-way + barriers is that the effects instructions outside of a critical section may + seep into the inside of the critical section. + +Locks and semaphores may not provide any guarantee of ordering on UP compiled +systems, and so cannot be counted on in such a situation to actually achieve +anything at all - especially with respect to I/O accesses - unless combined +with interrupt disabling operations. + +See also the section on "Inter-CPU locking barrier effects". + + +As an example, consider the following: + + *A = a; + *B = b; + LOCK + *C = c; + *D = d; + UNLOCK + *E = e; + *F = f; + +The following sequence of events is acceptable: + + LOCK, {*F,*A}, *E, {*C,*D}, *B, UNLOCK + + [+] Note that {*F,*A} indicates a combined access. + +But none of the following are: + + {*F,*A}, *B, LOCK, *C, *D, UNLOCK, *E + *A, *B, *C, LOCK, *D, UNLOCK, *E, *F + *A, *B, LOCK, *C, UNLOCK, *D, *E, *F + *B, LOCK, *C, *D, UNLOCK, {*F,*A}, *E + + + +INTERRUPT DISABLING FUNCTIONS +----------------------------- + +Functions that disable interrupts (LOCK equivalent) and enable interrupts +(UNLOCK equivalent) will act as compiler barriers only. So if memory or I/O +barriers are required in such a situation, they must be provided from some +other means. + + +MISCELLANEOUS FUNCTIONS +----------------------- + +Other functions that imply barriers: + + (*) schedule() and similar imply full memory barriers. + + (*) Memory allocation and release functions imply full memory barriers. + + +================================= +INTER-CPU LOCKING BARRIER EFFECTS +================================= + +On SMP systems locking primitives give a more substantial form of barrier: one +that does affect memory access ordering on other CPUs, within the context of +conflict on any particular lock. + + +LOCKS VS MEMORY ACCESSES +------------------------ + +Consider the following: the system has a pair of spinlocks (N) and (Q), and +three CPUs; then should the following sequence of events occur: + + CPU 1 CPU 2 + =============================== =============================== + *A = a; *E = e; + LOCK M LOCK Q + *B = b; *F = f; + *C = c; *G = g; + UNLOCK M UNLOCK Q + *D = d; *H = h; + +Then there is no guarantee as to what order CPU #3 will see the accesses to *A +through *H occur in, other than the constraints imposed by the separate locks +on the separate CPUs. It might, for example, see: + + *E, LOCK M, LOCK Q, *G, *C, *F, *A, *B, UNLOCK Q, *D, *H, UNLOCK M + +But it won't see any of: + + *B, *C or *D preceding LOCK M |