diff options
author | Eli Bendersky <eliben@chromium.org> | 2013-07-18 18:00:27 -0700 |
---|---|---|
committer | Eli Bendersky <eliben@chromium.org> | 2013-07-18 18:00:27 -0700 |
commit | 4412ea4b8e019d00dc7574fe1723eea0473a8ec1 (patch) | |
tree | 2badd5ce0727bfad02f10d0d82c8bcfa65677676 /docs | |
parent | 4a9f2a703db400ccf760f34101bcdd57642f96e4 (diff) | |
parent | 5b548094edef39376e17445aea28ad2b37d701c4 (diff) |
Merge remote-tracking branch 'origin/master'
Diffstat (limited to 'docs')
-rw-r--r-- | docs/PNaClLangRef.rst | 187 |
1 files changed, 174 insertions, 13 deletions
diff --git a/docs/PNaClLangRef.rst b/docs/PNaClLangRef.rst index 4f322a3eff..eef6023e2b 100644 --- a/docs/PNaClLangRef.rst +++ b/docs/PNaClLangRef.rst @@ -113,21 +113,139 @@ Volatile Memory Accesses `LLVM LangRef: Volatile Memory Accesses <LangRef.html#volatile>`_ -TODO: are we going to promote volatile to atomic? +PNaCl bitcode does not support volatile memory accesses. + +.. note:: + + The C11/C++11 standards mandate that ``volatile`` accesses execute + in program order (but are not fences, so other memory operations can + reorder around them), are not necessarily atomic, and can’t be + elided. They can be separated into smaller width accesses. + + The PNaCl toolchain applies regular LLVM optimizations along these + guidelines, and it further prevents any load/store (even + non-``volatile`` and non-atomic ones) from moving above or below a + volatile operations: they act as compiler barriers before + optimizations occur. The PNaCl toolchain freezes ``volatile`` + accesses after optimizations into atomic accesses with sequentially + consistent memory ordering. This eases the support of legacy + (i.e. non-C11/C++11) code, and combined with builtin fences these + programs can do meaningful cross-thread communication without + changing code. It also reflects the original code's intent and + guarantees better portability. + + Relaxed ordering could be used instead, but for the first release it + is more conservative to apply sequential consistency. Future + releases may change what happens at compile-time, but + already-released pexes will continue using sequential consistency. + + The PNaCl toolchain also requires that ``volatile`` accesses be at + least naturally aligned, and tries to guarantee this alignment. Memory Model for Concurrent Operations -------------------------------------- `LLVM LangRef: Memory Model for Concurrent Operations <LangRef.html#memmodel>`_ -TODO. +The memory model offered by PNaCl relies on the same coding guidelines +as the C11/C++11 one: concurrent accesses must always occur through +atomic primitives (offered by `atomic intrinsics`_), and these accesses +must always occur with the same size for the same memory +location. Visibility of stores is provided on a happens-before basis +that relates memory locations to each other as the C11/C++11 standards +do. + +.. note:: + + As in C11/C++11 some atomic accesses may be implemented with locks + on certain platforms. The ``ATOMIC_*_LOCK_FREE`` macros will always + be ``1``, signifying that all types are sometimes lock-free. The + ``is_lock_free`` methods will return the current platform's + implementation at runtime. + + The PNaCl toolchain supports concurrent memory accesses through + legacy GCC-style ``__sync_*`` builtins, as well as through C11/C++11 + atomic primitives. ``volatile`` memory accesses can also be used, + though these are discouraged, and aren't present in bitcode. + + PNaCl supports concurrency and parallelism with some restrictions: + + * Threading is explicitly supported. + * Inter-process communication through shared memory is limited to + operations which are lock-free on the current platform + (``is_lock_free`` methods). This may change at a later date. + * Direct interaction with device memory isn't supported. + * Signal handling isn't supported, PNaCl therefore promotes all + primitives to cross-thread (instead of single-thread). This may + change at a later date. Note that using atomic operations which + aren't lock-free may lead to deadlocks when handling asynchronous + signals. + * ``volatile`` and atomic operations are address-free (operations on + the same memory location via two different addresses work + atomically), as intended by the C11/C++11 standards. This is + critical for inter-process communication as well as synchronous + "external modifications" such as mapping underlying memory at + multiple locations. + + Setting up the above mechanisms requires assistance from the + embedding sandbox's runtime (e.g. NaCl's Pepper APIs), but using + them once setup can be done through regular C/C++ code. + + The PNaCl toolchain currently optimizes for memory ordering as LLVM + normally does, but at pexe creation time it promotes all + ``volatile`` accesses as well as all atomic accesses to be + sequentially consistent. Other memory orderings will be supported in + a future release, but pexes generated with the current toolchain + will continue functioning with sequential consistency. Using + sequential consistency provides a total ordering for all + sequentially-consistent operations on all addresses. + + This means that ``volatile`` and atomic memory accesses can only be + re-ordered in some limited way before the pexe is created, and will + act as fences for all memory accesses (even non-atomic and + non-``volatile``) after pexe creation. Non-atomic and + non-``volatile`` memory accesses may be reordered (unless a fence + intervenes), separated, elided or fused according to C and C++'s + memory model before the pexe is created as well as after its + creation. Atomic Memory Ordering Constraints ---------------------------------- `LLVM LangRef: Atomic Memory Ordering Constraints <LangRef.html#ordering>`_ -TODO. +PNaCl bitcode currently supports sequential consistency only, through +its `atomic intrinsics`_. + +.. note:: + + Atomics follow the same ordering constraints as in regular LLVM, but + all accesses are promoted to sequential consistency (the strongest + memory ordering) at pexe creation time. As more C11/C++11 code + allows us to understand performance and portability needs we intend + to support the full gamut of C11/C++11 memory orderings: + + - Relaxed: no operation orders memory. + - Consume: a load operation performs a consume operation on the + affected memory location (currently unsupported by LLVM). + - Acquire: a load operation performs an acquire operation on the + affected memory location. + - Release: a store operation performs a release operation on the + affected memory location. + - Acquire-release: load and store operations perform acquire and + release operations on the affected memory. + - Sequentially consistent: same as acquire-release, but providing + a global total ordering for all affected locations. + + As in C11/C++11: + + - Atomic accesses must at least be naturally aligned. + - Some accesses may not actually be atomic on certain platforms, + requiring an implementation that uses a global lock. + - An atomic memory location must always be accessed with atomic + primitives, and these primitives must always be of the same bit + size for that location. + - Not all memory orderings are valid for all atomic operations. Fast-Math Flags --------------- @@ -277,14 +395,6 @@ Only the LLVM instructions listed here are supported by PNaCl bitcode. The pointer argument of these instructions must be a *normalized* pointer (see :ref:`pointer types <pointertypes>`). -* ``fence`` -* ``cmpxchg``, ``atomicrmw`` - - The pointer argument of these instructions must be a *normalized* pointer - (see :ref:`pointer types <pointertypes>`). - - TODO(jfb): this may change - * ``trunc`` * ``zext`` * ``sext`` @@ -323,8 +433,6 @@ Intrinsic Functions The only intrinsics supported by PNaCl bitcode are the following. -TODO(jfb): atomics - * ``llvm.memcpy`` * ``llvm.memmove`` * ``llvm.memset`` @@ -365,3 +473,56 @@ TODO(jfb): atomics TODO: describe +.. _atomic intrinsics: + +* ``llvm.nacl.atomic.store`` +* ``llvm.nacl.atomic.load`` +* ``llvm.nacl.atomic.rmw`` +* ``llvm.nacl.atomic.cmpxchg`` +* ``llvm.nacl.atomic.fence`` + + .. code-block:: llvm + + declare iN @llvm.nacl.atomic.load.<size>( + iN* <source>, i32 <memory_order>) + declare void @llvm.nacl.atomic.store.<size>( + iN <operand>, iN* <destination>, i32 <memory_order>) + declare iN @llvm.nacl.atomic.rmw.<size>( + i32 <computation>, iN* <object>, iN <operand>, i32 <memory_order>) + declare iN @llvm.nacl.atomic.cmpxchg.<size>( + iN* <object>, iN <expected>, iN <desired>, + i32 <memory_order_success>, i32 <memory_order_failure>) + declare void @llvm.nacl.atomic.fence(i32 <memory_order>) + + Each of these intrinsics is overloaded on the ``iN`` argument, which + is reflected through ``<size>`` in the overload's name. Integral types + of 8, 16, 32 and 64-bit width are supported for these arguments. + + The ``@llvm.nacl.atomic.rmw`` intrinsic implements the following + read-modify-write operations, from the general and arithmetic sections + of the C11/C++11 standards: + + - ``add`` + - ``sub`` + - ``or`` + - ``and`` + - ``xor`` + - ``exchange`` + + For all of these read-modify-write operations, the returned value is + that at ``object`` before the computation. The ``computation`` + argument must be a compile-time constant. + + All atomic intrinsics also support C11/C++11 memory orderings, which + must be compile-time constants. Those are detailed in `Atomic Memory + Ordering Constraints`_. + + Integer values for these computations and memory orderings are defined + in ``"llvm/IR/NaClAtomicIntrinsics.h"``. + + .. note:: + + These intrinsics allow PNaCl to support C11/C++11 style atomic + operations as well as some legacy GCC-style ``__sync_*`` builtins + while remaining stable as the LLVM codebase changes. The user + isn't expected to use these intrinsics directly. |