diff options
author | Eli Bendersky <eliben@chromium.org> | 2013-07-29 13:26:35 -0700 |
---|---|---|
committer | Eli Bendersky <eliben@chromium.org> | 2013-07-29 13:26:35 -0700 |
commit | 8347d6d0703d610e7ce229ff5c0e06501b0922a3 (patch) | |
tree | 69d85888b7c781edb669a8e0ac4072b202e5abef /docs | |
parent | 55076f91ef19e89d9aedeca77eef6003794cd8ae (diff) |
Beginnings of a "PNaCl Developer's Guide".
The first piece of contents is the atomic/memory model "notes" - currently
cruderly ripped out of PNaClLangRef.rst and replaced with links.
BUG=None
R=jfb@chromium.org
Review URL: https://codereview.chromium.org/21089005
Diffstat (limited to 'docs')
-rw-r--r-- | docs/PNaClDeveloperGuide.rst | 132 | ||||
-rw-r--r-- | docs/PNaClLangRef.rst | 129 |
2 files changed, 139 insertions, 122 deletions
diff --git a/docs/PNaClDeveloperGuide.rst b/docs/PNaClDeveloperGuide.rst new file mode 100644 index 0000000000..7bd45dd6fd --- /dev/null +++ b/docs/PNaClDeveloperGuide.rst @@ -0,0 +1,132 @@ +======================= +PNaCl Developer's Guide +======================= + +.. contents:: + :local: + :depth: 3 + +Introduction +============ + +TODO + +Memory Model and Atomics +======================== + +Volatile Memory Accesses +------------------------ + +The C11/C++11 standards mandate that ``volatile`` accesses execute in program +order (but are not fences, so other memory operations can reorder around them), +are not necessarily atomic, and can’t be elided. They can be separated into +smaller width accesses. + +The PNaCl toolchain applies regular LLVM optimizations along these guidelines, +and it further prevents any load/store (even non-``volatile`` and non-atomic +ones) from moving above or below a volatile operations: they act as compiler +barriers before optimizations occur. The PNaCl toolchain freezes ``volatile`` +accesses after optimizations into atomic accesses with sequentially consistent +memory ordering. This eases the support of legacy (i.e. non-C11/C++11) code, and +combined with builtin fences these programs can do meaningful cross-thread +communication without changing code. It also reflects the original code's intent +and guarantees better portability. + +Relaxed ordering could be used instead, but for the first release it is more +conservative to apply sequential consistency. Future releases may change what +happens at compile-time, but already-released pexes will continue using +sequential consistency. + +The PNaCl toolchain also requires that ``volatile`` accesses be at least +naturally aligned, and tries to guarantee this alignment. + +Memory Model for Concurrent Operations +-------------------------------------- + +The memory model offered by PNaCl relies on the same coding guidelines as the +C11/C++11 one: concurrent accesses must always occur through atomic primitives +(offered by `atomic intrinsics <PNaClLangRef.html#atomicintrinsics>`_), and +these accesses must always occur with the same size for the same memory +location. Visibility of stores is provided on a happens-before basis that +relates memory locations to each other as the C11/C++11 standards do. + +As in C11/C++11 some atomic accesses may be implemented with locks on certain +platforms. The ``ATOMIC_*_LOCK_FREE`` macros will always be ``1``, signifying +that all types are sometimes lock-free. The ``is_lock_free`` methods will return +the current platform's implementation at runtime. + +The PNaCl toolchain supports concurrent memory accesses through legacy GCC-style +``__sync_*`` builtins, as well as through C11/C++11 atomic primitives. +``volatile`` memory accesses can also be used, though these are discouraged, and +aren't present in bitcode. + +PNaCl supports concurrency and parallelism with some restrictions: + +* Threading is explicitly supported. + +* Inter-process communication through shared memory is limited to operations + which are lock-free on the current platform (``is_lock_free`` methods). This + may change at a later date. + +* Direct interaction with device memory isn't supported. + +* Signal handling isn't supported, PNaCl therefore promotes all primitives to + cross-thread (instead of single-thread). This may change at a later date. Note + that using atomic operations which aren't lock-free may lead to deadlocks when + handling asynchronous signals. + +* ``volatile`` and atomic operations are address-free (operations on the same + memory location via two different addresses work atomically), as intended by + the C11/C++11 standards. This is critical for inter-process communication as + well as synchronous "external modifications" such as mapping underlying memory + at multiple locations. + +Setting up the above mechanisms requires assistance from the embedding sandbox's +runtime (e.g. NaCl's Pepper APIs), but using them once setup can be done through +regular C/C++ code. + +The PNaCl toolchain currently optimizes for memory ordering as LLVM normally +does, but at pexe creation time it promotes all ``volatile`` accesses as well as +all atomic accesses to be sequentially consistent. Other memory orderings will +be supported in a future release, but pexes generated with the current toolchain +will continue functioning with sequential consistency. Using sequential +consistency provides a total ordering for all sequentially-consistent operations +on all addresses. + +This means that ``volatile`` and atomic memory accesses can only be re-ordered +in some limited way before the pexe is created, and will act as fences for all +memory accesses (even non-atomic and non-``volatile``) after pexe creation. +Non-atomic and non-``volatile`` memory accesses may be reordered (unless a fence +intervenes), separated, elided or fused according to C and C++'s memory model +before the pexe is created as well as after its creation. + +Atomic Memory Ordering Constraints +---------------------------------- + +Atomics follow the same ordering constraints as in regular LLVM, but +all accesses are promoted to sequential consistency (the strongest +memory ordering) at pexe creation time. As more C11/C++11 code +allows us to understand performance and portability needs we intend +to support the full gamut of C11/C++11 memory orderings: + +- Relaxed: no operation orders memory. +- Consume: a load operation performs a consume operation on the affected memory + location (currently unsupported by LLVM). +- Acquire: a load operation performs an acquire operation on the affected memory + location. +- Release: a store operation performs a release operation on the affected memory + location. +- Acquire-release: load and store operations perform acquire and release + operations on the affected memory. +- Sequentially consistent: same as acquire-release, but providing a global total + ordering for all affected locations. + +As in C11/C++11: + +- Atomic accesses must at least be naturally aligned. +- Some accesses may not actually be atomic on certain platforms, requiring an + implementation that uses a global lock. +- An atomic memory location must always be accessed with atomic primitives, and + these primitives must always be of the same bit size for that location. +- Not all memory orderings are valid for all atomic operations. + diff --git a/docs/PNaClLangRef.rst b/docs/PNaClLangRef.rst index d24f174455..8ddfae0739 100644 --- a/docs/PNaClLangRef.rst +++ b/docs/PNaClLangRef.rst @@ -72,7 +72,7 @@ Restrictions on global variables: * PNaCl bitcode does not support TLS models. * Restrictions on :ref:`linkage types <linkagetypes>`. -* ``externally_initialized``. +* The ``externally_initialized`` attribute. Every global variable must have an initializer. Each initializer must be either a *SimpleElement* or a *CompoundElement*, defined as follows. @@ -144,139 +144,24 @@ Volatile Memory Accesses `LLVM LangRef: Volatile Memory Accesses <LangRef.html#volatile>`_ PNaCl bitcode does not support volatile memory accesses. The ``volatile`` -attribute on loads and stores is not supported. - -.. note:: - - The C11/C++11 standards mandate that ``volatile`` accesses execute - in program order (but are not fences, so other memory operations can - reorder around them), are not necessarily atomic, and can’t be - elided. They can be separated into smaller width accesses. - - The PNaCl toolchain applies regular LLVM optimizations along these - guidelines, and it further prevents any load/store (even - non-``volatile`` and non-atomic ones) from moving above or below a - volatile operations: they act as compiler barriers before - optimizations occur. The PNaCl toolchain freezes ``volatile`` - accesses after optimizations into atomic accesses with sequentially - consistent memory ordering. This eases the support of legacy - (i.e. non-C11/C++11) code, and combined with builtin fences these - programs can do meaningful cross-thread communication without - changing code. It also reflects the original code's intent and - guarantees better portability. - - Relaxed ordering could be used instead, but for the first release it - is more conservative to apply sequential consistency. Future - releases may change what happens at compile-time, but - already-released pexes will continue using sequential consistency. - - The PNaCl toolchain also requires that ``volatile`` accesses be at - least naturally aligned, and tries to guarantee this alignment. +attribute on loads and stores is not supported. See the +`PNaCl Developer's Guide <PNaClDeveloperGuide.html>`_ for more details. Memory Model for Concurrent Operations -------------------------------------- `LLVM LangRef: Memory Model for Concurrent Operations <LangRef.html#memmodel>`_ -The memory model offered by PNaCl relies on the same coding guidelines -as the C11/C++11 one: concurrent accesses must always occur through -atomic primitives (offered by `atomic intrinsics`_), and these accesses -must always occur with the same size for the same memory -location. Visibility of stores is provided on a happens-before basis -that relates memory locations to each other as the C11/C++11 standards -do. - -.. note:: - - As in C11/C++11 some atomic accesses may be implemented with locks - on certain platforms. The ``ATOMIC_*_LOCK_FREE`` macros will always - be ``1``, signifying that all types are sometimes lock-free. The - ``is_lock_free`` methods will return the current platform's - implementation at runtime. - - The PNaCl toolchain supports concurrent memory accesses through - legacy GCC-style ``__sync_*`` builtins, as well as through C11/C++11 - atomic primitives. ``volatile`` memory accesses can also be used, - though these are discouraged, and aren't present in bitcode. - - PNaCl supports concurrency and parallelism with some restrictions: - - * Threading is explicitly supported. - * Inter-process communication through shared memory is limited to - operations which are lock-free on the current platform - (``is_lock_free`` methods). This may change at a later date. - * Direct interaction with device memory isn't supported. - * Signal handling isn't supported, PNaCl therefore promotes all - primitives to cross-thread (instead of single-thread). This may - change at a later date. Note that using atomic operations which - aren't lock-free may lead to deadlocks when handling asynchronous - signals. - * ``volatile`` and atomic operations are address-free (operations on - the same memory location via two different addresses work - atomically), as intended by the C11/C++11 standards. This is - critical for inter-process communication as well as synchronous - "external modifications" such as mapping underlying memory at - multiple locations. - - Setting up the above mechanisms requires assistance from the - embedding sandbox's runtime (e.g. NaCl's Pepper APIs), but using - them once setup can be done through regular C/C++ code. - - The PNaCl toolchain currently optimizes for memory ordering as LLVM - normally does, but at pexe creation time it promotes all - ``volatile`` accesses as well as all atomic accesses to be - sequentially consistent. Other memory orderings will be supported in - a future release, but pexes generated with the current toolchain - will continue functioning with sequential consistency. Using - sequential consistency provides a total ordering for all - sequentially-consistent operations on all addresses. - - This means that ``volatile`` and atomic memory accesses can only be - re-ordered in some limited way before the pexe is created, and will - act as fences for all memory accesses (even non-atomic and - non-``volatile``) after pexe creation. Non-atomic and - non-``volatile`` memory accesses may be reordered (unless a fence - intervenes), separated, elided or fused according to C and C++'s - memory model before the pexe is created as well as after its - creation. +See the `PNaCl Developer's Guide <PNaClDeveloperGuide.html>`_ for more details. Atomic Memory Ordering Constraints ---------------------------------- `LLVM LangRef: Atomic Memory Ordering Constraints <LangRef.html#ordering>`_ -PNaCl bitcode currently supports sequential consistency only, through -its `atomic intrinsics`_. - -.. note:: - - Atomics follow the same ordering constraints as in regular LLVM, but - all accesses are promoted to sequential consistency (the strongest - memory ordering) at pexe creation time. As more C11/C++11 code - allows us to understand performance and portability needs we intend - to support the full gamut of C11/C++11 memory orderings: - - - Relaxed: no operation orders memory. - - Consume: a load operation performs a consume operation on the - affected memory location (currently unsupported by LLVM). - - Acquire: a load operation performs an acquire operation on the - affected memory location. - - Release: a store operation performs a release operation on the - affected memory location. - - Acquire-release: load and store operations perform acquire and - release operations on the affected memory. - - Sequentially consistent: same as acquire-release, but providing - a global total ordering for all affected locations. - - As in C11/C++11: - - - Atomic accesses must at least be naturally aligned. - - Some accesses may not actually be atomic on certain platforms, - requiring an implementation that uses a global lock. - - An atomic memory location must always be accessed with atomic - primitives, and these primitives must always be of the same bit - size for that location. - - Not all memory orderings are valid for all atomic operations. +PNaCl bitcode currently supports sequential consistency only, through its +`atomic intrinsics`_. See the +`PNaCl Developer's Guide <PNaClDeveloperGuide.html>`_ for more details. Fast-Math Flags --------------- |