aboutsummaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorEli Bendersky <eliben@chromium.org>2013-07-18 18:00:27 -0700
committerEli Bendersky <eliben@chromium.org>2013-07-18 18:00:27 -0700
commit4412ea4b8e019d00dc7574fe1723eea0473a8ec1 (patch)
tree2badd5ce0727bfad02f10d0d82c8bcfa65677676 /docs
parent4a9f2a703db400ccf760f34101bcdd57642f96e4 (diff)
parent5b548094edef39376e17445aea28ad2b37d701c4 (diff)
Merge remote-tracking branch 'origin/master'
Diffstat (limited to 'docs')
-rw-r--r--docs/PNaClLangRef.rst187
1 files changed, 174 insertions, 13 deletions
diff --git a/docs/PNaClLangRef.rst b/docs/PNaClLangRef.rst
index 4f322a3eff..eef6023e2b 100644
--- a/docs/PNaClLangRef.rst
+++ b/docs/PNaClLangRef.rst
@@ -113,21 +113,139 @@ Volatile Memory Accesses
`LLVM LangRef: Volatile Memory Accesses <LangRef.html#volatile>`_
-TODO: are we going to promote volatile to atomic?
+PNaCl bitcode does not support volatile memory accesses.
+
+.. note::
+
+ The C11/C++11 standards mandate that ``volatile`` accesses execute
+ in program order (but are not fences, so other memory operations can
+ reorder around them), are not necessarily atomic, and can’t be
+ elided. They can be separated into smaller width accesses.
+
+ The PNaCl toolchain applies regular LLVM optimizations along these
+ guidelines, and it further prevents any load/store (even
+ non-``volatile`` and non-atomic ones) from moving above or below a
+ volatile operations: they act as compiler barriers before
+ optimizations occur. The PNaCl toolchain freezes ``volatile``
+ accesses after optimizations into atomic accesses with sequentially
+ consistent memory ordering. This eases the support of legacy
+ (i.e. non-C11/C++11) code, and combined with builtin fences these
+ programs can do meaningful cross-thread communication without
+ changing code. It also reflects the original code's intent and
+ guarantees better portability.
+
+ Relaxed ordering could be used instead, but for the first release it
+ is more conservative to apply sequential consistency. Future
+ releases may change what happens at compile-time, but
+ already-released pexes will continue using sequential consistency.
+
+ The PNaCl toolchain also requires that ``volatile`` accesses be at
+ least naturally aligned, and tries to guarantee this alignment.
Memory Model for Concurrent Operations
--------------------------------------
`LLVM LangRef: Memory Model for Concurrent Operations <LangRef.html#memmodel>`_
-TODO.
+The memory model offered by PNaCl relies on the same coding guidelines
+as the C11/C++11 one: concurrent accesses must always occur through
+atomic primitives (offered by `atomic intrinsics`_), and these accesses
+must always occur with the same size for the same memory
+location. Visibility of stores is provided on a happens-before basis
+that relates memory locations to each other as the C11/C++11 standards
+do.
+
+.. note::
+
+ As in C11/C++11 some atomic accesses may be implemented with locks
+ on certain platforms. The ``ATOMIC_*_LOCK_FREE`` macros will always
+ be ``1``, signifying that all types are sometimes lock-free. The
+ ``is_lock_free`` methods will return the current platform's
+ implementation at runtime.
+
+ The PNaCl toolchain supports concurrent memory accesses through
+ legacy GCC-style ``__sync_*`` builtins, as well as through C11/C++11
+ atomic primitives. ``volatile`` memory accesses can also be used,
+ though these are discouraged, and aren't present in bitcode.
+
+ PNaCl supports concurrency and parallelism with some restrictions:
+
+ * Threading is explicitly supported.
+ * Inter-process communication through shared memory is limited to
+ operations which are lock-free on the current platform
+ (``is_lock_free`` methods). This may change at a later date.
+ * Direct interaction with device memory isn't supported.
+ * Signal handling isn't supported, PNaCl therefore promotes all
+ primitives to cross-thread (instead of single-thread). This may
+ change at a later date. Note that using atomic operations which
+ aren't lock-free may lead to deadlocks when handling asynchronous
+ signals.
+ * ``volatile`` and atomic operations are address-free (operations on
+ the same memory location via two different addresses work
+ atomically), as intended by the C11/C++11 standards. This is
+ critical for inter-process communication as well as synchronous
+ "external modifications" such as mapping underlying memory at
+ multiple locations.
+
+ Setting up the above mechanisms requires assistance from the
+ embedding sandbox's runtime (e.g. NaCl's Pepper APIs), but using
+ them once setup can be done through regular C/C++ code.
+
+ The PNaCl toolchain currently optimizes for memory ordering as LLVM
+ normally does, but at pexe creation time it promotes all
+ ``volatile`` accesses as well as all atomic accesses to be
+ sequentially consistent. Other memory orderings will be supported in
+ a future release, but pexes generated with the current toolchain
+ will continue functioning with sequential consistency. Using
+ sequential consistency provides a total ordering for all
+ sequentially-consistent operations on all addresses.
+
+ This means that ``volatile`` and atomic memory accesses can only be
+ re-ordered in some limited way before the pexe is created, and will
+ act as fences for all memory accesses (even non-atomic and
+ non-``volatile``) after pexe creation. Non-atomic and
+ non-``volatile`` memory accesses may be reordered (unless a fence
+ intervenes), separated, elided or fused according to C and C++'s
+ memory model before the pexe is created as well as after its
+ creation.
Atomic Memory Ordering Constraints
----------------------------------
`LLVM LangRef: Atomic Memory Ordering Constraints <LangRef.html#ordering>`_
-TODO.
+PNaCl bitcode currently supports sequential consistency only, through
+its `atomic intrinsics`_.
+
+.. note::
+
+ Atomics follow the same ordering constraints as in regular LLVM, but
+ all accesses are promoted to sequential consistency (the strongest
+ memory ordering) at pexe creation time. As more C11/C++11 code
+ allows us to understand performance and portability needs we intend
+ to support the full gamut of C11/C++11 memory orderings:
+
+ - Relaxed: no operation orders memory.
+ - Consume: a load operation performs a consume operation on the
+ affected memory location (currently unsupported by LLVM).
+ - Acquire: a load operation performs an acquire operation on the
+ affected memory location.
+ - Release: a store operation performs a release operation on the
+ affected memory location.
+ - Acquire-release: load and store operations perform acquire and
+ release operations on the affected memory.
+ - Sequentially consistent: same as acquire-release, but providing
+ a global total ordering for all affected locations.
+
+ As in C11/C++11:
+
+ - Atomic accesses must at least be naturally aligned.
+ - Some accesses may not actually be atomic on certain platforms,
+ requiring an implementation that uses a global lock.
+ - An atomic memory location must always be accessed with atomic
+ primitives, and these primitives must always be of the same bit
+ size for that location.
+ - Not all memory orderings are valid for all atomic operations.
Fast-Math Flags
---------------
@@ -277,14 +395,6 @@ Only the LLVM instructions listed here are supported by PNaCl bitcode.
The pointer argument of these instructions must be a *normalized* pointer
(see :ref:`pointer types <pointertypes>`).
-* ``fence``
-* ``cmpxchg``, ``atomicrmw``
-
- The pointer argument of these instructions must be a *normalized* pointer
- (see :ref:`pointer types <pointertypes>`).
-
- TODO(jfb): this may change
-
* ``trunc``
* ``zext``
* ``sext``
@@ -323,8 +433,6 @@ Intrinsic Functions
The only intrinsics supported by PNaCl bitcode are the following.
-TODO(jfb): atomics
-
* ``llvm.memcpy``
* ``llvm.memmove``
* ``llvm.memset``
@@ -365,3 +473,56 @@ TODO(jfb): atomics
TODO: describe
+.. _atomic intrinsics:
+
+* ``llvm.nacl.atomic.store``
+* ``llvm.nacl.atomic.load``
+* ``llvm.nacl.atomic.rmw``
+* ``llvm.nacl.atomic.cmpxchg``
+* ``llvm.nacl.atomic.fence``
+
+ .. code-block:: llvm
+
+ declare iN @llvm.nacl.atomic.load.<size>(
+ iN* <source>, i32 <memory_order>)
+ declare void @llvm.nacl.atomic.store.<size>(
+ iN <operand>, iN* <destination>, i32 <memory_order>)
+ declare iN @llvm.nacl.atomic.rmw.<size>(
+ i32 <computation>, iN* <object>, iN <operand>, i32 <memory_order>)
+ declare iN @llvm.nacl.atomic.cmpxchg.<size>(
+ iN* <object>, iN <expected>, iN <desired>,
+ i32 <memory_order_success>, i32 <memory_order_failure>)
+ declare void @llvm.nacl.atomic.fence(i32 <memory_order>)
+
+ Each of these intrinsics is overloaded on the ``iN`` argument, which
+ is reflected through ``<size>`` in the overload's name. Integral types
+ of 8, 16, 32 and 64-bit width are supported for these arguments.
+
+ The ``@llvm.nacl.atomic.rmw`` intrinsic implements the following
+ read-modify-write operations, from the general and arithmetic sections
+ of the C11/C++11 standards:
+
+ - ``add``
+ - ``sub``
+ - ``or``
+ - ``and``
+ - ``xor``
+ - ``exchange``
+
+ For all of these read-modify-write operations, the returned value is
+ that at ``object`` before the computation. The ``computation``
+ argument must be a compile-time constant.
+
+ All atomic intrinsics also support C11/C++11 memory orderings, which
+ must be compile-time constants. Those are detailed in `Atomic Memory
+ Ordering Constraints`_.
+
+ Integer values for these computations and memory orderings are defined
+ in ``"llvm/IR/NaClAtomicIntrinsics.h"``.
+
+ .. note::
+
+ These intrinsics allow PNaCl to support C11/C++11 style atomic
+ operations as well as some legacy GCC-style ``__sync_*`` builtins
+ while remaining stable as the LLVM codebase changes. The user
+ isn't expected to use these intrinsics directly.