aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--docs/AddressSanitizer.html171
-rw-r--r--docs/AddressSanitizer.rst158
-rw-r--r--docs/AnalyzerRegions.html260
-rw-r--r--docs/AnalyzerRegions.rst259
-rw-r--r--docs/ClangPlugins.html170
-rw-r--r--docs/ClangPlugins.rst149
-rw-r--r--docs/ClangTools.html110
-rw-r--r--docs/ClangTools.rst91
-rw-r--r--docs/HowToSetupToolingForLLVM.html212
-rw-r--r--docs/HowToSetupToolingForLLVM.rst211
-rw-r--r--docs/IntroductionToTheClangAST.html139
-rw-r--r--docs/IntroductionToTheClangAST.rst135
-rw-r--r--docs/JSONCompilationDatabase.html89
-rw-r--r--docs/JSONCompilationDatabase.rst85
-rw-r--r--docs/LibASTMatchersTutorial.html533
-rw-r--r--docs/LibASTMatchersTutorial.rst532
-rw-r--r--docs/PTHInternals.html179
-rw-r--r--docs/PTHInternals.rst164
-rw-r--r--docs/RAVFrontendAction.html224
-rw-r--r--docs/RAVFrontendAction.rst216
-rw-r--r--docs/UsersManual.html1338
-rw-r--r--docs/UsersManual.rst1238
-rw-r--r--docs/index.rst11
23 files changed, 3249 insertions, 3425 deletions
diff --git a/docs/AddressSanitizer.html b/docs/AddressSanitizer.html
deleted file mode 100644
index 397eafc2d5..0000000000
--- a/docs/AddressSanitizer.html
+++ /dev/null
@@ -1,171 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
- "http://www.w3.org/TR/html4/strict.dtd">
-<!-- Material used from: HTML 4.01 specs: http://www.w3.org/TR/html401/ -->
-<html>
-<head>
- <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
- <title>AddressSanitizer, a fast memory error detector</title>
- <link type="text/css" rel="stylesheet" href="../menu.css">
- <link type="text/css" rel="stylesheet" href="../content.css">
- <style type="text/css">
- td {
- vertical-align: top;
- }
- </style>
-</head>
-<body>
-
-<!--#include virtual="../menu.html.incl"-->
-
-<div id="content">
-
-<h1>AddressSanitizer</h1>
-<ul>
- <li> <a href="#intro">Introduction</a>
- <li> <a href="#howtobuild">How to Build</a>
- <li> <a href="#usage">Usage</a>
- <ul><li> <a href="#has_feature">__has_feature(address_sanitizer)</a></ul>
- <ul><li> <a href="#no_address_safety_analysis">
- __attribute__((no_address_safety_analysis))</a></ul>
- <li> <a href="#platforms">Supported Platforms</a>
- <li> <a href="#limitations">Limitations</a>
- <li> <a href="#status">Current Status</a>
- <li> <a href="#moreinfo">More Information</a>
-</ul>
-
-<h2 id="intro">Introduction</h2>
-AddressSanitizer is a fast memory error detector.
-It consists of a compiler instrumentation module and a run-time library.
-The tool can detect the following types of bugs:
-<ul> <li> Out-of-bounds accesses to heap, stack and globals
- <li> Use-after-free
- <li> Use-after-return (to some extent)
- <li> Double-free, invalid free
-</ul>
-Typical slowdown introduced by AddressSanitizer is <b>2x</b>.
-
-<h2 id="howtobuild">How to build</h2>
-Follow the <a href="../get_started.html">clang build instructions</a>.
-CMake build is supported.<BR>
-
-<h2 id="usage">Usage</h2>
-Simply compile and link your program with <tt>-fsanitize=address</tt> flag. <BR>
-The AddressSanitizer run-time library should be linked to the final executable,
-so make sure to use <tt>clang</tt> (not <tt>ld</tt>) for the final link step.<BR>
-When linking shared libraries, the AddressSanitizer run-time is not linked,
-so <tt>-Wl,-z,defs</tt> may cause link errors (don't use it with AddressSanitizer). <BR>
-
-To get a reasonable performance add <tt>-O1</tt> or higher. <BR>
-To get nicer stack traces in error messages add
-<tt>-fno-omit-frame-pointer</tt>. <BR>
-To get perfect stack traces you may need to disable inlining (just use <tt>-O1</tt>) and tail call
-elimination (<tt>-fno-optimize-sibling-calls</tt>).
-
-<pre>
-% cat example_UseAfterFree.cc
-int main(int argc, char **argv) {
- int *array = new int[100];
- delete [] array;
- return array[argc]; // BOOM
-}
-</pre>
-
-<pre>
-# Compile and link
-% clang -O1 -g -fsanitize=address -fno-omit-frame-pointer example_UseAfterFree.cc
-</pre>
-OR
-<pre>
-# Compile
-% clang -O1 -g -fsanitize=address -fno-omit-frame-pointer -c example_UseAfterFree.cc
-# Link
-% clang -g -fsanitize=address example_UseAfterFree.o
-</pre>
-
-If a bug is detected, the program will print an error message to stderr and exit with a
-non-zero exit code.
-Currently, AddressSanitizer does not symbolize its output, so you may need to use a
-separate script to symbolize the result offline (this will be fixed in future).
-<pre>
-% ./a.out 2> log
-% projects/compiler-rt/lib/asan/scripts/asan_symbolize.py / < log | c++filt
-==9442== ERROR: AddressSanitizer heap-use-after-free on address 0x7f7ddab8c084 at pc 0x403c8c bp 0x7fff87fb82d0 sp 0x7fff87fb82c8
-READ of size 4 at 0x7f7ddab8c084 thread T0
- #0 0x403c8c in main example_UseAfterFree.cc:4
- #1 0x7f7ddabcac4d in __libc_start_main ??:0
-0x7f7ddab8c084 is located 4 bytes inside of 400-byte region [0x7f7ddab8c080,0x7f7ddab8c210)
-freed by thread T0 here:
- #0 0x404704 in operator delete[](void*) ??:0
- #1 0x403c53 in main example_UseAfterFree.cc:4
- #2 0x7f7ddabcac4d in __libc_start_main ??:0
-previously allocated by thread T0 here:
- #0 0x404544 in operator new[](unsigned long) ??:0
- #1 0x403c43 in main example_UseAfterFree.cc:2
- #2 0x7f7ddabcac4d in __libc_start_main ??:0
-==9442== ABORTING
-</pre>
-
-AddressSanitizer exits on the first detected error. This is by design.
-One reason: it makes the generated code smaller and faster (both by ~5%).
-Another reason: this makes fixing bugs unavoidable. With Valgrind, it is often
-the case that users treat Valgrind warnings as false positives
-(which they are not) and don't fix them.
-
-
-<h3 id="has_feature">__has_feature(address_sanitizer)</h3>
-In some cases one may need to execute different code depending on whether
-AddressSanitizer is enabled.
-<a href="LanguageExtensions.html#__has_feature_extension">__has_feature</a>
-can be used for this purpose.
-<pre>
-#if defined(__has_feature)
-# if __has_feature(address_sanitizer)
- code that builds only under AddressSanitizer
-# endif
-#endif
-</pre>
-
-<h3 id="no_address_safety_analysis">__attribute__((no_address_safety_analysis))</h3>
-Some code should not be instrumented by AddressSanitizer.
-One may use the function attribute
-<a href="LanguageExtensions.html#address_sanitizer">
- <tt>no_address_safety_analysis</tt></a>
-to disable instrumentation of a particular function.
-This attribute may not be supported by other compilers, so we suggest to
-use it together with <tt>__has_feature(address_sanitizer)</tt>.
-Note: currently, this attribute will be lost if the function is inlined.
-
-<h2 id="platforms">Supported Platforms</h2>
-AddressSanitizer is supported on
-<ul><li>Linux i386/x86_64 (tested on Ubuntu 10.04 and 12.04).
-<li>MacOS 10.6, 10.7 and 10.8 (i386/x86_64).
-</ul>
-Support for Linux ARM (and Android ARM) is in progress
-(it may work, but is not guaranteed too).
-
-
-<h2 id="limitations">Limitations</h2>
-<ul>
-<li> AddressSanitizer uses more real memory than a native run.
-Exact overhead depends on the allocations sizes. The smaller the
-allocations you make the bigger the overhead is.
-<li> AddressSanitizer uses more stack memory. We have seen up to 3x increase.
-<li> On 64-bit platforms AddressSanitizer maps (but not reserves)
-16+ Terabytes of virtual address space.
-This means that tools like <tt>ulimit</tt> may not work as usually expected.
-<li> Static linking is not supported.
-</ul>
-
-
-<h2 id="status">Current Status</h2>
-AddressSanitizer is fully functional on supported platforms starting from LLVM 3.1.
-The test suite is integrated into CMake build and can be run with
-<tt>make check-asan</tt> command.
-
-<h2 id="moreinfo">More Information</h2>
-<a href="http://code.google.com/p/address-sanitizer/">http://code.google.com/p/address-sanitizer</a>.
-
-
-</div>
-</body>
-</html>
diff --git a/docs/AddressSanitizer.rst b/docs/AddressSanitizer.rst
new file mode 100644
index 0000000000..0ee108bd9e
--- /dev/null
+++ b/docs/AddressSanitizer.rst
@@ -0,0 +1,158 @@
+================
+AddressSanitizer
+================
+
+.. contents::
+ :local:
+
+Introduction
+============
+
+AddressSanitizer is a fast memory error detector. It consists of a
+compiler instrumentation module and a run-time library. The tool can
+detect the following types of bugs:
+
+- Out-of-bounds accesses to heap, stack and globals
+- Use-after-free
+- Use-after-return (to some extent)
+- Double-free, invalid free
+
+Typical slowdown introduced by AddressSanitizer is **2x**.
+
+How to build
+============
+
+Follow the `clang build instructions <../get_started.html>`_. CMake
+build is supported.
+
+Usage
+=====
+
+Simply compile and link your program with ``-fsanitize=address`` flag.
+The AddressSanitizer run-time library should be linked to the final
+executable, so make sure to use ``clang`` (not ``ld``) for the final
+link step.
+When linking shared libraries, the AddressSanitizer run-time is not
+linked, so ``-Wl,-z,defs`` may cause link errors (don't use it with
+AddressSanitizer).
+To get a reasonable performance add ``-O1`` or higher.
+To get nicer stack traces in error messages add
+``-fno-omit-frame-pointer``.
+To get perfect stack traces you may need to disable inlining (just use
+``-O1``) and tail call elimination (``-fno-optimize-sibling-calls``).
+
+::
+
+ % cat example_UseAfterFree.cc
+ int main(int argc, char **argv) {
+ int *array = new int[100];
+ delete [] array;
+ return array[argc]; // BOOM
+ }
+
+::
+
+ # Compile and link
+ % clang -O1 -g -fsanitize=address -fno-omit-frame-pointer example_UseAfterFree.cc
+
+OR
+
+::
+
+ # Compile
+ % clang -O1 -g -fsanitize=address -fno-omit-frame-pointer -c example_UseAfterFree.cc
+ # Link
+ % clang -g -fsanitize=address example_UseAfterFree.o
+
+If a bug is detected, the program will print an error message to stderr
+and exit with a non-zero exit code. Currently, AddressSanitizer does not
+symbolize its output, so you may need to use a separate script to
+symbolize the result offline (this will be fixed in future).
+
+::
+
+ % ./a.out 2> log
+ % projects/compiler-rt/lib/asan/scripts/asan_symbolize.py / < log | c++filt
+ ==9442== ERROR: AddressSanitizer heap-use-after-free on address 0x7f7ddab8c084 at pc 0x403c8c bp 0x7fff87fb82d0 sp 0x7fff87fb82c8
+ READ of size 4 at 0x7f7ddab8c084 thread T0
+ #0 0x403c8c in main example_UseAfterFree.cc:4
+ #1 0x7f7ddabcac4d in __libc_start_main ??:0
+ 0x7f7ddab8c084 is located 4 bytes inside of 400-byte region [0x7f7ddab8c080,0x7f7ddab8c210)
+ freed by thread T0 here:
+ #0 0x404704 in operator delete[](void*) ??:0
+ #1 0x403c53 in main example_UseAfterFree.cc:4
+ #2 0x7f7ddabcac4d in __libc_start_main ??:0
+ previously allocated by thread T0 here:
+ #0 0x404544 in operator new[](unsigned long) ??:0
+ #1 0x403c43 in main example_UseAfterFree.cc:2
+ #2 0x7f7ddabcac4d in __libc_start_main ??:0
+ ==9442== ABORTING
+
+AddressSanitizer exits on the first detected error. This is by design.
+One reason: it makes the generated code smaller and faster (both by
+~5%). Another reason: this makes fixing bugs unavoidable. With Valgrind,
+it is often the case that users treat Valgrind warnings as false
+positives (which they are not) and don't fix them.
+
+\_\_has\_feature(address\_sanitizer)
+------------------------------------
+
+In some cases one may need to execute different code depending on
+whether AddressSanitizer is enabled.
+`\_\_has\_feature <LanguageExtensions.html#__has_feature_extension>`_
+can be used for this purpose.
+
+::
+
+ #if defined(__has_feature)
+ # if __has_feature(address_sanitizer)
+ code that builds only under AddressSanitizer
+ # endif
+ #endif
+
+``__attribute__((no_address_safety_analysis))``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Some code should not be instrumented by AddressSanitizer. One may use
+the function attribute
+`no_address_safety_analysis <LanguageExtensions.html#address_sanitizer>`_
+to disable instrumentation of a particular function. This attribute may
+not be supported by other compilers, so we suggest to use it together
+with ``__has_feature(address_sanitizer)``. Note: currently, this
+attribute will be lost if the function is inlined.
+
+Supported Platforms
+===================
+
+AddressSanitizer is supported on
+
+- Linux i386/x86\_64 (tested on Ubuntu 10.04 and 12.04).
+- MacOS 10.6, 10.7 and 10.8 (i386/x86\_64).
+
+Support for Linux ARM (and Android ARM) is in progress (it may work, but
+is not guaranteed too).
+
+Limitations
+===========
+
+- AddressSanitizer uses more real memory than a native run. Exact
+ overhead depends on the allocations sizes. The smaller the
+ allocations you make the bigger the overhead is.
+- AddressSanitizer uses more stack memory. We have seen up to 3x
+ increase.
+- On 64-bit platforms AddressSanitizer maps (but not reserves) 16+
+ Terabytes of virtual address space. This means that tools like
+ ``ulimit`` may not work as usually expected.
+- Static linking is not supported.
+
+Current Status
+==============
+
+AddressSanitizer is fully functional on supported platforms starting
+from LLVM 3.1. The test suite is integrated into CMake build and can be
+run with ``make check-asan`` command.
+
+More Information
+================
+
+`http://code.google.com/p/address-sanitizer <http://code.google.com/p/address-sanitizer/>`_.
diff --git a/docs/AnalyzerRegions.html b/docs/AnalyzerRegions.html
deleted file mode 100644
index f9d3337920..0000000000
--- a/docs/AnalyzerRegions.html
+++ /dev/null
@@ -1,260 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
- "http://www.w3.org/TR/html4/strict.dtd">
-<html>
-<head>
-<title>Static Analyzer Design Document: Memory Regions</title>
-</head>
-<body>
-
-<h1>Static Analyzer Design Document: Memory Regions</h1>
-
-<h3>Authors</h3>
-
-<p>Ted Kremenek, <tt>kremenek at apple</tt><br>
-Zhongxing Xu, <tt>xuzhongzhing at gmail</tt></p>
-
-<h2 id="intro">Introduction</h2>
-
-<p>The path-sensitive analysis engine in libAnalysis employs an extensible API
-for abstractly modeling the memory of an analyzed program. This API employs the
-concept of "memory regions" to abstractly model chunks of program memory such as
-program variables and dynamically allocated memory such as those returned from
-'malloc' and 'alloca'. Regions are hierarchical, with subregions modeling
-subtyping relationships, field and array offsets into larger chunks of memory,
-and so on.</p>
-
-<p>The region API consists of two components:</p>
-
-<ul> <li>A taxonomy and representation of regions themselves within the analyzer
-engine. The primary definitions and interfaces are described in <tt><a
-href="http://clang.llvm.org/doxygen/MemRegion_8h-source.html">MemRegion.h</a></tt>.
-At the root of the region hierarchy is the class <tt>MemRegion</tt> with
-specific subclasses refining the region concept for variables, heap allocated
-memory, and so forth.</li> <li>The modeling of binding of values to regions. For
-example, modeling the value stored to a local variable <tt>x</tt> consists of
-recording the binding between the region for <tt>x</tt> (which represents the
-raw memory associated with <tt>x</tt>) and the value stored to <tt>x</tt>. This
-binding relationship is captured with the notion of &quot;symbolic
-stores.&quot;</li> </ul>
-
-<p>Symbolic stores, which can be thought of as representing the relation
-<tt>regions -> values</tt>, are implemented by subclasses of the
-<tt>StoreManager</tt> class (<tt><a
-href="http://clang.llvm.org/doxygen/Store_8h-source.html">Store.h</a></tt>). A
-particular StoreManager implementation has complete flexibility concerning the
-following:
-
-<ul>
-<li><em>How</em> to model the binding between regions and values</li>
-<li><em>What</em> bindings are recorded
-</ul>
-
-<p>Together, both points allow different StoreManagers to tradeoff between
-different levels of analysis precision and scalability concerning the reasoning
-of program memory. Meanwhile, the core path-sensitive engine makes no
-assumptions about either points, and queries a StoreManager about the bindings
-to a memory region through a generic interface that all StoreManagers share. If
-a particular StoreManager cannot reason about the potential bindings of a given
-memory region (e.g., '<tt>BasicStoreManager</tt>' does not reason about fields
-of structures) then the StoreManager can simply return 'unknown' (represented by
-'<tt>UnknownVal</tt>') for a particular region-binding. This separation of
-concerns not only isolates the core analysis engine from the details of
-reasoning about program memory but also facilities the option of a client of the
-path-sensitive engine to easily swap in different StoreManager implementations
-that internally reason about program memory in very different ways.</p>
-
-<p>The rest of this document is divided into two parts. We first discuss region
-taxonomy and the semantics of regions. We then discuss the StoreManager
-interface, and details of how the currently available StoreManager classes
-implement region bindings.</p>
-
-<h2 id="regions">Memory Regions and Region Taxonomy</h2>
-
-<h3>Pointers</h3>
-
-<p>Before talking about the memory regions, we would talk about the pointers
-since memory regions are essentially used to represent pointer values.</p>
-
-<p>The pointer is a type of values. Pointer values have two semantic aspects.
-One is its physical value, which is an address or location. The other is the
-type of the memory object residing in the address.</p>
-
-<p>Memory regions are designed to abstract these two properties of the pointer.
-The physical value of a pointer is represented by MemRegion pointers. The rvalue
-type of the region corresponds to the type of the pointee object.</p>
-
-<p>One complication is that we could have different view regions on the same
-memory chunk. They represent the same memory location, but have different
-abstract location, i.e., MemRegion pointers. Thus we need to canonicalize the
-abstract locations to get a unique abstract location for one physical
-location.</p>
-
-<p>Furthermore, these different view regions may or may not represent memory
-objects of different types. Some different types are semantically the same,
-for example, 'struct s' and 'my_type' are the same type.</p>
-
-<pre>
-struct s;
-typedef struct s my_type;
-</pre>
-
-<p>But <tt>char</tt> and <tt>int</tt> are not the same type in the code below:</p>
-
-<pre>
-void *p;
-int *q = (int*) p;
-char *r = (char*) p;
-</pre>
-
-<p>Thus we need to canonicalize the MemRegion which is used in binding and
-retrieving.</p>
-
-<h3>Regions</h3>
-<p>Region is the entity used to model pointer values. A Region has the following
-properties:</p>
-
-<ul>
-<li>Kind</li>
-
-<li>ObjectType: the type of the object residing on the region.</li>
-
-<li>LocationType: the type of the pointer value that the region corresponds to.
- Usually this is the pointer to the ObjectType. But sometimes we want to cache
- this type explicitly, for example, for a CodeTextRegion.</li>
-
-<li>StartLocation</li>
-
-<li>EndLocation</li>
-</ul>
-
-<h3>Symbolic Regions</h3>
-
-<p>A symbolic region is a map of the concept of symbolic values into the domain
-of regions. It is the way that we represent symbolic pointers. Whenever a
-symbolic pointer value is needed, a symbolic region is created to represent
-it.</p>
-
-<p>A symbolic region has no type. It wraps a SymbolData. But sometimes we have
-type information associated with a symbolic region. For this case, a
-TypedViewRegion is created to layer the type information on top of the symbolic
-region. The reason we do not carry type information with the symbolic region is
-that the symbolic regions can have no type. To be consistent, we don't let them
-to carry type information.</p>
-
-<p>Like a symbolic pointer, a symbolic region may be NULL, has unknown extent,
-and represents a generic chunk of memory.</p>
-
-<p><em><b>NOTE</b>: We plan not to use loc::SymbolVal in RegionStore and remove it
- gradually.</em></p>
-
-<p>Symbolic regions get their rvalue types through the following ways:</p>
-
-<ul>
-<li>Through the parameter or global variable that points to it, e.g.:
-<pre>
-void f(struct s* p) {
- ...
-}
-</pre>
-
-<p>The symbolic region pointed to by <tt>p</tt> has type <tt>struct
-s</tt>.</p></li>
-
-<li>Through explicit or implicit casts, e.g.:
-<pre>
-void f(void* p) {
- struct s* q = (struct s*) p;
- ...
-}
-</pre>
-</li>
-</ul>
-
-<p>We attach the type information to the symbolic region lazily. For the first
-case above, we create the <tt>TypedViewRegion</tt> only when the pointer is
-actually used to access the pointee memory object, that is when the element or
-field region is created. For the cast case, the <tt>TypedViewRegion</tt> is
-created when visiting the <tt>CastExpr</tt>.</p>
-
-<p>The reason for doing lazy typing is that symbolic regions are sometimes only
-used to do location comparison.</p>
-
-<h3>Pointer Casts</h3>
-
-<p>Pointer casts allow people to impose different 'views' onto a chunk of
-memory.</p>
-
-<p>Usually we have two kinds of casts. One kind of casts cast down with in the
-type hierarchy. It imposes more specific views onto more generic memory regions.
-The other kind of casts cast up with in the type hierarchy. It strips away more
-specific views on top of the more generic memory regions.</p>
-
-<p>We simulate the down casts by layering another <tt>TypedViewRegion</tt> on
-top of the original region. We simulate the up casts by striping away the top
-<tt>TypedViewRegion</tt>. Down casts is usually simple. For up casts, if the
-there is no <tt>TypedViewRegion</tt> to be stripped, we return the original
-region. If the underlying region is of the different type than the cast-to type,
-we flag an error state.</p>
-
-<p>For toll-free bridging casts, we return the original region.</p>
-
-<p>We can set up a partial order for pointer types, with the most general type
-<tt>void*</tt> at the top. The partial order forms a tree with <tt>void*</tt> as
-its root node.</p>
-
-<p>Every <tt>MemRegion</tt> has a root position in the type tree. For example,
-the pointee region of <tt>void *p</tt> has its root position at the root node of
-the tree. <tt>VarRegion</tt> of <tt>int x</tt> has its root position at the 'int
-type' node.</p>
-
-<p><tt>TypedViewRegion</tt> is used to move the region down or up in the tree.
-Moving down in the tree adds a <tt>TypedViewRegion</tt>. Moving up in the tree
-removes a <Tt>TypedViewRegion</tt>.</p>
-
-<p>Do we want to allow moving up beyond the root position? This happens
-when:</p> <pre> int x; void *p = &amp;x; </pre>
-
-<p>The region of <tt>x</tt> has its root position at 'int*' node. the cast to
-void* moves that region up to the 'void*' node. I propose to not allow such
-casts, and assign the region of <tt>x</tt> for <tt>p</tt>.</p>
-
-<p>Another non-ideal case is that people might cast to a non-generic pointer
-from another non-generic pointer instead of first casting it back to the generic
-pointer. Direct handling of this case would result in multiple layers of
-TypedViewRegions. This enforces an incorrect semantic view to the region,
-because we can only have one typed view on a region at a time. To avoid this
-inconsistency, before casting the region, we strip the TypedViewRegion, then do
-the cast. In summary, we only allow one layer of TypedViewRegion.</p>
-
-<h3>Region Bindings</h3>
-
-<p>The following region kinds are boundable: VarRegion, CompoundLiteralRegion,
-StringRegion, ElementRegion, FieldRegion, and ObjCIvarRegion.</p>
-
-<p>When binding regions, we perform canonicalization on element regions and field
-regions. This is because we can have different views on the same region, some
-of which are essentially the same view with different sugar type names.</p>
-
-<p>To canonicalize a region, we get the canonical types for all TypedViewRegions
-along the way up to the root region, and make new TypedViewRegions with those
-canonical types.</p>
-
-<p>For Objective-C and C++, perhaps another canonicalization rule should be
-added: for FieldRegion, the least derived class that has the field is used as
-the type of the super region of the FieldRegion.</p>
-
-<p>All bindings and retrievings are done on the canonicalized regions.</p>
-
-<p>Canonicalization is transparent outside the region store manager, and more
-specifically, unaware outside the Bind() and Retrieve() method. We don't need to
-consider region canonicalization when doing pointer cast.</p>
-
-<h3>Constraint Manager</h3>
-
-<p>The constraint manager reasons about the abstract location of memory objects.
-We can have different views on a region, but none of these views changes the
-location of that object. Thus we should get the same abstract location for those
-regions.</p>
-
-</body>
-</html>
diff --git a/docs/AnalyzerRegions.rst b/docs/AnalyzerRegions.rst
new file mode 100644
index 0000000000..80b3882bc9
--- /dev/null
+++ b/docs/AnalyzerRegions.rst
@@ -0,0 +1,259 @@
+===============================================
+Static Analyzer Design Document: Memory Regions
+===============================================
+
+Authors: Ted Kremenek, ``kremenek at apple``,
+Zhongxing Xu, ``xuzhongzhing at gmail``
+
+Introduction
+============
+
+The path-sensitive analysis engine in libAnalysis employs an extensible
+API for abstractly modeling the memory of an analyzed program. This API
+employs the concept of "memory regions" to abstractly model chunks of
+program memory such as program variables and dynamically allocated
+memory such as those returned from 'malloc' and 'alloca'. Regions are
+hierarchical, with subregions modeling subtyping relationships, field
+and array offsets into larger chunks of memory, and so on.
+
+The region API consists of two components:
+
+- A taxonomy and representation of regions themselves within the
+ analyzer engine. The primary definitions and interfaces are described
+ in ``MemRegion.h``. At the root of the region hierarchy is the class
+ ``MemRegion`` with specific subclasses refining the region concept
+ for variables, heap allocated memory, and so forth.
+- The modeling of binding of values to regions. For example, modeling
+ the value stored to a local variable ``x`` consists of recording the
+ binding between the region for ``x`` (which represents the raw memory
+ associated with ``x``) and the value stored to ``x``. This binding
+ relationship is captured with the notion of "symbolic stores."
+
+Symbolic stores, which can be thought of as representing the relation
+``regions -> values``, are implemented by subclasses of the
+``StoreManager`` class (``Store.h``). A particular StoreManager
+implementation has complete flexibility concerning the following:
+
+- *How* to model the binding between regions and values
+- *What* bindings are recorded
+
+Together, both points allow different StoreManagers to tradeoff between
+different levels of analysis precision and scalability concerning the
+reasoning of program memory. Meanwhile, the core path-sensitive engine
+makes no assumptions about either points, and queries a StoreManager
+about the bindings to a memory region through a generic interface that
+all StoreManagers share. If a particular StoreManager cannot reason
+about the potential bindings of a given memory region (e.g.,
+'``BasicStoreManager``' does not reason about fields of structures) then
+the StoreManager can simply return 'unknown' (represented by
+'``UnknownVal``') for a particular region-binding. This separation of
+concerns not only isolates the core analysis engine from the details of
+reasoning about program memory but also facilities the option of a
+client of the path-sensitive engine to easily swap in different
+StoreManager implementations that internally reason about program memory
+in very different ways.
+
+The rest of this document is divided into two parts. We first discuss
+region taxonomy and the semantics of regions. We then discuss the
+StoreManager interface, and details of how the currently available
+StoreManager classes implement region bindings.
+
+Memory Regions and Region Taxonomy
+==================================
+
+Pointers
+--------
+
+Before talking about the memory regions, we would talk about the
+pointers since memory regions are essentially used to represent pointer
+values.
+
+The pointer is a type of values. Pointer values have two semantic
+aspects. One is its physical value, which is an address or location. The
+other is the type of the memory object residing in the address.
+
+Memory regions are designed to abstract these two properties of the
+pointer. The physical value of a pointer is represented by MemRegion
+pointers. The rvalue type of the region corresponds to the type of the
+pointee object.
+
+One complication is that we could have different view regions on the
+same memory chunk. They represent the same memory location, but have
+different abstract location, i.e., MemRegion pointers. Thus we need to
+canonicalize the abstract locations to get a unique abstract location
+for one physical location.
+
+Furthermore, these different view regions may or may not represent
+memory objects of different types. Some different types are semantically
+the same, for example, 'struct s' and 'my\_type' are the same type.
+
+::
+
+ struct s;
+ typedef struct s my_type;
+
+But ``char`` and ``int`` are not the same type in the code below:
+
+::
+
+ void *p;
+ int *q = (int*) p;
+ char *r = (char*) p;
+
+Thus we need to canonicalize the MemRegion which is used in binding and
+retrieving.
+
+Regions
+-------
+
+Region is the entity used to model pointer values. A Region has the
+following properties:
+
+- Kind
+- ObjectType: the type of the object residing on the region.
+- LocationType: the type of the pointer value that the region
+ corresponds to. Usually this is the pointer to the ObjectType. But
+ sometimes we want to cache this type explicitly, for example, for a
+ CodeTextRegion.
+- StartLocation
+- EndLocation
+
+Symbolic Regions
+----------------
+
+A symbolic region is a map of the concept of symbolic values into the
+domain of regions. It is the way that we represent symbolic pointers.
+Whenever a symbolic pointer value is needed, a symbolic region is
+created to represent it.
+
+A symbolic region has no type. It wraps a SymbolData. But sometimes we
+have type information associated with a symbolic region. For this case,
+a TypedViewRegion is created to layer the type information on top of the
+symbolic region. The reason we do not carry type information with the
+symbolic region is that the symbolic regions can have no type. To be
+consistent, we don't let them to carry type information.
+
+Like a symbolic pointer, a symbolic region may be NULL, has unknown
+extent, and represents a generic chunk of memory.
+
+.. note::
+ We plan not to use loc::SymbolVal in RegionStore and remove it
+ gradually.
+
+Symbolic regions get their rvalue types through the following ways:
+
+- Through the parameter or global variable that points to it, e.g.:
+
+ ::
+
+ void f(struct s* p) {
+ ...
+ }
+
+ The symbolic region pointed to by ``p`` has type ``struct s``.
+
+- Through explicit or implicit casts, e.g.:
+
+ ::
+
+ void f(void* p) {
+ struct s* q = (struct s*) p;
+ ...
+ }
+
+We attach the type information to the symbolic region lazily. For the
+first case above, we create the ``TypedViewRegion`` only when the
+pointer is actually used to access the pointee memory object, that is
+when the element or field region is created. For the cast case, the
+``TypedViewRegion`` is created when visiting the ``CastExpr``.
+
+The reason for doing lazy typing is that symbolic regions are sometimes
+only used to do location comparison.
+
+Pointer Casts
+-------------
+
+Pointer casts allow people to impose different 'views' onto a chunk of
+m