15 files changed, 572 insertions, 114 deletions
diff --git a/docs/CMake.rst b/docs/CMake.rst
index 6eab04b970..fb081d7b98 100644
--- a/docs/CMake.rst
+++ b/docs/CMake.rst
@@ -204,7 +204,7 @@ LLVM-specific variables
   tests.
 
 **LLVM_APPEND_VC_REV**:BOOL
-  Append version control revision info (svn revision number or git revision id)
+  Append version control revision info (svn revision number or Git revision id)
   to LLVM version string (stored in the PACKAGE_VERSION macro). For this to work
   cmake must be invoked before the build. Defaults to OFF.
 
diff --git a/docs/CodeGenerator.rst b/docs/CodeGenerator.rst
index b5d4180974..75415ab9cc 100644
--- a/docs/CodeGenerator.rst
+++ b/docs/CodeGenerator.rst
@@ -1038,6 +1038,24 @@ for your target.  It has the following strengths:
   are used to manipulate the input immediate (in this case, take the high or low
   16-bits of the immediate).
 
+* When using the 'Pat' class to map a pattern to an instruction that has one
+  or more complex operands (like e.g. `X86 addressing mode`_), the pattern may
+  either specify the operand as a whole using a ``ComplexPattern``, or else it
+  may specify the components of the complex operand separately.  The latter is
+  done e.g. for pre-increment instructions by the PowerPC back end:
+
+  ::
+
+    def STWU  : DForm_1<37, (outs ptr_rc:$ea_res), (ins GPRC:$rS, memri:$dst),
+                    "stwu $rS, $dst", LdStStoreUpd, []>,
+                    RegConstraint<"$dst.reg = $ea_res">, NoEncode<"$ea_res">;
+
+    def : Pat<(pre_store GPRC:$rS, ptr_rc:$ptrreg, iaddroff:$ptroff),
+              (STWU GPRC:$rS, iaddroff:$ptroff, ptr_rc:$ptrreg)>;
+
+  Here, the pair of ``ptroff`` and ``ptrreg`` operands is matched onto the
+  complex operand ``dst`` of class ``memri`` in the ``STWU`` instruction.
+
 * While the system does automate a lot, it still allows you to write custom C++
   code to match special cases if there is something that is hard to
   express.
diff --git a/docs/CommandGuide/llvm-link.rst b/docs/CommandGuide/llvm-link.rst
index e4f2228841..3bcfa68c25 100644
--- a/docs/CommandGuide/llvm-link.rst
+++ b/docs/CommandGuide/llvm-link.rst
@@ -1,5 +1,5 @@
-llvm-link - LLVM linker
-=======================
+llvm-link - LLVM bitcode linker
+===============================
 
 SYNOPSIS
 --------
@@ -13,23 +13,9 @@ DESCRIPTION
 into a single LLVM bitcode file.  It writes the output file to standard output,
 unless the :option:`-o` option is used to specify a filename.
 
-:program:`llvm-link` attempts to load the input files from the current
-directory.  If that fails, it looks for each file in each of the directories
-specified by the :option:`-L` options on the command line.  The library search
-paths are global; each one is searched for every input file if necessary.  The
-directories are searched in the order they were specified on the command line.
-
 OPTIONS
 -------
 
-.. option:: -L directory
-
- Add the specified ``directory`` to the library search path.  When looking for
- libraries, :program:`llvm-link` will look in path name for libraries.  This
- option can be specified multiple times; :program:`llvm-link` will search
- inside these directories in the order in which they were specified on the
- command line.
-
 .. option:: -f
 
  Enable binary output on terminals.  Normally, :program:`llvm-link` will refuse
@@ -48,8 +34,8 @@ OPTIONS
 
 .. option:: -d
 
- If specified, :program:`llvm-link` prints a human-readable version of the output
- bitcode file to standard error.
+ If specified, :program:`llvm-link` prints a human-readable version of the
+ output bitcode file to standard error.
 
 .. option:: -help
 
@@ -67,8 +53,4 @@ EXIT STATUS
 If :program:`llvm-link` succeeds, it will exit with 0.  Otherwise, if an error
 occurs, it will exit with a non-zero value.
 
-SEE ALSO
---------
-
-gccld
 
diff --git a/docs/CompilerWriterInfo.rst b/docs/CompilerWriterInfo.rst
index 87add670af..681777c12d 100644
--- a/docs/CompilerWriterInfo.rst
+++ b/docs/CompilerWriterInfo.rst
@@ -107,6 +107,12 @@ OS X
 * `Mach-O Runtime Architecture <http://developer.apple.com/documentation/Darwin/RuntimeArchitecture-date.html>`_
 * `Notes on Mach-O ABI <http://www.unsanity.org/archives/000044.php>`_
 
+NVPTX
+=====
+
+* `CUDA Documentation <http://docs.nvidia.com/cuda/index.html>`_ includes the PTX
+  ISA and Driver API documentation
+
 Miscellaneous Resources
 =======================
 
diff --git a/docs/GettingStarted.rst b/docs/GettingStarted.rst
index 539c75e2d7..0bbbafc6e6 100644
--- a/docs/GettingStarted.rst
+++ b/docs/GettingStarted.rst
@@ -521,13 +521,13 @@ By placing it in the ``llvm/projects``, it will be automatically configured by
 the LLVM configure script as well as automatically updated when you run ``svn
 update``.
 
-GIT mirror
+Git Mirror
 ----------
 
-GIT mirrors are available for a number of LLVM subprojects. These mirrors sync
+Git mirrors are available for a number of LLVM subprojects. These mirrors sync
 automatically with each Subversion commit and contain all necessary git-svn
 marks (so, you can recreate git-svn metadata locally). Note that right now
-mirrors reflect only ``trunk`` for each project. You can do the read-only GIT
+mirrors reflect only ``trunk`` for each project. You can do the read-only Git
 clone of LLVM via:
 
 .. code-block:: console
@@ -538,10 +538,23 @@ If you want to check out clang too, run:
 
 .. code-block:: console
 
-  % git clone http://llvm.org/git/llvm.git
   % cd llvm/tools
   % git clone http://llvm.org/git/clang.git
 
+If you want to check out compiler-rt too, run:
+
+.. code-block:: console
+
+  % cd llvm/projects
+  % git clone http://llvm.org/git/compiler-rt.git
+
+If you want to check out the Test Suite Source Code (optional), run:
+
+.. code-block:: console
+
+  % cd llvm/projects
+  % git clone http://llvm.org/git/test-suite.git
+
 Since the upstream repository is in Subversion, you should use ``git
 pull --rebase`` instead of ``git pull`` to avoid generating a non-linear history
 in your clone.  To configure ``git pull`` to pass ``--rebase`` by default on the
@@ -626,8 +639,10 @@ To set up clone from which you can submit code using ``git-svn``, run:
   % git config svn-remote.svn.fetch :refs/remotes/origin/master
   % git svn rebase -l
 
+Likewise for compiler-rt and test-suite.
+
 To update this clone without generating git-svn tags that conflict with the
-upstream git repo, run:
+upstream Git repo, run:
 
 .. code-block:: console
 
@@ -638,12 +653,14 @@ upstream git repo, run:
      git checkout master &&
      git svn rebase -l)
 
+Likewise for compiler-rt and test-suite.
+
 This leaves your working directories on their master branches, so you'll need to
 ``checkout`` each working branch individually and ``rebase`` it on top of its
 parent branch.
 
 For those who wish to be able to update an llvm repo in a simpler fashion,
-consider placing the following git script in your path under the name
+consider placing the following Git script in your path under the name
 ``git-svnup``:
 
 .. code-block:: bash
@@ -991,7 +1008,7 @@ Optional Configuration Items
 ----------------------------
 
 If you're running on a Linux system that supports the `binfmt_misc
-<http://www.tat.physik.uni-tuebingen.de/~rguenth/linux/binfmt_misc.html>`_
+<http://en.wikipedia.org/wiki/binfmt_misc>`_
 module, and you have root access on the system, you can set your system up to
 execute LLVM bitcode files directly. To do this, use commands like this (the
 first command may not be required if you are already using the module):
diff --git a/docs/HowToSetUpLLVMStyleRTTI.rst b/docs/HowToSetUpLLVMStyleRTTI.rst
index b906b25621..e0f865a141 100644
--- a/docs/HowToSetUpLLVMStyleRTTI.rst
+++ b/docs/HowToSetUpLLVMStyleRTTI.rst
@@ -295,6 +295,78 @@ ordering right::
        | OtherSpecialSquare
      | Circle
 
+A Bug to be Aware Of
+--------------------
+
+The example just given opens the door to bugs where the ``classof``\s are
+not updated to match the ``Kind`` enum when adding (or removing) classes to
+(from) the hierarchy.
+
+Continuing the example above, suppose we add a ``SomewhatSpecialSquare`` as
+a subclass of ``Square``, and update the ``ShapeKind`` enum like so:
+
+.. code-block:: c++
+
+    enum ShapeKind {
+      SK_Square,
+      SK_SpecialSquare,
+      SK_OtherSpecialSquare,
+   +  SK_SomewhatSpecialSquare,
+      SK_Circle
+    }
+
+Now, suppose that we forget to update ``Square::classof()``, so it still
+looks like:
+
+.. code-block:: c++
+
+   static bool classof(const Shape *S) {
+     // BUG: Returns false when S->getKind() == SK_SomewhatSpecialSquare,
+     // even though SomewhatSpecialSquare "is a" Square.
+     return S->getKind() >= SK_Square &&
+            S->getKind() <= SK_OtherSpecialSquare;
+   }
+
+As the comment indicates, this code contains a bug. A straightforward and
+non-clever way to avoid this is to introduce an explicit ``SK_LastSquare``
+entry in the enum when adding the first subclass(es). For example, we could
+rewrite the example at the beginning of `Concrete Bases and Deeper
+Hierarchies`_ as:
+
+.. code-block:: c++
+
+    enum ShapeKind {
+      SK_Square,
+   +  SK_SpecialSquare,
+   +  SK_OtherSpecialSquare,
+   +  SK_LastSquare,
+      SK_Circle
+    }
+   ...
+   // Square::classof()
+   -  static bool classof(const Shape *S) {
+   -    return S->getKind() == SK_Square;
+   -  }
+   +  static bool classof(const Shape *S) {
+   +    return S->getKind() >= SK_Square &&
+   +           S->getKind() <= SK_LastSquare;
+   +  }
+
+Then, adding new subclasses is easy:
+
+.. code-block:: c++
+
+    enum ShapeKind {
+      SK_Square,
+      SK_SpecialSquare,
+      SK_OtherSpecialSquare,
+   +  SK_SomewhatSpecialSquare,
+      SK_LastSquare,
+      SK_Circle
+    }
+
+Notice that ``Square::classof`` does not need to be changed.
+
 .. _classof-contract:
 
 The Contract of ``classof``
diff --git a/docs/LangRef.rst b/docs/LangRef.rst
index 03004f66df..659f02afb9 100644
--- a/docs/LangRef.rst
+++ b/docs/LangRef.rst
@@ -8342,6 +8342,46 @@ strings. This can be useful for special purpose optimizations that want
 to look for these annotations. These have no other defined use; they are
 ignored by code generation and optimization.
 
+'``llvm.ptr.annotation.*``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+This is an overloaded intrinsic. You can use '``llvm.ptr.annotation``' on a
+pointer to an integer of any width. *NOTE* you must specify an address space for
+the pointer. The identifier for the default address space is the integer
+'``0``'.
+
+::
+
+      declare i8*   @llvm.ptr.annotation.p<address space>i8(i8* <val>, i8* <str>, i8* <str>, i32  <int>)
+      declare i16*  @llvm.ptr.annotation.p<address space>i16(i16* <val>, i8* <str>, i8* <str>, i32  <int>)
+      declare i32*  @llvm.ptr.annotation.p<address space>i32(i32* <val>, i8* <str>, i8* <str>, i32  <int>)
+      declare i64*  @llvm.ptr.annotation.p<address space>i64(i64* <val>, i8* <str>, i8* <str>, i32  <int>)
+      declare i256* @llvm.ptr.annotation.p<address space>i256(i256* <val>, i8* <str>, i8* <str>, i32  <int>)
+
+Overview:
+"""""""""
+
+The '``llvm.ptr.annotation``' intrinsic.
+
+Arguments:
+""""""""""
+
+The first argument is a pointer to an integer value of arbitrary bitwidth
+(result of some expression), the second is a pointer to a global string, the
+third is a pointer to a global string which is the source file name, and the
+last argument is the line number. It returns the value of the first argument.
+
+Semantics:
+""""""""""
+
+This intrinsic allows annotation of a pointer to an integer with arbitrary
+strings. This can be useful for special purpose optimizations that want to look
+for these annotations. These have no other defined use; they are ignored by code
+generation and optimization.
+
 '``llvm.annotation.*``' Intrinsic
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
diff --git a/docs/NVPTXUsage.rst b/docs/NVPTXUsage.rst
new file mode 100644
index 0000000000..5451619686
--- /dev/null
+++ b/docs/NVPTXUsage.rst
@@ -0,0 +1,276 @@
+=============================
+User Guide for NVPTX Back-end
+=============================
+
+.. contents::
+   :local:
+   :depth: 3
+
+
+Introduction
+============
+
+To support GPU programming, the NVPTX back-end supports a subset of LLVM IR
+along with a defined set of conventions used to represent GPU programming
+concepts. This document provides an overview of the general usage of the back-
+end, including a description of the conventions used and the set of accepted
+LLVM IR.
+
+.. note:: 
+   
+   This document assumes a basic familiarity with CUDA and the PTX
+   assembly language. Information about the CUDA Driver API and the PTX assembly
+   language can be found in the `CUDA documentation
+   <http://docs.nvidia.com/cuda/index.html>`_.
+
+
+
+Conventions
+===========
+
+Marking Functions as Kernels
+----------------------------
+
+In PTX, there are two types of functions: *device functions*, which are only
+callable by device code, and *kernel functions*, which are callable by host
+code. By default, the back-end will emit device functions. Metadata is used to
+declare a function as a kernel function. This metadata is attached to the
+``nvvm.annotations`` named metadata object, and has the following format:
+
+.. code-block:: llvm
+
+   !0 = metadata !{<function-ref>, metadata !"kernel", i32 1}
+
+The first parameter is a reference to the kernel function. The following
+example shows a kernel function calling a device function in LLVM IR. The
+function ``@my_kernel`` is callable from host code, but ``@my_fmad`` is not.
+
+.. code-block:: llvm
+
+    define float @my_fmad(float %x, float %y, float %z) {
+      %mul = fmul float %x, %y
+      %add = fadd float %mul, %z
+      ret float %add
+    }
+
+    define void @my_kernel(float* %ptr) {
+      %val = load float* %ptr
+      %ret = call float @my_fmad(float %val, float %val, float %val)
+      store float %ret, float* %ptr
+      ret void
+    }
+
+    !nvvm.annotations = !{!1}
+    !1 = metadata !{void (float*)* @my_kernel, metadata !"kernel", i32 1}
+
+When compiled, the PTX kernel functions are callable by host-side code.
+
+
+Address Spaces
+--------------
+
+The NVPTX back-end uses the following address space mapping:
+
+   ============= ======================
+   Address Space Memory Space
+   ============= ======================
+   0             Generic
+   1             Global
+   2             Internal Use
+   3             Shared
+   4             Constant
+   5             Local
+   ============= ======================
+
+Every global variable and pointer type is assigned to one of these address
+spaces, with 0 being the default address space. Intrinsics are provided which
+can be used to convert pointers between the generic and non-generic address
+spaces.
+
+As an example, the following IR will define an array ``@g`` that resides in
+global device memory.
+
+.. code-block:: llvm
+
+    @g = internal addrspace(1) global [4 x i32] [ i32 0, i32 1, i32 2, i32 3 ]
+
+LLVM IR functions can read and write to this array, and host-side code can
+copy data to it by name with the CUDA Driver API.
+
+Note that since address space 0 is the generic space, it is illegal to have
+global variables in address space 0.  Address space 0 is the default address
+space in LLVM, so the ``addrspace(N)`` annotation is *required* for global
+variables.
+
+
+NVPTX Intrinsics
+================
+
+Address Space Conversion
+------------------------
+
+'``llvm.nvvm.ptr.*.to.gen``' Intrinsics
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+These are overloaded intrinsics.  You can use these on any pointer types.
+
+.. code-block:: llvm
+
+    declare i8* @llvm.nvvm.ptr.global.to.gen.p0i8.p1i8(i8 addrspace(1)*)
+    declare i8* @llvm.nvvm.ptr.shared.to.gen.p0i8.p3i8(i8 addrspace(3)*)
+    declare i8* @llvm.nvvm.ptr.constant.to.gen.p0i8.p4i8(i8 addrspace(4)*)
+    declare i8* @llvm.nvvm.ptr.local.to.gen.p0i8.p5i8(i8 addrspace(5)*)
+
+Overview:
+"""""""""
+
+The '``llvm.nvvm.ptr.*.to.gen``' intrinsics convert a pointer in a non-generic
+address space to a generic address space pointer.
+
+Semantics:
+""""""""""
+
+These intrinsics modify the pointer value to be a valid generic address space
+pointer.
+
+
+'``llvm.nvvm.ptr.gen.to.*``' Intrinsics
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+These are overloaded intrinsics.  You can use these on any pointer types.
+
+.. code-block:: llvm
+
+    declare i8* @llvm.nvvm.ptr.gen.to.global.p1i8.p0i8(i8 addrspace(1)*)
+    declare i8* @llvm.nvvm.ptr.gen.to.shared.p3i8.p0i8(i8 addrspace(3)*)
+    declare i8* @llvm.nvvm.ptr.gen.to.constant.p4i8.p0i8(i8 addrspace(4)*)
+    declare i8* @llvm.nvvm.ptr.gen.to.local.p5i8.p0i8(i8 addrspace(5)*)
+
+Overview:
+"""""""""
+
+The '``llvm.nvvm.ptr.gen.to.*``' intrinsics convert a pointer in the generic
+address space to a pointer in the target address space.  Note that these
+intrinsics are only useful if the address space of the target address space of
+the pointer is known.  It is not legal to use address space conversion
+intrinsics to convert a pointer from one non-generic address space to another
+non-generic address space.
+
+Semantics:
+""""""""""
+
+These intrinsics modify the pointer value to be a valid pointer in the target
+non-generic address space.
+
+
+Reading PTX Special Registers
+-----------------------------
+
+'``llvm.nvvm.read.ptx.sreg.*``'
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+.. code-block:: llvm
+
+    declare i32 @llvm.nvvm.read.ptx.sreg.tid.x()
+    declare i32 @llvm.nvvm.read.ptx.sreg.tid.y()
+    declare i32 @llvm.nvvm.read.ptx.sreg.tid.z()
+    declare i32 @llvm.nvvm.read.ptx.sreg.ntid.x()
+    declare i32 @llvm.nvvm.read.ptx.sreg.ntid.y()
+    declare i32 @llvm.nvvm.read.ptx.sreg.ntid.z()
+    declare i32 @llvm.nvvm.read.ptx.sreg.ctaid.x()
+    declare i32 @llvm.nvvm.read.ptx.sreg.ctaid.y()
+    declare i32 @llvm.nvvm.read.ptx.sreg.ctaid.z()
+    declare i32 @llvm.nvvm.read.ptx.sreg.nctaid.x()
+    declare i32 @llvm.nvvm.read.ptx.sreg.nctaid.y()
+    declare i32 @llvm.nvvm.read.ptx.sreg.nctaid.z()
+    declare i32 @llvm.nvvm.read.ptx.sreg.warpsize()
+
+Overview:
+"""""""""
+
+The '``@llvm.nvvm.read.ptx.sreg.*``' intrinsics provide access to the PTX
+special registers, in particular the kernel launch bounds.  These registers
+map in the following way to CUDA builtins:
+
+   ============ =====================================
+   CUDA Builtin PTX Special Register Intrinsic
+   ============ =====================================
+   ``threadId`` ``@llvm.nvvm.read.ptx.sreg.tid.*``
+   ``blockIdx`` ``@llvm.nvvm.read.ptx.sreg.ctaid.*``
+   ``blockDim`` ``@llvm.nvvm.read.ptx.sreg.ntid.*``
+   ``gridDim``  ``@llvm.nvvm.read.ptx.sreg.nctaid.*``
+   ============ =====================================
+
+
+Barriers
+--------
+
+'``llvm.nvvm.barrier0``'
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+.. code-block:: llvm
+
+  declare void @llvm.nvvm.barrier0()
+
+Overview:
+"""""""""
+
+The '``@llvm.nvvm.barrier0()``' intrinsic emits a PTX ``bar.sync 0``
+instruction, equivalent to the ``__syncthreads()`` call in CUDA.
+
+
+Other Intrinsics
+----------------
+
+For the full set of NVPTX intrinsics, please see the
+``include/llvm/IR/IntrinsicsNVVM.td`` file in the LLVM source tree.
+
+
+Executing PTX
+=============
+
+The most common way to execute PTX assembly on a GPU device is to use the CUDA
+Driver API. This API is a low-level interface to the GPU driver and allows for
+JIT compilation of PTX code to native GPU machine code.
+
+Initializing the Driver API:
+
+.. code-block:: c++
+
+    CUdevice device;
+    CUcontext context;
+
+    // Initialize the driver API
+    cuInit(0);
+    // Get a handle to the first compute device
+    cuDeviceGet(&device, 0);
+    // Create a compute device context
+    cuCtxCreate(&context, 0, device);
+
+JIT compiling a PTX string to a device binary:
+
+.. code-block:: c++
+
+    CUmodule module;
+    CUfunction funcion;
+
+    // JIT compile a null-terminated PTX string
+    cuModuleLoadData(&module, (void*)PTXString);
+
+    // Get a handle to the "myfunction" kernel function
+    cuModuleGetFunction(&function, module, "myfunction");
+
+For full examples of executing PTX assembly, please see the `CUDA Samples
+<https://developer.nvidia.com/cuda-downloads>`_ distribution.
diff --git a/docs/ProgrammersManual.rst b/docs/ProgrammersManual.rst
index 4fc4597933..7864165617 100644
--- a/docs/ProgrammersManual.rst
+++ b/docs/ProgrammersManual.rst
@@ -626,6 +626,33 @@ SmallVectors are most useful when on the stack.
 SmallVector also provides a nice portable and efficient replacement for
 ``alloca``.
 
+.. note::
+
+   Prefer to use ``SmallVectorImpl<T>`` as a parameter type.
+
+   In APIs that don't care about the "small size" (most?), prefer to use
+   the ``SmallVectorImpl<T>`` class, which is basically just the "vector
+   header" (and methods) without the elements allocated after it. Note that
+   ``SmallVector<T, N>`` inherits from ``SmallVectorImpl<T>`` so the
+   conversion is implicit and costs nothing. E.g.
+
+   .. code-block:: c++
+
+      // BAD: Clients cannot pass e.g. SmallVector<Foo, 4>.
+      hardcodedSmallSize(SmallVector<Foo, 2> &Out);
+      // GOOD: Clients can pass any SmallVector<Foo, N>.
+      allowsAnySmallSize(SmallVectorImpl<Foo> &Out);
+
+      void someFunc() {
+        SmallVector<Foo, 8> Vec;
+        hardcodedSmallSize(Vec); // Error.
+        allowsAnySmallSize(Vec); // Works.
+      }
+
+   Even though it has "``Impl``" in the name, this is so widely used that
+   it really isn't "private to the implementation" anymore. A name like
+   ``SmallVectorHeader`` would be more appropriate.
+
 .. _dss_vector:
 
 <vector>
@@ -989,7 +1016,9 @@ coupled with a good choice of :ref:`sequential container <ds_sequential>`.
 This combination provides the several nice properties: the result data is
 contiguous in memory (good for cache locality), has few allocations, is easy to
 address (iterators in the final vector are just indices or pointers), and can be
-efficiently queried with a standard binary or radix search.
+efficiently queried with a standard binary search (e.g.
+``std::lower_bound``; if you want the whole range of elements comparing
+equal, use ``std::equal_range``).
 
 .. _dss_smallset:
 
diff --git a/docs/ReleaseNotes.rst b/docs/ReleaseNotes.rst
index 9383c5b3fa..3ca5560254 100644
--- a/docs/ReleaseNotes.rst
+++ b/docs/ReleaseNotes.rst
@@ -64,6 +64,12 @@ Non-comprehensive list of changes in this release
   attributes, which are useful for passing information to code generation. See
   :doc:`HowToUseAttributes` for more details.
 
+* TableGen's syntax for instruction selection patterns has been simplified.
+  Instead of specifying types indirectly with register classes, you should now
+  specify types directly in the input patterns. See ``SparcInstrInfo.td`` for
+  examples of the new syntax. The old syntax using register classes still
+  works, but it will be removed in a future LLVM release.
+
 * ... next change ...
 
 .. NOTE
@@ -90,6 +96,13 @@ in fairly early stages, but we expect successful compilation when:
 Some additional functionality is also implemented, notably DWARF debugging,
 GNU-style thread local storage and inline assembly.
 
+Hexagon Target
+--------------
+
+- Removed support for legacy hexagonv2 and hexagonv3 processor
+  architectures which are no longer in use. Currently supported
+  architectures are hexagonv4 and hexagonv5.
+
 Loop Vectorizer
 ---------------
 
diff --git a/docs/SourceLevelDebugging.rst b/docs/SourceLevelDebugging.rst
index 78ce4e0e53..857479508a 100644
--- a/docs/SourceLevelDebugging.rst
+++ b/docs/SourceLevelDebugging.rst
@@ -1811,11 +1811,11 @@ values, we can clarify the contents of the ``BUCKETS``, ``HASHES`` and
   |  HEADER.header_data_len | uint32_t
   |  HEADER_DATA            | HeaderData
   |-------------------------|
-  |  BUCKETS                | uint32_t[bucket_count] // 32 bit hash indexes
+  |  BUCKETS                | uint32_t[n_buckets] // 32 bit hash indexes
   |-------------------------|
-  |  HASHES                 | uint32_t[hashes_count] // 32 bit hash values
+  |  HASHES                 | uint32_t[n_hashes] // 32 bit hash values
   |-------------------------|
-  |  OFFSETS                | uint32_t[hashes_count] // 32 bit offsets to hash value data
+  |  OFFSETS                | uint32_t[n_hashes] // 32 bit offsets to hash value data
   |-------------------------|
   |  ALL HASH DATA          |
   `-------------------------'
@@ -2080,23 +2080,23 @@ array to be:
   HeaderData.atoms[0].form = DW_FORM_data4;
 
 This defines the contents to be the DIE offset (eAtomTypeDIEOffset) that is
-  encoded as a 32 bit value (DW_FORM_data4).  This allows a single name to have
-  multiple matching DIEs in a single file, which could come up with an inlined
-  function for instance.  Future tables could include more information about the
-  DIE such as flags indicating if the DIE is a function, method, block,
-  or inlined.
+encoded as a 32 bit value (DW_FORM_data4).  This allows a single name to have
+multiple matching DIEs in a single file, which could come up with an inlined
+function for instance.  Future tables could include more information about the
+DIE such as flags indicating if the DIE is a function, method, block,
+or inlined.
 
 The KeyType for the DWARF table is a 32 bit string table offset into the
-  ".debug_str" table.  The ".debug_str" is the string table for the DWARF which
-  may already contain copies of all of the strings.  This helps make sure, with
-  help from the compiler, that we reuse the strings between all of the DWARF
-  sections and keeps the hash table size down.  Another benefit to having the
-  compiler generate all strings as DW_FORM_strp in the debug info, is that
-  DWARF parsing can be made much faster.
+".debug_str" table.  The ".debug_str" is the string table for the DWARF which
+may already contain copies of all of the strings.  This helps make sure, with
+help from the compiler, that we reuse the strings between all of the DWARF
+sections and keeps the hash table size down.  Another benefit to having the
+compiler generate all strings as DW_FORM_strp in the debug info, is that
+DWARF parsing can be made much faster.
 
 After a lookup is made, we get an offset into the hash data.  The hash data
-  needs to be able to deal with 32 bit hash collisions, so the chunk of data
-  at the offset in the hash data consists of a triple:
+needs to be able to deal with 32 bit hash collisions, so the chunk of data
+at the offset in the hash data consists of a triple:
 
 .. code-block:: c
 
@@ -2105,7 +2105,7 @@ After a lookup is made, we get an offset into the hash data.  The hash data
   HashData[hash_data_count]
 
 If "str_offset" is zero, then the bucket contents are done. 99.9% of the
-  hash data chunks contain a single item (no 32 bit hash collision):
+hash data chunks contain a single item (no 32 bit hash collision):
 
 .. code-block:: none
 
diff --git a/docs/TableGen/LangRef.rst b/docs/TableGen/LangRef.rst
index c9e1efba03..bd28a9031d 100644
--- a/docs/TableGen/LangRef.rst
+++ b/docs/TableGen/LangRef.rst
@@ -286,7 +286,7 @@ given values.
 .. productionlist::
    SimpleValue: "(" `DagArg` `DagArgList` ")"
    DagArgList: `DagArg` ("," `DagArg`)*
-   DagArg: `Value` [":" `TokVarName`]
+   DagArg: `Value` [":" `TokVarName`] | `TokVarName`
 
 The initial :token:`DagArg` is called the "operator" of the dag.
 
diff --git a/docs/TestingGuide.rst b/docs/TestingGuide.rst
index 4d8c8ce307..79cedee764 100644
--- a/docs/TestingGuide.rst
+++ b/docs/TestingGuide.rst
@@ -224,16 +224,7 @@ Below is an example of legal RUN lines in a ``.ll`` file:
     ; RUN: diff %t1 %t2
 
 As with a Unix shell, the RUN lines permit pipelines and I/O
-redirection to be used. However, the usage is slightly different than
-for Bash. In general, it's useful to read the code of other tests to figure out
-what you can use in yours. The major differences are:
-
--  You can't do ``2>&1``. That will cause :program:`lit` to write to a file
-   named ``&1``. Usually this is done to get stderr to go through a pipe. You
-   can do that with ``|&`` so replace this idiom:
-   ``... 2>&1 | FileCheck`` with ``... |& FileCheck``
--  You can only redirect to a file, not to another descriptor and not
-   from a here document.
+redirection to be used.
 
 There are some quoting rules that you must pay attention to when writing
 your RUN lines. In general nothing needs to be quoted. :program:`lit` won't
@@ -243,7 +234,7 @@ everything enclosed as one value.
 
 In general, you should strive to keep your RUN lines as simple as possible,
 using them only to run tools that generate textual output you can then examine.
-The recommended way to examine output to figure out if the test passes it using
+The recommended way to examine output to figure out if the test passes is using
 the :doc:`FileCheck tool <CommandGuide/FileCheck>`. *[The usage of grep in RUN
 lines is deprecated - please do not send or commit patches that use it.]*
 
diff --git a/docs/WritingAnLLVMBackend.rst b/docs/WritingAnLLVMBackend.rst
index 6d6c2a1070..a03a5e42c2 100644
--- a/docs/WritingAnLLVMBackend.rst
+++ b/docs/WritingAnLLVMBackend.rst
@@ -760,7 +760,7 @@ target description file (``IntRegs``).
 
   def LDrr : F3_1 <3, 0b000000, (outs IntRegs:$dst), (ins MEMrr:$addr),
                    "ld [$addr], $dst",
-                   [(set IntRegs:$dst, (load ADDRrr:$addr))]>;
+                   [(set i32:$dst, (load ADDRrr:$addr))]>;
 
 The fourth parameter is the input source, which uses the address operand
 ``MEMrr`` that is defined earlier in ``SparcInstrInfo.td``:
@@ -788,7 +788,7 @@ class is defined:
 
   def LDri : F3_2 <3, 0b000000, (outs IntRegs:$dst), (ins MEMri:$addr),
                    "ld [$addr], $dst",
-                   [(set IntRegs:$dst, (load ADDRri:$addr))]>;
+                   [(set i32:$dst, (load ADDRri:$addr))]>;
 
 Writing these definitions for so many similar instructions can involve a lot of
 cut and paste.  In ``.td`` files, the ``multiclass`` directive enables the
@@ -803,11 +803,11 @@ pattern ``F3_12`` is defined to create 2 instruction classes each time
     def rr  : F3_1 <2, Op3Val,
                    (outs IntRegs:$dst), (ins IntRegs:$b, IntRegs:$c),
                    !strconcat(OpcStr, " $b, $c, $dst"),
-                   [(set IntRegs:$dst, (OpNode IntRegs:$b, IntRegs:$c))]>;
+                   [(set i32:$dst, (OpNode i32:$b, i32:$c))]>;
     def ri  : F3_2 <2, Op3Val,
                    (outs IntRegs:$dst), (ins IntRegs:$b, i32imm:$c),
                    !strconcat(OpcStr, " $b, $c, $dst"),
-                   [(set IntRegs:$dst, (OpNode IntRegs:$b, simm13:$c))]>;
+                   [(set i32:$dst, (OpNode i32:$b, simm13:$c))]>;
   }
 
 So when the ``defm`` directive is used for the ``XOR`` and ``ADD``
@@ -856,7 +856,7 @@ format instruction having three operands.
   def XNORrr  : F3_1<2, 0b000111,
                      (outs IntRegs:$dst), (ins IntRegs:$b, IntRegs:$c),
                      "xnor $b, $c, $dst",
-                     [(set IntRegs:$dst, (not (xor IntRegs:$b, IntRegs:$c)))]>;
+                     [(set i32:$dst, (not (xor i32:$b, i32:$c)))]>;
 
 The instruction templates in ``SparcInstrFormats.td`` show the base class for
 ``F3_1`` is ``InstSP``.
@@ -1124,7 +1124,7 @@ a pattern with the store DAG operator.
 .. code-block:: llvm
 
   def STrr  : F3_1< 3, 0b000100, (outs), (ins MEMrr:$addr, IntRegs:$src),
-                   "st $src, [$addr]", [(store IntRegs:$src, ADDRrr:$addr)]>;
+                   "st $src, [$addr]", [(store i32:$src, ADDRrr:$addr)]>;
 
 ``ADDRrr`` is a memory mode that is also defined in ``SparcInstrInfo.td``:
 
@@ -1185,7 +1185,7 @@ instruction.
       SDValue CPTmp0;
       SDValue CPTmp1;
 
-      // Pattern: (st:void IntRegs:i32:$src,
+      // Pattern: (st:void i32:i32:$src,
       //           ADDRrr:i32:$addr)<<P:Predicate_store>>
       // Emits: (STrr:void ADDRrr:i32:$addr, IntRegs:i32:$src)
       // Pattern complexity = 13  cost = 1  size = 0
diff --git a/docs/index.rst b/docs/index.rst
index 8f22ef2a77..c3bb8089da 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -22,7 +22,6 @@ Several introductory papers and presentations.
    :hidden:
 
    LangRef
-   GetElementPtr
 
 :doc:`LangRef`
   Defines the LLVM intermediate representation.
@@ -48,10 +47,6 @@ Several introductory papers and presentations.
 
   .. __: http://llvm.org/pubs/2002-12-LattnerMSThesis.html
 
-:doc:`GetElementPtr`
-  Answers to some very frequent questions about LLVM's most frequently
-  misunderstood instruction.
-
 `Publications mentioning LLVM <http://llvm.org/pubs>`_
    ..
 
@@ -72,7 +67,6 @@ representation.
    CMake
    HowToBuildOnARM
    CommandGuide/index
-   DeveloperPolicy
    GettingStarted
    GettingStartedVS
    FAQ
@@ -87,6 +81,7 @@ representation.
    ReleaseNotes
    Passes
    YamlIO
+   GetElementPtr
 
 :doc:`GettingStarted`
    Discusses how to get up and running quickly with the LLVM infrastructure.
@@ -108,9 +103,6 @@ representation.
    Tutorials about using LLVM. Includes a tutorial about making a custom
    language with LLVM.
 
-:doc:`DeveloperPolicy`
-   The LLVM project's policy towards developers and their contributions.
-
 :doc:`LLVM Command Guide <CommandGuide/index>`
    A reference manual for the LLVM command line utilities ("man" pages for LLVM
    tools).
@@ -149,25 +141,9 @@ representation.
 :doc:`YamlIO`
    A reference guide for using LLVM's YAML I/O library.
 
-IRC
-===
-
-Users and developers of the LLVM project (including subprojects such as Clang)
-can be found in #llvm on `irc.oftc.net <irc://irc.oftc.net/llvm>`_.
-
-This channel has several bots.
-
-* Buildbot reporters
-
-  * llvmbb - Bot for the main LLVM buildbot master.
-    http://lab.llvm.org:8011/console
-  * bb-chapuni - An individually run buildbot master. http://bb.pgr.jp/console
-  * smooshlab - Apple's internal buildbot master.
-
-* robot - Bugzilla linker. %bug <number>
-
-* clang-bot - A `geordi <http://www.eelis.net/geordi/>`_ instance running
-  near-trunk clang instead of gcc.
+:doc:`GetElementPtr`
+  Answers to some very frequent questions about LLVM's most frequently
+  misunderstood instruction.
 
 Programming Documentation
 =========================
@@ -248,6 +224,7 @@ For API clients and LLVM developers.
    WritingAnLLVMPass
    TableGen/LangRef
    HowToUseAttributes
+   NVPTXUsage
 
 :doc:`WritingAnLLVMPass`
    Information on how to write LLVM transformations and analyses.
@@ -316,6 +293,10 @@ For API clients and LLVM developers.
 :doc:`HowToUseAttributes`
   Answers some questions about the new Attributes infrastructure.
 
+:doc:`NVPTXUsage`
+   This document describes using the NVPTX back-end to compile GPU kernels.
+
+
 Development Process Documentation
 =================================
 
@@ -324,12 +305,16 @@ Information about LLVM's development process.
 .. toctree::
    :hidden:
 
+   DeveloperPolicy
    MakefileGuide
    Projects
    LLVMBuild
    HowToReleaseLLVM
    Packaging
 
+:doc:`DeveloperPolicy`
+   The LLVM project's policy towards developers and their contributions.
+
 :doc:`Projects`
   How-to guide and templates for new projects that *use* the LLVM
   infrastructure.  The templates (directory organization, Makefiles, and test
@@ -349,46 +334,75 @@ Information about LLVM's development process.
 :doc:`Packaging`
    Advice on packaging LLVM into a distribution.
 
+Community
+=========
+
+LLVM has a thriving community of friendly and helpful developers.
+The two primary communication mechanisms in the LLVM community are mailing
+lists and IRC.
+
 Mailing Lists
-=============
+-------------
 
 If you can't find what you need in these docs, try consulting the mailing
 lists.
 
-`LLVM Announcements List`__
-  This is a low volume list that provides important announcements regarding
-  LLVM.  It gets email about once a month.
-
-  .. __: http://lists.cs.uiuc.edu/mailman/listinfo/llvm-announce
-
-`Developer's List`__
+`Developer's List (llvmdev)`__
   This list is for people who want to be included in technical discussions of
   LLVM. People post to this list when they have questions about writing code
   for or using the LLVM tools. It is relatively low volume.
 
   .. __: http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
 
-`Bugs & Patches Archive`__
-  This list gets emailed every time a bug is opened and closed, and when people
-  submit patches to be included in LLVM.  It is higher volume than the LLVMdev
-  list.
-
-  .. __: http://lists.cs.uiuc.edu/pipermail/llvmbugs/
-
-`Commits Archive`__
+`Commits Archive (llvm-commits)`__
   This list contains all commit messages that are made when LLVM developers
-  commit code changes to the repository. It is useful for those who want to
-  stay on the bleeding edge of LLVM development. This list is very high volume.
+  commit code changes to the repository. It also serves as a forum for
+  patch review (i.e. send patches here). It is useful for those who want to
+  stay on the bleeding edge of LLVM development. This list is very high
+  volume.
 
   .. __: http://lists.cs.uiuc.edu/pipermail/llvm-commits/
 
-`Test Results Archive`__
+`Bugs & Patches Archive (llvmbugs)`__
+  This list gets emailed every time a bug is opened and closed. It is
+  higher volume than the LLVMdev list.
+
+  .. __: http://lists.cs.uiuc.edu/pipermail/llvmbugs/
+
+`Test Results Archive (llvm-testresults)`__
   A message is automatically sent to this list by every active nightly tester
   when it completes.  As such, this list gets email several times each day,
   making it a high volume list.
 
   .. __: http://lists.cs.uiuc.edu/pipermail/llvm-testresults/
 
+`LLVM Announcements List (llvm-announce)`__
+  This is a low volume list that provides important announcements regarding
+  LLVM.  It gets email about once a month.
+
+  .. __: http://lists.cs.uiuc.edu/mailman/listinfo/llvm-announce
+
+IRC
+---
+
+Users and developers of the LLVM project (including subprojects such as Clang)
+can be found in #llvm on `irc.oftc.net <irc://irc.oftc.net/llvm>`_.
+
+This channel has several bots.
+
+* Buildbot reporters
+
+  * llvmbb - Bot for the main LLVM buildbot master.
+    http://lab.llvm.org:8011/console
+  * bb-chapuni - An individually run buildbot master. http://bb.pgr.jp/console
+  * smooshlab - Apple's internal buildbot master.
+
+* robot - Bugzilla linker. %bug <number>
+
+* clang-bot - A `geordi <http://www.eelis.net/geordi/>`_ instance running
+  near-trunk clang instead of gcc.
+
+
 Indices and tables
 ==================