diff options
author | Alexander Kornienko <alexfh@google.com> | 2013-03-14 10:51:38 +0000 |
---|---|---|
committer | Alexander Kornienko <alexfh@google.com> | 2013-03-14 10:51:38 +0000 |
commit | 647735c781c5b37061ee03d6e9e6c7dda92218e2 (patch) | |
tree | 5a5e56606d41060263048b5a5586b3d2380898ba /docs | |
parent | 6aed25d93d1cfcde5809a73ffa7dc1b0d6396f66 (diff) | |
parent | f635ef401786c84df32090251a8cf45981ecca33 (diff) |
Updating branches/google/stable to r176857
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/google/stable@177040 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'docs')
89 files changed, 5778 insertions, 5636 deletions
diff --git a/docs/AliasAnalysis.rst b/docs/AliasAnalysis.rst index fdaec89cdf..712d57d14b 100644 --- a/docs/AliasAnalysis.rst +++ b/docs/AliasAnalysis.rst @@ -1,5 +1,3 @@ -.. _alias_analysis: - ================================== LLVM Alias Analysis Infrastructure ================================== @@ -205,7 +203,7 @@ look at the `various alias analysis implementations`_ included with LLVM. Different Pass styles --------------------- -The first step to determining what type of `LLVM pass <WritingAnLLVMPass.html>`_ +The first step to determining what type of :doc:`LLVM pass <WritingAnLLVMPass>` you need to use for your Alias Analysis. As is the case with most other analyses and transformations, the answer should be fairly obvious from what type of problem you are trying to solve: @@ -253,25 +251,24 @@ Interfaces which may be specified All of the `AliasAnalysis <http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html>`__ virtual methods -default to providing `chaining`_ to another alias analysis implementation, which -ends up returning conservatively correct information (returning "May" Alias and -"Mod/Ref" for alias and mod/ref queries respectively). Depending on the -capabilities of the analysis you are implementing, you just override the -interfaces you can improve. +default to providing :ref:`chaining <aliasanalysis-chaining>` to another alias +analysis implementation, which ends up returning conservatively correct +information (returning "May" Alias and "Mod/Ref" for alias and mod/ref queries +respectively). Depending on the capabilities of the analysis you are +implementing, you just override the interfaces you can improve. -.. _chaining: -.. _chain: +.. _aliasanalysis-chaining: ``AliasAnalysis`` chaining behavior ----------------------------------- -With only one special exception (the `no-aa`_ pass) every alias analysis pass -chains to another alias analysis implementation (for example, the user can -specify "``-basicaa -ds-aa -licm``" to get the maximum benefit from both alias -analyses). The alias analysis class automatically takes care of most of this -for methods that you don't override. For methods that you do override, in code -paths that return a conservative MayAlias or Mod/Ref result, simply return -whatever the superclass computes. For example: +With only one special exception (the :ref:`-no-aa <aliasanalysis-no-aa>` pass) +every alias analysis pass chains to another alias analysis implementation (for +example, the user can specify "``-basicaa -ds-aa -licm``" to get the maximum +benefit from both alias analyses). The alias analysis class automatically +takes care of most of this for methods that you don't override. For methods +that you do override, in code paths that return a conservative MayAlias or +Mod/Ref result, simply return whatever the superclass computes. For example: .. code-block:: c++ @@ -504,11 +501,11 @@ Available ``AliasAnalysis`` implementations ------------------------------------------- This section lists the various implementations of the ``AliasAnalysis`` -interface. With the exception of the `-no-aa`_ implementation, all of these -`chain`_ to other alias analysis implementations. +interface. With the exception of the :ref:`-no-aa <aliasanalysis-no-aa>` +implementation, all of these :ref:`chain <aliasanalysis-chaining>` to other +alias analysis implementations. -.. _no-aa: -.. _-no-aa: +.. _aliasanalysis-no-aa: The ``-no-aa`` pass ^^^^^^^^^^^^^^^^^^^ diff --git a/docs/Atomics.rst b/docs/Atomics.rst index 1bca53e2b1..705d73fbab 100644 --- a/docs/Atomics.rst +++ b/docs/Atomics.rst @@ -1,5 +1,3 @@ -.. _atomics: - ============================================== LLVM Atomic Instructions and Concurrency Guide ============================================== diff --git a/docs/BitCodeFormat.rst b/docs/BitCodeFormat.rst index 333e79b864..c83b6c1801 100644 --- a/docs/BitCodeFormat.rst +++ b/docs/BitCodeFormat.rst @@ -1,5 +1,3 @@ -.. _bitcode_format: - .. role:: raw-html(raw) :format: html diff --git a/docs/BranchWeightMetadata.rst b/docs/BranchWeightMetadata.rst index 2667ce3589..71ecd34c82 100644 --- a/docs/BranchWeightMetadata.rst +++ b/docs/BranchWeightMetadata.rst @@ -1,5 +1,3 @@ -.. _branch_weight: - =========================== LLVM Branch Weight Metadata =========================== diff --git a/docs/Bugpoint.rst b/docs/Bugpoint.rst index 9ccf0cc2d9..1a5fc8c027 100644 --- a/docs/Bugpoint.rst +++ b/docs/Bugpoint.rst @@ -1,5 +1,3 @@ -.. _bugpoint: - ==================================== LLVM bugpoint tool: design and usage ==================================== @@ -136,9 +134,9 @@ non-obvious ways. Here are some hints and tips: It is often useful to capture the output of the program to file. For example, in the C shell, you can run: - .. code-block:: bash + .. code-block:: console - bugpoint ... |& tee bugpoint.log + $ bugpoint ... |& tee bugpoint.log to get a copy of ``bugpoint``'s output in the file ``bugpoint.log``, as well as on your terminal. diff --git a/docs/CMake.rst b/docs/CMake.rst index 7f0420c446..6eab04b970 100644 --- a/docs/CMake.rst +++ b/docs/CMake.rst @@ -1,5 +1,3 @@ -.. _building-with-cmake: - ======================== Building LLVM with CMake ======================== @@ -36,7 +34,7 @@ We use here the command-line, non-interactive CMake interface. #. Create a directory for containing the build. It is not supported to build LLVM on the source directory. cd to this directory: - .. code-block:: bash + .. code-block:: console $ mkdir mybuilddir $ cd mybuilddir @@ -44,7 +42,7 @@ We use here the command-line, non-interactive CMake interface. #. Execute this command on the shell replacing `path/to/llvm/source/root` with the path to the root of your LLVM source tree: - .. code-block:: bash + .. code-block:: console $ cmake path/to/llvm/source/root @@ -80,14 +78,14 @@ the corresponding *Generator* for creating files for your build tool. You can explicitly specify the generator with the command line option ``-G "Name of the generator"``. For knowing the available generators on your platform, execute -.. code-block:: bash +.. code-block:: console $ cmake --help This will list the generator's names at the end of the help text. Generator's names are case-sensitive. Example: -.. code-block:: bash +.. code-block:: console $ cmake -G "Visual Studio 9 2008" path/to/llvm/source/root @@ -110,14 +108,14 @@ Variables customize how the build will be generated. Options are boolean variables, with possible values ON/OFF. Options and variables are defined on the CMake command line like this: -.. code-block:: bash +.. code-block:: console $ cmake -DVARIABLE=value path/to/llvm/source You can set a variable after the initial CMake invocation for changing its value. You can also undefine a variable: -.. code-block:: bash +.. code-block:: console $ cmake -UVARIABLE path/to/llvm/source @@ -127,7 +125,7 @@ on the root of the build directory. Do not hand-edit it. Variables are listed here appending its type after a colon. It is correct to write the variable and the type on the CMake command line: -.. code-block:: bash +.. code-block:: console $ cmake -DVARIABLE:TYPE=value path/to/llvm/source @@ -280,7 +278,7 @@ Testing is performed when the *check* target is built. For instance, if you are using makefiles, execute this command while on the top level of your build directory: -.. code-block:: bash +.. code-block:: console $ make check @@ -355,13 +353,15 @@ an equivalent variant of snippet shown above: target_link_libraries(mycompiler ${REQ_LLVM_LIBRARIES}) +.. _cmake-out-of-source-pass: + Developing LLVM pass out of source ---------------------------------- It is possible to develop LLVM passes against installed LLVM. An example of project layout provided below: -.. code-block:: bash +.. code-block:: none <project dir>/ | diff --git a/docs/CodeGenerator.rst b/docs/CodeGenerator.rst index ce23667eb3..b5d4180974 100644 --- a/docs/CodeGenerator.rst +++ b/docs/CodeGenerator.rst @@ -1,5 +1,3 @@ -.. _code_generator: - ========================================== The LLVM Target-Independent Code Generator ========================================== @@ -17,6 +15,8 @@ The LLVM Target-Independent Code Generator .partial { background-color: #F88017 } .yes { background-color: #0F0; } .yes:before { content: "Y" } + .na { background-color: #6666FF; } + .na:before { content: "N/A" } </style> .. contents:: @@ -285,12 +285,10 @@ The ``TargetInstrInfo`` class ----------------------------- The ``TargetInstrInfo`` class is used to describe the machine instructions -supported by the target. It is essentially an array of ``TargetInstrDescriptor`` -objects, each of which describes one instruction the target -supports. Descriptors define things like the mnemonic for the opcode, the number -of operands, the list of implicit register uses and defs, whether the -instruction has certain target-independent properties (accesses memory, is -commutable, etc), and holds any target-specific flags. +supported by the target. Descriptions define things like the mnemonic for +the opcode, the number of operands, the list of implicit register uses and defs, +whether the instruction has certain target-independent properties (accesses +memory, is commutable, etc), and holds any target-specific flags. The ``TargetFrameInfo`` class ----------------------------- @@ -1748,12 +1746,14 @@ the key: :raw-html:`<table border="1" cellspacing="0">` :raw-html:`<tr>` :raw-html:`<th>Unknown</th>` +:raw-html:`<th>Not Applicable</th>` :raw-html:`<th>No support</th>` :raw-html:`<th>Partial Support</th>` :raw-html:`<th>Complete Support</th>` :raw-html:`</tr>` :raw-html:`<tr>` :raw-html:`<td class="unknown"></td>` +:raw-html:`<td class="na"></td>` :raw-html:`<td class="no"></td>` :raw-html:`<td class="partial"></td>` :raw-html:`<td class="yes"></td>` @@ -1773,7 +1773,7 @@ Here is the table: :raw-html:`<th>MBlaze</th>` :raw-html:`<th>MSP430</th>` :raw-html:`<th>Mips</th>` -:raw-html:`<th>PTX</th>` +:raw-html:`<th>NVPTX</th>` :raw-html:`<th>PowerPC</th>` :raw-html:`<th>Sparc</th>` :raw-html:`<th>X86</th>` @@ -1787,7 +1787,7 @@ Here is the table: :raw-html:`<td class="no"></td> <!-- MBlaze -->` :raw-html:`<td class="unknown"></td> <!-- MSP430 -->` :raw-html:`<td class="yes"></td> <!-- Mips -->` -:raw-html:`<td class="no"></td> <!-- PTX -->` +:raw-html:`<td class="yes"></td> <!-- NVPTX -->` :raw-html:`<td class="yes"></td> <!-- PowerPC -->` :raw-html:`<td class="yes"></td> <!-- Sparc -->` :raw-html:`<td class="yes"></td> <!-- X86 -->` @@ -1801,7 +1801,7 @@ Here is the table: :raw-html:`<td class="yes"></td> <!-- MBlaze -->` :raw-html:`<td class="no"></td> <!-- MSP430 -->` :raw-html:`<td class="no"></td> <!-- Mips -->` -:raw-html:`<td class="no"></td> <!-- PTX -->` +:raw-html:`<td class="no"></td> <!-- NVPTX -->` :raw-html:`<td class="no"></td> <!-- PowerPC -->` :raw-html:`<td class="no"></td> <!-- Sparc -->` :raw-html:`<td class="yes"></td> <!-- X86 -->` @@ -1815,7 +1815,7 @@ Here is the table: :raw-html:`<td class="yes"></td> <!-- MBlaze -->` :raw-html:`<td class="no"></td> <!-- MSP430 -->` :raw-html:`<td class="no"></td> <!-- Mips -->` -:raw-html:`<td class="no"></td> <!-- PTX -->` +:raw-html:`<td class="na"></td> <!-- NVPTX -->` :raw-html:`<td class="no"></td> <!-- PowerPC -->` :raw-html:`<td class="no"></td> <!-- Sparc -->` :raw-html:`<td class="yes"></td> <!-- X86 -->` @@ -1829,7 +1829,7 @@ Here is the table: :raw-html:`<td class="yes"></td> <!-- MBlaze -->` :raw-html:`<td class="unknown"></td> <!-- MSP430 -->` :raw-html:`<td class="no"></td> <!-- Mips -->` -:raw-html:`<td class="unknown"></td> <!-- PTX -->` +:raw-html:`<td class="yes"></td> <!-- NVPTX -->` :raw-html:`<td class="yes"></td> <!-- PowerPC -->` :raw-html:`<td class="unknown"></td> <!-- Sparc -->` :raw-html:`<td class="yes"></td> <!-- X86 -->` @@ -1843,7 +1843,7 @@ Here is the table: :raw-html:`<td class="no"></td> <!-- MBlaze -->` :raw-html:`<td class="unknown"></td> <!-- MSP430 -->` :raw-html:`<td class="yes"></td> <!-- Mips -->` -:raw-html:`<td class="unknown"></td> <!-- PTX -->` +:raw-html:`<td class="na"></td> <!-- NVPTX -->` :raw-html:`<td class="yes"></td> <!-- PowerPC -->` :raw-html:`<td class="unknown"></td> <!-- Sparc -->` :raw-html:`<td class="yes"></td> <!-- X86 -->` @@ -1857,7 +1857,7 @@ Here is the table: :raw-html:`<td class="yes"></td> <!-- MBlaze -->` :raw-html:`<td class="no"></td> <!-- MSP430 -->` :raw-html:`<td class="no"></td> <!-- Mips -->` -:raw-html:`<td class="no"></td> <!-- PTX -->` +:raw-html:`<td class="na"></td> <!-- NVPTX -->` :raw-html:`<td class="no"></td> <!-- PowerPC -->` :raw-html:`<td class="no"></td> <!-- Sparc -->` :raw-html:`<td class="yes"></td> <!-- X86 -->` @@ -1871,7 +1871,7 @@ Here is the table: :raw-html:`<td class="no"></td> <!-- MBlaze -->` :raw-html:`<td class="unknown"></td> <!-- MSP430 -->` :raw-html:`<td class="no"></td> <!-- Mips -->` -:raw-html:`<td class="unknown"></td> <!-- PTX -->` +:raw-html:`<td class="no"></td> <!-- NVPTX -->` :raw-html:`<td class="yes"></td> <!-- PowerPC -->` :raw-html:`<td class="unknown"></td> <!-- Sparc -->` :raw-html:`<td class="yes"></td> <!-- X86 -->` @@ -1885,7 +1885,7 @@ Here is the table: :raw-html:`<td class="no"></td> <!-- MBlaze -->` :raw-html:`<td class="no"></td> <!-- MSP430 -->` :raw-html:`<td class="no"></td> <!-- Mips -->` -:raw-html:`<td class="no"></td> <!-- PTX -->` +:raw-html:`<td class="no"></td> <!-- NVPTX -->` :raw-html:`<td class="no"></td> <!-- PowerPC -->` :raw-html:`<td class="no"></td> <!-- Sparc -->` :raw-html:`<td class="partial"><a href="#feat_segstacks_x86">*</a></td> <!-- X86 -->` @@ -2367,17 +2367,17 @@ Dynamic Allocation TODO - More to come. -The PTX backend ---------------- +The NVPTX backend +----------------- -The PTX code generator lives in the lib/Target/PTX directory. It is currently a -work-in-progress, but already supports most of the code generation functionality -needed to generate correct PTX kernels for CUDA devices. +The NVPTX code generator under lib/Target/NVPTX is an open-source version of +the NVIDIA NVPTX code generator for LLVM. It is contributed by NVIDIA and is +a port of the code generator used in the CUDA compiler (nvcc). It targets the +PTX 3.0/3.1 ISA and can target any compute capability greater than or equal to +2.0 (Fermi). -The code generator can target PTX 2.0+, and shader model 1.0+. The PTX ISA -Reference Manual is used as the primary source of ISA information, though an -effort is made to make the output of the code generator match the output of the -NVidia nvcc compiler, whenever possible. +This target is of production quality and should be completely compatible with +the official NVIDIA toolchain. Code Generator Options: @@ -2387,39 +2387,28 @@ Code Generator Options: :raw-html:`<th>Description</th>` :raw-html:`</tr>` :raw-html:`<tr>` -:raw-html:`<td>``double``</td>` -:raw-html:`<td align="left">If enabled, the map_f64_to_f32 directive is disabled in the PTX output, allowing native double-precision arithmetic</td>` +:raw-html:`<td>sm_20</td>` +:raw-html:`<td align="left">Set shader model/compute capability to 2.0</td>` +:raw-html:`</tr>` +:raw-html:`<tr>` +:raw-html:`<td>sm_21</td>` +:raw-html:`<td align="left">Set shader model/compute capability to 2.1</td>` +:raw-html:`</tr>` +:raw-html:`<tr>` +:raw-html:`<td>sm_30</td>` +:raw-html:`<td align="left">Set shader model/compute capability to 3.0</td>` +:raw-html:`</tr>` +:raw-html:`<tr>` +:raw-html:`<td>sm_35</td>` +:raw-html:`<td align="left">Set shader model/compute capability to 3.5</td>` :raw-html:`</tr>` :raw-html:`<tr>` -:raw-html:`<td>``no-fma``</td>` -:raw-html:`<td align="left">Disable generation of Fused-Multiply Add instructions, which may be beneficial for some devices</td>` +:raw-html:`<td>ptx30</td>` +:raw-html:`<td align="left">Target PTX 3.0</td>` :raw-html:`</tr>` :raw-html:`<tr>` -:raw-html:`<td>``smxy / computexy``</td>` -:raw-html:`<td align="left">Set shader model/compute capability to x.y, e.g. sm20 or compute13</td>` +:raw-html:`<td>ptx31</td>` +:raw-html:`<td align="left">Target PTX 3.1</td>` :raw-html:`</tr>` :raw-html:`</table>` -Working: - -* Arithmetic instruction selection (including combo FMA) - -* Bitwise instruction selection - -* Control-flow instruction selection - -* Function calls (only on SM 2.0+ and no return arguments) - -* Addresses spaces (0 = global, 1 = constant, 2 = local, 4 = shared) - -* Thread synchronization (bar.sync) - -* Special register reads ([N]TID, [N]CTAID, PMx, CLOCK, etc.) - -In Progress: - -* Robust call instruction selection - -* Stack frame allocation - -* Device-specific instruction scheduling optimizations diff --git a/docs/CodingStandards.rst b/docs/CodingStandards.rst index 8003c12497..4d66ad7574 100644 --- a/docs/CodingStandards.rst +++ b/docs/CodingStandards.rst @@ -1,5 +1,3 @@ -.. _coding_standards: - ===================== LLVM Coding Standards ===================== @@ -1090,6 +1088,34 @@ flushes the output stream. In other words, these are equivalent: Most of the time, you probably have no reason to flush the output stream, so it's better to use a literal ``'\n'``. +Don't use ``inline`` when defining a function in a class definition +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +A member function defined in a class definition is implicitly inline, so don't +put the ``inline`` keyword in this case. + +Don't: + +.. code-block:: c++ + + class Foo { + public: + inline void bar() { + // ... + } + }; + +Do: + +.. code-block:: c++ + + class Foo { + public: + void bar() { + // ... + } + }; + Microscopic Details ------------------- @@ -1304,7 +1330,7 @@ namespace just because it was declared there. See Also ======== -A lot of these comments and recommendations have been culled for other sources. +A lot of these comments and recommendations have been culled from other sources. Two particularly important books for our work are: #. `Effective C++ diff --git a/docs/CommandGuide/FileCheck.rst b/docs/CommandGuide/FileCheck.rst index 256970b362..fce63ba688 100644 --- a/docs/CommandGuide/FileCheck.rst +++ b/docs/CommandGuide/FileCheck.rst @@ -43,7 +43,8 @@ OPTIONS By default, FileCheck canonicalizes input horizontal whitespace (spaces and tabs) which causes it to ignore these differences (a space will match a tab). - The :option:`--strict-whitespace` argument disables this behavior. + The :option:`--strict-whitespace` argument disables this behavior. End-of-line + sequences are canonicalized to UNIX-style '\n' in all modes. .. option:: -version diff --git a/docs/CommandGuide/index.rst b/docs/CommandGuide/index.rst index 73a4835dd7..ac8a944a2e 100644 --- a/docs/CommandGuide/index.rst +++ b/docs/CommandGuide/index.rst @@ -1,5 +1,3 @@ -.. _commands: - LLVM Command Guide ------------------ @@ -30,6 +28,7 @@ Basic Commands llvm-diff llvm-cov llvm-stress + llvm-symbolizer Debugging Tools ~~~~~~~~~~~~~~~ diff --git a/docs/CommandGuide/lit.rst b/docs/CommandGuide/lit.rst index 1dcaff10bf..40c7646260 100644 --- a/docs/CommandGuide/lit.rst +++ b/docs/CommandGuide/lit.rst @@ -151,10 +151,6 @@ ADDITIONAL OPTIONS List the discovered test suites as part of the standard output. -.. option:: --no-tcl-as-sh - - Run Tcl scripts internally (instead of converting to shell scripts). - .. option:: --repeat=N Run each test ``N`` times. Currently this is primarily useful for timing @@ -298,14 +294,13 @@ executed, two important global variables are predefined: tests in the suite. **suffixes** For **lit** test formats which scan directories for tests, this - variable is a list of suffixes to identify test files. Used by: *ShTest*, - *TclTest*. + variable is a list of suffixes to identify test files. Used by: *ShTest*. **substitutions** For **lit** test formats which substitute variables into a test - script, the list of substitutions to perform. Used by: *ShTest*, *TclTest*. + script, the list of substitutions to perform. Used by: *ShTest*. **unsupported** Mark an unsupported directory, all tests within it will be - reported as unsupported. Used by: *ShTest*, *TclTest*. + reported as unsupported. Used by: *ShTest*. **parent** The parent configuration, this is the config object for the directory containing the test suite, or None. diff --git a/docs/CommandGuide/llvm-bcanalyzer.rst b/docs/CommandGuide/llvm-bcanalyzer.rst index f1e4eac1be..7254088ec9 100644 --- a/docs/CommandGuide/llvm-bcanalyzer.rst +++ b/docs/CommandGuide/llvm-bcanalyzer.rst @@ -1,424 +1,305 @@ llvm-bcanalyzer - LLVM bitcode analyzer ======================================= - SYNOPSIS -------- - -**llvm-bcanalyzer** [*options*] [*filename*] - +:program:`llvm-bcanalyzer` [*options*] [*filename*] DESCRIPTION ----------- +The :program:`llvm-bcanalyzer` command is a small utility for analyzing bitcode +files. The tool reads a bitcode file (such as generated with the +:program:`llvm-as` tool) and produces a statistical report on the contents of +the bitcode file. The tool can also dump a low level but human readable +version of the bitcode file. This tool is probably not of much interest or +utility except for those working directly with the bitcode file format. Most +LLVM users can just ignore this tool. -The **llvm-bcanalyzer** command is a small utility for analyzing bitcode files. -The tool reads a bitcode file (such as generated with the **llvm-as** tool) and -produces a statistical report on the contents of the bitcode file. The tool -can also dump a low level but human readable version of the bitcode file. -This tool is probably not of much interest or utility except for those working -directly with the bitcode file format. Most LLVM users can just ignore -this tool. - -If *filename* is omitted or is ``-``, then **llvm-bcanalyzer** reads its input -from standard input. This is useful for combining the tool into a pipeline. -Output is written to the standard output. - +If *filename* is omitted or is ``-``, then :program:`llvm-bcanalyzer` reads its +input from standard input. This is useful for combining the tool into a +pipeline. Output is written to the standard output. OPTIONS ------- +.. program:: llvm-bcanalyzer +.. option:: -nodetails -**-nodetails** - - Causes **llvm-bcanalyzer** to abbreviate its output by writing out only a module - level summary. The details for individual functions are not displayed. - - + Causes :program:`llvm-bcanalyzer` to abbreviate its output by writing out only + a module level summary. The details for individual functions are not + displayed. -**-dump** +.. option:: -dump - Causes **llvm-bcanalyzer** to dump the bitcode in a human readable format. This - format is significantly different from LLVM assembly and provides details about - the encoding of the bitcode file. + Causes :program:`llvm-bcanalyzer` to dump the bitcode in a human readable + format. This format is significantly different from LLVM assembly and + provides details about the encoding of the bitcode file. +.. option:: -verify - -**-verify** - - Causes **llvm-bcanalyzer** to verify the module produced by reading the - bitcode. This ensures that the statistics generated are based on a consistent + Causes :program:`llvm-bcanalyzer` to verify the module produced by reading the + bitcode. This ensures that the statistics generated are based on a consistent module. - - -**-help** +.. option:: -help Print a summary of command line options. - - - EXIT STATUS ----------- - -If **llvm-bcanalyzer** succeeds, it will exit with 0. Otherwise, if an error -occurs, it will exit with a non-zero value, usually 1. - +If :program:`llvm-bcanalyzer` succeeds, it will exit with 0. Otherwise, if an +error occurs, it will exit with a non-zero value, usually 1. SUMMARY OUTPUT DEFINITIONS -------------------------- - -The following items are always printed by llvm-bcanalyzer. They comprize the +The following items are always printed by llvm-bcanalyzer. They comprize the summary output. - **Bitcode Analysis Of Module** This just provides the name of the module for which bitcode analysis is being generated. - - **Bitcode Version Number** The bitcode version (not LLVM version) of the file read by the analyzer. - - **File Size** The size, in bytes, of the entire bitcode file. - - **Module Bytes** - The size, in bytes, of the module block. Percentage is relative to File Size. - - + The size, in bytes, of the module block. Percentage is relative to File Size. **Function Bytes** - The size, in bytes, of all the function blocks. Percentage is relative to File + The size, in bytes, of all the function blocks. Percentage is relative to File Size. - - **Global Types Bytes** - The size, in bytes, of the Global Types Pool. Percentage is relative to File - Size. This is the size of the definitions of all types in the bitcode file. - - + The size, in bytes, of the Global Types Pool. Percentage is relative to File + Size. This is the size of the definitions of all types in the bitcode file. **Constant Pool Bytes** The size, in bytes, of the Constant Pool Blocks Percentage is relative to File Size. - - **Module Globals Bytes** Ths size, in bytes, of the Global Variable Definitions and their initializers. Percentage is relative to File Size. - - **Instruction List Bytes** The size, in bytes, of all the instruction lists in all the functions. - Percentage is relative to File Size. Note that this value is also included in + Percentage is relative to File Size. Note that this value is also included in the Function Bytes. - - **Compaction Table Bytes** The size, in bytes, of all the compaction tables in all the functions. - Percentage is relative to File Size. Note that this value is also included in + Percentage is relative to File Size. Note that this value is also included in the Function Bytes. - - **Symbol Table Bytes** - The size, in bytes, of all the symbol tables in all the functions. Percentage is - relative to File Size. Note that this value is also included in the Function + The size, in bytes, of all the symbol tables in all the functions. Percentage is + relative to File Size. Note that this value is also included in the Function Bytes. - - **Dependent Libraries Bytes** - The size, in bytes, of the list of dependent libraries in the module. Percentage - is relative to File Size. Note that this value is also included in the Module + The size, in bytes, of the list of dependent libraries in the module. Percentage + is relative to File Size. Note that this value is also included in the Module Global Bytes. - - **Number Of Bitcode Blocks** The total number of blocks of any kind in the bitcode file. - - **Number Of Functions** The total number of function definitions in the bitcode file. - - **Number Of Types** The total number of types defined in the Global Types Pool. - - **Number Of Constants** The total number of constants (of any type) defined in the Constant Pool. - - **Number Of Basic Blocks** The total number of basic blocks defined in all functions in the bitcode file. - - **Number Of Instructions** The total number of instructions defined in all functions in the bitcode file. - - **Number Of Long Instructions** The total number of long instructions defined in all functions in the bitcode - file. Long instructions are those taking greater than 4 bytes. Typically long + file. Long instructions are those taking greater than 4 bytes. Typically long instructions are GetElementPtr with several indices, PHI nodes, and calls to functions with large numbers of arguments. - - **Number Of Operands** The total number of operands used in all instructions in the bitcode file. - - **Number Of Compaction Tables** The total number of compaction tables in all functions in the bitcode file. - - **Number Of Symbol Tables** The total number of symbol tables in all functions in the bitcode file. - - **Number Of Dependent Libs** The total number of dependent libraries found in the bitcode file. - - **Total Instruction Size** The total size of the instructions in all functions in the bitcode file. - - **Average Instruction Size** The average number of bytes per instruction across all functions in the bitcode - file. This value is computed by dividing Total Instruction Size by Number Of + file. This value is computed by dividing Total Instruction Size by Number Of Instructions. - - **Maximum Type Slot Number** - The maximum value used for a type's slot number. Larger slot number values take + The maximum value used for a type's slot number. Larger slot number values take more bytes to encode. - - **Maximum Value Slot Number** - The maximum value used for a value's slot number. Larger slot number values take + The maximum value used for a value's slot number. Larger slot number values take more bytes to encode. - - **Bytes Per Value** - The average size of a Value definition (of any type). This is computed by + The average size of a Value definition (of any type). This is computed by dividing File Size by the total number of values of any type. - - **Bytes Per Global** The average size of a global definition (constants and global variables). - - **Bytes Per Function** - The average number of bytes per function definition. This is computed by + The average number of bytes per function definition. This is computed by dividing Function Bytes by Number Of Functions. - - **# of VBR 32-bit Integers** The total number of 32-bit integers encoded using the Variable Bit Rate encoding scheme. - - **# of VBR 64-bit Integers** The total number of 64-bit integers encoded using the Variable Bit Rate encoding scheme. - - **# of VBR Compressed Bytes** The total number of bytes consumed by the 32-bit and 64-bit integers that use the Variable Bit Rate encoding scheme. - - **# of VBR Expanded Bytes** The total number of bytes that would have been consumed by the 32-bit and 64-bit integers had they not been compressed with the Variable Bit Rage encoding scheme. - - **Bytes Saved With VBR** The total number of bytes saved by using the Variable Bit Rate encoding scheme. The percentage is relative to # of VBR Expanded Bytes. - - - DETAILED OUTPUT DEFINITIONS --------------------------- - The following definitions occur only if the -nodetails option was not given. The detailed output provides additional information on a per-function basis. - **Type** The type signature of the function. - - **Byte Size** The total number of bytes in the function's block. - - **Basic Blocks** The number of basic blocks defined by the function. - - **Instructions** The number of instructions defined by the function. - - **Long Instructions** The number of instructions using the long instruction format in the function. - - **Operands** The number of operands used by all instructions in the function. - - **Instruction Size** The number of bytes consumed by instructions in the function. - - **Average Instruction Size** - The average number of bytes consumed by the instructions in the function. This - value is computed by dividing Instruction Size by Instructions. - - + The average number of bytes consumed by the instructions in the function. + This value is computed by dividing Instruction Size by Instructions. **Bytes Per Instruction** - The average number of bytes used by the function per instruction. This value is - computed by dividing Byte Size by Instructions. Note that this is not the same - as Average Instruction Size. It computes a number relative to the total function - size not just the size of the instruction list. - - + The average number of bytes used by the function per instruction. This value + is computed by dividing Byte Size by Instructions. Note that this is not the + same as Average Instruction Size. It computes a number relative to the total + function size not just the size of the instruction list. **Number of VBR 32-bit Integers** The total number of 32-bit integers found in this function (for any use). - - **Number of VBR 64-bit Integers** The total number of 64-bit integers found in this function (for any use). - - **Number of VBR Compressed Bytes** The total number of bytes in this function consumed by the 32-bit and 64-bit integers that use the Variable Bit Rate encoding scheme. - - **Number of VBR Expanded Bytes** The total number of bytes in this function that would have been consumed by the 32-bit and 64-bit integers had they not been compressed with the Variable Bit Rate encoding scheme. - - **Bytes Saved With VBR** The total number of bytes saved in this function by using the Variable Bit - Rate encoding scheme. The percentage is relative to # of VBR Expanded Bytes. - - - + Rate encoding scheme. The percentage is relative to # of VBR Expanded Bytes. SEE ALSO -------- +:doc:`/CommandGuide/llvm-dis`, :doc:`/BitCodeFormat` -llvm-dis|llvm-dis, `http://llvm.org/docs/BitCodeFormat.html <http://llvm.org/docs/BitCodeFormat.html>`_ diff --git a/docs/CommandGuide/llvm-symbolizer.rst b/docs/CommandGuide/llvm-symbolizer.rst new file mode 100644 index 0000000000..73babb1e5c --- /dev/null +++ b/docs/CommandGuide/llvm-symbolizer.rst @@ -0,0 +1,65 @@ +llvm-symbolizer - convert addresses into source code locations +============================================================== + +SYNOPSIS +-------- + +:program:`llvm-symbolizer` [options] + +DESCRIPTION +----------- + +:program:`llvm-symbolizer` reads object file names and addresses from standard +input and prints corresponding source code locations to standard output. This +program uses debug info sections and symbol table in the object files. + +EXAMPLE +-------- + +.. code-block:: console + + $ cat addr.txt + a.out 0x4004f4 + /tmp/b.out 0x400528 + /tmp/c.so 0x710 + $ llvm-symbolizer < addr.txt + main + /tmp/a.cc:4 + + f(int, int) + /tmp/b.cc:11 + + h_inlined_into_g + /tmp/header.h:2 + g_inlined_into_f + /tmp/header.h:7 + f_inlined_into_main + /tmp/source.cc:3 + main + /tmp/source.cc:8 + +OPTIONS +------- + +.. option:: -functions + + Print function names as well as source file/line locations. Defaults to true. + +.. option:: -use-symbol-table + + Prefer function names stored in symbol table to function names + in debug info sections. Defaults to true. + +.. option:: -demangle + + Print demangled function names. Defaults to true. + +.. option:: -inlining + + If a source code location is in an inlined function, prints all the + inlnied frames. Defaults to true. + +EXIT STATUS +----------- + +:program:`llvm-symbolizer` returns 0. Other exit codes imply internal program error. diff --git a/docs/CommandLine.rst b/docs/CommandLine.rst index 302f5a4cf5..073958b16b 100644 --- a/docs/CommandLine.rst +++ b/docs/CommandLine.rst @@ -1,5 +1,3 @@ -.. _commandline: - ============================== CommandLine 2.0 Library Manual ============================== @@ -68,9 +66,7 @@ CommandLine library to have the following features: This document will hopefully let you jump in and start using CommandLine in your utility quickly and painlessly. Additionally it should be a simple reference -manual to figure out how stuff works. If it is failing in some area (or you -want an extension to the library), nag the author, `Chris -Lattner <mailto:sabre@nondot.org>`_. +manual to figure out how stuff works. Quick Start Guide ================= diff --git a/docs/CompilerWriterInfo.rst b/docs/CompilerWriterInfo.rst index 7504d3c75a..87add670af 100644 --- a/docs/CompilerWriterInfo.rst +++ b/docs/CompilerWriterInfo.rst @@ -1,5 +1,3 @@ -.. _compiler_writer_info: - ======================================================== Architecture & Platform Information for Compiler Writers ======================================================== @@ -12,8 +10,6 @@ Architecture & Platform Information for Compiler Writers This document is a work-in-progress. Additions and clarifications are welcome. - Compiled by `Misha Brukman <http://misha.brukman.net>`_. - Hardware ======== @@ -24,6 +20,11 @@ ARM * `ABI <http://www.arm.com/products/DevTools/ABI.html>`_ +AArch64 +------- + +* `ARMv8 Instruction Set Overview <http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.genc010197a/index.html>`_ + Itanium (ia64) -------------- @@ -40,19 +41,15 @@ PowerPC IBM - Official manuals and docs ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -* `PowerPC Architecture Book <http://www-106.ibm.com/developerworks/eserver/articles/archguide.html>`_ - - * Book I: `PowerPC User Instruction Set Architecture <http://www-106.ibm.com/developerworks/eserver/pdfs/archpub1.pdf>`_ - - * Book II: `PowerPC Virtual Environment Architecture <http://www-106.ibm.com/developerworks/eserver/pdfs/archpub2.pdf>`_ +* `Power Instruction Set Architecture, Versions 2.03 through 2.06 (authentication required, free sign-up) <https://www.power.org/technology-introduction/standards-specifications>`_ - * Book III: `PowerPC Operating Environment Architecture <http://www-106.ibm.com/developerworks/eserver/pdfs/archpub3.pdf>`_ +* `PowerPC Compiler Writer's Guide <http://www.ibm.com/chips/techlib/techlib.nsf/techdocs/852569B20050FF7785256996007558C6>`_ -* `PowerPC Compiler Writer's Guide <http://www-3.ibm.com/chips/techlib/techlib.nsf/techdocs/852569B20050FF7785256996007558C6>`_ +* `Intro to PowerPC Architecture <http://www.ibm.com/developerworks/linux/library/l-powarch/>`_ -* `PowerPC Processor Manuals <http://www-3.ibm.com/chips/techlib/techlib.nsf/products/PowerPC>`_ +* `PowerPC Processor Manuals (embedded) <http://www.ibm.com/chips/techlib/techlib.nsf/products/PowerPC>`_ -* `Intro to PowerPC Architecture <http://www-106.ibm.com/developerworks/linux/library/l-powarch/>`_ +* `Various IBM specifications and white papers <https://www.power.org/documentation/?document_company=105&document_category=all&publish_year=all&grid_order=DESC&grid_sort=title>`_ * `IBM AIX/5L for POWER Assembly Reference <http://publibn.boulder.ibm.com/doc_link/en_US/a_doc_lib/aixassem/alangref/alangreftfrm.htm>`_ @@ -81,7 +78,7 @@ AMD - Official manuals and docs Intel - Official manuals and docs ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -* `IA-32 manuals <http://developer.intel.com/design/pentium4/manuals/index_new.htm>`_ +* `Intel 64 and IA-32 manuals <http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html>`_ * `Intel Itanium documentation <http://www.intel.com/design/itanium/documentation.htm?iid=ipp_srvr_proc_itanium2+techdocs>`_ Other x86-specific information @@ -101,6 +98,8 @@ Linux ----- * `PowerPC 64-bit ELF ABI Supplement <http://www.linuxbase.org/spec/ELF/ppc64/>`_ +* `Procedure Call Standard for the AArch64 Architecture <http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055a/IHI0055A_aapcs64.pdf>`_ +* `ELF for the ARM 64-bit Architecture (AArch64) <http://infocenter.arm.com/help/topic/com.arm.doc.ihi0056a/IHI0056A_aaelf64.pdf>`_ OS X ---- diff --git a/docs/DebuggingJITedCode.rst b/docs/DebuggingJITedCode.rst index eeb2f7787d..d6101d5100 100644 --- a/docs/DebuggingJITedCode.rst +++ b/docs/DebuggingJITedCode.rst @@ -1,11 +1,7 @@ -.. _debugging-jited-code: - ============================== Debugging JIT-ed Code With GDB ============================== -.. sectionauthor:: Reid Kleckner and Eli Bendersky - Background ========== diff --git a/docs/DeveloperPolicy.rst b/docs/DeveloperPolicy.rst index 925e769b86..43bdc85985 100644 --- a/docs/DeveloperPolicy.rst +++ b/docs/DeveloperPolicy.rst @@ -1,5 +1,3 @@ -.. _developer_policy: - ===================== LLVM Developer Policy ===================== diff --git a/docs/Dummy.html b/docs/Dummy.html new file mode 100644 index 0000000000..e69de29bb2 --- /dev/null +++ b/docs/Dummy.html diff --git a/docs/ExceptionHandling.rst b/docs/ExceptionHandling.rst index 190f18261d..0a86607556 100644 --- a/docs/ExceptionHandling.rst +++ b/docs/ExceptionHandling.rst @@ -1,5 +1,3 @@ -.. _exception_handling: - ========================== Exception Handling in LLVM ========================== @@ -34,13 +32,13 @@ execution of an application. A more complete description of the Itanium ABI exception handling runtime support of can be found at `Itanium C++ ABI: Exception Handling -<http://www.codesourcery.com/cxx-abi/abi-eh.html>`_. A description of the +<http://mentorembedded.github.com/cxx-abi/abi-eh.html>`_. A description of the exception frame format can be found at `Exception Frames -<http://refspecs.freestandards.org/LSB_3.0.0/LSB-Core-generic/LSB-Core-generic/ehframechpt.html>`_, +<http://refspecs.linuxfoundation.org/LSB_3.0.0/LSB-Core-generic/LSB-Core-generic/ehframechpt.html>`_, with details of the DWARF 4 specification at `DWARF 4 Standard <http://dwarfstd.org/Dwarf4Std.php>`_. A description for the C++ exception table formats can be found at `Exception Handling Tables -<http://www.codesourcery.com/cxx-abi/exceptions.pdf>`_. +<http://mentorembedded.github.com/cxx-abi/exceptions.pdf>`_. Setjmp/Longjmp Exception Handling --------------------------------- @@ -151,10 +149,10 @@ type info index are passed in as arguments. The landing pad saves the exception structure reference and then proceeds to select the catch block that corresponds to the type info of the exception object. -The LLVM `landingpad instruction <LangRef.html#i_landingpad>`_ is used to convey -information about the landing pad to the back end. For C++, the ``landingpad`` -instruction returns a pointer and integer pair corresponding to the pointer to -the *exception structure* and the *selector value* respectively. +The LLVM :ref:`i_landingpad` is used to convey information about the landing +pad to the back end. For C++, the ``landingpad`` instruction returns a pointer +and integer pair corresponding to the pointer to the *exception structure* and +the *selector value* respectively. The ``landingpad`` instruction takes a reference to the personality function to be used for this ``try``/``catch`` sequence. The remainder of the instruction is @@ -203,10 +201,9 @@ A cleanup is extra code which needs to be run as part of unwinding a scope. C++ destructors are a typical example, but other languages and language extensions provide a variety of different kinds of cleanups. In general, a landing pad may need to run arbitrary amounts of cleanup code before actually entering a catch -block. To indicate the presence of cleanups, a `landingpad -instruction <LangRef.html#i_landingpad>`_ should have a *cleanup* -clause. Otherwise, the unwinder will not stop at the landing pad if there are no -catches or filters that require it to. +block. To indicate the presence of cleanups, a :ref:`i_landingpad` should have +a *cleanup* clause. Otherwise, the unwinder will not stop at the landing pad if +there are no catches or filters that require it to. .. note:: @@ -226,9 +223,9 @@ Throw Filters C++ allows the specification of which exception types may be thrown from a function. To represent this, a top level landing pad may exist to filter out -invalid types. To express this in LLVM code the `landingpad -instruction <LangRef.html#i_landingpad>`_ will have a filter clause. The clause -consists of an array of type infos. ``landingpad`` will return a negative value +invalid types. To express this in LLVM code the :ref:`i_landingpad` will have a +filter clause. The clause consists of an array of type infos. +``landingpad`` will return a negative value if the exception does not match any of the type infos. If no match is found then a call to ``__cxa_call_unexpected`` should be made, otherwise ``_Unwind_Resume``. Each of these functions requires a reference to the @@ -269,8 +266,8 @@ handling information at various points in generated code. .. _llvm.eh.typeid.for: -llvm.eh.typeid.for ------------------- +``llvm.eh.typeid.for`` +---------------------- .. code-block:: llvm @@ -283,8 +280,8 @@ function. This value can be used to compare against the result of .. _llvm.eh.sjlj.setjmp: -llvm.eh.sjlj.setjmp -------------------- +``llvm.eh.sjlj.setjmp`` +----------------------- .. code-block:: llvm @@ -305,8 +302,8 @@ available for use in a target-specific manner. .. _llvm.eh.sjlj.longjmp: -llvm.eh.sjlj.longjmp --------------------- +``llvm.eh.sjlj.longjmp`` +------------------------ .. code-block:: llvm @@ -318,8 +315,8 @@ a buffer populated by `llvm.eh.sjlj.setjmp`_. The frame pointer and stack pointer are restored from the buffer, then control is transferred to the destination address. -llvm.eh.sjlj.lsda ------------------ +``llvm.eh.sjlj.lsda`` +--------------------- .. code-block:: llvm @@ -330,8 +327,8 @@ the address of the Language Specific Data Area (LSDA) for the current function. The SJLJ front-end code stores this address in the exception handling function context for use by the runtime. -llvm.eh.sjlj.callsite ---------------------- +``llvm.eh.sjlj.callsite`` +------------------------- .. code-block:: llvm diff --git a/docs/ExtendingLLVM.rst b/docs/ExtendingLLVM.rst index 6df08eee98..3d8e9ee79a 100644 --- a/docs/ExtendingLLVM.rst +++ b/docs/ExtendingLLVM.rst @@ -1,5 +1,3 @@ -.. _extending_llvm: - ============================================================ Extending LLVM: Adding instructions, intrinsics, types, etc. ============================================================ diff --git a/docs/FAQ.rst b/docs/FAQ.rst index b0e3ca0456..e4ab2c18f7 100644 --- a/docs/FAQ.rst +++ b/docs/FAQ.rst @@ -1,5 +1,3 @@ -.. _faq: - ================================ Frequently Asked Questions (FAQ) ================================ @@ -53,6 +51,29 @@ Some porting problems may exist in the following areas: like the Bourne Shell and sed. Porting to systems without these tools (MacOS 9, Plan 9) will require more effort. +What API do I use to store a value to one of the virtual registers in LLVM IR's SSA representation? +--------------------------------------------------------------------------------------------------- + +In short: you can't. It's actually kind of a silly question once you grok +what's going on. Basically, in code like: + +.. code-block:: llvm + + %result = add i32 %foo, %bar + +, ``%result`` is just a name given to the ``Value`` of the ``add`` +instruction. In other words, ``%result`` *is* the add instruction. The +"assignment" doesn't explicitly "store" anything to any "virtual register"; +the "``=``" is more like the mathematical sense of equality. + +Longer explanation: In order to generate a textual representation of the +IR, some kind of name has to be given to each instruction so that other +instructions can textually reference it. However, the isomorphic in-memory +representation that you manipulate from C++ has no such restriction since +instructions can simply keep pointers to any other ``Value``'s that they +reference. In fact, the names of dummy numbered temporaries like ``%1`` are +not explicitly represented in the in-memory representation at all (see +``Value::getName()``). Build Problems ============== @@ -79,7 +100,7 @@ grabbing the wrong linker/assembler/etc, there are two ways to fix it: #. Run ``configure`` with an alternative ``PATH`` that is correct. In a Bourne compatible shell, the syntax would be: -.. code-block:: bash +.. code-block:: console % PATH=[the path without the bad program] ./configure ... @@ -106,7 +127,7 @@ I've modified a Makefile in my source tree, but my build tree keeps using the ol If the Makefile already exists in your object tree, you can just run the following command in the top level directory of your object tree: -.. code-block:: bash +.. code-block:: console % ./config.status <relative path to Makefile>; @@ -133,13 +154,13 @@ This is most likely occurring because you built a profile or release For example, if you built LLVM with the command: -.. code-block:: bash +.. code-block:: console % gmake ENABLE_PROFILING=1 ...then you must run the tests with the following commands: -.. code-block:: bash +.. code-block:: console % cd llvm/test % gmake ENABLE_PROFILING=1 @@ -175,17 +196,17 @@ After Subversion update, rebuilding gives the error "No rule to make target". ----------------------------------------------------------------------------- If the error is of the form: -.. code-block:: bash +.. code-block:: console gmake[2]: *** No rule to make target `/path/to/somefile', - needed by `/path/to/another/file.d'. + needed by `/path/to/another/file.d'. Stop. This may occur anytime files are moved within the Subversion repository or removed entirely. In this case, the best solution is to erase all ``.d`` files, which list dependencies for source files, and rebuild: -.. code-block:: bash +.. code-block:: console % cd $LLVM_OBJ_DIR % rm -f `find . -name \*\.d` diff --git a/docs/GarbageCollection.rst b/docs/GarbageCollection.rst index b0b2718409..5c3a1af23c 100644 --- a/docs/GarbageCollection.rst +++ b/docs/GarbageCollection.rst @@ -5,9 +5,6 @@ Accurate Garbage Collection with LLVM .. contents:: :local: -.. sectionauthor:: Chris Lattner <sabre@nondot.org> and - Gordon Henriksen - Introduction ============ @@ -49,8 +46,6 @@ techniques dominates any low-level losses. This document describes the mechanisms and interfaces provided by LLVM to support accurate garbage collection. -.. _feature: - Goals and non-goals ------------------- @@ -121,8 +116,6 @@ lot of work for the developer of a novel language. However, it's easy to get started quickly and scale up to a more sophisticated implementation as your compiler matures. -.. _quickstart: - Getting started =============== @@ -177,8 +170,6 @@ To help with several of these tasks (those indicated with a \*), LLVM includes a highly portable, built-in ShadowStack code generator. It is compiled into ``llc`` and works even with the interpreter and C backends. -.. _quickstart-compiler: - In your compiler ---------------- @@ -200,8 +191,6 @@ There's no need to use ``@llvm.gcread`` and ``@llvm.gcwrite`` over plain ``load`` and ``store`` for now. You will need them when switching to a more advanced GC. -.. _quickstart-runtime: - In your runtime --------------- @@ -263,8 +252,6 @@ data structure, but there are only 20 lines of meaningful code.) } } -.. _shadow-stack: - About the shadow stack ---------------------- @@ -283,8 +270,9 @@ The tradeoff for this simplicity and portability is: * Not thread-safe. Still, it's an easy way to get started. After your compiler and runtime are up -and running, writing a plugin_ will allow you to take advantage of :ref:`more -advanced GC features <collector-algos>` of LLVM in order to improve performance. +and running, writing a :ref:`plugin <plugin>` will allow you to take advantage +of :ref:`more advanced GC features <collector-algos>` of LLVM in order to +improve performance. .. _gc_intrinsics: @@ -300,8 +288,6 @@ These facilities are limited to those strictly necessary; they are not intended to be a complete interface to any garbage collector. A program will need to interface with the GC library using the facilities provided by that program. -.. _gcattr: - Specifying GC code generation: ``gc "..."`` ------------------------------------------- @@ -392,8 +378,6 @@ could be compiled to this LLVM code: store %Object* null, %Object** %X ... -.. _barriers: - Reading and writing references in the heap ------------------------------------------ @@ -423,15 +407,13 @@ pointer: %derived = getelementptr %object, i32 0, i32 2, i32 %n LLVM does not enforce this relationship between the object and derived pointer -(although a plugin_ might). However, it would be an unusual collector that -violated it. +(although a :ref:`plugin <plugin>` might). However, it would be an unusual +collector that violated it. The use of these intrinsics is naturally optional if the target GC does require the corresponding barrier. Such a GC plugin will replace the intrinsic calls with the corresponding ``load`` or ``store`` instruction if they are used. -.. _gcwrite: - Write barrier: ``llvm.gcwrite`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -442,14 +424,12 @@ Write barrier: ``llvm.gcwrite`` For write barriers, LLVM provides the ``llvm.gcwrite`` intrinsic function. It has exactly the same semantics as a non-volatile ``store`` to the derived pointer (the third argument). The exact code generated is specified by a -compiler plugin_. +compiler :ref:`plugin <plugin>`. Many important algorithms require write barriers, including generational and concurrent collectors. Additionally, write barriers could be used to implement reference counting. -.. _gcread: - Read barrier: ``llvm.gcread`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -459,8 +439,8 @@ Read barrier: ``llvm.gcread`` For read barriers, LLVM provides the ``llvm.gcread`` intrinsic function. It has exactly the same semantics as a non-volatile ``load`` from the derived pointer -(the second argument). The exact code generated is specified by a compiler -plugin_. +(the second argument). The exact code generated is specified by a +:ref:`compiler plugin <plugin>`. Read barriers are needed by fewer algorithms than write barriers, and may have a greater performance impact since pointer reads are more frequent than writes. @@ -739,8 +719,6 @@ Since LLVM does not yet compute liveness information, there is no means of distinguishing an uninitialized stack root from an initialized one. Therefore, this feature should be used by all GC plugins. It is enabled by default. -.. _custom: - Custom lowering of intrinsics: ``CustomRoots``, ``CustomReadBarriers``, and ``CustomWriteBarriers`` --------------------------------------------------------------------------------------------------- @@ -777,10 +755,10 @@ If ``CustomReadBarriers`` or ``CustomWriteBarriers`` are specified, then ``performCustomLowering`` **must** eliminate the corresponding barriers. ``performCustomLowering`` must comply with the same restrictions as -`FunctionPass::runOnFunction <WritingAnLLVMPass.html#runOnFunction>`__ +:ref:`FunctionPass::runOnFunction <writing-an-llvm-pass-runOnFunction>` Likewise, ``initializeCustomLowering`` has the same semantics as -`Pass::doInitialization(Module&) -<WritingAnLLVMPass.html#doInitialization_mod>`__ +:ref:`Pass::doInitialization(Module&) +<writing-an-llvm-pass-doInitialization-mod>` The following can be used as a template: diff --git a/docs/GetElementPtr.rst b/docs/GetElementPtr.rst index 3b57d78cf1..306a2a87ef 100644 --- a/docs/GetElementPtr.rst +++ b/docs/GetElementPtr.rst @@ -1,5 +1,3 @@ -.. _gep: - ======================================= The Often Misunderstood GEP Instruction ======================================= diff --git a/docs/GettingStarted.rst b/docs/GettingStarted.rst index 8902684c98..539c75e2d7 100644 --- a/docs/GettingStarted.rst +++ b/docs/GettingStarted.rst @@ -1,9 +1,10 @@ -.. _getting_started: - ==================================== Getting Started with the LLVM System ==================================== +.. contents:: + :local: + Overview ======== @@ -68,27 +69,24 @@ Here's the short story for getting up and running quickly with LLVM: * ``../llvm/configure [options]`` Some common options: - * ``--prefix=directory`` --- - - Specify for *directory* the full pathname of where you want the LLVM - tools and libraries to be installed (default ``/usr/local``). + * ``--prefix=directory`` --- Specify for *directory* the full pathname of + where you want the LLVM tools and libraries to be installed (default + ``/usr/local``). - * ``--enable-optimized`` --- + * ``--enable-optimized`` --- Compile with optimizations enabled (default + is NO). - Compile with optimizations enabled (default is NO). - - * ``--enable-assertions`` --- - - Compile with assertion checks enabled (default is YES). + * ``--enable-assertions`` --- Compile with assertion checks enabled + (default is YES). * ``make [-j]`` --- The ``-j`` specifies the number of jobs (commands) to run simultaneously. This builds both LLVM and Clang for Debug+Asserts mode. - The --enabled-optimized configure option is used to specify a Release + The ``--enabled-optimized`` configure option is used to specify a Release build. * ``make check-all`` --- This run the regression tests to ensure everything is in working order. - + * ``make update`` --- This command is used to update all the svn repositories at once, rather then having to ``cd`` into the individual repositories and running ``svn update``. @@ -126,6 +124,8 @@ LLVM is known to work on the following platforms: +-----------------+----------------------+-------------------------+ |Linux | amd64 | GCC | +-----------------+----------------------+-------------------------+ +|Linux | ARM\ :sup:`13` | GCC | ++-----------------+----------------------+-------------------------+ |Solaris | V9 (Ultrasparc) | GCC | +-----------------+----------------------+-------------------------+ |FreeBSD | x86\ :sup:`1` | GCC | @@ -161,8 +161,6 @@ LLVM has partial support for the following platforms: .. note:: - Code generation supported for Pentium processors and up - #. Code generation supported for Pentium processors and up #. Code generation supported for 32-bit ABI only #. No native code generation @@ -182,9 +180,9 @@ LLVM has partial support for the following platforms: Windows-specifics that will cause the build to fail. #. To use LLVM modules on Win32-based system, you may configure LLVM with ``--enable-shared``. - #. To compile SPU backend, you need to add ``LDFLAGS=-Wl,--stack,16777216`` to configure. + #. MCJIT not working well pre-v7, old JIT engine not supported any more. Note that you will need about 1-3 GB of space for a full LLVM build in Debug mode, depending on the system (it is so large because of all the debugging @@ -219,11 +217,7 @@ uses the package and provides other details. +--------------------------------------------------------------+-----------------+---------------------------------------------+ | `SVN <http://subversion.tigris.org/project_packages.html>`_ | >=1.3 | Subversion access to LLVM\ :sup:`2` | +--------------------------------------------------------------+-----------------+---------------------------------------------+ -| `DejaGnu <http://savannah.gnu.org/projects/dejagnu>`_ | 1.4.2 | Automated test suite\ :sup:`3` | -+--------------------------------------------------------------+-----------------+---------------------------------------------+ -| `tcl <http://www.tcl.tk/software/tcltk/>`_ | 8.3, 8.4 | Automated test suite\ :sup:`3` | -+--------------------------------------------------------------+-----------------+---------------------------------------------+ -| `expect <http://expect.nist.gov/>`_ | 5.38.0 | Automated test suite\ :sup:`3` | +| `python <http://www.python.org/>`_ | >=2.4 | Automated test suite\ :sup:`3` | +--------------------------------------------------------------+-----------------+---------------------------------------------+ | `perl <http://www.perl.com/download.csp>`_ | >=5.6.0 | Utilities | +--------------------------------------------------------------+-----------------+---------------------------------------------+ @@ -368,6 +362,9 @@ optimizations are turned on. The symptom is an infinite loop in ``-O0``. A test failure in ``test/Assembler/alignstack.ll`` is one symptom of the problem. +**GCC 4.6.3 on ARM**: Miscompiles ``llvm-readobj`` at ``-O3``. A test failure +in ``test/Object/readobj-shared-object.test`` is one symptom of the problem. + **GNU ld 2.16.X**. Some 2.16.X versions of the ld linker will produce very long warning messages complaining that some "``.gnu.linkonce.t.*``" symbol was defined in a discarded section. You can safely ignore these messages as they are @@ -384,6 +381,14 @@ intermittent failures when building LLVM with position independent code. The symptom is an error about cyclic dependencies. We recommend upgrading to a newer version of Gold. +**Clang 3.0 with libstdc++ 4.7.x**: a few Linux distributions (Ubuntu 12.10, +Fedora 17) have both Clang 3.0 and libstdc++ 4.7 in their repositories. Clang +3.0 does not implement a few builtins that are used in this library. We +recommend using the system GCC to compile LLVM and Clang in this case. + +**Clang 3.0 on Mageia 2**. There's a packaging issue: Clang can not find at +least some (``cxxabi.h``) libstdc++ headers. + .. _Getting Started with LLVM: Getting Started with LLVM @@ -459,6 +464,8 @@ The files are as follows, with *x.y* marking the version number: Binary release of the llvm-gcc-4.2 front end for a specific platform. +.. _checkout: + Checkout LLVM from Subversion ----------------------------- @@ -505,7 +512,7 @@ directory: If you would like to get the LLVM test suite (a separate package as of 1.4), you get it from the Subversion repository: -.. code-block:: bash +.. code-block:: console % cd llvm/projects % svn co http://llvm.org/svn/llvm-project/test-suite/trunk test-suite @@ -523,13 +530,13 @@ marks (so, you can recreate git-svn metadata locally). Note that right now mirrors reflect only ``trunk`` for each project. You can do the read-only GIT clone of LLVM via: -.. code-block:: bash +.. code-block:: console % git clone http://llvm.org/git/llvm.git If you want to check out clang too, run: -.. code-block:: bash +.. code-block:: console % git clone http://llvm.org/git/llvm.git % cd llvm/tools @@ -540,7 +547,7 @@ pull --rebase`` instead of ``git pull`` to avoid generating a non-linear history in your clone. To configure ``git pull`` to pass ``--rebase`` by default on the master branch, run the following command: -.. code-block:: bash +.. code-block:: console % git config branch.master.rebase true @@ -553,13 +560,13 @@ Assume ``master`` points the upstream and ``mybranch`` points your working branch, and ``mybranch`` is rebased onto ``master``. At first you may check sanity of whitespaces: -.. code-block:: bash +.. code-block:: console % git diff --check master..mybranch The easiest way to generate a patch is as below: -.. code-block:: bash +.. code-block:: console % git diff master..mybranch > /path/to/mybranch.diff @@ -570,14 +577,14 @@ could be accepted with ``patch -p1 -N``. But you may generate patchset with git-format-patch. It generates by-each-commit patchset. To generate patch files to attach to your article: -.. code-block:: bash +.. code-block:: console % git format-patch --no-attach master..mybranch -o /path/to/your/patchset If you would like to send patches directly, you may use git-send-email or git-imap-send. Here is an example to generate the patchset in Gmail's [Drafts]. -.. code-block:: bash +.. code-block:: console % git format-patch --attach master..mybranch --stdout | git imap-send @@ -603,7 +610,7 @@ For developers to work with git-svn To set up clone from which you can submit code using ``git-svn``, run: -.. code-block:: bash +.. code-block:: console % git clone http://llvm.org/git/llvm.git % cd llvm @@ -622,7 +629,7 @@ To set up clone from which you can submit code using ``git-svn``, run: To update this clone without generating git-svn tags that conflict with the upstream git repo, run: -.. code-block:: bash +.. code-block:: console % git fetch && (cd tools/clang && git fetch) # Get matching revisions of both trees. % git checkout master @@ -633,18 +640,61 @@ upstream git repo, run: This leaves your working directories on their master branches, so you'll need to ``checkout`` each working branch individually and ``rebase`` it on top of its -parent branch. (Note: This script is intended for relative newbies to git. If -you have more experience, you can likely improve on it.) +parent branch. + +For those who wish to be able to update an llvm repo in a simpler fashion, +consider placing the following git script in your path under the name +``git-svnup``: + +.. code-block:: bash + + #!/bin/bash + + STATUS=$(git status -s | grep -v "??") + + if [ ! -z "$STATUS" ]; then + STASH="yes" + git stash >/dev/null + fi + + git fetch + OLD_BRANCH=$(git rev-parse --abbrev-ref HEAD) + git checkout master 2> /dev/null + git svn rebase -l + git checkout $OLD_BRANCH 2> /dev/null + + if [ ! -z $STASH ]; then + git stash pop >/dev/null + fi + +Then to perform the aforementioned update steps go into your source directory +and just type ``git-svnup`` or ``git svnup`` and everything will just work. + +To commit back changes via git-svn, use ``dcommit``: + +.. code-block:: console + + % git svn dcommit + +Note that git-svn will create one SVN commit for each Git commit you have pending, +so squash and edit each commit before executing ``dcommit`` to make sure they all +conform to the coding standards and the developers' policy. + +On success, ``dcommit`` will rebase against the HEAD of SVN, so to avoid conflict, +please make sure your current branch is up-to-date (via fetch/rebase) before +proceeding. The git-svn metadata can get out of sync after you mess around with branches and ``dcommit``. When that happens, ``git svn dcommit`` stops working, complaining about files with uncommitted changes. The fix is to rebuild the metadata: -.. code-block:: bash +.. code-block:: console % rm -rf .git/svn % git svn rebase -l +Please, refer to the Git-SVN manual (``man git-svn``) for more information. + Local LLVM Configuration ------------------------ @@ -661,14 +711,15 @@ configure the build system: | Variable | Purpose | +============+===========================================================+ | CC | Tells ``configure`` which C compiler to use. By default, | -| | ``configure`` will look for the first GCC C compiler in | -| | ``PATH``. Use this variable to override ``configure``\'s | -| | default behavior. | +| | ``configure`` will check ``PATH`` for ``clang`` and GCC C | +| | compilers (in this order). Use this variable to override | +| | ``configure``\'s default behavior. | +------------+-----------------------------------------------------------+ | CXX | Tells ``configure`` which C++ compiler to use. By | -| | default, ``configure`` will look for the first GCC C++ | -| | compiler in ``PATH``. Use this variable to override | -| | ``configure``'s default behavior. | +| | default, ``configure`` will check ``PATH`` for | +| | ``clang++`` and GCC C++ compilers (in this order). Use | +| | this variable to override ``configure``'s default | +| | behavior. | +------------+-----------------------------------------------------------+ The following options can be used to set or enable LLVM specific options: @@ -722,13 +773,13 @@ To configure LLVM, follow these steps: #. Change directory into the object root directory: - .. code-block:: bash + .. code-block:: console % cd OBJ_ROOT #. Run the ``configure`` script located in the LLVM source tree: - .. code-block:: bash + .. code-block:: console % SRC_ROOT/configure --prefix=/install/path [other options] @@ -764,7 +815,7 @@ Profile Builds Once you have LLVM configured, you can build it by entering the *OBJ_ROOT* directory and issuing the following command: -.. code-block:: bash +.. code-block:: console % gmake @@ -775,7 +826,7 @@ If you have multiple processors in your machine, you may wish to use some of the parallel build options provided by GNU Make. For example, you could use the command: -.. code-block:: bash +.. code-block:: console % gmake -j2 @@ -857,7 +908,7 @@ For instructions on how to install Sphinx, see After following the instructions there for installing Sphinx, build the LLVM HTML documentation by doing the following: -.. code-block:: bash +.. code-block:: console $ cd SRC_ROOT/docs $ make -f Makefile.sphinx @@ -893,13 +944,13 @@ This is accomplished in the typical autoconf manner: * Change directory to where the LLVM object files should live: - .. code-block:: bash + .. code-block:: console % cd OBJ_ROOT * Run the ``configure`` script found in the LLVM source directory: - .. code-block:: bash + .. code-block:: console % SRC_ROOT/configure @@ -945,7 +996,7 @@ module, and you have root access on the system, you can set your system up to execute LLVM bitcode files directly. To do this, use commands like this (the first command may not be required if you are already using the module): -.. code-block:: bash +.. code-block:: console % mount -t binfmt_misc none /proc/sys/fs/binfmt_misc % echo ':llvm:M::BC::/path/to/lli:' > /proc/sys/fs/binfmt_misc/register @@ -955,7 +1006,7 @@ first command may not be required if you are already using the module): This allows you to execute LLVM bitcode files directly. On Debian, you can also use this command instead of the 'echo' command above: -.. code-block:: bash +.. code-block:: console % sudo update-binfmts --install llvm /path/to/lli --magic 'BC' @@ -1246,7 +1297,7 @@ Example with clang #. Next, compile the C file into a native executable: - .. code-block:: bash + .. code-block:: console % clang hello.c -o hello @@ -1257,7 +1308,7 @@ Example with clang #. Next, compile the C file into a LLVM bitcode file: - .. code-block:: bash + .. code-block:: console % clang -O3 -emit-llvm hello.c -c -o hello.bc @@ -1267,13 +1318,13 @@ Example with clang #. Run the program in both forms. To run the program, use: - .. code-block:: bash + .. code-block:: console % ./hello and - .. code-block:: bash + .. code-block:: console % lli hello.bc @@ -1282,27 +1333,27 @@ Example with clang #. Use the ``llvm-dis`` utility to take a look at the LLVM assembly code: - .. code-block:: bash + .. code-block:: console % llvm-dis < hello.bc | less #. Compile the program to native assembly using the LLC code generator: - .. code-block:: bash + .. code-block:: console % llc hello.bc -o hello.s #. Assemble the native assembly language file into a program: - .. code-block:: bash + .. code-block:: console - **Solaris:** % /opt/SUNWspro/bin/cc -xarch=v9 hello.s -o hello.native + % /opt/SUNWspro/bin/cc -xarch=v9 hello.s -o hello.native # On Solaris - **Others:** % gcc hello.s -o hello.native + % gcc hello.s -o hello.native # On others #. Execute the native code program: - .. code-block:: bash + .. code-block:: console % ./hello.native diff --git a/docs/GettingStartedVS.rst b/docs/GettingStartedVS.rst index 35f97f04b9..4c80f2c57b 100644 --- a/docs/GettingStartedVS.rst +++ b/docs/GettingStartedVS.rst @@ -1,5 +1,3 @@ -.. _winvs: - ================================================================== Getting Started with the LLVM System using Microsoft Visual Studio ================================================================== diff --git a/docs/GoldPlugin.rst b/docs/GoldPlugin.rst index 300aea9f9a..17bbeb8ba9 100644 --- a/docs/GoldPlugin.rst +++ b/docs/GoldPlugin.rst @@ -1,11 +1,7 @@ -.. _gold-plugin: - ==================== The LLVM gold plugin ==================== -.. sectionauthor:: Nick Lewycky - Introduction ============ diff --git a/docs/HowToAddABuilder.rst b/docs/HowToAddABuilder.rst index b0cd2907f9..893f12d19d 100644 --- a/docs/HowToAddABuilder.rst +++ b/docs/HowToAddABuilder.rst @@ -1,11 +1,7 @@ -.. _how_to_add_a_builder: - =================================================================== How To Add Your Build Configuration To LLVM Buildbot Infrastructure =================================================================== -.. sectionauthor:: Galina Kistanova <gkistanova@gmail.com> - Introduction ============ diff --git a/docs/HowToBuildOnARM.rst b/docs/HowToBuildOnARM.rst index d786a7deda..32ae39ba68 100644 --- a/docs/HowToBuildOnARM.rst +++ b/docs/HowToBuildOnARM.rst @@ -1,11 +1,7 @@ -.. _how_to_build_on_arm: - =================================================================== How To Build On ARM =================================================================== -.. sectionauthor:: Wei-Ren Chen (陳韋任) <chenwj@iis.sinica.edu.tw> - Introduction ============ @@ -40,8 +36,8 @@ on the ARMv6 and ARMv7 architectures and may be inapplicable to older chips. .. code-block:: bash - ./configure --build=armv7l-unknown-linux-gnueabihf - --host=armv7l-unknown-linux-gnueabihf - --target=armv7l-unknown-linux-gnueabihf --with-cpu=cortex-a9 - --with-float=hard --with-abi=aapcs-vfp --with-fpu=neon - --enable-targets=arm --disable-optimized --enable-assertions + ./configure --build=armv7l-unknown-linux-gnueabihf \ + --host=armv7l-unknown-linux-gnueabihf \ + --target=armv7l-unknown-linux-gnueabihf --with-cpu=cortex-a9 \ + --with-float=hard --with-abi=aapcs-vfp --with-fpu=neon \ + --enable-targets=arm --enable-optimized --enable-assertions diff --git a/docs/HowToReleaseLLVM.rst b/docs/HowToReleaseLLVM.rst index eb6c838a21..31877bd35a 100644 --- a/docs/HowToReleaseLLVM.rst +++ b/docs/HowToReleaseLLVM.rst @@ -6,11 +6,6 @@ How To Release LLVM To The Public :local: :depth: 1 -.. sectionauthor:: Tanya Lattner <tonic@nondot.org>, - Reid Spencer <rspencer@x10sys.com>, - John Criswell <criswell@cs.uiuc.edu> and - Bill Wendling <wendling@apple.com> - Introduction ============ @@ -201,7 +196,7 @@ Build LLVM Build ``Debug``, ``Release+Asserts``, and ``Release`` versions of ``llvm`` on all supported platforms. Directions to build ``llvm`` -are :ref:`here <getting_started>`. +are :doc:`here <GettingStarted>`. Build Clang Binary Distribution ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -268,7 +263,7 @@ no regressions when using either ``clang`` or ``dragonegg`` with the Qualify Clang ^^^^^^^^^^^^^ -``Clang`` is qualified when front-end specific tests in the ``llvm`` dejagnu +``Clang`` is qualified when front-end specific tests in the ``llvm`` regression test suite all pass, clang's own test suite passes cleanly, and there are no regressions in the ``test-suite``. @@ -278,26 +273,26 @@ Specific Target Qualification Details +--------------+-------------+----------------+-----------------------------+ | Architecture | OS | clang baseline | tests | +==============+=============+================+=============================+ -| x86-32 | Linux | last release | llvm dejagnu, | -| | | | clang tests, | +| x86-32 | Linux | last release | llvm regression tests, | +| | | | clang regression tests, | | | | | test-suite (including spec) | +--------------+-------------+----------------+-----------------------------+ -| x86-32 | FreeBSD | last release | llvm dejagnu, | -| | | | clang tests, | +| x86-32 | FreeBSD | last release | llvm regression tests, | +| | | | clang regression tests, | | | | | test-suite | +--------------+-------------+----------------+-----------------------------+ | x86-32 | mingw | none | QT | +--------------+-------------+----------------+-----------------------------+ -| x86-64 | Mac OS 10.X | last release | llvm dejagnu, | -| | | | clang tests, | +| x86-64 | Mac OS 10.X | last release | llvm regression tests, | +| | | | clang regression tests, | | | | | test-suite (including spec) | +--------------+-------------+----------------+-----------------------------+ -| x86-64 | Linux | last release | llvm dejagnu, | -| | | | clang tests, | +| x86-64 | Linux | last release | llvm regression tests, | +| | | | clang regression tests, | | | | | test-suite (including spec) | +--------------+-------------+----------------+-----------------------------+ -| x86-64 | FreeBSD | last release | llvm dejagnu, | -| | | | clang tests, | +| x86-64 | FreeBSD | last release | llvm regression tests, | +| | | | clang regression tests, | | | | | test-suite | +--------------+-------------+----------------+-----------------------------+ diff --git a/docs/HowToSetUpLLVMStyleRTTI.rst b/docs/HowToSetUpLLVMStyleRTTI.rst index aa1ad84afe..b906b25621 100644 --- a/docs/HowToSetUpLLVMStyleRTTI.rst +++ b/docs/HowToSetUpLLVMStyleRTTI.rst @@ -1,11 +1,7 @@ -.. _how-to-set-up-llvm-style-rtti: - ====================================================== How to set up LLVM-style RTTI for your class hierarchy ====================================================== -.. sectionauthor:: Sean Silva <silvas@purdue.edu> - .. contents:: Background diff --git a/docs/HowToSubmitABug.rst b/docs/HowToSubmitABug.rst index ff2d649ce3..45be2826b3 100644 --- a/docs/HowToSubmitABug.rst +++ b/docs/HowToSubmitABug.rst @@ -1,11 +1,7 @@ -.. _how-to-submit-a-bug-report: - ================================ How to submit an LLVM bug report ================================ -.. sectionauthor:: Chris Lattner <sabre@nondot.org> and Misha Brukman <http://misha.brukman.net> - Introduction - Got bugs? ======================== diff --git a/docs/HowToUseAttributes.rst b/docs/HowToUseAttributes.rst new file mode 100644 index 0000000000..66c44c01f6 --- /dev/null +++ b/docs/HowToUseAttributes.rst @@ -0,0 +1,81 @@ +===================== +How To Use Attributes +===================== + +.. contents:: + :local: + +Introduction +============ + +Attributes in LLVM have changed in some fundamental ways. It was necessary to +do this to support expanding the attributes to encompass more than a handful of +attributes --- e.g. command line options. The old way of handling attributes +consisted of representing them as a bit mask of values. This bit mask was +stored in a "list" structure that was reference counted. The advantage of this +was that attributes could be manipulated with 'or's and 'and's. The +disadvantage of this was that there was limited room for expansion, and +virtually no support for attribute-value pairs other than alignment. + +In the new scheme, an ``Attribute`` object represents a single attribute that's +uniqued. You use the ``Attribute::get`` methods to create a new ``Attribute`` +object. An attribute can be a single "enum" value (the enum being the +``Attribute::AttrKind`` enum), a string representing a target-dependent +attribute, or an attribute-value pair. Some examples: + +* Target-independent: ``noinline``, ``zext`` +* Target-dependent: ``"no-sse"``, ``"thumb2"`` +* Attribute-value pair: ``"cpu" = "cortex-a8"``, ``align = 4`` + +Note: for an attribute value pair, we expect a target-dependent attribute to +have a string for the value. + +``Attribute`` +============= +An ``Attribute`` object is designed to be passed around by value. + +Because attributes are no longer represented as a bit mask, you will need to +convert any code which does treat them as a bit mask to use the new query +methods on the Attribute class. + +``AttributeSet`` +================ + +The ``AttributeSet`` class replaces the old ``AttributeList`` class. The +``AttributeSet`` stores a collection of Attribute objects for each kind of +object that may have an attribute associated with it: the function as a +whole, the return type, or the function's parameters. A function's attributes +are at index ``AttributeSet::FunctionIndex``; the return type's attributes are +at index ``AttributeSet::ReturnIndex``; and the function's parameters' +attributes are at indices 1, ..., n (where 'n' is the number of parameters). +Most methods on the ``AttributeSet`` class take an index parameter. + +An ``AttributeSet`` is also a uniqued and immutable object. You create an +``AttributeSet`` through the ``AttributeSet::get`` methods. You can add and +remove attributes, which result in the creation of a new ``AttributeSet``. + +An ``AttributeSet`` object is designed to be passed around by value. + +Note: It is advised that you do *not* use the ``AttributeSet`` "introspection" +methods (e.g. ``Raw``, ``getRawPointer``, etc.). These methods break +encapsulation, and may be removed in a future release (i.e. LLVM 4.0). + +``AttrBuilder`` +=============== + +Lastly, we have a "builder" class to help create the ``AttributeSet`` object +without having to create several different intermediate uniqued +``AttributeSet`` objects. The ``AttrBuilder`` class allows you to add and +remove attributes at will. The attributes won't be uniqued until you call the +appropriate ``AttributeSet::get`` method. + +An ``AttrBuilder`` object is *not* designed to be passed around by value. It +should be passed by reference. + +Note: It is advised that you do *not* use the ``AttrBuilder::addRawValue()`` +method or the ``AttrBuilder(uint64_t Val)`` constructor. These are for +backwards compatibility and may be removed in a future release (i.e. LLVM 4.0). + +And that's basically it! A lot of functionality is hidden behind these classes, +but the interfaces are pretty straight forward. + diff --git a/docs/HowToUseInstrMappings.rst b/docs/HowToUseInstrMappings.rst index bf9278e770..8a3e7c8d72 100644 --- a/docs/HowToUseInstrMappings.rst +++ b/docs/HowToUseInstrMappings.rst @@ -1,11 +1,7 @@ -.. _how_to_use_instruction_mappings: - =============================== How To Use Instruction Mappings =============================== -.. sectionauthor:: Jyotsna Verma <jverma@codeaurora.org> - .. contents:: :local: diff --git a/docs/LangRef.rst b/docs/LangRef.rst index 1ea475dee6..03004f66df 100644 --- a/docs/LangRef.rst +++ b/docs/LangRef.rst @@ -111,7 +111,7 @@ After strength reduction: .. code-block:: llvm - %result = shl i32 %X, i8 3 + %result = shl i32 %X, 3 And the hard way: @@ -148,20 +148,20 @@ symbol table entries. Here is an example of the "hello world" module: .. code-block:: llvm - ; Declare the string constant as a global constant. - @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00" + ; Declare the string constant as a global constant. + @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00" - ; External declaration of the puts function - declare i32 @puts(i8* nocapture) nounwind + ; External declaration of the puts function + declare i32 @puts(i8* nocapture) nounwind ; Definition of main function - define i32 @main() { ; i32()*  - ; Convert [13 x i8]* to i8 *... + define i32 @main() { ; i32()* + ; Convert [13 x i8]* to i8 *... %cast210 = getelementptr [13 x i8]* @.str, i64 0, i64 0 - ; Call puts function to write out the string to stdout. + ; Call puts function to write out the string to stdout. call i32 @puts(i8* %cast210) - ret i32 0 + ret i32 0 } ; Named metadata @@ -262,7 +262,7 @@ linkage: Some languages allow differing globals to be merged, such as two functions with different semantics. Other languages, such as ``C++``, ensure that only equivalent globals are ever merged (the - "one definition rule" — "ODR"). Such languages can use the + "one definition rule" --- "ODR"). Such languages can use the ``linkonce_odr`` and ``weak_odr`` linkage types to indicate that the global will only be merged with equivalent globals. These linkage types are otherwise the same as their non-``odr`` versions. @@ -465,11 +465,11 @@ more information on under which circumstances the different models may be used. The target may choose a different TLS model if the specified model is not supported, or if a better choice of model can be made. -A variable may be defined as a global "constant," which indicates that +A variable may be defined as a global ``constant``, which indicates that the contents of the variable will **never** be modified (enabling better optimization, allowing the global data to be placed in the read-only section of an executable, etc). Note that variables that need runtime -initialization cannot be marked "constant" as there is a store to the +initialization cannot be marked ``constant`` as there is a store to the variable. LLVM explicitly allows *declarations* of global variables to be marked @@ -501,6 +501,14 @@ is zero. The address space qualifier must precede any other attributes. LLVM allows an explicit section to be specified for globals. If the target supports it, it will emit globals to the section specified. +By default, global initializers are optimized by assuming that global +variables defined within the module are not modified from their +initial values before the start of the global initializer. This is +true even for variables potentially accessible from outside the +module, including those with external linkage or appearing in +``@llvm.used``. This assumption may be suppressed by marking the +variable with ``externally_initialized``. + An explicit alignment may be specified for a global, which must be a power of 2. If not present, or if the alignment is set to zero, the alignment of the global is set by the target to whatever it feels @@ -679,7 +687,7 @@ Currently, only the following parameter attributes are defined: This indicates that the pointer parameter specifies the address of a structure that is the return value of the function in the source program. This pointer must be guaranteed by the caller to be valid: - loads and stores to the structure may be assumed by the callee to + loads and stores to the structure may be assumed by the callee not to trap and to be properly aligned. This may only be applied to the first parameter. This is not a valid attribute for return values. @@ -712,6 +720,11 @@ Currently, only the following parameter attributes are defined: This indicates that the pointer parameter can be excised using the :ref:`trampoline intrinsics <int_trampoline>`. This is not a valid attribute for return values. +``nobuiltin`` + This indicates that the callee function at a call site is not + recognized as a built-in function. LLVM will retain the original call + and not replace it with equivalent code based on the semantics of the + built-in function. .. _gc: @@ -729,6 +742,36 @@ The compiler declares the supported values of *name*. Specifying a collector which will cause the compiler to alter its output in order to support the named garbage collection algorithm. +.. _attrgrp: + +Attribute Groups +---------------- + +Attribute groups are groups of attributes that are referenced by objects within +the IR. They are important for keeping ``.ll`` files readable, because a lot of +functions will use the same set of attributes. In the degenerative case of a +``.ll`` file that corresponds to a single ``.c`` file, the single attribute +group will capture the important command line flags used to build that file. + +An attribute group is a module-level object. To use an attribute group, an +object references the attribute group's ID (e.g. ``#37``). An object may refer +to more than one attribute group. In that situation, the attributes from the +different groups are merged. + +Here is an example of attribute groups for a function that should always be +inlined, has a stack alignment of 4, and which shouldn't use SSE instructions: + +.. code-block:: llvm + + ; Target-independent attributes: + #0 = attributes { alwaysinline alignstack=4 } + + ; Target-dependent attributes: + #1 = attributes { "no-sse" } + + ; Function @f has attributes: alwaysinline, alignstack=4, and "no-sse". + define void @f() #0 #1 { ... } + .. _fnattrs: Function Attributes @@ -750,9 +793,6 @@ example: define void @f() alwaysinline optsize { ... } define void @f() optsize { ... } -``address_safety`` - This attribute indicates that the address safety analysis is enabled - for this function. ``alignstack(<n>)`` This attribute indicates that, when emitting the prologue and epilogue, the backend should forcibly align the stack pointer. @@ -774,6 +814,17 @@ example: ``naked`` This attribute disables prologue / epilogue emission for the function. This can have very system-specific consequences. +``noduplicate`` + This attribute indicates that calls to the function cannot be + duplicated. A call to a ``noduplicate`` function may be moved + within its parent function, but may not be duplicated within + its parent function. + + A function containing a ``noduplicate`` call may still + be an inlining candidate, provided that the call is not + duplicated by inlining. That implies that the function has + internal linkage and only has one call site, so the original + call is dead after inlining. ``noimplicitfloat`` This attributes disables implicit floating point instructions. ``noinline`` @@ -819,13 +870,27 @@ example: ``setjmp`` is an example of such a function. The compiler disables some optimizations (like tail calls) in the caller of these functions. +``sanitize_address`` + This attribute indicates that AddressSanitizer checks + (dynamic address safety analysis) are enabled for this function. +``sanitize_memory`` + This attribute indicates that MemorySanitizer checks (dynamic detection + of accesses to uninitialized memory) are enabled for this function. +``sanitize_thread`` + This attribute indicates that ThreadSanitizer checks + (dynamic thread safety analysis) are enabled for this function. ``ssp`` This attribute indicates that the function should emit a stack - smashing protector. It is in the form of a "canary"—a random value + smashing protector. It is in the form of a "canary" --- a random value placed on the stack before the local variables that's checked upon return from the function to see if it has been overwritten. A heuristic is used to determine if a function needs stack protectors - or not. + or not. The heuristic used will enable protectors for functions with: + + - Character arrays larger than ``ssp-buffer-size`` (default 8). + - Aggregates containing character arrays larger than ``ssp-buffer-size``. + - Calls to alloca() with variable sizes or constant sizes greater than + ``ssp-buffer-size``. If a function that has an ``ssp`` attribute is inlined into a function that doesn't have an ``ssp`` attribute, then the resulting @@ -837,8 +902,24 @@ example: If a function that has an ``sspreq`` attribute is inlined into a function that doesn't have an ``sspreq`` attribute or which has an - ``ssp`` attribute, then the resulting function will have an - ``sspreq`` attribute. + ``ssp`` or ``sspstrong`` attribute, then the resulting function will have + an ``sspreq`` attribute. +``sspstrong`` + This attribute indicates that the function should emit a stack smashing + protector. This attribute causes a strong heuristic to be used when + determining if a function needs stack protectors. The strong heuristic + will enable protectors for functions with: + + - Arrays of any size and type + - Aggregates containing an array of any size and type. + - Calls to alloca(). + - Local variables that have had their address taken. + + This overrides the ``ssp`` function attribute. + + If a function that has an ``sspstrong`` attribute is inlined into a + function that doesn't have an ``sspstrong`` attribute, then the + resulting function will have an ``sspstrong`` attribute. ``uwtable`` This attribute indicates that the ABI being targeted requires that an unwind table entry be produce for this function even if we can @@ -939,22 +1020,20 @@ specifications are given in this list: - ``E`` - big endian - ``p:64:64:64`` - 64-bit pointers with 64-bit alignment -- ``p1:32:32:32`` - 32-bit pointers with 32-bit alignment for address - space 1 -- ``p2:16:32:32`` - 16-bit pointers with 32-bit alignment for address - space 2 +- ``S0`` - natural stack alignment is unspecified - ``i1:8:8`` - i1 is 8-bit (byte) aligned - ``i8:8:8`` - i8 is 8-bit (byte) aligned - ``i16:16:16`` - i16 is 16-bit aligned - ``i32:32:32`` - i32 is 32-bit aligned - ``i64:32:64`` - i64 has ABI alignment of 32-bits but preferred alignment of 64-bits +- ``f16:16:16`` - half is 16-bit aligned - ``f32:32:32`` - float is 32-bit aligned - ``f64:64:64`` - double is 64-bit aligned +- ``f128:128:128`` - quad is 128-bit aligned - ``v64:64:64`` - 64-bit vector is 64-bit aligned - ``v128:128:128`` - 128-bit vector is 128-bit aligned -- ``a0:0:1`` - aggregates are 8-bit aligned -- ``s0:64:64`` - stack objects are 64-bit aligned +- ``a0:0:64`` - aggregates are 64-bit aligned When LLVM is determining the alignment for a given type, it uses the following rules: @@ -1050,6 +1129,21 @@ volatile operations. The optimizers *may* change the order of volatile operations relative to non-volatile operations. This is not Java's "volatile" and has no cross-thread synchronization behavior. +IR-level volatile loads and stores cannot safely be optimized into +llvm.memcpy or llvm.memmove intrinsics even when those intrinsics are +flagged volatile. Likewise, the backend should never split or merge +target-legal volatile load/store instructions. + +.. admonition:: Rationale + + Platforms may rely on volatile loads and stores of natively supported + data width to be executed as single instruction. For example, in C + this holds for an l-value of volatile primitive type with native + hardware support, but not necessarily for aggregate types. The + frontend upholds these expectations, which are intentionally + unspecified in the IR. The rules above ensure that IR transformation + do not violate the frontend's contract with the language. + .. _memmodel: Memory Model for Concurrent Operations @@ -1543,7 +1637,7 @@ Examples: +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ``i32 (i32)`` | function taking an ``i32``, returning an ``i32`` | +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ``float (i16, i32 *) *`` | :ref:`Pointer <t_pointer>` to a function that takes an ``i16`` and a :ref:`pointer <t_pointer>` to ``i32``, returning ``float``. | +| ``float (i16, i32 *) *`` | :ref:`Pointer <t_pointer>` to a function that takes an ``i16`` and a :ref:`pointer <t_pointer>` to ``i32``, returning ``float``. | +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ``i32 (i8*, ...)`` | A vararg function that takes at least one :ref:`pointer <t_pointer>` to ``i8`` (char in C), which returns an integer. This is the signature for ``printf`` in LLVM. | +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ @@ -1594,7 +1688,7 @@ Examples: +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ``{ i32, i32, i32 }`` | A triple of three ``i32`` values | +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -| ``{ float, i32 (i32) * }`` | A pair, where the first element is a ``float`` and the second element is a :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32``, returning an ``i32``. | +| ``{ float, i32 (i32) * }`` | A pair, where the first element is a ``float`` and the second element is a :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32``, returning an ``i32``. | +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ``<{ i8, i32 }>`` | A packed struct known to be 5 bytes in size. | +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ @@ -1743,7 +1837,7 @@ and disassembly do not cause any bits to change in the constants. When using the hexadecimal form, constants of types half, float, and double are represented using the 16-digit form shown above (which matches the IEEE754 representation for double); half and float values -must, however, be exactly representable as IEE754 half and single +must, however, be exactly representable as IEEE 754 half and single precision, respectively. Hexadecimal format is always used for long double, and there are three forms of long double. The 80-bit format used by x86 is represented as ``0xK`` followed by 20 hexadecimal digits. The @@ -2068,7 +2162,7 @@ Taking the address of the entry block is illegal. This value only has defined behavior when used as an operand to the ':ref:`indirectbr <i_indirectbr>`' instruction, or for comparisons against null. Pointer equality tests between labels addresses results in -undefined behavior — though, again, comparison against null is ok, and +undefined behavior --- though, again, comparison against null is ok, and no label is equal to the null pointer. This may be passed around as an opaque pointer sized value as long as the bits are not inspected. This allows ``ptrtoint`` and arithmetic to be performed on these values so @@ -2130,7 +2224,7 @@ The following is the syntax for constant expressions: won't fit in the floating point type, the results are undefined. ``ptrtoint (CST to TYPE)`` Convert a pointer typed constant to the corresponding integer - constant ``TYPE`` must be an integer type. ``CST`` must be of + constant. ``TYPE`` must be an integer type. ``CST`` must be of pointer type. The ``CST`` value is zero extended, truncated, or unchanged to make it fit in ``TYPE``. ``inttoptr (CST to TYPE)`` @@ -2433,14 +2527,132 @@ Examples: !2 = metadata !{ i8 0, i8 2, i8 3, i8 6 } !3 = metadata !{ i8 -2, i8 0, i8 3, i8 6 } +'``llvm.loop``' +^^^^^^^^^^^^^^^ + +It is sometimes useful to attach information to loop constructs. Currently, +loop metadata is implemented as metadata attached to the branch instruction +in the loop latch block. This type of metadata refer to a metadata node that is +guaranteed to be separate for each loop. The loop-level metadata is prefixed +with ``llvm.loop``. + +The loop identifier metadata is implemented using a metadata that refers to +itself to avoid merging it with any other identifier metadata, e.g., +during module linkage or function inlining. That is, each loop should refer +to their own identification metadata even if they reside in separate functions. +The following example contains loop identifier metadata for two separate loop +constructs: + +.. code-block:: llvm + + !0 = metadata !{ metadata !0 } + !1 = metadata !{ metadata !1 } + + +'``llvm.loop.parallel``' Metadata +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This loop metadata can be used to communicate that a loop should be considered +a parallel loop. The semantics of parallel loops in this case is the one +with the strongest cross-iteration instruction ordering freedom: the +iterations in the loop can be considered completely independent of each +other (also known as embarrassingly parallel loops). + +This metadata can originate from a programming language with parallel loop +constructs. In such a case it is completely the programmer's responsibility +to ensure the instructions from the different iterations of the loop can be +executed in an arbitrary order, in parallel, or intertwined. No loop-carried +dependency checking at all must be expected from the compiler. + +In order to fulfill the LLVM requirement for metadata to be safely ignored, +it is important to ensure that a parallel loop is converted to +a sequential loop in case an optimization (agnostic of the parallel loop +semantics) converts the loop back to such. This happens when new memory +accesses that do not fulfill the requirement of free ordering across iterations +are added to the loop. Therefore, this metadata is required, but not +sufficient, to consider the loop at hand a parallel loop. For a loop +to be parallel, all its memory accessing instructions need to be +marked with the ``llvm.mem.parallel_loop_access`` metadata that refer +to the same loop identifier metadata that identify the loop at hand. + +'``llvm.mem``' +^^^^^^^^^^^^^^^ + +Metadata types used to annotate memory accesses with information helpful +for optimizations are prefixed with ``llvm.mem``. + +'``llvm.mem.parallel_loop_access``' Metadata +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +For a loop to be parallel, in addition to using +the ``llvm.loop.parallel`` metadata to mark the loop latch branch instruction, +also all of the memory accessing instructions in the loop body need to be +marked with the ``llvm.mem.parallel_loop_access`` metadata. If there +is at least one memory accessing instruction not marked with the metadata, +the loop, despite it possibly using the ``llvm.loop.parallel`` metadata, +must be considered a sequential loop. This causes parallel loops to be +converted to sequential loops due to optimization passes that are unaware of +the parallel semantics and that insert new memory instructions to the loop +body. + +Example of a loop that is considered parallel due to its correct use of +both ``llvm.loop.parallel`` and ``llvm.mem.parallel_loop_access`` +metadata types that refer to the same loop identifier metadata. + +.. code-block:: llvm + + for.body: + ... + %0 = load i32* %arrayidx, align 4, !llvm.mem.parallel_loop_access !0 + ... + store i32 %0, i32* %arrayidx4, align 4, !llvm.mem.parallel_loop_access !0 + ... + br i1 %exitcond, label %for.end, label %for.body, !llvm.loop.parallel !0 + + for.end: + ... + !0 = metadata !{ metadata !0 } + +It is also possible to have nested parallel loops. In that case the +memory accesses refer to a list of loop identifier metadata nodes instead of +the loop identifier metadata node directly: + +.. code-block:: llvm + + outer.for.body: + ... + + inner.for.body: + ... + %0 = load i32* %arrayidx, align 4, !llvm.mem.parallel_loop_access !0 + ... + store i32 %0, i32* %arrayidx4, align 4, !llvm.mem.parallel_loop_access !0 + ... + br i1 %exitcond, label %inner.for.end, label %inner.for.body, !llvm.loop.parallel !1 + + inner.for.end: + ... + %0 = load i32* %arrayidx, align 4, !llvm.mem.parallel_loop_access !0 + ... + store i32 %0, i32* %arrayidx4, align 4, !llvm.mem.parallel_loop_access !0 + ... + br i1 %exitcond, label %outer.for.end, label %outer.for.body, !llvm.loop.parallel !2 + + outer.for.end: ; preds = %for.body + ... + !0 = metadata !{ metadata !1, metadata !2 } ; a list of parallel loop identifiers + !1 = metadata !{ metadata !1 } ; an identifier for the inner parallel loop + !2 = metadata !{ metadata !2 } ; an identifier for the outer parallel loop + + Module Flags Metadata ===================== Information about the module as a whole is difficult to convey to LLVM's subsystems. The LLVM IR isn't sufficient to transmit this information. The ``llvm.module.flags`` named metadata exists in order to facilitate -this. These flags are in the form of key / value pairs — much like a -dictionary — making it easy for any subsystem who cares about a flag to +this. These flags are in the form of key / value pairs --- much like a +dictionary --- making it easy for any subsystem who cares about a flag to look it up. The ``llvm.module.flags`` metadata contains a list of metadata triplets. @@ -2451,14 +2663,16 @@ Each triplet has the following form: (or more) metadata with the same ID. The supported behaviors are described below. - The second element is a metadata string that is a unique ID for the - metadata. How each ID is interpreted is documented below. + metadata. Each module may only have one flag entry for each unique ID (not + including entries with the **Require** behavior). - The third element is the value of the flag. When two (or more) modules are merged together, the resulting -``llvm.module.flags`` metadata is the union of the modules' -``llvm.module.flags`` metadata. The only exception being a flag with the -*Override* behavior, which may override another flag's value (see -below). +``llvm.module.flags`` metadata is the union of the modules' flags. That is, for +each unique metadata ID string, there will be exactly one entry in the merged +modules ``llvm.module.flags`` metadata table, and the value for that entry will +be determined by the merge behavior flag, as described below. The only exception +is that entries with the *Require* behavior are always preserved. The following behaviors are supported: @@ -2471,25 +2685,43 @@ The following behaviors are supported: * - 1 - **Error** - Emits an error if two values disagree. It is an error to have an - ID with both an Error and a Warning behavior. + Emits an error if two values disagree, otherwise the resulting value + is that of the operands. * - 2 - **Warning** - Emits a warning if two values disagree. + Emits a warning if two values disagree. The result value will be the + operand for the flag from the first module being linked. * - 3 - **Require** - Emits an error when the specified value is not present or doesn't - have the specified value. It is an error for two (or more) - ``llvm.module.flags`` with the same ID to have the Require behavior - but different values. There may be multiple Require flags per ID. + Adds a requirement that another module flag be present and have a + specified value after linking is performed. The value must be a + metadata pair, where the first element of the pair is the ID of the + module flag to be restricted, and the second element of the pair is + the value the module flag should be restricted to. This behavior can + be used to restrict the allowable results (via triggering of an + error) of linking IDs with the **Override** behavior. * - 4 - **Override** - Uses the specified value if the two values disagree. It is an - error for two (or more) ``llvm.module.flags`` with the same ID - to have the Override behavior but different values. + Uses the specified value, regardless of the behavior or value of the + other module. If both modules specify **Override**, but the values + differ, an error will be emitted. + + * - 5 + - **Append** + Appends the two values, which are required to be metadata nodes. + + * - 6 + - **AppendUnique** + Appends the two values, which are required to be metadata + nodes. However, duplicate entries in the second list are dropped + during the append operation. + +It is an error for a particular unique flag ID to have multiple behaviors, +except in the case of **Require** (which adds restrictions on another metadata +value) or **Override**. An example of module flags: @@ -2511,7 +2743,7 @@ An example of module flags: - Metadata ``!1`` has the ID ``!"bar"`` and the value '37'. The behavior if two or more ``!"bar"`` flags are seen is to use the value - '37' if their values are not equal. + '37'. - Metadata ``!2`` has the ID ``!"qux"`` and the value '42'. The behavior if two or more ``!"qux"`` flags are seen is to emit a @@ -2523,10 +2755,9 @@ An example of module flags: metadata !{ metadata !"foo", i32 1 } - The behavior is to emit an error if the ``llvm.module.flags`` does - not contain a flag with the ID ``!"foo"`` that has the value '1'. If - two or more ``!"qux"`` flags exist, then they must have the same - value or an error will be issued. + The behavior is to emit an error if the ``llvm.module.flags`` does not + contain a flag with the ID ``!"foo"`` that has the value '1' after linking is + performed. Objective-C Garbage Collection Module Flags Metadata ---------------------------------------------------- @@ -2548,26 +2779,26 @@ following key-value pairs: * - Key - Value - * - ``Objective-C Version`` - - **[Required]** — The Objective-C ABI version. Valid values are 1 and 2. + * - ``Objective-C Version`` + - **[Required]** --- The Objective-C ABI version. Valid values are 1 and 2. - * - ``Objective-C Image Info Version`` - - **[Required]** — The version of the image info section. Currently + * - ``Objective-C Image Info Version`` + - **[Required]** --- The version of the image info section. Currently always 0. - * - ``Objective-C Image Info Section`` - - **[Required]** — The section to place the metadata. Valid values are + * - ``Objective-C Image Info Section`` + - **[Required]** --- The section to place the metadata. Valid values are ``"__OBJC, __image_info, regular"`` for Objective-C ABI version 1, and ``"__DATA,__objc_imageinfo, regular, no_dead_strip"`` for Objective-C ABI version 2. - * - ``Objective-C Garbage Collection`` - - **[Required]** — Specifies whether garbage collection is supported or + * - ``Objective-C Garbage Collection`` + - **[Required]** --- Specifies whether garbage collection is supported or not. Valid values are 0, for no garbage collection, and 2, for garbage collection supported. - * - ``Objective-C GC Only`` - - **[Optional]** — Specifies that only garbage collection is supported. + * - ``Objective-C GC Only`` + - **[Optional]** --- Specifies that only garbage collection is supported. If present, its value must be 6. This flag requires that the ``Objective-C Garbage Collection`` flag have the value 2. @@ -2580,6 +2811,40 @@ Some important flag interactions: - A module with ``Objective-C Garbage Collection`` set to 0 cannot be merged with a module with ``Objective-C GC Only`` set to 6. +Automatic Linker Flags Module Flags Metadata +-------------------------------------------- + +Some targets support embedding flags to the linker inside individual object +files. Typically this is used in conjunction with language extensions which +allow source files to explicitly declare the libraries they depend on, and have +these automatically be transmitted to the linker via object files. + +These flags are encoded in the IR using metadata in the module flags section, +using the ``Linker Options`` key. The merge behavior for this flag is required +to be ``AppendUnique``, and the value for the key is expected to be a metadata +node which should be a list of other metadata nodes, each of which should be a +list of metadata strings defining linker options. + +For example, the following metadata section specifies two separate sets of +linker options, presumably to link against ``libz`` and the ``Cocoa`` +framework:: + + !0 = metadata !{ i32 6, metadata !"Linker Options", + metadata !{ + metadata !{ metadata !"-lz" }, + metadata !{ metadata !"-framework", metadata !"Cocoa" } } } + !llvm.module.flags = !{ !0 } + +The metadata encoding as lists of lists of options, as opposed to a collapsed +list of options, is chosen so that the IR encoding can use multiple option +strings to specify e.g., a single library, while still having that specifier be +preserved as an atomic element that can be recognized by a target specific +assembly writer or object file emitter. + +Each individual option is required to be either a valid option for the target's +linker, or an option that is reserved by the target specific assembly writer or +object file emitter. No other aspect of these options is defined by the IR. + Intrinsic Global Variables ========================== @@ -3732,7 +3997,7 @@ Example: <result> = lshr i32 4, 1 ; yields {i32}:result = 2 <result> = lshr i32 4, 2 ; yields {i32}:result = 1 <result> = lshr i8 4, 3 ; yields {i8}:result = 0 - <result> = lshr i8 -2, 1 ; yields {i8}:result = 0x7FFFFFFF + <result> = lshr i8 -2, 1 ; yields {i8}:result = 0x7FFFFFFF <result> = lshr i32 1, 32 ; undefined <result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1> @@ -5766,7 +6031,7 @@ Overview: The '``landingpad``' instruction is used by `LLVM's exception handling system <ExceptionHandling.html#overview>`_ to specify that a basic block -is a landing pad — one where the exception lands, and corresponds to the +is a landing pad --- one where the exception lands, and corresponds to the code found in the ``catch`` portion of a ``try``/``catch`` sequence. It defines values supplied by the personality function (``pers_fn``) upon re-entry to the function. The ``resultval`` has the type ``resultty``. @@ -5778,7 +6043,7 @@ This instruction takes a ``pers_fn`` value. This is the personality function associated with the unwinding mechanism. The optional ``cleanup`` flag indicates that the landing pad block is a cleanup. -A ``clause`` begins with the clause type — ``catch`` or ``filter`` — and +A ``clause`` begins with the clause type --- ``catch`` or ``filter`` --- and contains the global variable representing the "type" that may be caught or filtered respectively. Unlike the ``catch`` clause, the ``filter`` clause takes an array constant as its argument. Use @@ -7377,7 +7642,7 @@ Semantics: """""""""" The '``llvm.sadd.with.overflow``' family of intrinsic functions perform -a signed addition of the two variables. They return a structure — the +a signed addition of the two variables. They return a structure --- the first element of which is the signed summation, and the second element of which is a bit specifying if the signed summation resulted in an overflow. @@ -7427,7 +7692,7 @@ Semantics: """""""""" The '``llvm.uadd.with.overflow``' family of intrinsic functions perform -an unsigned addition of the two arguments. They return a structure — the +an unsigned addition of the two arguments. They return a structure --- the first element of which is the sum, and the second element of which is a bit specifying if the unsigned summation resulted in a carry. @@ -7476,7 +7741,7 @@ Semantics: """""""""" The '``llvm.ssub.with.overflow``' family of intrinsic functions perform -a signed subtraction of the two arguments. They return a structure — the +a signed subtraction of the two arguments. They return a structure --- the first element of which is the subtraction, and the second element of which is a bit specifying if the signed subtraction resulted in an overflow. @@ -7526,7 +7791,7 @@ Semantics: """""""""" The '``llvm.usub.with.overflow``' family of intrinsic functions perform -an unsigned subtraction of the two arguments. They return a structure — +an unsigned subtraction of the two arguments. They return a structure --- the first element of which is the subtraction, and the second element of which is a bit specifying if the unsigned subtraction resulted in an overflow. @@ -7576,7 +7841,7 @@ Semantics: """""""""" The '``llvm.smul.with.overflow``' family of intrinsic functions perform -a signed multiplication of the two arguments. They return a structure — +a signed multiplication of the two arguments. They return a structure --- the first element of which is the multiplication, and the second element of which is a bit specifying if the signed multiplication resulted in an overflow. @@ -7626,8 +7891,8 @@ Semantics: """""""""" The '``llvm.umul.with.overflow``' family of intrinsic functions perform -an unsigned multiplication of the two arguments. They return a structure -— the first element of which is the multiplication, and the second +an unsigned multiplication of the two arguments. They return a structure --- +the first element of which is the multiplication, and the second element of which is a bit specifying if the unsigned multiplication resulted in an overflow. @@ -7659,8 +7924,10 @@ Overview: """"""""" The '``llvm.fmuladd.*``' intrinsic functions represent multiply-add -expressions that can be fused if the code generator determines that the -fused expression would be legal and efficient. +expressions that can be fused if the code generator determines that (a) the +target instruction set has support for a fused operation, and (b) that the +fused operation is more efficient than the equivalent, separate pair of mul +and add instructions. Arguments: """""""""" diff --git a/docs/Lexicon.rst b/docs/Lexicon.rst index d568c0b302..11f1341f5c 100644 --- a/docs/Lexicon.rst +++ b/docs/Lexicon.rst @@ -1,5 +1,3 @@ -.. _lexicon: - ================ The LLVM Lexicon ================ @@ -17,11 +15,28 @@ A **ADCE** Aggressive Dead Code Elimination +**AST** + Abstract Syntax Tree. + + Due to Clang's influence (mostly the fact that parsing and semantic + analysis are so intertwined for C and especially C++), the typical + working definition of AST in the LLVM community is roughly "the + compiler's first complete symbolic (as opposed to textual) + representation of an input program". + As such, an "AST" might be a more general graph instead of a "tree" + (consider the symbolic representation for the type of a typical "linked + list node"). This working definition is closer to what some authors + call an "annotated abstract syntax tree". + + Consult your favorite compiler book or search engine for more details. + B - +.. _lexicon-bb-vectorization: + **BB Vectorization** - Basic Block Vectorization + Basic-Block Vectorization **BURS** Bottom Up Rewriting System --- A method of instruction selection for code @@ -185,6 +200,10 @@ S **SCCP** Sparse Conditional Constant Propagation +**SLP** + Superword-Level Parallelism, same as :ref:`Basic-Block Vectorization + <lexicon-bb-vectorization>`. + **SRoA** Scalar Replacement of Aggregates diff --git a/docs/LinkTimeOptimization.rst b/docs/LinkTimeOptimization.rst index 7eacf0bd0d..c15abd325e 100644 --- a/docs/LinkTimeOptimization.rst +++ b/docs/LinkTimeOptimization.rst @@ -1,5 +1,3 @@ -.. _lto: - ====================================================== LLVM Link Time Optimization: Design and Implementation ====================================================== @@ -85,9 +83,10 @@ invokes system linker. return foo1(); } -.. code-block:: bash +To compile, run: + +.. code-block:: console - --- command lines --- % clang -emit-llvm -c a.c -o a.o # <-- a.o is LLVM bitcode file % clang -c main.c -o main.o # <-- main.o is native object file % clang a.o main.o -o main # <-- standard link command without modifications @@ -96,7 +95,7 @@ invokes system linker. visible symbol defined in LLVM bitcode file. The linker completes its usual symbol resolution pass and finds that ``foo2()`` is not used anywhere. This information is used by the LLVM optimizer and it - removes ``foo2()``.</li> + removes ``foo2()``. * As soon as ``foo2()`` is removed, the optimizer recognizes that condition ``i < 0`` is always false, which means ``foo3()`` is never used. Hence, the diff --git a/docs/Makefile.sphinx b/docs/Makefile.sphinx index 3746522db6..21f66488b2 100644 --- a/docs/Makefile.sphinx +++ b/docs/Makefile.sphinx @@ -46,10 +46,6 @@ clean: html: $(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html @echo - @# FIXME: Remove this `cp` once HTML->Sphinx transition is completed. - @# Kind of a hack, but HTML-formatted docs are on the way out anyway. - @echo "Copying legacy HTML-formatted docs into $(BUILDDIR)/html" - @cp -a *.html $(BUILDDIR)/html @echo "Build finished. The HTML pages are in $(BUILDDIR)/html." dirhtml: diff --git a/docs/MakefileGuide.rst b/docs/MakefileGuide.rst index 2c1d33e962..3e90907886 100644 --- a/docs/MakefileGuide.rst +++ b/docs/MakefileGuide.rst @@ -1,5 +1,3 @@ -.. _makefile_guide: - =================== LLVM Makefile Guide =================== @@ -60,7 +58,7 @@ To use the makefile system, you simply create a file named ``Makefile`` in your directory and declare values for certain variables. The variables and values that you select determine what the makefile system will do. These variables enable rules and processing in the makefile system that automatically Do The -Right Thing™. +Right Thing (C). Including Makefiles ------------------- @@ -170,9 +168,9 @@ openable with the ``dlopen`` function and searchable with the ``dlsym`` function (or your operating system's equivalents). While this isn't strictly necessary on Linux and a few other platforms, it is required on systems like HP-UX and Darwin. You should use ``LOADABLE_MODULE`` for any shared library that you -intend to be loaded into an tool via the ``-load`` option. See the -`WritingAnLLVMPass.html <WritingAnLLVMPass.html#makefile>`_ document for an -example of why you might want to do this. +intend to be loaded into an tool via the ``-load`` option. `Pass documentation +<writing-an-llvm-pass-makefile>`_ has an example of why you might want to do +this. Bitcode Modules ^^^^^^^^^^^^^^^ @@ -241,7 +239,7 @@ and the names of the libraries you wish to link with the tool. For example: says that we are to build a tool name ``mytool`` and that it requires three libraries: ``mylib``, ``LLVMSupport.a`` and ``LLVMSystem.a``. -Note that two different variables are use to indicate which libraries are +Note that two different variables are used to indicate which libraries are linked: ``USEDLIBS`` and ``LLVMLIBS``. This distinction is necessary to support projects. ``LLVMLIBS`` refers to the LLVM libraries found in the LLVM object directory. ``USEDLIBS`` refers to the libraries built by your project. In the @@ -348,9 +346,9 @@ details. This target should be implemented by the ``Makefile`` in the project's ``test`` directory. It is invoked by the ``check`` target elsewhere. Each project is free to define the actions of ``check-local`` as appropriate for that -project. The LLVM project itself uses dejagnu to run a suite of feature and -regresson tests. Other projects may choose to use dejagnu or any other testing -mechanism. +project. The LLVM project itself uses the :doc:`Lit <CommandGuide/lit>` testing +tool to run a suite of feature and regression tests. Other projects may choose +to use :program:`lit` or any other testing mechanism. ``clean`` --------- @@ -358,7 +356,7 @@ mechanism. This target cleans the build directory, recursively removing all things that the Makefile builds. The cleaning rules have been made guarded so they shouldn't go awry (via ``rm -f $(UNSET_VARIABLE)/*`` which will attempt to erase the entire -directory structure. +directory structure). ``clean-local`` --------------- @@ -606,8 +604,8 @@ system that tell it what to do for the current directory. the build process, such as code generators (e.g. ``tblgen``). ``OPTIONAL_DIRS`` - Specify a set of directories that may be built, if they exist, but its not - an error for them not to exist. + Specify a set of directories that may be built, if they exist, but it is + not an error for them not to exist. ``PARALLEL_DIRS`` Specify a set of directories to build recursively and in parallel if the @@ -701,6 +699,9 @@ The override variables are given below: ``CFLAGS`` Additional flags to be passed to the 'C' compiler. +``CPPFLAGS`` + Additional flags passed to the C/C++ preprocessor. + ``CXX`` Specifies the path to the C++ compiler. diff --git a/docs/MarkedUpDisassembly.rst b/docs/MarkedUpDisassembly.rst index e1282e102e..cc4dbc817e 100644 --- a/docs/MarkedUpDisassembly.rst +++ b/docs/MarkedUpDisassembly.rst @@ -1,5 +1,3 @@ -.. _marked_up_disassembly: - ======================================= LLVM's Optional Rich Disassembly Output ======================================= diff --git a/docs/Packaging.rst b/docs/Packaging.rst index 6e74158d72..7c2dc95612 100644 --- a/docs/Packaging.rst +++ b/docs/Packaging.rst @@ -1,5 +1,3 @@ -.. _packaging: - ======================== Advice on Packaging LLVM ======================== diff --git a/docs/Passes.html b/docs/Passes.html deleted file mode 100644 index 7bffc54d8d..0000000000 --- a/docs/Passes.html +++ /dev/null @@ -1,2025 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <title>LLVM's Analysis and Transform Passes</title> - <link rel="stylesheet" href="_static/llvm.css" type="text/css"> - <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> -</head> -<body> - -<!-- - -If Passes.html is up to date, the following "one-liner" should print -an empty diff. - -egrep -e '^<tr><td><a href="#.*">-.*</a></td><td>.*</td></tr>$' \ - -e '^ <a name=".*">.*</a>$' < Passes.html >html; \ -perl >help <<'EOT' && diff -u help html; rm -f help html -open HTML, "<Passes.html" or die "open: Passes.html: $!\n"; -while (<HTML>) { - m:^<tr><td><a href="#(.*)">-.*</a></td><td>.*</td></tr>$: or next; - $order{$1} = sprintf("%03d", 1 + int %order); -} -open HELP, "../Release/bin/opt -help|" or die "open: opt -help: $!\n"; -while (<HELP>) { - m:^ -([^ ]+) +- (.*)$: or next; - my $o = $order{$1}; - $o = "000" unless defined $o; - push @x, "$o<tr><td><a href=\"#$1\">-$1</a></td><td>$2</td></tr>\n"; - push @y, "$o <a name=\"$1\">-$1: $2</a>\n"; -} -@x = map { s/^\d\d\d//; $_ } sort @x; -@y = map { s/^\d\d\d//; $_ } sort @y; -print @x, @y; -EOT - -This (real) one-liner can also be helpful when converting comments to HTML: - -perl -e '$/ = undef; for (split(/\n/, <>)) { s:^ *///? ?::; print " <p>\n" if !$on && $_ =~ /\S/; print " </p>\n" if $on && $_ =~ /^\s*$/; print " $_\n"; $on = ($_ =~ /\S/); } print " </p>\n" if $on' - - --> - -<h1>LLVM's Analysis and Transform Passes</h1> - -<ol> - <li><a href="#intro">Introduction</a></li> - <li><a href="#analyses">Analysis Passes</a> - <li><a href="#transforms">Transform Passes</a></li> - <li><a href="#utilities">Utility Passes</a></li> -</ol> - -<div class="doc_author"> - <p>Written by <a href="mailto:rspencer@x10sys.com">Reid Spencer</a> - and Gordon Henriksen</p> -</div> - -<!-- ======================================================================= --> -<h2><a name="intro">Introduction</a></h2> -<div> - <p>This document serves as a high level summary of the optimization features - that LLVM provides. Optimizations are implemented as Passes that traverse some - portion of a program to either collect information or transform the program. - The table below divides the passes that LLVM provides into three categories. - Analysis passes compute information that other passes can use or for debugging - or program visualization purposes. Transform passes can use (or invalidate) - the analysis passes. Transform passes all mutate the program in some way. - Utility passes provides some utility but don't otherwise fit categorization. - For example passes to extract functions to bitcode or write a module to - bitcode are neither analysis nor transform passes. - <p>The table below provides a quick summary of each pass and links to the more - complete pass description later in the document.</p> - -<table> -<tr><th colspan="2"><b>ANALYSIS PASSES</b></th></tr> -<tr><th>Option</th><th>Name</th></tr> -<tr><td><a href="#aa-eval">-aa-eval</a></td><td>Exhaustive Alias Analysis Precision Evaluator</td></tr> -<tr><td><a href="#basicaa">-basicaa</a></td><td>Basic Alias Analysis (stateless AA impl)</td></tr> -<tr><td><a href="#basiccg">-basiccg</a></td><td>Basic CallGraph Construction</td></tr> -<tr><td><a href="#count-aa">-count-aa</a></td><td>Count Alias Analysis Query Responses</td></tr> -<tr><td><a href="#da">-da</a></td><td>Dependence Analysis</td></tr> -<tr><td><a href="#debug-aa">-debug-aa</a></td><td>AA use debugger</td></tr> -<tr><td><a href="#domfrontier">-domfrontier</a></td><td>Dominance Frontier Construction</td></tr> -<tr><td><a href="#domtree">-domtree</a></td><td>Dominator Tree Construction</td></tr> -<tr><td><a href="#dot-callgraph">-dot-callgraph</a></td><td>Print Call Graph to 'dot' file</td></tr> -<tr><td><a href="#dot-cfg">-dot-cfg</a></td><td>Print CFG of function to 'dot' file</td></tr> -<tr><td><a href="#dot-cfg-only">-dot-cfg-only</a></td><td>Print CFG of function to 'dot' file (with no function bodies)</td></tr> -<tr><td><a href="#dot-dom">-dot-dom</a></td><td>Print dominance tree of function to 'dot' file</td></tr> -<tr><td><a href="#dot-dom-only">-dot-dom-only</a></td><td>Print dominance tree of function to 'dot' file (with no function bodies)</td></tr> -<tr><td><a href="#dot-postdom">-dot-postdom</a></td><td>Print postdominance tree of function to 'dot' file</td></tr> -<tr><td><a href="#dot-postdom-only">-dot-postdom-only</a></td><td>Print postdominance tree of function to 'dot' file (with no function bodies)</td></tr> -<tr><td><a href="#globalsmodref-aa">-globalsmodref-aa</a></td><td>Simple mod/ref analysis for globals</td></tr> -<tr><td><a href="#instcount">-instcount</a></td><td>Counts the various types of Instructions</td></tr> -<tr><td><a href="#intervals">-intervals</a></td><td>Interval Partition Construction</td></tr> -<tr><td><a href="#iv-users">-iv-users</a></td><td>Induction Variable Users</td></tr> -<tr><td><a href="#lazy-value-info">-lazy-value-info</a></td><td>Lazy Value Information Analysis</td></tr> -<tr><td><a href="#libcall-aa">-libcall-aa</a></td><td>LibCall Alias Analysis</td></tr> -<tr><td><a href="#lint">-lint</a></td><td>Statically lint-checks LLVM IR</td></tr> -<tr><td><a href="#loops">-loops</a></td><td>Natural Loop Information</td></tr> -<tr><td><a href="#memdep">-memdep</a></td><td>Memory Dependence Analysis</td></tr> -<tr><td><a href="#module-debuginfo">-module-debuginfo</a></td><td>Decodes module-level debug info</td></tr> -<tr><td><a href="#no-aa">-no-aa</a></td><td>No Alias Analysis (always returns 'may' alias)</td></tr> -<tr><td><a href="#no-profile">-no-profile</a></td><td>No Profile Information</td></tr> -<tr><td><a href="#postdomtree">-postdomtree</a></td><td>Post-Dominator Tree Construction</td></tr> -<tr><td><a href="#print-alias-sets">-print-alias-sets</a></td><td>Alias Set Printer</td></tr> -<tr><td><a href="#print-callgraph">-print-callgraph</a></td><td>Print a call graph</td></tr> -<tr><td><a href="#print-callgraph-sccs">-print-callgraph-sccs</a></td><td>Print SCCs of the Call Graph</td></tr> -<tr><td><a href="#print-cfg-sccs">-print-cfg-sccs</a></td><td>Print SCCs of each function CFG</td></tr> -<tr><td><a href="#print-dbginfo">-print-dbginfo</a></td><td>Print debug info in human readable form</td></tr> -<tr><td><a href="#print-dom-info">-print-dom-info</a></td><td>Dominator Info Printer</td></tr> -<tr><td><a href="#print-externalfnconstants">-print-externalfnconstants</a></td><td>Print external fn callsites passed constants</td></tr> -<tr><td><a href="#print-function">-print-function</a></td><td>Print function to stderr</td></tr> -<tr><td><a href="#print-module">-print-module</a></td><td>Print module to stderr</td></tr> -<tr><td><a href="#print-used-types">-print-used-types</a></td><td>Find Used Types</td></tr> -<tr><td><a href="#profile-estimator">-profile-estimator</a></td><td>Estimate profiling information</td></tr> -<tr><td><a href="#profile-loader">-profile-loader</a></td><td>Load profile information from llvmprof.out</td></tr> -<tr><td><a href="#profile-verifier">-profile-verifier</a></td><td>Verify profiling information</td></tr> -<tr><td><a href="#regions">-regions</a></td><td>Detect single entry single exit regions</td></tr> -<tr><td><a href="#scalar-evolution">-scalar-evolution</a></td><td>Scalar Evolution Analysis</td></tr> -<tr><td><a href="#scev-aa">-scev-aa</a></td><td>ScalarEvolution-based Alias Analysis</td></tr> -<tr><td><a href="#targetdata">-targetdata</a></td><td>Target Data Layout</td></tr> - - -<tr><th colspan="2"><b>TRANSFORM PASSES</b></th></tr> -<tr><th>Option</th><th>Name</th></tr> -<tr><td><a href="#adce">-adce</a></td><td>Aggressive Dead Code Elimination</td></tr> -<tr><td><a href="#always-inline">-always-inline</a></td><td>Inliner for always_inline functions</td></tr> -<tr><td><a href="#argpromotion">-argpromotion</a></td><td>Promote 'by reference' arguments to scalars</td></tr> -<tr><td><a href="#bb-vectorize">-bb-vectorize</a></td><td>Combine instructions to form vector instructions within basic blocks</td></tr> -<tr><td><a href="#block-placement">-block-placement</a></td><td>Profile Guided Basic Block Placement</td></tr> -<tr><td><a href="#break-crit-edges">-break-crit-edges</a></td><td>Break critical edges in CFG</td></tr> -<tr><td><a href="#codegenprepare">-codegenprepare</a></td><td>Optimize for code generation</td></tr> -<tr><td><a href="#constmerge">-constmerge</a></td><td>Merge Duplicate Global Constants</td></tr> -<tr><td><a href="#constprop">-constprop</a></td><td>Simple constant propagation</td></tr> -<tr><td><a href="#dce">-dce</a></td><td>Dead Code Elimination</td></tr> -<tr><td><a href="#deadargelim">-deadargelim</a></td><td>Dead Argument Elimination</td></tr> -<tr><td><a href="#deadtypeelim">-deadtypeelim</a></td><td>Dead Type Elimination</td></tr> -<tr><td><a href="#die">-die</a></td><td>Dead Instruction Elimination</td></tr> -<tr><td><a href="#dse">-dse</a></td><td>Dead Store Elimination</td></tr> -<tr><td><a href="#functionattrs">-functionattrs</a></td><td>Deduce function attributes</td></tr> -<tr><td><a href="#globaldce">-globaldce</a></td><td>Dead Global Elimination</td></tr> -<tr><td><a href="#globalopt">-globalopt</a></td><td>Global Variable Optimizer</td></tr> -<tr><td><a href="#gvn">-gvn</a></td><td>Global Value Numbering</td></tr> -<tr><td><a href="#indvars">-indvars</a></td><td>Canonicalize Induction Variables</td></tr> -<tr><td><a href="#inline">-inline</a></td><td>Function Integration/Inlining</td></tr> -<tr><td><a href="#insert-edge-profiling">-insert-edge-profiling</a></td><td>Insert instrumentation for edge profiling</td></tr> -<tr><td><a href="#insert-optimal-edge-profiling">-insert-optimal-edge-profiling</a></td><td>Insert optimal instrumentation for edge profiling</td></tr> -<tr><td><a href="#instcombine">-instcombine</a></td><td>Combine redundant instructions</td></tr> -<tr><td><a href="#internalize">-internalize</a></td><td>Internalize Global Symbols</td></tr> -<tr><td><a href="#ipconstprop">-ipconstprop</a></td><td>Interprocedural constant propagation</td></tr> -<tr><td><a href="#ipsccp">-ipsccp</a></td><td>Interprocedural Sparse Conditional Constant Propagation</td></tr> -<tr><td><a href="#jump-threading">-jump-threading</a></td><td>Jump Threading</td></tr> -<tr><td><a href="#lcssa">-lcssa</a></td><td>Loop-Closed SSA Form Pass</td></tr> -<tr><td><a href="#licm">-licm</a></td><td>Loop Invariant Code Motion</td></tr> -<tr><td><a href="#loop-deletion">-loop-deletion</a></td><td>Delete dead loops</td></tr> -<tr><td><a href="#loop-extract">-loop-extract</a></td><td>Extract loops into new functions</td></tr> -<tr><td><a href="#loop-extract-single">-loop-extract-single</a></td><td>Extract at most one loop into a new function</td></tr> -<tr><td><a href="#loop-reduce">-loop-reduce</a></td><td>Loop Strength Reduction</td></tr> -<tr><td><a href="#loop-rotate">-loop-rotate</a></td><td>Rotate Loops</td></tr> -<tr><td><a href="#loop-simplify">-loop-simplify</a></td><td>Canonicalize natural loops</td></tr> -<tr><td><a href="#loop-unroll">-loop-unroll</a></td><td>Unroll loops</td></tr> -<tr><td><a href="#loop-unswitch">-loop-unswitch</a></td><td>Unswitch loops</td></tr> -<tr><td><a href="#loweratomic">-loweratomic</a></td><td>Lower atomic intrinsics to non-atomic form</td></tr> -<tr><td><a href="#lowerinvoke">-lowerinvoke</a></td><td>Lower invoke and unwind, for unwindless code generators</td></tr> -<tr><td><a href="#lowerswitch">-lowerswitch</a></td><td>Lower SwitchInst's to branches</td></tr> -<tr><td><a href="#mem2reg">-mem2reg</a></td><td>Promote Memory to Register</td></tr> -<tr><td><a href="#memcpyopt">-memcpyopt</a></td><td>MemCpy Optimization</td></tr> -<tr><td><a href="#mergefunc">-mergefunc</a></td><td>Merge Functions</td></tr> -<tr><td><a href="#mergereturn">-mergereturn</a></td><td>Unify function exit nodes</td></tr> -<tr><td><a href="#partial-inliner">-partial-inliner</a></td><td>Partial Inliner</td></tr> -<tr><td><a href="#prune-eh">-prune-eh</a></td><td>Remove unused exception handling info</td></tr> -<tr><td><a href="#reassociate">-reassociate</a></td><td>Reassociate expressions</td></tr> -<tr><td><a href="#reg2mem">-reg2mem</a></td><td>Demote all values to stack slots</td></tr> -<tr><td><a href="#scalarrepl">-scalarrepl</a></td><td>Scalar Replacement of Aggregates (DT)</td></tr> -<tr><td><a href="#sccp">-sccp</a></td><td>Sparse Conditional Constant Propagation</td></tr> -<tr><td><a href="#simplify-libcalls">-simplify-libcalls</a></td><td>Simplify well-known library calls</td></tr> -<tr><td><a href="#simplifycfg">-simplifycfg</a></td><td>Simplify the CFG</td></tr> -<tr><td><a href="#sink">-sink</a></td><td>Code sinking</td></tr> -<tr><td><a href="#strip">-strip</a></td><td>Strip all symbols from a module</td></tr> -<tr><td><a href="#strip-dead-debug-info">-strip-dead-debug-info</a></td><td>Strip debug info for unused symbols</td></tr> -<tr><td><a href="#strip-dead-prototypes">-strip-dead-prototypes</a></td><td>Strip Unused Function Prototypes</td></tr> -<tr><td><a href="#strip-debug-declare">-strip-debug-declare</a></td><td>Strip all llvm.dbg.declare intrinsics</td></tr> -<tr><td><a href="#strip-nondebug">-strip-nondebug</a></td><td>Strip all symbols, except dbg symbols, from a module</td></tr> -<tr><td><a href="#tailcallelim">-tailcallelim</a></td><td>Tail Call Elimination</td></tr> - - -<tr><th colspan="2"><b>UTILITY PASSES</b></th></tr> -<tr><th>Option</th><th>Name</th></tr> -<tr><td><a href="#deadarghaX0r">-deadarghaX0r</a></td><td>Dead Argument Hacking (BUGPOINT USE ONLY; DO NOT USE)</td></tr> -<tr><td><a href="#extract-blocks">-extract-blocks</a></td><td>Extract Basic Blocks From Module (for bugpoint use)</td></tr> -<tr><td><a href="#instnamer">-instnamer</a></td><td>Assign names to anonymous instructions</td></tr> -<tr><td><a href="#preverify">-preverify</a></td><td>Preliminary module verification</td></tr> -<tr><td><a href="#verify">-verify</a></td><td>Module Verifier</td></tr> -<tr><td><a href="#view-cfg">-view-cfg</a></td><td>View CFG of function</td></tr> -<tr><td><a href="#view-cfg-only">-view-cfg-only</a></td><td>View CFG of function (with no function bodies)</td></tr> -<tr><td><a href="#view-dom">-view-dom</a></td><td>View dominance tree of function</td></tr> -<tr><td><a href="#view-dom-only">-view-dom-only</a></td><td>View dominance tree of function (with no function bodies)</td></tr> -<tr><td><a href="#view-postdom">-view-postdom</a></td><td>View postdominance tree of function</td></tr> -<tr><td><a href="#view-postdom-only">-view-postdom-only</a></td><td>View postdominance tree of function (with no function bodies)</td></tr> -</table> - -</div> - -<!-- ======================================================================= --> -<h2><a name="analyses">Analysis Passes</a></h2> -<div> - <p>This section describes the LLVM Analysis Passes.</p> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="aa-eval">-aa-eval: Exhaustive Alias Analysis Precision Evaluator</a> -</h3> -<div> - <p>This is a simple N^2 alias analysis accuracy evaluator. - Basically, for each function in the program, it simply queries to see how the - alias analysis implementation answers alias queries between each pair of - pointers in the function.</p> - - <p>This is inspired and adapted from code by: Naveen Neelakantam, Francesco - Spadini, and Wojciech Stryjewski.</p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="basicaa">-basicaa: Basic Alias Analysis (stateless AA impl)</a> -</h3> -<div> - <p>A basic alias analysis pass that implements identities (two different - globals cannot alias, etc), but does no stateful analysis.</p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="basiccg">-basiccg: Basic CallGraph Construction</a> -</h3> -<div> - <p>Yet to be written.</p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="count-aa">-count-aa: Count Alias Analysis Query Responses</a> -</h3> -<div> - <p> - A pass which can be used to count how many alias queries - are being made and how the alias analysis implementation being used responds. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="da">-da: Dependence Analysis</a> -</h3> -<div> - <p>Dependence analysis framework, which is used to detect dependences in - memory accesses.</p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="debug-aa">-debug-aa: AA use debugger</a> -</h3> -<div> - <p> - This simple pass checks alias analysis users to ensure that if they - create a new value, they do not query AA without informing it of the value. - It acts as a shim over any other AA pass you want. - </p> - - <p> - Yes keeping track of every value in the program is expensive, but this is - a debugging pass. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="domfrontier">-domfrontier: Dominance Frontier Construction</a> -</h3> -<div> - <p> - This pass is a simple dominator construction algorithm for finding forward - dominator frontiers. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="domtree">-domtree: Dominator Tree Construction</a> -</h3> -<div> - <p> - This pass is a simple dominator construction algorithm for finding forward - dominators. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="dot-callgraph">-dot-callgraph: Print Call Graph to 'dot' file</a> -</h3> -<div> - <p> - This pass, only available in <code>opt</code>, prints the call graph into a - <code>.dot</code> graph. This graph can then be processed with the "dot" tool - to convert it to postscript or some other suitable format. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="dot-cfg">-dot-cfg: Print CFG of function to 'dot' file</a> -</h3> -<div> - <p> - This pass, only available in <code>opt</code>, prints the control flow graph - into a <code>.dot</code> graph. This graph can then be processed with the - "dot" tool to convert it to postscript or some other suitable format. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="dot-cfg-only">-dot-cfg-only: Print CFG of function to 'dot' file (with no function bodies)</a> -</h3> -<div> - <p> - This pass, only available in <code>opt</code>, prints the control flow graph - into a <code>.dot</code> graph, omitting the function bodies. This graph can - then be processed with the "dot" tool to convert it to postscript or some - other suitable format. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="dot-dom">-dot-dom: Print dominance tree of function to 'dot' file</a> -</h3> -<div> - <p> - This pass, only available in <code>opt</code>, prints the dominator tree - into a <code>.dot</code> graph. This graph can then be processed with the - "dot" tool to convert it to postscript or some other suitable format. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="dot-dom-only">-dot-dom-only: Print dominance tree of function to 'dot' file (with no function bodies)</a> -</h3> -<div> - <p> - This pass, only available in <code>opt</code>, prints the dominator tree - into a <code>.dot</code> graph, omitting the function bodies. This graph can - then be processed with the "dot" tool to convert it to postscript or some - other suitable format. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="dot-postdom">-dot-postdom: Print postdominance tree of function to 'dot' file</a> -</h3> -<div> - <p> - This pass, only available in <code>opt</code>, prints the post dominator tree - into a <code>.dot</code> graph. This graph can then be processed with the - "dot" tool to convert it to postscript or some other suitable format. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="dot-postdom-only">-dot-postdom-only: Print postdominance tree of function to 'dot' file (with no function bodies)</a> -</h3> -<div> - <p> - This pass, only available in <code>opt</code>, prints the post dominator tree - into a <code>.dot</code> graph, omitting the function bodies. This graph can - then be processed with the "dot" tool to convert it to postscript or some - other suitable format. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="globalsmodref-aa">-globalsmodref-aa: Simple mod/ref analysis for globals</a> -</h3> -<div> - <p> - This simple pass provides alias and mod/ref information for global values - that do not have their address taken, and keeps track of whether functions - read or write memory (are "pure"). For this simple (but very common) case, - we can provide pretty accurate and useful information. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="instcount">-instcount: Counts the various types of Instructions</a> -</h3> -<div> - <p> - This pass collects the count of all instructions and reports them - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="intervals">-intervals: Interval Partition Construction</a> -</h3> -<div> - <p> - This analysis calculates and represents the interval partition of a function, - or a preexisting interval partition. - </p> - - <p> - In this way, the interval partition may be used to reduce a flow graph down - to its degenerate single node interval partition (unless it is irreducible). - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="iv-users">-iv-users: Induction Variable Users</a> -</h3> -<div> - <p>Bookkeeping for "interesting" users of expressions computed from - induction variables.</p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="lazy-value-info">-lazy-value-info: Lazy Value Information Analysis</a> -</h3> -<div> - <p>Interface for lazy computation of value constraint information.</p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="libcall-aa">-libcall-aa: LibCall Alias Analysis</a> -</h3> -<div> - <p>LibCall Alias Analysis.</p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="lint">-lint: Statically lint-checks LLVM IR</a> -</h3> -<div> - <p>This pass statically checks for common and easily-identified constructs - which produce undefined or likely unintended behavior in LLVM IR.</p> - - <p>It is not a guarantee of correctness, in two ways. First, it isn't - comprehensive. There are checks which could be done statically which are - not yet implemented. Some of these are indicated by TODO comments, but - those aren't comprehensive either. Second, many conditions cannot be - checked statically. This pass does no dynamic instrumentation, so it - can't check for all possible problems.</p> - - <p>Another limitation is that it assumes all code will be executed. A store - through a null pointer in a basic block which is never reached is harmless, - but this pass will warn about it anyway.</p> - - <p>Optimization passes may make conditions that this pass checks for more or - less obvious. If an optimization pass appears to be introducing a warning, - it may be that the optimization pass is merely exposing an existing - condition in the code.</p> - - <p>This code may be run before instcombine. In many cases, instcombine checks - for the same kinds of things and turns instructions with undefined behavior - into unreachable (or equivalent). Because of this, this pass makes some - effort to look through bitcasts and so on. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="loops">-loops: Natural Loop Information</a> -</h3> -<div> - <p> - This analysis is used to identify natural loops and determine the loop depth - of various nodes of the CFG. Note that the loops identified may actually be - several natural loops that share the same header node... not just a single - natural loop. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="memdep">-memdep: Memory Dependence Analysis</a> -</h3> -<div> - <p> - An analysis that determines, for a given memory operation, what preceding - memory operations it depends on. It builds on alias analysis information, and - tries to provide a lazy, caching interface to a common kind of alias - information query. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="module-debuginfo">-module-debuginfo: Decodes module-level debug info</a> -</h3> -<div> - <p>This pass decodes the debug info metadata in a module and prints in a - (sufficiently-prepared-) human-readable form. - - For example, run this pass from opt along with the -analyze option, and - it'll print to standard output. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="no-aa">-no-aa: No Alias Analysis (always returns 'may' alias)</a> -</h3> -<div> - <p> - This is the default implementation of the Alias Analysis interface. It always - returns "I don't know" for alias queries. NoAA is unlike other alias analysis - implementations, in that it does not chain to a previous analysis. As such it - doesn't follow many of the rules that other alias analyses must. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="no-profile">-no-profile: No Profile Information</a> -</h3> -<div> - <p> - The default "no profile" implementation of the abstract - <code>ProfileInfo</code> interface. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="postdomfrontier">-postdomfrontier: Post-Dominance Frontier Construction</a> -</h3> -<div> - <p> - This pass is a simple post-dominator construction algorithm for finding - post-dominator frontiers. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="postdomtree">-postdomtree: Post-Dominator Tree Construction</a> -</h3> -<div> - <p> - This pass is a simple post-dominator construction algorithm for finding - post-dominators. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="print-alias-sets">-print-alias-sets: Alias Set Printer</a> -</h3> -<div> - <p>Yet to be written.</p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="print-callgraph">-print-callgraph: Print a call graph</a> -</h3> -<div> - <p> - This pass, only available in <code>opt</code>, prints the call graph to - standard error in a human-readable form. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="print-callgraph-sccs">-print-callgraph-sccs: Print SCCs of the Call Graph</a> -</h3> -<div> - <p> - This pass, only available in <code>opt</code>, prints the SCCs of the call - graph to standard error in a human-readable form. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="print-cfg-sccs">-print-cfg-sccs: Print SCCs of each function CFG</a> -</h3> -<div> - <p> - This pass, only available in <code>opt</code>, prints the SCCs of each - function CFG to standard error in a human-readable form. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="print-dbginfo">-print-dbginfo: Print debug info in human readable form</a> -</h3> -<div> - <p>Pass that prints instructions, and associated debug info:</p> - <ul> - - <li>source/line/col information</li> - <li>original variable name</li> - <li>original type name</li> - </ul> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="print-dom-info">-print-dom-info: Dominator Info Printer</a> -</h3> -<div> - <p>Dominator Info Printer.</p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="print-externalfnconstants">-print-externalfnconstants: Print external fn callsites passed constants</a> -</h3> -<div> - <p> - This pass, only available in <code>opt</code>, prints out call sites to - external functions that are called with constant arguments. This can be - useful when looking for standard library functions we should constant fold - or handle in alias analyses. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="print-function">-print-function: Print function to stderr</a> -</h3> -<div> - <p> - The <code>PrintFunctionPass</code> class is designed to be pipelined with - other <code>FunctionPass</code>es, and prints out the functions of the module - as they are processed. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="print-module">-print-module: Print module to stderr</a> -</h3> -<div> - <p> - This pass simply prints out the entire module when it is executed. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="print-used-types">-print-used-types: Find Used Types</a> -</h3> -<div> - <p> - This pass is used to seek out all of the types in use by the program. Note - that this analysis explicitly does not include types only used by the symbol - table. -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="profile-estimator">-profile-estimator: Estimate profiling information</a> -</h3> -<div> - <p>Profiling information that estimates the profiling information - in a very crude and unimaginative way. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="profile-loader">-profile-loader: Load profile information from llvmprof.out</a> -</h3> -<div> - <p> - A concrete implementation of profiling information that loads the information - from a profile dump file. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="profile-verifier">-profile-verifier: Verify profiling information</a> -</h3> -<div> - <p>Pass that checks profiling information for plausibility.</p> -</div> -<h3> - <a name="regions">-regions: Detect single entry single exit regions</a> -</h3> -<div> - <p> - The <code>RegionInfo</code> pass detects single entry single exit regions in a - function, where a region is defined as any subgraph that is connected to the - remaining graph at only two spots. Furthermore, an hierarchical region tree is - built. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="scalar-evolution">-scalar-evolution: Scalar Evolution Analysis</a> -</h3> -<div> - <p> - The <code>ScalarEvolution</code> analysis can be used to analyze and - catagorize scalar expressions in loops. It specializes in recognizing general - induction variables, representing them with the abstract and opaque - <code>SCEV</code> class. Given this analysis, trip counts of loops and other - important properties can be obtained. - </p> - - <p> - This analysis is primarily useful for induction variable substitution and - strength reduction. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="scev-aa">-scev-aa: ScalarEvolution-based Alias Analysis</a> -</h3> -<div> - <p>Simple alias analysis implemented in terms of ScalarEvolution queries. - - This differs from traditional loop dependence analysis in that it tests - for dependencies within a single iteration of a loop, rather than - dependencies between different iterations. - - ScalarEvolution has a more complete understanding of pointer arithmetic - than BasicAliasAnalysis' collection of ad-hoc analyses. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="targetdata">-targetdata: Target Data Layout</a> -</h3> -<div> - <p>Provides other passes access to information on how the size and alignment - required by the target ABI for various data types.</p> -</div> - -</div> - -<!-- ======================================================================= --> -<h2><a name="transforms">Transform Passes</a></h2> -<div> - <p>This section describes the LLVM Transform Passes.</p> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="adce">-adce: Aggressive Dead Code Elimination</a> -</h3> -<div> - <p>ADCE aggressively tries to eliminate code. This pass is similar to - <a href="#dce">DCE</a> but it assumes that values are dead until proven - otherwise. This is similar to <a href="#sccp">SCCP</a>, except applied to - the liveness of values.</p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="always-inline">-always-inline: Inliner for always_inline functions</a> -</h3> -<div> - <p>A custom inliner that handles only functions that are marked as - "always inline".</p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="argpromotion">-argpromotion: Promote 'by reference' arguments to scalars</a> -</h3> -<div> - <p> - This pass promotes "by reference" arguments to be "by value" arguments. In - practice, this means looking for internal functions that have pointer - arguments. If it can prove, through the use of alias analysis, that an - argument is *only* loaded, then it can pass the value into the function - instead of the address of the value. This can cause recursive simplification - of code and lead to the elimination of allocas (especially in C++ template - code like the STL). - </p> - - <p> - This pass also handles aggregate arguments that are passed into a function, - scalarizing them if the elements of the aggregate are only loaded. Note that - it refuses to scalarize aggregates which would require passing in more than - three operands to the function, because passing thousands of operands for a - large array or structure is unprofitable! - </p> - - <p> - Note that this transformation could also be done for arguments that are only - stored to (returning the value instead), but does not currently. This case - would be best handled when and if LLVM starts supporting multiple return - values from functions. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="bb-vectorize">-bb-vectorize: Basic-Block Vectorization</a> -</h3> -<div> - <p>This pass combines instructions inside basic blocks to form vector - instructions. It iterates over each basic block, attempting to pair - compatible instructions, repeating this process until no additional - pairs are selected for vectorization. When the outputs of some pair - of compatible instructions are used as inputs by some other pair of - compatible instructions, those pairs are part of a potential - vectorization chain. Instruction pairs are only fused into vector - instructions when they are part of a chain longer than some - threshold length. Moreover, the pass attempts to find the best - possible chain for each pair of compatible instructions. These - heuristics are intended to prevent vectorization in cases where - it would not yield a performance increase of the resulting code. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="block-placement">-block-placement: Profile Guided Basic Block Placement</a> -</h3> -<div> - <p>This pass is a very simple profile guided basic block placement algorithm. - The idea is to put frequently executed blocks together at the start of the - function and hopefully increase the number of fall-through conditional - branches. If there is no profile information for a particular function, this - pass basically orders blocks in depth-first order.</p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="break-crit-edges">-break-crit-edges: Break critical edges in CFG</a> -</h3> -<div> - <p> - Break all of the critical edges in the CFG by inserting a dummy basic block. - It may be "required" by passes that cannot deal with critical edges. This - transformation obviously invalidates the CFG, but can update forward dominator - (set, immediate dominators, tree, and frontier) information. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="codegenprepare">-codegenprepare: Optimize for code generation</a> -</h3> -<div> - This pass munges the code in the input function to better prepare it for - SelectionDAG-based code generation. This works around limitations in it's - basic-block-at-a-time approach. It should eventually be removed. -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="constmerge">-constmerge: Merge Duplicate Global Constants</a> -</h3> -<div> - <p> - Merges duplicate global constants together into a single constant that is - shared. This is useful because some passes (ie TraceValues) insert a lot of - string constants into the program, regardless of whether or not an existing - string is available. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="constprop">-constprop: Simple constant propagation</a> -</h3> -<div> - <p>This file implements constant propagation and merging. It looks for - instructions involving only constant operands and replaces them with a - constant value instead of an instruction. For example:</p> - <blockquote><pre>add i32 1, 2</pre></blockquote> - <p>becomes</p> - <blockquote><pre>i32 3</pre></blockquote> - <p>NOTE: this pass has a habit of making definitions be dead. It is a good - idea to to run a <a href="#die">DIE</a> (Dead Instruction Elimination) pass - sometime after running this pass.</p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="dce">-dce: Dead Code Elimination</a> -</h3> -<div> - <p> - Dead code elimination is similar to <a href="#die">dead instruction - elimination</a>, but it rechecks instructions that were used by removed - instructions to see if they are newly dead. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="deadargelim">-deadargelim: Dead Argument Elimination</a> -</h3> -<div> - <p> - This pass deletes dead arguments from internal functions. Dead argument - elimination removes arguments which are directly dead, as well as arguments - only passed into function calls as dead arguments of other functions. This - pass also deletes dead arguments in a similar way. - </p> - - <p> - This pass is often useful as a cleanup pass to run after aggressive - interprocedural passes, which add possibly-dead arguments. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="deadtypeelim">-deadtypeelim: Dead Type Elimination</a> -</h3> -<div> - <p> - This pass is used to cleanup the output of GCC. It eliminate names for types - that are unused in the entire translation unit, using the <a - href="#findusedtypes">find used types</a> pass. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="die">-die: Dead Instruction Elimination</a> -</h3> -<div> - <p> - Dead instruction elimination performs a single pass over the function, - removing instructions that are obviously dead. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="dse">-dse: Dead Store Elimination</a> -</h3> -<div> - <p> - A trivial dead store elimination that only considers basic-block local - redundant stores. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="functionattrs">-functionattrs: Deduce function attributes</a> -</h3> -<div> - <p>A simple interprocedural pass which walks the call-graph, looking for - functions which do not access or only read non-local memory, and marking them - readnone/readonly. In addition, it marks function arguments (of pointer type) - 'nocapture' if a call to the function does not create any copies of the pointer - value that outlive the call. This more or less means that the pointer is only - dereferenced, and not returned from the function or stored in a global. - This pass is implemented as a bottom-up traversal of the call-graph. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="globaldce">-globaldce: Dead Global Elimination</a> -</h3> -<div> - <p> - This transform is designed to eliminate unreachable internal globals from the - program. It uses an aggressive algorithm, searching out globals that are - known to be alive. After it finds all of the globals which are needed, it - deletes whatever is left over. This allows it to delete recursive chunks of - the program which are unreachable. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="globalopt">-globalopt: Global Variable Optimizer</a> -</h3> -<div> - <p> - This pass transforms simple global variables that never have their address - taken. If obviously true, it marks read/write globals as constant, deletes - variables only stored to, etc. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="gvn">-gvn: Global Value Numbering</a> -</h3> -<div> - <p> - This pass performs global value numbering to eliminate fully and partially - redundant instructions. It also performs redundant load elimination. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="indvars">-indvars: Canonicalize Induction Variables</a> -</h3> -<div> - <p> - This transformation analyzes and transforms the induction variables (and - computations derived from them) into simpler forms suitable for subsequent - analysis and transformation. - </p> - - <p> - This transformation makes the following changes to each loop with an - identifiable induction variable: - </p> - - <ol> - <li>All loops are transformed to have a <em>single</em> canonical - induction variable which starts at zero and steps by one.</li> - <li>The canonical induction variable is guaranteed to be the first PHI node - in the loop header block.</li> - <li>Any pointer arithmetic recurrences are raised to use array - subscripts.</li> - </ol> - - <p> - If the trip count of a loop is computable, this pass also makes the following - changes: - </p> - - <ol> - <li>The exit condition for the loop is canonicalized to compare the - induction value against the exit value. This turns loops like: - <blockquote><pre>for (i = 7; i*i < 1000; ++i)</pre></blockquote> - into - <blockquote><pre>for (i = 0; i != 25; ++i)</pre></blockquote></li> - <li>Any use outside of the loop of an expression derived from the indvar - is changed to compute the derived value outside of the loop, eliminating - the dependence on the exit value of the induction variable. If the only - purpose of the loop is to compute the exit value of some derived - expression, this transformation will make the loop dead.</li> - </ol> - - <p> - This transformation should be followed by strength reduction after all of the - desired loop transformations have been performed. Additionally, on targets - where it is profitable, the loop could be transformed to count down to zero - (the "do loop" optimization). - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="inline">-inline: Function Integration/Inlining</a> -</h3> -<div> - <p> - Bottom-up inlining of functions into callees. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="insert-edge-profiling">-insert-edge-profiling: Insert instrumentation for edge profiling</a> -</h3> -<div> - <p> - This pass instruments the specified program with counters for edge profiling. - Edge profiling can give a reasonable approximation of the hot paths through a - program, and is used for a wide variety of program transformations. - </p> - - <p> - Note that this implementation is very naïve. It inserts a counter for - <em>every</em> edge in the program, instead of using control flow information - to prune the number of counters inserted. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="insert-optimal-edge-profiling">-insert-optimal-edge-profiling: Insert optimal instrumentation for edge profiling</a> -</h3> -<div> - <p>This pass instruments the specified program with counters for edge profiling. - Edge profiling can give a reasonable approximation of the hot paths through a - program, and is used for a wide variety of program transformations. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="instcombine">-instcombine: Combine redundant instructions</a> -</h3> -<div> - <p> - Combine instructions to form fewer, simple - instructions. This pass does not modify the CFG This pass is where algebraic - simplification happens. - </p> - - <p> - This pass combines things like: - </p> - -<blockquote><pre ->%Y = add i32 %X, 1 -%Z = add i32 %Y, 1</pre></blockquote> - - <p> - into: - </p> - -<blockquote><pre ->%Z = add i32 %X, 2</pre></blockquote> - - <p> - This is a simple worklist driven algorithm. - </p> - - <p> - This pass guarantees that the following canonicalizations are performed on - the program: - </p> - - <ul> - <li>If a binary operator has a constant operand, it is moved to the right- - hand side.</li> - <li>Bitwise operators with constant operands are always grouped so that - shifts are performed first, then <code>or</code>s, then - <code>and</code>s, then <code>xor</code>s.</li> - <li>Compare instructions are converted from <code><</code>, - <code>></code>, <code>≤</code>, or <code>≥</code> to - <code>=</code> or <code>≠</code> if possible.</li> - <li>All <code>cmp</code> instructions on boolean values are replaced with - logical operations.</li> - <li><code>add <var>X</var>, <var>X</var></code> is represented as - <code>mul <var>X</var>, 2</code> ⇒ <code>shl <var>X</var>, 1</code></li> - <li>Multiplies with a constant power-of-two argument are transformed into - shifts.</li> - <li>… etc.</li> - </ul> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="internalize">-internalize: Internalize Global Symbols</a> -</h3> -<div> - <p> - This pass loops over all of the functions in the input module, looking for a - main function. If a main function is found, all other functions and all - global variables with initializers are marked as internal. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="ipconstprop">-ipconstprop: Interprocedural constant propagation</a> -</h3> -<div> - <p> - This pass implements an <em>extremely</em> simple interprocedural constant - propagation pass. It could certainly be improved in many different ways, - like using a worklist. This pass makes arguments dead, but does not remove - them. The existing dead argument elimination pass should be run after this - to clean up the mess. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="ipsccp">-ipsccp: Interprocedural Sparse Conditional Constant Propagation</a> -</h3> -<div> - <p> - An interprocedural variant of <a href="#sccp">Sparse Conditional Constant - Propagation</a>. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="jump-threading">-jump-threading: Jump Threading</a> -</h3> -<div> - <p> - Jump threading tries to find distinct threads of control flow running through - a basic block. This pass looks at blocks that have multiple predecessors and - multiple successors. If one or more of the predecessors of the block can be - proven to always cause a jump to one of the successors, we forward the edge - from the predecessor to the successor by duplicating the contents of this - block. - </p> - <p> - An example of when this can occur is code like this: - </p> - - <pre ->if () { ... - X = 4; -} -if (X < 3) {</pre> - - <p> - In this case, the unconditional branch at the end of the first if can be - revectored to the false side of the second if. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="lcssa">-lcssa: Loop-Closed SSA Form Pass</a> -</h3> -<div> - <p> - This pass transforms loops by placing phi nodes at the end of the loops for - all values that are live across the loop boundary. For example, it turns - the left into the right code: - </p> - - <pre ->for (...) for (...) - if (c) if (c) - X1 = ... X1 = ... - else else - X2 = ... X2 = ... - X3 = phi(X1, X2) X3 = phi(X1, X2) -... = X3 + 4 X4 = phi(X3) - ... = X4 + 4</pre> - - <p> - This is still valid LLVM; the extra phi nodes are purely redundant, and will - be trivially eliminated by <code>InstCombine</code>. The major benefit of - this transformation is that it makes many other loop optimizations, such as - LoopUnswitching, simpler. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="licm">-licm: Loop Invariant Code Motion</a> -</h3> -<div> - <p> - This pass performs loop invariant code motion, attempting to remove as much - code from the body of a loop as possible. It does this by either hoisting - code into the preheader block, or by sinking code to the exit blocks if it is - safe. This pass also promotes must-aliased memory locations in the loop to - live in registers, thus hoisting and sinking "invariant" loads and stores. - </p> - - <p> - This pass uses alias analysis for two purposes: - </p> - - <ul> - <li>Moving loop invariant loads and calls out of loops. If we can determine - that a load or call inside of a loop never aliases anything stored to, - we can hoist it or sink it like any other instruction.</li> - <li>Scalar Promotion of Memory - If there is a store instruction inside of - the loop, we try to move the store to happen AFTER the loop instead of - inside of the loop. This can only happen if a few conditions are true: - <ul> - <li>The pointer stored through is loop invariant.</li> - <li>There are no stores or loads in the loop which <em>may</em> alias - the pointer. There are no calls in the loop which mod/ref the - pointer.</li> - </ul> - If these conditions are true, we can promote the loads and stores in the - loop of the pointer to use a temporary alloca'd variable. We then use - the mem2reg functionality to construct the appropriate SSA form for the - variable.</li> - </ul> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="loop-deletion">-loop-deletion: Delete dead loops</a> -</h3> -<div> - <p> - This file implements the Dead Loop Deletion Pass. This pass is responsible - for eliminating loops with non-infinite computable trip counts that have no - side effects or volatile instructions, and do not contribute to the - computation of the function's return value. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="loop-extract">-loop-extract: Extract loops into new functions</a> -</h3> -<div> - <p> - A pass wrapper around the <code>ExtractLoop()</code> scalar transformation to - extract each top-level loop into its own new function. If the loop is the - <em>only</em> loop in a given function, it is not touched. This is a pass most - useful for debugging via bugpoint. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="loop-extract-single">-loop-extract-single: Extract at most one loop into a new function</a> -</h3> -<div> - <p> - Similar to <a href="#loop-extract">Extract loops into new functions</a>, - this pass extracts one natural loop from the program into a function if it - can. This is used by bugpoint. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="loop-reduce">-loop-reduce: Loop Strength Reduction</a> -</h3> -<div> - <p> - This pass performs a strength reduction on array references inside loops that - have as one or more of their components the loop induction variable. This is - accomplished by creating a new value to hold the initial value of the array - access for the first iteration, and then creating a new GEP instruction in - the loop to increment the value by the appropriate amount. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="loop-rotate">-loop-rotate: Rotate Loops</a> -</h3> -<div> - <p>A simple loop rotation transformation.</p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="loop-simplify">-loop-simplify: Canonicalize natural loops</a> -</h3> -<div> - <p> - This pass performs several transformations to transform natural loops into a - simpler form, which makes subsequent analyses and transformations simpler and - more effective. - </p> - - <p> - Loop pre-header insertion guarantees that there is a single, non-critical - entry edge from outside of the loop to the loop header. This simplifies a - number of analyses and transformations, such as LICM. - </p> - - <p> - Loop exit-block insertion guarantees that all exit blocks from the loop - (blocks which are outside of the loop that have predecessors inside of the - loop) only have predecessors from inside of the loop (and are thus dominated - by the loop header). This simplifies transformations such as store-sinking - that are built into LICM. - </p> - - <p> - This pass also guarantees that loops will have exactly one backedge. - </p> - - <p> - Note that the simplifycfg pass will clean up blocks which are split out but - end up being unnecessary, so usage of this pass should not pessimize - generated code. - </p> - - <p> - This pass obviously modifies the CFG, but updates loop information and - dominator information. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="loop-unroll">-loop-unroll: Unroll loops</a> -</h3> -<div> - <p> - This pass implements a simple loop unroller. It works best when loops have - been canonicalized by the <a href="#indvars"><tt>-indvars</tt></a> pass, - allowing it to determine the trip counts of loops easily. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="loop-unswitch">-loop-unswitch: Unswitch loops</a> -</h3> -<div> - <p> - This pass transforms loops that contain branches on loop-invariant conditions - to have multiple loops. For example, it turns the left into the right code: - </p> - - <pre ->for (...) if (lic) - A for (...) - if (lic) A; B; C - B else - C for (...) - A; C</pre> - - <p> - This can increase the size of the code exponentially (doubling it every time - a loop is unswitched) so we only unswitch if the resultant code will be - smaller than a threshold. - </p> - - <p> - This pass expects LICM to be run before it to hoist invariant conditions out - of the loop, to make the unswitching opportunity obvious. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="loweratomic">-loweratomic: Lower atomic intrinsics to non-atomic form</a> -</h3> -<div> - <p> - This pass lowers atomic intrinsics to non-atomic form for use in a known - non-preemptible environment. - </p> - - <p> - The pass does not verify that the environment is non-preemptible (in - general this would require knowledge of the entire call graph of the - program including any libraries which may not be available in bitcode form); - it simply lowers every atomic intrinsic. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="lowerinvoke">-lowerinvoke: Lower invoke and unwind, for unwindless code generators</a> -</h3> -<div> - <p> - This transformation is designed for use by code generators which do not yet - support stack unwinding. This pass supports two models of exception handling - lowering, the 'cheap' support and the 'expensive' support. - </p> - - <p> - 'Cheap' exception handling support gives the program the ability to execute - any program which does not "throw an exception", by turning 'invoke' - instructions into calls and by turning 'unwind' instructions into calls to - abort(). If the program does dynamically use the unwind instruction, the - program will print a message then abort. - </p> - - <p> - 'Expensive' exception handling support gives the full exception handling - support to the program at the cost of making the 'invoke' instruction - really expensive. It basically inserts setjmp/longjmp calls to emulate the - exception handling as necessary. - </p> - - <p> - Because the 'expensive' support slows down programs a lot, and EH is only - used for a subset of the programs, it must be specifically enabled by the - <tt>-enable-correct-eh-support</tt> option. - </p> - - <p> - Note that after this pass runs the CFG is not entirely accurate (exceptional - control flow edges are not correct anymore) so only very simple things should - be done after the lowerinvoke pass has run (like generation of native code). - This should not be used as a general purpose "my LLVM-to-LLVM pass doesn't - support the invoke instruction yet" lowering pass. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="lowerswitch">-lowerswitch: Lower SwitchInst's to branches</a> -</h3> -<div> - <p> - Rewrites <tt>switch</tt> instructions with a sequence of branches, which - allows targets to get away with not implementing the switch instruction until - it is convenient. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="mem2reg">-mem2reg: Promote Memory to Register</a> -</h3> -<div> - <p> - This file promotes memory references to be register references. It promotes - <tt>alloca</tt> instructions which only have <tt>load</tt>s and - <tt>store</tt>s as uses. An <tt>alloca</tt> is transformed by using dominator - frontiers to place <tt>phi</tt> nodes, then traversing the function in - depth-first order to rewrite <tt>load</tt>s and <tt>store</tt>s as - appropriate. This is just the standard SSA construction algorithm to construct - "pruned" SSA form. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="memcpyopt">-memcpyopt: MemCpy Optimization</a> -</h3> -<div> - <p> - This pass performs various transformations related to eliminating memcpy - calls, or transforming sets of stores into memset's. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="mergefunc">-mergefunc: Merge Functions</a> -</h3> -<div> - <p>This pass looks for equivalent functions that are mergable and folds them. - - A hash is computed from the function, based on its type and number of - basic blocks. - - Once all hashes are computed, we perform an expensive equality comparison - on each function pair. This takes n^2/2 comparisons per bucket, so it's - important that the hash function be high quality. The equality comparison - iterates through each instruction in each basic block. - - When a match is found the functions are folded. If both functions are - overridable, we move the functionality into a new internal function and - leave two overridable thunks to it. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="mergereturn">-mergereturn: Unify function exit nodes</a> -</h3> -<div> - <p> - Ensure that functions have at most one <tt>ret</tt> instruction in them. - Additionally, it keeps track of which node is the new exit node of the CFG. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="partial-inliner">-partial-inliner: Partial Inliner</a> -</h3> -<div> - <p>This pass performs partial inlining, typically by inlining an if - statement that surrounds the body of the function. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="prune-eh">-prune-eh: Remove unused exception handling info</a> -</h3> -<div> - <p> - This file implements a simple interprocedural pass which walks the call-graph, - turning <tt>invoke</tt> instructions into <tt>call</tt> instructions if and - only if the callee cannot throw an exception. It implements this as a - bottom-up traversal of the call-graph. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="reassociate">-reassociate: Reassociate expressions</a> -</h3> -<div> - <p> - This pass reassociates commutative expressions in an order that is designed - to promote better constant propagation, GCSE, LICM, PRE, etc. - </p> - - <p> - For example: 4 + (<var>x</var> + 5) ⇒ <var>x</var> + (4 + 5) - </p> - - <p> - In the implementation of this algorithm, constants are assigned rank = 0, - function arguments are rank = 1, and other values are assigned ranks - corresponding to the reverse post order traversal of current function - (starting at 2), which effectively gives values in deep loops higher rank - than values not in loops. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="reg2mem">-reg2mem: Demote all values to stack slots</a> -</h3> -<div> - <p> - This file demotes all registers to memory references. It is intended to be - the inverse of <a href="#mem2reg"><tt>-mem2reg</tt></a>. By converting to - <tt>load</tt> instructions, the only values live across basic blocks are - <tt>alloca</tt> instructions and <tt>load</tt> instructions before - <tt>phi</tt> nodes. It is intended that this should make CFG hacking much - easier. To make later hacking easier, the entry block is split into two, such - that all introduced <tt>alloca</tt> instructions (and nothing else) are in the - entry block. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="scalarrepl">-scalarrepl: Scalar Replacement of Aggregates (DT)</a> -</h3> -<div> - <p> - The well-known scalar replacement of aggregates transformation. This - transform breaks up <tt>alloca</tt> instructions of aggregate type (structure - or array) into individual <tt>alloca</tt> instructions for each member if - possible. Then, if possible, it transforms the individual <tt>alloca</tt> - instructions into nice clean scalar SSA form. - </p> - - <p> - This combines a simple scalar replacement of aggregates algorithm with the <a - href="#mem2reg"><tt>mem2reg</tt></a> algorithm because often interact, - especially for C++ programs. As such, iterating between <tt>scalarrepl</tt>, - then <a href="#mem2reg"><tt>mem2reg</tt></a> until we run out of things to - promote works well. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="sccp">-sccp: Sparse Conditional Constant Propagation</a> -</h3> -<div> - <p> - Sparse conditional constant propagation and merging, which can be summarized - as: - </p> - - <ol> - <li>Assumes values are constant unless proven otherwise</li> - <li>Assumes BasicBlocks are dead unless proven otherwise</li> - <li>Proves values to be constant, and replaces them with constants</li> - <li>Proves conditional branches to be unconditional</li> - </ol> - - <p> - Note that this pass has a habit of making definitions be dead. It is a good - idea to to run a DCE pass sometime after running this pass. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="simplify-libcalls">-simplify-libcalls: Simplify well-known library calls</a> -</h3> -<div> - <p> - Applies a variety of small optimizations for calls to specific well-known - function calls (e.g. runtime library functions). For example, a call - <tt>exit(3)</tt> that occurs within the <tt>main()</tt> function can be - transformed into simply <tt>return 3</tt>. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="simplifycfg">-simplifycfg: Simplify the CFG</a> -</h3> -<div> - <p> - Performs dead code elimination and basic block merging. Specifically: - </p> - - <ol> - <li>Removes basic blocks with no predecessors.</li> - <li>Merges a basic block into its predecessor if there is only one and the - predecessor only has one successor.</li> - <li>Eliminates PHI nodes for basic blocks with a single predecessor.</li> - <li>Eliminates a basic block that only contains an unconditional - branch.</li> - </ol> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="sink">-sink: Code sinking</a> -</h3> -<div> - <p>This pass moves instructions into successor blocks, when possible, so that - they aren't executed on paths where their results aren't needed. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="strip">-strip: Strip all symbols from a module</a> -</h3> -<div> - <p> - performs code stripping. this transformation can delete: - </p> - - <ol> - <li>names for virtual registers</li> - <li>symbols for internal globals and functions</li> - <li>debug information</li> - </ol> - - <p> - note that this transformation makes code much less readable, so it should - only be used in situations where the <tt>strip</tt> utility would be used, - such as reducing code size or making it harder to reverse engineer code. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="strip-dead-debug-info">-strip-dead-debug-info: Strip debug info for unused symbols</a> -</h3> -<div> - <p> - performs code stripping. this transformation can delete: - </p> - - <ol> - <li>names for virtual registers</li> - <li>symbols for internal globals and functions</li> - <li>debug information</li> - </ol> - - <p> - note that this transformation makes code much less readable, so it should - only be used in situations where the <tt>strip</tt> utility would be used, - such as reducing code size or making it harder to reverse engineer code. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="strip-dead-prototypes">-strip-dead-prototypes: Strip Unused Function Prototypes</a> -</h3> -<div> - <p> - This pass loops over all of the functions in the input module, looking for - dead declarations and removes them. Dead declarations are declarations of - functions for which no implementation is available (i.e., declarations for - unused library functions). - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="strip-debug-declare">-strip-debug-declare: Strip all llvm.dbg.declare intrinsics</a> -</h3> -<div> - <p>This pass implements code stripping. Specifically, it can delete:</p> - <ul> - <li>names for virtual registers</li> - <li>symbols for internal globals and functions</li> - <li>debug information</li> - </ul> - <p> - Note that this transformation makes code much less readable, so it should - only be used in situations where the 'strip' utility would be used, such as - reducing code size or making it harder to reverse engineer code. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="strip-nondebug">-strip-nondebug: Strip all symbols, except dbg symbols, from a module</a> -</h3> -<div> - <p>This pass implements code stripping. Specifically, it can delete:</p> - <ul> - <li>names for virtual registers</li> - <li>symbols for internal globals and functions</li> - <li>debug information</li> - </ul> - <p> - Note that this transformation makes code much less readable, so it should - only be used in situations where the 'strip' utility would be used, such as - reducing code size or making it harder to reverse engineer code. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="tailcallelim">-tailcallelim: Tail Call Elimination</a> -</h3> -<div> - <p> - This file transforms calls of the current function (self recursion) followed - by a return instruction with a branch to the entry of the function, creating - a loop. This pass also implements the following extensions to the basic - algorithm: - </p> - - <ul> - <li>Trivial instructions between the call and return do not prevent the - transformation from taking place, though currently the analysis cannot - support moving any really useful instructions (only dead ones). - <li>This pass transforms functions that are prevented from being tail - recursive by an associative expression to use an accumulator variable, - thus compiling the typical naive factorial or <tt>fib</tt> implementation - into efficient code. - <li>TRE is performed if the function returns void, if the return - returns the result returned by the call, or if the function returns a - run-time constant on all exits from the function. It is possible, though - unlikely, that the return returns something else (like constant 0), and - can still be TRE'd. It can be TRE'd if <em>all other</em> return - instructions in the function return the exact same value. - <li>If it can prove that callees do not access theier caller stack frame, - they are marked as eligible for tail call elimination (by the code - generator). - </ul> -</div> - -<!-- ======================================================================= --> -<h2><a name="utilities">Utility Passes</a></h2> -<div> - <p>This section describes the LLVM Utility Passes.</p> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="deadarghaX0r">-deadarghaX0r: Dead Argument Hacking (BUGPOINT USE ONLY; DO NOT USE)</a> -</h3> -<div> - <p> - Same as dead argument elimination, but deletes arguments to functions which - are external. This is only for use by <a - href="Bugpoint.html">bugpoint</a>.</p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="extract-blocks">-extract-blocks: Extract Basic Blocks From Module (for bugpoint use)</a> -</h3> -<div> - <p> - This pass is used by bugpoint to extract all blocks from the module into their - own functions.</p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="instnamer">-instnamer: Assign names to anonymous instructions</a> -</h3> -<div> - <p>This is a little utility pass that gives instructions names, this is mostly - useful when diffing the effect of an optimization because deleting an - unnamed instruction can change all other instruction numbering, making the - diff very noisy. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="preverify">-preverify: Preliminary module verification</a> -</h3> -<div> - <p> - Ensures that the module is in the form required by the <a - href="#verifier">Module Verifier</a> pass. - </p> - - <p> - Running the verifier runs this pass automatically, so there should be no need - to use it directly. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="verify">-verify: Module Verifier</a> -</h3> -<div> - <p> - Verifies an LLVM IR code. This is useful to run after an optimization which is - undergoing testing. Note that <tt>llvm-as</tt> verifies its input before - emitting bitcode, and also that malformed bitcode is likely to make LLVM - crash. All language front-ends are therefore encouraged to verify their output - before performing optimizing transformations. - </p> - - <ul> - <li>Both of a binary operator's parameters are of the same type.</li> - <li>Verify that the indices of mem access instructions match other - operands.</li> - <li>Verify that arithmetic and other things are only performed on - first-class types. Verify that shifts and logicals only happen on - integrals f.e.</li> - <li>All of the constants in a switch statement are of the correct type.</li> - <li>The code is in valid SSA form.</li> - <li>It is illegal to put a label into any other type (like a structure) or - to return one.</li> - <li>Only phi nodes can be self referential: <tt>%x = add i32 %x, %x</tt> is - invalid.</li> - <li>PHI nodes must have an entry for each predecessor, with no extras.</li> - <li>PHI nodes must be the first thing in a basic block, all grouped - together.</li> - <li>PHI nodes must have at least one entry.</li> - <li>All basic blocks should only end with terminator insts, not contain - them.</li> - <li>The entry node to a function must not have predecessors.</li> - <li>All Instructions must be embedded into a basic block.</li> - <li>Functions cannot take a void-typed parameter.</li> - <li>Verify that a function's argument list agrees with its declared - type.</li> - <li>It is illegal to specify a name for a void value.</li> - <li>It is illegal to have an internal global value with no initializer.</li> - <li>It is illegal to have a ret instruction that returns a value that does - not agree with the function return value type.</li> - <li>Function call argument types match the function prototype.</li> - <li>All other things that are tested by asserts spread about the code.</li> - </ul> - - <p> - Note that this does not provide full security verification (like Java), but - instead just tries to ensure that code is well-formed. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="view-cfg">-view-cfg: View CFG of function</a> -</h3> -<div> - <p> - Displays the control flow graph using the GraphViz tool. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="view-cfg-only">-view-cfg-only: View CFG of function (with no function bodies)</a> -</h3> -<div> - <p> - Displays the control flow graph using the GraphViz tool, but omitting function - bodies. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="view-dom">-view-dom: View dominance tree of function</a> -</h3> -<div> - <p> - Displays the dominator tree using the GraphViz tool. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="view-dom-only">-view-dom-only: View dominance tree of function (with no function bodies)</a> -</h3> -<div> - <p> - Displays the dominator tree using the GraphViz tool, but omitting function - bodies. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="view-postdom">-view-postdom: View postdominance tree of function</a> -</h3> -<div> - <p> - Displays the post dominator tree using the GraphViz tool. - </p> -</div> - -<!-------------------------------------------------------------------------- --> -<h3> - <a name="view-postdom-only">-view-postdom-only: View postdominance tree of function (with no function bodies)</a> -</h3> -<div> - <p> - Displays the post dominator tree using the GraphViz tool, but omitting - function bodies. - </p> -</div> - -</div> - -<!-- *********************************************************************** --> - -<hr> -<address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> - <a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - - <a href="mailto:rspencer@x10sys.com">Reid Spencer</a><br> - <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> - Last modified: $Date$ -</address> - -</body> -</html> diff --git a/docs/Passes.rst b/docs/Passes.rst new file mode 100644 index 0000000000..9cb8ba0c34 --- /dev/null +++ b/docs/Passes.rst @@ -0,0 +1,1261 @@ +.. + If Passes.html is up to date, the following "one-liner" should print + an empty diff. + + egrep -e '^<tr><td><a href="#.*">-.*</a></td><td>.*</td></tr>$' \ + -e '^ <a name=".*">.*</a>$' < Passes.html >html; \ + perl >help <<'EOT' && diff -u help html; rm -f help html + open HTML, "<Passes.html" or die "open: Passes.html: $!\n"; + while (<HTML>) { + m:^<tr><td><a href="#(.*)">-.*</a></td><td>.*</td></tr>$: or next; + $order{$1} = sprintf("%03d", 1 + int %order); + } + open HELP, "../Release/bin/opt -help|" or die "open: opt -help: $!\n"; + while (<HELP>) { + m:^ -([^ ]+) +- (.*)$: or next; + my $o = $order{$1}; + $o = "000" unless defined $o; + push @x, "$o<tr><td><a href=\"#$1\">-$1</a></td><td>$2</td></tr>\n"; + push @y, "$o <a name=\"$1\">-$1: $2</a>\n"; + } + @x = map { s/^\d\d\d//; $_ } sort @x; + @y = map { s/^\d\d\d//; $_ } sort @y; + print @x, @y; + EOT + + This (real) one-liner can also be helpful when converting comments to HTML: + + perl -e '$/ = undef; for (split(/\n/, <>)) { s:^ *///? ?::; print " <p>\n" if !$on && $_ =~ /\S/; print " </p>\n" if $on && $_ =~ /^\s*$/; print " $_\n"; $on = ($_ =~ /\S/); } print " </p>\n" if $on' + +==================================== +LLVM's Analysis and Transform Passes +==================================== + +.. contents:: + :local: + +Introduction +============ + +This document serves as a high level summary of the optimization features that +LLVM provides. Optimizations are implemented as Passes that traverse some +portion of a program to either collect information or transform the program. +The table below divides the passes that LLVM provides into three categories. +Analysis passes compute information that other passes can use or for debugging +or program visualization purposes. Transform passes can use (or invalidate) +the analysis passes. Transform passes all mutate the program in some way. +Utility passes provides some utility but don't otherwise fit categorization. +For example passes to extract functions to bitcode or write a module to bitcode +are neither analysis nor transform passes. The table of contents above +provides a quick summary of each pass and links to the more complete pass +description later in the document. + +Analysis Passes +=============== + +This section describes the LLVM Analysis Passes. + +``-aa-eval``: Exhaustive Alias Analysis Precision Evaluator +----------------------------------------------------------- + +This is a simple N^2 alias analysis accuracy evaluator. Basically, for each +function in the program, it simply queries to see how the alias analysis +implementation answers alias queries between each pair of pointers in the +function. + +This is inspired and adapted from code by: Naveen Neelakantam, Francesco +Spadini, and Wojciech Stryjewski. + +``-basicaa``: Basic Alias Analysis (stateless AA impl) +------------------------------------------------------ + +A basic alias analysis pass that implements identities (two different globals +cannot alias, etc), but does no stateful analysis. + +``-basiccg``: Basic CallGraph Construction +------------------------------------------ + +Yet to be written. + +``-count-aa``: Count Alias Analysis Query Responses +--------------------------------------------------- + +A pass which can be used to count how many alias queries are being made and how +the alias analysis implementation being used responds. + +``-da``: Dependence Analysis +---------------------------- + +Dependence analysis framework, which is used to detect dependences in memory +accesses. + +``-debug-aa``: AA use debugger +------------------------------ + +This simple pass checks alias analysis users to ensure that if they create a +new value, they do not query AA without informing it of the value. It acts as +a shim over any other AA pass you want. + +Yes keeping track of every value in the program is expensive, but this is a +debugging pass. + +``-domfrontier``: Dominance Frontier Construction +------------------------------------------------- + +This pass is a simple dominator construction algorithm for finding forward +dominator frontiers. + +``-domtree``: Dominator Tree Construction +----------------------------------------- + +This pass is a simple dominator construction algorithm for finding forward +dominators. + + +``-dot-callgraph``: Print Call Graph to "dot" file +-------------------------------------------------- + +This pass, only available in ``opt``, prints the call graph into a ``.dot`` +graph. This graph can then be processed with the "dot" tool to convert it to +postscript or some other suitable format. + +``-dot-cfg``: Print CFG of function to "dot" file +------------------------------------------------- + +This pass, only available in ``opt``, prints the control flow graph into a +``.dot`` graph. This graph can then be processed with the :program:`dot` tool +to convert it to postscript or some other suitable format. + +``-dot-cfg-only``: Print CFG of function to "dot" file (with no function bodies) +-------------------------------------------------------------------------------- + +This pass, only available in ``opt``, prints the control flow graph into a +``.dot`` graph, omitting the function bodies. This graph can then be processed +with the :program:`dot` tool to convert it to postscript or some other suitable +format. + +``-dot-dom``: Print dominance tree of function to "dot" file +------------------------------------------------------------ + +This pass, only available in ``opt``, prints the dominator tree into a ``.dot`` +graph. This graph can then be processed with the :program:`dot` tool to +convert it to postscript or some other suitable format. + +``-dot-dom-only``: Print dominance tree of function to "dot" file (with no function bodies) +------------------------------------------------------------------------------------------- + +This pass, only available in ``opt``, prints the dominator tree into a ``.dot`` +graph, omitting the function bodies. This graph can then be processed with the +:program:`dot` tool to convert it to postscript or some other suitable format. + +``-dot-postdom``: Print postdominance tree of function to "dot" file +-------------------------------------------------------------------- + +This pass, only available in ``opt``, prints the post dominator tree into a +``.dot`` graph. This graph can then be processed with the :program:`dot` tool +to convert it to postscript or some other suitable format. + +``-dot-postdom-only``: Print postdominance tree of function to "dot" file (with no function bodies) +--------------------------------------------------------------------------------------------------- + +This pass, only available in ``opt``, prints the post dominator tree into a +``.dot`` graph, omitting the function bodies. This graph can then be processed +with the :program:`dot` tool to convert it to postscript or some other suitable +format. + +``-globalsmodref-aa``: Simple mod/ref analysis for globals +---------------------------------------------------------- + +This simple pass provides alias and mod/ref information for global values that +do not have their address taken, and keeps track of whether functions read or +write memory (are "pure"). For this simple (but very common) case, we can +provide pretty accurate and useful information. + +``-instcount``: Counts the various types of ``Instruction``\ s +-------------------------------------------------------------- + +This pass collects the count of all instructions and reports them. + +``-intervals``: Interval Partition Construction +----------------------------------------------- + +This analysis calculates and represents the interval partition of a function, +or a preexisting interval partition. + +In this way, the interval partition may be used to reduce a flow graph down to +its degenerate single node interval partition (unless it is irreducible). + +``-iv-users``: Induction Variable Users +--------------------------------------- + +Bookkeeping for "interesting" users of expressions computed from induction +variables. + +``-lazy-value-info``: Lazy Value Information Analysis +----------------------------------------------------- + +Interface for lazy computation of value constraint information. + +``-libcall-aa``: LibCall Alias Analysis +--------------------------------------- + +LibCall Alias Analysis. + +``-lint``: Statically lint-checks LLVM IR +----------------------------------------- + +This pass statically checks for common and easily-identified constructs which +produce undefined or likely unintended behavior in LLVM IR. + +It is not a guarantee of correctness, in two ways. First, it isn't +comprehensive. There are checks which could be done statically which are not +yet implemented. Some of these are indicated by TODO comments, but those +aren't comprehensive either. Second, many conditions cannot be checked +statically. This pass does no dynamic instrumentation, so it can't check for +all possible problems. + +Another limitation is that it assumes all code will be executed. A store +through a null pointer in a basic block which is never reached is harmless, but +this pass will warn about it anyway. + +Optimization passes may make conditions that this pass checks for more or less +obvious. If an optimization pass appears to be introducing a warning, it may +be that the optimization pass is merely exposing an existing condition in the +code. + +This code may be run before :ref:`instcombine <passes-instcombine>`. In many +cases, instcombine checks for the same kinds of things and turns instructions +with undefined behavior into unreachable (or equivalent). Because of this, +this pass makes some effort to look through bitcasts and so on. + +``-loops``: Natural Loop Information +------------------------------------ + +This analysis is used to identify natural loops and determine the loop depth of +various nodes of the CFG. Note that the loops identified may actually be +several natural loops that share the same header node... not just a single +natural loop. + +``-memdep``: Memory Dependence Analysis +--------------------------------------- + +An analysis that determines, for a given memory operation, what preceding +memory operations it depends on. It builds on alias analysis information, and +tries to provide a lazy, caching interface to a common kind of alias +information query. + +``-module-debuginfo``: Decodes module-level debug info +------------------------------------------------------ + +This pass decodes the debug info metadata in a module and prints in a +(sufficiently-prepared-) human-readable form. + +For example, run this pass from ``opt`` along with the ``-analyze`` option, and +it'll print to standard output. + +``-no-aa``: No Alias Analysis (always returns 'may' alias) +---------------------------------------------------------- + +This is the default implementation of the Alias Analysis interface. It always +returns "I don't know" for alias queries. NoAA is unlike other alias analysis +implementations, in that it does not chain to a previous analysis. As such it +doesn't follow many of the rules that other alias analyses must. + +``-no-profile``: No Profile Information +--------------------------------------- + +The default "no profile" implementation of the abstract ``ProfileInfo`` +interface. + +``-postdomfrontier``: Post-Dominance Frontier Construction +---------------------------------------------------------- + +This pass is a simple post-dominator construction algorithm for finding +post-dominator frontiers. + +``-postdomtree``: Post-Dominator Tree Construction +-------------------------------------------------- + +This pass is a simple post-dominator construction algorithm for finding +post-dominators. + +``-print-alias-sets``: Alias Set Printer +---------------------------------------- + +Yet to be written. + +``-print-callgraph``: Print a call graph +---------------------------------------- + +This pass, only available in ``opt``, prints the call graph to standard error +in a human-readable form. + +``-print-callgraph-sccs``: Print SCCs of the Call Graph +------------------------------------------------------- + +This pass, only available in ``opt``, prints the SCCs of the call graph to +standard error in a human-readable form. + +``-print-cfg-sccs``: Print SCCs of each function CFG +---------------------------------------------------- + +This pass, only available in ``opt``, printsthe SCCs of each function CFG to +standard error in a human-readable fom. + +``-print-dbginfo``: Print debug info in human readable form +----------------------------------------------------------- + +Pass that prints instructions, and associated debug info: + +#. source/line/col information +#. original variable name +#. original type name + +``-print-dom-info``: Dominator Info Printer +------------------------------------------- + +Dominator Info Printer. + +``-print-externalfnconstants``: Print external fn callsites passed constants +---------------------------------------------------------------------------- + +This pass, only available in ``opt``, prints out call sites to external +functions that are called with constant arguments. This can be useful when +looking for standard library functions we should constant fold or handle in +alias analyses. + +``-print-function``: Print function to stderr +--------------------------------------------- + +The ``PrintFunctionPass`` class is designed to be pipelined with other +``FunctionPasses``, and prints out the functions of the module as they are +processed. + +``-print-module``: Print module to stderr +----------------------------------------- + +This pass simply prints out the entire module when it is executed. + +.. _passes-print-used-types: + +``-print-used-types``: Find Used Types +-------------------------------------- + +This pass is used to seek out all of the types in use by the program. Note +that this analysis explicitly does not include types only used by the symbol +table. + +``-profile-estimator``: Estimate profiling information +------------------------------------------------------ + +Profiling information that estimates the profiling information in a very crude +and unimaginative way. + +``-profile-loader``: Load profile information from ``llvmprof.out`` +------------------------------------------------------------------- + +A concrete implementation of profiling information that loads the information +from a profile dump file. + +``-profile-verifier``: Verify profiling information +--------------------------------------------------- + +Pass that checks profiling information for plausibility. + +``-regions``: Detect single entry single exit regions +----------------------------------------------------- + +The ``RegionInfo`` pass detects single entry single exit regions in a function, +where a region is defined as any subgraph that is connected to the remaining +graph at only two spots. Furthermore, an hierarchical region tree is built. + +``-scalar-evolution``: Scalar Evolution Analysis +------------------------------------------------ + +The ``ScalarEvolution`` analysis can be used to analyze and catagorize scalar +expressions in loops. It specializes in recognizing general induction +variables, representing them with the abstract and opaque ``SCEV`` class. +Given this analysis, trip counts of loops and other important properties can be +obtained. + +This analysis is primarily useful for induction variable substitution and +strength reduction. + +``-scev-aa``: ScalarEvolution-based Alias Analysis +-------------------------------------------------- + +Simple alias analysis implemented in terms of ``ScalarEvolution`` queries. + +This differs from traditional loop dependence analysis in that it tests for +dependencies within a single iteration of a loop, rather than dependencies +between different iterations. + +``ScalarEvolution`` has a more complete understanding of pointer arithmetic +than ``BasicAliasAnalysis``' collection of ad-hoc analyses. + +``-targetdata``: Target Data Layout +----------------------------------- + +Provides other passes access to information on how the size and alignment +required by the target ABI for various data types. + +Transform Passes +================ + +This section describes the LLVM Transform Passes. + +``-adce``: Aggressive Dead Code Elimination +------------------------------------------- + +ADCE aggressively tries to eliminate code. This pass is similar to :ref:`DCE +<passes-dce>` but it assumes that values are dead until proven otherwise. This +is similar to :ref:`SCCP <passes-sccp>`, except applied to the liveness of +values. + +``-always-inline``: Inliner for ``always_inline`` functions +----------------------------------------------------------- + +A custom inliner that handles only functions that are marked as "always +inline". + +``-argpromotion``: Promote 'by reference' arguments to scalars +-------------------------------------------------------------- + +This pass promotes "by reference" arguments to be "by value" arguments. In +practice, this means looking for internal functions that have pointer +arguments. If it can prove, through the use of alias analysis, that an +argument is *only* loaded, then it can pass the value into the function instead +of the address of the value. This can cause recursive simplification of code +and lead to the elimination of allocas (especially in C++ template code like +the STL). + +This pass also handles aggregate arguments that are passed into a function, +scalarizing them if the elements of the aggregate are only loaded. Note that +it refuses to scalarize aggregates which would require passing in more than +three operands to the function, because passing thousands of operands for a +large array or structure is unprofitable! + +Note that this transformation could also be done for arguments that are only +stored to (returning the value instead), but does not currently. This case +would be best handled when and if LLVM starts supporting multiple return values +from functions. + +``-bb-vectorize``: Basic-Block Vectorization +-------------------------------------------- + +This pass combines instructions inside basic blocks to form vector +instructions. It iterates over each basic block, attempting to pair compatible +instructions, repeating this process until no additional pairs are selected for +vectorization. When the outputs of some pair of compatible instructions are +used as inputs by some other pair of compatible instructions, those pairs are +part of a potential vectorization chain. Instruction pairs are only fused into +vector instructions when they are part of a chain longer than some threshold +length. Moreover, the pass attempts to find the best possible chain for each +pair of compatible instructions. These heuristics are intended to prevent +vectorization in cases where it would not yield a performance increase of the +resulting code. + +``-block-placement``: Profile Guided Basic Block Placement +---------------------------------------------------------- + +This pass is a very simple profile guided basic block placement algorithm. The +idea is to put frequently executed blocks together at the start of the function +and hopefully increase the number of fall-through conditional branches. If +there is no profile information for a particular function, this pass basically +orders blocks in depth-first order. + +``-break-crit-edges``: Break critical edges in CFG +-------------------------------------------------- + +Break all of the critical edges in the CFG by inserting a dummy basic block. +It may be "required" by passes that cannot deal with critical edges. This +transformation obviously invalidates the CFG, but can update forward dominator +(set, immediate dominators, tree, and frontier) information. + +``-codegenprepare``: Optimize for code generation +------------------------------------------------- + +This pass munges the code in the input function to better prepare it for +SelectionDAG-based code generation. This works around limitations in it's +basic-block-at-a-time approach. It should eventually be removed. + +``-constmerge``: Merge Duplicate Global Constants +------------------------------------------------- + +Merges duplicate global constants together into a single constant that is +shared. This is useful because some passes (i.e., TraceValues) insert a lot of +string constants into the program, regardless of whether or not an existing +string is available. + +``-constprop``: Simple constant propagation +------------------------------------------- + +This file implements constant propagation and merging. It looks for +instructions involving only constant operands and replaces them with a constant +value instead of an instruction. For example: + +.. code-block:: llvm + + add i32 1, 2 + +becomes + +.. code-block:: llvm + + i32 3 + +NOTE: this pass has a habit of making definitions be dead. It is a good idea +to to run a :ref:`Dead Instruction Elimination <passes-die>` pass sometime +after running this pass. + +.. _passes-dce: + +``-dce``: Dead Code Elimination +------------------------------- + +Dead code elimination is similar to :ref:`dead instruction elimination +<passes-die>`, but it rechecks instructions that were used by removed +instructions to see if they are newly dead. + +``-deadargelim``: Dead Argument Elimination +------------------------------------------- + +This pass deletes dead arguments from internal functions. Dead argument +elimination removes arguments which are directly dead, as well as arguments +only passed into function calls as dead arguments of other functions. This +pass also deletes dead arguments in a similar way. + +This pass is often useful as a cleanup pass to run after aggressive +interprocedural passes, which add possibly-dead arguments. + +``-deadtypeelim``: Dead Type Elimination +---------------------------------------- + +This pass is used to cleanup the output of GCC. It eliminate names for types +that are unused in the entire translation unit, using the :ref:`find used types +<passes-print-used-types>` pass. + +.. _passes-die: + +``-die``: Dead Instruction Elimination +-------------------------------------- + +Dead instruction elimination performs a single pass over the function, removing +instructions that are obviously dead. + +``-dse``: Dead Store Elimination +-------------------------------- + +A trivial dead store elimination that only considers basic-block local +redundant stores. + +``-functionattrs``: Deduce function attributes +---------------------------------------------- + +A simple interprocedural pass which walks the call-graph, looking for functions +which do not access or only read non-local memory, and marking them +``readnone``/``readonly``. In addition, it marks function arguments (of +pointer type) "``nocapture``" if a call to the function does not create any +copies of the pointer value that outlive the call. This more or less means +that the pointer is only dereferenced, and not returned from the function or +stored in a global. This pass is implemented as a bottom-up traversal of the +call-graph. + +``-globaldce``: Dead Global Elimination +--------------------------------------- + +This transform is designed to eliminate unreachable internal globals from the +program. It uses an aggressive algorithm, searching out globals that are known +to be alive. After it finds all of the globals which are needed, it deletes +whatever is left over. This allows it to delete recursive chunks of the +program which are unreachable. + +``-globalopt``: Global Variable Optimizer +----------------------------------------- + +This pass transforms simple global variables that never have their address +taken. If obviously true, it marks read/write globals as constant, deletes +variables only stored to, etc. + +``-gvn``: Global Value Numbering +-------------------------------- + +This pass performs global value numbering to eliminate fully and partially +redundant instructions. It also performs redundant load elimination. + +.. _passes-indvars: + +``-indvars``: Canonicalize Induction Variables +---------------------------------------------- + +This transformation analyzes and transforms the induction variables (and +computations derived from them) into simpler forms suitable for subsequent +analysis and transformation. + +This transformation makes the following changes to each loop with an +identifiable induction variable: + +* All loops are transformed to have a *single* canonical induction variable + which starts at zero and steps by one. +* The canonical induction variable is guaranteed to be the first PHI node in + the loop header block. +* Any pointer arithmetic recurrences are raised to use array subscripts. + +If the trip count of a loop is computable, this pass also makes the following +changes: + +* The exit condition for the loop is canonicalized to compare the induction + value against the exit value. This turns loops like: + + .. code-block:: c++ + + for (i = 7; i*i < 1000; ++i) + + into + + .. code-block:: c++ + + for (i = 0; i != 25; ++i) + +* Any use outside of the loop of an expression derived from the indvar is + changed to compute the derived value outside of the loop, eliminating the + dependence on the exit value of the induction variable. If the only purpose + of the loop is to compute the exit value of some derived expression, this + transformation will make the loop dead. + +This transformation should be followed by strength reduction after all of the +desired loop transformations have been performed. Additionally, on targets +where it is profitable, the loop could be transformed to count down to zero +(the "do loop" optimization). + +``-inline``: Function Integration/Inlining +------------------------------------------ + +Bottom-up inlining of functions into callees. + +``-insert-edge-profiling``: Insert instrumentation for edge profiling +--------------------------------------------------------------------- + +This pass instruments the specified program with counters for edge profiling. +Edge profiling can give a reasonable approximation of the hot paths through a +program, and is used for a wide variety of program transformations. + +Note that this implementation is very naïve. It inserts a counter for *every* +edge in the program, instead of using control flow information to prune the +number of counters inserted. + +``-insert-optimal-edge-profiling``: Insert optimal instrumentation for edge profiling +------------------------------------------------------------------------------------- + +This pass instruments the specified program with counters for edge profiling. +Edge profiling can give a reasonable approximation of the hot paths through a +program, and is used for a wide variety of program transformations. + +.. _passes-instcombine: + +``-instcombine``: Combine redundant instructions +------------------------------------------------ + +Combine instructions to form fewer, simple instructions. This pass does not +modify the CFG This pass is where algebraic simplification happens. + +This pass combines things like: + +.. code-block:: llvm + + %Y = add i32 %X, 1 + %Z = add i32 %Y, 1 + +into: + +.. code-block:: llvm + + %Z = add i32 %X, 2 + +This is a simple worklist driven algorithm. + +This pass guarantees that the following canonicalizations are performed on the +program: + +#. If a binary operator has a constant operand, it is moved to the right-hand + side. +#. Bitwise operators with constant operands are always grouped so that shifts + are performed first, then ``or``\ s, then ``and``\ s, then ``xor``\ s. +#. Compare instructions are converted from ``<``, ``>``, ``≤``, or ``≥`` to + ``=`` or ``≠`` if possible. +#. All ``cmp`` instructions on boolean values are replaced with logical + operations. +#. ``add X, X`` is represented as ``mul X, 2`` ⇒ ``shl X, 1`` +#. Multiplies with a constant power-of-two argument are transformed into + shifts. +#. … etc. + +``-internalize``: Internalize Global Symbols +-------------------------------------------- + +This pass loops over all of the functions in the input module, looking for a +main function. If a main function is found, all other functions and all global +variables with initializers are marked as internal. + +``-ipconstprop``: Interprocedural constant propagation +------------------------------------------------------ + +This pass implements an *extremely* simple interprocedural constant propagation +pass. It could certainly be improved in many different ways, like using a +worklist. This pass makes arguments dead, but does not remove them. The +existing dead argument elimination pass should be run after this to clean up +the mess. + +``-ipsccp``: Interprocedural Sparse Conditional Constant Propagation +-------------------------------------------------------------------- + +An interprocedural variant of :ref:`Sparse Conditional Constant Propagation +<passes-sccp>`. + +``-jump-threading``: Jump Threading +----------------------------------- + +Jump threading tries to find distinct threads of control flow running through a +basic block. This pass looks at blocks that have multiple predecessors and +multiple successors. If one or more of the predecessors of the block can be +proven to always cause a jump to one of the successors, we forward the edge +from the predecessor to the successor by duplicating the contents of this +block. + +An example of when this can occur is code like this: + +.. code-block:: c++ + + if () { ... + X = 4; + } + if (X < 3) { + +In this case, the unconditional branch at the end of the first if can be +revectored to the false side of the second if. + +``-lcssa``: Loop-Closed SSA Form Pass +------------------------------------- + +This pass transforms loops by placing phi nodes at the end of the loops for all +values that are live across the loop boundary. For example, it turns the left +into the right code: + +.. code-block:: c++ + + for (...) for (...) + if (c) if (c) + X1 = ... X1 = ... + else else + X2 = ... X2 = ... + X3 = phi(X1, X2) X3 = phi(X1, X2) + ... = X3 + 4 X4 = phi(X3) + ... = X4 + 4 + +This is still valid LLVM; the extra phi nodes are purely redundant, and will be +trivially eliminated by ``InstCombine``. The major benefit of this +transformation is that it makes many other loop optimizations, such as +``LoopUnswitch``\ ing, simpler. + +.. _passes-licm: + +``-licm``: Loop Invariant Code Motion +------------------------------------- + +This pass performs loop invariant code motion, attempting to remove as much +code from the body of a loop as possible. It does this by either hoisting code +into the preheader block, or by sinking code to the exit blocks if it is safe. +This pass also promotes must-aliased memory locations in the loop to live in +registers, thus hoisting and sinking "invariant" loads and stores. + +This pass uses alias analysis for two purposes: + +#. Moving loop invariant loads and calls out of loops. If we can determine + that a load or call inside of a loop never aliases anything stored to, we + can hoist it or sink it like any other instruction. + +#. Scalar Promotion of Memory. If there is a store instruction inside of the + loop, we try to move the store to happen AFTER the loop instead of inside of + the loop. This can only happen if a few conditions are true: + + #. The pointer stored through is loop invariant. + #. There are no stores or loads in the loop which *may* alias the pointer. + There are no calls in the loop which mod/ref the pointer. + + If these conditions are true, we can promote the loads and stores in the + loop of the pointer to use a temporary alloca'd variable. We then use the + :ref:`mem2reg <passes-mem2reg>` functionality to construct the appropriate + SSA form for the variable. + +``-loop-deletion``: Delete dead loops +------------------------------------- + +This file implements the Dead Loop Deletion Pass. This pass is responsible for +eliminating loops with non-infinite computable trip counts that have no side +effects or volatile instructions, and do not contribute to the computation of +the function's return value. + +.. _passes-loop-extract: + +``-loop-extract``: Extract loops into new functions +--------------------------------------------------- + +A pass wrapper around the ``ExtractLoop()`` scalar transformation to extract +each top-level loop into its own new function. If the loop is the *only* loop +in a given function, it is not touched. This is a pass most useful for +debugging via bugpoint. + +``-loop-extract-single``: Extract at most one loop into a new function +---------------------------------------------------------------------- + +Similar to :ref:`Extract loops into new functions <passes-loop-extract>`, this +pass extracts one natural loop from the program into a function if it can. +This is used by :program:`bugpoint`. + +``-loop-reduce``: Loop Strength Reduction +----------------------------------------- + +This pass performs a strength reduction on array references inside loops that +have as one or more of their components the loop induction variable. This is +accomplished by creating a new value to hold the initial value of the array +access for the first iteration, and then creating a new GEP instruction in the +loop to increment the value by the appropriate amount. + +``-loop-rotate``: Rotate Loops +------------------------------ + +A simple loop rotation transformation. + +``-loop-simplify``: Canonicalize natural loops +---------------------------------------------- + +This pass performs several transformations to transform natural loops into a +simpler form, which makes subsequent analyses and transformations simpler and +more effective. + +Loop pre-header insertion guarantees that there is a single, non-critical entry +edge from outside of the loop to the loop header. This simplifies a number of +analyses and transformations, such as :ref:`LICM <passes-licm>`. + +Loop exit-block insertion guarantees that all exit blocks from the loop (blocks +which are outside of the loop that have predecessors inside of the loop) only +have predecessors from inside of the loop (and are thus dominated by the loop +header). This simplifies transformations such as store-sinking that are built +into LICM. + +This pass also guarantees that loops will have exactly one backedge. + +Note that the :ref:`simplifycfg <passes-simplifycfg>` pass will clean up blocks +which are split out but end up being unnecessary, so usage of this pass should +not pessimize generated code. + +This pass obviously modifies the CFG, but updates loop information and +dominator information. + +``-loop-unroll``: Unroll loops +------------------------------ + +This pass implements a simple loop unroller. It works best when loops have +been canonicalized by the :ref:`indvars <passes-indvars>` pass, allowing it to +determine the trip counts of loops easily. + +``-loop-unswitch``: Unswitch loops +---------------------------------- + +This pass transforms loops that contain branches on loop-invariant conditions +to have multiple loops. For example, it turns the left into the right code: + +.. code-block:: c++ + + for (...) if (lic) + A for (...) + if (lic) A; B; C + B else + C for (...) + A; C + +This can increase the size of the code exponentially (doubling it every time a +loop is unswitched) so we only unswitch if the resultant code will be smaller +than a threshold. + +This pass expects :ref:`LICM <passes-licm>` to be run before it to hoist +invariant conditions out of the loop, to make the unswitching opportunity +obvious. + +``-loweratomic``: Lower atomic intrinsics to non-atomic form +------------------------------------------------------------ + +This pass lowers atomic intrinsics to non-atomic form for use in a known +non-preemptible environment. + +The pass does not verify that the environment is non-preemptible (in general +this would require knowledge of the entire call graph of the program including +any libraries which may not be available in bitcode form); it simply lowers +every atomic intrinsic. + +``-lowerinvoke``: Lower invoke and unwind, for unwindless code generators +------------------------------------------------------------------------- + +This transformation is designed for use by code generators which do not yet +support stack unwinding. This pass supports two models of exception handling +lowering, the "cheap" support and the "expensive" support. + +"Cheap" exception handling support gives the program the ability to execute any +program which does not "throw an exception", by turning "``invoke``" +instructions into calls and by turning "``unwind``" instructions into calls to +``abort()``. If the program does dynamically use the "``unwind``" instruction, +the program will print a message then abort. + +"Expensive" exception handling support gives the full exception handling +support to the program at the cost of making the "``invoke``" instruction +really expensive. It basically inserts ``setjmp``/``longjmp`` calls to emulate +the exception handling as necessary. + +Because the "expensive" support slows down programs a lot, and EH is only used +for a subset of the programs, it must be specifically enabled by the +``-enable-correct-eh-support`` option. + +Note that after this pass runs the CFG is not entirely accurate (exceptional +control flow edges are not correct anymore) so only very simple things should +be done after the ``lowerinvoke`` pass has run (like generation of native +code). This should not be used as a general purpose "my LLVM-to-LLVM pass +doesn't support the ``invoke`` instruction yet" lowering pass. + +``-lowerswitch``: Lower ``SwitchInst``\ s to branches +----------------------------------------------------- + +Rewrites switch instructions with a sequence of branches, which allows targets +to get away with not implementing the switch instruction until it is +convenient. + +.. _passes-mem2reg: + +``-mem2reg``: Promote Memory to Register +---------------------------------------- + +This file promotes memory references to be register references. It promotes +alloca instructions which only have loads and stores as uses. An ``alloca`` is +transformed by using dominator frontiers to place phi nodes, then traversing +the function in depth-first order to rewrite loads and stores as appropriate. +This is just the standard SSA construction algorithm to construct "pruned" SSA +form. + +``-memcpyopt``: MemCpy Optimization +----------------------------------- + +This pass performs various transformations related to eliminating ``memcpy`` +calls, or transforming sets of stores into ``memset``\ s. + +``-mergefunc``: Merge Functions +------------------------------- + +This pass looks for equivalent functions that are mergable and folds them. + +A hash is computed from the function, based on its type and number of basic +blocks. + +Once all hashes are computed, we perform an expensive equality comparison on +each function pair. This takes n^2/2 comparisons per bucket, so it's important +that the hash function be high quality. The equality comparison iterates +through each instruction in each basic block. + +When a match is found the functions are folded. If both functions are +overridable, we move the functionality into a new internal function and leave +two overridable thunks to it. + +``-mergereturn``: Unify function exit nodes +------------------------------------------- + +Ensure that functions have at most one ``ret`` instruction in them. +Additionally, it keeps track of which node is the new exit node of the CFG. + +``-partial-inliner``: Partial Inliner +------------------------------------- + +This pass performs partial inlining, typically by inlining an ``if`` statement +that surrounds the body of the function. + +``-prune-eh``: Remove unused exception handling info +---------------------------------------------------- + +This file implements a simple interprocedural pass which walks the call-graph, +turning invoke instructions into call instructions if and only if the callee +cannot throw an exception. It implements this as a bottom-up traversal of the +call-graph. + +``-reassociate``: Reassociate expressions +----------------------------------------- + +This pass reassociates commutative expressions in an order that is designed to +promote better constant propagation, GCSE, :ref:`LICM <passes-licm>`, PRE, etc. + +For example: 4 + (x + 5) ⇒ x + (4 + 5) + +In the implementation of this algorithm, constants are assigned rank = 0, +function arguments are rank = 1, and other values are assigned ranks +corresponding to the reverse post order traversal of current function (starting +at 2), which effectively gives values in deep loops higher rank than values not +in loops. + +``-reg2mem``: Demote all values to stack slots +---------------------------------------------- + +This file demotes all registers to memory references. It is intended to be the +inverse of :ref:`mem2reg <passes-mem2reg>`. By converting to ``load`` +instructions, the only values live across basic blocks are ``alloca`` +instructions and ``load`` instructions before ``phi`` nodes. It is intended +that this should make CFG hacking much easier. To make later hacking easier, +the entry block is split into two, such that all introduced ``alloca`` +instructions (and nothing else) are in the entry block. + +``-scalarrepl``: Scalar Replacement of Aggregates (DT) +------------------------------------------------------ + +The well-known scalar replacement of aggregates transformation. This transform +breaks up ``alloca`` instructions of aggregate type (structure or array) into +individual ``alloca`` instructions for each member if possible. Then, if +possible, it transforms the individual ``alloca`` instructions into nice clean +scalar SSA form. + +This combines a simple scalar replacement of aggregates algorithm with the +:ref:`mem2reg <passes-mem2reg>` algorithm because often interact, especially +for C++ programs. As such, iterating between ``scalarrepl``, then +:ref:`mem2reg <passes-mem2reg>` until we run out of things to promote works +well. + +.. _passes-sccp: + +``-sccp``: Sparse Conditional Constant Propagation +-------------------------------------------------- + +Sparse conditional constant propagation and merging, which can be summarized +as: + +* Assumes values are constant unless proven otherwise +* Assumes BasicBlocks are dead unless proven otherwise +* Proves values to be constant, and replaces them with constants +* Proves conditional branches to be unconditional + +Note that this pass has a habit of making definitions be dead. It is a good +idea to to run a :ref:`DCE <passes-dce>` pass sometime after running this pass. + +``-simplify-libcalls``: Simplify well-known library calls +--------------------------------------------------------- + +Applies a variety of small optimizations for calls to specific well-known +function calls (e.g. runtime library functions). For example, a call +``exit(3)`` that occurs within the ``main()`` function can be transformed into +simply ``return 3``. + +.. _passes-simplifycfg: + +``-simplifycfg``: Simplify the CFG +---------------------------------- + +Performs dead code elimination and basic block merging. Specifically: + +* Removes basic blocks with no predecessors. +* Merges a basic block into its predecessor if there is only one and the + predecessor only has one successor. +* Eliminates PHI nodes for basic blocks with a single predecessor. +* Eliminates a basic block that only contains an unconditional branch. + +``-sink``: Code sinking +----------------------- + +This pass moves instructions into successor blocks, when possible, so that they +aren't executed on paths where their results aren't needed. + +``-strip``: Strip all symbols from a module +------------------------------------------- + +Performs code stripping. This transformation can delete: + +* names for virtual registers +* symbols for internal globals and functions +* debug information + +Note that this transformation makes code much less readable, so it should only +be used in situations where the strip utility would be used, such as reducing +code size or making it harder to reverse engineer code. + +``-strip-dead-debug-info``: Strip debug info for unused symbols +--------------------------------------------------------------- + +.. FIXME: this description is the same as for -strip + +performs code stripping. this transformation can delete: + +* names for virtual registers +* symbols for internal globals and functions +* debug information + +note that this transformation makes code much less readable, so it should only +be used in situations where the strip utility would be used, such as reducing +code size or making it harder to reverse engineer code. + +``-strip-dead-prototypes``: Strip Unused Function Prototypes +------------------------------------------------------------ + +This pass loops over all of the functions in the input module, looking for dead +declarations and removes them. Dead declarations are declarations of functions +for which no implementation is available (i.e., declarations for unused library +functions). + +``-strip-debug-declare``: Strip all ``llvm.dbg.declare`` intrinsics +------------------------------------------------------------------- + +.. FIXME: this description is the same as for -strip + +This pass implements code stripping. Specifically, it can delete: + +#. names for virtual registers +#. symbols for internal globals and functions +#. debug information + +Note that this transformation makes code much less readable, so it should only +be used in situations where the 'strip' utility would be used, such as reducing +code size or making it harder to reverse engineer code. + +``-strip-nondebug``: Strip all symbols, except dbg symbols, from a module +------------------------------------------------------------------------- + +.. FIXME: this description is the same as for -strip + +This pass implements code stripping. Specifically, it can delete: + +#. names for virtual registers +#. symbols for internal globals and functions +#. debug information + +Note that this transformation makes code much less readable, so it should only +be used in situations where the 'strip' utility would be used, such as reducing +code size or making it harder to reverse engineer code. + +``-tailcallelim``: Tail Call Elimination +---------------------------------------- + +This file transforms calls of the current function (self recursion) followed by +a return instruction with a branch to the entry of the function, creating a +loop. This pass also implements the following extensions to the basic +algorithm: + +#. Trivial instructions between the call and return do not prevent the + transformation from taking place, though currently the analysis cannot + support moving any really useful instructions (only dead ones). +#. This pass transforms functions that are prevented from being tail recursive + by an associative expression to use an accumulator variable, thus compiling + the typical naive factorial or fib implementation into efficient code. +#. TRE is performed if the function returns void, if the return returns the + result returned by the call, or if the function returns a run-time constant + on all exits from the function. It is possible, though unlikely, that the + return returns something else (like constant 0), and can still be TRE'd. It + can be TRE'd if *all other* return instructions in the function return the + exact same value. +#. If it can prove that callees do not access theier caller stack frame, they + are marked as eligible for tail call elimination (by the code generator). + +Utility Passes +============== + +This section describes the LLVM Utility Passes. + +``-deadarghaX0r``: Dead Argument Hacking (BUGPOINT USE ONLY; DO NOT USE) +------------------------------------------------------------------------ + +Same as dead argument elimination, but deletes arguments to functions which are +external. This is only for use by :doc:`bugpoint <Bugpoint>`. + +``-extract-blocks``: Extract Basic Blocks From Module (for bugpoint use) +------------------------------------------------------------------------ + +This pass is used by bugpoint to extract all blocks from the module into their +own functions. + +``-instnamer``: Assign names to anonymous instructions +------------------------------------------------------ + +This is a little utility pass that gives instructions names, this is mostly +useful when diffing the effect of an optimization because deleting an unnamed +instruction can change all other instruction numbering, making the diff very +noisy. + +``-preverify``: Preliminary module verification +----------------------------------------------- + +Ensures that the module is in the form required by the :ref:`Module Verifier +<passes-verify>` pass. Running the verifier runs this pass automatically, so +there should be no need to use it directly. + +.. _passes-verify: + +``-verify``: Module Verifier +---------------------------- + +Verifies an LLVM IR code. This is useful to run after an optimization which is +undergoing testing. Note that llvm-as verifies its input before emitting +bitcode, and also that malformed bitcode is likely to make LLVM crash. All +language front-ends are therefore encouraged to verify their output before +performing optimizing transformations. + +#. Both of a binary operator's parameters are of the same type. +#. Verify that the indices of mem access instructions match other operands. +#. Verify that arithmetic and other things are only performed on first-class + types. Verify that shifts and logicals only happen on integrals f.e. +#. All of the constants in a switch statement are of the correct type. +#. The code is in valid SSA form. +#. It is illegal to put a label into any other type (like a structure) or to + return one. +#. Only phi nodes can be self referential: ``%x = add i32 %x``, ``%x`` is + invalid. +#. PHI nodes must have an entry for each predecessor, with no extras. +#. PHI nodes must be the first thing in a basic block, all grouped together. +#. PHI nodes must have at least one entry. +#. All basic blocks should only end with terminator insts, not contain them. +#. The entry node to a function must not have predecessors. +#. All Instructions must be embedded into a basic block. +#. Functions cannot take a void-typed parameter. +#. Verify that a function's argument list agrees with its declared type. +#. It is illegal to specify a name for a void value. +#. It is illegal to have an internal global value with no initializer. +#. It is illegal to have a ``ret`` instruction that returns a value that does + not agree with the function return value type. +#. Function call argument types match the function prototype. +#. All other things that are tested by asserts spread about the code. + +Note that this does not provide full security verification (like Java), but +instead just tries to ensure that code is well-formed. + +``-view-cfg``: View CFG of function +----------------------------------- + +Displays the control flow graph using the GraphViz tool. + +``-view-cfg-only``: View CFG of function (with no function bodies) +------------------------------------------------------------------ + +Displays the control flow graph using the GraphViz tool, but omitting function +bodies. + +``-view-dom``: View dominance tree of function +---------------------------------------------- + +Displays the dominator tree using the GraphViz tool. + +``-view-dom-only``: View dominance tree of function (with no function bodies) +----------------------------------------------------------------------------- + +Displays the dominator tree using the GraphViz tool, but omitting function +bodies. + +``-view-postdom``: View postdominance tree of function +------------------------------------------------------ + +Displays the post dominator tree using the GraphViz tool. + +``-view-postdom-only``: View postdominance tree of function (with no function bodies) +------------------------------------------------------------------------------------- + +Displays the post dominator tree using the GraphViz tool, but omitting function +bodies. + diff --git a/docs/Phabricator.rst b/docs/Phabricator.rst index b45449793e..efab10cd13 100644 --- a/docs/Phabricator.rst +++ b/docs/Phabricator.rst @@ -88,6 +88,12 @@ diffs between different versions of the patch as it was reviewed in the *Revision Update History*. Most features are self descriptive - explore, and if you have a question, drop by on #llvm in IRC to get help. +Note that as e-mail is the system of reference for code reviews, and some +people prefer it over a web interface, we do not generate automated mail +when a review changes state, for example by clicking "Accept Revision" in +the web interface. Thus, please type LGTM into the comment box to accept +a change from Phabricator. + Status ------ diff --git a/docs/ProgrammersManual.rst b/docs/ProgrammersManual.rst index 2b272de425..4fc4597933 100644 --- a/docs/ProgrammersManual.rst +++ b/docs/ProgrammersManual.rst @@ -6,14 +6,7 @@ LLVM Programmer's Manual :local: .. warning:: - This is a work in progress. - -.. sectionauthor:: Chris Lattner <sabre@nondot.org>, - Dinakar Dhurjati <dhurjati@cs.uiuc.edu>, - Gabor Greif <ggreif@gmail.com>, - Joel Stanley <jstanley@cs.uiuc.edu>, - Reid Spencer <rspencer@x10sys.com> and - Owen Anderson <owen@apple.com> + This is always a work in progress. .. _introduction: @@ -84,8 +77,8 @@ Here are some useful links: (even better, get the book) <http://www.mindview.net/Books/TICPP/ThinkingInCPP2e.html>`_. -You are also encouraged to take a look at the :ref:`LLVM Coding Standards -<coding_standards>` guide which focuses on how to write maintainable code more +You are also encouraged to take a look at the :doc:`LLVM Coding Standards +<CodingStandards>` guide which focuses on how to write maintainable code more than where to put your curly braces. .. _resources: @@ -185,8 +178,8 @@ rarely have to include this file directly). These five templates can be used with any classes, whether they have a v-table or not. If you want to add support for these templates, see the document -:ref:`How to set up LLVM-style RTTI for your class hierarchy -<how-to-set-up-llvm-style-rtti>` +:doc:`How to set up LLVM-style RTTI for your class hierarchy +<HowToSetUpLLVMStyleRTTI>` .. _string_apis: @@ -1059,6 +1052,22 @@ SparseSet is useful for algorithms that need very fast clear/find/insert/erase and fast iteration over small sets. It is not intended for building composite data structures. +.. _dss_sparsemultiset: + +llvm/ADT/SparseMultiSet.h +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +SparseMultiSet adds multiset behavior to SparseSet, while retaining SparseSet's +desirable attributes. Like SparseSet, it typically uses a lot of memory, but +provides operations that are almost as fast as a vector. Typical keys are +physical registers, virtual registers, or numbered basic blocks. + +SparseMultiSet is useful for algorithms that need very fast +clear/find/insert/erase of the entire collection, and iteration over sets of +elements sharing a key. It is often a more efficient choice than using composite +data structures (e.g. vector-of-vectors, map-of-vectors). It is not intended for +building composite data structures. + .. _dss_FoldingSet: llvm/ADT/FoldingSet.h @@ -2256,13 +2265,13 @@ accomplished by the following scheme: A bit-encoding in the 2 LSBits (least significant bits) of the ``Use::Prev`` allows to find the start of the ``User`` object: -* ``00`` –> binary digit 0 +* ``00`` --- binary digit 0 -* ``01`` –> binary digit 1 +* ``01`` --- binary digit 1 -* ``10`` –> stop and calculate (``s``) +* ``10`` --- stop and calculate (``s``) -* ``11`` –> full stop (``S``) +* ``11`` --- full stop (``S``) Given a ``Use*``, all we have to do is to walk till we get a stop and we either have a ``User`` immediately behind or we have to walk to the next stop picking diff --git a/docs/Projects.rst b/docs/Projects.rst index c5d03d33a0..3246e3ff16 100644 --- a/docs/Projects.rst +++ b/docs/Projects.rst @@ -1,5 +1,3 @@ -.. _projects: - ======================== Creating an LLVM Project ======================== @@ -153,12 +151,10 @@ Underneath your top level directory, you should have the following directories: Currently, the LLVM build system provides basic support for tests. The LLVM system provides the following: -* LLVM provides a ``tcl`` procedure that is used by ``Dejagnu`` to run tests. - It can be found in ``llvm/lib/llvm-dg.exp``. This test procedure uses ``RUN`` +* LLVM contains regression tests in ``llvm/test``. These tests are run by the + :doc:`Lit <CommandGuide/lit>` testing tool. This test procedure uses ``RUN`` lines in the actual test case to determine how to run the test. See the - :doc:`TestingGuide` for more details. You can easily write Makefile - support similar to the Makefiles in ``llvm/test`` to use ``Dejagnu`` to - run your project's tests. + :doc:`TestingGuide` for more details. * LLVM contains an optional package called ``llvm-test``, which provides benchmarks and programs that are known to compile with the Clang front diff --git a/docs/README.txt b/docs/README.txt index 5ddd599d8a..22cf930779 100644 --- a/docs/README.txt +++ b/docs/README.txt @@ -1,12 +1,42 @@ LLVM Documentation ================== -The LLVM documentation is currently written in two formats: +LLVM's documentation is written in reStructuredText, a lightweight +plaintext markup language (file extension `.rst`). While the +reStructuredText documentation should be quite readable in source form, it +is mostly meant to be processed by the Sphinx documentation generation +system to create HTML pages which are hosted on <http://llvm.org/docs/> and +updated after every commit. Manpage output is also supported, see below. - * Plain HTML documentation. +If you instead would like to generate and view the HTML locally, install +Sphinx <http://sphinx-doc.org/> and then do: - * reStructured Text documentation using the Sphinx documentation generator. It - is currently tested with Sphinx 1.1.3. + cd docs/ + make -f Makefile.sphinx + $BROWSER _build/html/index.html - For more information, see the "Sphinx Introduction for LLVM Developers" - document. +The mapping between reStructuredText files and generated documentation is +`docs/Foo.rst` <-> `_build/html/Foo.html` <-> `http://llvm.org/docs/Foo.html`. + +If you are interested in writing new documentation, you will want to read +`SphinxQuickstartTemplate.rst` which will get you writing documentation +very fast and includes examples of the most important reStructuredText +markup syntax. + +Manpage Output +=============== + +Building the manpages is similar to building the HTML documentation. The +primary difference is to use the `man` makefile target, instead of the +default (which is `html`). Sphinx then produces the man pages in the +directory `_build/man/`. + + cd docs/ + make -f Makefile.sphinx man + man -l _build/man/FileCheck.1 + +The correspondence between .rst files and man pages is +`docs/CommandGuide/Foo.rst` <-> `_build/man/Foo.1`. +These .rst files are also included during HTML generation so they are also +viewable online (as noted above) at e.g. +`http://llvm.org/docs/CommandGuide/Foo.html`. diff --git a/docs/ReleaseNotes.rst b/docs/ReleaseNotes.rst index a5922ad983..9383c5b3fa 100644 --- a/docs/ReleaseNotes.rst +++ b/docs/ReleaseNotes.rst @@ -1,27 +1,21 @@ -.. raw:: html - - <style> .red {color:red} </style> - -.. role:: red - ====================== -LLVM 3.2 Release Notes +LLVM 3.3 Release Notes ====================== .. contents:: :local: -Written by the `LLVM Team <http://llvm.org/>`_ +.. warning:: + These are in-progress notes for the upcoming LLVM 3.3 release. You may + prefer the `LLVM 3.2 Release Notes <http://llvm.org/releases/3.2/docs + /ReleaseNotes.html>`_. -:red:`These are in-progress notes for the upcoming LLVM 3.2 release. You may -prefer the` `LLVM 3.1 Release Notes <http://llvm.org/releases/3.1/docs -/ReleaseNotes.html>`_. Introduction ============ This document contains the release notes for the LLVM Compiler Infrastructure, -release 3.2. Here we describe the status of LLVM, including major improvements +release 3.3. Here we describe the status of LLVM, including major improvements from the previous release, improvements in various subprojects of LLVM, and some of the current users of the code. All LLVM releases may be downloaded from the `LLVM releases web site <http://llvm.org/releases/>`_. @@ -37,517 +31,90 @@ LLVM web page, this document applies to the *next* release, not the current one. To see the release notes for a specific release, please see the `releases page <http://llvm.org/releases/>`_. -Sub-project Status Update -========================= - -The LLVM 3.2 distribution currently consists of code from the core LLVM -repository, which roughly includes the LLVM optimizers, code generators and -supporting tools, and the Clang repository. In addition to this code, the LLVM -Project includes other sub-projects that are in development. Here we include -updates on these subprojects. - -Clang: C/C++/Objective-C Frontend Toolkit ------------------------------------------ - -`Clang <http://clang.llvm.org/>`_ is an LLVM front end for the C, C++, and -Objective-C languages. Clang aims to provide a better user experience through -expressive diagnostics, a high level of conformance to language standards, fast -compilation, and low memory use. Like LLVM, Clang provides a modular, -library-based architecture that makes it suitable for creating or integrating -with other development tools. Clang is considered a production-quality -compiler for C, Objective-C, C++ and Objective-C++ on x86 (32- and 64-bit), and -for Darwin/ARM targets. - -In the LLVM 3.2 time-frame, the Clang team has made many improvements. -Highlights include: - -#. ... - -For more details about the changes to Clang since the 3.1 release, see the -`Clang release notes. <http://clang.llvm.org/docs/ReleaseNotes.html>`_ - -If Clang rejects your code but another compiler accepts it, please take a look -at the `language compatibility <http://clang.llvm.org/compatibility.html>`_ -guide to make sure this is not intentional or a known issue. - -DragonEgg: GCC front-ends, LLVM back-end ----------------------------------------- - -`DragonEgg <http://dragonegg.llvm.org/>`_ is a `gcc plugin -<http://gcc.gnu.org/wiki/plugins>`_ that replaces GCC's optimizers and code -generators with LLVM's. It works with gcc-4.5 and gcc-4.6 (and partially with -gcc-4.7), can target the x86-32/x86-64 and ARM processor families, and has been -successfully used on the Darwin, FreeBSD, KFreeBSD, Linux and OpenBSD -platforms. It fully supports Ada, C, C++ and Fortran. It has partial support -for Go, Java, Obj-C and Obj-C++. - -The 3.2 release has the following notable changes: - -#. ... - -compiler-rt: Compiler Runtime Library -------------------------------------- - -The new LLVM `compiler-rt project <http://compiler-rt.llvm.org/>`_ is a simple -library that provides an implementation of the low-level target-specific hooks -required by code generation and other runtime components. For example, when -compiling for a 32-bit target, converting a double to a 64-bit unsigned integer -is compiled into a runtime call to the ``__fixunsdfdi`` function. The -``compiler-rt`` library provides highly optimized implementations of this and -other low-level routines (some are 3x faster than the equivalent libgcc -routines). - -The 3.2 release has the following notable changes: - -#. ... - -LLDB: Low Level Debugger ------------------------- - -`LLDB <http://lldb.llvm.org>`_ is a ground-up implementation of a command line -debugger, as well as a debugger API that can be used from other applications. -LLDB makes use of the Clang parser to provide high-fidelity expression parsing -(particularly for C++) and uses the LLVM JIT for target support. - -The 3.2 release has the following notable changes: - -#. ... - -libc++: C++ Standard Library ----------------------------- - -Like compiler_rt, libc++ is now :ref:`dual licensed -<copyright-license-patents>` under the MIT and UIUC license, allowing it to be -used more permissively. - -Within the LLVM 3.2 time-frame there were the following highlights: - -#. ... - -VMKit ------ - -The `VMKit project <http://vmkit.llvm.org/>`_ is an implementation of a Java -Virtual Machine (Java VM or JVM) that uses LLVM for static and just-in-time -compilation. +Non-comprehensive list of changes in this release +================================================= -The 3.2 release has the following notable changes: +.. NOTE + For small 1-3 sentence descriptions, just add an entry at the end of + this list. If your description won't fit comfortably in one bullet + point (e.g. maybe you would like to give an example of the + functionality, or simply have a lot to talk about), see the `NOTE` below + for adding a new subsection. -#. ... +* The CellSPU port has been removed. It can still be found in older versions. -Polly: Polyhedral Optimizer ---------------------------- +* The IR-level extended linker APIs (for example, to link bitcode files out of + archives) have been removed. Any existing clients of these features should + move to using a linker with integrated LTO support. -`Polly <http://polly.llvm.org/>`_ is an *experimental* optimizer for data -locality and parallelism. It provides high-level loop optimizations and -automatic parallelisation. +* LLVM and Clang's documentation has been migrated to the `Sphinx + <http://sphinx-doc.org/>`_ documentation generation system which uses + easy-to-write reStructuredText. See `llvm/docs/README.txt` for more + information. -Within the LLVM 3.2 time-frame there were the following highlights: +* TargetTransformInfo (TTI) is a new interface that can be used by IR-level + passes to obtain target-specific information, such as the costs of + instructions. Only "Lowering" passes such as LSR and the vectorizer are + allowed to use the TTI infrastructure. -#. isl, the integer set library used by Polly, was relicensed to the MIT license -#. isl based code generation -#. MIT licensed replacement for CLooG (LGPLv2) -#. Fine grained option handling (separation of core and border computations, - control overhead vs. code size) -#. Support for FORTRAN and dragonegg -#. OpenMP code generation fixes +* We've improved the X86 and ARM cost model. -External Open Source Projects Using LLVM 3.2 -============================================ +* The Attributes classes have been completely rewritten and expanded. They now + support not only enumerated attributes and alignments, but "string" + attributes, which are useful for passing information to code generation. See + :doc:`HowToUseAttributes` for more details. -An exciting aspect of LLVM is that it is used as an enabling technology for a -lot of other language and tools projects. This section lists some of the -projects that have already been updated to work with LLVM 3.2. +* ... next change ... -Crack ------ +.. NOTE + If you would like to document a larger change, then you can add a + subsection about it right here. You can copy the following boilerplate + and un-indent it (the indentation causes it to be inside this comment). -`Crack <http://code.google.com/p/crack-language/>`_ aims to provide the ease of -development of a scripting language with the performance of a compiled -language. The language derives concepts from C++, Java and Python, -incorporating object-oriented programming, operator overloading and strong -typing. + Special New Feature + ------------------- -FAUST ------ + Makes programs 10x faster by doing Special New Thing. -`FAUST <http://faust.grame.fr/>`_ is a compiled language for real-time audio -signal processing. The name FAUST stands for Functional AUdio STream. Its -programming model combines two approaches: functional programming and block -diagram composition. In addition with the C, C++, Java, JavaScript output -formats, the Faust compiler can generate LLVM bitcode, and works with LLVM -2.7-3.1. +AArch64 target +-------------- -Glasgow Haskell Compiler (GHC) ------------------------------- +We've added support for AArch64, ARM's 64-bit architecture. Development is still +in fairly early stages, but we expect successful compilation when: -`GHC <http://www.haskell.org/ghc/>`_ is an open source compiler and programming -suite for Haskell, a lazy functional programming language. It includes an -optimizing static compiler generating good code for a variety of platforms, -together with an interactive system for convenient, quick development. +- compiling standard compliant C99 and C++03 with Clang; +- using Linux as a target platform; +- where code + static data doesn't exceed 4GB in size (heap allocated data has + no limitation). -GHC 7.0 and onwards include an LLVM code generator, supporting LLVM 2.8 and -later. +Some additional functionality is also implemented, notably DWARF debugging, +GNU-style thread local storage and inline assembly. -Julia ------ - -`Julia <https://github.com/JuliaLang/julia>`_ is a high-level, high-performance -dynamic language for technical computing. It provides a sophisticated -compiler, distributed parallel execution, numerical accuracy, and an extensive -mathematical function library. The compiler uses type inference to generate -fast code without any type declarations, and uses LLVM's optimization passes -and JIT compiler. The `Julia Language <http://julialang.org/>`_ is designed -around multiple dispatch, giving programs a large degree of flexibility. It is -ready for use on many kinds of problems. - -LLVM D Compiler +Loop Vectorizer --------------- -`LLVM D Compiler <https://github.com/ldc-developers/ldc>`_ (LDC) is a compiler -for the D programming Language. It is based on the DMD frontend and uses LLVM -as backend. - -Open Shading Language ---------------------- - -`Open Shading Language (OSL) -<https://github.com/imageworks/OpenShadingLanguage/>`_ is a small but rich -language for programmable shading in advanced global illumination renderers and -other applications, ideal for describing materials, lights, displacement, and -pattern generation. It uses LLVM to JIT complex shader networks to x86 code at -runtime. - -OSL was developed by Sony Pictures Imageworks for use in its in-house renderer -used for feature film animation and visual effects, and is distributed as open -source software with the "New BSD" license. - -Portable OpenCL (pocl) ----------------------- - -In addition to producing an easily portable open source OpenCL implementation, -another major goal of `pocl <http://pocl.sourceforge.net/>`_ is improving -performance portability of OpenCL programs with compiler optimizations, -reducing the need for target-dependent manual optimizations. An important part -of pocl is a set of LLVM passes used to statically parallelize multiple -work-items with the kernel compiler, even in the presence of work-group -barriers. This enables static parallelization of the fine-grained static -concurrency in the work groups in multiple ways (SIMD, VLIW, superscalar, ...). - -Pure ----- - -`Pure <http://pure-lang.googlecode.com/>`_ is an algebraic/functional -programming language based on term rewriting. Programs are collections of -equations which are used to evaluate expressions in a symbolic fashion. The -interpreter uses LLVM as a backend to JIT-compile Pure programs to fast native -code. Pure offers dynamic typing, eager and lazy evaluation, lexical closures, -a hygienic macro system (also based on term rewriting), built-in list and -matrix support (including list and matrix comprehensions) and an easy-to-use -interface to C and other programming languages (including the ability to load -LLVM bitcode modules, and inline C, C++, Fortran and Faust code in Pure -programs if the corresponding LLVM-enabled compilers are installed). - -Pure version 0.54 has been tested and is known to work with LLVM 3.1 (and -continues to work with older LLVM releases >= 2.5). - -TTA-based Co-design Environment (TCE) -------------------------------------- - -`TCE <http://tce.cs.tut.fi/>`_ is a toolset for designing application-specific -processors (ASP) based on the Transport triggered architecture (TTA). The -toolset provides a complete co-design flow from C/C++ programs down to -synthesizable VHDL/Verilog and parallel program binaries. Processor -customization points include the register files, function units, supported -operations, and the interconnection network. - -TCE uses Clang and LLVM for C/C++ language support, target independent -optimizations and also for parts of code generation. It generates new -LLVM-based code generators "on the fly" for the designed TTA processors and -loads them in to the compiler backend as runtime libraries to avoid per-target -recompilation of larger parts of the compiler chain. - -Installation Instructions -========================= - -See :doc:`GettingStarted`. - -What's New in LLVM 3.2? -======================= - -This release includes a huge number of bug fixes, performance tweaks and minor -improvements. Some of the major improvements and new features are listed in -this section. - -Major New Features ------------------- - -.. - - Features that need text if they're finished for 3.2: - ARM EHABI - combiner-aa? - strong phi elim - loop dependence analysis - CorrelatedValuePropagation - lib/Transforms/IPO/MergeFunctions.cpp => consider for 3.2. - Integrated assembler on by default for arm/thumb? - - Near dead: - Analysis/RegionInfo.h + Dom Frontiers - SparseBitVector: used in LiveVar. - llvm/lib/Archive - replace with lib object? - - -LLVM 3.2 includes several major changes and big features: - -#. New NVPTX back-end (replacing existing PTX back-end) based on NVIDIA sources -#. ... - -LLVM IR and Core Improvements ------------------------------ - -LLVM IR has several new features for better support of new targets and that -expose new optimization opportunities: - -#. Thread local variables may have a specified TLS model. See the :ref:`Language - Reference Manual <globalvars>`. -#. ... - -Optimizer Improvements ----------------------- - -In addition to many minor performance tweaks and bug fixes, this release -includes a few major enhancements and additions to the optimizers: - -Loop Vectorizer - We've added a loop vectorizer and we are now able to -vectorize small loops. The loop vectorizer is disabled by default and can be -enabled using the ``-mllvm -vectorize-loops`` flag. The SIMD vector width can -be specified using the flag ``-mllvm -force-vector-width=4``. The default -value is ``0`` which means auto-select. - -We can now vectorize this function: - -.. code-block:: c++ - - unsigned sum_arrays(int *A, int *B, int start, int end) { - unsigned sum = 0; - for (int i = start; i < end; ++i) - sum += A[i] + B[i] + i; - return sum; - } - -We vectorize under the following loops: - -#. The inner most loops must have a single basic block. -#. The number of iterations are known before the loop starts to execute. -#. The loop counter needs to be incremented by one. -#. The loop trip count **can** be a variable. -#. Loops do **not** need to start at zero. -#. The induction variable can be used inside the loop. -#. Loop reductions are supported. -#. Arrays with affine access pattern do **not** need to be marked as - '``noalias``' and are checked at runtime. -#. ... - -SROA - We've re-written SROA to be significantly more powerful. - -#. Branch weight metadata is preseved through more of the optimizer. -#. ... - -MC Level Improvements ---------------------- - -The LLVM Machine Code (aka MC) subsystem was created to solve a number of -problems in the realm of assembly, disassembly, object file format handling, -and a number of other related areas that CPU instruction-set level tools work -in. For more information, please see the `Intro to the LLVM MC Project Blog -Post <http://blog.llvm.org/2010/04/intro-to-llvm-mc-project.html>`_. - -#. ... - -.. _codegen: - -Target Independent Code Generator Improvements ----------------------------------------------- - -Stack Coloring - We have implemented a new optimization pass to merge stack -objects which are used in disjoin areas of the code. This optimization reduces -the required stack space significantly, in cases where it is clear to the -optimizer that the stack slot is not shared. We use the lifetime markers to -tell the codegen that a certain alloca is used within a region. - -We now merge consecutive loads and stores. - -We have put a significant amount of work into the code generator -infrastructure, which allows us to implement more aggressive algorithms and -make it run faster: - -#. ... - -We added new TableGen infrastructure to support bundling for Very Long -Instruction Word (VLIW) architectures. TableGen can now automatically generate -a deterministic finite automaton from a VLIW target's schedule description -which can be queried to determine legal groupings of instructions in a bundle. - -We have added a new target independent VLIW packetizer based on the DFA -infrastructure to group machine instructions into bundles. - -Basic Block Placement -^^^^^^^^^^^^^^^^^^^^^ - -A probability based block placement and code layout algorithm was added to -LLVM's code generator. This layout pass supports probabilities derived from -static heuristics as well as source code annotations such as -``__builtin_expect``. - -X86-32 and X86-64 Target Improvements -------------------------------------- - -New features and major changes in the X86 target include: - -#. ... - -.. _ARM: - -ARM Target Improvements ------------------------ - -New features of the ARM target include: - -#. ... - -.. _armintegratedassembler: - -ARM Integrated Assembler -^^^^^^^^^^^^^^^^^^^^^^^^ - -The ARM target now includes a full featured macro assembler, including -direct-to-object module support for clang. The assembler is currently enabled -by default for Darwin only pending testing and any additional necessary -platform specific support for Linux. - -Full support is included for Thumb1, Thumb2 and ARM modes, along with subtarget -and CPU specific extensions for VFP2, VFP3 and NEON. - -The assembler is Unified Syntax only (see ARM Architecural Reference Manual for -details). While there is some, and growing, support for pre-unfied (divided) -syntax, there are still significant gaps in that support. - -MIPS Target Improvements ------------------------- - -New features and major changes in the MIPS target include: - -#. ... - -PowerPC Target Improvements ---------------------------- - -Many fixes and changes across LLVM (and Clang) for better compliance with the -64-bit PowerPC ELF Application Binary Interface, interoperability with GCC, and -overall 64-bit PowerPC support. Some highlights include: - -#. MCJIT support added. -#. PPC64 relocation support and (small code model) TOC handling added. -#. Parameter passing and return value fixes (alignment issues, padding, varargs - support, proper register usage, odd-sized structure support, float support, - extension of return values for i32 return values). -#. Fixes in spill and reload code for vector registers. -#. C++ exception handling enabled. -#. Changes to remediate double-rounding compatibility issues with respect to - GCC behavior. -#. Refactoring to disentangle ``ppc64-elf-linux`` ABI from Darwin ppc64 ABI - support. -#. Assorted new test cases and test case fixes (endian and word size issues). -#. Fixes for big-endian codegen bugs, instruction encodings, and instruction - constraints. -#. Implemented ``-integrated-as`` support. -#. Additional support for Altivec compare operations. -#. IBM long double support. - -There have also been code generation improvements for both 32- and 64-bit code. -Instruction scheduling support for the Freescale e500mc and e5500 cores has -been added. - -PTX/NVPTX Target Improvements ------------------------------ - -The PTX back-end has been replaced by the NVPTX back-end, which is based on the -LLVM back-end used by NVIDIA in their CUDA (nvcc) and OpenCL compiler. Some -highlights include: - -#. Compatibility with PTX 3.1 and SM 3.5. -#. Support for NVVM intrinsics as defined in the NVIDIA Compiler SDK. -#. Full compatibility with old PTX back-end, with much greater coverage of LLVM - SIR. - -Please submit any back-end bugs to the LLVM Bugzilla site. - -Other Target Specific Improvements ----------------------------------- - -#. ... - -Major Changes and Removed Features ----------------------------------- - -If you're already an LLVM user or developer with out-of-tree changes based on -LLVM 3.2, this section lists some "gotchas" that you may run into upgrading -from the previous release. - -#. The CellSPU port has been removed. It can still be found in older versions. -#. ... - -Internal API Changes --------------------- - -In addition, many APIs have changed in this release. Some of the major LLVM -API changes are: - -We've added a new interface for allowing IR-level passes to access -target-specific information. A new IR-level pass, called -``TargetTransformInfo`` provides a number of low-level interfaces. LSR and -LowerInvoke already use the new interface. - -The ``TargetData`` structure has been renamed to ``DataLayout`` and moved to -``VMCore`` to remove a dependency on ``Target``. - -#. ... - -Tools Changes -------------- - -In addition, some tools have changed in this release. Some of the changes are: - -#. ... - -Python Bindings ---------------- - -Officially supported Python bindings have been added! Feature support is far -from complete. The current bindings support interfaces to: - -#. ... +We've continued the work on the loop vectorizer. The loop vectorizer now +has the following features: -Known Problems -============== +- Loops with unknown trip count. +- Runtime checks of pointers +- Reductions, Inductions +- If Conversion +- Pointer induction variables +- Reverse iterators +- Vectorization of mixed types +- Vectorization of function calls +- Partial unrolling during vectorization -LLVM is generally a production quality compiler, and is used by a broad range -of applications and shipping in many products. That said, not every subsystem -is as mature as the aggregate, particularly the more obscure1 targets. If you -run into a problem, please check the `LLVM bug database -<http://llvm.org/bugs/>`_ and submit a bug if there isn't already one or ask on -the `LLVMdev list <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>`_. +R600 Backend +------------ -Known problem areas include: +The R600 backend was added in this release, it supports AMD GPUs +(HD2XXX - HD7XXX). This backend is used in AMD's Open Source +graphics / compute drivers which are developed as part of the `Mesa3D +<http://www.mesa3d.org>`_ project. -#. The CellSPU, MSP430, and XCore backends are experimental. -#. The integrated assembler, disassembler, and JIT is not supported by several - targets. If an integrated assembler is not supported, then a system - assembler is required. For more details, see the - :ref:`target-feature-matrix`. Additional Information ====================== diff --git a/docs/SegmentedStacks.rst b/docs/SegmentedStacks.rst index f97d62abda..e44ce42313 100644 --- a/docs/SegmentedStacks.rst +++ b/docs/SegmentedStacks.rst @@ -1,5 +1,3 @@ -.. _segmented_stacks: - ======================== Segmented Stacks in LLVM ======================== diff --git a/docs/SourceLevelDebugging.rst b/docs/SourceLevelDebugging.rst index d7c50d234a..78ce4e0e53 100644 --- a/docs/SourceLevelDebugging.rst +++ b/docs/SourceLevelDebugging.rst @@ -2,8 +2,6 @@ Source Level Debugging with LLVM ================================ -.. sectionauthor:: Chris Lattner <sabre@nondot.org> and Jim Laskey <jlaskey@mac.com> - .. contents:: :local: @@ -300,7 +298,6 @@ Subprogram descriptors metadata, ;; Reference to type descriptor i1, ;; True if the global is local to compile unit (static) i1, ;; True if the global is defined in the compile unit (not extern) - i32, ;; Line number where the scope of the subprogram begins i32, ;; Virtuality, e.g. dwarf::DW_VIRTUALITY__virtual i32, ;; Index into a virtual function metadata, ;; indicates which base type contains the vtable pointer for the @@ -310,7 +307,8 @@ Subprogram descriptors Function * , ;; Pointer to LLVM function metadata, ;; Lists function template parameters metadata, ;; Function declaration descriptor - metadata ;; List of function variables + metadata, ;; List of function variables + i32 ;; Line number where the scope of the subprogram begins } These descriptors provide debug information about functions, methods and @@ -408,7 +406,8 @@ Derived type descriptors i32, ;; Flags to encode attributes, e.g. private metadata, ;; Reference to type derived from metadata, ;; (optional) Name of the Objective C property associated with - ;; Objective-C an ivar + ;; Objective-C an ivar, or the type of which this + ;; pointer-to-member is pointing to members of. metadata, ;; (optional) Name of the Objective C property getter selector. metadata, ;; (optional) Name of the Objective C property setter selector. i32 ;; (optional) Objective C property attributes. @@ -420,14 +419,15 @@ values: .. code-block:: llvm - DW_TAG_formal_parameter = 5 - DW_TAG_member = 13 - DW_TAG_pointer_type = 15 - DW_TAG_reference_type = 16 - DW_TAG_typedef = 22 - DW_TAG_const_type = 38 - DW_TAG_volatile_type = 53 - DW_TAG_restrict_type = 55 + DW_TAG_formal_parameter = 5 + DW_TAG_member = 13 + DW_TAG_pointer_type = 15 + DW_TAG_reference_type = 16 + DW_TAG_typedef = 22 + DW_TAG_ptr_to_member_type = 31 + DW_TAG_const_type = 38 + DW_TAG_volatile_type = 53 + DW_TAG_restrict_type = 55 ``DW_TAG_member`` is used to define a member of a :ref:`composite type <format_composite_type>` or :ref:`subprogram <format_subprograms>`. The type @@ -482,14 +482,13 @@ are possible tag values: DW_TAG_enumeration_type = 4 DW_TAG_structure_type = 19 DW_TAG_union_type = 23 - DW_TAG_vector_type = 259 DW_TAG_subroutine_type = 21 DW_TAG_inheritance = 28 The vector flag indicates that an array type is a native packed vector. -The members of array types (tag = ``DW_TAG_array_type``) or vector types (tag = -``DW_TAG_vector_type``) are :ref:`subrange descriptors <format_subrange>`, each +The members of array types (tag = ``DW_TAG_array_type``) are +:ref:`subrange descriptors <format_subrange>`, each representing the range of subscripts at that level of indexing. The members of enumeration types (tag = ``DW_TAG_enumeration_type``) are @@ -583,12 +582,10 @@ value of the tag depends on the usage of the variable: DW_TAG_auto_variable = 256 DW_TAG_arg_variable = 257 - DW_TAG_return_variable = 258 An auto variable is any variable declared in the body of the function. An argument variable is any variable that appears as a formal argument to the -function. A return variable is used to track the result of a function and has -no source correspondent. +function. The context is either the subprogram or block where the variable is defined. Name the source variable name. Context and line indicate where the variable diff --git a/docs/SphinxQuickstartTemplate.rst b/docs/SphinxQuickstartTemplate.rst index 640df63db1..fe6e44a27c 100644 --- a/docs/SphinxQuickstartTemplate.rst +++ b/docs/SphinxQuickstartTemplate.rst @@ -2,8 +2,6 @@ Sphinx Quickstart Template ========================== -.. sectionauthor:: Sean Silva <silvas@purdue.edu> - Introduction and Quickstart =========================== @@ -24,7 +22,8 @@ reStructuredText syntax is useful when writing the document, so the last ~half of this document (starting with `Example Section`_) gives examples which should cover 99% of use cases. -Let me say that again: focus on *content*. +Let me say that again: focus on *content*. But if you really need to verify +Sphinx's output, see ``docs/README.txt`` for information. Once you have finished with the content, please send the ``.rst`` file to llvm-commits for review. @@ -65,7 +64,7 @@ Your text can be *emphasized*, **bold**, or ``monospace``. Use blank lines to separate paragraphs. -Headings (like ``Example Section`` just above) give your document +Headings (like ``Example Section`` just above) give your document its structure. Use the same kind of adornments (e.g. ``======`` vs. ``------``) as are used in this document. The adornment must be the same length as the text above it. For Vim users, variations of ``yypVr=`` might be handy. @@ -86,7 +85,7 @@ Lists can be made like this: #. This is a second list element. - #. They nest too. + #. Use indentation to create nested lists. You can also use unordered lists. @@ -104,19 +103,35 @@ You can make blocks of code like this: .. code-block:: c++ int main() { - return 0 + return 0; } -For a shell session, use a ``bash`` code block: +For a shell session, use a ``console`` code block (some existing docs use +``bash``): -.. code-block:: bash +.. code-block:: console $ echo "Goodbye cruel world!" $ rm -rf / If you need to show LLVM IR use the ``llvm`` code block. -You can show preformatted text without any syntax highlighting like this: +.. code-block:: llvm + + define i32 @test1() { + entry: + ret i32 0 + } + +Some other common code blocks you might need are ``c``, ``objc``, ``make``, +and ``cmake``. If you need something beyond that, you can look at the `full +list`_ of supported code blocks. + +.. _`full list`: http://pygments.org/docs/lexers/ + +However, don't waste time fiddling with syntax highlighting when you could +be adding meaningful content. When in doubt, show preformatted text +without any syntax highlighting like this: :: diff --git a/docs/SystemLibrary.rst b/docs/SystemLibrary.rst index 88404f4d81..0d0f4fa994 100644 --- a/docs/SystemLibrary.rst +++ b/docs/SystemLibrary.rst @@ -2,8 +2,6 @@ System Library ============== -.. sectionauthor:: Reid Spencer <rspencer@x10sys.com> - Abstract ======== diff --git a/docs/TableGen/LangRef.rst b/docs/TableGen/LangRef.rst new file mode 100644 index 0000000000..c9e1efba03 --- /dev/null +++ b/docs/TableGen/LangRef.rst @@ -0,0 +1,383 @@ +=========================== +TableGen Language Reference +=========================== + +.. sectionauthor:: Sean Silva <silvas@purdue.edu> + +.. contents:: + :local: + +.. warning:: + This document is extremely rough. If you find something lacking, please + fix it, file a documentation bug, or ask about it on llvmdev. + +Introduction +============ + +This document is meant to be a normative spec about the TableGen language +in and of itself (i.e. how to understand a given construct in terms of how +it affects the final set of records represented by the TableGen file). If +you are unsure if this document is really what you are looking for, please +read :doc:`/TableGenFundamentals` first. + +Notation +======== + +The lexical and syntax notation used here is intended to imitate +`Python's`_. In particular, for lexical definitions, the productions +operate at the character level and there is no implied whitespace between +elements. The syntax definitions operate at the token level, so there is +implied whitespace between tokens. + +.. _`Python's`: http://docs.python.org/py3k/reference/introduction.html#notation + +Lexical Analysis +================ + +TableGen supports BCPL (``// ...``) and nestable C-style (``/* ... */``) +comments. + +The following is a listing of the basic punctuation tokens:: + + - + [ ] { } ( ) < > : ; . = ? # + +Numeric literals take one of the following forms: + +.. TableGen actually will lex some pretty strange sequences an interpret + them as numbers. What is shown here is an attempt to approximate what it + "should" accept. + +.. productionlist:: + TokInteger: `DecimalInteger` | `HexInteger` | `BinInteger` + DecimalInteger: ["+" | "-"] ("0"..."9")+ + HexInteger: "0x" ("0"..."9" | "a"..."f" | "A"..."F")+ + BinInteger: "0b" ("0" | "1")+ + +One aspect to note is that the :token:`DecimalInteger` token *includes* the +``+`` or ``-``, as opposed to having ``+`` and ``-`` be unary operators as +most languages do. + +TableGen has identifier-like tokens: + +.. productionlist:: + ualpha: "a"..."z" | "A"..."Z" | "_" + TokIdentifier: ("0"..."9")* `ualpha` (`ualpha` | "0"..."9")* + TokVarName: "$" `ualpha` (`ualpha` | "0"..."9")* + +Note that unlike most languages, TableGen allows :token:`TokIdentifier` to +begin with a number. In case of ambiguity, a token will be interpreted as a +numeric literal rather than an identifier. + +TableGen also has two string-like literals: + +.. productionlist:: + TokString: '"' <non-'"' characters and C-like escapes> '"' + TokCodeFragment: "[{" <shortest text not containing "}]"> "}]" + +.. note:: + The current implementation accepts the following C-like escapes:: + + \\ \' \" \t \n + +TableGen also has the following keywords:: + + bit bits class code dag + def foreach defm field in + int let list multiclass string + +TableGen also has "bang operators" which have a +wide variety of meanings: + +.. productionlist:: + BangOperator: one of + :!eq !if !head !tail !con + :!add !shl !sra !srl + :!cast !empty !subst !foreach !strconcat + +Syntax +====== + +TableGen has an ``include`` mechanism. It does not play a role in the +syntax per se, since it is lexically replaced with the contents of the +included file. + +.. productionlist:: + IncludeDirective: "include" `TokString` + +TableGen's top-level production consists of "objects". + +.. productionlist:: + TableGenFile: `Object`* + Object: `Class` | `Def` | `Defm` | `Let` | `MultiClass` | `Foreach` + +``class``\es +------------ + +.. productionlist:: + Class: "class" `TokIdentifier` [`TemplateArgList`] `ObjectBody` + +A ``class`` declaration creates a record which other records can inherit +from. A class can be parametrized by a list of "template arguments", whose +values can be used in the class body. + +A given class can only be defined once. A ``class`` declaration is +considered to define the class if any of the following is true: + +.. break ObjectBody into its consituents so that they are present here? + +#. The :token:`TemplateArgList` is present. +#. The :token:`Body` in the :token:`ObjectBody` is present and is not empty. +#. The :token:`BaseClassList` in the :token:`ObjectBody` is present. + +You can declare an empty class by giving and empty :token:`TemplateArgList` +and an empty :token:`ObjectBody`. This can serve as a restricted form of +forward declaration: note that records deriving from the forward-declared +class will inherit no fields from it since the record expansion is done +when the record is parsed. + +.. productionlist:: + TemplateArgList: "<" `Declaration` ("," `Declaration`)* ">" + +Declarations +------------ + +.. Omitting mention of arcane "field" prefix to discourage its use. + +The declaration syntax is pretty much what you would expect as a C++ +programmer. + +.. productionlist:: + Declaration: `Type` `TokIdentifier` ["=" `Value`] + +It assigns the value to the identifer. + +Types +----- + +.. productionlist:: + Type: "string" | "code" | "bit" | "int" | "dag" + :| "bits" "<" `TokInteger` ">" + :| "list" "<" `Type` ">" + :| `ClassID` + ClassID: `TokIdentifier` + +Both ``string`` and ``code`` correspond to the string type; the difference +is purely to indicate programmer intention. + +The :token:`ClassID` must identify a class that has been previously +declared or defined. + +Values +------ + +.. productionlist:: + Value: `SimpleValue` `ValueSuffix`* + ValueSuffix: "{" `RangeList` "}" + :| "[" `RangeList` "]" + :| "." `TokIdentifier` + RangeList: `RangePiece` ("," `RangePiece`)* + RangePiece: `TokInteger` + :| `TokInteger` "-" `TokInteger` + :| `TokInteger` `TokInteger` + +The peculiar last form of :token:`RangePiece` is due to the fact that the +"``-``" is included in the :token:`TokInteger`, hence ``1-5`` gets lexed as +two consecutive :token:`TokInteger`'s, with values ``1`` and ``-5``, +instead of "1", "-", and "5". +The :token:`RangeList` can be thought of as specifying "list slice" in some +contexts. + + +:token:`SimpleValue` has a number of forms: + + +.. productionlist:: + SimpleValue: `TokIdentifier` + +The value will be the variable referenced by the identifier. It can be one +of: + +.. The code for this is exceptionally abstruse. These examples are a + best-effort attempt. + +* name of a ``def``, such as the use of ``Bar`` in:: + + def Bar : SomeClass { + int X = 5; + } + + def Foo { + SomeClass Baz = Bar; + } + +* value local to a ``def``, such as the use of ``Bar`` in:: + + def Foo { + int Bar = 5; + int Baz = Bar; + } + +* a template arg of a ``class``, such as the use of ``Bar`` in:: + + class Foo<int Bar> { + int Baz = Bar; + } + +* value local to a ``multiclass``, such as the use of ``Bar`` in:: + + multiclass Foo { + int Bar = 5; + int Baz = Bar; + } + +* a template arg to a ``multiclass``, such as the use of ``Bar`` in:: + + multiclass Foo<int Bar> { + int Baz = Bar; + } + +.. productionlist:: + SimpleValue: `TokInteger` + +This represents the numeric value of the integer. + +.. productionlist:: + SimpleValue: `TokString`+ + +Multiple adjacent string literals are concatenated like in C/C++. The value +is the concatenation of the strings. + +.. productionlist:: + SimpleValue: `TokCodeFragment` + +The value is the string value of the code fragment. + +.. productionlist:: + SimpleValue: "?" + +``?`` represents an "unset" initializer. + +.. productionlist:: + SimpleValue: "{" `ValueList` "}" + ValueList: [`ValueListNE`] + ValueListNE: `Value` ("," `Value`)* + +This represents a sequence of bits, as would be used to initialize a +``bits<n>`` field (where ``n`` is the number of bits). + +.. productionlist:: + SimpleValue: `ClassID` "<" `ValueListNE` ">" + +This generates a new anonymous record definition (as would be created by an +unnamed ``def`` inheriting from the given class with the given template +arguments) and the value is the value of that record definition. + +.. productionlist:: + SimpleValue: "[" `ValueList` "]" ["<" `Type` ">"] + +A list initializer. The optional :token:`Type` can be used to indicate a +specific element type, otherwise the element type will be deduced from the +given values. + +.. The initial `DagArg` of the dag must start with an identifier or + !cast, but this is more of an implementation detail and so for now just + leave it out. + +.. productionlist:: + SimpleValue: "(" `DagArg` `DagArgList` ")" + DagArgList: `DagArg` ("," `DagArg`)* + DagArg: `Value` [":" `TokVarName`] + +The initial :token:`DagArg` is called the "operator" of the dag. + +.. productionlist:: + SimpleValue: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")" + +Bodies +------ + +.. productionlist:: + ObjectBody: `BaseClassList` `Body` + BaseClassList: [":" `BaseClassListNE`] + BaseClassListNE: `SubClassRef` ("," `SubClassRef`)* + SubClassRef: (`ClassID` | `MultiClassID`) ["<" `ValueList` ">"] + DefmID: `TokIdentifier` + +The version with the :token:`MultiClassID` is only valid in the +:token:`BaseClassList` of a ``defm``. +The :token:`MultiClassID` should be the name of a ``multiclass``. + +.. put this somewhere else + +It is after parsing the base class list that the "let stack" is applied. + +.. productionlist:: + Body: ";" | "{" BodyList "}" + BodyList: BodyItem* + BodyItem: `Declaration` ";" + :| "let" `TokIdentifier` [`RangeList`] "=" `Value` ";" + +The ``let`` form allows overriding the value of an inherited field. + +``def`` +------- + +.. TODO:: + There can be pastes in the names here, like ``#NAME#``. Look into that + and document it (it boils down to ParseIDValue with IDParseMode == + ParseNameMode). ParseObjectName calls into the general ParseValue, with + the only different from "arbitrary expression parsing" being IDParseMode + == Mode. + +.. productionlist:: + Def: "def" `TokIdentifier` `ObjectBody` + +Defines a record whose name is given by the :token:`TokIdentifier`. The +fields of the record are inherited from the base classes and defined in the +body. + +Special handling occurs if this ``def`` appears inside a ``multiclass`` or +a ``foreach``. + +``defm`` +-------- + +.. productionlist:: + Defm: "defm" `TokIdentifier` ":" `BaseClassListNE` ";" + +Note that in the :token:`BaseClassList`, all of the ``multiclass``'s must +precede any ``class``'s that appear. + +``foreach`` +----------- + +.. productionlist:: + Foreach: "foreach" `Declaration` "in" "{" `Object`* "}" + :| "foreach" `Declaration` "in" `Object` + +The value assigned to the variable in the declaration is iterated over and +the object or object list is reevaluated with the variable set at each +iterated value. + +Top-Level ``let`` +----------------- + +.. productionlist:: + Let: "let" `LetList` "in" "{" `Object`* "}" + :| "let" `LetList` "in" `Object` + LetList: `LetItem` ("," `LetItem`)* + LetItem: `TokIdentifier` [`RangeList`] "=" `Value` + +This is effectively equivalent to ``let`` inside the body of a record +except that it applies to multiple records at a time. The bindings are +applied at the end of parsing the base classes of a record. + +``multiclass`` +-------------- + +.. productionlist:: + MultiClass: "multiclass" `TokIdentifier` [`TemplateArgList`] + : [":" `BaseMultiClassList`] "{" `MultiClassObject`+ "}" + BaseMultiClassList: `MultiClassID` ("," `MultiClassID`)* + MultiClassID: `TokIdentifier` + MultiClassObject: `Def` | `Defm` | `Let` | `Foreach` diff --git a/docs/TableGenFundamentals.rst b/docs/TableGenFundamentals.rst index 356b7d208e..4fe4bb986a 100644 --- a/docs/TableGenFundamentals.rst +++ b/docs/TableGenFundamentals.rst @@ -1,5 +1,3 @@ -.. _tablegen: - ===================== TableGen Fundamentals ===================== @@ -121,11 +119,11 @@ this (at the time of this writing): ... This definition corresponds to the 32-bit register-register ``add`` instruction -of the the x86 architecture. ``def ADD32rr`` defines a record named +of the x86 architecture. ``def ADD32rr`` defines a record named ``ADD32rr``, and the comment at the end of the line indicates the superclasses of the definition. The body of the record contains all of the data that TableGen assembled for the record, indicating that the instruction is part of -the "X86" namespace, the pattern indicating how the the instruction should be +the "X86" namespace, the pattern indicating how the instruction should be emitted into the assembly file, that it is a two-address instruction, has a particular encoding, etc. The contents and semantics of the information in the record are specific to the needs of the X86 backend, and are only shown as an @@ -793,6 +791,10 @@ Expressions used by code generator to describe instructions and isel patterns: TableGen backends ================= +Until we get a step-by-step HowTo for writing TableGen backends, you can at +least grab the boilerplate (build system, new files, etc.) from Clang's +r173931. + TODO: How they work, how to write one. This section should not contain details about any particular backend, except maybe ``-print-enums`` as an example. This should highlight the APIs in ``TableGen/Record.h``. diff --git a/docs/TestSuiteMakefileGuide.rst b/docs/TestSuiteMakefileGuide.rst index b10379ef4d..e2852a0735 100644 --- a/docs/TestSuiteMakefileGuide.rst +++ b/docs/TestSuiteMakefileGuide.rst @@ -2,9 +2,6 @@ LLVM test-suite Makefile Guide ============================== -Written by John T. Criswell, Daniel Dunbar, Reid Spencer, and Tanya -Lattner - .. contents:: :local: diff --git a/docs/TestingGuide.rst b/docs/TestingGuide.rst index f66cae1d14..4d8c8ce307 100644 --- a/docs/TestingGuide.rst +++ b/docs/TestingGuide.rst @@ -2,9 +2,6 @@ LLVM Testing Infrastructure Guide ================================= -Written by John T. Criswell, Daniel Dunbar, Reid Spencer, and Tanya -Lattner - .. contents:: :local: @@ -234,51 +231,58 @@ what you can use in yours. The major differences are: - You can't do ``2>&1``. That will cause :program:`lit` to write to a file named ``&1``. Usually this is done to get stderr to go through a pipe. You can do that with ``|&`` so replace this idiom: - ``... 2>&1 | grep`` with ``... |& grep`` + ``... 2>&1 | FileCheck`` with ``... |& FileCheck`` - You can only redirect to a file, not to another descriptor and not from a here document. There are some quoting rules that you must pay attention to when writing your RUN lines. In general nothing needs to be quoted. :program:`lit` won't strip off any quote characters so they will get passed to the invoked program. -For example: +To avoid this use curly braces to tell :program:`lit` that it should treat +everything enclosed as one value. -.. code-block:: bash +In general, you should strive to keep your RUN lines as simple as possible, +using them only to run tools that generate textual output you can then examine. +The recommended way to examine output to figure out if the test passes it using +the :doc:`FileCheck tool <CommandGuide/FileCheck>`. *[The usage of grep in RUN +lines is deprecated - please do not send or commit patches that use it.]* - ... | grep 'find this string' +Fragile tests +------------- -This will fail because the ``'`` characters are passed to ``grep``. This would -make ``grep`` to look for ``'find`` in the files ``this`` and -``string'``. To avoid this use curly braces to tell :program:`lit` that it -should treat everything enclosed as one value. So our example would become: +It is easy to write a fragile test that would fail spuriously if the tool being +tested outputs a full path to the input file. For example, :program:`opt` by +default outputs a ``ModuleID``: -.. code-block:: bash +.. code-block:: console - ... | grep {find this string} + $ cat example.ll + define i32 @main() nounwind { + ret i32 0 + } -In general, you should strive to keep your RUN lines as simple as possible, -using them only to run tools that generate the output you can then examine. The -recommended way to examine output to figure out if the test passes it using the -:doc:`FileCheck tool <CommandGuide/FileCheck>`. The usage of ``grep`` in RUN -lines is discouraged. - -The FileCheck utility ---------------------- - -A powerful feature of the RUN lines is that it allows any arbitrary -commands to be executed as part of the test harness. While standard -(portable) unix tools like ``grep`` work fine on run lines, as you see -above, there are a lot of caveats due to interaction with shell syntax, -and we want to make sure the run lines are portable to a wide range of -systems. Another major problem is that ``grep`` is not very good at checking -to verify that the output of a tools contains a series of different -output in a specific order. The :program:`FileCheck` tool was designed to -help with these problems. - -:program:`FileCheck` is designed to read a file to check from standard input, -and the set of things to verify from a file specified as a command line -argument. :program:`FileCheck` is described in :doc:`the FileCheck man page -<CommandGuide/FileCheck>`. + $ opt -S /path/to/example.ll + ; ModuleID = '/path/to/example.ll' + + define i32 @main() nounwind { + ret i32 0 + } + +``ModuleID`` can unexpetedly match against ``CHECK`` lines. For example: + +.. code-block:: llvm + + ; RUN: opt -S %s | FileCheck + + define i32 @main() nounwind { + ; CHECK-NOT: load + ret i32 0 + } + +This test will fail if placed into a ``download`` directory. + +To make your tests robust, always use ``opt ... < %s`` in the RUN line. +:program:`opt` does not output a ``ModuleID`` when input comes from stdin. Variables and substitutions --------------------------- diff --git a/docs/Vectorizers.rst b/docs/Vectorizers.rst new file mode 100644 index 0000000000..e2d3667bc1 --- /dev/null +++ b/docs/Vectorizers.rst @@ -0,0 +1,338 @@ +========================== +Auto-Vectorization in LLVM +========================== + +.. contents:: + :local: + +LLVM has two vectorizers: The :ref:`Loop Vectorizer <loop-vectorizer>`, +which operates on Loops, and the :ref:`Basic Block Vectorizer +<bb-vectorizer>`, which optimizes straight-line code. These vectorizers +focus on different optimization opportunities and use different techniques. +The BB vectorizer merges multiple scalars that are found in the code into +vectors while the Loop Vectorizer widens instructions in the original loop +to operate on multiple consecutive loop iterations. + +.. _loop-vectorizer: + +The Loop Vectorizer +=================== + +Usage +----- + +LLVM's Loop Vectorizer is now available and will be useful for many people. +It is not enabled by default, but can be enabled through clang using the +command line flag: + +.. code-block:: console + + $ clang -fvectorize -O3 file.c + +If the ``-fvectorize`` flag is used then the loop vectorizer will be enabled +when running with ``-O3``, ``-O2``. When ``-Os`` is used, the loop vectorizer +will only vectorize loops that do not require a major increase in code size. + +We plan to enable the Loop Vectorizer by default as part of the LLVM 3.3 release. + +Command line flags +^^^^^^^^^^^^^^^^^^ + +The loop vectorizer uses a cost model to decide on the optimal vectorization factor +and unroll factor. However, users of the vectorizer can force the vectorizer to use +specific values. Both 'clang' and 'opt' support the flags below. + +Users can control the vectorization SIMD width using the command line flag "-force-vector-width". + +.. code-block:: console + + $ clang -mllvm -force-vector-width=8 ... + $ opt -loop-vectorize -force-vector-width=8 ... + +Users can control the unroll factor using the command line flag "-force-vector-unroll" + +.. code-block:: console + + $ clang -mllvm -force-vector-unroll=2 ... + $ opt -loop-vectorize -force-vector-unroll=2 ... + +Features +-------- + +The LLVM Loop Vectorizer has a number of features that allow it to vectorize +complex loops. + +Loops with unknown trip count +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The Loop Vectorizer supports loops with an unknown trip count. +In the loop below, the iteration ``start`` and ``finish`` points are unknown, +and the Loop Vectorizer has a mechanism to vectorize loops that do not start +at zero. In this example, 'n' may not be a multiple of the vector width, and +the vectorizer has to execute the last few iterations as scalar code. Keeping +a scalar copy of the loop increases the code size. + +.. code-block:: c++ + + void bar(float *A, float* B, float K, int start, int end) { + for (int i = start; i < end; ++i) + A[i] *= B[i] + K; + } + +Runtime Checks of Pointers +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In the example below, if the pointers A and B point to consecutive addresses, +then it is illegal to vectorize the code because some elements of A will be +written before they are read from array B. + +Some programmers use the 'restrict' keyword to notify the compiler that the +pointers are disjointed, but in our example, the Loop Vectorizer has no way of +knowing that the pointers A and B are unique. The Loop Vectorizer handles this +loop by placing code that checks, at runtime, if the arrays A and B point to +disjointed memory locations. If arrays A and B overlap, then the scalar version +of the loop is executed. + +.. code-block:: c++ + + void bar(float *A, float* B, float K, int n) { + for (int i = 0; i < n; ++i) + A[i] *= B[i] + K; + } + + +Reductions +^^^^^^^^^^ + +In this example the ``sum`` variable is used by consecutive iterations of +the loop. Normally, this would prevent vectorization, but the vectorizer can +detect that 'sum' is a reduction variable. The variable 'sum' becomes a vector +of integers, and at the end of the loop the elements of the array are added +together to create the correct result. We support a number of different +reduction operations, such as addition, multiplication, XOR, AND and OR. + +.. code-block:: c++ + + int foo(int *A, int *B, int n) { + unsigned sum = 0; + for (int i = 0; i < n; ++i) + sum += A[i] + 5; + return sum; + } + +We support floating point reduction operations when `-ffast-math` is used. + +Inductions +^^^^^^^^^^ + +In this example the value of the induction variable ``i`` is saved into an +array. The Loop Vectorizer knows to vectorize induction variables. + +.. code-block:: c++ + + void bar(float *A, float* B, float K, int n) { + for (int i = 0; i < n; ++i) + A[i] = i; + } + +If Conversion +^^^^^^^^^^^^^ + +The Loop Vectorizer is able to "flatten" the IF statement in the code and +generate a single stream of instructions. The Loop Vectorizer supports any +control flow in the innermost loop. The innermost loop may contain complex +nesting of IFs, ELSEs and even GOTOs. + +.. code-block:: c++ + + int foo(int *A, int *B, int n) { + unsigned sum = 0; + for (int i = 0; i < n; ++i) + if (A[i] > B[i]) + sum += A[i] + 5; + return sum; + } + +Pointer Induction Variables +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This example uses the "accumulate" function of the standard c++ library. This +loop uses C++ iterators, which are pointers, and not integer indices. +The Loop Vectorizer detects pointer induction variables and can vectorize +this loop. This feature is important because many C++ programs use iterators. + +.. code-block:: c++ + + int baz(int *A, int n) { + return std::accumulate(A, A + n, 0); + } + +Reverse Iterators +^^^^^^^^^^^^^^^^^ + +The Loop Vectorizer can vectorize loops that count backwards. + +.. code-block:: c++ + + int foo(int *A, int *B, int n) { + for (int i = n; i > 0; --i) + A[i] +=1; + } + +Scatter / Gather +^^^^^^^^^^^^^^^^ + +The Loop Vectorizer can vectorize code that becomes a sequence of scalar instructions +that scatter/gathers memory. + +.. code-block:: c++ + + int foo(int *A, int *B, int n, int k) { + for (int i = 0; i < n; ++i) + A[i*7] += B[i*k]; + } + +Vectorization of Mixed Types +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The Loop Vectorizer can vectorize programs with mixed types. The Vectorizer +cost model can estimate the cost of the type conversion and decide if +vectorization is profitable. + +.. code-block:: c++ + + int foo(int *A, char *B, int n, int k) { + for (int i = 0; i < n; ++i) + A[i] += 4 * B[i]; + } + +Global Structures Alias Analysis +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Access to global structures can also be vectorized, with alias analysis being +used to make sure accesses don't alias. Run-time checks can also be added on +pointer access to structure members. + +Many variations are supported, but some that rely on undefined behaviour being +ignored (as other compilers do) are still being left un-vectorized. + +.. code-block:: c++ + + struct { int A[100], K, B[100]; } Foo; + + int foo() { + for (int i = 0; i < 100; ++i) + Foo.A[i] = Foo.B[i] + 100; + } + +Vectorization of function calls +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The Loop Vectorize can vectorize intrinsic math functions. +See the table below for a list of these functions. + ++-----+-----+---------+ +| pow | exp | exp2 | ++-----+-----+---------+ +| sin | cos | sqrt | ++-----+-----+---------+ +| log |log2 | log10 | ++-----+-----+---------+ +|fabs |floor| ceil | ++-----+-----+---------+ +|fma |trunc|nearbyint| ++-----+-----+---------+ +| | | fmuladd | ++-----+-----+---------+ + +The loop vectorizer knows about special instructions on the target and will +vectorize a loop containing a function call that maps to the instructions. For +example, the loop below will be vectorized on Intel x86 if the SSE4.1 roundps +instruction is available. + +.. code-block:: c++ + + void foo(float *f) { + for (int i = 0; i != 1024; ++i) + f[i] = floorf(f[i]); + } + +Partial unrolling during vectorization +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Modern processors feature multiple execution units, and only programs that contain a +high degree of parallelism can fully utilize the entire width of the machine. +The Loop Vectorizer increases the instruction level parallelism (ILP) by +performing partial-unrolling of loops. + +In the example below the entire array is accumulated into the variable 'sum'. +This is inefficient because only a single execution port can be used by the processor. +By unrolling the code the Loop Vectorizer allows two or more execution ports +to be used simultaneously. + +.. code-block:: c++ + + int foo(int *A, int *B, int n) { + unsigned sum = 0; + for (int i = 0; i < n; ++i) + sum += A[i]; + return sum; + } + +The Loop Vectorizer uses a cost model to decide when it is profitable to unroll loops. +The decision to unroll the loop depends on the register pressure and the generated code size. + +Performance +----------- + +This section shows the the execution time of Clang on a simple benchmark: +`gcc-loops <http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/UnitTests/Vectorizer/>`_. +This benchmarks is a collection of loops from the GCC autovectorization +`page <http://gcc.gnu.org/projects/tree-ssa/vectorization.html>`_ by Dorit Nuzman. + +The chart below compares GCC-4.7, ICC-13, and Clang-SVN with and without loop vectorization at -O3, tuned for "corei7-avx", running on a Sandybridge iMac. +The Y-axis shows the time in msec. Lower is better. The last column shows the geomean of all the kernels. + +.. image:: gcc-loops.png + +And Linpack-pc with the same configuration. Result is Mflops, higher is better. + +.. image:: linpack-pc.png + +.. _bb-vectorizer: + +The Basic Block Vectorizer +========================== + +Usage +------ + +The Basic Block Vectorizer is not enabled by default, but it can be enabled +through clang using the command line flag: + +.. code-block:: console + + $ clang -fslp-vectorize file.c + +Details +------- + +The goal of basic-block vectorization (a.k.a. superword-level parallelism) is +to combine similar independent instructions within simple control-flow regions +into vector instructions. Memory accesses, arithemetic operations, comparison +operations and some math functions can all be vectorized using this technique +(subject to the capabilities of the target architecture). + +For example, the following function performs very similar operations on its +inputs (a1, b1) and (a2, b2). The basic-block vectorizer may combine these +into vector operations. + +.. code-block:: c++ + + int foo(int a1, int a2, int b1, int b2) { + int r1 = a1*(a1 + b1)/b1 + 50*b1/a1; + int r2 = a2*(a2 + b2)/b2 + 50*b2/a2; + return r1 + r2; + } + + diff --git a/docs/WritingAnLLVMBackend.rst b/docs/WritingAnLLVMBackend.rst index 7803163ae6..6d6c2a1070 100644 --- a/docs/WritingAnLLVMBackend.rst +++ b/docs/WritingAnLLVMBackend.rst @@ -2,7 +2,10 @@ Writing an LLVM Compiler Backend ================================ -.. sectionauthor:: Mason Woo <http://www.woo.com> and Misha Brukman <http://misha.brukman.net> +.. toctree:: + :hidden: + + HowToUseInstrMappings .. contents:: :local: @@ -54,8 +57,8 @@ These essential documents must be read before reading this document: file (``.td`` suffix) and generates C++ code that can be used for code generation. -* `Writing an LLVM Pass <WritingAnLLVMPass.html>`_ --- The assembly printer is - a ``FunctionPass``, as are several SelectionDAG processing steps. +* :doc:`WritingAnLLVMPass` --- The assembly printer is a ``FunctionPass``, as + are several ``SelectionDAG`` processing steps. To follow the SPARC examples in this document, have a copy of `The SPARC Architecture Manual, Version 8 <http://www.sparc.org/standards/V8.pdf>`_ for diff --git a/docs/WritingAnLLVMPass.html b/docs/WritingAnLLVMPass.html deleted file mode 100644 index af1ffa4fb7..0000000000 --- a/docs/WritingAnLLVMPass.html +++ /dev/null @@ -1,1954 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> - <title>Writing an LLVM Pass</title> - <link rel="stylesheet" href="_static/llvm.css" type="text/css"> -</head> -<body> - -<h1> - Writing an LLVM Pass -</h1> - -<ol> - <li><a href="#introduction">Introduction - What is a pass?</a></li> - <li><a href="#quickstart">Quick Start - Writing hello world</a> - <ul> - <li><a href="#makefile">Setting up the build environment</a></li> - <li><a href="#basiccode">Basic code required</a></li> - <li><a href="#running">Running a pass with <tt>opt</tt></a></li> - </ul></li> - <li><a href="#passtype">Pass classes and requirements</a> - <ul> - <li><a href="#ImmutablePass">The <tt>ImmutablePass</tt> class</a></li> - <li><a href="#ModulePass">The <tt>ModulePass</tt> class</a> - <ul> - <li><a href="#runOnModule">The <tt>runOnModule</tt> method</a></li> - </ul></li> - <li><a href="#CallGraphSCCPass">The <tt>CallGraphSCCPass</tt> class</a> - <ul> - <li><a href="#doInitialization_scc">The <tt>doInitialization(CallGraph - &)</tt> method</a></li> - <li><a href="#runOnSCC">The <tt>runOnSCC</tt> method</a></li> - <li><a href="#doFinalization_scc">The <tt>doFinalization(CallGraph - &)</tt> method</a></li> - </ul></li> - <li><a href="#FunctionPass">The <tt>FunctionPass</tt> class</a> - <ul> - <li><a href="#doInitialization_mod">The <tt>doInitialization(Module - &)</tt> method</a></li> - <li><a href="#runOnFunction">The <tt>runOnFunction</tt> method</a></li> - <li><a href="#doFinalization_mod">The <tt>doFinalization(Module - &)</tt> method</a></li> - </ul></li> - <li><a href="#LoopPass">The <tt>LoopPass</tt> class</a> - <ul> - <li><a href="#doInitialization_loop">The <tt>doInitialization(Loop *, - LPPassManager &)</tt> method</a></li> - <li><a href="#runOnLoop">The <tt>runOnLoop</tt> method</a></li> - <li><a href="#doFinalization_loop">The <tt>doFinalization() - </tt> method</a></li> - </ul></li> - <li><a href="#RegionPass">The <tt>RegionPass</tt> class</a> - <ul> - <li><a href="#doInitialization_region">The <tt>doInitialization(Region *, - RGPassManager &)</tt> method</a></li> - <li><a href="#runOnRegion">The <tt>runOnRegion</tt> method</a></li> - <li><a href="#doFinalization_region">The <tt>doFinalization() - </tt> method</a></li> - </ul></li> - <li><a href="#BasicBlockPass">The <tt>BasicBlockPass</tt> class</a> - <ul> - <li><a href="#doInitialization_fn">The <tt>doInitialization(Function - &)</tt> method</a></li> - <li><a href="#runOnBasicBlock">The <tt>runOnBasicBlock</tt> - method</a></li> - <li><a href="#doFinalization_fn">The <tt>doFinalization(Function - &)</tt> method</a></li> - </ul></li> - <li><a href="#MachineFunctionPass">The <tt>MachineFunctionPass</tt> - class</a> - <ul> - <li><a href="#runOnMachineFunction">The - <tt>runOnMachineFunction(MachineFunction &)</tt> method</a></li> - </ul></li> - </ul> - <li><a href="#registration">Pass Registration</a> - <ul> - <li><a href="#print">The <tt>print</tt> method</a></li> - </ul></li> - <li><a href="#interaction">Specifying interactions between passes</a> - <ul> - <li><a href="#getAnalysisUsage">The <tt>getAnalysisUsage</tt> - method</a></li> - <li><a href="#AU::addRequired">The <tt>AnalysisUsage::addRequired<></tt> and <tt>AnalysisUsage::addRequiredTransitive<></tt> methods</a></li> - <li><a href="#AU::addPreserved">The <tt>AnalysisUsage::addPreserved<></tt> method</a></li> - <li><a href="#AU::examples">Example implementations of <tt>getAnalysisUsage</tt></a></li> - <li><a href="#getAnalysis">The <tt>getAnalysis<></tt> and -<tt>getAnalysisIfAvailable<></tt> methods</a></li> - </ul></li> - <li><a href="#analysisgroup">Implementing Analysis Groups</a> - <ul> - <li><a href="#agconcepts">Analysis Group Concepts</a></li> - <li><a href="#registerag">Using <tt>RegisterAnalysisGroup</tt></a></li> - </ul></li> - <li><a href="#passStatistics">Pass Statistics</a> - <li><a href="#passmanager">What PassManager does</a> - <ul> - <li><a href="#releaseMemory">The <tt>releaseMemory</tt> method</a></li> - </ul></li> - <li><a href="#registering">Registering dynamically loaded passes</a> - <ul> - <li><a href="#registering_existing">Using existing registries</a></li> - <li><a href="#registering_new">Creating new registries</a></li> - </ul></li> - <li><a href="#debughints">Using GDB with dynamically loaded passes</a> - <ul> - <li><a href="#breakpoint">Setting a breakpoint in your pass</a></li> - <li><a href="#debugmisc">Miscellaneous Problems</a></li> - </ul></li> - <li><a href="#future">Future extensions planned</a> - <ul> - <li><a href="#SMP">Multithreaded LLVM</a></li> - </ul></li> -</ol> - -<div class="doc_author"> - <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a> and - <a href="mailto:jlaskey@mac.com">Jim Laskey</a></p> -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="introduction">Introduction - What is a pass?</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>The LLVM Pass Framework is an important part of the LLVM system, because LLVM -passes are where most of the interesting parts of the compiler exist. Passes -perform the transformations and optimizations that make up the compiler, they -build the analysis results that are used by these transformations, and they are, -above all, a structuring technique for compiler code.</p> - -<p>All LLVM passes are subclasses of the <tt><a -href="http://llvm.org/doxygen/classllvm_1_1Pass.html">Pass</a></tt> -class, which implement functionality by overriding virtual methods inherited -from <tt>Pass</tt>. Depending on how your pass works, you should inherit from -the <tt><a href="#ModulePass">ModulePass</a></tt>, <tt><a -href="#CallGraphSCCPass">CallGraphSCCPass</a></tt>, <tt><a -href="#FunctionPass">FunctionPass</a></tt>, or <tt><a -href="#LoopPass">LoopPass</a></tt>, or <tt><a -href="#RegionPass">RegionPass</a></tt>, or <tt><a -href="#BasicBlockPass">BasicBlockPass</a></tt> classes, which gives the system -more information about what your pass does, and how it can be combined with -other passes. One of the main features of the LLVM Pass Framework is that it -schedules passes to run in an efficient way based on the constraints that your -pass meets (which are indicated by which class they derive from).</p> - -<p>We start by showing you how to construct a pass, everything from setting up -the code, to compiling, loading, and executing it. After the basics are down, -more advanced features are discussed.</p> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="quickstart">Quick Start - Writing hello world</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>Here we describe how to write the "hello world" of passes. The "Hello" pass -is designed to simply print out the name of non-external functions that exist in -the program being compiled. It does not modify the program at all, it just -inspects it. The source code and files for this pass are available in the LLVM -source tree in the <tt>lib/Transforms/Hello</tt> directory.</p> - -<!-- ======================================================================= --> -<h3> - <a name="makefile">Setting up the build environment</a> -</h3> - -<div> - - <p>First, configure and build LLVM. This needs to be done directly inside the - LLVM source tree rather than in a separate objects directory. - Next, you need to create a new directory somewhere in the LLVM source - base. For this example, we'll assume that you made - <tt>lib/Transforms/Hello</tt>. Finally, you must set up a build script - (Makefile) that will compile the source code for the new pass. To do this, - copy the following into <tt>Makefile</tt>:</p> - <hr> - -<div class="doc_code"><pre> -# Makefile for hello pass - -# Path to top level of LLVM hierarchy -LEVEL = ../../.. - -# Name of the library to build -LIBRARYNAME = Hello - -# Make the shared library become a loadable module so the tools can -# dlopen/dlsym on the resulting library. -LOADABLE_MODULE = 1 - -# Include the makefile implementation stuff -include $(LEVEL)/Makefile.common -</pre></div> - -<p>This makefile specifies that all of the <tt>.cpp</tt> files in the current -directory are to be compiled and linked together into a shared object -<tt>$(LEVEL)/Debug+Asserts/lib/Hello.so</tt> that can be dynamically loaded by -the <tt>opt</tt> or <tt>bugpoint</tt> tools via their <tt>-load</tt> options. -If your operating system uses a suffix other than .so (such as windows or -Mac OS/X), the appropriate extension will be used.</p> - -<p>If you are used CMake to build LLVM, see -<a href="CMake.html#passdev">Developing an LLVM pass with CMake</a>.</p> - -<p>Now that we have the build scripts set up, we just need to write the code for -the pass itself.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="basiccode">Basic code required</a> -</h3> - -<div> - -<p>Now that we have a way to compile our new pass, we just have to write it. -Start out with:</p> - -<div class="doc_code"> -<pre> -<b>#include</b> "<a href="http://llvm.org/doxygen/Pass_8h-source.html">llvm/Pass.h</a>" -<b>#include</b> "<a href="http://llvm.org/doxygen/Function_8h-source.html">llvm/Function.h</a>" -<b>#include</b> "<a href="http://llvm.org/doxygen/raw__ostream_8h.html">llvm/Support/raw_ostream.h</a>" -</pre> -</div> - -<p>Which are needed because we are writing a <tt><a -href="http://llvm.org/doxygen/classllvm_1_1Pass.html">Pass</a></tt>, -we are operating on <tt><a -href="http://llvm.org/doxygen/classllvm_1_1Function.html">Function</a></tt>'s, -and we will be doing some printing.</p> - -<p>Next we have:</p> - -<div class="doc_code"> -<pre> -<b>using namespace llvm;</b> -</pre> -</div> - -<p>... which is required because the functions from the include files -live in the llvm namespace.</p> - -<p>Next we have:</p> - -<div class="doc_code"> -<pre> -<b>namespace</b> { -</pre> -</div> - -<p>... which starts out an anonymous namespace. Anonymous namespaces are to C++ -what the "<tt>static</tt>" keyword is to C (at global scope). It makes the -things declared inside of the anonymous namespace visible only to the current -file. If you're not familiar with them, consult a decent C++ book for more -information.</p> - -<p>Next, we declare our pass itself:</p> - -<div class="doc_code"> -<pre> - <b>struct</b> Hello : <b>public</b> <a href="#FunctionPass">FunctionPass</a> { -</pre> -</div> - -<p>This declares a "<tt>Hello</tt>" class that is a subclass of <tt><a -href="http://llvm.org/doxygen/classllvm_1_1FunctionPass.html">FunctionPass</a></tt>. -The different builtin pass subclasses are described in detail <a -href="#passtype">later</a>, but for now, know that <a -href="#FunctionPass"><tt>FunctionPass</tt></a>'s operate on a function at a -time.</p> - -<div class="doc_code"> -<pre> - static char ID; - Hello() : FunctionPass(ID) {} -</pre> -</div> - -<p>This declares pass identifier used by LLVM to identify pass. This allows LLVM -to avoid using expensive C++ runtime information.</p> - -<div class="doc_code"> -<pre> - <b>virtual bool</b> <a href="#runOnFunction">runOnFunction</a>(Function &F) { - errs() << "<i>Hello: </i>"; - errs().write_escaped(F.getName()) << "\n"; - <b>return false</b>; - } - }; <i>// end of struct Hello</i> -} <i>// end of anonymous namespace</i> -</pre> -</div> - -<p>We declare a "<a href="#runOnFunction"><tt>runOnFunction</tt></a>" method, -which overloads an abstract virtual method inherited from <a -href="#FunctionPass"><tt>FunctionPass</tt></a>. This is where we are supposed -to do our thing, so we just print out our message with the name of each -function.</p> - -<div class="doc_code"> -<pre> -char Hello::ID = 0; -</pre> -</div> - -<p>We initialize pass ID here. LLVM uses ID's address to identify a pass, so -initialization value is not important.</p> - -<div class="doc_code"> -<pre> -static RegisterPass<Hello> X("<i>hello</i>", "<i>Hello World Pass</i>", - false /* Only looks at CFG */, - false /* Analysis Pass */); -</pre> -</div> - -<p>Lastly, we <a href="#registration">register our class</a> <tt>Hello</tt>, -giving it a command line argument "<tt>hello</tt>", and a name "<tt>Hello World -Pass</tt>". The last two arguments describe its behavior: if a pass walks CFG -without modifying it then the third argument is set to <tt>true</tt>; if a pass -is an analysis pass, for example dominator tree pass, then <tt>true</tt> is -supplied as the fourth argument.</p> - -<p>As a whole, the <tt>.cpp</tt> file looks like:</p> - -<div class="doc_code"> -<pre> -<b>#include</b> "<a href="http://llvm.org/doxygen/Pass_8h-source.html">llvm/Pass.h</a>" -<b>#include</b> "<a href="http://llvm.org/doxygen/Function_8h-source.html">llvm/Function.h</a>" -<b>#include</b> "<a href="http://llvm.org/doxygen/raw__ostream_8h.html">llvm/Support/raw_ostream.h</a>" - -<b>using namespace llvm;</b> - -<b>namespace</b> { - <b>struct Hello</b> : <b>public</b> <a href="#FunctionPass">FunctionPass</a> { - - static char ID; - Hello() : FunctionPass(ID) {} - - <b>virtual bool</b> <a href="#runOnFunction">runOnFunction</a>(Function &F) { - errs() << "<i>Hello: </i>"; - errs().write_escaped(F.getName()) << '\n'; - <b>return false</b>; - } - - }; -} - -char Hello::ID = 0; -static RegisterPass<Hello> X("hello", "Hello World Pass", false, false); -</pre> -</div> - -<p>Now that it's all together, compile the file with a simple "<tt>gmake</tt>" -command in the local directory and you should get a new file -"<tt>Debug+Asserts/lib/Hello.so</tt>" under the top level directory of the LLVM -source tree (not in the local directory). Note that everything in this file is -contained in an anonymous namespace — this reflects the fact that passes -are self contained units that do not need external interfaces (although they can -have them) to be useful.</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="running">Running a pass with <tt>opt</tt></a> -</h3> - -<div> - -<p>Now that you have a brand new shiny shared object file, we can use the -<tt>opt</tt> command to run an LLVM program through your pass. Because you -registered your pass with <tt>RegisterPass</tt>, you will be able to -use the <tt>opt</tt> tool to access it, once loaded.</p> - -<p>To test it, follow the example at the end of the <a -href="GettingStarted.html">Getting Started Guide</a> to compile "Hello World" to -LLVM. We can now run the bitcode file (<tt>hello.bc</tt>) for the program -through our transformation like this (or course, any bitcode file will -work):</p> - -<div class="doc_code"><pre> -$ opt -load ../../../Debug+Asserts/lib/Hello.so -hello < hello.bc > /dev/null -Hello: __main -Hello: puts -Hello: main -</pre></div> - -<p>The '<tt>-load</tt>' option specifies that '<tt>opt</tt>' should load your -pass as a shared object, which makes '<tt>-hello</tt>' a valid command line -argument (which is one reason you need to <a href="#registration">register your -pass</a>). Because the hello pass does not modify the program in any -interesting way, we just throw away the result of <tt>opt</tt> (sending it to -<tt>/dev/null</tt>).</p> - -<p>To see what happened to the other string you registered, try running -<tt>opt</tt> with the <tt>-help</tt> option:</p> - -<div class="doc_code"><pre> -$ opt -load ../../../Debug+Asserts/lib/Hello.so -help -OVERVIEW: llvm .bc -> .bc modular optimizer - -USAGE: opt [options] <input bitcode> - -OPTIONS: - Optimizations available: -... - -globalopt - Global Variable Optimizer - -globalsmodref-aa - Simple mod/ref analysis for globals - -gvn - Global Value Numbering - <b>-hello - Hello World Pass</b> - -indvars - Induction Variable Simplification - -inline - Function Integration/Inlining - -insert-edge-profiling - Insert instrumentation for edge profiling -... -</pre></div> - -<p>The pass name gets added as the information string for your pass, giving some -documentation to users of <tt>opt</tt>. Now that you have a working pass, you -would go ahead and make it do the cool transformations you want. Once you get -it all working and tested, it may become useful to find out how fast your pass -is. The <a href="#passManager"><tt>PassManager</tt></a> provides a nice command -line option (<tt>--time-passes</tt>) that allows you to get information about -the execution time of your pass along with the other passes you queue up. For -example:</p> - -<div class="doc_code"><pre> -$ opt -load ../../../Debug+Asserts/lib/Hello.so -hello -time-passes < hello.bc > /dev/null -Hello: __main -Hello: puts -Hello: main -=============================================================================== - ... Pass execution timing report ... -=============================================================================== - Total Execution Time: 0.02 seconds (0.0479059 wall clock) - - ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Pass Name --- - 0.0100 (100.0%) 0.0000 ( 0.0%) 0.0100 ( 50.0%) 0.0402 ( 84.0%) Bitcode Writer - 0.0000 ( 0.0%) 0.0100 (100.0%) 0.0100 ( 50.0%) 0.0031 ( 6.4%) Dominator Set Construction - 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0013 ( 2.7%) Module Verifier - <b> 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0033 ( 6.9%) Hello World Pass</b> - 0.0100 (100.0%) 0.0100 (100.0%) 0.0200 (100.0%) 0.0479 (100.0%) TOTAL -</pre></div> - -<p>As you can see, our implementation above is pretty fast :). The additional -passes listed are automatically inserted by the '<tt>opt</tt>' tool to verify -that the LLVM emitted by your pass is still valid and well formed LLVM, which -hasn't been broken somehow.</p> - -<p>Now that you have seen the basics of the mechanics behind passes, we can talk -about some more details of how they work and how to use them.</p> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="passtype">Pass classes and requirements</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>One of the first things that you should do when designing a new pass is to -decide what class you should subclass for your pass. The <a -href="#basiccode">Hello World</a> example uses the <tt><a -href="#FunctionPass">FunctionPass</a></tt> class for its implementation, but we -did not discuss why or when this should occur. Here we talk about the classes -available, from the most general to the most specific.</p> - -<p>When choosing a superclass for your Pass, you should choose the <b>most -specific</b> class possible, while still being able to meet the requirements -listed. This gives the LLVM Pass Infrastructure information necessary to -optimize how passes are run, so that the resultant compiler isn't unnecessarily -slow.</p> - -<!-- ======================================================================= --> -<h3> - <a name="ImmutablePass">The <tt>ImmutablePass</tt> class</a> -</h3> - -<div> - -<p>The most plain and boring type of pass is the "<tt><a -href="http://llvm.org/doxygen/classllvm_1_1ImmutablePass.html">ImmutablePass</a></tt>" -class. This pass type is used for passes that do not have to be run, do not -change state, and never need to be updated. This is not a normal type of -transformation or analysis, but can provide information about the current -compiler configuration.</p> - -<p>Although this pass class is very infrequently used, it is important for -providing information about the current target machine being compiled for, and -other static information that can affect the various transformations.</p> - -<p><tt>ImmutablePass</tt>es never invalidate other transformations, are never -invalidated, and are never "run".</p> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="ModulePass">The <tt>ModulePass</tt> class</a> -</h3> - -<div> - -<p>The "<tt><a -href="http://llvm.org/doxygen/classllvm_1_1ModulePass.html">ModulePass</a></tt>" -class is the most general of all superclasses that you can use. Deriving from -<tt>ModulePass</tt> indicates that your pass uses the entire program as a unit, -referring to function bodies in no predictable order, or adding and removing -functions. Because nothing is known about the behavior of <tt>ModulePass</tt> -subclasses, no optimization can be done for their execution.</p> - -<p>A module pass can use function level passes (e.g. dominators) using -the getAnalysis interface -<tt>getAnalysis<DominatorTree>(llvm::Function *)</tt> to provide the -function to retrieve analysis result for, if the function pass does not require -any module or immutable passes. Note that this can only be done for functions for which the -analysis ran, e.g. in the case of dominators you should only ask for the -DominatorTree for function definitions, not declarations.</p> - -<p>To write a correct <tt>ModulePass</tt> subclass, derive from -<tt>ModulePass</tt> and overload the <tt>runOnModule</tt> method with the -following signature:</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="runOnModule">The <tt>runOnModule</tt> method</a> -</h4> - -<div> - -<div class="doc_code"><pre> -<b>virtual bool</b> runOnModule(Module &M) = 0; -</pre></div> - -<p>The <tt>runOnModule</tt> method performs the interesting work of the pass. -It should return true if the module was modified by the transformation and -false otherwise.</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="CallGraphSCCPass">The <tt>CallGraphSCCPass</tt> class</a> -</h3> - -<div> - -<p>The "<tt><a -href="http://llvm.org/doxygen/classllvm_1_1CallGraphSCCPass.html">CallGraphSCCPass</a></tt>" -is used by passes that need to traverse the program bottom-up on the call graph -(callees before callers). Deriving from CallGraphSCCPass provides some -mechanics for building and traversing the CallGraph, but also allows the system -to optimize execution of CallGraphSCCPass's. If your pass meets the -requirements outlined below, and doesn't meet the requirements of a <tt><a -href="#FunctionPass">FunctionPass</a></tt> or <tt><a -href="#BasicBlockPass">BasicBlockPass</a></tt>, you should derive from -<tt>CallGraphSCCPass</tt>.</p> - -<p><b>TODO</b>: explain briefly what SCC, Tarjan's algo, and B-U mean.</p> - -<p>To be explicit, <tt>CallGraphSCCPass</tt> subclasses are:</p> - -<ol> - -<li>... <em>not allowed</em> to inspect or modify any <tt>Function</tt>s other -than those in the current SCC and the direct callers and direct callees of the -SCC.</li> - -<li>... <em>required</em> to preserve the current CallGraph object, updating it -to reflect any changes made to the program.</li> - -<li>... <em>not allowed</em> to add or remove SCC's from the current Module, -though they may change the contents of an SCC.</li> - -<li>... <em>allowed</em> to add or remove global variables from the current -Module.</li> - -<li>... <em>allowed</em> to maintain state across invocations of - <a href="#runOnSCC"><tt>runOnSCC</tt></a> (including global data).</li> -</ol> - -<p>Implementing a <tt>CallGraphSCCPass</tt> is slightly tricky in some cases -because it has to handle SCCs with more than one node in it. All of the virtual -methods described below should return true if they modified the program, or -false if they didn't.</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="doInitialization_scc"> - The <tt>doInitialization(CallGraph &)</tt> method - </a> -</h4> - -<div> - -<div class="doc_code"><pre> -<b>virtual bool</b> doInitialization(CallGraph &CG); -</pre></div> - -<p>The <tt>doIninitialize</tt> method is allowed to do most of the things that -<tt>CallGraphSCCPass</tt>'s are not allowed to do. They can add and remove -functions, get pointers to functions, etc. The <tt>doInitialization</tt> method -is designed to do simple initialization type of stuff that does not depend on -the SCCs being processed. The <tt>doInitialization</tt> method call is not -scheduled to overlap with any other pass executions (thus it should be very -fast).</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="runOnSCC">The <tt>runOnSCC</tt> method</a> -</h4> - -<div> - -<div class="doc_code"><pre> -<b>virtual bool</b> runOnSCC(CallGraphSCC &SCC) = 0; -</pre></div> - -<p>The <tt>runOnSCC</tt> method performs the interesting work of the pass, and -should return true if the module was modified by the transformation, false -otherwise.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="doFinalization_scc"> - The <tt>doFinalization(CallGraph &)</tt> method - </a> -</h4> - -<div> - -<div class="doc_code"><pre> -<b>virtual bool</b> doFinalization(CallGraph &CG); -</pre></div> - -<p>The <tt>doFinalization</tt> method is an infrequently used method that is -called when the pass framework has finished calling <a -href="#runOnFunction"><tt>runOnFunction</tt></a> for every function in the -program being compiled.</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="FunctionPass">The <tt>FunctionPass</tt> class</a> -</h3> - -<div> - -<p>In contrast to <tt>ModulePass</tt> subclasses, <tt><a -href="http://llvm.org/doxygen/classllvm_1_1Pass.html">FunctionPass</a></tt> -subclasses do have a predictable, local behavior that can be expected by the -system. All <tt>FunctionPass</tt> execute on each function in the program -independent of all of the other functions in the program. -<tt>FunctionPass</tt>'s do not require that they are executed in a particular -order, and <tt>FunctionPass</tt>'s do not modify external functions.</p> - -<p>To be explicit, <tt>FunctionPass</tt> subclasses are not allowed to:</p> - -<ol> -<li>Modify a Function other than the one currently being processed.</li> -<li>Add or remove Function's from the current Module.</li> -<li>Add or remove global variables from the current Module.</li> -<li>Maintain state across invocations of - <a href="#runOnFunction"><tt>runOnFunction</tt></a> (including global data)</li> -</ol> - -<p>Implementing a <tt>FunctionPass</tt> is usually straightforward (See the <a -href="#basiccode">Hello World</a> pass for example). <tt>FunctionPass</tt>'s -may overload three virtual methods to do their work. All of these methods -should return true if they modified the program, or false if they didn't.</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="doInitialization_mod"> - The <tt>doInitialization(Module &)</tt> method - </a> -</h4> - -<div> - -<div class="doc_code"><pre> -<b>virtual bool</b> doInitialization(Module &M); -</pre></div> - -<p>The <tt>doIninitialize</tt> method is allowed to do most of the things that -<tt>FunctionPass</tt>'s are not allowed to do. They can add and remove -functions, get pointers to functions, etc. The <tt>doInitialization</tt> method -is designed to do simple initialization type of stuff that does not depend on -the functions being processed. The <tt>doInitialization</tt> method call is not -scheduled to overlap with any other pass executions (thus it should be very -fast).</p> - -<p>A good example of how this method should be used is the <a -href="http://llvm.org/doxygen/LowerAllocations_8cpp-source.html">LowerAllocations</a> -pass. This pass converts <tt>malloc</tt> and <tt>free</tt> instructions into -platform dependent <tt>malloc()</tt> and <tt>free()</tt> function calls. It -uses the <tt>doInitialization</tt> method to get a reference to the malloc and -free functions that it needs, adding prototypes to the module if necessary.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="runOnFunction">The <tt>runOnFunction</tt> method</a> -</h4> - -<div> - -<div class="doc_code"><pre> -<b>virtual bool</b> runOnFunction(Function &F) = 0; -</pre></div><p> - -<p>The <tt>runOnFunction</tt> method must be implemented by your subclass to do -the transformation or analysis work of your pass. As usual, a true value should -be returned if the function is modified.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="doFinalization_mod"> - The <tt>doFinalization(Module &)</tt> method - </a> -</h4> - -<div> - -<div class="doc_code"><pre> -<b>virtual bool</b> doFinalization(Module &M); -</pre></div> - -<p>The <tt>doFinalization</tt> method is an infrequently used method that is -called when the pass framework has finished calling <a -href="#runOnFunction"><tt>runOnFunction</tt></a> for every function in the -program being compiled.</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="LoopPass">The <tt>LoopPass</tt> class </a> -</h3> - -<div> - -<p> All <tt>LoopPass</tt> execute on each loop in the function independent of -all of the other loops in the function. <tt>LoopPass</tt> processes loops in -loop nest order such that outer most loop is processed last. </p> - -<p> <tt>LoopPass</tt> subclasses are allowed to update loop nest using -<tt>LPPassManager</tt> interface. Implementing a loop pass is usually -straightforward. <tt>LoopPass</tt>'s may overload three virtual methods to -do their work. All these methods should return true if they modified the -program, or false if they didn't. </p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="doInitialization_loop"> - The <tt>doInitialization(Loop *,LPPassManager &)</tt> method - </a> -</h4> - -<div> - -<div class="doc_code"><pre> -<b>virtual bool</b> doInitialization(Loop *, LPPassManager &LPM); -</pre></div> - -<p>The <tt>doInitialization</tt> method is designed to do simple initialization -type of stuff that does not depend on the functions being processed. The -<tt>doInitialization</tt> method call is not scheduled to overlap with any -other pass executions (thus it should be very fast). LPPassManager -interface should be used to access Function or Module level analysis -information.</p> - -</div> - - -<!-- _______________________________________________________________________ --> -<h4> - <a name="runOnLoop">The <tt>runOnLoop</tt> method</a> -</h4> - -<div> - -<div class="doc_code"><pre> -<b>virtual bool</b> runOnLoop(Loop *, LPPassManager &LPM) = 0; -</pre></div><p> - -<p>The <tt>runOnLoop</tt> method must be implemented by your subclass to do -the transformation or analysis work of your pass. As usual, a true value should -be returned if the function is modified. <tt>LPPassManager</tt> interface -should be used to update loop nest.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="doFinalization_loop">The <tt>doFinalization()</tt> method</a> -</h4> - -<div> - -<div class="doc_code"><pre> -<b>virtual bool</b> doFinalization(); -</pre></div> - -<p>The <tt>doFinalization</tt> method is an infrequently used method that is -called when the pass framework has finished calling <a -href="#runOnLoop"><tt>runOnLoop</tt></a> for every loop in the -program being compiled. </p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="RegionPass">The <tt>RegionPass</tt> class </a> -</h3> - -<div> - -<p> <tt>RegionPass</tt> is similar to <a href="#LoopPass"><tt>LoopPass</tt></a>, -but executes on each single entry single exit region in the function. -<tt>RegionPass</tt> processes regions in nested order such that the outer most -region is processed last. </p> - -<p> <tt>RegionPass</tt> subclasses are allowed to update the region tree by using -the <tt>RGPassManager</tt> interface. You may overload three virtual methods of -<tt>RegionPass</tt> to implement your own region pass. All these -methods should return true if they modified the program, or false if they didn not. -</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="doInitialization_region"> - The <tt>doInitialization(Region *, RGPassManager &)</tt> method - </a> -</h4> - -<div> - -<div class="doc_code"><pre> -<b>virtual bool</b> doInitialization(Region *, RGPassManager &RGM); -</pre></div> - -<p>The <tt>doInitialization</tt> method is designed to do simple initialization -type of stuff that does not depend on the functions being processed. The -<tt>doInitialization</tt> method call is not scheduled to overlap with any -other pass executions (thus it should be very fast). RPPassManager -interface should be used to access Function or Module level analysis -information.</p> - -</div> - - -<!-- _______________________________________________________________________ --> -<h4> - <a name="runOnRegion">The <tt>runOnRegion</tt> method</a> -</h4> - -<div> - -<div class="doc_code"><pre> -<b>virtual bool</b> runOnRegion(Region *, RGPassManager &RGM) = 0; -</pre></div><p> - -<p>The <tt>runOnRegion</tt> method must be implemented by your subclass to do -the transformation or analysis work of your pass. As usual, a true value should -be returned if the region is modified. <tt>RGPassManager</tt> interface -should be used to update region tree.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="doFinalization_region">The <tt>doFinalization()</tt> method</a> -</h4> - -<div> - -<div class="doc_code"><pre> -<b>virtual bool</b> doFinalization(); -</pre></div> - -<p>The <tt>doFinalization</tt> method is an infrequently used method that is -called when the pass framework has finished calling <a -href="#runOnRegion"><tt>runOnRegion</tt></a> for every region in the -program being compiled. </p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="BasicBlockPass">The <tt>BasicBlockPass</tt> class</a> -</h3> - -<div> - -<p><tt>BasicBlockPass</tt>'s are just like <a -href="#FunctionPass"><tt>FunctionPass</tt></a>'s, except that they must limit -their scope of inspection and modification to a single basic block at a time. -As such, they are <b>not</b> allowed to do any of the following:</p> - -<ol> -<li>Modify or inspect any basic blocks outside of the current one</li> -<li>Maintain state across invocations of - <a href="#runOnBasicBlock"><tt>runOnBasicBlock</tt></a></li> -<li>Modify the control flow graph (by altering terminator instructions)</li> -<li>Any of the things forbidden for - <a href="#FunctionPass"><tt>FunctionPass</tt></a>es.</li> -</ol> - -<p><tt>BasicBlockPass</tt>es are useful for traditional local and "peephole" -optimizations. They may override the same <a -href="#doInitialization_mod"><tt>doInitialization(Module &)</tt></a> and <a -href="#doFinalization_mod"><tt>doFinalization(Module &)</tt></a> methods that <a -href="#FunctionPass"><tt>FunctionPass</tt></a>'s have, but also have the following virtual methods that may also be implemented:</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="doInitialization_fn"> - The <tt>doInitialization(Function &)</tt> method - </a> -</h4> - -<div> - -<div class="doc_code"><pre> -<b>virtual bool</b> doInitialization(Function &F); -</pre></div> - -<p>The <tt>doIninitialize</tt> method is allowed to do most of the things that -<tt>BasicBlockPass</tt>'s are not allowed to do, but that -<tt>FunctionPass</tt>'s can. The <tt>doInitialization</tt> method is designed -to do simple initialization that does not depend on the -BasicBlocks being processed. The <tt>doInitialization</tt> method call is not -scheduled to overlap with any other pass executions (thus it should be very -fast).</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="runOnBasicBlock">The <tt>runOnBasicBlock</tt> method</a> -</h4> - -<div> - -<div class="doc_code"><pre> -<b>virtual bool</b> runOnBasicBlock(BasicBlock &BB) = 0; -</pre></div> - -<p>Override this function to do the work of the <tt>BasicBlockPass</tt>. This -function is not allowed to inspect or modify basic blocks other than the -parameter, and are not allowed to modify the CFG. A true value must be returned -if the basic block is modified.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="doFinalization_fn"> - The <tt>doFinalization(Function &)</tt> method - </a> -</h4> - -<div> - -<div class="doc_code"><pre> -<b>virtual bool</b> doFinalization(Function &F); -</pre></div> - -<p>The <tt>doFinalization</tt> method is an infrequently used method that is -called when the pass framework has finished calling <a -href="#runOnBasicBlock"><tt>runOnBasicBlock</tt></a> for every BasicBlock in the -program being compiled. This can be used to perform per-function -finalization.</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="MachineFunctionPass">The <tt>MachineFunctionPass</tt> class</a> -</h3> - -<div> - -<p>A <tt>MachineFunctionPass</tt> is a part of the LLVM code generator that -executes on the machine-dependent representation of each LLVM function in the -program.</p> - -<p>Code generator passes are registered and initialized specially by -<tt>TargetMachine::addPassesToEmitFile</tt> and similar routines, so they -cannot generally be run from the <tt>opt</tt> or <tt>bugpoint</tt> -commands.</p> - -<p>A <tt>MachineFunctionPass</tt> is also a <tt>FunctionPass</tt>, so all -the restrictions that apply to a <tt>FunctionPass</tt> also apply to it. -<tt>MachineFunctionPass</tt>es also have additional restrictions. In particular, -<tt>MachineFunctionPass</tt>es are not allowed to do any of the following:</p> - -<ol> -<li>Modify or create any LLVM IR Instructions, BasicBlocks, Arguments, - Functions, GlobalVariables, GlobalAliases, or Modules.</li> -<li>Modify a MachineFunction other than the one currently being processed.</li> -<li>Maintain state across invocations of <a -href="#runOnMachineFunction"><tt>runOnMachineFunction</tt></a> (including global -data)</li> -</ol> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="runOnMachineFunction"> - The <tt>runOnMachineFunction(MachineFunction &MF)</tt> method - </a> -</h4> - -<div> - -<div class="doc_code"><pre> -<b>virtual bool</b> runOnMachineFunction(MachineFunction &MF) = 0; -</pre></div> - -<p><tt>runOnMachineFunction</tt> can be considered the main entry point of a -<tt>MachineFunctionPass</tt>; that is, you should override this method to do the -work of your <tt>MachineFunctionPass</tt>.</p> - -<p>The <tt>runOnMachineFunction</tt> method is called on every -<tt>MachineFunction</tt> in a <tt>Module</tt>, so that the -<tt>MachineFunctionPass</tt> may perform optimizations on the machine-dependent -representation of the function. If you want to get at the LLVM <tt>Function</tt> -for the <tt>MachineFunction</tt> you're working on, use -<tt>MachineFunction</tt>'s <tt>getFunction()</tt> accessor method -- but -remember, you may not modify the LLVM <tt>Function</tt> or its contents from a -<tt>MachineFunctionPass</tt>.</p> - -</div> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="registration">Pass registration</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>In the <a href="#basiccode">Hello World</a> example pass we illustrated how -pass registration works, and discussed some of the reasons that it is used and -what it does. Here we discuss how and why passes are registered.</p> - -<p>As we saw above, passes are registered with the <b><tt>RegisterPass</tt></b> -template. The template parameter is the name of the pass that is to be used on -the command line to specify that the pass should be added to a program (for -example, with <tt>opt</tt> or <tt>bugpoint</tt>). The first argument is the -name of the pass, which is to be used for the <tt>-help</tt> output of -programs, as -well as for debug output generated by the <tt>--debug-pass</tt> option.</p> - -<p>If you want your pass to be easily dumpable, you should -implement the virtual <tt>print</tt> method:</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="print">The <tt>print</tt> method</a> -</h4> - -<div> - -<div class="doc_code"><pre> -<b>virtual void</b> print(std::ostream &O, <b>const</b> Module *M) <b>const</b>; -</pre></div> - -<p>The <tt>print</tt> method must be implemented by "analyses" in order to print -a human readable version of the analysis results. This is useful for debugging -an analysis itself, as well as for other people to figure out how an analysis -works. Use the <tt>opt -analyze</tt> argument to invoke this method.</p> - -<p>The <tt>llvm::OStream</tt> parameter specifies the stream to write the results on, -and the <tt>Module</tt> parameter gives a pointer to the top level module of the -program that has been analyzed. Note however that this pointer may be null in -certain circumstances (such as calling the <tt>Pass::dump()</tt> from a -debugger), so it should only be used to enhance debug output, it should not be -depended on.</p> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="interaction">Specifying interactions between passes</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>One of the main responsibilities of the <tt>PassManager</tt> is to make sure -that passes interact with each other correctly. Because <tt>PassManager</tt> -tries to <a href="#passmanager">optimize the execution of passes</a> it must -know how the passes interact with each other and what dependencies exist between -the various passes. To track this, each pass can declare the set of passes that -are required to be executed before the current pass, and the passes which are -invalidated by the current pass.</p> - -<p>Typically this functionality is used to require that analysis results are -computed before your pass is run. Running arbitrary transformation passes can -invalidate the computed analysis results, which is what the invalidation set -specifies. If a pass does not implement the <tt><a -href="#getAnalysisUsage">getAnalysisUsage</a></tt> method, it defaults to not -having any prerequisite passes, and invalidating <b>all</b> other passes.</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="getAnalysisUsage">The <tt>getAnalysisUsage</tt> method</a> -</h4> - -<div> - -<div class="doc_code"><pre> -<b>virtual void</b> getAnalysisUsage(AnalysisUsage &Info) <b>const</b>; -</pre></div> - -<p>By implementing the <tt>getAnalysisUsage</tt> method, the required and -invalidated sets may be specified for your transformation. The implementation -should fill in the <tt><a -href="http://llvm.org/doxygen/classllvm_1_1AnalysisUsage.html">AnalysisUsage</a></tt> -object with information about which passes are required and not invalidated. To -do this, a pass may call any of the following methods on the AnalysisUsage -object:</p> -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="AU::addRequired"> - The <tt>AnalysisUsage::addRequired<></tt> - and <tt>AnalysisUsage::addRequiredTransitive<></tt> methods - </a> -</h4> - -<div> -<p> -If your pass requires a previous pass to be executed (an analysis for example), -it can use one of these methods to arrange for it to be run before your pass. -LLVM has many different types of analyses and passes that can be required, -spanning the range from <tt>DominatorSet</tt> to <tt>BreakCriticalEdges</tt>. -Requiring <tt>BreakCriticalEdges</tt>, for example, guarantees that there will -be no critical edges in the CFG when your pass has been run. -</p> - -<p> -Some analyses chain to other analyses to do their job. For example, an <a -href="AliasAnalysis.html">AliasAnalysis</a> implementation is required to <a -href="AliasAnalysis.html#chaining">chain</a> to other alias analysis passes. In -cases where analyses chain, the <tt>addRequiredTransitive</tt> method should be -used instead of the <tt>addRequired</tt> method. This informs the PassManager -that the transitively required pass should be alive as long as the requiring -pass is. -</p> -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="AU::addPreserved"> - The <tt>AnalysisUsage::addPreserved<></tt> method - </a> -</h4> - -<div> -<p> -One of the jobs of the PassManager is to optimize how and when analyses are run. -In particular, it attempts to avoid recomputing data unless it needs to. For -this reason, passes are allowed to declare that they preserve (i.e., they don't -invalidate) an existing analysis if it's available. For example, a simple -constant folding pass would not modify the CFG, so it can't possibly affect the -results of dominator analysis. By default, all passes are assumed to invalidate -all others. -</p> - -<p> -The <tt>AnalysisUsage</tt> class provides several methods which are useful in -certain circumstances that are related to <tt>addPreserved</tt>. In particular, -the <tt>setPreservesAll</tt> method can be called to indicate that the pass does -not modify the LLVM program at all (which is true for analyses), and the -<tt>setPreservesCFG</tt> method can be used by transformations that change -instructions in the program but do not modify the CFG or terminator instructions -(note that this property is implicitly set for <a -href="#BasicBlockPass">BasicBlockPass</a>'s). -</p> - -<p> -<tt>addPreserved</tt> is particularly useful for transformations like -<tt>BreakCriticalEdges</tt>. This pass knows how to update a small set of loop -and dominator related analyses if they exist, so it can preserve them, despite -the fact that it hacks on the CFG. -</p> -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="AU::examples"> - Example implementations of <tt>getAnalysisUsage</tt> - </a> -</h4> - -<div> - -<div class="doc_code"><pre> -<i>// This example modifies the program, but does not modify the CFG</i> -<b>void</b> <a href="http://llvm.org/doxygen/structLICM.html">LICM</a>::getAnalysisUsage(AnalysisUsage &AU) <b>const</b> { - AU.setPreservesCFG(); - AU.addRequired<<a href="http://llvm.org/doxygen/classllvm_1_1LoopInfo.html">LoopInfo</a>>(); -} -</pre></div> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="getAnalysis"> - The <tt>getAnalysis<></tt> and - <tt>getAnalysisIfAvailable<></tt> methods - </a> -</h4> - -<div> - -<p>The <tt>Pass::getAnalysis<></tt> method is automatically inherited by -your class, providing you with access to the passes that you declared that you -required with the <a href="#getAnalysisUsage"><tt>getAnalysisUsage</tt></a> -method. It takes a single template argument that specifies which pass class you -want, and returns a reference to that pass. For example:</p> - -<div class="doc_code"><pre> -bool LICM::runOnFunction(Function &F) { - LoopInfo &LI = getAnalysis<LoopInfo>(); - ... -} -</pre></div> - -<p>This method call returns a reference to the pass desired. You may get a -runtime assertion failure if you attempt to get an analysis that you did not -declare as required in your <a -href="#getAnalysisUsage"><tt>getAnalysisUsage</tt></a> implementation. This -method can be called by your <tt>run*</tt> method implementation, or by any -other local method invoked by your <tt>run*</tt> method. - -A module level pass can use function level analysis info using this interface. -For example:</p> - -<div class="doc_code"><pre> -bool ModuleLevelPass::runOnModule(Module &M) { - ... - DominatorTree &DT = getAnalysis<DominatorTree>(Func); - ... -} -</pre></div> - -<p>In above example, runOnFunction for DominatorTree is called by pass manager -before returning a reference to the desired pass.</p> - -<p> -If your pass is capable of updating analyses if they exist (e.g., -<tt>BreakCriticalEdges</tt>, as described above), you can use the -<tt>getAnalysisIfAvailable</tt> method, which returns a pointer to the analysis -if it is active. For example:</p> - -<div class="doc_code"><pre> -... -if (DominatorSet *DS = getAnalysisIfAvailable<DominatorSet>()) { - <i>// A DominatorSet is active. This code will update it.</i> -} -... -</pre></div> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="analysisgroup">Implementing Analysis Groups</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>Now that we understand the basics of how passes are defined, how they are -used, and how they are required from other passes, it's time to get a little bit -fancier. All of the pass relationships that we have seen so far are very -simple: one pass depends on one other specific pass to be run before it can run. -For many applications, this is great, for others, more flexibility is -required.</p> - -<p>In particular, some analyses are defined such that there is a single simple -interface to the analysis results, but multiple ways of calculating them. -Consider alias analysis for example. The most trivial alias analysis returns -"may alias" for any alias query. The most sophisticated analysis a -flow-sensitive, context-sensitive interprocedural analysis that can take a -significant amount of time to execute (and obviously, there is a lot of room -between these two extremes for other implementations). To cleanly support -situations like this, the LLVM Pass Infrastructure supports the notion of -Analysis Groups.</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="agconcepts">Analysis Group Concepts</a> -</h4> - -<div> - -<p>An Analysis Group is a single simple interface that may be implemented by -multiple different passes. Analysis Groups can be given human readable names -just like passes, but unlike passes, they need not derive from the <tt>Pass</tt> -class. An analysis group may have one or more implementations, one of which is -the "default" implementation.</p> - -<p>Analysis groups are used by client passes just like other passes are: the -<tt>AnalysisUsage::addRequired()</tt> and <tt>Pass::getAnalysis()</tt> methods. -In order to resolve this requirement, the <a href="#passmanager">PassManager</a> -scans the available passes to see if any implementations of the analysis group -are available. If none is available, the default implementation is created for -the pass to use. All standard rules for <A href="#interaction">interaction -between passes</a> still apply.</p> - -<p>Although <a href="#registration">Pass Registration</a> is optional for normal -passes, all analysis group implementations must be registered, and must use the -<A href="#registerag"><tt>INITIALIZE_AG_PASS</tt></a> template to join the -implementation pool. Also, a default implementation of the interface -<b>must</b> be registered with <A -href="#registerag"><tt>RegisterAnalysisGroup</tt></a>.</p> - -<p>As a concrete example of an Analysis Group in action, consider the <a -href="http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html">AliasAnalysis</a> -analysis group. The default implementation of the alias analysis interface (the -<tt><a -href="http://llvm.org/doxygen/structBasicAliasAnalysis.html">basicaa</a></tt> -pass) just does a few simple checks that don't require significant analysis to -compute (such as: two different globals can never alias each other, etc). -Passes that use the <tt><a -href="http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html">AliasAnalysis</a></tt> -interface (for example the <tt><a -href="http://llvm.org/doxygen/structGCSE.html">gcse</a></tt> pass), do -not care which implementation of alias analysis is actually provided, they just -use the designated interface.</p> - -<p>From the user's perspective, commands work just like normal. Issuing the -command '<tt>opt -gcse ...</tt>' will cause the <tt>basicaa</tt> class to be -instantiated and added to the pass sequence. Issuing the command '<tt>opt --somefancyaa -gcse ...</tt>' will cause the <tt>gcse</tt> pass to use the -<tt>somefancyaa</tt> alias analysis (which doesn't actually exist, it's just a -hypothetical example) instead.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="registerag">Using <tt>RegisterAnalysisGroup</tt></a> -</h4> - -<div> - -<p>The <tt>RegisterAnalysisGroup</tt> template is used to register the analysis -group itself, while the <tt>INITIALIZE_AG_PASS</tt> is used to add pass -implementations to the analysis group. First, -an analysis group should be registered, with a human readable name -provided for it. -Unlike registration of passes, there is no command line argument to be specified -for the Analysis Group Interface itself, because it is "abstract":</p> - -<div class="doc_code"><pre> -<b>static</b> RegisterAnalysisGroup<<a href="http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html">AliasAnalysis</a>> A("<i>Alias Analysis</i>"); -</pre></div> - -<p>Once the analysis is registered, passes can declare that they are valid -implementations of the interface by using the following code:</p> - -<div class="doc_code"><pre> -<b>namespace</b> { - //<i> Declare that we implement the AliasAnalysis interface</i> - INITIALIZE_AG_PASS(FancyAA, <a href="http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html">AliasAnalysis</a>, "<i>somefancyaa</i>", - "<i>A more complex alias analysis implementation</i>", - false, // <i>Is CFG Only?</i> - true, // <i>Is Analysis?</i> - false); // <i>Is default Analysis Group implementation?</i> -} -</pre></div> - -<p>This just shows a class <tt>FancyAA</tt> that -uses the <tt>INITIALIZE_AG_PASS</tt> macro both to register and -to "join" the <tt><a href="http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html">AliasAnalysis</a></tt> -analysis group. Every implementation of an analysis group should join using -this macro.</p> - -<div class="doc_code"><pre> -<b>namespace</b> { - //<i> Declare that we implement the AliasAnalysis interface</i> - INITIALIZE_AG_PASS(BasicAA, <a href="http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html">AliasAnalysis</a>, "<i>basicaa</i>", - "<i>Basic Alias Analysis (default AA impl)</i>", - false, // <i>Is CFG Only?</i> - true, // <i>Is Analysis?</i> - true); // <i>Is default Analysis Group implementation?</i> -} -</pre></div> - -<p>Here we show how the default implementation is specified (using the final -argument to the <tt>INITIALIZE_AG_PASS</tt> template). There must be exactly -one default implementation available at all times for an Analysis Group to be -used. Only default implementation can derive from <tt>ImmutablePass</tt>. -Here we declare that the - <tt><a href="http://llvm.org/doxygen/structBasicAliasAnalysis.html">BasicAliasAnalysis</a></tt> -pass is the default implementation for the interface.</p> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="passStatistics">Pass Statistics</a> -</h2> -<!-- *********************************************************************** --> - -<div> -<p>The <a -href="http://llvm.org/doxygen/Statistic_8h-source.html"><tt>Statistic</tt></a> -class is designed to be an easy way to expose various success -metrics from passes. These statistics are printed at the end of a -run, when the -stats command line option is enabled on the command -line. See the <a href="http://llvm.org/docs/ProgrammersManual.html#Statistic">Statistics section</a> in the Programmer's Manual for details. - -</div> - - -<!-- *********************************************************************** --> -<h2> - <a name="passmanager">What PassManager does</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>The <a -href="http://llvm.org/doxygen/PassManager_8h-source.html"><tt>PassManager</tt></a> -<a -href="http://llvm.org/doxygen/classllvm_1_1PassManager.html">class</a> -takes a list of passes, ensures their <a href="#interaction">prerequisites</a> -are set up correctly, and then schedules passes to run efficiently. All of the -LLVM tools that run passes use the <tt>PassManager</tt> for execution of these -passes.</p> - -<p>The <tt>PassManager</tt> does two main things to try to reduce the execution -time of a series of passes:</p> - -<ol> -<li><b>Share analysis results</b> - The PassManager attempts to avoid -recomputing analysis results as much as possible. This means keeping track of -which analyses are available already, which analyses get invalidated, and which -analyses are needed to be run for a pass. An important part of work is that the -<tt>PassManager</tt> tracks the exact lifetime of all analysis results, allowing -it to <a href="#releaseMemory">free memory</a> allocated to holding analysis -results as soon as they are no longer needed.</li> - -<li><b>Pipeline the execution of passes on the program</b> - The -<tt>PassManager</tt> attempts to get better cache and memory usage behavior out -of a series of passes by pipelining the passes together. This means that, given -a series of consecutive <a href="#FunctionPass"><tt>FunctionPass</tt></a>'s, it -will execute all of the <a href="#FunctionPass"><tt>FunctionPass</tt></a>'s on -the first function, then all of the <a -href="#FunctionPass"><tt>FunctionPass</tt></a>es on the second function, -etc... until the entire program has been run through the passes. - -<p>This improves the cache behavior of the compiler, because it is only touching -the LLVM program representation for a single function at a time, instead of -traversing the entire program. It reduces the memory consumption of compiler, -because, for example, only one <a -href="http://llvm.org/doxygen/classllvm_1_1DominatorSet.html"><tt>DominatorSet</tt></a> -needs to be calculated at a time. This also makes it possible to implement -some <a -href="#SMP">interesting enhancements</a> in the future.</p></li> - -</ol> - -<p>The effectiveness of the <tt>PassManager</tt> is influenced directly by how -much information it has about the behaviors of the passes it is scheduling. For -example, the "preserved" set is intentionally conservative in the face of an -unimplemented <a href="#getAnalysisUsage"><tt>getAnalysisUsage</tt></a> method. -Not implementing when it should be implemented will have the effect of not -allowing any analysis results to live across the execution of your pass.</p> - -<p>The <tt>PassManager</tt> class exposes a <tt>--debug-pass</tt> command line -options that is useful for debugging pass execution, seeing how things work, and -diagnosing when you should be preserving more analyses than you currently are -(To get information about all of the variants of the <tt>--debug-pass</tt> -option, just type '<tt>opt -help-hidden</tt>').</p> - -<p>By using the <tt>--debug-pass=Structure</tt> option, for example, we can see -how our <a href="#basiccode">Hello World</a> pass interacts with other passes. -Lets try it out with the <tt>gcse</tt> and <tt>licm</tt> passes:</p> - -<div class="doc_code"><pre> -$ opt -load ../../../Debug+Asserts/lib/Hello.so -gcse -licm --debug-pass=Structure < hello.bc > /dev/null -Module Pass Manager - Function Pass Manager - Dominator Set Construction - Immediate Dominators Construction - Global Common Subexpression Elimination --- Immediate Dominators Construction --- Global Common Subexpression Elimination - Natural Loop Construction - Loop Invariant Code Motion --- Natural Loop Construction --- Loop Invariant Code Motion - Module Verifier --- Dominator Set Construction --- Module Verifier - Bitcode Writer ---Bitcode Writer -</pre></div> - -<p>This output shows us when passes are constructed and when the analysis -results are known to be dead (prefixed with '<tt>--</tt>'). Here we see that -GCSE uses dominator and immediate dominator information to do its job. The LICM -pass uses natural loop information, which uses dominator sets, but not immediate -dominators. Because immediate dominators are no longer useful after the GCSE -pass, it is immediately destroyed. The dominator sets are then reused to -compute natural loop information, which is then used by the LICM pass.</p> - -<p>After the LICM pass, the module verifier runs (which is automatically added -by the '<tt>opt</tt>' tool), which uses the dominator set to check that the -resultant LLVM code is well formed. After it finishes, the dominator set -information is destroyed, after being computed once, and shared by three -passes.</p> - -<p>Lets see how this changes when we run the <a href="#basiccode">Hello -World</a> pass in between the two passes:</p> - -<div class="doc_code"><pre> -$ opt -load ../../../Debug+Asserts/lib/Hello.so -gcse -hello -licm --debug-pass=Structure < hello.bc > /dev/null -Module Pass Manager - Function Pass Manager - Dominator Set Construction - Immediate Dominators Construction - Global Common Subexpression Elimination -<b>-- Dominator Set Construction</b> --- Immediate Dominators Construction --- Global Common Subexpression Elimination -<b> Hello World Pass --- Hello World Pass - Dominator Set Construction</b> - Natural Loop Construction - Loop Invariant Code Motion --- Natural Loop Construction --- Loop Invariant Code Motion - Module Verifier --- Dominator Set Construction --- Module Verifier - Bitcode Writer ---Bitcode Writer -Hello: __main -Hello: puts -Hello: main -</pre></div> - -<p>Here we see that the <a href="#basiccode">Hello World</a> pass has killed the -Dominator Set pass, even though it doesn't modify the code at all! To fix this, -we need to add the following <a -href="#getAnalysisUsage"><tt>getAnalysisUsage</tt></a> method to our pass:</p> - -<div class="doc_code"><pre> -<i>// We don't modify the program, so we preserve all analyses</i> -<b>virtual void</b> getAnalysisUsage(AnalysisUsage &AU) <b>const</b> { - AU.setPreservesAll(); -} -</pre></div> - -<p>Now when we run our pass, we get this output:</p> - -<div class="doc_code"><pre> -$ opt -load ../../../Debug+Asserts/lib/Hello.so -gcse -hello -licm --debug-pass=Structure < hello.bc > /dev/null -Pass Arguments: -gcse -hello -licm -Module Pass Manager - Function Pass Manager - Dominator Set Construction - Immediate Dominators Construction - Global Common Subexpression Elimination --- Immediate Dominators Construction --- Global Common Subexpression Elimination - Hello World Pass --- Hello World Pass - Natural Loop Construction - Loop Invariant Code Motion --- Loop Invariant Code Motion --- Natural Loop Construction - Module Verifier --- Dominator Set Construction --- Module Verifier - Bitcode Writer ---Bitcode Writer -Hello: __main -Hello: puts -Hello: main -</pre></div> - -<p>Which shows that we don't accidentally invalidate dominator information -anymore, and therefore do not have to compute it twice.</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="releaseMemory">The <tt>releaseMemory</tt> method</a> -</h4> - -<div> - -<div class="doc_code"><pre> - <b>virtual void</b> releaseMemory(); -</pre></div> - -<p>The <tt>PassManager</tt> automatically determines when to compute analysis -results, and how long to keep them around for. Because the lifetime of the pass -object itself is effectively the entire duration of the compilation process, we -need some way to free analysis results when they are no longer useful. The -<tt>releaseMemory</tt> virtual method is the way to do this.</p> - -<p>If you are writing an analysis or any other pass that retains a significant -amount of state (for use by another pass which "requires" your pass and uses the -<a href="#getAnalysis">getAnalysis</a> method) you should implement -<tt>releaseMemory</tt> to, well, release the memory allocated to maintain this -internal state. This method is called after the <tt>run*</tt> method for the -class, before the next call of <tt>run*</tt> in your pass.</p> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="registering">Registering dynamically loaded passes</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p><i>Size matters</i> when constructing production quality tools using llvm, -both for the purposes of distribution, and for regulating the resident code size -when running on the target system. Therefore, it becomes desirable to -selectively use some passes, while omitting others and maintain the flexibility -to change configurations later on. You want to be able to do all this, and, -provide feedback to the user. This is where pass registration comes into -play.</p> - -<p>The fundamental mechanisms for pass registration are the -<tt>MachinePassRegistry</tt> class and subclasses of -<tt>MachinePassRegistryNode</tt>.</p> - -<p>An instance of <tt>MachinePassRegistry</tt> is used to maintain a list of -<tt>MachinePassRegistryNode</tt> objects. This instance maintains the list and -communicates additions and deletions to the command line interface.</p> - -<p>An instance of <tt>MachinePassRegistryNode</tt> subclass is used to maintain -information provided about a particular pass. This information includes the -command line name, the command help string and the address of the function used -to create an instance of the pass. A global static constructor of one of these -instances <i>registers</i> with a corresponding <tt>MachinePassRegistry</tt>, -the static destructor <i>unregisters</i>. Thus a pass that is statically linked -in the tool will be registered at start up. A dynamically loaded pass will -register on load and unregister at unload.</p> - -<!-- _______________________________________________________________________ --> -<h3> - <a name="registering_existing">Using existing registries</a> -</h3> - -<div> - -<p>There are predefined registries to track instruction scheduling -(<tt>RegisterScheduler</tt>) and register allocation (<tt>RegisterRegAlloc</tt>) -machine passes. Here we will describe how to <i>register</i> a register -allocator machine pass.</p> - -<p>Implement your register allocator machine pass. In your register allocator -<tt>.cpp</tt> file add the following include;</p> - -<div class="doc_code"><pre> -#include "llvm/CodeGen/RegAllocRegistry.h" -</pre></div> - -<p>Also in your register allocator .cpp file, define a creator function in the -form; </p> - -<div class="doc_code"><pre> -FunctionPass *createMyRegisterAllocator() { - return new MyRegisterAllocator(); -} -</pre></div> - -<p>Note that the signature of this function should match the type of -<tt>RegisterRegAlloc::FunctionPassCtor</tt>. In the same file add the -"installing" declaration, in the form;</p> - -<div class="doc_code"><pre> -static RegisterRegAlloc myRegAlloc("myregalloc", - "my register allocator help string", - createMyRegisterAllocator); -</pre></div> - -<p>Note the two spaces prior to the help string produces a tidy result on the --help query.</p> - -<div class="doc_code"><pre> -$ llc -help - ... - -regalloc - Register allocator to use (default=linearscan) - =linearscan - linear scan register allocator - =local - local register allocator - =simple - simple register allocator - =myregalloc - my register allocator help string - ... -</pre></div> - -<p>And that's it. The user is now free to use <tt>-regalloc=myregalloc</tt> as -an option. Registering instruction schedulers is similar except use the -<tt>RegisterScheduler</tt> class. Note that the -<tt>RegisterScheduler::FunctionPassCtor</tt> is significantly different from -<tt>RegisterRegAlloc::FunctionPassCtor</tt>.</p> - -<p>To force the load/linking of your register allocator into the llc/lli tools, -add your creator function's global declaration to "Passes.h" and add a "pseudo" -call line to <tt>llvm/Codegen/LinkAllCodegenComponents.h</tt>.</p> - -</div> - - -<!-- _______________________________________________________________________ --> -<h3> - <a name="registering_new">Creating new registries</a> -</h3> - -<div> - -<p>The easiest way to get started is to clone one of the existing registries; we -recommend <tt>llvm/CodeGen/RegAllocRegistry.h</tt>. The key things to modify -are the class name and the <tt>FunctionPassCtor</tt> type.</p> - -<p>Then you need to declare the registry. Example: if your pass registry is -<tt>RegisterMyPasses</tt> then define;</p> - -<div class="doc_code"><pre> -MachinePassRegistry RegisterMyPasses::Registry; -</pre></div> - -<p>And finally, declare the command line option for your passes. Example:</p> - -<div class="doc_code"><pre> -cl::opt<RegisterMyPasses::FunctionPassCtor, false, - RegisterPassParser<RegisterMyPasses> > -MyPassOpt("mypass", - cl::init(&createDefaultMyPass), - cl::desc("my pass option help")); -</pre></div> - -<p>Here the command option is "mypass", with createDefaultMyPass as the default -creator.</p> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="debughints">Using GDB with dynamically loaded passes</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>Unfortunately, using GDB with dynamically loaded passes is not as easy as it -should be. First of all, you can't set a breakpoint in a shared object that has -not been loaded yet, and second of all there are problems with inlined functions -in shared objects. Here are some suggestions to debugging your pass with -GDB.</p> - -<p>For sake of discussion, I'm going to assume that you are debugging a -transformation invoked by <tt>opt</tt>, although nothing described here depends -on that.</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="breakpoint">Setting a breakpoint in your pass</a> -</h4> - -<div> - -<p>First thing you do is start <tt>gdb</tt> on the <tt>opt</tt> process:</p> - -<div class="doc_code"><pre> -$ <b>gdb opt</b> -GNU gdb 5.0 -Copyright 2000 Free Software Foundation, Inc. -GDB is free software, covered by the GNU General Public License, and you are -welcome to change it and/or distribute copies of it under certain conditions. -Type "show copying" to see the conditions. -There is absolutely no warranty for GDB. Type "show warranty" for details. -This GDB was configured as "sparc-sun-solaris2.6"... -(gdb) -</pre></div> - -<p>Note that <tt>opt</tt> has a lot of debugging information in it, so it takes -time to load. Be patient. Since we cannot set a breakpoint in our pass yet -(the shared object isn't loaded until runtime), we must execute the process, and -have it stop before it invokes our pass, but after it has loaded the shared -object. The most foolproof way of doing this is to set a breakpoint in -<tt>PassManager::run</tt> and then run the process with the arguments you -want:</p> - -<div class="doc_code"><pre> -(gdb) <b>break llvm::PassManager::run</b> -Breakpoint 1 at 0x2413bc: file Pass.cpp, line 70. -(gdb) <b>run test.bc -load $(LLVMTOP)/llvm/Debug+Asserts/lib/[libname].so -[passoption]</b> -Starting program: opt test.bc -load $(LLVMTOP)/llvm/Debug+Asserts/lib/[libname].so -[passoption] -Breakpoint 1, PassManager::run (this=0xffbef174, M=@0x70b298) at Pass.cpp:70 -70 bool PassManager::run(Module &M) { return PM->run(M); } -(gdb) -</pre></div> - -<p>Once the <tt>opt</tt> stops in the <tt>PassManager::run</tt> method you are -now free to set breakpoints in your pass so that you can trace through execution -or do other standard debugging stuff.</p> - -</div> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="debugmisc">Miscellaneous Problems</a> -</h4> - -<div> - -<p>Once you have the basics down, there are a couple of problems that GDB has, -some with solutions, some without.</p> - -<ul> -<li>Inline functions have bogus stack information. In general, GDB does a -pretty good job getting stack traces and stepping through inline functions. -When a pass is dynamically loaded however, it somehow completely loses this -capability. The only solution I know of is to de-inline a function (move it -from the body of a class to a .cpp file).</li> - -<li>Restarting the program breaks breakpoints. After following the information -above, you have succeeded in getting some breakpoints planted in your pass. Nex -thing you know, you restart the program (i.e., you type '<tt>run</tt>' again), -and you start getting errors about breakpoints being unsettable. The only way I -have found to "fix" this problem is to <tt>delete</tt> the breakpoints that are -already set in your pass, run the program, and re-set the breakpoints once -execution stops in <tt>PassManager::run</tt>.</li> - -</ul> - -<p>Hopefully these tips will help with common case debugging situations. If -you'd like to contribute some tips of your own, just contact <a -href="mailto:sabre@nondot.org">Chris</a>.</p> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2> - <a name="future">Future extensions planned</a> -</h2> -<!-- *********************************************************************** --> - -<div> - -<p>Although the LLVM Pass Infrastructure is very capable as it stands, and does -some nifty stuff, there are things we'd like to add in the future. Here is -where we are going:</p> - -<!-- _______________________________________________________________________ --> -<h4> - <a name="SMP">Multithreaded LLVM</a> -</h4> - -<div> - -<p>Multiple CPU machines are becoming more common and compilation can never be -fast enough: obviously we should allow for a multithreaded compiler. Because of -the semantics defined for passes above (specifically they cannot maintain state -across invocations of their <tt>run*</tt> methods), a nice clean way to -implement a multithreaded compiler would be for the <tt>PassManager</tt> class -to create multiple instances of each pass object, and allow the separate -instances to be hacking on different parts of the program at the same time.</p> - -<p>This implementation would prevent each of the passes from having to implement -multithreaded constructs, requiring only the LLVM core to have locking in a few -places (for global resources). Although this is a simple extension, we simply -haven't had time (or multiprocessor machines, thus a reason) to implement this. -Despite that, we have kept the LLVM passes SMP ready, and you should too.</p> - -</div> - -</div> - -<!-- *********************************************************************** --> -<hr> -<address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> - <a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - - <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> - <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date$ -</address> - -</body> -</html> diff --git a/docs/WritingAnLLVMPass.rst b/docs/WritingAnLLVMPass.rst new file mode 100644 index 0000000000..b10d98f87e --- /dev/null +++ b/docs/WritingAnLLVMPass.rst @@ -0,0 +1,1436 @@ +==================== +Writing an LLVM Pass +==================== + +.. contents:: + :local: + +Introduction --- What is a pass? +================================ + +The LLVM Pass Framework is an important part of the LLVM system, because LLVM +passes are where most of the interesting parts of the compiler exist. Passes +perform the transformations and optimizations that make up the compiler, they +build the analysis results that are used by these transformations, and they +are, above all, a structuring technique for compiler code. + +All LLVM passes are subclasses of the `Pass +<http://llvm.org/doxygen/classllvm_1_1Pass.html>`_ class, which implement +functionality by overriding virtual methods inherited from ``Pass``. Depending +on how your pass works, you should inherit from the :ref:`ModulePass +<writing-an-llvm-pass-ModulePass>` , :ref:`CallGraphSCCPass +<writing-an-llvm-pass-CallGraphSCCPass>`, :ref:`FunctionPass +<writing-an-llvm-pass-FunctionPass>` , or :ref:`LoopPass +<writing-an-llvm-pass-LoopPass>`, or :ref:`RegionPass +<writing-an-llvm-pass-RegionPass>`, or :ref:`BasicBlockPass +<writing-an-llvm-pass-BasicBlockPass>` classes, which gives the system more +information about what your pass does, and how it can be combined with other +passes. One of the main features of the LLVM Pass Framework is that it +schedules passes to run in an efficient way based on the constraints that your +pass meets (which are indicated by which class they derive from). + +We start by showing you how to construct a pass, everything from setting up the +code, to compiling, loading, and executing it. After the basics are down, more +advanced features are discussed. + +Quick Start --- Writing hello world +=================================== + +Here we describe how to write the "hello world" of passes. The "Hello" pass is +designed to simply print out the name of non-external functions that exist in +the program being compiled. It does not modify the program at all, it just +inspects it. The source code and files for this pass are available in the LLVM +source tree in the ``lib/Transforms/Hello`` directory. + +.. _writing-an-llvm-pass-makefile: + +Setting up the build environment +-------------------------------- + +.. FIXME: Why does this recommend to build in-tree? + +First, configure and build LLVM. This needs to be done directly inside the +LLVM source tree rather than in a separate objects directory. Next, you need +to create a new directory somewhere in the LLVM source base. For this example, +we'll assume that you made ``lib/Transforms/Hello``. Finally, you must set up +a build script (``Makefile``) that will compile the source code for the new +pass. To do this, copy the following into ``Makefile``: + +.. code-block:: make + + # Makefile for hello pass + + # Path to top level of LLVM hierarchy + LEVEL = ../../.. + + # Name of the library to build + LIBRARYNAME = Hello + + # Make the shared library become a loadable module so the tools can + # dlopen/dlsym on the resulting library. + LOADABLE_MODULE = 1 + + # Include the makefile implementation stuff + include $(LEVEL)/Makefile.common + +This makefile specifies that all of the ``.cpp`` files in the current directory +are to be compiled and linked together into a shared object +``$(LEVEL)/Debug+Asserts/lib/Hello.so`` that can be dynamically loaded by the +:program:`opt` or :program:`bugpoint` tools via their :option:`-load` options. +If your operating system uses a suffix other than ``.so`` (such as Windows or Mac +OS X), the appropriate extension will be used. + +If you are used CMake to build LLVM, see :ref:`cmake-out-of-source-pass`. + +Now that we have the build scripts set up, we just need to write the code for +the pass itself. + +.. _writing-an-llvm-pass-basiccode: + +Basic code required +------------------- + +Now that we have a way to compile our new pass, we just have to write it. +Start out with: + +.. code-block:: c++ + + #include "llvm/Pass.h" + #include "llvm/Function.h" + #include "llvm/Support/raw_ostream.h" + +Which are needed because we are writing a `Pass +<http://llvm.org/doxygen/classllvm_1_1Pass.html>`_, we are operating on +`Function <http://llvm.org/doxygen/classllvm_1_1Function.html>`_\ s, and we will +be doing some printing. + +Next we have: + +.. code-block:: c++ + + using namespace llvm; + +... which is required because the functions from the include files live in the +llvm namespace. + +Next we have: + +.. code-block:: c++ + + namespace { + +... which starts out an anonymous namespace. Anonymous namespaces are to C++ +what the "``static``" keyword is to C (at global scope). It makes the things +declared inside of the anonymous namespace visible only to the current file. +If you're not familiar with them, consult a decent C++ book for more +information. + +Next, we declare our pass itself: + +.. code-block:: c++ + + struct Hello : public FunctionPass { + +This declares a "``Hello``" class that is a subclass of `FunctionPass +<writing-an-llvm-pass-FunctionPass>`. The different builtin pass subclasses +are described in detail :ref:`later <writing-an-llvm-pass-pass-classes>`, but +for now, know that ``FunctionPass`` operates on a function at a time. + +.. code-block:: c++ + + static char ID; + Hello() : FunctionPass(ID) {} + +This declares pass identifier used by LLVM to identify pass. This allows LLVM +to avoid using expensive C++ runtime information. + +.. code-block:: c++ + + virtual bool runOnFunction(Function &F) { + errs() << "Hello: "; + errs().write_escaped(F.getName()) << "\n"; + return false; + } + }; // end of struct Hello + } // end of anonymous namespace + +We declare a :ref:`runOnFunction <writing-an-llvm-pass-runOnFunction>` method, +which overrides an abstract virtual method inherited from :ref:`FunctionPass +<writing-an-llvm-pass-FunctionPass>`. This is where we are supposed to do our +thing, so we just print out our message with the name of each function. + +.. code-block:: c++ + + char Hello::ID = 0; + +We initialize pass ID here. LLVM uses ID's address to identify a pass, so +initialization value is not important. + +.. code-block:: c++ + + static RegisterPass<Hello> X("hello", "Hello World Pass", + false /* Only looks at CFG */, + false /* Analysis Pass */); + +Lastly, we :ref:`register our class <writing-an-llvm-pass-registration>` +``Hello``, giving it a command line argument "``hello``", and a name "Hello +World Pass". The last two arguments describe its behavior: if a pass walks CFG +without modifying it then the third argument is set to ``true``; if a pass is +an analysis pass, for example dominator tree pass, then ``true`` is supplied as +the fourth argument. + +As a whole, the ``.cpp`` file looks like: + +.. code-block:: c++ + + #include "llvm/Pass.h" + #include "llvm/Function.h" + #include "llvm/Support/raw_ostream.h" + + using namespace llvm; + + namespace { + struct Hello : public FunctionPass { + static char ID; + Hello() : FunctionPass(ID) {} + + virtual bool runOnFunction(Function &F) { + errs() << "Hello: "; + errs().write_escaped(F.getName()) << '\n'; + return false; + } + }; + } + + char Hello::ID = 0; + static RegisterPass<Hello> X("hello", "Hello World Pass", false, false); + +Now that it's all together, compile the file with a simple "``gmake``" command +in the local directory and you should get a new file +"``Debug+Asserts/lib/Hello.so``" under the top level directory of the LLVM +source tree (not in the local directory). Note that everything in this file is +contained in an anonymous namespace --- this reflects the fact that passes +are self contained units that do not need external interfaces (although they +can have them) to be useful. + +Running a pass with ``opt`` +--------------------------- + +Now that you have a brand new shiny shared object file, we can use the +:program:`opt` command to run an LLVM program through your pass. Because you +registered your pass with ``RegisterPass``, you will be able to use the +:program:`opt` tool to access it, once loaded. + +To test it, follow the example at the end of the :doc:`GettingStarted` to +compile "Hello World" to LLVM. We can now run the bitcode file (hello.bc) for +the program through our transformation like this (or course, any bitcode file +will work): + +.. code-block:: console + + $ opt -load ../../../Debug+Asserts/lib/Hello.so -hello < hello.bc > /dev/null + Hello: __main + Hello: puts + Hello: main + +The :option:`-load` option specifies that :program:`opt` should load your pass +as a shared object, which makes "``-hello``" a valid command line argument +(which is one reason you need to :ref:`register your pass +<writing-an-llvm-pass-registration>`). Because the Hello pass does not modify +the program in any interesting way, we just throw away the result of +:program:`opt` (sending it to ``/dev/null``). + +To see what happened to the other string you registered, try running +:program:`opt` with the :option:`-help` option: + +.. code-block:: console + + $ opt -load ../../../Debug+Asserts/lib/Hello.so -help + OVERVIEW: llvm .bc -> .bc modular optimizer + + USAGE: opt [options] <input bitcode> + + OPTIONS: + Optimizations available: + ... + -globalopt - Global Variable Optimizer + -globalsmodref-aa - Simple mod/ref analysis for globals + -gvn - Global Value Numbering + -hello - Hello World Pass + -indvars - Induction Variable Simplification + -inline - Function Integration/Inlining + -insert-edge-profiling - Insert instrumentation for edge profiling + ... + +The pass name gets added as the information string for your pass, giving some +documentation to users of :program:`opt`. Now that you have a working pass, +you would go ahead and make it do the cool transformations you want. Once you +get it all working and tested, it may become useful to find out how fast your +pass is. The :ref:`PassManager <writing-an-llvm-pass-passmanager>` provides a +nice command line option (:option:`--time-passes`) that allows you to get +information about the execution time of your pass along with the other passes +you queue up. For example: + +.. code-block:: console + + $ opt -load ../../../Debug+Asserts/lib/Hello.so -hello -time-passes < hello.bc > /dev/null + Hello: __main + Hello: puts + Hello: main + =============================================================================== + ... Pass execution timing report ... + =============================================================================== + Total Execution Time: 0.02 seconds (0.0479059 wall clock) + + ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Pass Name --- + 0.0100 (100.0%) 0.0000 ( 0.0%) 0.0100 ( 50.0%) 0.0402 ( 84.0%) Bitcode Writer + 0.0000 ( 0.0%) 0.0100 (100.0%) 0.0100 ( 50.0%) 0.0031 ( 6.4%) Dominator Set Construction + 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0013 ( 2.7%) Module Verifier + 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0033 ( 6.9%) Hello World Pass + 0.0100 (100.0%) 0.0100 (100.0%) 0.0200 (100.0%) 0.0479 (100.0%) TOTAL + +As you can see, our implementation above is pretty fast. The additional +passes listed are automatically inserted by the :program:`opt` tool to verify +that the LLVM emitted by your pass is still valid and well formed LLVM, which +hasn't been broken somehow. + +Now that you have seen the basics of the mechanics behind passes, we can talk +about some more details of how they work and how to use them. + +.. _writing-an-llvm-pass-pass-classes: + +Pass classes and requirements +============================= + +One of the first things that you should do when designing a new pass is to +decide what class you should subclass for your pass. The :ref:`Hello World +<writing-an-llvm-pass-basiccode>` example uses the :ref:`FunctionPass +<writing-an-llvm-pass-FunctionPass>` class for its implementation, but we did +not discuss why or when this should occur. Here we talk about the classes +available, from the most general to the most specific. + +When choosing a superclass for your ``Pass``, you should choose the **most +specific** class possible, while still being able to meet the requirements +listed. This gives the LLVM Pass Infrastructure information necessary to +optimize how passes are run, so that the resultant compiler isn't unnecessarily +slow. + +The ``ImmutablePass`` class +--------------------------- + +The most plain and boring type of pass is the "`ImmutablePass +<http://llvm.org/doxygen/classllvm_1_1ImmutablePass.html>`_" class. This pass +type is used for passes that do not have to be run, do not change state, and +never need to be updated. This is not a normal type of transformation or +analysis, but can provide information about the current compiler configuration. + +Although this pass class is very infrequently used, it is important for +providing information about the current target machine being compiled for, and +other static information that can affect the various transformations. + +``ImmutablePass``\ es never invalidate other transformations, are never +invalidated, and are never "run". + +.. _writing-an-llvm-pass-ModulePass: + +The ``ModulePass`` class +------------------------ + +The `ModulePass <http://llvm.org/doxygen/classllvm_1_1ModulePass.html>`_ class +is the most general of all superclasses that you can use. Deriving from +``ModulePass`` indicates that your pass uses the entire program as a unit, +referring to function bodies in no predictable order, or adding and removing +functions. Because nothing is known about the behavior of ``ModulePass`` +subclasses, no optimization can be done for their execution. + +A module pass can use function level passes (e.g. dominators) using the +``getAnalysis`` interface ``getAnalysis<DominatorTree>(llvm::Function *)`` to +provide the function to retrieve analysis result for, if the function pass does +not require any module or immutable passes. Note that this can only be done +for functions for which the analysis ran, e.g. in the case of dominators you +should only ask for the ``DominatorTree`` for function definitions, not +declarations. + +To write a correct ``ModulePass`` subclass, derive from ``ModulePass`` and +overload the ``runOnModule`` method with the following signature: + +The ``runOnModule`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool runOnModule(Module &M) = 0; + +The ``runOnModule`` method performs the interesting work of the pass. It +should return ``true`` if the module was modified by the transformation and +``false`` otherwise. + +.. _writing-an-llvm-pass-CallGraphSCCPass: + +The ``CallGraphSCCPass`` class +------------------------------ + +The `CallGraphSCCPass +<http://llvm.org/doxygen/classllvm_1_1CallGraphSCCPass.html>`_ is used by +passes that need to traverse the program bottom-up on the call graph (callees +before callers). Deriving from ``CallGraphSCCPass`` provides some mechanics +for building and traversing the ``CallGraph``, but also allows the system to +optimize execution of ``CallGraphSCCPass``\ es. If your pass meets the +requirements outlined below, and doesn't meet the requirements of a +:ref:`FunctionPass <writing-an-llvm-pass-FunctionPass>` or :ref:`BasicBlockPass +<writing-an-llvm-pass-BasicBlockPass>`, you should derive from +``CallGraphSCCPass``. + +``TODO``: explain briefly what SCC, Tarjan's algo, and B-U mean. + +To be explicit, CallGraphSCCPass subclasses are: + +#. ... *not allowed* to inspect or modify any ``Function``\ s other than those + in the current SCC and the direct callers and direct callees of the SCC. +#. ... *required* to preserve the current ``CallGraph`` object, updating it to + reflect any changes made to the program. +#. ... *not allowed* to add or remove SCC's from the current Module, though + they may change the contents of an SCC. +#. ... *allowed* to add or remove global variables from the current Module. +#. ... *allowed* to maintain state across invocations of :ref:`runOnSCC + <writing-an-llvm-pass-runOnSCC>` (including global data). + +Implementing a ``CallGraphSCCPass`` is slightly tricky in some cases because it +has to handle SCCs with more than one node in it. All of the virtual methods +described below should return ``true`` if they modified the program, or +``false`` if they didn't. + +The ``doInitialization(CallGraph &)`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool doInitialization(CallGraph &CG); + +The ``doInitialization`` method is allowed to do most of the things that +``CallGraphSCCPass``\ es are not allowed to do. They can add and remove +functions, get pointers to functions, etc. The ``doInitialization`` method is +designed to do simple initialization type of stuff that does not depend on the +SCCs being processed. The ``doInitialization`` method call is not scheduled to +overlap with any other pass executions (thus it should be very fast). + +.. _writing-an-llvm-pass-runOnSCC: + +The ``runOnSCC`` method +^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool runOnSCC(CallGraphSCC &SCC) = 0; + +The ``runOnSCC`` method performs the interesting work of the pass, and should +return ``true`` if the module was modified by the transformation, ``false`` +otherwise. + +The ``doFinalization(CallGraph &)`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool doFinalization(CallGraph &CG); + +The ``doFinalization`` method is an infrequently used method that is called +when the pass framework has finished calling :ref:`runOnFunction +<writing-an-llvm-pass-runOnFunction>` for every function in the program being +compiled. + +.. _writing-an-llvm-pass-FunctionPass: + +The ``FunctionPass`` class +-------------------------- + +In contrast to ``ModulePass`` subclasses, `FunctionPass +<http://llvm.org/doxygen/classllvm_1_1Pass.html>`_ subclasses do have a +predictable, local behavior that can be expected by the system. All +``FunctionPass`` execute on each function in the program independent of all of +the other functions in the program. ``FunctionPass``\ es do not require that +they are executed in a particular order, and ``FunctionPass``\ es do not modify +external functions. + +To be explicit, ``FunctionPass`` subclasses are not allowed to: + +#. Modify a ``Function`` other than the one currently being processed. +#. Add or remove ``Function``\ s from the current ``Module``. +#. Add or remove global variables from the current ``Module``. +#. Maintain state across invocations of:ref:`runOnFunction + <writing-an-llvm-pass-runOnFunction>` (including global data). + +Implementing a ``FunctionPass`` is usually straightforward (See the :ref:`Hello +World <writing-an-llvm-pass-basiccode>` pass for example). +``FunctionPass``\ es may overload three virtual methods to do their work. All +of these methods should return ``true`` if they modified the program, or +``false`` if they didn't. + +.. _writing-an-llvm-pass-doInitialization-mod: + +The ``doInitialization(Module &)`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool doInitialization(Module &M); + +The ``doInitialization`` method is allowed to do most of the things that +``FunctionPass``\ es are not allowed to do. They can add and remove functions, +get pointers to functions, etc. The ``doInitialization`` method is designed to +do simple initialization type of stuff that does not depend on the functions +being processed. The ``doInitialization`` method call is not scheduled to +overlap with any other pass executions (thus it should be very fast). + +A good example of how this method should be used is the `LowerAllocations +<http://llvm.org/doxygen/LowerAllocations_8cpp-source.html>`_ pass. This pass +converts ``malloc`` and ``free`` instructions into platform dependent +``malloc()`` and ``free()`` function calls. It uses the ``doInitialization`` +method to get a reference to the ``malloc`` and ``free`` functions that it +needs, adding prototypes to the module if necessary. + +.. _writing-an-llvm-pass-runOnFunction: + +The ``runOnFunction`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool runOnFunction(Function &F) = 0; + +The ``runOnFunction`` method must be implemented by your subclass to do the +transformation or analysis work of your pass. As usual, a ``true`` value +should be returned if the function is modified. + +.. _writing-an-llvm-pass-doFinalization-mod: + +The ``doFinalization(Module &)`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool doFinalization(Module &M); + +The ``doFinalization`` method is an infrequently used method that is called +when the pass framework has finished calling :ref:`runOnFunction +<writing-an-llvm-pass-runOnFunction>` for every function in the program being +compiled. + +.. _writing-an-llvm-pass-LoopPass: + +The ``LoopPass`` class +---------------------- + +All ``LoopPass`` execute on each loop in the function independent of all of the +other loops in the function. ``LoopPass`` processes loops in loop nest order +such that outer most loop is processed last. + +``LoopPass`` subclasses are allowed to update loop nest using ``LPPassManager`` +interface. Implementing a loop pass is usually straightforward. +``LoopPass``\ es may overload three virtual methods to do their work. All +these methods should return ``true`` if they modified the program, or ``false`` +if they didn't. + +The ``doInitialization(Loop *, LPPassManager &)`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool doInitialization(Loop *, LPPassManager &LPM); + +The ``doInitialization`` method is designed to do simple initialization type of +stuff that does not depend on the functions being processed. The +``doInitialization`` method call is not scheduled to overlap with any other +pass executions (thus it should be very fast). ``LPPassManager`` interface +should be used to access ``Function`` or ``Module`` level analysis information. + +.. _writing-an-llvm-pass-runOnLoop: + +The ``runOnLoop`` method +^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool runOnLoop(Loop *, LPPassManager &LPM) = 0; + +The ``runOnLoop`` method must be implemented by your subclass to do the +transformation or analysis work of your pass. As usual, a ``true`` value +should be returned if the function is modified. ``LPPassManager`` interface +should be used to update loop nest. + +The ``doFinalization()`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool doFinalization(); + +The ``doFinalization`` method is an infrequently used method that is called +when the pass framework has finished calling :ref:`runOnLoop +<writing-an-llvm-pass-runOnLoop>` for every loop in the program being compiled. + +.. _writing-an-llvm-pass-RegionPass: + +The ``RegionPass`` class +------------------------ + +``RegionPass`` is similar to :ref:`LoopPass <writing-an-llvm-pass-LoopPass>`, +but executes on each single entry single exit region in the function. +``RegionPass`` processes regions in nested order such that the outer most +region is processed last. + +``RegionPass`` subclasses are allowed to update the region tree by using the +``RGPassManager`` interface. You may overload three virtual methods of +``RegionPass`` to implement your own region pass. All these methods should +return ``true`` if they modified the program, or ``false`` if they did not. + +The ``doInitialization(Region *, RGPassManager &)`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool doInitialization(Region *, RGPassManager &RGM); + +The ``doInitialization`` method is designed to do simple initialization type of +stuff that does not depend on the functions being processed. The +``doInitialization`` method call is not scheduled to overlap with any other +pass executions (thus it should be very fast). ``RPPassManager`` interface +should be used to access ``Function`` or ``Module`` level analysis information. + +.. _writing-an-llvm-pass-runOnRegion: + +The ``runOnRegion`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool runOnRegion(Region *, RGPassManager &RGM) = 0; + +The ``runOnRegion`` method must be implemented by your subclass to do the +transformation or analysis work of your pass. As usual, a true value should be +returned if the region is modified. ``RGPassManager`` interface should be used to +update region tree. + +The ``doFinalization()`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool doFinalization(); + +The ``doFinalization`` method is an infrequently used method that is called +when the pass framework has finished calling :ref:`runOnRegion +<writing-an-llvm-pass-runOnRegion>` for every region in the program being +compiled. + +.. _writing-an-llvm-pass-BasicBlockPass: + +The ``BasicBlockPass`` class +---------------------------- + +``BasicBlockPass``\ es are just like :ref:`FunctionPass's +<writing-an-llvm-pass-FunctionPass>` , except that they must limit their scope +of inspection and modification to a single basic block at a time. As such, +they are **not** allowed to do any of the following: + +#. Modify or inspect any basic blocks outside of the current one. +#. Maintain state across invocations of :ref:`runOnBasicBlock + <writing-an-llvm-pass-runOnBasicBlock>`. +#. Modify the control flow graph (by altering terminator instructions) +#. Any of the things forbidden for :ref:`FunctionPasses + <writing-an-llvm-pass-FunctionPass>`. + +``BasicBlockPass``\ es are useful for traditional local and "peephole" +optimizations. They may override the same :ref:`doInitialization(Module &) +<writing-an-llvm-pass-doInitialization-mod>` and :ref:`doFinalization(Module &) +<writing-an-llvm-pass-doFinalization-mod>` methods that :ref:`FunctionPass's +<writing-an-llvm-pass-FunctionPass>` have, but also have the following virtual +methods that may also be implemented: + +The ``doInitialization(Function &)`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool doInitialization(Function &F); + +The ``doInitialization`` method is allowed to do most of the things that +``BasicBlockPass``\ es are not allowed to do, but that ``FunctionPass``\ es +can. The ``doInitialization`` method is designed to do simple initialization +that does not depend on the ``BasicBlock``\ s being processed. The +``doInitialization`` method call is not scheduled to overlap with any other +pass executions (thus it should be very fast). + +.. _writing-an-llvm-pass-runOnBasicBlock: + +The ``runOnBasicBlock`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool runOnBasicBlock(BasicBlock &BB) = 0; + +Override this function to do the work of the ``BasicBlockPass``. This function +is not allowed to inspect or modify basic blocks other than the parameter, and +are not allowed to modify the CFG. A ``true`` value must be returned if the +basic block is modified. + +The ``doFinalization(Function &)`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool doFinalization(Function &F); + +The ``doFinalization`` method is an infrequently used method that is called +when the pass framework has finished calling :ref:`runOnBasicBlock +<writing-an-llvm-pass-runOnBasicBlock>` for every ``BasicBlock`` in the program +being compiled. This can be used to perform per-function finalization. + +The ``MachineFunctionPass`` class +--------------------------------- + +A ``MachineFunctionPass`` is a part of the LLVM code generator that executes on +the machine-dependent representation of each LLVM function in the program. + +Code generator passes are registered and initialized specially by +``TargetMachine::addPassesToEmitFile`` and similar routines, so they cannot +generally be run from the :program:`opt` or :program:`bugpoint` commands. + +A ``MachineFunctionPass`` is also a ``FunctionPass``, so all the restrictions +that apply to a ``FunctionPass`` also apply to it. ``MachineFunctionPass``\ es +also have additional restrictions. In particular, ``MachineFunctionPass``\ es +are not allowed to do any of the following: + +#. Modify or create any LLVM IR ``Instruction``\ s, ``BasicBlock``\ s, + ``Argument``\ s, ``Function``\ s, ``GlobalVariable``\ s, + ``GlobalAlias``\ es, or ``Module``\ s. +#. Modify a ``MachineFunction`` other than the one currently being processed. +#. Maintain state across invocations of :ref:`runOnMachineFunction + <writing-an-llvm-pass-runOnMachineFunction>` (including global data). + +.. _writing-an-llvm-pass-runOnMachineFunction: + +The ``runOnMachineFunction(MachineFunction &MF)`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool runOnMachineFunction(MachineFunction &MF) = 0; + +``runOnMachineFunction`` can be considered the main entry point of a +``MachineFunctionPass``; that is, you should override this method to do the +work of your ``MachineFunctionPass``. + +The ``runOnMachineFunction`` method is called on every ``MachineFunction`` in a +``Module``, so that the ``MachineFunctionPass`` may perform optimizations on +the machine-dependent representation of the function. If you want to get at +the LLVM ``Function`` for the ``MachineFunction`` you're working on, use +``MachineFunction``'s ``getFunction()`` accessor method --- but remember, you +may not modify the LLVM ``Function`` or its contents from a +``MachineFunctionPass``. + +.. _writing-an-llvm-pass-registration: + +Pass registration +----------------- + +In the :ref:`Hello World <writing-an-llvm-pass-basiccode>` example pass we +illustrated how pass registration works, and discussed some of the reasons that +it is used and what it does. Here we discuss how and why passes are +registered. + +As we saw above, passes are registered with the ``RegisterPass`` template. The +template parameter is the name of the pass that is to be used on the command +line to specify that the pass should be added to a program (for example, with +:program:`opt` or :program:`bugpoint`). The first argument is the name of the +pass, which is to be used for the :option:`-help` output of programs, as well +as for debug output generated by the :option:`--debug-pass` option. + +If you want your pass to be easily dumpable, you should implement the virtual +print method: + +The ``print`` method +^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual void print(llvm::raw_ostream &O, const Module *M) const; + +The ``print`` method must be implemented by "analyses" in order to print a +human readable version of the analysis results. This is useful for debugging +an analysis itself, as well as for other people to figure out how an analysis +works. Use the opt ``-analyze`` argument to invoke this method. + +The ``llvm::raw_ostream`` parameter specifies the stream to write the results +on, and the ``Module`` parameter gives a pointer to the top level module of the +program that has been analyzed. Note however that this pointer may be ``NULL`` +in certain circumstances (such as calling the ``Pass::dump()`` from a +debugger), so it should only be used to enhance debug output, it should not be +depended on. + +.. _writing-an-llvm-pass-interaction: + +Specifying interactions between passes +-------------------------------------- + +One of the main responsibilities of the ``PassManager`` is to make sure that +passes interact with each other correctly. Because ``PassManager`` tries to +:ref:`optimize the execution of passes <writing-an-llvm-pass-passmanager>` it +must know how the passes interact with each other and what dependencies exist +between the various passes. To track this, each pass can declare the set of +passes that are required to be executed before the current pass, and the passes +which are invalidated by the current pass. + +Typically this functionality is used to require that analysis results are +computed before your pass is run. Running arbitrary transformation passes can +invalidate the computed analysis results, which is what the invalidation set +specifies. If a pass does not implement the :ref:`getAnalysisUsage +<writing-an-llvm-pass-getAnalysisUsage>` method, it defaults to not having any +prerequisite passes, and invalidating **all** other passes. + +.. _writing-an-llvm-pass-getAnalysisUsage: + +The ``getAnalysisUsage`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual void getAnalysisUsage(AnalysisUsage &Info) const; + +By implementing the ``getAnalysisUsage`` method, the required and invalidated +sets may be specified for your transformation. The implementation should fill +in the `AnalysisUsage +<http://llvm.org/doxygen/classllvm_1_1AnalysisUsage.html>`_ object with +information about which passes are required and not invalidated. To do this, a +pass may call any of the following methods on the ``AnalysisUsage`` object: + +The ``AnalysisUsage::addRequired<>`` and ``AnalysisUsage::addRequiredTransitive<>`` methods +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If your pass requires a previous pass to be executed (an analysis for example), +it can use one of these methods to arrange for it to be run before your pass. +LLVM has many different types of analyses and passes that can be required, +spanning the range from ``DominatorSet`` to ``BreakCriticalEdges``. Requiring +``BreakCriticalEdges``, for example, guarantees that there will be no critical +edges in the CFG when your pass has been run. + +Some analyses chain to other analyses to do their job. For example, an +`AliasAnalysis <AliasAnalysis>` implementation is required to :ref:`chain +<aliasanalysis-chaining>` to other alias analysis passes. In cases where +analyses chain, the ``addRequiredTransitive`` method should be used instead of +the ``addRequired`` method. This informs the ``PassManager`` that the +transitively required pass should be alive as long as the requiring pass is. + +The ``AnalysisUsage::addPreserved<>`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +One of the jobs of the ``PassManager`` is to optimize how and when analyses are +run. In particular, it attempts to avoid recomputing data unless it needs to. +For this reason, passes are allowed to declare that they preserve (i.e., they +don't invalidate) an existing analysis if it's available. For example, a +simple constant folding pass would not modify the CFG, so it can't possibly +affect the results of dominator analysis. By default, all passes are assumed +to invalidate all others. + +The ``AnalysisUsage`` class provides several methods which are useful in +certain circumstances that are related to ``addPreserved``. In particular, the +``setPreservesAll`` method can be called to indicate that the pass does not +modify the LLVM program at all (which is true for analyses), and the +``setPreservesCFG`` method can be used by transformations that change +instructions in the program but do not modify the CFG or terminator +instructions (note that this property is implicitly set for +:ref:`BasicBlockPass <writing-an-llvm-pass-BasicBlockPass>`\ es). + +``addPreserved`` is particularly useful for transformations like +``BreakCriticalEdges``. This pass knows how to update a small set of loop and +dominator related analyses if they exist, so it can preserve them, despite the +fact that it hacks on the CFG. + +Example implementations of ``getAnalysisUsage`` +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + // This example modifies the program, but does not modify the CFG + void LICM::getAnalysisUsage(AnalysisUsage &AU) const { + AU.setPreservesCFG(); + AU.addRequired<LoopInfo>(); + } + +.. _writing-an-llvm-pass-getAnalysis: + +The ``getAnalysis<>`` and ``getAnalysisIfAvailable<>`` methods +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``Pass::getAnalysis<>`` method is automatically inherited by your class, +providing you with access to the passes that you declared that you required +with the :ref:`getAnalysisUsage <writing-an-llvm-pass-getAnalysisUsage>` +method. It takes a single template argument that specifies which pass class +you want, and returns a reference to that pass. For example: + +.. code-block:: c++ + + bool LICM::runOnFunction(Function &F) { + LoopInfo &LI = getAnalysis<LoopInfo>(); + //... + } + +This method call returns a reference to the pass desired. You may get a +runtime assertion failure if you attempt to get an analysis that you did not +declare as required in your :ref:`getAnalysisUsage +<writing-an-llvm-pass-getAnalysisUsage>` implementation. This method can be +called by your ``run*`` method implementation, or by any other local method +invoked by your ``run*`` method. + +A module level pass can use function level analysis info using this interface. +For example: + +.. code-block:: c++ + + bool ModuleLevelPass::runOnModule(Module &M) { + //... + DominatorTree &DT = getAnalysis<DominatorTree>(Func); + //... + } + +In above example, ``runOnFunction`` for ``DominatorTree`` is called by pass +manager before returning a reference to the desired pass. + +If your pass is capable of updating analyses if they exist (e.g., +``BreakCriticalEdges``, as described above), you can use the +``getAnalysisIfAvailable`` method, which returns a pointer to the analysis if +it is active. For example: + +.. code-block:: c++ + + if (DominatorSet *DS = getAnalysisIfAvailable<DominatorSet>()) { + // A DominatorSet is active. This code will update it. + } + +Implementing Analysis Groups +---------------------------- + +Now that we understand the basics of how passes are defined, how they are used, +and how they are required from other passes, it's time to get a little bit +fancier. All of the pass relationships that we have seen so far are very +simple: one pass depends on one other specific pass to be run before it can +run. For many applications, this is great, for others, more flexibility is +required. + +In particular, some analyses are defined such that there is a single simple +interface to the analysis results, but multiple ways of calculating them. +Consider alias analysis for example. The most trivial alias analysis returns +"may alias" for any alias query. The most sophisticated analysis a +flow-sensitive, context-sensitive interprocedural analysis that can take a +significant amount of time to execute (and obviously, there is a lot of room +between these two extremes for other implementations). To cleanly support +situations like this, the LLVM Pass Infrastructure supports the notion of +Analysis Groups. + +Analysis Group Concepts +^^^^^^^^^^^^^^^^^^^^^^^ + +An Analysis Group is a single simple interface that may be implemented by +multiple different passes. Analysis Groups can be given human readable names +just like passes, but unlike passes, they need not derive from the ``Pass`` +class. An analysis group may have one or more implementations, one of which is +the "default" implementation. + +Analysis groups are used by client passes just like other passes are: the +``AnalysisUsage::addRequired()`` and ``Pass::getAnalysis()`` methods. In order +to resolve this requirement, the :ref:`PassManager +<writing-an-llvm-pass-passmanager>` scans the available passes to see if any +implementations of the analysis group are available. If none is available, the +default implementation is created for the pass to use. All standard rules for +:ref:`interaction between passes <writing-an-llvm-pass-interaction>` still +apply. + +Although :ref:`Pass Registration <writing-an-llvm-pass-registration>` is +optional for normal passes, all analysis group implementations must be +registered, and must use the :ref:`INITIALIZE_AG_PASS +<writing-an-llvm-pass-RegisterAnalysisGroup>` template to join the +implementation pool. Also, a default implementation of the interface **must** +be registered with :ref:`RegisterAnalysisGroup +<writing-an-llvm-pass-RegisterAnalysisGroup>`. + +As a concrete example of an Analysis Group in action, consider the +`AliasAnalysis <http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html>`_ +analysis group. The default implementation of the alias analysis interface +(the `basicaa <http://llvm.org/doxygen/structBasicAliasAnalysis.html>`_ pass) +just does a few simple checks that don't require significant analysis to +compute (such as: two different globals can never alias each other, etc). +Passes that use the `AliasAnalysis +<http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html>`_ interface (for +example the `gcse <http://llvm.org/doxygen/structGCSE.html>`_ pass), do not +care which implementation of alias analysis is actually provided, they just use +the designated interface. + +From the user's perspective, commands work just like normal. Issuing the +command ``opt -gcse ...`` will cause the ``basicaa`` class to be instantiated +and added to the pass sequence. Issuing the command ``opt -somefancyaa -gcse +...`` will cause the ``gcse`` pass to use the ``somefancyaa`` alias analysis +(which doesn't actually exist, it's just a hypothetical example) instead. + +.. _writing-an-llvm-pass-RegisterAnalysisGroup: + +Using ``RegisterAnalysisGroup`` +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``RegisterAnalysisGroup`` template is used to register the analysis group +itself, while the ``INITIALIZE_AG_PASS`` is used to add pass implementations to +the analysis group. First, an analysis group should be registered, with a +human readable name provided for it. Unlike registration of passes, there is +no command line argument to be specified for the Analysis Group Interface +itself, because it is "abstract": + +.. code-block:: c++ + + static RegisterAnalysisGroup<AliasAnalysis> A("Alias Analysis"); + +Once the analysis is registered, passes can declare that they are valid +implementations of the interface by using the following code: + +.. code-block:: c++ + + namespace { + // Declare that we implement the AliasAnalysis interface + INITIALIZE_AG_PASS(FancyAA, AliasAnalysis , "somefancyaa", + "A more complex alias analysis implementation", + false, // Is CFG Only? + true, // Is Analysis? + false); // Is default Analysis Group implementation? + } + +This just shows a class ``FancyAA`` that uses the ``INITIALIZE_AG_PASS`` macro +both to register and to "join" the `AliasAnalysis +<http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html>`_ analysis group. +Every implementation of an analysis group should join using this macro. + +.. code-block:: c++ + + namespace { + // Declare that we implement the AliasAnalysis interface + INITIALIZE_AG_PASS(BasicAA, AliasAnalysis, "basicaa", + "Basic Alias Analysis (default AA impl)", + false, // Is CFG Only? + true, // Is Analysis? + true); // Is default Analysis Group implementation? + } + +Here we show how the default implementation is specified (using the final +argument to the ``INITIALIZE_AG_PASS`` template). There must be exactly one +default implementation available at all times for an Analysis Group to be used. +Only default implementation can derive from ``ImmutablePass``. Here we declare +that the `BasicAliasAnalysis +<http://llvm.org/doxygen/structBasicAliasAnalysis.html>`_ pass is the default +implementation for the interface. + +Pass Statistics +=============== + +The `Statistic <http://llvm.org/doxygen/Statistic_8h-source.html>`_ class is +designed to be an easy way to expose various success metrics from passes. +These statistics are printed at the end of a run, when the :option:`-stats` +command line option is enabled on the command line. See the :ref:`Statistics +section <Statistic>` in the Programmer's Manual for details. + +.. _writing-an-llvm-pass-passmanager: + +What PassManager does +--------------------- + +The `PassManager <http://llvm.org/doxygen/PassManager_8h-source.html>`_ `class +<http://llvm.org/doxygen/classllvm_1_1PassManager.html>`_ takes a list of +passes, ensures their :ref:`prerequisites <writing-an-llvm-pass-interaction>` +are set up correctly, and then schedules passes to run efficiently. All of the +LLVM tools that run passes use the PassManager for execution of these passes. + +The PassManager does two main things to try to reduce the execution time of a +series of passes: + +#. **Share analysis results.** The ``PassManager`` attempts to avoid + recomputing analysis results as much as possible. This means keeping track + of which analyses are available already, which analyses get invalidated, and + which analyses are needed to be run for a pass. An important part of work + is that the ``PassManager`` tracks the exact lifetime of all analysis + results, allowing it to :ref:`free memory + <writing-an-llvm-pass-releaseMemory>` allocated to holding analysis results + as soon as they are no longer needed. + +#. **Pipeline the execution of passes on the program.** The ``PassManager`` + attempts to get better cache and memory usage behavior out of a series of + passes by pipelining the passes together. This means that, given a series + of consecutive :ref:`FunctionPass <writing-an-llvm-pass-FunctionPass>`, it + will execute all of the :ref:`FunctionPass + <writing-an-llvm-pass-FunctionPass>` on the first function, then all of the + :ref:`FunctionPasses <writing-an-llvm-pass-FunctionPass>` on the second + function, etc... until the entire program has been run through the passes. + + This improves the cache behavior of the compiler, because it is only + touching the LLVM program representation for a single function at a time, + instead of traversing the entire program. It reduces the memory consumption + of compiler, because, for example, only one `DominatorSet + <http://llvm.org/doxygen/classllvm_1_1DominatorSet.html>`_ needs to be + calculated at a time. This also makes it possible to implement some + :ref:`interesting enhancements <writing-an-llvm-pass-SMP>` in the future. + +The effectiveness of the ``PassManager`` is influenced directly by how much +information it has about the behaviors of the passes it is scheduling. For +example, the "preserved" set is intentionally conservative in the face of an +unimplemented :ref:`getAnalysisUsage <writing-an-llvm-pass-getAnalysisUsage>` +method. Not implementing when it should be implemented will have the effect of +not allowing any analysis results to live across the execution of your pass. + +The ``PassManager`` class exposes a ``--debug-pass`` command line options that +is useful for debugging pass execution, seeing how things work, and diagnosing +when you should be preserving more analyses than you currently are. (To get +information about all of the variants of the ``--debug-pass`` option, just type +"``opt -help-hidden``"). + +By using the --debug-pass=Structure option, for example, we can see how our +:ref:`Hello World <writing-an-llvm-pass-basiccode>` pass interacts with other +passes. Lets try it out with the gcse and licm passes: + +.. code-block:: console + + $ opt -load ../../../Debug+Asserts/lib/Hello.so -gcse -licm --debug-pass=Structure < hello.bc > /dev/null + Module Pass Manager + Function Pass Manager + Dominator Set Construction + Immediate Dominators Construction + Global Common Subexpression Elimination + -- Immediate Dominators Construction + -- Global Common Subexpression Elimination + Natural Loop Construction + Loop Invariant Code Motion + -- Natural Loop Construction + -- Loop Invariant Code Motion + Module Verifier + -- Dominator Set Construction + -- Module Verifier + Bitcode Writer + --Bitcode Writer + +This output shows us when passes are constructed and when the analysis results +are known to be dead (prefixed with "``--``"). Here we see that GCSE uses +dominator and immediate dominator information to do its job. The LICM pass +uses natural loop information, which uses dominator sets, but not immediate +dominators. Because immediate dominators are no longer useful after the GCSE +pass, it is immediately destroyed. The dominator sets are then reused to +compute natural loop information, which is then used by the LICM pass. + +After the LICM pass, the module verifier runs (which is automatically added by +the :program:`opt` tool), which uses the dominator set to check that the +resultant LLVM code is well formed. After it finishes, the dominator set +information is destroyed, after being computed once, and shared by three +passes. + +Lets see how this changes when we run the :ref:`Hello World +<writing-an-llvm-pass-basiccode>` pass in between the two passes: + +.. code-block:: console + + $ opt -load ../../../Debug+Asserts/lib/Hello.so -gcse -hello -licm --debug-pass=Structure < hello.bc > /dev/null + Module Pass Manager + Function Pass Manager + Dominator Set Construction + Immediate Dominators Construction + Global Common Subexpression Elimination + -- Dominator Set Construction + -- Immediate Dominators Construction + -- Global Common Subexpression Elimination + Hello World Pass + -- Hello World Pass + Dominator Set Construction + Natural Loop Construction + Loop Invariant Code Motion + -- Natural Loop Construction + -- Loop Invariant Code Motion + Module Verifier + -- Dominator Set Construction + -- Module Verifier + Bitcode Writer + --Bitcode Writer + Hello: __main + Hello: puts + Hello: main + +Here we see that the :ref:`Hello World <writing-an-llvm-pass-basiccode>` pass +has killed the Dominator Set pass, even though it doesn't modify the code at +all! To fix this, we need to add the following :ref:`getAnalysisUsage +<writing-an-llvm-pass-getAnalysisUsage>` method to our pass: + +.. code-block:: c++ + + // We don't modify the program, so we preserve all analyses + virtual void getAnalysisUsage(AnalysisUsage &AU) const { + AU.setPreservesAll(); + } + +Now when we run our pass, we get this output: + +.. code-block:: console + + $ opt -load ../../../Debug+Asserts/lib/Hello.so -gcse -hello -licm --debug-pass=Structure < hello.bc > /dev/null + Pass Arguments: -gcse -hello -licm + Module Pass Manager + Function Pass Manager + Dominator Set Construction + Immediate Dominators Construction + Global Common Subexpression Elimination + -- Immediate Dominators Construction + -- Global Common Subexpression Elimination + Hello World Pass + -- Hello World Pass + Natural Loop Construction + Loop Invariant Code Motion + -- Loop Invariant Code Motion + -- Natural Loop Construction + Module Verifier + -- Dominator Set Construction + -- Module Verifier + Bitcode Writer + --Bitcode Writer + Hello: __main + Hello: puts + Hello: main + +Which shows that we don't accidentally invalidate dominator information +anymore, and therefore do not have to compute it twice. + +.. _writing-an-llvm-pass-releaseMemory: + +The ``releaseMemory`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual void releaseMemory(); + +The ``PassManager`` automatically determines when to compute analysis results, +and how long to keep them around for. Because the lifetime of the pass object +itself is effectively the entire duration of the compilation process, we need +some way to free analysis results when they are no longer useful. The +``releaseMemory`` virtual method is the way to do this. + +If you are writing an analysis or any other pass that retains a significant +amount of state (for use by another pass which "requires" your pass and uses +the :ref:`getAnalysis <writing-an-llvm-pass-getAnalysis>` method) you should +implement ``releaseMemory`` to, well, release the memory allocated to maintain +this internal state. This method is called after the ``run*`` method for the +class, before the next call of ``run*`` in your pass. + +Registering dynamically loaded passes +===================================== + +*Size matters* when constructing production quality tools using LLVM, both for +the purposes of distribution, and for regulating the resident code size when +running on the target system. Therefore, it becomes desirable to selectively +use some passes, while omitting others and maintain the flexibility to change +configurations later on. You want to be able to do all this, and, provide +feedback to the user. This is where pass registration comes into play. + +The fundamental mechanisms for pass registration are the +``MachinePassRegistry`` class and subclasses of ``MachinePassRegistryNode``. + +An instance of ``MachinePassRegistry`` is used to maintain a list of +``MachinePassRegistryNode`` objects. This instance maintains the list and +communicates additions and deletions to the command line interface. + +An instance of ``MachinePassRegistryNode`` subclass is used to maintain +information provided about a particular pass. This information includes the +command line name, the command help string and the address of the function used +to create an instance of the pass. A global static constructor of one of these +instances *registers* with a corresponding ``MachinePassRegistry``, the static +destructor *unregisters*. Thus a pass that is statically linked in the tool +will be registered at start up. A dynamically loaded pass will register on +load and unregister at unload. + +Using existing registries +------------------------- + +There are predefined registries to track instruction scheduling +(``RegisterScheduler``) and register allocation (``RegisterRegAlloc``) machine +passes. Here we will describe how to *register* a register allocator machine +pass. + +Implement your register allocator machine pass. In your register allocator +``.cpp`` file add the following include: + +.. code-block:: c++ + + #include "llvm/CodeGen/RegAllocRegistry.h" + +Also in your register allocator ``.cpp`` file, define a creator function in the +form: + +.. code-block:: c++ + + FunctionPass *createMyRegisterAllocator() { + return new MyRegisterAllocator(); + } + +Note that the signature of this function should match the type of +``RegisterRegAlloc::FunctionPassCtor``. In the same file add the "installing" +declaration, in the form: + +.. code-block:: c++ + + static RegisterRegAlloc myRegAlloc("myregalloc", + "my register allocator help string", + createMyRegisterAllocator); + +Note the two spaces prior to the help string produces a tidy result on the +:option:`-help` query. + +.. code-block:: console + + $ llc -help + ... + -regalloc - Register allocator to use (default=linearscan) + =linearscan - linear scan register allocator + =local - local register allocator + =simple - simple register allocator + =myregalloc - my register allocator help string + ... + +And that's it. The user is now free to use ``-regalloc=myregalloc`` as an +option. Registering instruction schedulers is similar except use the +``RegisterScheduler`` class. Note that the +``RegisterScheduler::FunctionPassCtor`` is significantly different from +``RegisterRegAlloc::FunctionPassCtor``. + +To force the load/linking of your register allocator into the +:program:`llc`/:program:`lli` tools, add your creator function's global +declaration to ``Passes.h`` and add a "pseudo" call line to +``llvm/Codegen/LinkAllCodegenComponents.h``. + +Creating new registries +----------------------- + +The easiest way to get started is to clone one of the existing registries; we +recommend ``llvm/CodeGen/RegAllocRegistry.h``. The key things to modify are +the class name and the ``FunctionPassCtor`` type. + +Then you need to declare the registry. Example: if your pass registry is +``RegisterMyPasses`` then define: + +.. code-block:: c++ + + MachinePassRegistry RegisterMyPasses::Registry; + +And finally, declare the command line option for your passes. Example: + +.. code-block:: c++ + + cl::opt<RegisterMyPasses::FunctionPassCtor, false, + RegisterPassParser<RegisterMyPasses> > + MyPassOpt("mypass", + cl::init(&createDefaultMyPass), + cl::desc("my pass option help")); + +Here the command option is "``mypass``", with ``createDefaultMyPass`` as the +default creator. + +Using GDB with dynamically loaded passes +---------------------------------------- + +Unfortunately, using GDB with dynamically loaded passes is not as easy as it +should be. First of all, you can't set a breakpoint in a shared object that +has not been loaded yet, and second of all there are problems with inlined +functions in shared objects. Here are some suggestions to debugging your pass +with GDB. + +For sake of discussion, I'm going to assume that you are debugging a +transformation invoked by :program:`opt`, although nothing described here +depends on that. + +Setting a breakpoint in your pass +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +First thing you do is start gdb on the opt process: + +.. code-block:: console + + $ gdb opt + GNU gdb 5.0 + Copyright 2000 Free Software Foundation, Inc. + GDB is free software, covered by the GNU General Public License, and you are + welcome to change it and/or distribute copies of it under certain conditions. + Type "show copying" to see the conditions. + There is absolutely no warranty for GDB. Type "show warranty" for details. + This GDB was configured as "sparc-sun-solaris2.6"... + (gdb) + +Note that :program:`opt` has a lot of debugging information in it, so it takes +time to load. Be patient. Since we cannot set a breakpoint in our pass yet +(the shared object isn't loaded until runtime), we must execute the process, +and have it stop before it invokes our pass, but after it has loaded the shared +object. The most foolproof way of doing this is to set a breakpoint in +``PassManager::run`` and then run the process with the arguments you want: + +.. code-block:: console + + $ (gdb) break llvm::PassManager::run + Breakpoint 1 at 0x2413bc: file Pass.cpp, line 70. + (gdb) run test.bc -load $(LLVMTOP)/llvm/Debug+Asserts/lib/[libname].so -[passoption] + Starting program: opt test.bc -load $(LLVMTOP)/llvm/Debug+Asserts/lib/[libname].so -[passoption] + Breakpoint 1, PassManager::run (this=0xffbef174, M=@0x70b298) at Pass.cpp:70 + 70 bool PassManager::run(Module &M) { return PM->run(M); } + (gdb) + +Once the :program:`opt` stops in the ``PassManager::run`` method you are now +free to set breakpoints in your pass so that you can trace through execution or +do other standard debugging stuff. + +Miscellaneous Problems +^^^^^^^^^^^^^^^^^^^^^^ + +Once you have the basics down, there are a couple of problems that GDB has, +some with solutions, some without. + +* Inline functions have bogus stack information. In general, GDB does a pretty + good job getting stack traces and stepping through inline functions. When a + pass is dynamically loaded however, it somehow completely loses this + capability. The only solution I know of is to de-inline a function (move it + from the body of a class to a ``.cpp`` file). + +* Restarting the program breaks breakpoints. After following the information + above, you have succeeded in getting some breakpoints planted in your pass. + Nex thing you know, you restart the program (i.e., you type "``run``" again), + and you start getting errors about breakpoints being unsettable. The only + way I have found to "fix" this problem is to delete the breakpoints that are + already set in your pass, run the program, and re-set the breakpoints once + execution stops in ``PassManager::run``. + +Hopefully these tips will help with common case debugging situations. If you'd +like to contribute some tips of your own, just contact `Chris +<mailto:sabre@nondot.org>`_. + +Future extensions planned +------------------------- + +Although the LLVM Pass Infrastructure is very capable as it stands, and does +some nifty stuff, there are things we'd like to add in the future. Here is +where we are going: + +.. _writing-an-llvm-pass-SMP: + +Multithreaded LLVM +^^^^^^^^^^^^^^^^^^ + +Multiple CPU machines are becoming more common and compilation can never be +fast enough: obviously we should allow for a multithreaded compiler. Because +of the semantics defined for passes above (specifically they cannot maintain +state across invocations of their ``run*`` methods), a nice clean way to +implement a multithreaded compiler would be for the ``PassManager`` class to +create multiple instances of each pass object, and allow the separate instances +to be hacking on different parts of the program at the same time. + +This implementation would prevent each of the passes from having to implement +multithreaded constructs, requiring only the LLVM core to have locking in a few +places (for global resources). Although this is a simple extension, we simply +haven't had time (or multiprocessor machines, thus a reason) to implement this. +Despite that, we have kept the LLVM passes SMP ready, and you should too. + diff --git a/docs/YamlIO.rst b/docs/YamlIO.rst new file mode 100644 index 0000000000..ac50292f4a --- /dev/null +++ b/docs/YamlIO.rst @@ -0,0 +1,860 @@ +===================== +YAML I/O +===================== + +.. contents:: + :local: + +Introduction to YAML +==================== + +YAML is a human readable data serialization language. The full YAML language +spec can be read at `yaml.org +<http://www.yaml.org/spec/1.2/spec.html#Introduction>`_. The simplest form of +yaml is just "scalars", "mappings", and "sequences". A scalar is any number +or string. The pound/hash symbol (#) begins a comment line. A mapping is +a set of key-value pairs where the key ends with a colon. For example: + +.. code-block:: yaml + + # a mapping + name: Tom + hat-size: 7 + +A sequence is a list of items where each item starts with a leading dash ('-'). +For example: + +.. code-block:: yaml + + # a sequence + - x86 + - x86_64 + - PowerPC + +You can combine mappings and sequences by indenting. For example a sequence +of mappings in which one of the mapping values is itself a sequence: + +.. code-block:: yaml + + # a sequence of mappings with one key's value being a sequence + - name: Tom + cpus: + - x86 + - x86_64 + - name: Bob + cpus: + - x86 + - name: Dan + cpus: + - PowerPC + - x86 + +Sometime sequences are known to be short and the one entry per line is too +verbose, so YAML offers an alternate syntax for sequences called a "Flow +Sequence" in which you put comma separated sequence elements into square +brackets. The above example could then be simplified to : + + +.. code-block:: yaml + + # a sequence of mappings with one key's value being a flow sequence + - name: Tom + cpus: [ x86, x86_64 ] + - name: Bob + cpus: [ x86 ] + - name: Dan + cpus: [ PowerPC, x86 ] + + +Introduction to YAML I/O +======================== + +The use of indenting makes the YAML easy for a human to read and understand, +but having a program read and write YAML involves a lot of tedious details. +The YAML I/O library structures and simplifies reading and writing YAML +documents. + +YAML I/O assumes you have some "native" data structures which you want to be +able to dump as YAML and recreate from YAML. The first step is to try +writing example YAML for your data structures. You may find after looking at +possible YAML representations that a direct mapping of your data structures +to YAML is not very readable. Often the fields are not in the order that +a human would find readable. Or the same information is replicated in multiple +locations, making it hard for a human to write such YAML correctly. + +In relational database theory there is a design step called normalization in +which you reorganize fields and tables. The same considerations need to +go into the design of your YAML encoding. But, you may not want to change +your existing native data structures. Therefore, when writing out YAML +there may be a normalization step, and when reading YAML there would be a +corresponding denormalization step. + +YAML I/O uses a non-invasive, traits based design. YAML I/O defines some +abstract base templates. You specialize those templates on your data types. +For instance, if you have an enumerated type FooBar you could specialize +ScalarEnumerationTraits on that type and define the enumeration() method: + +.. code-block:: c++ + + using llvm::yaml::ScalarEnumerationTraits; + using llvm::yaml::IO; + + template <> + struct ScalarEnumerationTraits<FooBar> { + static void enumeration(IO &io, FooBar &value) { + ... + } + }; + + +As with all YAML I/O template specializations, the ScalarEnumerationTraits is used for +both reading and writing YAML. That is, the mapping between in-memory enum +values and the YAML string representation is only in place. +This assures that the code for writing and parsing of YAML stays in sync. + +To specify a YAML mappings, you define a specialization on +llvm::yaml::MappingTraits. +If your native data structure happens to be a struct that is already normalized, +then the specialization is simple. For example: + +.. code-block:: c++ + + using llvm::yaml::MappingTraits; + using llvm::yaml::IO; + + template <> + struct MappingTraits<Person> { + static void mapping(IO &io, Person &info) { + io.mapRequired("name", info.name); + io.mapOptional("hat-size", info.hatSize); + } + }; + + +A YAML sequence is automatically inferred if you data type has begin()/end() +iterators and a push_back() method. Therefore any of the STL containers +(such as std::vector<>) will automatically translate to YAML sequences. + +Once you have defined specializations for your data types, you can +programmatically use YAML I/O to write a YAML document: + +.. code-block:: c++ + + using llvm::yaml::Output; + + Person tom; + tom.name = "Tom"; + tom.hatSize = 8; + Person dan; + dan.name = "Dan"; + dan.hatSize = 7; + std::vector<Person> persons; + persons.push_back(tom); + persons.push_back(dan); + + Output yout(llvm::outs()); + yout << persons; + +This would write the following: + +.. code-block:: yaml + + - name: Tom + hat-size: 8 + - name: Dan + hat-size: 7 + +And you can also read such YAML documents with the following code: + +.. code-block:: c++ + + using llvm::yaml::Input; + + typedef std::vector<Person> PersonList; + std::vector<PersonList> docs; + + Input yin(document.getBuffer()); + yin >> docs; + + if ( yin.error() ) + return; + + // Process read document + for ( PersonList &pl : docs ) { + for ( Person &person : pl ) { + cout << "name=" << person.name; + } + } + +One other feature of YAML is the ability to define multiple documents in a +single file. That is why reading YAML produces a vector of your document type. + + + +Error Handling +============== + +When parsing a YAML document, if the input does not match your schema (as +expressed in your XxxTraits<> specializations). YAML I/O +will print out an error message and your Input object's error() method will +return true. For instance the following document: + +.. code-block:: yaml + + - name: Tom + shoe-size: 12 + - name: Dan + hat-size: 7 + +Has a key (shoe-size) that is not defined in the schema. YAML I/O will +automatically generate this error: + +.. code-block:: yaml + + YAML:2:2: error: unknown key 'shoe-size' + shoe-size: 12 + ^~~~~~~~~ + +Similar errors are produced for other input not conforming to the schema. + + +Scalars +======= + +YAML scalars are just strings (i.e. not a sequence or mapping). The YAML I/O +library provides support for translating between YAML scalars and specific +C++ types. + + +Built-in types +-------------- +The following types have built-in support in YAML I/O: + +* bool +* float +* double +* StringRef +* int64_t +* int32_t +* int16_t +* int8_t +* uint64_t +* uint32_t +* uint16_t +* uint8_t + +That is, you can use those types in fields of MappingTraits or as element type +in sequence. When reading, YAML I/O will validate that the string found +is convertible to that type and error out if not. + + +Unique types +------------ +Given that YAML I/O is trait based, the selection of how to convert your data +to YAML is based on the type of your data. But in C++ type matching, typedefs +do not generate unique type names. That means if you have two typedefs of +unsigned int, to YAML I/O both types look exactly like unsigned int. To +facilitate make unique type names, YAML I/O provides a macro which is used +like a typedef on built-in types, but expands to create a class with conversion +operators to and from the base type. For example: + +.. code-block:: c++ + + LLVM_YAML_STRONG_TYPEDEF(uint32_t, MyFooFlags) + LLVM_YAML_STRONG_TYPEDEF(uint32_t, MyBarFlags) + +This generates two classes MyFooFlags and MyBarFlags which you can use in your +native data structures instead of uint32_t. They are implicitly +converted to and from uint32_t. The point of creating these unique types +is that you can now specify traits on them to get different YAML conversions. + +Hex types +--------- +An example use of a unique type is that YAML I/O provides fixed sized unsigned +integers that are written with YAML I/O as hexadecimal instead of the decimal +format used by the built-in integer types: + +* Hex64 +* Hex32 +* Hex16 +* Hex8 + +You can use llvm::yaml::Hex32 instead of uint32_t and the only different will +be that when YAML I/O writes out that type it will be formatted in hexadecimal. + + +ScalarEnumerationTraits +----------------------- +YAML I/O supports translating between in-memory enumerations and a set of string +values in YAML documents. This is done by specializing ScalarEnumerationTraits<> +on your enumeration type and define a enumeration() method. +For instance, suppose you had an enumeration of CPUs and a struct with it as +a field: + +.. code-block:: c++ + + enum CPUs { + cpu_x86_64 = 5, + cpu_x86 = 7, + cpu_PowerPC = 8 + }; + + struct Info { + CPUs cpu; + uint32_t flags; + }; + +To support reading and writing of this enumeration, you can define a +ScalarEnumerationTraits specialization on CPUs, which can then be used +as a field type: + +.. code-block:: c++ + + using llvm::yaml::ScalarEnumerationTraits; + using llvm::yaml::MappingTraits; + using llvm::yaml::IO; + + template <> + struct ScalarEnumerationTraits<CPUs> { + static void enumeration(IO &io, CPUs &value) { + io.enumCase(value, "x86_64", cpu_x86_64); + io.enumCase(value, "x86", cpu_x86); + io.enumCase(value, "PowerPC", cpu_PowerPC); + } + }; + + template <> + struct MappingTraits<Info> { + static void mapping(IO &io, Info &info) { + io.mapRequired("cpu", info.cpu); + io.mapOptional("flags", info.flags, 0); + } + }; + +When reading YAML, if the string found does not match any of the the strings +specified by enumCase() methods, an error is automatically generated. +When writing YAML, if the value being written does not match any of the values +specified by the enumCase() methods, a runtime assertion is triggered. + + +BitValue +-------- +Another common data structure in C++ is a field where each bit has a unique +meaning. This is often used in a "flags" field. YAML I/O has support for +converting such fields to a flow sequence. For instance suppose you +had the following bit flags defined: + +.. code-block:: c++ + + enum { + flagsPointy = 1 + flagsHollow = 2 + flagsFlat = 4 + flagsRound = 8 + }; + + LLVM_YAML_UNIQUE_TYPE(MyFlags, uint32_t) + +To support reading and writing of MyFlags, you specialize ScalarBitSetTraits<> +on MyFlags and provide the bit values and their names. + +.. code-block:: c++ + + using llvm::yaml::ScalarBitSetTraits; + using llvm::yaml::MappingTraits; + using llvm::yaml::IO; + + template <> + struct ScalarBitSetTraits<MyFlags> { + static void bitset(IO &io, MyFlags &value) { + io.bitSetCase(value, "hollow", flagHollow); + io.bitSetCase(value, "flat", flagFlat); + io.bitSetCase(value, "round", flagRound); + io.bitSetCase(value, "pointy", flagPointy); + } + }; + + struct Info { + StringRef name; + MyFlags flags; + }; + + template <> + struct MappingTraits<Info> { + static void mapping(IO &io, Info& info) { + io.mapRequired("name", info.name); + io.mapRequired("flags", info.flags); + } + }; + +With the above, YAML I/O (when writing) will test mask each value in the +bitset trait against the flags field, and each that matches will +cause the corresponding string to be added to the flow sequence. The opposite +is done when reading and any unknown string values will result in a error. With +the above schema, a same valid YAML document is: + +.. code-block:: yaml + + name: Tom + flags: [ pointy, flat ] + + +Custom Scalar +------------- +Sometimes for readability a scalar needs to be formatted in a custom way. For +instance your internal data structure may use a integer for time (seconds since +some epoch), but in YAML it would be much nicer to express that integer in +some time format (e.g. 4-May-2012 10:30pm). YAML I/O has a way to support +custom formatting and parsing of scalar types by specializing ScalarTraits<> on +your data type. When writing, YAML I/O will provide the native type and +your specialization must create a temporary llvm::StringRef. When reading, +YAML I/O will provide a llvm::StringRef of scalar and your specialization +must convert that to your native data type. An outline of a custom scalar type +looks like: + +.. code-block:: c++ + + using llvm::yaml::ScalarTraits; + using llvm::yaml::IO; + + template <> + struct ScalarTraits<MyCustomType> { + static void output(const T &value, llvm::raw_ostream &out) { + out << value; // do custom formatting here + } + static StringRef input(StringRef scalar, T &value) { + // do custom parsing here. Return the empty string on success, + // or an error message on failure. + return StringRef(); + } + }; + + +Mappings +======== + +To be translated to or from a YAML mapping for your type T you must specialize +llvm::yaml::MappingTraits on T and implement the "void mapping(IO &io, T&)" +method. If your native data structures use pointers to a class everywhere, +you can specialize on the class pointer. Examples: + +.. code-block:: c++ + + using llvm::yaml::MappingTraits; + using llvm::yaml::IO; + + // Example of struct Foo which is used by value + template <> + struct MappingTraits<Foo> { + static void mapping(IO &io, Foo &foo) { + io.mapOptional("size", foo.size); + ... + } + }; + + // Example of struct Bar which is natively always a pointer + template <> + struct MappingTraits<Bar*> { + static void mapping(IO &io, Bar *&bar) { + io.mapOptional("size", bar->size); + ... + } + }; + + +No Normalization +---------------- + +The mapping() method is responsible, if needed, for normalizing and +denormalizing. In a simple case where the native data structure requires no +normalization, the mapping method just uses mapOptional() or mapRequired() to +bind the struct's fields to YAML key names. For example: + +.. code-block:: c++ + + using llvm::yaml::MappingTraits; + using llvm::yaml::IO; + + template <> + struct MappingTraits<Person> { + static void mapping(IO &io, Person &info) { + io.mapRequired("name", info.name); + io.mapOptional("hat-size", info.hatSize); + } + }; + + +Normalization +---------------- + +When [de]normalization is required, the mapping() method needs a way to access +normalized values as fields. To help with this, there is +a template MappingNormalization<> which you can then use to automatically +do the normalization and denormalization. The template is used to create +a local variable in your mapping() method which contains the normalized keys. + +Suppose you have native data type +Polar which specifies a position in polar coordinates (distance, angle): + +.. code-block:: c++ + + struct Polar { + float distance; + float angle; + }; + +but you've decided the normalized YAML for should be in x,y coordinates. That +is, you want the yaml to look like: + +.. code-block:: yaml + + x: 10.3 + y: -4.7 + +You can support this by defining a MappingTraits that normalizes the polar +coordinates to x,y coordinates when writing YAML and denormalizes x,y +coordinates into polar when reading YAML. + +.. code-block:: c++ + + using llvm::yaml::MappingTraits; + using llvm::yaml::IO; + + template <> + struct MappingTraits<Polar> { + + class NormalizedPolar { + public: + NormalizedPolar(IO &io) + : x(0.0), y(0.0) { + } + NormalizedPolar(IO &, Polar &polar) + : x(polar.distance * cos(polar.angle)), + y(polar.distance * sin(polar.angle)) { + } + Polar denormalize(IO &) { + return Polar(sqrt(x*x+y*y, arctan(x,y)); + } + + float x; + float y; + }; + + static void mapping(IO &io, Polar &polar) { + MappingNormalization<NormalizedPolar, Polar> keys(io, polar); + + io.mapRequired("x", keys->x); + io.mapRequired("y", keys->y); + } + }; + +When writing YAML, the local variable "keys" will be a stack allocated +instance of NormalizedPolar, constructed from the suppled polar object which +initializes it x and y fields. The mapRequired() methods then write out the x +and y values as key/value pairs. + +When reading YAML, the local variable "keys" will be a stack allocated instance +of NormalizedPolar, constructed by the empty constructor. The mapRequired +methods will find the matching key in the YAML document and fill in the x and y +fields of the NormalizedPolar object keys. At the end of the mapping() method +when the local keys variable goes out of scope, the denormalize() method will +automatically be called to convert the read values back to polar coordinates, +and then assigned back to the second parameter to mapping(). + +In some cases, the normalized class may be a subclass of the native type and +could be returned by the denormalize() method, except that the temporary +normalized instance is stack allocated. In these cases, the utility template +MappingNormalizationHeap<> can be used instead. It just like +MappingNormalization<> except that it heap allocates the normalized object +when reading YAML. It never destroys the normalized object. The denormalize() +method can this return "this". + + +Default values +-------------- +Within a mapping() method, calls to io.mapRequired() mean that that key is +required to exist when parsing YAML documents, otherwise YAML I/O will issue an +error. + +On the other hand, keys registered with io.mapOptional() are allowed to not +exist in the YAML document being read. So what value is put in the field +for those optional keys? +There are two steps to how those optional fields are filled in. First, the +second parameter to the mapping() method is a reference to a native class. That +native class must have a default constructor. Whatever value the default +constructor initially sets for an optional field will be that field's value. +Second, the mapOptional() method has an optional third parameter. If provided +it is the value that mapOptional() should set that field to if the YAML document +does not have that key. + +There is one important difference between those two ways (default constructor +and third parameter to mapOptional). When YAML I/O generates a YAML document, +if the mapOptional() third parameter is used, if the actual value being written +is the same as (using ==) the default value, then that key/value is not written. + + +Order of Keys +-------------- + +When writing out a YAML document, the keys are written in the order that the +calls to mapRequired()/mapOptional() are made in the mapping() method. This +gives you a chance to write the fields in an order that a human reader of +the YAML document would find natural. This may be different that the order +of the fields in the native class. + +When reading in a YAML document, the keys in the document can be in any order, +but they are processed in the order that the calls to mapRequired()/mapOptional() +are made in the mapping() method. That enables some interesting +functionality. For instance, if the first field bound is the cpu and the second +field bound is flags, and the flags are cpu specific, you can programmatically +switch how the flags are converted to and from YAML based on the cpu. +This works for both reading and writing. For example: + +.. code-block:: c++ + + using llvm::yaml::MappingTraits; + using llvm::yaml::IO; + + struct Info { + CPUs cpu; + uint32_t flags; + }; + + template <> + struct MappingTraits<Info> { + static void mapping(IO &io, Info &info) { + io.mapRequired("cpu", info.cpu); + // flags must come after cpu for this to work when reading yaml + if ( info.cpu == cpu_x86_64 ) + io.mapRequired("flags", *(My86_64Flags*)info.flags); + else + io.mapRequired("flags", *(My86Flags*)info.flags); + } + }; + + +Sequence +======== + +To be translated to or from a YAML sequence for your type T you must specialize +llvm::yaml::SequenceTraits on T and implement two methods: +``size_t size(IO &io, T&)`` and +``T::value_type& element(IO &io, T&, size_t indx)``. For example: + +.. code-block:: c++ + + template <> + struct SequenceTraits<MySeq> { + static size_t size(IO &io, MySeq &list) { ... } + static MySeqEl element(IO &io, MySeq &list, size_t index) { ... } + }; + +The size() method returns how many elements are currently in your sequence. +The element() method returns a reference to the i'th element in the sequence. +When parsing YAML, the element() method may be called with an index one bigger +than the current size. Your element() method should allocate space for one +more element (using default constructor if element is a C++ object) and returns +a reference to that new allocated space. + + +Flow Sequence +------------- +A YAML "flow sequence" is a sequence that when written to YAML it uses the +inline notation (e.g [ foo, bar ] ). To specify that a sequence type should +be written in YAML as a flow sequence, your SequenceTraits specialization should +add "static const bool flow = true;". For instance: + +.. code-block:: c++ + + template <> + struct SequenceTraits<MyList> { + static size_t size(IO &io, MyList &list) { ... } + static MyListEl element(IO &io, MyList &list, size_t index) { ... } + + // The existence of this member causes YAML I/O to use a flow sequence + static const bool flow = true; + }; + +With the above, if you used MyList as the data type in your native data +structures, then then when converted to YAML, a flow sequence of integers +will be used (e.g. [ 10, -3, 4 ]). + + +Utility Macros +-------------- +Since a common source of sequences is std::vector<>, YAML I/O provides macros: +LLVM_YAML_IS_SEQUENCE_VECTOR() and LLVM_YAML_IS_FLOW_SEQUENCE_VECTOR() which +can be used to easily specify SequenceTraits<> on a std::vector type. YAML +I/O does not partial specialize SequenceTraits on std::vector<> because that +would force all vectors to be sequences. An example use of the macros: + +.. code-block:: c++ + + std::vector<MyType1>; + std::vector<MyType2>; + LLVM_YAML_IS_SEQUENCE_VECTOR(MyType1) + LLVM_YAML_IS_FLOW_SEQUENCE_VECTOR(MyType2) + + + +Document List +============= + +YAML allows you to define multiple "documents" in a single YAML file. Each +new document starts with a left aligned "---" token. The end of all documents +is denoted with a left aligned "..." token. Many users of YAML will never +have need for multiple documents. The top level node in their YAML schema +will be a mapping or sequence. For those cases, the following is not needed. +But for cases where you do want multiple documents, you can specify a +trait for you document list type. The trait has the same methods as +SequenceTraits but is named DocumentListTraits. For example: + +.. code-block:: c++ + + template <> + struct DocumentListTraits<MyDocList> { + static size_t size(IO &io, MyDocList &list) { ... } + static MyDocType element(IO &io, MyDocList &list, size_t index) { ... } + }; + + +User Context Data +================= +When an llvm::yaml::Input or llvm::yaml::Output object is created their +constructors take an optional "context" parameter. This is a pointer to +whatever state information you might need. + +For instance, in a previous example we showed how the conversion type for a +flags field could be determined at runtime based on the value of another field +in the mapping. But what if an inner mapping needs to know some field value +of an outer mapping? That is where the "context" parameter comes in. You +can set values in the context in the outer map's mapping() method and +retrieve those values in the inner map's mapping() method. + +The context value is just a void*. All your traits which use the context +and operate on your native data types, need to agree what the context value +actually is. It could be a pointer to an object or struct which your various +traits use to shared context sensitive information. + + +Output +====== + +The llvm::yaml::Output class is used to generate a YAML document from your +in-memory data structures, using traits defined on your data types. +To instantiate an Output object you need an llvm::raw_ostream, and optionally +a context pointer: + +.. code-block:: c++ + + class Output : public IO { + public: + Output(llvm::raw_ostream &, void *context=NULL); + +Once you have an Output object, you can use the C++ stream operator on it +to write your native data as YAML. One thing to recall is that a YAML file +can contain multiple "documents". If the top level data structure you are +streaming as YAML is a mapping, scalar, or sequence, then Output assumes you +are generating one document and wraps the mapping output +with "``---``" and trailing "``...``". + +.. code-block:: c++ + + using llvm::yaml::Output; + + void dumpMyMapDoc(const MyMapType &info) { + Output yout(llvm::outs()); + yout << info; + } + +The above could produce output like: + +.. code-block:: yaml + + --- + name: Tom + hat-size: 7 + ... + +On the other hand, if the top level data structure you are streaming as YAML +has a DocumentListTraits specialization, then Output walks through each element +of your DocumentList and generates a "---" before the start of each element +and ends with a "...". + +.. code-block:: c++ + + using llvm::yaml::Output; + + void dumpMyMapDoc(const MyDocListType &docList) { + Output yout(llvm::outs()); + yout << docList; + } + +The above could produce output like: + +.. code-block:: yaml + + --- + name: Tom + hat-size: 7 + --- + name: Tom + shoe-size: 11 + ... + +Input +===== + +The llvm::yaml::Input class is used to parse YAML document(s) into your native +data structures. To instantiate an Input +object you need a StringRef to the entire YAML file, and optionally a context +pointer: + +.. code-block:: c++ + + class Input : public IO { + public: + Input(StringRef inputContent, void *context=NULL); + +Once you have an Input object, you can use the C++ stream operator to read +the document(s). If you expect there might be multiple YAML documents in +one file, you'll need to specialize DocumentListTraits on a list of your +document type and stream in that document list type. Otherwise you can +just stream in the document type. Also, you can check if there was +any syntax errors in the YAML be calling the error() method on the Input +object. For example: + +.. code-block:: c++ + + // Reading a single document + using llvm::yaml::Input; + + Input yin(mb.getBuffer()); + + // Parse the YAML file + MyDocType theDoc; + yin >> theDoc; + + // Check for error + if ( yin.error() ) + return; + + +.. code-block:: c++ + + // Reading multiple documents in one file + using llvm::yaml::Input; + + LLVM_YAML_IS_DOCUMENT_LIST_VECTOR(std::vector<MyDocType>) + + Input yin(mb.getBuffer()); + + // Parse the YAML file + std::vector<MyDocType> theDocList; + yin >> theDocList; + + // Check for error + if ( yin.error() ) + return; + + diff --git a/docs/conf.py b/docs/conf.py index 919bb3bc9d..0ac3b7836b 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -40,7 +40,7 @@ master_doc = 'index' # General information about the project. project = u'LLVM' -copyright = u'2012, LLVM Project' +copyright = u'2003-2013, LLVM Project' # The version info for the project you're documenting, acts as replacement for # |version| and |release|, also used in various other places throughout the @@ -95,7 +95,7 @@ html_theme = 'llvm-theme' # Theme options are theme-specific and customize the look and feel of a theme # further. For a list of options available for each theme, see the # documentation. -#html_theme_options = {} +html_theme_options = { "nosidebar": True } # Add any paths that contain custom themes here, relative to this directory. html_theme_path = ["_themes"] diff --git a/docs/design_and_overview.rst b/docs/design_and_overview.rst deleted file mode 100644 index 22e8125bb6..0000000000 --- a/docs/design_and_overview.rst +++ /dev/null @@ -1,37 +0,0 @@ -.. _design_and_overview: - -LLVM Design & Overview -====================== - -.. toctree:: - :hidden: - - LangRef - GetElementPtr - -* :doc:`LangRef` - - Defines the LLVM intermediate representation. - -* `Introduction to the LLVM Compiler <http://llvm.org/pubs/2008-10-04-ACAT-LLVM-Intro.html>`_ - - Presentation providing a users introduction to LLVM. - -* `Intro to LLVM <http://www.aosabook.org/en/llvm.html>`_ - - Book chapter providing a compiler hacker's introduction to LLVM. - -* `LLVM: A Compilation Framework forLifelong Program Analysis & Transformation - <http://llvm.org/pubs/2004-01-30-CGO-LLVM.html>`_ - - Design overview. - -* `LLVM: An Infrastructure for Multi-Stage Optimization - <http://llvm.org/pubs/2002-12-LattnerMSThesis.html>`_ - - More details (quite old now). - -* :ref:`gep` - - Answers to some very frequent questions about LLVM's most frequently - misunderstood instruction. diff --git a/docs/development_process.rst b/docs/development_process.rst deleted file mode 100644 index ecd4c6a616..0000000000 --- a/docs/development_process.rst +++ /dev/null @@ -1,32 +0,0 @@ -.. _development_process: - -Development Process Documentation -================================= - -.. toctree:: - :hidden: - - MakefileGuide - Projects - LLVMBuild - HowToReleaseLLVM - -* :ref:`projects` - - How-to guide and templates for new projects that *use* the LLVM - infrastructure. The templates (directory organization, Makefiles, and test - tree) allow the project code to be located outside (or inside) the ``llvm/`` - tree, while using LLVM header files and libraries. - -* :doc:`LLVMBuild` - - Describes the LLVMBuild organization and files used by LLVM to specify - component descriptions. - -* :ref:`makefile_guide` - - Describes how the LLVM makefiles work and how to use them. - -* :doc:`HowToReleaseLLVM` - - This is a guide to preparing LLVM releases. Most developers can ignore it. diff --git a/docs/doxygen.footer b/docs/doxygen.footer index c492e7df6c..95d5434f67 100644 --- a/docs/doxygen.footer +++ b/docs/doxygen.footer @@ -3,7 +3,7 @@ Generated on $datetime for <a href="http://llvm.org/">$projectname</a> by <a href="http://www.doxygen.org"><img src="doxygen.png" alt="Doxygen" align="middle" border="0"/>$doxygenversion</a><br> -Copyright © 2003-2012 University of Illinois at Urbana-Champaign. +Copyright © 2003-2013 University of Illinois at Urbana-Champaign. All Rights Reserved.</p> <hr> diff --git a/docs/gcc-loops.png b/docs/gcc-loops.png Binary files differnew file mode 100644 index 0000000000..8923a31153 --- /dev/null +++ b/docs/gcc-loops.png diff --git a/docs/index.rst b/docs/index.rst index d406b52574..8f22ef2a77 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -1,5 +1,3 @@ -.. _contents: - Overview ======== @@ -15,54 +13,382 @@ research projects. Similarly, documentation is broken down into several high-level groupings targeted at different audiences: -* **Design & Overview** +LLVM Design & Overview +====================== + +Several introductory papers and presentations. + +.. toctree:: + :hidden: + + LangRef + GetElementPtr + +:doc:`LangRef` + Defines the LLVM intermediate representation. + +`Introduction to the LLVM Compiler`__ + Presentation providing a users introduction to LLVM. + + .. __: http://llvm.org/pubs/2008-10-04-ACAT-LLVM-Intro.html + +`Intro to LLVM`__ + Book chapter providing a compiler hacker's introduction to LLVM. + + .. __: http://www.aosabook.org/en/llvm.html + + +`LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation`__ + Design overview. + + .. __: http://llvm.org/pubs/2004-01-30-CGO-LLVM.html + +`LLVM: An Infrastructure for Multi-Stage Optimization`__ + More details (quite old now). + + .. __: http://llvm.org/pubs/2002-12-LattnerMSThesis.html + +:doc:`GetElementPtr` + Answers to some very frequent questions about LLVM's most frequently + misunderstood instruction. + +`Publications mentioning LLVM <http://llvm.org/pubs>`_ + .. + +User Guides +=========== + +For those new to the LLVM system. + +NOTE: If you are a user who is only interested in using LLVM-based +compilers, you should look into `Clang <http://clang.llvm.org>`_ or +`DragonEgg <http://dragonegg.llvm.org>`_ instead. The documentation here is +intended for users who have a need to work with the intermediate LLVM +representation. + +.. toctree:: + :hidden: + + CMake + HowToBuildOnARM + CommandGuide/index + DeveloperPolicy + GettingStarted + GettingStartedVS + FAQ + Lexicon + HowToAddABuilder + yaml2obj + HowToSubmitABug + SphinxQuickstartTemplate + Phabricator + TestingGuide + tutorial/index + ReleaseNotes + Passes + YamlIO + +:doc:`GettingStarted` + Discusses how to get up and running quickly with the LLVM infrastructure. + Everything from unpacking and compilation of the distribution to execution + of some tools. + +:doc:`CMake` + An addendum to the main Getting Started guide for those using the `CMake + build system <http://www.cmake.org>`_. + +:doc:`HowToBuildOnARM` + Notes on building and testing LLVM/Clang on ARM. + +:doc:`GettingStartedVS` + An addendum to the main Getting Started guide for those using Visual Studio + on Windows. + +:doc:`tutorial/index` + Tutorials about using LLVM. Includes a tutorial about making a custom + language with LLVM. + +:doc:`DeveloperPolicy` + The LLVM project's policy towards developers and their contributions. + +:doc:`LLVM Command Guide <CommandGuide/index>` + A reference manual for the LLVM command line utilities ("man" pages for LLVM + tools). + +:doc:`Passes` + A list of optimizations and analyses implemented in LLVM. + +:doc:`FAQ` + A list of common questions and problems and their solutions. + +:doc:`Release notes for the current release <ReleaseNotes>` + This describes new features, known bugs, and other limitations. + +:doc:`HowToSubmitABug` + Instructions for properly submitting information about any bugs you run into + in the LLVM system. + +:doc:`SphinxQuickstartTemplate` + A template + tutorial for writing new Sphinx documentation. It is meant + to be read in source form. + +:doc:`LLVM Testing Infrastructure Guide <TestingGuide>` + A reference manual for using the LLVM testing infrastructure. + +`How to build the C, C++, ObjC, and ObjC++ front end`__ + Instructions for building the clang front-end from source. + + .. __: http://clang.llvm.org/get_started.html + +:doc:`Lexicon` + Definition of acronyms, terms and concepts used in LLVM. + +:doc:`HowToAddABuilder` + Instructions for adding new builder to LLVM buildbot master. + +:doc:`YamlIO` + A reference guide for using LLVM's YAML I/O library. + +IRC +=== + +Users and developers of the LLVM project (including subprojects such as Clang) +can be found in #llvm on `irc.oftc.net <irc://irc.oftc.net/llvm>`_. + +This channel has several bots. + +* Buildbot reporters + + * llvmbb - Bot for the main LLVM buildbot master. + http://lab.llvm.org:8011/console + * bb-chapuni - An individually run buildbot master. http://bb.pgr.jp/console + * smooshlab - Apple's internal buildbot master. + +* robot - Bugzilla linker. %bug <number> - Several introductory papers and presentations are available at - :ref:`design_and_overview`. +* clang-bot - A `geordi <http://www.eelis.net/geordi/>`_ instance running + near-trunk clang instead of gcc. -* **Publications** +Programming Documentation +========================= - The list of `publications <http://llvm.org/pubs>`_ based on LLVM. +For developers of applications which use LLVM as a library. -* **User Guides** +.. toctree:: + :hidden: + + Atomics + CodingStandards + CommandLine + CompilerWriterInfo + ExtendingLLVM + HowToSetUpLLVMStyleRTTI + ProgrammersManual + +:doc:`LLVM Language Reference Manual <LangRef>` + Defines the LLVM intermediate representation and the assembly form of the + different nodes. - Those new to the LLVM system should first visit the :ref:`userguides`. +:doc:`Atomics` + Information about LLVM's concurrency model. - NOTE: If you are a user who is only interested in using LLVM-based - compilers, you should look into `Clang <http://clang.llvm.org>`_ or - `DragonEgg <http://dragonegg.llvm.org>`_ instead. The documentation here is - intended for users who have a need to work with the intermediate LLVM - representation. +:doc:`ProgrammersManual` + Introduction to the general layout of the LLVM sourcebase, important classes + and APIs, and some tips & tricks. -* **API Clients** +:doc:`CommandLine` + Provides information on using the command line parsing library. - Developers of applications which use LLVM as a library should visit the - :ref:`programming`. +:doc:`CodingStandards` + Details the LLVM coding standards and provides useful information on writing + efficient C++ code. -* **Subsystems** +:doc:`HowToSetUpLLVMStyleRTTI` + How to make ``isa<>``, ``dyn_cast<>``, etc. available for clients of your + class hierarchy. - API clients and LLVM developers may be interested in the - :ref:`subsystems` documentation. +:doc:`ExtendingLLVM` + Look here to see how to add instructions and intrinsics to LLVM. -* **Development Process** +`Doxygen generated documentation <http://llvm.org/doxygen/>`_ + (`classes <http://llvm.org/doxygen/inherits.html>`_) + (`tarball <http://llvm.org/doxygen/doxygen.tar.gz>`_) - Additional documentation on the LLVM project can be found at - :ref:`development_process`. +`ViewVC Repository Browser <http://llvm.org/viewvc/>`_ + .. -* **Mailing Lists** +:doc:`CompilerWriterInfo` + A list of helpful links for compiler writers. - For more information, consider consulting the LLVM :ref:`mailing_lists`. +Subsystem Documentation +======================= + +For API clients and LLVM developers. .. toctree:: - :maxdepth: 2 - - design_and_overview - userguides - programming - subsystems - development_process - mailing_lists - + :hidden: + + AliasAnalysis + BitCodeFormat + BranchWeightMetadata + Bugpoint + CodeGenerator + ExceptionHandling + LinkTimeOptimization + SegmentedStacks + TableGenFundamentals + DebuggingJITedCode + GoldPlugin + MarkedUpDisassembly + SystemLibrary + SourceLevelDebugging + Vectorizers + WritingAnLLVMBackend + GarbageCollection + WritingAnLLVMPass + TableGen/LangRef + HowToUseAttributes + +:doc:`WritingAnLLVMPass` + Information on how to write LLVM transformations and analyses. + +:doc:`WritingAnLLVMBackend` + Information on how to write LLVM backends for machine targets. + +:doc:`CodeGenerator` + The design and implementation of the LLVM code generator. Useful if you are + working on retargetting LLVM to a new architecture, designing a new codegen + pass, or enhancing existing components. + +:doc:`TableGenFundamentals` + Describes the TableGen tool, which is used heavily by the LLVM code + generator. + +:doc:`AliasAnalysis` + Information on how to write a new alias analysis implementation or how to + use existing analyses. + +:doc:`GarbageCollection` + The interfaces source-language compilers should use for compiling GC'd + programs. + +:doc:`Source Level Debugging with LLVM <SourceLevelDebugging>` + This document describes the design and philosophy behind the LLVM + source-level debugger. + +:doc:`Vectorizers` + This document describes the current status of vectorization in LLVM. + +:doc:`ExceptionHandling` + This document describes the design and implementation of exception handling + in LLVM. + +:doc:`Bugpoint` + Automatic bug finder and test-case reducer description and usage + information. + +:doc:`BitCodeFormat` + This describes the file format and encoding used for LLVM "bc" files. + +:doc:`System Library <SystemLibrary>` + This document describes the LLVM System Library (``lib/System``) and + how to keep LLVM source code portable + +:doc:`LinkTimeOptimization` + This document describes the interface between LLVM intermodular optimizer + and the linker and its design + +:doc:`GoldPlugin` + How to build your programs with link-time optimization on Linux. + +:doc:`DebuggingJITedCode` + How to debug JITed code with GDB. + +:doc:`BranchWeightMetadata` + Provides information about Branch Prediction Information. + +:doc:`SegmentedStacks` + This document describes segmented stacks and how they are used in LLVM. + +:doc:`MarkedUpDisassembly` + This document describes the optional rich disassembly output syntax. + +:doc:`HowToUseAttributes` + Answers some questions about the new Attributes infrastructure. + +Development Process Documentation +================================= + +Information about LLVM's development process. + +.. toctree:: + :hidden: + + MakefileGuide + Projects + LLVMBuild + HowToReleaseLLVM + Packaging + +:doc:`Projects` + How-to guide and templates for new projects that *use* the LLVM + infrastructure. The templates (directory organization, Makefiles, and test + tree) allow the project code to be located outside (or inside) the ``llvm/`` + tree, while using LLVM header files and libraries. + +:doc:`LLVMBuild` + Describes the LLVMBuild organization and files used by LLVM to specify + component descriptions. + +:doc:`MakefileGuide` + Describes how the LLVM makefiles work and how to use them. + +:doc:`HowToReleaseLLVM` + This is a guide to preparing LLVM releases. Most developers can ignore it. + +:doc:`Packaging` + Advice on packaging LLVM into a distribution. + +Mailing Lists +============= + +If you can't find what you need in these docs, try consulting the mailing +lists. + +`LLVM Announcements List`__ + This is a low volume list that provides important announcements regarding + LLVM. It gets email about once a month. + + .. __: http://lists.cs.uiuc.edu/mailman/listinfo/llvm-announce + +`Developer's List`__ + This list is for people who want to be included in technical discussions of + LLVM. People post to this list when they have questions about writing code + for or using the LLVM tools. It is relatively low volume. + + .. __: http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev + +`Bugs & Patches Archive`__ + This list gets emailed every time a bug is opened and closed, and when people + submit patches to be included in LLVM. It is higher volume than the LLVMdev + list. + + .. __: http://lists.cs.uiuc.edu/pipermail/llvmbugs/ + +`Commits Archive`__ + This list contains all commit messages that are made when LLVM developers + commit code changes to the repository. It is useful for those who want to + stay on the bleeding edge of LLVM development. This list is very high volume. + + .. __: http://lists.cs.uiuc.edu/pipermail/llvm-commits/ + +`Test Results Archive`__ + A message is automatically sent to this list by every active nightly tester + when it completes. As such, this list gets email several times each day, + making it a high volume list. + + .. __: http://lists.cs.uiuc.edu/pipermail/llvm-testresults/ + Indices and tables ================== diff --git a/docs/linpack-pc.png b/docs/linpack-pc.png Binary files differnew file mode 100644 index 0000000000..bbbee7d67e --- /dev/null +++ b/docs/linpack-pc.png diff --git a/docs/mailing_lists.rst b/docs/mailing_lists.rst deleted file mode 100644 index 106f1da48f..0000000000 --- a/docs/mailing_lists.rst +++ /dev/null @@ -1,35 +0,0 @@ -.. _mailing_lists: - -Mailing Lists -============= - - * `LLVM Announcements List - <http://lists.cs.uiuc.edu/mailman/listinfo/llvm-announce>`_ - - This is a low volume list that provides important announcements regarding - LLVM. It gets email about once a month. - - * `Developer's List <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>`_ - - This list is for people who want to be included in technical discussions of - LLVM. People post to this list when they have questions about writing code - for or using the LLVM tools. It is relatively low volume. - - * `Bugs & Patches Archive <http://lists.cs.uiuc.edu/pipermail/llvmbugs/>`_ - - This list gets emailed every time a bug is opened and closed, and when people - submit patches to be included in LLVM. It is higher volume than the LLVMdev - list. - - * `Commits Archive <http://lists.cs.uiuc.edu/pipermail/llvm-commits/>`_ - - This list contains all commit messages that are made when LLVM developers - commit code changes to the repository. It is useful for those who want to - stay on the bleeding edge of LLVM development. This list is very high volume. - - * `Test Results Archive - <http://lists.cs.uiuc.edu/pipermail/llvm-testresults/>`_ - - A message is automatically sent to this list by every active nightly tester - when it completes. As such, this list gets email several times each day, - making it a high volume list. diff --git a/docs/programming.rst b/docs/programming.rst deleted file mode 100644 index 3fea6ed427..0000000000 --- a/docs/programming.rst +++ /dev/null @@ -1,58 +0,0 @@ -.. _programming: - -Programming Documentation -========================= - -.. toctree:: - :hidden: - - Atomics - CodingStandards - CommandLine - CompilerWriterInfo - ExtendingLLVM - HowToSetUpLLVMStyleRTTI - ProgrammersManual - -* `LLVM Language Reference Manual <LangRef.html>`_ - - Defines the LLVM intermediate representation and the assembly form of the - different nodes. - -* :ref:`atomics` - - Information about LLVM's concurrency model. - -* :doc:`ProgrammersManual` - - Introduction to the general layout of the LLVM sourcebase, important classes - and APIs, and some tips & tricks. - -* :ref:`commandline` - - Provides information on using the command line parsing library. - -* :ref:`coding_standards` - - Details the LLVM coding standards and provides useful information on writing - efficient C++ code. - -* :doc:`HowToSetUpLLVMStyleRTTI` - - How to make ``isa<>``, ``dyn_cast<>``, etc. available for clients of your - class hierarchy. - -* :ref:`extending_llvm` - - Look here to see how to add instructions and intrinsics to LLVM. - -* `Doxygen generated documentation <http://llvm.org/doxygen/>`_ - - (`classes <http://llvm.org/doxygen/inherits.html>`_) - (`tarball <http://llvm.org/doxygen/doxygen.tar.gz>`_) - -* `ViewVC Repository Browser <http://llvm.org/viewvc/>`_ - -* :ref:`compiler_writer_info` - - A list of helpful links for compiler writers. diff --git a/docs/subsystems.rst b/docs/subsystems.rst deleted file mode 100644 index 275955be6e..0000000000 --- a/docs/subsystems.rst +++ /dev/null @@ -1,114 +0,0 @@ -.. _subsystems: - -Subsystem Documentation -======================= - -.. toctree:: - :hidden: - - AliasAnalysis - BitCodeFormat - BranchWeightMetadata - Bugpoint - CodeGenerator - ExceptionHandling - LinkTimeOptimization - SegmentedStacks - TableGenFundamentals - DebuggingJITedCode - GoldPlugin - MarkedUpDisassembly - HowToUseInstrMappings - SystemLibrary - SourceLevelDebugging - WritingAnLLVMBackend - GarbageCollection - -.. FIXME: once LangRef is Sphinxified, HowToUseInstrMappings should be put - under LangRef's toctree instead of this page's toctree. - -* `Writing an LLVM Pass <WritingAnLLVMPass.html>`_ - - Information on how to write LLVM transformations and analyses. - -* :doc:`WritingAnLLVMBackend` - - Information on how to write LLVM backends for machine targets. - -* :ref:`code_generator` - - The design and implementation of the LLVM code generator. Useful if you are - working on retargetting LLVM to a new architecture, designing a new codegen - pass, or enhancing existing components. - -* :ref:`tablegen` - - Describes the TableGen tool, which is used heavily by the LLVM code - generator. - -* :ref:`alias_analysis` - - Information on how to write a new alias analysis implementation or how to - use existing analyses. - -* :doc:`GarbageCollection` - - The interfaces source-language compilers should use for compiling GC'd - programs. - -* :doc:`Source Level Debugging with LLVM <SourceLevelDebugging>` - - This document describes the design and philosophy behind the LLVM - source-level debugger. - -* :ref:`exception_handling` - - This document describes the design and implementation of exception handling - in LLVM. - -* :ref:`bugpoint` - - Automatic bug finder and test-case reducer description and usage - information. - -* :ref:`bitcode_format` - - This describes the file format and encoding used for LLVM "bc" files. - -* :doc:`System Library <SystemLibrary>` - - This document describes the LLVM System Library (``lib/System``) and - how to keep LLVM source code portable - -* :ref:`lto` - - This document describes the interface between LLVM intermodular optimizer - and the linker and its design - -* :ref:`gold-plugin` - - How to build your programs with link-time optimization on Linux. - -* :ref:`debugging-jited-code` - - How to debug JITed code with GDB. - -* :ref:`branch_weight` - - Provides information about Branch Prediction Information. - -* :ref:`segmented_stacks` - - This document describes segmented stacks and how they are used in LLVM. - -* `Howto: Implementing LLVM Integrated Assembler`_ - - A simple guide for how to implement an LLVM integrated assembler for an - architecture. - -.. _`Howto: Implementing LLVM Integrated Assembler`: http://www.embecosm.com/download/ean10.html - -* :ref:`marked_up_disassembly` - - This document describes the optional rich disassembly output syntax. - diff --git a/docs/tutorial/LangImpl1.rst b/docs/tutorial/LangImpl1.rst index eb84e4c923..aa619cf19f 100644 --- a/docs/tutorial/LangImpl1.rst +++ b/docs/tutorial/LangImpl1.rst @@ -5,8 +5,6 @@ Kaleidoscope: Tutorial Introduction and the Lexer .. contents:: :local: -Written by `Chris Lattner <mailto:sabre@nondot.org>`_ - Tutorial Introduction ===================== diff --git a/docs/tutorial/LangImpl2.rst b/docs/tutorial/LangImpl2.rst index 0d62894a24..7262afa8f3 100644 --- a/docs/tutorial/LangImpl2.rst +++ b/docs/tutorial/LangImpl2.rst @@ -5,8 +5,6 @@ Kaleidoscope: Implementing a Parser and AST .. contents:: :local: -Written by `Chris Lattner <mailto:sabre@nondot.org>`_ - Chapter 2 Introduction ====================== diff --git a/docs/tutorial/LangImpl3.rst b/docs/tutorial/LangImpl3.rst index 01935a443b..9d5f90839e 100644 --- a/docs/tutorial/LangImpl3.rst +++ b/docs/tutorial/LangImpl3.rst @@ -5,8 +5,6 @@ Kaleidoscope: Code generation to LLVM IR .. contents:: :local: -Written by `Chris Lattner <mailto:sabre@nondot.org>`_ - Chapter 3 Introduction ====================== diff --git a/docs/tutorial/LangImpl4.rst b/docs/tutorial/LangImpl4.rst index 8484c57f9d..96c06d124e 100644 --- a/docs/tutorial/LangImpl4.rst +++ b/docs/tutorial/LangImpl4.rst @@ -5,8 +5,6 @@ Kaleidoscope: Adding JIT and Optimizer Support .. contents:: :local: -Written by `Chris Lattner <mailto:sabre@nondot.org>`_ - Chapter 4 Introduction ====================== diff --git a/docs/tutorial/LangImpl5.rst b/docs/tutorial/LangImpl5.rst index 8405e1a917..80d5f37bc4 100644 --- a/docs/tutorial/LangImpl5.rst +++ b/docs/tutorial/LangImpl5.rst @@ -5,8 +5,6 @@ Kaleidoscope: Extending the Language: Control Flow .. contents:: :local: -Written by `Chris Lattner <mailto:sabre@nondot.org>`_ - Chapter 5 Introduction ====================== diff --git a/docs/tutorial/LangImpl6.rst b/docs/tutorial/LangImpl6.rst index 30f4e90d03..a5a60bffe0 100644 --- a/docs/tutorial/LangImpl6.rst +++ b/docs/tutorial/LangImpl6.rst @@ -5,8 +5,6 @@ Kaleidoscope: Extending the Language: User-defined Operators .. contents:: :local: -Written by `Chris Lattner <mailto:sabre@nondot.org>`_ - Chapter 6 Introduction ====================== diff --git a/docs/tutorial/LangImpl7.rst b/docs/tutorial/LangImpl7.rst index 602dcb5f6f..6dde2fe41d 100644 --- a/docs/tutorial/LangImpl7.rst +++ b/docs/tutorial/LangImpl7.rst @@ -5,8 +5,6 @@ Kaleidoscope: Extending the Language: Mutable Variables .. contents:: :local: -Written by `Chris Lattner <mailto:sabre@nondot.org>`_ - Chapter 7 Introduction ====================== diff --git a/docs/tutorial/LangImpl8.rst b/docs/tutorial/LangImpl8.rst index 4058991f19..3534b2e0c9 100644 --- a/docs/tutorial/LangImpl8.rst +++ b/docs/tutorial/LangImpl8.rst @@ -5,8 +5,6 @@ Kaleidoscope: Conclusion and other useful LLVM tidbits .. contents:: :local: -Written by `Chris Lattner <mailto:sabre@nondot.org>`_ - Tutorial Conclusion =================== diff --git a/docs/tutorial/OCamlLangImpl1.rst b/docs/tutorial/OCamlLangImpl1.rst index daa482507d..94ca3a5aa4 100644 --- a/docs/tutorial/OCamlLangImpl1.rst +++ b/docs/tutorial/OCamlLangImpl1.rst @@ -5,9 +5,6 @@ Kaleidoscope: Tutorial Introduction and the Lexer .. contents:: :local: -Written by `Chris Lattner <mailto:sabre@nondot.org>`_ and `Erick -Tryzelaar <mailto:idadesub@users.sourceforge.net>`_ - Tutorial Introduction ===================== diff --git a/docs/tutorial/OCamlLangImpl2.rst b/docs/tutorial/OCamlLangImpl2.rst index 07490e1f67..83a22ab22d 100644 --- a/docs/tutorial/OCamlLangImpl2.rst +++ b/docs/tutorial/OCamlLangImpl2.rst @@ -5,9 +5,6 @@ Kaleidoscope: Implementing a Parser and AST .. contents:: :local: -Written by `Chris Lattner <mailto:sabre@nondot.org>`_ and `Erick -Tryzelaar <mailto:idadesub@users.sourceforge.net>`_ - Chapter 2 Introduction ====================== diff --git a/docs/tutorial/OCamlLangImpl3.rst b/docs/tutorial/OCamlLangImpl3.rst index d2a47b486c..fd9f0e5cd3 100644 --- a/docs/tutorial/OCamlLangImpl3.rst +++ b/docs/tutorial/OCamlLangImpl3.rst @@ -5,9 +5,6 @@ Kaleidoscope: Code generation to LLVM IR .. contents:: :local: -Written by `Chris Lattner <mailto:sabre@nondot.org>`_ and `Erick -Tryzelaar <mailto:idadesub@users.sourceforge.net>`_ - Chapter 3 Introduction ====================== diff --git a/docs/tutorial/OCamlLangImpl4.rst b/docs/tutorial/OCamlLangImpl4.rst index 865a03dfb7..b13b2afa88 100644 --- a/docs/tutorial/OCamlLangImpl4.rst +++ b/docs/tutorial/OCamlLangImpl4.rst @@ -5,9 +5,6 @@ Kaleidoscope: Adding JIT and Optimizer Support .. contents:: :local: -Written by `Chris Lattner <mailto:sabre@nondot.org>`_ and `Erick -Tryzelaar <mailto:idadesub@users.sourceforge.net>`_ - Chapter 4 Introduction ====================== diff --git a/docs/tutorial/OCamlLangImpl5.rst b/docs/tutorial/OCamlLangImpl5.rst index 203fb6f73b..b8ae3c58dd 100644 --- a/docs/tutorial/OCamlLangImpl5.rst +++ b/docs/tutorial/OCamlLangImpl5.rst @@ -5,9 +5,6 @@ Kaleidoscope: Extending the Language: Control Flow .. contents:: :local: -Written by `Chris Lattner <mailto:sabre@nondot.org>`_ and `Erick -Tryzelaar <mailto:idadesub@users.sourceforge.net>`_ - Chapter 5 Introduction ====================== diff --git a/docs/tutorial/OCamlLangImpl6.rst b/docs/tutorial/OCamlLangImpl6.rst index 7665647736..36bffa8e96 100644 --- a/docs/tutorial/OCamlLangImpl6.rst +++ b/docs/tutorial/OCamlLangImpl6.rst @@ -5,9 +5,6 @@ Kaleidoscope: Extending the Language: User-defined Operators .. contents:: :local: -Written by `Chris Lattner <mailto:sabre@nondot.org>`_ and `Erick -Tryzelaar <mailto:idadesub@users.sourceforge.net>`_ - Chapter 6 Introduction ====================== diff --git a/docs/tutorial/OCamlLangImpl7.rst b/docs/tutorial/OCamlLangImpl7.rst index 07da3a8ff9..cfb49312c5 100644 --- a/docs/tutorial/OCamlLangImpl7.rst +++ b/docs/tutorial/OCamlLangImpl7.rst @@ -5,9 +5,6 @@ Kaleidoscope: Extending the Language: Mutable Variables .. contents:: :local: -Written by `Chris Lattner <mailto:sabre@nondot.org>`_ and `Erick -Tryzelaar <mailto:idadesub@users.sourceforge.net>`_ - Chapter 7 Introduction ====================== diff --git a/docs/tutorial/OCamlLangImpl8.rst b/docs/tutorial/OCamlLangImpl8.rst index 4058991f19..3534b2e0c9 100644 --- a/docs/tutorial/OCamlLangImpl8.rst +++ b/docs/tutorial/OCamlLangImpl8.rst @@ -5,8 +5,6 @@ Kaleidoscope: Conclusion and other useful LLVM tidbits .. contents:: :local: -Written by `Chris Lattner <mailto:sabre@nondot.org>`_ - Tutorial Conclusion =================== diff --git a/docs/tutorial/index.rst b/docs/tutorial/index.rst index 4658f9619e..69a9aee096 100644 --- a/docs/tutorial/index.rst +++ b/docs/tutorial/index.rst @@ -22,6 +22,19 @@ Kaleidoscope: Implementing a Language with LLVM in Objective Caml OCamlLangImpl* +External Tutorials +================== + +`Tutorial: Creating an LLVM Backend for the Cpu0 Architecture <http://jonathan2251.github.com/lbd/>`_ + A step-by-step tutorial for developing an LLVM backend. Under + active development at `<https://github.com/Jonathan2251/lbd>`_ (please + contribute!). + +`Howto: Implementing LLVM Integrated Assembler`_ + A simple guide for how to implement an LLVM integrated assembler for an + architecture. + +.. _`Howto: Implementing LLVM Integrated Assembler`: http://www.embecosm.com/download/ean10.html Advanced Topics =============== diff --git a/docs/userguides.rst b/docs/userguides.rst deleted file mode 100644 index cfb6dbeb5e..0000000000 --- a/docs/userguides.rst +++ /dev/null @@ -1,106 +0,0 @@ -.. _userguides: - -User Guides -=========== - -.. toctree:: - :hidden: - - CMake - HowToBuildOnARM - CommandGuide/index - DeveloperPolicy - GettingStarted - GettingStartedVS - FAQ - Lexicon - Packaging - HowToAddABuilder - yaml2obj - HowToSubmitABug - SphinxQuickstartTemplate - Phabricator - TestingGuide - tutorial/index - ReleaseNotes - -* :ref:`getting_started` - - Discusses how to get up and running quickly with the LLVM infrastructure. - Everything from unpacking and compilation of the distribution to execution - of some tools. - -* :ref:`building-with-cmake` - - An addendum to the main Getting Started guide for those using the `CMake - build system <http://www.cmake.org>`_. - -* :ref:`how_to_build_on_arm` - - Notes on building and testing LLVM/Clang on ARM. - -* :doc:`GettingStartedVS` - - An addendum to the main Getting Started guide for those using Visual Studio - on Windows. - -* :doc:`tutorial/index` - - A walk through the process of using LLVM for a custom language, and the - facilities LLVM offers in tutorial form. - -* :ref:`developer_policy` - - The LLVM project's policy towards developers and their contributions. - -* :ref:`LLVM Command Guide <commands>` - - A reference manual for the LLVM command line utilities ("man" pages for LLVM - tools). - -* `LLVM's Analysis and Transform Passes <Passes.html>`_ - - A list of optimizations and analyses implemented in LLVM. - -* :ref:`faq` - - A list of common questions and problems and their solutions. - -* :doc:`Release notes for the current release <ReleaseNotes>` - - This describes new features, known bugs, and other limitations. - -* :ref:`how-to-submit-a-bug-report` - - Instructions for properly submitting information about any bugs you run into - in the LLVM system. -* :doc:`SphinxQuickstartTemplate` - - A template + tutorial for writing new Sphinx documentation. It is meant - to be read in source form. - -* :doc:`LLVM Testing Infrastructure Guide <TestingGuide>` - - A reference manual for using the LLVM testing infrastructure. - -* `How to build the C, C++, ObjC, and ObjC++ front end <http://clang.llvm.org/get_started.html>`_ - - Instructions for building the clang front-end from source. - -* :ref:`packaging` - - Advice on packaging LLVM into a distribution. - -* :ref:`lexicon` - - Definition of acronyms, terms and concepts used in LLVM. - -* :ref:`how_to_add_a_builder` - - Instructions for adding new builder to LLVM buildbot master. - -* **IRC** -- You can probably find help on the unofficial LLVM IRC. - - We often are on irc.oftc.net in the #llvm channel. If you are using the - mozilla browser, and have chatzilla installed, you can `join #llvm on - irc.oftc.net <irc://irc.oftc.net/llvm>`_. diff --git a/docs/yaml2obj.rst b/docs/yaml2obj.rst index d051e7e22c..b269806e06 100644 --- a/docs/yaml2obj.rst +++ b/docs/yaml2obj.rst @@ -1,5 +1,3 @@ -.. _yaml2obj: - yaml2obj ======== |