diff options
Diffstat (limited to 'docs/WritingAnLLVMPass.rst')
-rw-r--r-- | docs/WritingAnLLVMPass.rst | 1439 |
1 files changed, 1439 insertions, 0 deletions
diff --git a/docs/WritingAnLLVMPass.rst b/docs/WritingAnLLVMPass.rst new file mode 100644 index 0000000000..db47fefd93 --- /dev/null +++ b/docs/WritingAnLLVMPass.rst @@ -0,0 +1,1439 @@ +==================== +Writing an LLVM Pass +==================== + +.. contents:: + :local: + +Written by `Chris Lattner <mailto:sabre@nondot.org>`_ and +`Jim Laskey <mailto:jlaskey@mac.com>`_ + +Introduction --- What is a pass? +================================ + +The LLVM Pass Framework is an important part of the LLVM system, because LLVM +passes are where most of the interesting parts of the compiler exist. Passes +perform the transformations and optimizations that make up the compiler, they +build the analysis results that are used by these transformations, and they +are, above all, a structuring technique for compiler code. + +All LLVM passes are subclasses of the `Pass +<http://llvm.org/doxygen/classllvm_1_1Pass.html>`_ class, which implement +functionality by overriding virtual methods inherited from ``Pass``. Depending +on how your pass works, you should inherit from the :ref:`ModulePass +<writing-an-llvm-pass-ModulePass>` , :ref:`CallGraphSCCPass +<writing-an-llvm-pass-CallGraphSCCPass>`, :ref:`FunctionPass +<writing-an-llvm-pass-FunctionPass>` , or :ref:`LoopPass +<writing-an-llvm-pass-LoopPass>`, or :ref:`RegionPass +<writing-an-llvm-pass-RegionPass>`, or :ref:`BasicBlockPass +<writing-an-llvm-pass-BasicBlockPass>` classes, which gives the system more +information about what your pass does, and how it can be combined with other +passes. One of the main features of the LLVM Pass Framework is that it +schedules passes to run in an efficient way based on the constraints that your +pass meets (which are indicated by which class they derive from). + +We start by showing you how to construct a pass, everything from setting up the +code, to compiling, loading, and executing it. After the basics are down, more +advanced features are discussed. + +Quick Start --- Writing hello world +=================================== + +Here we describe how to write the "hello world" of passes. The "Hello" pass is +designed to simply print out the name of non-external functions that exist in +the program being compiled. It does not modify the program at all, it just +inspects it. The source code and files for this pass are available in the LLVM +source tree in the ``lib/Transforms/Hello`` directory. + +.. _writing-an-llvm-pass-makefile: + +Setting up the build environment +-------------------------------- + +.. FIXME: Why does this recommend to build in-tree? + +First, configure and build LLVM. This needs to be done directly inside the +LLVM source tree rather than in a separate objects directory. Next, you need +to create a new directory somewhere in the LLVM source base. For this example, +we'll assume that you made ``lib/Transforms/Hello``. Finally, you must set up +a build script (``Makefile``) that will compile the source code for the new +pass. To do this, copy the following into ``Makefile``: + +.. code-block:: make + + # Makefile for hello pass + + # Path to top level of LLVM hierarchy + LEVEL = ../../.. + + # Name of the library to build + LIBRARYNAME = Hello + + # Make the shared library become a loadable module so the tools can + # dlopen/dlsym on the resulting library. + LOADABLE_MODULE = 1 + + # Include the makefile implementation stuff + include $(LEVEL)/Makefile.common + +This makefile specifies that all of the ``.cpp`` files in the current directory +are to be compiled and linked together into a shared object +``$(LEVEL)/Debug+Asserts/lib/Hello.so`` that can be dynamically loaded by the +:program:`opt` or :program:`bugpoint` tools via their :option:`-load` options. +If your operating system uses a suffix other than ``.so`` (such as Windows or Mac +OS X), the appropriate extension will be used. + +If you are used CMake to build LLVM, see :ref:`cmake-out-of-source-pass`. + +Now that we have the build scripts set up, we just need to write the code for +the pass itself. + +.. _writing-an-llvm-pass-basiccode: + +Basic code required +------------------- + +Now that we have a way to compile our new pass, we just have to write it. +Start out with: + +.. code-block:: c++ + + #include "llvm/Pass.h" + #include "llvm/Function.h" + #include "llvm/Support/raw_ostream.h" + +Which are needed because we are writing a `Pass +<http://llvm.org/doxygen/classllvm_1_1Pass.html>`_, we are operating on +`Function <http://llvm.org/doxygen/classllvm_1_1Function.html>`_\ s, and we will +be doing some printing. + +Next we have: + +.. code-block:: c++ + + using namespace llvm; + +... which is required because the functions from the include files live in the +llvm namespace. + +Next we have: + +.. code-block:: c++ + + namespace { + +... which starts out an anonymous namespace. Anonymous namespaces are to C++ +what the "``static``" keyword is to C (at global scope). It makes the things +declared inside of the anonymous namespace visible only to the current file. +If you're not familiar with them, consult a decent C++ book for more +information. + +Next, we declare our pass itself: + +.. code-block:: c++ + + struct Hello : public FunctionPass { + +This declares a "``Hello``" class that is a subclass of `FunctionPass +<writing-an-llvm-pass-FunctionPass>`. The different builtin pass subclasses +are described in detail :ref:`later <writing-an-llvm-pass-pass-classes>`, but +for now, know that ``FunctionPass`` operates on a function at a time. + +.. code-block:: c++ + + static char ID; + Hello() : FunctionPass(ID) {} + +This declares pass identifier used by LLVM to identify pass. This allows LLVM +to avoid using expensive C++ runtime information. + +.. code-block:: c++ + + virtual bool runOnFunction(Function &F) { + errs() << "Hello: "; + errs().write_escaped(F.getName()) << "\n"; + return false; + } + }; // end of struct Hello + } // end of anonymous namespace + +We declare a :ref:`runOnFunction <writing-an-llvm-pass-runOnFunction>` method, +which overrides an abstract virtual method inherited from :ref:`FunctionPass +<writing-an-llvm-pass-FunctionPass>`. This is where we are supposed to do our +thing, so we just print out our message with the name of each function. + +.. code-block:: c++ + + char Hello::ID = 0; + +We initialize pass ID here. LLVM uses ID's address to identify a pass, so +initialization value is not important. + +.. code-block:: c++ + + static RegisterPass<Hello> X("hello", "Hello World Pass", + false /* Only looks at CFG */, + false /* Analysis Pass */); + +Lastly, we :ref:`register our class <writing-an-llvm-pass-registration>` +``Hello``, giving it a command line argument "``hello``", and a name "Hello +World Pass". The last two arguments describe its behavior: if a pass walks CFG +without modifying it then the third argument is set to ``true``; if a pass is +an analysis pass, for example dominator tree pass, then ``true`` is supplied as +the fourth argument. + +As a whole, the ``.cpp`` file looks like: + +.. code-block:: c++ + + #include "llvm/Pass.h" + #include "llvm/Function.h" + #include "llvm/Support/raw_ostream.h" + + using namespace llvm; + + namespace { + struct Hello : public FunctionPass { + static char ID; + Hello() : FunctionPass(ID) {} + + virtual bool runOnFunction(Function &F) { + errs() << "Hello: "; + errs().write_escaped(F.getName()) << '\n'; + return false; + } + }; + } + + char Hello::ID = 0; + static RegisterPass<Hello> X("hello", "Hello World Pass", false, false); + +Now that it's all together, compile the file with a simple "``gmake``" command +in the local directory and you should get a new file +"``Debug+Asserts/lib/Hello.so``" under the top level directory of the LLVM +source tree (not in the local directory). Note that everything in this file is +contained in an anonymous namespace --- this reflects the fact that passes +are self contained units that do not need external interfaces (although they +can have them) to be useful. + +Running a pass with ``opt`` +--------------------------- + +Now that you have a brand new shiny shared object file, we can use the +:program:`opt` command to run an LLVM program through your pass. Because you +registered your pass with ``RegisterPass``, you will be able to use the +:program:`opt` tool to access it, once loaded. + +To test it, follow the example at the end of the :doc:`GettingStarted` to +compile "Hello World" to LLVM. We can now run the bitcode file (hello.bc) for +the program through our transformation like this (or course, any bitcode file +will work): + +.. code-block:: console + + $ opt -load ../../../Debug+Asserts/lib/Hello.so -hello < hello.bc > /dev/null + Hello: __main + Hello: puts + Hello: main + +The :option:`-load` option specifies that :program:`opt` should load your pass +as a shared object, which makes "``-hello``" a valid command line argument +(which is one reason you need to :ref:`register your pass +<writing-an-llvm-pass-registration>`). Because the Hello pass does not modify +the program in any interesting way, we just throw away the result of +:program:`opt` (sending it to ``/dev/null``). + +To see what happened to the other string you registered, try running +:program:`opt` with the :option:`-help` option: + +.. code-block:: console + + $ opt -load ../../../Debug+Asserts/lib/Hello.so -help + OVERVIEW: llvm .bc -> .bc modular optimizer + + USAGE: opt [options] <input bitcode> + + OPTIONS: + Optimizations available: + ... + -globalopt - Global Variable Optimizer + -globalsmodref-aa - Simple mod/ref analysis for globals + -gvn - Global Value Numbering + -hello - Hello World Pass + -indvars - Induction Variable Simplification + -inline - Function Integration/Inlining + -insert-edge-profiling - Insert instrumentation for edge profiling + ... + +The pass name gets added as the information string for your pass, giving some +documentation to users of :program:`opt`. Now that you have a working pass, +you would go ahead and make it do the cool transformations you want. Once you +get it all working and tested, it may become useful to find out how fast your +pass is. The :ref:`PassManager <writing-an-llvm-pass-passmanager>` provides a +nice command line option (:option:`--time-passes`) that allows you to get +information about the execution time of your pass along with the other passes +you queue up. For example: + +.. code-block:: console + + $ opt -load ../../../Debug+Asserts/lib/Hello.so -hello -time-passes < hello.bc > /dev/null + Hello: __main + Hello: puts + Hello: main + =============================================================================== + ... Pass execution timing report ... + =============================================================================== + Total Execution Time: 0.02 seconds (0.0479059 wall clock) + + ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Pass Name --- + 0.0100 (100.0%) 0.0000 ( 0.0%) 0.0100 ( 50.0%) 0.0402 ( 84.0%) Bitcode Writer + 0.0000 ( 0.0%) 0.0100 (100.0%) 0.0100 ( 50.0%) 0.0031 ( 6.4%) Dominator Set Construction + 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0013 ( 2.7%) Module Verifier + 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0033 ( 6.9%) Hello World Pass + 0.0100 (100.0%) 0.0100 (100.0%) 0.0200 (100.0%) 0.0479 (100.0%) TOTAL + +As you can see, our implementation above is pretty fast. The additional +passes listed are automatically inserted by the :program:`opt` tool to verify +that the LLVM emitted by your pass is still valid and well formed LLVM, which +hasn't been broken somehow. + +Now that you have seen the basics of the mechanics behind passes, we can talk +about some more details of how they work and how to use them. + +.. _writing-an-llvm-pass-pass-classes: + +Pass classes and requirements +============================= + +One of the first things that you should do when designing a new pass is to +decide what class you should subclass for your pass. The :ref:`Hello World +<writing-an-llvm-pass-basiccode>` example uses the :ref:`FunctionPass +<writing-an-llvm-pass-FunctionPass>` class for its implementation, but we did +not discuss why or when this should occur. Here we talk about the classes +available, from the most general to the most specific. + +When choosing a superclass for your ``Pass``, you should choose the **most +specific** class possible, while still being able to meet the requirements +listed. This gives the LLVM Pass Infrastructure information necessary to +optimize how passes are run, so that the resultant compiler isn't unnecessarily +slow. + +The ``ImmutablePass`` class +--------------------------- + +The most plain and boring type of pass is the "`ImmutablePass +<http://llvm.org/doxygen/classllvm_1_1ImmutablePass.html>`_" class. This pass +type is used for passes that do not have to be run, do not change state, and +never need to be updated. This is not a normal type of transformation or +analysis, but can provide information about the current compiler configuration. + +Although this pass class is very infrequently used, it is important for +providing information about the current target machine being compiled for, and +other static information that can affect the various transformations. + +``ImmutablePass``\ es never invalidate other transformations, are never +invalidated, and are never "run". + +.. _writing-an-llvm-pass-ModulePass: + +The ``ModulePass`` class +------------------------ + +The `ModulePass <http://llvm.org/doxygen/classllvm_1_1ModulePass.html>`_ class +is the most general of all superclasses that you can use. Deriving from +``ModulePass`` indicates that your pass uses the entire program as a unit, +referring to function bodies in no predictable order, or adding and removing +functions. Because nothing is known about the behavior of ``ModulePass`` +subclasses, no optimization can be done for their execution. + +A module pass can use function level passes (e.g. dominators) using the +``getAnalysis`` interface ``getAnalysis<DominatorTree>(llvm::Function *)`` to +provide the function to retrieve analysis result for, if the function pass does +not require any module or immutable passes. Note that this can only be done +for functions for which the analysis ran, e.g. in the case of dominators you +should only ask for the ``DominatorTree`` for function definitions, not +declarations. + +To write a correct ``ModulePass`` subclass, derive from ``ModulePass`` and +overload the ``runOnModule`` method with the following signature: + +The ``runOnModule`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool runOnModule(Module &M) = 0; + +The ``runOnModule`` method performs the interesting work of the pass. It +should return ``true`` if the module was modified by the transformation and +``false`` otherwise. + +.. _writing-an-llvm-pass-CallGraphSCCPass: + +The ``CallGraphSCCPass`` class +------------------------------ + +The `CallGraphSCCPass +<http://llvm.org/doxygen/classllvm_1_1CallGraphSCCPass.html>`_ is used by +passes that need to traverse the program bottom-up on the call graph (callees +before callers). Deriving from ``CallGraphSCCPass`` provides some mechanics +for building and traversing the ``CallGraph``, but also allows the system to +optimize execution of ``CallGraphSCCPass``\ es. If your pass meets the +requirements outlined below, and doesn't meet the requirements of a +:ref:`FunctionPass <writing-an-llvm-pass-FunctionPass>` or :ref:`BasicBlockPass +<writing-an-llvm-pass-BasicBlockPass>`, you should derive from +``CallGraphSCCPass``. + +``TODO``: explain briefly what SCC, Tarjan's algo, and B-U mean. + +To be explicit, CallGraphSCCPass subclasses are: + +#. ... *not allowed* to inspect or modify any ``Function``\ s other than those + in the current SCC and the direct callers and direct callees of the SCC. +#. ... *required* to preserve the current ``CallGraph`` object, updating it to + reflect any changes made to the program. +#. ... *not allowed* to add or remove SCC's from the current Module, though + they may change the contents of an SCC. +#. ... *allowed* to add or remove global variables from the current Module. +#. ... *allowed* to maintain state across invocations of :ref:`runOnSCC + <writing-an-llvm-pass-runOnSCC>` (including global data). + +Implementing a ``CallGraphSCCPass`` is slightly tricky in some cases because it +has to handle SCCs with more than one node in it. All of the virtual methods +described below should return ``true`` if they modified the program, or +``false`` if they didn't. + +The ``doInitialization(CallGraph &)`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool doInitialization(CallGraph &CG); + +The ``doInitialization`` method is allowed to do most of the things that +``CallGraphSCCPass``\ es are not allowed to do. They can add and remove +functions, get pointers to functions, etc. The ``doInitialization`` method is +designed to do simple initialization type of stuff that does not depend on the +SCCs being processed. The ``doInitialization`` method call is not scheduled to +overlap with any other pass executions (thus it should be very fast). + +.. _writing-an-llvm-pass-runOnSCC: + +The ``runOnSCC`` method +^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool runOnSCC(CallGraphSCC &SCC) = 0; + +The ``runOnSCC`` method performs the interesting work of the pass, and should +return ``true`` if the module was modified by the transformation, ``false`` +otherwise. + +The ``doFinalization(CallGraph &)`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool doFinalization(CallGraph &CG); + +The ``doFinalization`` method is an infrequently used method that is called +when the pass framework has finished calling :ref:`runOnFunction +<writing-an-llvm-pass-runOnFunction>` for every function in the program being +compiled. + +.. _writing-an-llvm-pass-FunctionPass: + +The ``FunctionPass`` class +-------------------------- + +In contrast to ``ModulePass`` subclasses, `FunctionPass +<http://llvm.org/doxygen/classllvm_1_1Pass.html>`_ subclasses do have a +predictable, local behavior that can be expected by the system. All +``FunctionPass`` execute on each function in the program independent of all of +the other functions in the program. ``FunctionPass``\ es do not require that +they are executed in a particular order, and ``FunctionPass``\ es do not modify +external functions. + +To be explicit, ``FunctionPass`` subclasses are not allowed to: + +#. Modify a ``Function`` other than the one currently being processed. +#. Add or remove ``Function``\ s from the current ``Module``. +#. Add or remove global variables from the current ``Module``. +#. Maintain state across invocations of:ref:`runOnFunction + <writing-an-llvm-pass-runOnFunction>` (including global data). + +Implementing a ``FunctionPass`` is usually straightforward (See the :ref:`Hello +World <writing-an-llvm-pass-basiccode>` pass for example). +``FunctionPass``\ es may overload three virtual methods to do their work. All +of these methods should return ``true`` if they modified the program, or +``false`` if they didn't. + +.. _writing-an-llvm-pass-doInitialization-mod: + +The ``doInitialization(Module &)`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool doInitialization(Module &M); + +The ``doInitialization`` method is allowed to do most of the things that +``FunctionPass``\ es are not allowed to do. They can add and remove functions, +get pointers to functions, etc. The ``doInitialization`` method is designed to +do simple initialization type of stuff that does not depend on the functions +being processed. The ``doInitialization`` method call is not scheduled to +overlap with any other pass executions (thus it should be very fast). + +A good example of how this method should be used is the `LowerAllocations +<http://llvm.org/doxygen/LowerAllocations_8cpp-source.html>`_ pass. This pass +converts ``malloc`` and ``free`` instructions into platform dependent +``malloc()`` and ``free()`` function calls. It uses the ``doInitialization`` +method to get a reference to the ``malloc`` and ``free`` functions that it +needs, adding prototypes to the module if necessary. + +.. _writing-an-llvm-pass-runOnFunction: + +The ``runOnFunction`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool runOnFunction(Function &F) = 0; + +The ``runOnFunction`` method must be implemented by your subclass to do the +transformation or analysis work of your pass. As usual, a ``true`` value +should be returned if the function is modified. + +.. _writing-an-llvm-pass-doFinalization-mod: + +The ``doFinalization(Module &)`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool doFinalization(Module &M); + +The ``doFinalization`` method is an infrequently used method that is called +when the pass framework has finished calling :ref:`runOnFunction +<writing-an-llvm-pass-runOnFunction>` for every function in the program being +compiled. + +.. _writing-an-llvm-pass-LoopPass: + +The ``LoopPass`` class +---------------------- + +All ``LoopPass`` execute on each loop in the function independent of all of the +other loops in the function. ``LoopPass`` processes loops in loop nest order +such that outer most loop is processed last. + +``LoopPass`` subclasses are allowed to update loop nest using ``LPPassManager`` +interface. Implementing a loop pass is usually straightforward. +``LoopPass``\ es may overload three virtual methods to do their work. All +these methods should return ``true`` if they modified the program, or ``false`` +if they didn't. + +The ``doInitialization(Loop *, LPPassManager &)`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool doInitialization(Loop *, LPPassManager &LPM); + +The ``doInitialization`` method is designed to do simple initialization type of +stuff that does not depend on the functions being processed. The +``doInitialization`` method call is not scheduled to overlap with any other +pass executions (thus it should be very fast). ``LPPassManager`` interface +should be used to access ``Function`` or ``Module`` level analysis information. + +.. _writing-an-llvm-pass-runOnLoop: + +The ``runOnLoop`` method +^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool runOnLoop(Loop *, LPPassManager &LPM) = 0; + +The ``runOnLoop`` method must be implemented by your subclass to do the +transformation or analysis work of your pass. As usual, a ``true`` value +should be returned if the function is modified. ``LPPassManager`` interface +should be used to update loop nest. + +The ``doFinalization()`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool doFinalization(); + +The ``doFinalization`` method is an infrequently used method that is called +when the pass framework has finished calling :ref:`runOnLoop +<writing-an-llvm-pass-runOnLoop>` for every loop in the program being compiled. + +.. _writing-an-llvm-pass-RegionPass: + +The ``RegionPass`` class +------------------------ + +``RegionPass`` is similar to :ref:`LoopPass <writing-an-llvm-pass-LoopPass>`, +but executes on each single entry single exit region in the function. +``RegionPass`` processes regions in nested order such that the outer most +region is processed last. + +``RegionPass`` subclasses are allowed to update the region tree by using the +``RGPassManager`` interface. You may overload three virtual methods of +``RegionPass`` to implement your own region pass. All these methods should +return ``true`` if they modified the program, or ``false`` if they did not. + +The ``doInitialization(Region *, RGPassManager &)`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool doInitialization(Region *, RGPassManager &RGM); + +The ``doInitialization`` method is designed to do simple initialization type of +stuff that does not depend on the functions being processed. The +``doInitialization`` method call is not scheduled to overlap with any other +pass executions (thus it should be very fast). ``RPPassManager`` interface +should be used to access ``Function`` or ``Module`` level analysis information. + +.. _writing-an-llvm-pass-runOnRegion: + +The ``runOnRegion`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool runOnRegion(Region *, RGPassManager &RGM) = 0; + +The ``runOnRegion`` method must be implemented by your subclass to do the +transformation or analysis work of your pass. As usual, a true value should be +returned if the region is modified. ``RGPassManager`` interface should be used to +update region tree. + +The ``doFinalization()`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool doFinalization(); + +The ``doFinalization`` method is an infrequently used method that is called +when the pass framework has finished calling :ref:`runOnRegion +<writing-an-llvm-pass-runOnRegion>` for every region in the program being +compiled. + +.. _writing-an-llvm-pass-BasicBlockPass: + +The ``BasicBlockPass`` class +---------------------------- + +``BasicBlockPass``\ es are just like :ref:`FunctionPass's +<writing-an-llvm-pass-FunctionPass>` , except that they must limit their scope +of inspection and modification to a single basic block at a time. As such, +they are **not** allowed to do any of the following: + +#. Modify or inspect any basic blocks outside of the current one. +#. Maintain state across invocations of :ref:`runOnBasicBlock + <writing-an-llvm-pass-runOnBasicBlock>`. +#. Modify the control flow graph (by altering terminator instructions) +#. Any of the things forbidden for :ref:`FunctionPasses + <writing-an-llvm-pass-FunctionPass>`. + +``BasicBlockPass``\ es are useful for traditional local and "peephole" +optimizations. They may override the same :ref:`doInitialization(Module &) +<writing-an-llvm-pass-doInitialization-mod>` and :ref:`doFinalization(Module &) +<writing-an-llvm-pass-doFinalization-mod>` methods that :ref:`FunctionPass's +<writing-an-llvm-pass-FunctionPass>` have, but also have the following virtual +methods that may also be implemented: + +The ``doInitialization(Function &)`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool doInitialization(Function &F); + +The ``doInitialization`` method is allowed to do most of the things that +``BasicBlockPass``\ es are not allowed to do, but that ``FunctionPass``\ es +can. The ``doInitialization`` method is designed to do simple initialization +that does not depend on the ``BasicBlock``\ s being processed. The +``doInitialization`` method call is not scheduled to overlap with any other +pass executions (thus it should be very fast). + +.. _writing-an-llvm-pass-runOnBasicBlock: + +The ``runOnBasicBlock`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool runOnBasicBlock(BasicBlock &BB) = 0; + +Override this function to do the work of the ``BasicBlockPass``. This function +is not allowed to inspect or modify basic blocks other than the parameter, and +are not allowed to modify the CFG. A ``true`` value must be returned if the +basic block is modified. + +The ``doFinalization(Function &)`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool doFinalization(Function &F); + +The ``doFinalization`` method is an infrequently used method that is called +when the pass framework has finished calling :ref:`runOnBasicBlock +<writing-an-llvm-pass-runOnBasicBlock>` for every ``BasicBlock`` in the program +being compiled. This can be used to perform per-function finalization. + +The ``MachineFunctionPass`` class +--------------------------------- + +A ``MachineFunctionPass`` is a part of the LLVM code generator that executes on +the machine-dependent representation of each LLVM function in the program. + +Code generator passes are registered and initialized specially by +``TargetMachine::addPassesToEmitFile`` and similar routines, so they cannot +generally be run from the :program:`opt` or :program:`bugpoint` commands. + +A ``MachineFunctionPass`` is also a ``FunctionPass``, so all the restrictions +that apply to a ``FunctionPass`` also apply to it. ``MachineFunctionPass``\ es +also have additional restrictions. In particular, ``MachineFunctionPass``\ es +are not allowed to do any of the following: + +#. Modify or create any LLVM IR ``Instruction``\ s, ``BasicBlock``\ s, + ``Argument``\ s, ``Function``\ s, ``GlobalVariable``\ s, + ``GlobalAlias``\ es, or ``Module``\ s. +#. Modify a ``MachineFunction`` other than the one currently being processed. +#. Maintain state across invocations of :ref:`runOnMachineFunction + <writing-an-llvm-pass-runOnMachineFunction>` (including global data). + +.. _writing-an-llvm-pass-runOnMachineFunction: + +The ``runOnMachineFunction(MachineFunction &MF)`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual bool runOnMachineFunction(MachineFunction &MF) = 0; + +``runOnMachineFunction`` can be considered the main entry point of a +``MachineFunctionPass``; that is, you should override this method to do the +work of your ``MachineFunctionPass``. + +The ``runOnMachineFunction`` method is called on every ``MachineFunction`` in a +``Module``, so that the ``MachineFunctionPass`` may perform optimizations on +the machine-dependent representation of the function. If you want to get at +the LLVM ``Function`` for the ``MachineFunction`` you're working on, use +``MachineFunction``'s ``getFunction()`` accessor method --- but remember, you +may not modify the LLVM ``Function`` or its contents from a +``MachineFunctionPass``. + +.. _writing-an-llvm-pass-registration: + +Pass registration +----------------- + +In the :ref:`Hello World <writing-an-llvm-pass-basiccode>` example pass we +illustrated how pass registration works, and discussed some of the reasons that +it is used and what it does. Here we discuss how and why passes are +registered. + +As we saw above, passes are registered with the ``RegisterPass`` template. The +template parameter is the name of the pass that is to be used on the command +line to specify that the pass should be added to a program (for example, with +:program:`opt` or :program:`bugpoint`). The first argument is the name of the +pass, which is to be used for the :option:`-help` output of programs, as well +as for debug output generated by the :option:`--debug-pass` option. + +If you want your pass to be easily dumpable, you should implement the virtual +print method: + +The ``print`` method +^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual void print(llvm::raw_ostream &O, const Module *M) const; + +The ``print`` method must be implemented by "analyses" in order to print a +human readable version of the analysis results. This is useful for debugging +an analysis itself, as well as for other people to figure out how an analysis +works. Use the opt ``-analyze`` argument to invoke this method. + +The ``llvm::raw_ostream`` parameter specifies the stream to write the results +on, and the ``Module`` parameter gives a pointer to the top level module of the +program that has been analyzed. Note however that this pointer may be ``NULL`` +in certain circumstances (such as calling the ``Pass::dump()`` from a +debugger), so it should only be used to enhance debug output, it should not be +depended on. + +.. _writing-an-llvm-pass-interaction: + +Specifying interactions between passes +-------------------------------------- + +One of the main responsibilities of the ``PassManager`` is to make sure that +passes interact with each other correctly. Because ``PassManager`` tries to +:ref:`optimize the execution of passes <writing-an-llvm-pass-passmanager>` it +must know how the passes interact with each other and what dependencies exist +between the various passes. To track this, each pass can declare the set of +passes that are required to be executed before the current pass, and the passes +which are invalidated by the current pass. + +Typically this functionality is used to require that analysis results are +computed before your pass is run. Running arbitrary transformation passes can +invalidate the computed analysis results, which is what the invalidation set +specifies. If a pass does not implement the :ref:`getAnalysisUsage +<writing-an-llvm-pass-getAnalysisUsage>` method, it defaults to not having any +prerequisite passes, and invalidating **all** other passes. + +.. _writing-an-llvm-pass-getAnalysisUsage: + +The ``getAnalysisUsage`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + virtual void getAnalysisUsage(AnalysisUsage &Info) const; + +By implementing the ``getAnalysisUsage`` method, the required and invalidated +sets may be specified for your transformation. The implementation should fill +in the `AnalysisUsage +<http://llvm.org/doxygen/classllvm_1_1AnalysisUsage.html>`_ object with +information about which passes are required and not invalidated. To do this, a +pass may call any of the following methods on the ``AnalysisUsage`` object: + +The ``AnalysisUsage::addRequired<>`` and ``AnalysisUsage::addRequiredTransitive<>`` methods +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If your pass requires a previous pass to be executed (an analysis for example), +it can use one of these methods to arrange for it to be run before your pass. +LLVM has many different types of analyses and passes that can be required, +spanning the range from ``DominatorSet`` to ``BreakCriticalEdges``. Requiring +``BreakCriticalEdges``, for example, guarantees that there will be no critical +edges in the CFG when your pass has been run. + +Some analyses chain to other analyses to do their job. For example, an +`AliasAnalysis <AliasAnalysis>` implementation is required to :ref:`chain +<aliasanalysis-chaining>` to other alias analysis passes. In cases where +analyses chain, the ``addRequiredTransitive`` method should be used instead of +the ``addRequired`` method. This informs the ``PassManager`` that the +transitively required pass should be alive as long as the requiring pass is. + +The ``AnalysisUsage::addPreserved<>`` method +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +One of the jobs of the ``PassManager`` is to optimize how and when analyses are +run. In particular, it attempts to avoid recomputing data unless it needs to. +For this reason, passes are allowed to declare that they preserve (i.e., they +don't invalidate) an existing analysis if it's available. For example, a +simple constant folding pass would not modify the CFG, so it can't possibly +affect the results of dominator analysis. By default, all passes are assumed +to invalidate all others. + +The ``AnalysisUsage`` class provides several methods which are useful in +certain circumstances that are related to ``addPreserved``. In particular, the +``setPreservesAll`` method can be called to indicate that the pass does not +modify the LLVM program at all (which is true for analyses), and the +``setPreservesCFG`` method can be used by transformations that change +instructions in the program but do not modify the CFG or terminator +instructions (note that this property is implicitly set for +:ref:`BasicBlockPass <writing-an-llvm-pass-BasicBlockPass>`\ es). + +``addPreserved`` is particularly useful for transformations like +``BreakCriticalEdges``. This pass knows how to update a small set of loop and +dominator related analyses if they exist, so it can preserve them, despite the +fact that it hacks on the CFG. + +Example implementations of ``getAnalysisUsage`` +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c++ + + // This example modifies the program, but does not modify the CFG + void LICM::getAnalysisUsage(AnalysisUsage &AU) const { + AU.setPreservesCFG(); + AU.addRequired<LoopInfo>(); + } + +.. _writing-an-llvm-pass-getAnalysis: + +The ``getAnalysis<>`` and ``getAnalysisIfAvailable<>`` methods +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``Pass::getAnalysis<>`` method is automatically inherited by your class, +providing you with access to the passes that you declared that you required +with the :ref:`getAnalysisUsage <writing-an-llvm-pass-getAnalysisUsage>` +method. It takes a single template argument that specifies which pass class +you want, and returns a reference to that pass. For example: + +.. code-block:: c++ + + bool LICM::runOnFunction(Function &F) { + LoopInfo &LI = getAnalysis<LoopInfo>(); + //... + } + +This method call returns a reference to the pass desired. You may get a +runtime assertion failure if you attempt to get an analysis that you did not +declare as required in your :ref:`getAnalysisUsage +<writing-an-llvm-pass-getAnalysisUsage>` implementation. This method can be +called by your ``run*`` method implementation, or by any other local method +invoked by your ``run*`` method. + +A module level pass can use function level analysis info using this interface. +For example: + +.. code-block:: c++ + + bool ModuleLevelPass::runOnModule(Module &M) { + //... + DominatorTree &DT = getAnalysis<DominatorTree>(Func); + //... + } + +In above example, ``runOnFunction`` for ``DominatorTree`` is called by pass +manager before returning a reference to the desired pass. + +If your pass is capable of updating analyses if they exist (e.g., +``BreakCriticalEdges``, as described above), you can use the +``getAnalysisIfAvailable`` method, which returns a pointer to the analysis if +it is active. For example: + +.. code-b |