diff options
author | John Criswell <criswell@uiuc.edu> | 2005-11-02 18:05:50 +0000 |
---|---|---|
committer | John Criswell <criswell@uiuc.edu> | 2005-11-02 18:05:50 +0000 |
commit | cfa435f79bf39fead32263a8b71c9ae440b55214 (patch) | |
tree | 2f1ef0a4c3fb5549b8bbb014891f92866d46e042 /docs/ProgrammersManual.html |
Mark these as failing on sparc instead of sparcv9.
The configure script no longer tells us that we're configuring for SparcV9
specifically.
2004-06-17-UnorderedCompares may work on SparcV8, but it's experiental
anyway.
2005-02-20-AggregateSAVEEXPR should fail on any Solaris machine, as Solaris
doesn't provide complex number support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_16@24155 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'docs/ProgrammersManual.html')
-rw-r--r-- | docs/ProgrammersManual.html | 2282 |
1 files changed, 2282 insertions, 0 deletions
diff --git a/docs/ProgrammersManual.html b/docs/ProgrammersManual.html new file mode 100644 index 0000000000..e4d50039f6 --- /dev/null +++ b/docs/ProgrammersManual.html @@ -0,0 +1,2282 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" + "http://www.w3.org/TR/html4/strict.dtd"> +<html> +<head> + <title>LLVM Programmer's Manual</title> + <link rel="stylesheet" href="llvm.css" type="text/css"> +</head> +<body> + +<div class="doc_title"> + LLVM Programmer's Manual +</div> + +<ol> + <li><a href="#introduction">Introduction</a></li> + <li><a href="#general">General Information</a> + <ul> + <li><a href="#stl">The C++ Standard Template Library</a></li> +<!-- + <li>The <tt>-time-passes</tt> option</li> + <li>How to use the LLVM Makefile system</li> + <li>How to write a regression test</li> + +--> + </ul> + </li> + <li><a href="#apis">Important and useful LLVM APIs</a> + <ul> + <li><a href="#isa">The <tt>isa<></tt>, <tt>cast<></tt> +and <tt>dyn_cast<></tt> templates</a> </li> + <li><a href="#DEBUG">The <tt>DEBUG()</tt> macro & <tt>-debug</tt> +option</a> + <ul> + <li><a href="#DEBUG_TYPE">Fine grained debug info with <tt>DEBUG_TYPE</tt> +and the <tt>-debug-only</tt> option</a> </li> + </ul> + </li> + <li><a href="#Statistic">The <tt>Statistic</tt> template & <tt>-stats</tt> +option</a></li> +<!-- + <li>The <tt>InstVisitor</tt> template + <li>The general graph API +--> + <li><a href="#ViewGraph">Viewing graphs while debugging code</a></li> + </ul> + </li> + <li><a href="#common">Helpful Hints for Common Operations</a> + <ul> + <li><a href="#inspection">Basic Inspection and Traversal Routines</a> + <ul> + <li><a href="#iterate_function">Iterating over the <tt>BasicBlock</tt>s +in a <tt>Function</tt></a> </li> + <li><a href="#iterate_basicblock">Iterating over the <tt>Instruction</tt>s +in a <tt>BasicBlock</tt></a> </li> + <li><a href="#iterate_institer">Iterating over the <tt>Instruction</tt>s +in a <tt>Function</tt></a> </li> + <li><a href="#iterate_convert">Turning an iterator into a +class pointer</a> </li> + <li><a href="#iterate_complex">Finding call sites: a more +complex example</a> </li> + <li><a href="#calls_and_invokes">Treating calls and invokes +the same way</a> </li> + <li><a href="#iterate_chains">Iterating over def-use & +use-def chains</a> </li> + </ul> + </li> + <li><a href="#simplechanges">Making simple changes</a> + <ul> + <li><a href="#schanges_creating">Creating and inserting new + <tt>Instruction</tt>s</a> </li> + <li><a href="#schanges_deleting">Deleting <tt>Instruction</tt>s</a> </li> + <li><a href="#schanges_replacing">Replacing an <tt>Instruction</tt> +with another <tt>Value</tt></a> </li> + </ul> + </li> +<!-- + <li>Working with the Control Flow Graph + <ul> + <li>Accessing predecessors and successors of a <tt>BasicBlock</tt> + <li> + <li> + </ul> +--> + </ul> + </li> + + <li><a href="#advanced">Advanced Topics</a> + <ul> + <li><a href="#TypeResolve">LLVM Type Resolution</a> + <ul> + <li><a href="#BuildRecType">Basic Recursive Type Construction</a></li> + <li><a href="#refineAbstractTypeTo">The <tt>refineAbstractTypeTo</tt> method</a></li> + <li><a href="#PATypeHolder">The PATypeHolder Class</a></li> + <li><a href="#AbstractTypeUser">The AbstractTypeUser Class</a></li> + </ul></li> + + <li><a href="#SymbolTable">The <tt>SymbolTable</tt> class </a></li> + </ul></li> + + <li><a href="#coreclasses">The Core LLVM Class Hierarchy Reference</a> + <ul> + <li><a href="#Value">The <tt>Value</tt> class</a> + <ul> + <li><a href="#User">The <tt>User</tt> class</a> + <ul> + <li><a href="#Instruction">The <tt>Instruction</tt> class</a> + <ul> + <li><a href="#GetElementPtrInst">The <tt>GetElementPtrInst</tt> class</a></li> + </ul> + </li> + <li><a href="#Module">The <tt>Module</tt> class</a></li> + <li><a href="#Constant">The <tt>Constant</tt> class</a> + <ul> + <li><a href="#GlobalValue">The <tt>GlobalValue</tt> class</a> + <ul> + <li><a href="#BasicBlock">The <tt>BasicBlock</tt>class</a></li> + <li><a href="#Function">The <tt>Function</tt> class</a></li> + <li><a href="#GlobalVariable">The <tt>GlobalVariable</tt> class</a></li> + </ul> + </li> + </ul> + </li> + </ul> + </li> + <li><a href="#Type">The <tt>Type</tt> class</a> </li> + <li><a href="#Argument">The <tt>Argument</tt> class</a></li> + </ul> + </li> + </ul> + </li> +</ol> + +<div class="doc_author"> + <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a>, + <a href="mailto:dhurjati@cs.uiuc.edu">Dinakar Dhurjati</a>, + <a href="mailto:jstanley@cs.uiuc.edu">Joel Stanley</a>, and + <a href="mailto:rspencer@x10sys.com">Reid Spencer</a></p> +</div> + +<!-- *********************************************************************** --> +<div class="doc_section"> + <a name="introduction">Introduction </a> +</div> +<!-- *********************************************************************** --> + +<div class="doc_text"> + +<p>This document is meant to highlight some of the important classes and +interfaces available in the LLVM source-base. This manual is not +intended to explain what LLVM is, how it works, and what LLVM code looks +like. It assumes that you know the basics of LLVM and are interested +in writing transformations or otherwise analyzing or manipulating the +code.</p> + +<p>This document should get you oriented so that you can find your +way in the continuously growing source code that makes up the LLVM +infrastructure. Note that this manual is not intended to serve as a +replacement for reading the source code, so if you think there should be +a method in one of these classes to do something, but it's not listed, +check the source. Links to the <a href="/doxygen/">doxygen</a> sources +are provided to make this as easy as possible.</p> + +<p>The first section of this document describes general information that is +useful to know when working in the LLVM infrastructure, and the second describes +the Core LLVM classes. In the future this manual will be extended with +information describing how to use extension libraries, such as dominator +information, CFG traversal routines, and useful utilities like the <tt><a +href="/doxygen/InstVisitor_8h-source.html">InstVisitor</a></tt> template.</p> + +</div> + +<!-- *********************************************************************** --> +<div class="doc_section"> + <a name="general">General Information</a> +</div> +<!-- *********************************************************************** --> + +<div class="doc_text"> + +<p>This section contains general information that is useful if you are working +in the LLVM source-base, but that isn't specific to any particular API.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="stl">The C++ Standard Template Library</a> +</div> + +<div class="doc_text"> + +<p>LLVM makes heavy use of the C++ Standard Template Library (STL), +perhaps much more than you are used to, or have seen before. Because of +this, you might want to do a little background reading in the +techniques used and capabilities of the library. There are many good +pages that discuss the STL, and several books on the subject that you +can get, so it will not be discussed in this document.</p> + +<p>Here are some useful links:</p> + +<ol> + +<li><a href="http://www.dinkumware.com/refxcpp.html">Dinkumware C++ Library +reference</a> - an excellent reference for the STL and other parts of the +standard C++ library.</li> + +<li><a href="http://www.tempest-sw.com/cpp/">C++ In a Nutshell</a> - This is an +O'Reilly book in the making. It has a decent +Standard Library +Reference that rivals Dinkumware's, and is unfortunately no longer free since the book has been +published.</li> + +<li><a href="http://www.parashift.com/c++-faq-lite/">C++ Frequently Asked +Questions</a></li> + +<li><a href="http://www.sgi.com/tech/stl/">SGI's STL Programmer's Guide</a> - +Contains a useful <a +href="http://www.sgi.com/tech/stl/stl_introduction.html">Introduction to the +STL</a>.</li> + +<li><a href="http://www.research.att.com/%7Ebs/C++.html">Bjarne Stroustrup's C++ +Page</a></li> + +<li><a href="http://64.78.49.204/"> +Bruce Eckel's Thinking in C++, 2nd ed. Volume 2 Revision 4.0 (even better, get +the book).</a></li> + +</ol> + +<p>You are also encouraged to take a look at the <a +href="CodingStandards.html">LLVM Coding Standards</a> guide which focuses on how +to write maintainable code more than where to put your curly braces.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="stl">Other useful references</a> +</div> + +<div class="doc_text"> + +<ol> +<li><a href="http://www.psc.edu/%7Esemke/cvs_branches.html">CVS +Branch and Tag Primer</a></li> +<li><a href="http://www.fortran-2000.com/ArnaudRecipes/sharedlib.html">Using +static and shared libraries across platforms</a></li> +</ol> + +</div> + +<!-- *********************************************************************** --> +<div class="doc_section"> + <a name="apis">Important and useful LLVM APIs</a> +</div> +<!-- *********************************************************************** --> + +<div class="doc_text"> + +<p>Here we highlight some LLVM APIs that are generally useful and good to +know about when writing transformations.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="isa">The isa<>, cast<> and dyn_cast<> templates</a> +</div> + +<div class="doc_text"> + +<p>The LLVM source-base makes extensive use of a custom form of RTTI. +These templates have many similarities to the C++ <tt>dynamic_cast<></tt> +operator, but they don't have some drawbacks (primarily stemming from +the fact that <tt>dynamic_cast<></tt> only works on classes that +have a v-table). Because they are used so often, you must know what they +do and how they work. All of these templates are defined in the <a + href="/doxygen/Casting_8h-source.html"><tt>llvm/Support/Casting.h</tt></a> +file (note that you very rarely have to include this file directly).</p> + +<dl> + <dt><tt>isa<></tt>: </dt> + + <dd>The <tt>isa<></tt> operator works exactly like the Java + "<tt>instanceof</tt>" operator. It returns true or false depending on whether + a reference or pointer points to an instance of the specified class. This can + be very useful for constraint checking of various sorts (example below).</dd> + + <dt><tt>cast<></tt>: </dt> + + <dd>The <tt>cast<></tt> operator is a "checked cast" operation. It + converts a pointer or reference from a base class to a derived cast, causing + an assertion failure if it is not really an instance of the right type. This + should be used in cases where you have some information that makes you believe + that something is of the right type. An example of the <tt>isa<></tt> + and <tt>cast<></tt> template is: + + <pre> + static bool isLoopInvariant(const <a href="#Value">Value</a> *V, const Loop *L) { + if (isa<<a href="#Constant">Constant</a>>(V) || isa<<a href="#Argument">Argument</a>>(V) || isa<<a href="#GlobalValue">GlobalValue</a>>(V)) + return true; + + <i>// Otherwise, it must be an instruction...</i> + return !L->contains(cast<<a href="#Instruction">Instruction</a>>(V)->getParent()); + } + </pre> + + <p>Note that you should <b>not</b> use an <tt>isa<></tt> test followed + by a <tt>cast<></tt>, for that use the <tt>dyn_cast<></tt> + operator.</p> + + </dd> + + <dt><tt>dyn_cast<></tt>:</dt> + + <dd>The <tt>dyn_cast<></tt> operator is a "checking cast" operation. It + checks to see if the operand is of the specified type, and if so, returns a + pointer to it (this operator does not work with references). If the operand is + not of the correct type, a null pointer is returned. Thus, this works very + much like the <tt>dynamic_cast</tt> operator in C++, and should be used in the + same circumstances. Typically, the <tt>dyn_cast<></tt> operator is used + in an <tt>if</tt> statement or some other flow control statement like this: + + <pre> + if (<a href="#AllocationInst">AllocationInst</a> *AI = dyn_cast<<a href="#AllocationInst">AllocationInst</a>>(Val)) { + ... + } + </pre> + + <p> This form of the <tt>if</tt> statement effectively combines together a + call to <tt>isa<></tt> and a call to <tt>cast<></tt> into one + statement, which is very convenient.</p> + + <p>Note that the <tt>dyn_cast<></tt> operator, like C++'s + <tt>dynamic_cast</tt> or Java's <tt>instanceof</tt> operator, can be abused. + In particular you should not use big chained <tt>if/then/else</tt> blocks to + check for lots of different variants of classes. If you find yourself + wanting to do this, it is much cleaner and more efficient to use the + <tt>InstVisitor</tt> class to dispatch over the instruction type directly.</p> + + </dd> + + <dt><tt>cast_or_null<></tt>: </dt> + + <dd>The <tt>cast_or_null<></tt> operator works just like the + <tt>cast<></tt> operator, except that it allows for a null pointer as + an argument (which it then propagates). This can sometimes be useful, + allowing you to combine several null checks into one.</dd> + + <dt><tt>dyn_cast_or_null<></tt>: </dt> + + <dd>The <tt>dyn_cast_or_null<></tt> operator works just like the + <tt>dyn_cast<></tt> operator, except that it allows for a null pointer + as an argument (which it then propagates). This can sometimes be useful, + allowing you to combine several null checks into one.</dd> + + </dl> + +<p>These five templates can be used with any classes, whether they have a +v-table or not. To add support for these templates, you simply need to add +<tt>classof</tt> static methods to the class you are interested casting +to. Describing this is currently outside the scope of this document, but there +are lots of examples in the LLVM source base.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="DEBUG">The <tt>DEBUG()</tt> macro & <tt>-debug</tt> option</a> +</div> + +<div class="doc_text"> + +<p>Often when working on your pass you will put a bunch of debugging printouts +and other code into your pass. After you get it working, you want to remove +it... but you may need it again in the future (to work out new bugs that you run +across).</p> + +<p> Naturally, because of this, you don't want to delete the debug printouts, +but you don't want them to always be noisy. A standard compromise is to comment +them out, allowing you to enable them if you need them in the future.</p> + +<p>The "<tt><a href="/doxygen/Debug_8h-source.html">llvm/Support/Debug.h</a></tt>" +file provides a macro named <tt>DEBUG()</tt> that is a much nicer solution to +this problem. Basically, you can put arbitrary code into the argument of the +<tt>DEBUG</tt> macro, and it is only executed if '<tt>opt</tt>' (or any other +tool) is run with the '<tt>-debug</tt>' command line argument:</p> + + <pre> ... <br> DEBUG(std::cerr << "I am here!\n");<br> ...<br></pre> + +<p>Then you can run your pass like this:</p> + + <pre> $ opt < a.bc > /dev/null -mypass<br> <no output><br> $ opt < a.bc > /dev/null -mypass -debug<br> I am here!<br> $<br></pre> + +<p>Using the <tt>DEBUG()</tt> macro instead of a home-brewed solution allows you +to not have to create "yet another" command line option for the debug output for +your pass. Note that <tt>DEBUG()</tt> macros are disabled for optimized builds, +so they do not cause a performance impact at all (for the same reason, they +should also not contain side-effects!).</p> + +<p>One additional nice thing about the <tt>DEBUG()</tt> macro is that you can +enable or disable it directly in gdb. Just use "<tt>set DebugFlag=0</tt>" or +"<tt>set DebugFlag=1</tt>" from the gdb if the program is running. If the +program hasn't been started yet, you can always just run it with +<tt>-debug</tt>.</p> + +</div> + +<!-- _______________________________________________________________________ --> +<div class="doc_subsubsection"> + <a name="DEBUG_TYPE">Fine grained debug info with <tt>DEBUG_TYPE</tt> and + the <tt>-debug-only</tt> option</a> +</div> + +<div class="doc_text"> + +<p>Sometimes you may find yourself in a situation where enabling <tt>-debug</tt> +just turns on <b>too much</b> information (such as when working on the code +generator). If you want to enable debug information with more fine-grained +control, you define the <tt>DEBUG_TYPE</tt> macro and the <tt>-debug</tt> only +option as follows:</p> + + <pre> ...<br> DEBUG(std::cerr << "No debug type\n");<br> #undef DEBUG_TYPE<br> #define DEBUG_TYPE "foo"<br> DEBUG(std::cerr << "'foo' debug type\n");<br> #undef DEBUG_TYPE<br> #define DEBUG_TYPE "bar"<br> DEBUG(std::cerr << "'bar' debug type\n");<br> #undef DEBUG_TYPE<br> #define DEBUG_TYPE ""<br> DEBUG(std::cerr << "No debug type (2)\n");<br> ...<br></pre> + +<p>Then you can run your pass like this:</p> + + <pre> $ opt < a.bc > /dev/null -mypass<br> <no output><br> $ opt < a.bc > /dev/null -mypass -debug<br> No debug type<br> 'foo' debug type<br> 'bar' debug type<br> No debug type (2)<br> $ opt < a.bc > /dev/null -mypass -debug-only=foo<br> 'foo' debug type<br> $ opt < a.bc > /dev/null -mypass -debug-only=bar<br> 'bar' debug type<br> $<br></pre> + +<p>Of course, in practice, you should only set <tt>DEBUG_TYPE</tt> at the top of +a file, to specify the debug type for the entire module (if you do this before +you <tt>#include "llvm/Support/Debug.h"</tt>, you don't have to insert the ugly +<tt>#undef</tt>'s). Also, you should use names more meaningful than "foo" and +"bar", because there is no system in place to ensure that names do not +conflict. If two different modules use the same string, they will all be turned +on when the name is specified. This allows, for example, all debug information +for instruction scheduling to be enabled with <tt>-debug-type=InstrSched</tt>, +even if the source lives in multiple files.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="Statistic">The <tt>Statistic</tt> template & <tt>-stats</tt> + option</a> +</div> + +<div class="doc_text"> + +<p>The "<tt><a +href="/doxygen/Statistic_8h-source.html">llvm/ADT/Statistic.h</a></tt>" file +provides a template named <tt>Statistic</tt> that is used as a unified way to +keep track of what the LLVM compiler is doing and how effective various +optimizations are. It is useful to see what optimizations are contributing to +making a particular program run faster.</p> + +<p>Often you may run your pass on some big program, and you're interested to see +how many times it makes a certain transformation. Although you can do this with +hand inspection, or some ad-hoc method, this is a real pain and not very useful +for big programs. Using the <tt>Statistic</tt> template makes it very easy to +keep track of this information, and the calculated information is presented in a +uniform manner with the rest of the passes being executed.</p> + +<p>There are many examples of <tt>Statistic</tt> uses, but the basics of using +it are as follows:</p> + +<ol> + <li>Define your statistic like this: + <pre>static Statistic<> NumXForms("mypassname", "The # of times I did stuff");<br></pre> + + <p>The <tt>Statistic</tt> template can emulate just about any data-type, + but if you do not specify a template argument, it defaults to acting like + an unsigned int counter (this is usually what you want).</p></li> + + <li>Whenever you make a transformation, bump the counter: + <pre> ++NumXForms; // I did stuff<br></pre> + </li> + </ol> + + <p>That's all you have to do. To get '<tt>opt</tt>' to print out the + statistics gathered, use the '<tt>-stats</tt>' option:</p> + + <pre> $ opt -stats -mypassname < program.bc > /dev/null<br> ... statistic output ...<br></pre> + + <p> When running <tt>gccas</tt> on a C file from the SPEC benchmark +suite, it gives a report that looks like this:</p> + + <pre> 7646 bytecodewriter - Number of normal instructions<br> 725 bytecodewriter - Number of oversized instructions<br> 129996 bytecodewriter - Number of bytecode bytes written<br> 2817 raise - Number of insts DCEd or constprop'd<br> 3213 raise - Number of cast-of-self removed<br> 5046 raise - Number of expression trees converted<br> 75 raise - Number of other getelementptr's formed<br> 138 raise - Number of load/store peepholes<br> 42 deadtypeelim - Number of unused typenames removed from symtab<br> 392 funcresolve - Number of varargs functions resolved<br> 27 globaldce - Number of global variables removed<br> 2 adce - Number of basic blocks removed<br> 134 cee - Number of branches revectored<br> 49 cee - Number of setcc instruction eliminated<br> 532 gcse - Number of loads removed<br> 2919 gcse - Number of instructions removed<br> 86 indvars - Number of canonical indvars added<br> 87 indvars - Number of aux indvars removed<br> 25 instcombine - Number of dead inst eliminate<br> 434 instcombine - Number of insts combined<br> 248 licm - Number of load insts hoisted<br> 1298 licm - Number of insts hoisted to a loop pre-header<br> 3 licm - Number of insts hoisted to multiple loop preds (bad, no loop pre-header)<br> 75 mem2reg - Number of alloca's promoted<br> 1444 cfgsimplify - Number of blocks simplified<br></pre> + +<p>Obviously, with so many optimizations, having a unified framework for this +stuff is very nice. Making your pass fit well into the framework makes it more +maintainable and useful.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="ViewGraph">Viewing graphs while debugging code</a> +</div> + +<div class="doc_text"> + +<p>Several of the important data structures in LLVM are graphs: for example +CFGs made out of LLVM <a href="#BasicBlock">BasicBlock</a>s, CFGs made out of +LLVM <a href="CodeGenerator.html#machinebasicblock">MachineBasicBlock</a>s, and +<a href="CodeGenerator.html#selectiondag_intro">Instruction Selection +DAGs</a>. In many cases, while debugging various parts of the compiler, it is +nice to instantly visualize these graphs.</p> + +<p>LLVM provides several callbacks that are available in a debug build to do +exactly that. If you call the <tt>Function::viewCFG()</tt> method, for example, +the current LLVM tool will pop up a window containing the CFG for the function +where each basic block is a node in the graph, and each node contains the +instructions in the block. Similarly, there also exists +<tt>Function::viewCFGOnly()</tt> (does not include the instructions), the +<tt>MachineFunction::viewCFG()</tt> and <tt>MachineFunction::viewCFGOnly()</tt>, +and the <tt>SelectionDAG::viewGraph()</tt> methods. Within GDB, for example, +you can usually use something like "<tt>call DAG.viewGraph()</tt>" to pop +up a window. Alternatively, you can sprinkle calls to these functions in your +code in places you want to debug.</p> + +<p>Getting this to work requires a small amount of configuration. On Unix +systems with X11, install the <a href="http://www.graphviz.org">graphviz</a> +toolkit, and make sure 'dot' and 'gv' are in your path. If you are running on +Mac OS/X, download and install the Mac OS/X <a +href="http://www.pixelglow.com/graphviz/">Graphviz program</a>, and add +<tt>/Applications/Graphviz.app/Contents/MacOS/</tt> (or whereever you install +it) to your path. Once in your system and path are set up, rerun the LLVM +configure script and rebuild LLVM to enable this functionality.</p> + +</div> + + +<!-- *********************************************************************** --> +<div class="doc_section"> + <a name="common">Helpful Hints for Common Operations</a> +</div> +<!-- *********************************************************************** --> + +<div class="doc_text"> + +<p>This section describes how to perform some very simple transformations of +LLVM code. This is meant to give examples of common idioms used, showing the +practical side of LLVM transformations. <p> Because this is a "how-to" section, +you should also read about the main classes that you will be working with. The +<a href="#coreclasses">Core LLVM Class Hierarchy Reference</a> contains details +and descriptions of the main classes that you should know about.</p> + +</div> + +<!-- NOTE: this section should be heavy on example code --> +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="inspection">Basic Inspection and Traversal Routines</a> +</div> + +<div class="doc_text"> + +<p>The LLVM compiler infrastructure have many different data structures that may +be traversed. Following the example of the C++ standard template library, the +techniques used to traverse these various data structures are all basically the +same. For a enumerable sequence of values, the <tt>XXXbegin()</tt> function (or +method) returns an iterator to the start of the sequence, the <tt>XXXend()</tt> +function returns an iterator pointing to one past the last valid element of the +sequence, and there is some <tt>XXXiterator</tt> data type that is common +between the two operations.</p> + +<p>Because the pattern for iteration is common across many different aspects of +the program representation, the standard template library algorithms may be used +on them, and it is easier to remember how to iterate. First we show a few common +examples of the data structures that need to be traversed. Other data +structures are traversed in very similar ways.</p> + +</div> + +<!-- _______________________________________________________________________ --> +<div class="doc_subsubsection"> + <a name="iterate_function">Iterating over the </a><a + href="#BasicBlock"><tt>BasicBlock</tt></a>s in a <a + href="#Function"><tt>Function</tt></a> +</div> + +<div class="doc_text"> + +<p>It's quite common to have a <tt>Function</tt> instance that you'd like to +transform in some way; in particular, you'd like to manipulate its +<tt>BasicBlock</tt>s. To facilitate this, you'll need to iterate over all of +the <tt>BasicBlock</tt>s that constitute the <tt>Function</tt>. The following is +an example that prints the name of a <tt>BasicBlock</tt> and the number of +<tt>Instruction</tt>s it contains:</p> + + <pre> // func is a pointer to a Function instance<br> for (Function::iterator i = func->begin(), e = func->end(); i != e; ++i) {<br><br> // print out the name of the basic block if it has one, and then the<br> // number of instructions that it contains<br><br> cerr << "Basic block (name=" << i->getName() << ") has " <br> << i->size() << " instructions.\n";<br> }<br></pre> + +<p>Note that i can be used as if it were a pointer for the purposes of +invoking member functions of the <tt>Instruction</tt> class. This is +because the indirection operator is overloaded for the iterator +classes. In the above code, the expression <tt>i->size()</tt> is +exactly equivalent to <tt>(*i).size()</tt> just like you'd expect.</p> + +</div> + +<!-- _______________________________________________________________________ --> +<div class="doc_subsubsection"> + <a name="iterate_basicblock">Iterating over the </a><a + href="#Instruction"><tt>Instruction</tt></a>s in a <a + href="#BasicBlock"><tt>BasicBlock</tt></a> +</div> + +<div class="doc_text"> + +<p>Just like when dealing with <tt>BasicBlock</tt>s in <tt>Function</tt>s, it's +easy to iterate over the individual instructions that make up +<tt>BasicBlock</tt>s. Here's a code snippet that prints out each instruction in +a <tt>BasicBlock</tt>:</p> + +<pre> + // blk is a pointer to a BasicBlock instance + for (BasicBlock::iterator i = blk->begin(), e = blk->end(); i != e; ++i) + // the next statement works since operator<<(ostream&,...) + // is overloaded for Instruction& + std::cerr << *i << "\n"; +</pre> + +<p>However, this isn't really the best way to print out the contents of a +<tt>BasicBlock</tt>! Since the ostream operators are overloaded for virtually +anything you'll care about, you could have just invoked the print routine on the +basic block itself: <tt>std::cerr << *blk << "\n";</tt>.</p> + +</div> + +<!-- _______________________________________________________________________ --> +<div class="doc_subsubsection"> + <a name="iterate_institer">Iterating over the </a><a + href="#Instruction"><tt>Instruction</tt></a>s in a <a + href="#Function"><tt>Function</tt></a> +</div> + +<div class="doc_text"> + +<p>If you're finding that you commonly iterate over a <tt>Function</tt>'s +<tt>BasicBlock</tt>s and then that <tt>BasicBlock</tt>'s <tt>Instruction</tt>s, +<tt>InstIterator</tt> should be used instead. You'll need to include <a +href="/doxygen/InstIterator_8h-source.html"><tt>llvm/Support/InstIterator.h</tt></a>, +and then instantiate <tt>InstIterator</tt>s explicitly in your code. Here's a +small example that shows how to dump all instructions in a function to the standard error stream:<p> + + <pre>#include "<a href="/doxygen/InstIterator_8h-source.html">llvm/Support/InstIterator.h</a>"<br>...<br>// Suppose F is a ptr to a function<br>for (inst_iterator i = inst_begin(F), e = inst_end(F); i != e; ++i)<br> cerr << *i << "\n";<br></pre> +Easy, isn't it? You can also use <tt>InstIterator</tt>s to fill a +worklist with its initial contents. For example, if you wanted to +initialize a worklist to contain all instructions in a <tt>Function</tt> +F, all you would need to do is something like: + <pre>std::set<Instruction*> worklist;<br>worklist.insert(inst_begin(F), inst_end(F));<br></pre> + +<p>The STL set <tt>worklist</tt> would now contain all instructions in the +<tt>Function</tt> pointed to by F.</p> + +</div> + +<!-- _______________________________________________________________________ --> +<div class="doc_subsubsection"> + <a name="iterate_convert">Turning an iterator into a class pointer (and + vice-versa)</a> +</div> + +<div class="doc_text"> + +<p>Sometimes, it'll be useful to grab a reference (or pointer) to a class +instance when all you've got at hand is an iterator. Well, extracting +a reference or a pointer from an iterator is very straight-forward. +Assuming that <tt>i</tt> is a <tt>BasicBlock::iterator</tt> and <tt>j</tt> +is a <tt>BasicBlock::const_iterator</tt>:</p> + + <pre> Instruction& inst = *i; // grab reference to instruction reference<br> Instruction* pinst = &*i; // grab pointer to instruction reference<br> const Instruction& inst = *j;<br></pre> + +<p>However, the iterators you'll be working with in the LLVM framework are +special: they will automatically convert to a ptr-to-instance type whenever they +need to. Instead of dereferencing the iterator and then taking the address of +the result, you can simply assign the iterator to the proper pointer type and +you get the dereference and address-of operation as a result of the assignment +(behind the scenes, this is a result of overloading casting mechanisms). Thus +the last line of the last example,</p> + + <pre>Instruction* pinst = &*i;</pre> + +<p>is semantically equivalent to</p> + + <pre>Instruction* pinst = i;</pre> + +<p>It's also possible to turn a class pointer into the corresponding iterator, +and this is a constant time operation (very efficient). The following code +snippet illustrates use of the conversion constructors provided by LLVM +iterators. By using these, you can explicitly grab the iterator of something +without actually obtaining it via iteration over some structure:</p> + + <pre>void printNextInstruction(Instruction* inst) {<br> BasicBlock::iterator it(inst);<br> ++it; // after this line, it refers to the instruction after *inst.<br> if (it != inst->getParent()->end()) cerr << *it << "\n";<br>}<br></pre> + +</div> + +<!--_______________________________________________________________________--> +<div class="doc_subsubsection"> + <a name="iterate_complex">Finding call sites: a slightly more complex + example</a> +</div> + +<div class="doc_text"> + +<p>Say that you're writing a FunctionPass and would like to count all the +locations in the entire module (that is, across every <tt>Function</tt>) where a +certain function (i.e., some <tt>Function</tt>*) is already in scope. As you'll +learn later, you may want to use an <tt>InstVisitor</tt> to accomplish this in a +much more straight-forward manner, but this example will allow us to explore how +you'd do it if you didn't have <tt>InstVisitor</tt> around. In pseudocode, this +is what we want to do:</p> + + <pre>initialize callCounter to zero<br>for each Function f in the Module<br> for each BasicBlock b in f<br> for each Instruction i in b<br> if (i is a CallInst and calls the given function)<br> increment callCounter<br></pre> + +<p>And the actual code is (remember, since we're writing a +<tt>FunctionPass</tt>, our <tt>FunctionPass</tt>-derived class simply has to +override the <tt>runOnFunction</tt> method...):</p> + + <pre>Function* targetFunc = ...;<br><br>class OurFunctionPass : pub |