aboutsummaryrefslogtreecommitdiff
path: root/docs/LangRef.html
diff options
context:
space:
mode:
Diffstat (limited to 'docs/LangRef.html')
-rw-r--r--docs/LangRef.html3320
1 files changed, 3320 insertions, 0 deletions
diff --git a/docs/LangRef.html b/docs/LangRef.html
new file mode 100644
index 0000000000..6b5bb962c6
--- /dev/null
+++ b/docs/LangRef.html
@@ -0,0 +1,3320 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
+ "http://www.w3.org/TR/html4/strict.dtd">
+<html>
+<head>
+ <title>LLVM Assembly Language Reference Manual</title>
+ <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
+ <meta name="author" content="Chris Lattner">
+ <meta name="description"
+ content="LLVM Assembly Language Reference Manual.">
+ <link rel="stylesheet" href="llvm.css" type="text/css">
+</head>
+
+<body>
+
+<div class="doc_title"> LLVM Language Reference Manual </div>
+<ol>
+ <li><a href="#abstract">Abstract</a></li>
+ <li><a href="#introduction">Introduction</a></li>
+ <li><a href="#identifiers">Identifiers</a></li>
+ <li><a href="#highlevel">High Level Structure</a>
+ <ol>
+ <li><a href="#modulestructure">Module Structure</a></li>
+ <li><a href="#linkage">Linkage Types</a></li>
+ <li><a href="#callingconv">Calling Conventions</a></li>
+ <li><a href="#globalvars">Global Variables</a></li>
+ <li><a href="#functionstructure">Function Structure</a></li>
+ </ol>
+ </li>
+ <li><a href="#typesystem">Type System</a>
+ <ol>
+ <li><a href="#t_primitive">Primitive Types</a>
+ <ol>
+ <li><a href="#t_classifications">Type Classifications</a></li>
+ </ol>
+ </li>
+ <li><a href="#t_derived">Derived Types</a>
+ <ol>
+ <li><a href="#t_array">Array Type</a></li>
+ <li><a href="#t_function">Function Type</a></li>
+ <li><a href="#t_pointer">Pointer Type</a></li>
+ <li><a href="#t_struct">Structure Type</a></li>
+ <li><a href="#t_packed">Packed Type</a></li>
+ <li><a href="#t_opaque">Opaque Type</a></li>
+ </ol>
+ </li>
+ </ol>
+ </li>
+ <li><a href="#constants">Constants</a>
+ <ol>
+ <li><a href="#simpleconstants">Simple Constants</a>
+ <li><a href="#aggregateconstants">Aggregate Constants</a>
+ <li><a href="#globalconstants">Global Variable and Function Addresses</a>
+ <li><a href="#undefvalues">Undefined Values</a>
+ <li><a href="#constantexprs">Constant Expressions</a>
+ </ol>
+ </li>
+ <li><a href="#instref">Instruction Reference</a>
+ <ol>
+ <li><a href="#terminators">Terminator Instructions</a>
+ <ol>
+ <li><a href="#i_ret">'<tt>ret</tt>' Instruction</a></li>
+ <li><a href="#i_br">'<tt>br</tt>' Instruction</a></li>
+ <li><a href="#i_switch">'<tt>switch</tt>' Instruction</a></li>
+ <li><a href="#i_invoke">'<tt>invoke</tt>' Instruction</a></li>
+ <li><a href="#i_unwind">'<tt>unwind</tt>' Instruction</a></li>
+ <li><a href="#i_unreachable">'<tt>unreachable</tt>' Instruction</a></li>
+ </ol>
+ </li>
+ <li><a href="#binaryops">Binary Operations</a>
+ <ol>
+ <li><a href="#i_add">'<tt>add</tt>' Instruction</a></li>
+ <li><a href="#i_sub">'<tt>sub</tt>' Instruction</a></li>
+ <li><a href="#i_mul">'<tt>mul</tt>' Instruction</a></li>
+ <li><a href="#i_div">'<tt>div</tt>' Instruction</a></li>
+ <li><a href="#i_rem">'<tt>rem</tt>' Instruction</a></li>
+ <li><a href="#i_setcc">'<tt>set<i>cc</i></tt>' Instructions</a></li>
+ </ol>
+ </li>
+ <li><a href="#bitwiseops">Bitwise Binary Operations</a>
+ <ol>
+ <li><a href="#i_and">'<tt>and</tt>' Instruction</a></li>
+ <li><a href="#i_or">'<tt>or</tt>' Instruction</a></li>
+ <li><a href="#i_xor">'<tt>xor</tt>' Instruction</a></li>
+ <li><a href="#i_shl">'<tt>shl</tt>' Instruction</a></li>
+ <li><a href="#i_shr">'<tt>shr</tt>' Instruction</a></li>
+ </ol>
+ </li>
+ <li><a href="#memoryops">Memory Access Operations</a>
+ <ol>
+ <li><a href="#i_malloc">'<tt>malloc</tt>' Instruction</a></li>
+ <li><a href="#i_free">'<tt>free</tt>' Instruction</a></li>
+ <li><a href="#i_alloca">'<tt>alloca</tt>' Instruction</a></li>
+ <li><a href="#i_load">'<tt>load</tt>' Instruction</a></li>
+ <li><a href="#i_store">'<tt>store</tt>' Instruction</a></li>
+ <li><a href="#i_getelementptr">'<tt>getelementptr</tt>' Instruction</a></li>
+ </ol>
+ </li>
+ <li><a href="#otherops">Other Operations</a>
+ <ol>
+ <li><a href="#i_phi">'<tt>phi</tt>' Instruction</a></li>
+ <li><a href="#i_cast">'<tt>cast .. to</tt>' Instruction</a></li>
+ <li><a href="#i_select">'<tt>select</tt>' Instruction</a></li>
+ <li><a href="#i_call">'<tt>call</tt>' Instruction</a></li>
+ <li><a href="#i_vaarg">'<tt>vaarg</tt>' Instruction</a></li>
+ </ol>
+ </li>
+ </ol>
+ </li>
+ <li><a href="#intrinsics">Intrinsic Functions</a>
+ <ol>
+ <li><a href="#int_varargs">Variable Argument Handling Intrinsics</a>
+ <ol>
+ <li><a href="#i_va_start">'<tt>llvm.va_start</tt>' Intrinsic</a></li>
+ <li><a href="#i_va_end">'<tt>llvm.va_end</tt>' Intrinsic</a></li>
+ <li><a href="#i_va_copy">'<tt>llvm.va_copy</tt>' Intrinsic</a></li>
+ </ol>
+ </li>
+ <li><a href="#int_gc">Accurate Garbage Collection Intrinsics</a>
+ <ol>
+ <li><a href="#i_gcroot">'<tt>llvm.gcroot</tt>' Intrinsic</a></li>
+ <li><a href="#i_gcread">'<tt>llvm.gcread</tt>' Intrinsic</a></li>
+ <li><a href="#i_gcwrite">'<tt>llvm.gcwrite</tt>' Intrinsic</a></li>
+ </ol>
+ </li>
+ <li><a href="#int_codegen">Code Generator Intrinsics</a>
+ <ol>
+ <li><a href="#i_returnaddress">'<tt>llvm.returnaddress</tt>' Intrinsic</a></li>
+ <li><a href="#i_frameaddress">'<tt>llvm.frameaddress</tt>' Intrinsic</a></li>
+ <li><a href="#i_prefetch">'<tt>llvm.prefetch</tt>' Intrinsic</a></li>
+ <li><a href="#i_pcmarker">'<tt>llvm.pcmarker</tt>' Intrinsic</a></li>
+ </ol>
+ </li>
+ <li><a href="#int_os">Operating System Intrinsics</a>
+ <ol>
+ <li><a href="#i_readport">'<tt>llvm.readport</tt>' Intrinsic</a></li>
+ <li><a href="#i_writeport">'<tt>llvm.writeport</tt>' Intrinsic</a></li>
+ <li><a href="#i_readio">'<tt>llvm.readio</tt>' Intrinsic</a></li>
+ <li><a href="#i_writeio">'<tt>llvm.writeio</tt>' Intrinsic</a></li>
+ </ol>
+ <li><a href="#int_libc">Standard C Library Intrinsics</a>
+ <ol>
+ <li><a href="#i_memcpy">'<tt>llvm.memcpy</tt>' Intrinsic</a></li>
+ <li><a href="#i_memmove">'<tt>llvm.memmove</tt>' Intrinsic</a></li>
+ <li><a href="#i_memset">'<tt>llvm.memset</tt>' Intrinsic</a></li>
+ <li><a href="#i_isunordered">'<tt>llvm.isunordered</tt>' Intrinsic</a></li>
+ <li><a href="#i_sqrt">'<tt>llvm.sqrt</tt>' Intrinsic</a></li>
+
+ </ol>
+ </li>
+ <li><a href="#int_count">Bit counting Intrinsics</a>
+ <ol>
+ <li><a href="#int_ctpop">'<tt>llvm.ctpop</tt>' Intrinsic </a></li>
+ <li><a href="#int_ctlz">'<tt>llvm.ctlz</tt>' Intrinsic </a></li>
+ <li><a href="#int_cttz">'<tt>llvm.cttz</tt>' Intrinsic </a></li>
+ </ol>
+ </li>
+ <li><a href="#int_debugger">Debugger intrinsics</a></li>
+ </ol>
+ </li>
+</ol>
+
+<div class="doc_author">
+ <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a>
+ and <a href="mailto:vadve@cs.uiuc.edu">Vikram Adve</a></p>
+</div>
+
+<!-- *********************************************************************** -->
+<div class="doc_section"> <a name="abstract">Abstract </a></div>
+<!-- *********************************************************************** -->
+
+<div class="doc_text">
+<p>This document is a reference manual for the LLVM assembly language.
+LLVM is an SSA based representation that provides type safety,
+low-level operations, flexibility, and the capability of representing
+'all' high-level languages cleanly. It is the common code
+representation used throughout all phases of the LLVM compilation
+strategy.</p>
+</div>
+
+<!-- *********************************************************************** -->
+<div class="doc_section"> <a name="introduction">Introduction</a> </div>
+<!-- *********************************************************************** -->
+
+<div class="doc_text">
+
+<p>The LLVM code representation is designed to be used in three
+different forms: as an in-memory compiler IR, as an on-disk bytecode
+representation (suitable for fast loading by a Just-In-Time compiler),
+and as a human readable assembly language representation. This allows
+LLVM to provide a powerful intermediate representation for efficient
+compiler transformations and analysis, while providing a natural means
+to debug and visualize the transformations. The three different forms
+of LLVM are all equivalent. This document describes the human readable
+representation and notation.</p>
+
+<p>The LLVM representation aims to be light-weight and low-level
+while being expressive, typed, and extensible at the same time. It
+aims to be a "universal IR" of sorts, by being at a low enough level
+that high-level ideas may be cleanly mapped to it (similar to how
+microprocessors are "universal IR's", allowing many source languages to
+be mapped to them). By providing type information, LLVM can be used as
+the target of optimizations: for example, through pointer analysis, it
+can be proven that a C automatic variable is never accessed outside of
+the current function... allowing it to be promoted to a simple SSA
+value instead of a memory location.</p>
+
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection"> <a name="wellformed">Well-Formedness</a> </div>
+
+<div class="doc_text">
+
+<p>It is important to note that this document describes 'well formed'
+LLVM assembly language. There is a difference between what the parser
+accepts and what is considered 'well formed'. For example, the
+following instruction is syntactically okay, but not well formed:</p>
+
+<pre>
+ %x = <a href="#i_add">add</a> int 1, %x
+</pre>
+
+<p>...because the definition of <tt>%x</tt> does not dominate all of
+its uses. The LLVM infrastructure provides a verification pass that may
+be used to verify that an LLVM module is well formed. This pass is
+automatically run by the parser after parsing input assembly and by
+the optimizer before it outputs bytecode. The violations pointed out
+by the verifier pass indicate bugs in transformation passes or input to
+the parser.</p>
+
+<!-- Describe the typesetting conventions here. --> </div>
+
+<!-- *********************************************************************** -->
+<div class="doc_section"> <a name="identifiers">Identifiers</a> </div>
+<!-- *********************************************************************** -->
+
+<div class="doc_text">
+
+<p>LLVM uses three different forms of identifiers, for different
+purposes:</p>
+
+<ol>
+ <li>Named values are represented as a string of characters with a '%' prefix.
+ For example, %foo, %DivisionByZero, %a.really.long.identifier. The actual
+ regular expression used is '<tt>%[a-zA-Z$._][a-zA-Z$._0-9]*</tt>'.
+ Identifiers which require other characters in their names can be surrounded
+ with quotes. In this way, anything except a <tt>"</tt> character can be used
+ in a name.</li>
+
+ <li>Unnamed values are represented as an unsigned numeric value with a '%'
+ prefix. For example, %12, %2, %44.</li>
+
+ <li>Constants, which are described in a <a href="#constants">section about
+ constants</a>, below.</li>
+</ol>
+
+<p>LLVM requires that values start with a '%' sign for two reasons: Compilers
+don't need to worry about name clashes with reserved words, and the set of
+reserved words may be expanded in the future without penalty. Additionally,
+unnamed identifiers allow a compiler to quickly come up with a temporary
+variable without having to avoid symbol table conflicts.</p>
+
+<p>Reserved words in LLVM are very similar to reserved words in other
+languages. There are keywords for different opcodes ('<tt><a
+href="#i_add">add</a></tt>', '<tt><a href="#i_cast">cast</a></tt>', '<tt><a
+href="#i_ret">ret</a></tt>', etc...), for primitive type names ('<tt><a
+href="#t_void">void</a></tt>', '<tt><a href="#t_uint">uint</a></tt>', etc...),
+and others. These reserved words cannot conflict with variable names, because
+none of them start with a '%' character.</p>
+
+<p>Here is an example of LLVM code to multiply the integer variable
+'<tt>%X</tt>' by 8:</p>
+
+<p>The easy way:</p>
+
+<pre>
+ %result = <a href="#i_mul">mul</a> uint %X, 8
+</pre>
+
+<p>After strength reduction:</p>
+
+<pre>
+ %result = <a href="#i_shl">shl</a> uint %X, ubyte 3
+</pre>
+
+<p>And the hard way:</p>
+
+<pre>
+ <a href="#i_add">add</a> uint %X, %X <i>; yields {uint}:%0</i>
+ <a href="#i_add">add</a> uint %0, %0 <i>; yields {uint}:%1</i>
+ %result = <a href="#i_add">add</a> uint %1, %1
+</pre>
+
+<p>This last way of multiplying <tt>%X</tt> by 8 illustrates several
+important lexical features of LLVM:</p>
+
+<ol>
+
+ <li>Comments are delimited with a '<tt>;</tt>' and go until the end of
+ line.</li>
+
+ <li>Unnamed temporaries are created when the result of a computation is not
+ assigned to a named value.</li>
+
+ <li>Unnamed temporaries are numbered sequentially</li>
+
+</ol>
+
+<p>...and it also shows a convention that we follow in this document. When
+demonstrating instructions, we will follow an instruction with a comment that
+defines the type and name of value produced. Comments are shown in italic
+text.</p>
+
+</div>
+
+<!-- *********************************************************************** -->
+<div class="doc_section"> <a name="highlevel">High Level Structure</a> </div>
+<!-- *********************************************************************** -->
+
+<!-- ======================================================================= -->
+<div class="doc_subsection"> <a name="modulestructure">Module Structure</a>
+</div>
+
+<div class="doc_text">
+
+<p>LLVM programs are composed of "Module"s, each of which is a
+translation unit of the input programs. Each module consists of
+functions, global variables, and symbol table entries. Modules may be
+combined together with the LLVM linker, which merges function (and
+global variable) definitions, resolves forward declarations, and merges
+symbol table entries. Here is an example of the "hello world" module:</p>
+
+<pre><i>; Declare the string constant as a global constant...</i>
+<a href="#identifiers">%.LC0</a> = <a href="#linkage_internal">internal</a> <a
+ href="#globalvars">constant</a> <a href="#t_array">[13 x sbyte]</a> c"hello world\0A\00" <i>; [13 x sbyte]*</i>
+
+<i>; External declaration of the puts function</i>
+<a href="#functionstructure">declare</a> int %puts(sbyte*) <i>; int(sbyte*)* </i>
+
+<i>; Definition of main function</i>
+int %main() { <i>; int()* </i>
+ <i>; Convert [13x sbyte]* to sbyte *...</i>
+ %cast210 = <a
+ href="#i_getelementptr">getelementptr</a> [13 x sbyte]* %.LC0, long 0, long 0 <i>; sbyte*</i>
+
+ <i>; Call puts function to write out the string to stdout...</i>
+ <a
+ href="#i_call">call</a> int %puts(sbyte* %cast210) <i>; int</i>
+ <a
+ href="#i_ret">ret</a> int 0<br>}<br></pre>
+
+<p>This example is made up of a <a href="#globalvars">global variable</a>
+named "<tt>.LC0</tt>", an external declaration of the "<tt>puts</tt>"
+function, and a <a href="#functionstructure">function definition</a>
+for "<tt>main</tt>".</p>
+
+<p>In general, a module is made up of a list of global values,
+where both functions and global variables are global values. Global values are
+represented by a pointer to a memory location (in this case, a pointer to an
+array of char, and a pointer to a function), and have one of the following <a
+href="#linkage">linkage types</a>.</p>
+
+</div>
+
+<!-- ======================================================================= -->
+<div class="doc_subsection">
+ <a name="linkage">Linkage Types</a>
+</div>
+
+<div class="doc_text">
+
+<p>
+All Global Variables and Functions have one of the following types of linkage:
+</p>
+
+<dl>
+
+ <dt><tt><b><a name="linkage_internal">internal</a></b></tt> </dt>
+
+ <dd>Global values with internal linkage are only directly accessible by
+ objects in the current module. In particular, linking code into a module with
+ an internal global value may cause the internal to be renamed as necessary to
+ avoid collisions. Because the symbol is internal to the module, all
+ references can be updated. This corresponds to the notion of the
+ '<tt>static</tt>' keyword in C, or the idea of "anonymous namespaces" in C++.
+ </dd>
+
+ <dt><tt><b><a name="linkage_linkonce">linkonce</a></b></tt>: </dt>
+
+ <dd>"<tt>linkonce</tt>" linkage is similar to <tt>internal</tt> linkage, with
+ the twist that linking together two modules defining the same
+ <tt>linkonce</tt> globals will cause one of the globals to be discarded. This
+ is typically used to implement inline functions. Unreferenced
+ <tt>linkonce</tt> globals are allowed to be discarded.
+ </dd>
+
+ <dt><tt><b><a name="linkage_weak">weak</a></b></tt>: </dt>
+
+ <dd>"<tt>weak</tt>" linkage is exactly the same as <tt>linkonce</tt> linkage,
+ except that unreferenced <tt>weak</tt> globals may not be discarded. This is
+ used to implement constructs in C such as "<tt>int X;</tt>" at global scope.
+ </dd>
+
+ <dt><tt><b><a name="linkage_appending">appending</a></b></tt>: </dt>
+
+ <dd>"<tt>appending</tt>" linkage may only be applied to global variables of
+ pointer to array type. When two global variables with appending linkage are
+ linked together, the two global arrays are appended together. This is the
+ LLVM, typesafe, equivalent of having the system linker append together
+ "sections" with identical names when .o files are linked.
+ </dd>
+
+ <dt><tt><b><a name="linkage_external">externally visible</a></b></tt>:</dt>
+
+ <dd>If none of the above identifiers are used, the global is externally
+ visible, meaning that it participates in linkage and can be used to resolve
+ external symbol references.
+ </dd>
+</dl>
+
+<p><a name="linkage_external">For example, since the "<tt>.LC0</tt>"
+variable is defined to be internal, if another module defined a "<tt>.LC0</tt>"
+variable and was linked with this one, one of the two would be renamed,
+preventing a collision. Since "<tt>main</tt>" and "<tt>puts</tt>" are
+external (i.e., lacking any linkage declarations), they are accessible
+outside of the current module. It is illegal for a function <i>declaration</i>
+to have any linkage type other than "externally visible".</a></p>
+
+</div>
+
+<!-- ======================================================================= -->
+<div class="doc_subsection">
+ <a name="callingconv">Calling Conventions</a>
+</div>
+
+<div class="doc_text">
+
+<p>LLVM <a href="#functionstructure">functions</a>, <a href="#i_call">calls</a>
+and <a href="#i_invoke">invokes</a> can all have an optional calling convention
+specified for the call. The calling convention of any pair of dynamic
+caller/callee must match, or the behavior of the program is undefined. The
+following calling conventions are supported by LLVM, and more may be added in
+the future:</p>
+
+<dl>
+ <dt><b>"<tt>ccc</tt>" - The C calling convention</b>:</dt>
+
+ <dd>This calling convention (the default if no other calling convention is
+ specified) matches the target C calling conventions. This calling convention
+ supports varargs function calls and tolerates some mismatch in the declared
+ prototype and implemented declaration of the function (as does normal C).
+ </dd>
+
+ <dt><b>"<tt>fastcc</tt>" - The fast calling convention</b>:</dt>
+
+ <dd>This calling convention attempts to make calls as fast as possible
+ (e.g. by passing things in registers). This calling convention allows the
+ target to use whatever tricks it wants to produce fast code for the target,
+ without having to conform to an externally specified ABI. Implementations of
+ this convention should allow arbitrary tail call optimization to be supported.
+ This calling convention does not support varargs and requires the prototype of
+ all callees to exactly match the prototype of the function definition.
+ </dd>
+
+ <dt><b>"<tt>coldcc</tt>" - The cold calling convention</b>:</dt>
+
+ <dd>This calling convention attempts to make code in the caller as efficient
+ as possible under the assumption that the call is not commonly executed. As
+ such, these calls often preserve all registers so that the call does not break
+ any live ranges in the caller side. This calling convention does not support
+ varargs and requires the prototype of all callees to exactly match the
+ prototype of the function definition.
+ </dd>
+
+ <dt><b>"<tt>cc &lt;<em>n</em>&gt;</tt>" - Numbered convention</b>:</dt>
+
+ <dd>Any calling convention may be specified by number, allowing
+ target-specific calling conventions to be used. Target specific calling
+ conventions start at 64.
+ </dd>
+</dl>
+
+<p>More calling conventions can be added/defined on an as-needed basis, to
+support pascal conventions or any other well-known target-independent
+convention.</p>
+
+</div>
+
+<!-- ======================================================================= -->
+<div class="doc_subsection">
+ <a name="globalvars">Global Variables</a>
+</div>
+
+<div class="doc_text">
+
+<p>Global variables define regions of memory allocated at compilation time
+instead of run-time. Global variables may optionally be initialized. A
+variable may be defined as a global "constant," which indicates that the
+contents of the variable will <b>never</b> be modified (enabling better
+optimization, allowing the global data to be placed in the read-only section of
+an executable, etc). Note that variables that need runtime initialization
+cannot be marked "constant" as there is a store to the variable.</p>
+
+<p>
+LLVM explicitly allows <em>declarations</em> of global variables to be marked
+constant, even if the final definition of the global is not. This capability
+can be used to enable slightly better optimization of the program, but requires
+the language definition to guarantee that optimizations based on the
+'constantness' are valid for the translation units that do not include the
+definition.
+</p>
+
+<p>As SSA values, global variables define pointer values that are in
+scope (i.e. they dominate) all basic blocks in the program. Global
+variables always define a pointer to their "content" type because they
+describe a region of memory, and all memory objects in LLVM are
+accessed through pointers.</p>
+
+</div>
+
+
+<!-- ======================================================================= -->
+<div class="doc_subsection">
+ <a name="functionstructure">Functions</a>
+</div>
+
+<div class="doc_text">
+
+<p>LLVM function definitions consist of an optional <a href="#linkage">linkage
+type</a>, an optional <a href="#callingconv">calling convention</a>, a return
+type, a function name, a (possibly empty) argument list, an opening curly brace,
+a list of basic blocks, and a closing curly brace. LLVM function declarations
+are defined with the "<tt>declare</tt>" keyword, an optional <a
+href="#callingconv">calling convention</a>, a return type, a function name, and
+a possibly empty list of arguments.</p>
+
+<p>A function definition contains a list of basic blocks, forming the CFG for
+the function. Each basic block may optionally start with a label (giving the
+basic block a symbol table entry), contains a list of instructions, and ends
+with a <a href="#terminators">terminator</a> instruction (such as a branch or
+function return).</p>
+
+<p>The first basic block in a program is special in two ways: it is immediately
+executed on entrance to the function, and it is not allowed to have predecessor
+basic blocks (i.e. there can not be any branches to the entry block of a
+function). Because the block can have no predecessors, it also cannot have any
+<a href="#i_phi">PHI nodes</a>.</p>
+
+<p>LLVM functions are identified by their name and type signature. Hence, two
+functions with the same name but different parameter lists or return values are
+considered different functions, and LLVM will resolve references to each
+appropriately.</p>
+
+</div>
+
+
+
+<!-- *********************************************************************** -->
+<div class="doc_section"> <a name="typesystem">Type System</a> </div>
+<!-- *********************************************************************** -->
+
+<div class="doc_text">
+
+<p>The LLVM type system is one of the most important features of the
+intermediate representation. Being typed enables a number of
+optimizations to be performed on the IR directly, without having to do
+extra analyses on the side before the transformation. A strong type
+system makes it easier to read the generated code and enables novel
+analyses and transformations that are not feasible to perform on normal
+three address code representations.</p>
+
+</div>
+
+<!-- ======================================================================= -->
+<div class="doc_subsection"> <a name="t_primitive">Primitive Types</a> </div>
+<div class="doc_text">
+<p>The primitive types are the fundamental building blocks of the LLVM
+system. The current set of primitive types is as follows:</p>
+
+<table class="layout">
+ <tr class="layout">
+ <td class="left">
+ <table>
+ <tbody>
+ <tr><th>Type</th><th>Description</th></tr>
+ <tr><td><tt>void</tt></td><td>No value</td></tr>
+ <tr><td><tt>ubyte</tt></td><td>Unsigned 8-bit value</td></tr>
+ <tr><td><tt>ushort</tt></td><td>Unsigned 16-bit value</td></tr>
+ <tr><td><tt>uint</tt></td><td>Unsigned 32-bit value</td></tr>
+ <tr><td><tt>ulong</tt></td><td>Unsigned 64-bit value</td></tr>
+ <tr><td><tt>float</tt></td><td>32-bit floating point value</td></tr>
+ <tr><td><tt>label</tt></td><td>Branch destination</td></tr>
+ </tbody>
+ </table>
+ </td>
+ <td class="right">
+ <table>
+ <tbody>
+ <tr><th>Type</th><th>Description</th></tr>
+ <tr><td><tt>bool</tt></td><td>True or False value</td></tr>
+ <tr><td><tt>sbyte</tt></td><td>Signed 8-bit value</td></tr>
+ <tr><td><tt>short</tt></td><td>Signed 16-bit value</td></tr>
+ <tr><td><tt>int</tt></td><td>Signed 32-bit value</td></tr>
+ <tr><td><tt>long</tt></td><td>Signed 64-bit value</td></tr>
+ <tr><td><tt>double</tt></td><td>64-bit floating point value</td></tr>
+ </tbody>
+ </table>
+ </td>
+ </tr>
+</table>
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection"> <a name="t_classifications">Type
+Classifications</a> </div>
+<div class="doc_text">
+<p>These different primitive types fall into a few useful
+classifications:</p>
+
+<table border="1" cellspacing="0" cellpadding="4">
+ <tbody>
+ <tr><th>Classification</th><th>Types</th></tr>
+ <tr>
+ <td><a name="t_signed">signed</a></td>
+ <td><tt>sbyte, short, int, long, float, double</tt></td>
+ </tr>
+ <tr>
+ <td><a name="t_unsigned">unsigned</a></td>
+ <td><tt>ubyte, ushort, uint, ulong</tt></td>
+ </tr>
+ <tr>
+ <td><a name="t_integer">integer</a></td>
+ <td><tt>ubyte, sbyte, ushort, short, uint, int, ulong, long</tt></td>
+ </tr>
+ <tr>
+ <td><a name="t_integral">integral</a></td>
+ <td><tt>bool, ubyte, sbyte, ushort, short, uint, int, ulong, long</tt>
+ </td>
+ </tr>
+ <tr>
+ <td><a name="t_floating">floating point</a></td>
+ <td><tt>float, double</tt></td>
+ </tr>
+ <tr>
+ <td><a name="t_firstclass">first class</a></td>
+ <td><tt>bool, ubyte, sbyte, ushort, short, uint, int, ulong, long,<br>
+ float, double, <a href="#t_pointer">pointer</a>,
+ <a href="#t_packed">packed</a></tt></td>
+ </tr>
+ </tbody>
+</table>
+
+<p>The <a href="#t_firstclass">first class</a> types are perhaps the
+most important. Values of these types are the only ones which can be
+produced by instructions, passed as arguments, or used as operands to
+instructions. This means that all structures and arrays must be
+manipulated either by pointer or by component.</p>
+</div>
+
+<!-- ======================================================================= -->
+<div class="doc_subsection"> <a name="t_derived">Derived Types</a> </div>
+
+<div class="doc_text">
+
+<p>The real power in LLVM comes from the derived types in the system.
+This is what allows a programmer to represent arrays, functions,
+pointers, and other useful types. Note that these derived types may be
+recursive: For example, it is possible to have a two dimensional array.</p>
+
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection"> <a name="t_array">Array Type</a> </div>
+
+<div class="doc_text">
+
+<h5>Overview:</h5>
+
+<p>The array type is a very simple derived type that arranges elements
+sequentially in memory. The array type requires a size (number of
+elements) and an underlying data type.</p>
+
+<h5>Syntax:</h5>
+
+<pre>
+ [&lt;# elements&gt; x &lt;elementtype&gt;]
+</pre>
+
+<p>The number of elements is a constant integer value; elementtype may
+be any type with a size.</p>
+
+<h5>Examples:</h5>
+<table class="layout">
+ <tr class="layout">
+ <td class="left">
+ <tt>[40 x int ]</tt><br/>
+ <tt>[41 x int ]</tt><br/>
+ <tt>[40 x uint]</tt><br/>
+ </td>
+ <td class="left">
+ Array of 40 integer values.<br/>
+ Array of 41 integer values.<br/>
+ Array of 40 unsigned integer values.<br/>
+ </td>
+ </tr>
+</table>
+<p>Here are some examples of multidimensional arrays:</p>
+<table class="layout">
+ <tr class="layout">
+ <td class="left">
+ <tt>[3 x [4 x int]]</tt><br/>
+ <tt>[12 x [10 x float]]</tt><br/>
+ <tt>[2 x [3 x [4 x uint]]]</tt><br/>
+ </td>
+ <td class="left">
+ 3x4 array of integer values.<br/>
+ 12x10 array of single precision floating point values.<br/>
+ 2x3x4 array of unsigned integer values.<br/>
+ </td>
+ </tr>
+</table>
+
+<p>Note that 'variable sized arrays' can be implemented in LLVM with a zero
+length array. Normally, accesses past the end of an array are undefined in
+LLVM (e.g. it is illegal to access the 5th element of a 3 element array).
+As a special case, however, zero length arrays are recognized to be variable
+length. This allows implementation of 'pascal style arrays' with the LLVM
+type "{ int, [0 x float]}", for example.</p>
+
+</div>
+
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection"> <a name="t_function">Function Type</a> </div>
+<div class="doc_text">
+<h5>Overview:</h5>
+<p>The function type can be thought of as a function signature. It
+consists of a return type and a list of formal parameter types.
+Function types are usually used to build virtual function tables
+(which are structures of pointers to functions), for indirect function
+calls, and when defining a function.</p>
+<p>
+The return type of a function type cannot be an aggregate type.
+</p>
+<h5>Syntax:</h5>
+<pre> &lt;returntype&gt; (&lt;parameter list&gt;)<br></pre>
+<p>...where '<tt>&lt;parameter list&gt;</tt>' is a comma-separated list of type
+specifiers. Optionally, the parameter list may include a type <tt>...</tt>,
+which indicates that the function takes a variable number of arguments.
+Variable argument functions can access their arguments with the <a
+ href="#int_varargs">variable argument handling intrinsic</a> functions.</p>
+<h5>Examples:</h5>
+<table class="layout">
+ <tr class="layout">
+ <td class="left">
+ <tt>int (int)</tt> <br/>
+ <tt>float (int, int *) *</tt><br/>
+ <tt>int (sbyte *, ...)</tt><br/>
+ </td>
+ <td class="left">
+ function taking an <tt>int</tt>, returning an <tt>int</tt><br/>
+ <a href="#t_pointer">Pointer</a> to a function that takes an
+ <tt>int</tt> and a <a href="#t_pointer">pointer</a> to <tt>int</tt>,
+ returning <tt>float</tt>.<br/>
+ A vararg function that takes at least one <a href="#t_pointer">pointer</a>
+ to <tt>sbyte</tt> (signed char in C), which returns an integer. This is
+ the signature for <tt>printf</tt> in LLVM.<br/>
+ </td>
+ </tr>
+</table>
+
+</div>
+<!-- _______________________________________________________________________ -->
+<div class="doc_subsubsection"> <a name="t_struct">Structure Type</a> </div>
+<div class="doc_text">
+<h5>Overview:</h5>
+<p>The structure type is used to represent a collection of data members
+together in memory. The packing of the field types is defined to match
+the ABI of the underlying processor. The elements of a structure may
+be any type that has a size.</p>
+<p>Structures are accessed using '<tt><a href="#i_load">load</a></tt>
+and '<tt><a href="#i_store">store</a></tt>' by getting a pointer to a
+field with the '<tt><a href="#i_getelementptr">getelementptr</a></tt>'
+instruction.</p>
+<h5>Syntax:</h5>