diff options
Diffstat (limited to 'docs/LangRef.html')
-rw-r--r-- | docs/LangRef.html | 3320 |
1 files changed, 3320 insertions, 0 deletions
diff --git a/docs/LangRef.html b/docs/LangRef.html new file mode 100644 index 0000000000..6b5bb962c6 --- /dev/null +++ b/docs/LangRef.html @@ -0,0 +1,3320 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" + "http://www.w3.org/TR/html4/strict.dtd"> +<html> +<head> + <title>LLVM Assembly Language Reference Manual</title> + <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> + <meta name="author" content="Chris Lattner"> + <meta name="description" + content="LLVM Assembly Language Reference Manual."> + <link rel="stylesheet" href="llvm.css" type="text/css"> +</head> + +<body> + +<div class="doc_title"> LLVM Language Reference Manual </div> +<ol> + <li><a href="#abstract">Abstract</a></li> + <li><a href="#introduction">Introduction</a></li> + <li><a href="#identifiers">Identifiers</a></li> + <li><a href="#highlevel">High Level Structure</a> + <ol> + <li><a href="#modulestructure">Module Structure</a></li> + <li><a href="#linkage">Linkage Types</a></li> + <li><a href="#callingconv">Calling Conventions</a></li> + <li><a href="#globalvars">Global Variables</a></li> + <li><a href="#functionstructure">Function Structure</a></li> + </ol> + </li> + <li><a href="#typesystem">Type System</a> + <ol> + <li><a href="#t_primitive">Primitive Types</a> + <ol> + <li><a href="#t_classifications">Type Classifications</a></li> + </ol> + </li> + <li><a href="#t_derived">Derived Types</a> + <ol> + <li><a href="#t_array">Array Type</a></li> + <li><a href="#t_function">Function Type</a></li> + <li><a href="#t_pointer">Pointer Type</a></li> + <li><a href="#t_struct">Structure Type</a></li> + <li><a href="#t_packed">Packed Type</a></li> + <li><a href="#t_opaque">Opaque Type</a></li> + </ol> + </li> + </ol> + </li> + <li><a href="#constants">Constants</a> + <ol> + <li><a href="#simpleconstants">Simple Constants</a> + <li><a href="#aggregateconstants">Aggregate Constants</a> + <li><a href="#globalconstants">Global Variable and Function Addresses</a> + <li><a href="#undefvalues">Undefined Values</a> + <li><a href="#constantexprs">Constant Expressions</a> + </ol> + </li> + <li><a href="#instref">Instruction Reference</a> + <ol> + <li><a href="#terminators">Terminator Instructions</a> + <ol> + <li><a href="#i_ret">'<tt>ret</tt>' Instruction</a></li> + <li><a href="#i_br">'<tt>br</tt>' Instruction</a></li> + <li><a href="#i_switch">'<tt>switch</tt>' Instruction</a></li> + <li><a href="#i_invoke">'<tt>invoke</tt>' Instruction</a></li> + <li><a href="#i_unwind">'<tt>unwind</tt>' Instruction</a></li> + <li><a href="#i_unreachable">'<tt>unreachable</tt>' Instruction</a></li> + </ol> + </li> + <li><a href="#binaryops">Binary Operations</a> + <ol> + <li><a href="#i_add">'<tt>add</tt>' Instruction</a></li> + <li><a href="#i_sub">'<tt>sub</tt>' Instruction</a></li> + <li><a href="#i_mul">'<tt>mul</tt>' Instruction</a></li> + <li><a href="#i_div">'<tt>div</tt>' Instruction</a></li> + <li><a href="#i_rem">'<tt>rem</tt>' Instruction</a></li> + <li><a href="#i_setcc">'<tt>set<i>cc</i></tt>' Instructions</a></li> + </ol> + </li> + <li><a href="#bitwiseops">Bitwise Binary Operations</a> + <ol> + <li><a href="#i_and">'<tt>and</tt>' Instruction</a></li> + <li><a href="#i_or">'<tt>or</tt>' Instruction</a></li> + <li><a href="#i_xor">'<tt>xor</tt>' Instruction</a></li> + <li><a href="#i_shl">'<tt>shl</tt>' Instruction</a></li> + <li><a href="#i_shr">'<tt>shr</tt>' Instruction</a></li> + </ol> + </li> + <li><a href="#memoryops">Memory Access Operations</a> + <ol> + <li><a href="#i_malloc">'<tt>malloc</tt>' Instruction</a></li> + <li><a href="#i_free">'<tt>free</tt>' Instruction</a></li> + <li><a href="#i_alloca">'<tt>alloca</tt>' Instruction</a></li> + <li><a href="#i_load">'<tt>load</tt>' Instruction</a></li> + <li><a href="#i_store">'<tt>store</tt>' Instruction</a></li> + <li><a href="#i_getelementptr">'<tt>getelementptr</tt>' Instruction</a></li> + </ol> + </li> + <li><a href="#otherops">Other Operations</a> + <ol> + <li><a href="#i_phi">'<tt>phi</tt>' Instruction</a></li> + <li><a href="#i_cast">'<tt>cast .. to</tt>' Instruction</a></li> + <li><a href="#i_select">'<tt>select</tt>' Instruction</a></li> + <li><a href="#i_call">'<tt>call</tt>' Instruction</a></li> + <li><a href="#i_vaarg">'<tt>vaarg</tt>' Instruction</a></li> + </ol> + </li> + </ol> + </li> + <li><a href="#intrinsics">Intrinsic Functions</a> + <ol> + <li><a href="#int_varargs">Variable Argument Handling Intrinsics</a> + <ol> + <li><a href="#i_va_start">'<tt>llvm.va_start</tt>' Intrinsic</a></li> + <li><a href="#i_va_end">'<tt>llvm.va_end</tt>' Intrinsic</a></li> + <li><a href="#i_va_copy">'<tt>llvm.va_copy</tt>' Intrinsic</a></li> + </ol> + </li> + <li><a href="#int_gc">Accurate Garbage Collection Intrinsics</a> + <ol> + <li><a href="#i_gcroot">'<tt>llvm.gcroot</tt>' Intrinsic</a></li> + <li><a href="#i_gcread">'<tt>llvm.gcread</tt>' Intrinsic</a></li> + <li><a href="#i_gcwrite">'<tt>llvm.gcwrite</tt>' Intrinsic</a></li> + </ol> + </li> + <li><a href="#int_codegen">Code Generator Intrinsics</a> + <ol> + <li><a href="#i_returnaddress">'<tt>llvm.returnaddress</tt>' Intrinsic</a></li> + <li><a href="#i_frameaddress">'<tt>llvm.frameaddress</tt>' Intrinsic</a></li> + <li><a href="#i_prefetch">'<tt>llvm.prefetch</tt>' Intrinsic</a></li> + <li><a href="#i_pcmarker">'<tt>llvm.pcmarker</tt>' Intrinsic</a></li> + </ol> + </li> + <li><a href="#int_os">Operating System Intrinsics</a> + <ol> + <li><a href="#i_readport">'<tt>llvm.readport</tt>' Intrinsic</a></li> + <li><a href="#i_writeport">'<tt>llvm.writeport</tt>' Intrinsic</a></li> + <li><a href="#i_readio">'<tt>llvm.readio</tt>' Intrinsic</a></li> + <li><a href="#i_writeio">'<tt>llvm.writeio</tt>' Intrinsic</a></li> + </ol> + <li><a href="#int_libc">Standard C Library Intrinsics</a> + <ol> + <li><a href="#i_memcpy">'<tt>llvm.memcpy</tt>' Intrinsic</a></li> + <li><a href="#i_memmove">'<tt>llvm.memmove</tt>' Intrinsic</a></li> + <li><a href="#i_memset">'<tt>llvm.memset</tt>' Intrinsic</a></li> + <li><a href="#i_isunordered">'<tt>llvm.isunordered</tt>' Intrinsic</a></li> + <li><a href="#i_sqrt">'<tt>llvm.sqrt</tt>' Intrinsic</a></li> + + </ol> + </li> + <li><a href="#int_count">Bit counting Intrinsics</a> + <ol> + <li><a href="#int_ctpop">'<tt>llvm.ctpop</tt>' Intrinsic </a></li> + <li><a href="#int_ctlz">'<tt>llvm.ctlz</tt>' Intrinsic </a></li> + <li><a href="#int_cttz">'<tt>llvm.cttz</tt>' Intrinsic </a></li> + </ol> + </li> + <li><a href="#int_debugger">Debugger intrinsics</a></li> + </ol> + </li> +</ol> + +<div class="doc_author"> + <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a> + and <a href="mailto:vadve@cs.uiuc.edu">Vikram Adve</a></p> +</div> + +<!-- *********************************************************************** --> +<div class="doc_section"> <a name="abstract">Abstract </a></div> +<!-- *********************************************************************** --> + +<div class="doc_text"> +<p>This document is a reference manual for the LLVM assembly language. +LLVM is an SSA based representation that provides type safety, +low-level operations, flexibility, and the capability of representing +'all' high-level languages cleanly. It is the common code +representation used throughout all phases of the LLVM compilation +strategy.</p> +</div> + +<!-- *********************************************************************** --> +<div class="doc_section"> <a name="introduction">Introduction</a> </div> +<!-- *********************************************************************** --> + +<div class="doc_text"> + +<p>The LLVM code representation is designed to be used in three +different forms: as an in-memory compiler IR, as an on-disk bytecode +representation (suitable for fast loading by a Just-In-Time compiler), +and as a human readable assembly language representation. This allows +LLVM to provide a powerful intermediate representation for efficient +compiler transformations and analysis, while providing a natural means +to debug and visualize the transformations. The three different forms +of LLVM are all equivalent. This document describes the human readable +representation and notation.</p> + +<p>The LLVM representation aims to be light-weight and low-level +while being expressive, typed, and extensible at the same time. It +aims to be a "universal IR" of sorts, by being at a low enough level +that high-level ideas may be cleanly mapped to it (similar to how +microprocessors are "universal IR's", allowing many source languages to +be mapped to them). By providing type information, LLVM can be used as +the target of optimizations: for example, through pointer analysis, it +can be proven that a C automatic variable is never accessed outside of +the current function... allowing it to be promoted to a simple SSA +value instead of a memory location.</p> + +</div> + +<!-- _______________________________________________________________________ --> +<div class="doc_subsubsection"> <a name="wellformed">Well-Formedness</a> </div> + +<div class="doc_text"> + +<p>It is important to note that this document describes 'well formed' +LLVM assembly language. There is a difference between what the parser +accepts and what is considered 'well formed'. For example, the +following instruction is syntactically okay, but not well formed:</p> + +<pre> + %x = <a href="#i_add">add</a> int 1, %x +</pre> + +<p>...because the definition of <tt>%x</tt> does not dominate all of +its uses. The LLVM infrastructure provides a verification pass that may +be used to verify that an LLVM module is well formed. This pass is +automatically run by the parser after parsing input assembly and by +the optimizer before it outputs bytecode. The violations pointed out +by the verifier pass indicate bugs in transformation passes or input to +the parser.</p> + +<!-- Describe the typesetting conventions here. --> </div> + +<!-- *********************************************************************** --> +<div class="doc_section"> <a name="identifiers">Identifiers</a> </div> +<!-- *********************************************************************** --> + +<div class="doc_text"> + +<p>LLVM uses three different forms of identifiers, for different +purposes:</p> + +<ol> + <li>Named values are represented as a string of characters with a '%' prefix. + For example, %foo, %DivisionByZero, %a.really.long.identifier. The actual + regular expression used is '<tt>%[a-zA-Z$._][a-zA-Z$._0-9]*</tt>'. + Identifiers which require other characters in their names can be surrounded + with quotes. In this way, anything except a <tt>"</tt> character can be used + in a name.</li> + + <li>Unnamed values are represented as an unsigned numeric value with a '%' + prefix. For example, %12, %2, %44.</li> + + <li>Constants, which are described in a <a href="#constants">section about + constants</a>, below.</li> +</ol> + +<p>LLVM requires that values start with a '%' sign for two reasons: Compilers +don't need to worry about name clashes with reserved words, and the set of +reserved words may be expanded in the future without penalty. Additionally, +unnamed identifiers allow a compiler to quickly come up with a temporary +variable without having to avoid symbol table conflicts.</p> + +<p>Reserved words in LLVM are very similar to reserved words in other +languages. There are keywords for different opcodes ('<tt><a +href="#i_add">add</a></tt>', '<tt><a href="#i_cast">cast</a></tt>', '<tt><a +href="#i_ret">ret</a></tt>', etc...), for primitive type names ('<tt><a +href="#t_void">void</a></tt>', '<tt><a href="#t_uint">uint</a></tt>', etc...), +and others. These reserved words cannot conflict with variable names, because +none of them start with a '%' character.</p> + +<p>Here is an example of LLVM code to multiply the integer variable +'<tt>%X</tt>' by 8:</p> + +<p>The easy way:</p> + +<pre> + %result = <a href="#i_mul">mul</a> uint %X, 8 +</pre> + +<p>After strength reduction:</p> + +<pre> + %result = <a href="#i_shl">shl</a> uint %X, ubyte 3 +</pre> + +<p>And the hard way:</p> + +<pre> + <a href="#i_add">add</a> uint %X, %X <i>; yields {uint}:%0</i> + <a href="#i_add">add</a> uint %0, %0 <i>; yields {uint}:%1</i> + %result = <a href="#i_add">add</a> uint %1, %1 +</pre> + +<p>This last way of multiplying <tt>%X</tt> by 8 illustrates several +important lexical features of LLVM:</p> + +<ol> + + <li>Comments are delimited with a '<tt>;</tt>' and go until the end of + line.</li> + + <li>Unnamed temporaries are created when the result of a computation is not + assigned to a named value.</li> + + <li>Unnamed temporaries are numbered sequentially</li> + +</ol> + +<p>...and it also shows a convention that we follow in this document. When +demonstrating instructions, we will follow an instruction with a comment that +defines the type and name of value produced. Comments are shown in italic +text.</p> + +</div> + +<!-- *********************************************************************** --> +<div class="doc_section"> <a name="highlevel">High Level Structure</a> </div> +<!-- *********************************************************************** --> + +<!-- ======================================================================= --> +<div class="doc_subsection"> <a name="modulestructure">Module Structure</a> +</div> + +<div class="doc_text"> + +<p>LLVM programs are composed of "Module"s, each of which is a +translation unit of the input programs. Each module consists of +functions, global variables, and symbol table entries. Modules may be +combined together with the LLVM linker, which merges function (and +global variable) definitions, resolves forward declarations, and merges +symbol table entries. Here is an example of the "hello world" module:</p> + +<pre><i>; Declare the string constant as a global constant...</i> +<a href="#identifiers">%.LC0</a> = <a href="#linkage_internal">internal</a> <a + href="#globalvars">constant</a> <a href="#t_array">[13 x sbyte]</a> c"hello world\0A\00" <i>; [13 x sbyte]*</i> + +<i>; External declaration of the puts function</i> +<a href="#functionstructure">declare</a> int %puts(sbyte*) <i>; int(sbyte*)* </i> + +<i>; Definition of main function</i> +int %main() { <i>; int()* </i> + <i>; Convert [13x sbyte]* to sbyte *...</i> + %cast210 = <a + href="#i_getelementptr">getelementptr</a> [13 x sbyte]* %.LC0, long 0, long 0 <i>; sbyte*</i> + + <i>; Call puts function to write out the string to stdout...</i> + <a + href="#i_call">call</a> int %puts(sbyte* %cast210) <i>; int</i> + <a + href="#i_ret">ret</a> int 0<br>}<br></pre> + +<p>This example is made up of a <a href="#globalvars">global variable</a> +named "<tt>.LC0</tt>", an external declaration of the "<tt>puts</tt>" +function, and a <a href="#functionstructure">function definition</a> +for "<tt>main</tt>".</p> + +<p>In general, a module is made up of a list of global values, +where both functions and global variables are global values. Global values are +represented by a pointer to a memory location (in this case, a pointer to an +array of char, and a pointer to a function), and have one of the following <a +href="#linkage">linkage types</a>.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="linkage">Linkage Types</a> +</div> + +<div class="doc_text"> + +<p> +All Global Variables and Functions have one of the following types of linkage: +</p> + +<dl> + + <dt><tt><b><a name="linkage_internal">internal</a></b></tt> </dt> + + <dd>Global values with internal linkage are only directly accessible by + objects in the current module. In particular, linking code into a module with + an internal global value may cause the internal to be renamed as necessary to + avoid collisions. Because the symbol is internal to the module, all + references can be updated. This corresponds to the notion of the + '<tt>static</tt>' keyword in C, or the idea of "anonymous namespaces" in C++. + </dd> + + <dt><tt><b><a name="linkage_linkonce">linkonce</a></b></tt>: </dt> + + <dd>"<tt>linkonce</tt>" linkage is similar to <tt>internal</tt> linkage, with + the twist that linking together two modules defining the same + <tt>linkonce</tt> globals will cause one of the globals to be discarded. This + is typically used to implement inline functions. Unreferenced + <tt>linkonce</tt> globals are allowed to be discarded. + </dd> + + <dt><tt><b><a name="linkage_weak">weak</a></b></tt>: </dt> + + <dd>"<tt>weak</tt>" linkage is exactly the same as <tt>linkonce</tt> linkage, + except that unreferenced <tt>weak</tt> globals may not be discarded. This is + used to implement constructs in C such as "<tt>int X;</tt>" at global scope. + </dd> + + <dt><tt><b><a name="linkage_appending">appending</a></b></tt>: </dt> + + <dd>"<tt>appending</tt>" linkage may only be applied to global variables of + pointer to array type. When two global variables with appending linkage are + linked together, the two global arrays are appended together. This is the + LLVM, typesafe, equivalent of having the system linker append together + "sections" with identical names when .o files are linked. + </dd> + + <dt><tt><b><a name="linkage_external">externally visible</a></b></tt>:</dt> + + <dd>If none of the above identifiers are used, the global is externally + visible, meaning that it participates in linkage and can be used to resolve + external symbol references. + </dd> +</dl> + +<p><a name="linkage_external">For example, since the "<tt>.LC0</tt>" +variable is defined to be internal, if another module defined a "<tt>.LC0</tt>" +variable and was linked with this one, one of the two would be renamed, +preventing a collision. Since "<tt>main</tt>" and "<tt>puts</tt>" are +external (i.e., lacking any linkage declarations), they are accessible +outside of the current module. It is illegal for a function <i>declaration</i> +to have any linkage type other than "externally visible".</a></p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="callingconv">Calling Conventions</a> +</div> + +<div class="doc_text"> + +<p>LLVM <a href="#functionstructure">functions</a>, <a href="#i_call">calls</a> +and <a href="#i_invoke">invokes</a> can all have an optional calling convention +specified for the call. The calling convention of any pair of dynamic +caller/callee must match, or the behavior of the program is undefined. The +following calling conventions are supported by LLVM, and more may be added in +the future:</p> + +<dl> + <dt><b>"<tt>ccc</tt>" - The C calling convention</b>:</dt> + + <dd>This calling convention (the default if no other calling convention is + specified) matches the target C calling conventions. This calling convention + supports varargs function calls and tolerates some mismatch in the declared + prototype and implemented declaration of the function (as does normal C). + </dd> + + <dt><b>"<tt>fastcc</tt>" - The fast calling convention</b>:</dt> + + <dd>This calling convention attempts to make calls as fast as possible + (e.g. by passing things in registers). This calling convention allows the + target to use whatever tricks it wants to produce fast code for the target, + without having to conform to an externally specified ABI. Implementations of + this convention should allow arbitrary tail call optimization to be supported. + This calling convention does not support varargs and requires the prototype of + all callees to exactly match the prototype of the function definition. + </dd> + + <dt><b>"<tt>coldcc</tt>" - The cold calling convention</b>:</dt> + + <dd>This calling convention attempts to make code in the caller as efficient + as possible under the assumption that the call is not commonly executed. As + such, these calls often preserve all registers so that the call does not break + any live ranges in the caller side. This calling convention does not support + varargs and requires the prototype of all callees to exactly match the + prototype of the function definition. + </dd> + + <dt><b>"<tt>cc <<em>n</em>></tt>" - Numbered convention</b>:</dt> + + <dd>Any calling convention may be specified by number, allowing + target-specific calling conventions to be used. Target specific calling + conventions start at 64. + </dd> +</dl> + +<p>More calling conventions can be added/defined on an as-needed basis, to +support pascal conventions or any other well-known target-independent +convention.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="globalvars">Global Variables</a> +</div> + +<div class="doc_text"> + +<p>Global variables define regions of memory allocated at compilation time +instead of run-time. Global variables may optionally be initialized. A +variable may be defined as a global "constant," which indicates that the +contents of the variable will <b>never</b> be modified (enabling better +optimization, allowing the global data to be placed in the read-only section of +an executable, etc). Note that variables that need runtime initialization +cannot be marked "constant" as there is a store to the variable.</p> + +<p> +LLVM explicitly allows <em>declarations</em> of global variables to be marked +constant, even if the final definition of the global is not. This capability +can be used to enable slightly better optimization of the program, but requires +the language definition to guarantee that optimizations based on the +'constantness' are valid for the translation units that do not include the +definition. +</p> + +<p>As SSA values, global variables define pointer values that are in +scope (i.e. they dominate) all basic blocks in the program. Global +variables always define a pointer to their "content" type because they +describe a region of memory, and all memory objects in LLVM are +accessed through pointers.</p> + +</div> + + +<!-- ======================================================================= --> +<div class="doc_subsection"> + <a name="functionstructure">Functions</a> +</div> + +<div class="doc_text"> + +<p>LLVM function definitions consist of an optional <a href="#linkage">linkage +type</a>, an optional <a href="#callingconv">calling convention</a>, a return +type, a function name, a (possibly empty) argument list, an opening curly brace, +a list of basic blocks, and a closing curly brace. LLVM function declarations +are defined with the "<tt>declare</tt>" keyword, an optional <a +href="#callingconv">calling convention</a>, a return type, a function name, and +a possibly empty list of arguments.</p> + +<p>A function definition contains a list of basic blocks, forming the CFG for +the function. Each basic block may optionally start with a label (giving the +basic block a symbol table entry), contains a list of instructions, and ends +with a <a href="#terminators">terminator</a> instruction (such as a branch or +function return).</p> + +<p>The first basic block in a program is special in two ways: it is immediately +executed on entrance to the function, and it is not allowed to have predecessor +basic blocks (i.e. there can not be any branches to the entry block of a +function). Because the block can have no predecessors, it also cannot have any +<a href="#i_phi">PHI nodes</a>.</p> + +<p>LLVM functions are identified by their name and type signature. Hence, two +functions with the same name but different parameter lists or return values are +considered different functions, and LLVM will resolve references to each +appropriately.</p> + +</div> + + + +<!-- *********************************************************************** --> +<div class="doc_section"> <a name="typesystem">Type System</a> </div> +<!-- *********************************************************************** --> + +<div class="doc_text"> + +<p>The LLVM type system is one of the most important features of the +intermediate representation. Being typed enables a number of +optimizations to be performed on the IR directly, without having to do +extra analyses on the side before the transformation. A strong type +system makes it easier to read the generated code and enables novel +analyses and transformations that are not feasible to perform on normal +three address code representations.</p> + +</div> + +<!-- ======================================================================= --> +<div class="doc_subsection"> <a name="t_primitive">Primitive Types</a> </div> +<div class="doc_text"> +<p>The primitive types are the fundamental building blocks of the LLVM +system. The current set of primitive types is as follows:</p> + +<table class="layout"> + <tr class="layout"> + <td class="left"> + <table> + <tbody> + <tr><th>Type</th><th>Description</th></tr> + <tr><td><tt>void</tt></td><td>No value</td></tr> + <tr><td><tt>ubyte</tt></td><td>Unsigned 8-bit value</td></tr> + <tr><td><tt>ushort</tt></td><td>Unsigned 16-bit value</td></tr> + <tr><td><tt>uint</tt></td><td>Unsigned 32-bit value</td></tr> + <tr><td><tt>ulong</tt></td><td>Unsigned 64-bit value</td></tr> + <tr><td><tt>float</tt></td><td>32-bit floating point value</td></tr> + <tr><td><tt>label</tt></td><td>Branch destination</td></tr> + </tbody> + </table> + </td> + <td class="right"> + <table> + <tbody> + <tr><th>Type</th><th>Description</th></tr> + <tr><td><tt>bool</tt></td><td>True or False value</td></tr> + <tr><td><tt>sbyte</tt></td><td>Signed 8-bit value</td></tr> + <tr><td><tt>short</tt></td><td>Signed 16-bit value</td></tr> + <tr><td><tt>int</tt></td><td>Signed 32-bit value</td></tr> + <tr><td><tt>long</tt></td><td>Signed 64-bit value</td></tr> + <tr><td><tt>double</tt></td><td>64-bit floating point value</td></tr> + </tbody> + </table> + </td> + </tr> +</table> +</div> + +<!-- _______________________________________________________________________ --> +<div class="doc_subsubsection"> <a name="t_classifications">Type +Classifications</a> </div> +<div class="doc_text"> +<p>These different primitive types fall into a few useful +classifications:</p> + +<table border="1" cellspacing="0" cellpadding="4"> + <tbody> + <tr><th>Classification</th><th>Types</th></tr> + <tr> + <td><a name="t_signed">signed</a></td> + <td><tt>sbyte, short, int, long, float, double</tt></td> + </tr> + <tr> + <td><a name="t_unsigned">unsigned</a></td> + <td><tt>ubyte, ushort, uint, ulong</tt></td> + </tr> + <tr> + <td><a name="t_integer">integer</a></td> + <td><tt>ubyte, sbyte, ushort, short, uint, int, ulong, long</tt></td> + </tr> + <tr> + <td><a name="t_integral">integral</a></td> + <td><tt>bool, ubyte, sbyte, ushort, short, uint, int, ulong, long</tt> + </td> + </tr> + <tr> + <td><a name="t_floating">floating point</a></td> + <td><tt>float, double</tt></td> + </tr> + <tr> + <td><a name="t_firstclass">first class</a></td> + <td><tt>bool, ubyte, sbyte, ushort, short, uint, int, ulong, long,<br> + float, double, <a href="#t_pointer">pointer</a>, + <a href="#t_packed">packed</a></tt></td> + </tr> + </tbody> +</table> + +<p>The <a href="#t_firstclass">first class</a> types are perhaps the +most important. Values of these types are the only ones which can be +produced by instructions, passed as arguments, or used as operands to +instructions. This means that all structures and arrays must be +manipulated either by pointer or by component.</p> +</div> + +<!-- ======================================================================= --> +<div class="doc_subsection"> <a name="t_derived">Derived Types</a> </div> + +<div class="doc_text"> + +<p>The real power in LLVM comes from the derived types in the system. +This is what allows a programmer to represent arrays, functions, +pointers, and other useful types. Note that these derived types may be +recursive: For example, it is possible to have a two dimensional array.</p> + +</div> + +<!-- _______________________________________________________________________ --> +<div class="doc_subsubsection"> <a name="t_array">Array Type</a> </div> + +<div class="doc_text"> + +<h5>Overview:</h5> + +<p>The array type is a very simple derived type that arranges elements +sequentially in memory. The array type requires a size (number of +elements) and an underlying data type.</p> + +<h5>Syntax:</h5> + +<pre> + [<# elements> x <elementtype>] +</pre> + +<p>The number of elements is a constant integer value; elementtype may +be any type with a size.</p> + +<h5>Examples:</h5> +<table class="layout"> + <tr class="layout"> + <td class="left"> + <tt>[40 x int ]</tt><br/> + <tt>[41 x int ]</tt><br/> + <tt>[40 x uint]</tt><br/> + </td> + <td class="left"> + Array of 40 integer values.<br/> + Array of 41 integer values.<br/> + Array of 40 unsigned integer values.<br/> + </td> + </tr> +</table> +<p>Here are some examples of multidimensional arrays:</p> +<table class="layout"> + <tr class="layout"> + <td class="left"> + <tt>[3 x [4 x int]]</tt><br/> + <tt>[12 x [10 x float]]</tt><br/> + <tt>[2 x [3 x [4 x uint]]]</tt><br/> + </td> + <td class="left"> + 3x4 array of integer values.<br/> + 12x10 array of single precision floating point values.<br/> + 2x3x4 array of unsigned integer values.<br/> + </td> + </tr> +</table> + +<p>Note that 'variable sized arrays' can be implemented in LLVM with a zero +length array. Normally, accesses past the end of an array are undefined in +LLVM (e.g. it is illegal to access the 5th element of a 3 element array). +As a special case, however, zero length arrays are recognized to be variable +length. This allows implementation of 'pascal style arrays' with the LLVM +type "{ int, [0 x float]}", for example.</p> + +</div> + +<!-- _______________________________________________________________________ --> +<div class="doc_subsubsection"> <a name="t_function">Function Type</a> </div> +<div class="doc_text"> +<h5>Overview:</h5> +<p>The function type can be thought of as a function signature. It +consists of a return type and a list of formal parameter types. +Function types are usually used to build virtual function tables +(which are structures of pointers to functions), for indirect function +calls, and when defining a function.</p> +<p> +The return type of a function type cannot be an aggregate type. +</p> +<h5>Syntax:</h5> +<pre> <returntype> (<parameter list>)<br></pre> +<p>...where '<tt><parameter list></tt>' is a comma-separated list of type +specifiers. Optionally, the parameter list may include a type <tt>...</tt>, +which indicates that the function takes a variable number of arguments. +Variable argument functions can access their arguments with the <a + href="#int_varargs">variable argument handling intrinsic</a> functions.</p> +<h5>Examples:</h5> +<table class="layout"> + <tr class="layout"> + <td class="left"> + <tt>int (int)</tt> <br/> + <tt>float (int, int *) *</tt><br/> + <tt>int (sbyte *, ...)</tt><br/> + </td> + <td class="left"> + function taking an <tt>int</tt>, returning an <tt>int</tt><br/> + <a href="#t_pointer">Pointer</a> to a function that takes an + <tt>int</tt> and a <a href="#t_pointer">pointer</a> to <tt>int</tt>, + returning <tt>float</tt>.<br/> + A vararg function that takes at least one <a href="#t_pointer">pointer</a> + to <tt>sbyte</tt> (signed char in C), which returns an integer. This is + the signature for <tt>printf</tt> in LLVM.<br/> + </td> + </tr> +</table> + +</div> +<!-- _______________________________________________________________________ --> +<div class="doc_subsubsection"> <a name="t_struct">Structure Type</a> </div> +<div class="doc_text"> +<h5>Overview:</h5> +<p>The structure type is used to represent a collection of data members +together in memory. The packing of the field types is defined to match +the ABI of the underlying processor. The elements of a structure may +be any type that has a size.</p> +<p>Structures are accessed using '<tt><a href="#i_load">load</a></tt> +and '<tt><a href="#i_store">store</a></tt>' by getting a pointer to a +field with the '<tt><a href="#i_getelementptr">getelementptr</a></tt>' +instruction.</p> +<h5>Syntax:</h5> |