diff options
author | Bill Wendling <isanbard@gmail.com> | 2009-04-15 02:12:37 +0000 |
---|---|---|
committer | Bill Wendling <isanbard@gmail.com> | 2009-04-15 02:12:37 +0000 |
commit | 801188066a71467fc031721903ab8986dd11acbe (patch) | |
tree | 0ee46a9914f3c2891411429adc8a0180dc945c04 /docs/CodeGenerator.html | |
parent | 61e08bd97fce1c84dd0e6f01f71d5f56f5f5ad0f (diff) |
More obsessive reformatting. Fixed some validation errors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@69130 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'docs/CodeGenerator.html')
-rw-r--r-- | docs/CodeGenerator.html | 2054 |
1 files changed, 1045 insertions, 1009 deletions
diff --git a/docs/CodeGenerator.html b/docs/CodeGenerator.html index 66d793dfa0..2471e9d20c 100644 --- a/docs/CodeGenerator.html +++ b/docs/CodeGenerator.html @@ -119,52 +119,51 @@ <div class="doc_text"> <p>The LLVM target-independent code generator is a framework that provides a -suite of reusable components for translating the LLVM internal representation to -the machine code for a specified target—either in assembly form (suitable -for a static compiler) or in binary machine code format (usable for a JIT -compiler). The LLVM target-independent code generator consists of five main -components:</p> + suite of reusable components for translating the LLVM internal representation + to the machine code for a specified target—either in assembly form + (suitable for a static compiler) or in binary machine code format (usable for + a JIT compiler). The LLVM target-independent code generator consists of five + main components:</p> <ol> -<li><a href="#targetdesc">Abstract target description</a> interfaces which -capture important properties about various aspects of the machine, independently -of how they will be used. These interfaces are defined in -<tt>include/llvm/Target/</tt>.</li> - -<li>Classes used to represent the <a href="#codegendesc">machine code</a> being -generated for a target. These classes are intended to be abstract enough to -represent the machine code for <i>any</i> target machine. These classes are -defined in <tt>include/llvm/CodeGen/</tt>.</li> - -<li><a href="#codegenalgs">Target-independent algorithms</a> used to implement -various phases of native code generation (register allocation, scheduling, stack -frame representation, etc). This code lives in <tt>lib/CodeGen/</tt>.</li> - -<li><a href="#targetimpls">Implementations of the abstract target description -interfaces</a> for particular targets. These machine descriptions make use of -the components provided by LLVM, and can optionally provide custom -target-specific passes, to build complete code generators for a specific target. -Target descriptions live in <tt>lib/Target/</tt>.</li> - -<li><a href="#jit">The target-independent JIT components</a>. The LLVM JIT is -completely target independent (it uses the <tt>TargetJITInfo</tt> structure to -interface for target-specific issues. The code for the target-independent -JIT lives in <tt>lib/ExecutionEngine/JIT</tt>.</li> - + <li><a href="#targetdesc">Abstract target description</a> interfaces which + capture important properties about various aspects of the machine, + independently of how they will be used. These interfaces are defined in + <tt>include/llvm/Target/</tt>.</li> + + <li>Classes used to represent the <a href="#codegendesc">machine code</a> + being generated for a target. These classes are intended to be abstract + enough to represent the machine code for <i>any</i> target machine. These + classes are defined in <tt>include/llvm/CodeGen/</tt>.</li> + + <li><a href="#codegenalgs">Target-independent algorithms</a> used to implement + various phases of native code generation (register allocation, scheduling, + stack frame representation, etc). This code lives + in <tt>lib/CodeGen/</tt>.</li> + + <li><a href="#targetimpls">Implementations of the abstract target description + interfaces</a> for particular targets. These machine descriptions make + use of the components provided by LLVM, and can optionally provide custom + target-specific passes, to build complete code generators for a specific + target. Target descriptions live in <tt>lib/Target/</tt>.</li> + + <li><a href="#jit">The target-independent JIT components</a>. The LLVM JIT is + completely target independent (it uses the <tt>TargetJITInfo</tt> + structure to interface for target-specific issues. The code for the + target-independent JIT lives in <tt>lib/ExecutionEngine/JIT</tt>.</li> </ol> -<p> -Depending on which part of the code generator you are interested in working on, -different pieces of this will be useful to you. In any case, you should be -familiar with the <a href="#targetdesc">target description</a> and <a -href="#codegendesc">machine code representation</a> classes. If you want to add -a backend for a new target, you will need to <a href="#targetimpls">implement the -target description</a> classes for your new target and understand the <a -href="LangRef.html">LLVM code representation</a>. If you are interested in -implementing a new <a href="#codegenalgs">code generation algorithm</a>, it -should only depend on the target-description and machine code representation -classes, ensuring that it is portable. -</p> +<p>Depending on which part of the code generator you are interested in working + on, different pieces of this will be useful to you. In any case, you should + be familiar with the <a href="#targetdesc">target description</a> + and <a href="#codegendesc">machine code representation</a> classes. If you + want to add a backend for a new target, you will need + to <a href="#targetimpls">implement the target description</a> classes for + your new target and understand the <a href="LangRef.html">LLVM code + representation</a>. If you are interested in implementing a + new <a href="#codegenalgs">code generation algorithm</a>, it should only + depend on the target-description and machine code representation classes, + ensuring that it is portable.</p> </div> @@ -176,27 +175,27 @@ classes, ensuring that it is portable. <div class="doc_text"> <p>The two pieces of the LLVM code generator are the high-level interface to the -code generator and the set of reusable components that can be used to build -target-specific backends. The two most important interfaces (<a -href="#targetmachine"><tt>TargetMachine</tt></a> and <a -href="#targetdata"><tt>TargetData</tt></a>) are the only ones that are -required to be defined for a backend to fit into the LLVM system, but the others -must be defined if the reusable code generator components are going to be -used.</p> + code generator and the set of reusable components that can be used to build + target-specific backends. The two most important interfaces + (<a href="#targetmachine"><tt>TargetMachine</tt></a> + and <a href="#targetdata"><tt>TargetData</tt></a>) are the only ones that are + required to be defined for a backend to fit into the LLVM system, but the + others must be defined if the reusable code generator components are going to + be used.</p> <p>This design has two important implications. The first is that LLVM can -support completely non-traditional code generation targets. For example, the C -backend does not require register allocation, instruction selection, or any of -the other standard components provided by the system. As such, it only -implements these two interfaces, and does its own thing. Another example of a -code generator like this is a (purely hypothetical) backend that converts LLVM -to the GCC RTL form and uses GCC to emit machine code for a target.</p> + support completely non-traditional code generation targets. For example, the + C backend does not require register allocation, instruction selection, or any + of the other standard components provided by the system. As such, it only + implements these two interfaces, and does its own thing. Another example of + a code generator like this is a (purely hypothetical) backend that converts + LLVM to the GCC RTL form and uses GCC to emit machine code for a target.</p> -<p>This design also implies that it is possible to design and -implement radically different code generators in the LLVM system that do not -make use of any of the built-in components. Doing so is not recommended at all, -but could be required for radically different targets that do not fit into the -LLVM machine description model: FPGAs for example.</p> +<p>This design also implies that it is possible to design and implement + radically different code generators in the LLVM system that do not make use + of any of the built-in components. Doing so is not recommended at all, but + could be required for radically different targets that do not fit into the + LLVM machine description model: FPGAs for example.</p> </div> @@ -207,75 +206,73 @@ LLVM machine description model: FPGAs for example.</p> <div class="doc_text"> -<p>The LLVM target-independent code generator is designed to support efficient and -quality code generation for standard register-based microprocessors. Code -generation in this model is divided into the following stages:</p> +<p>The LLVM target-independent code generator is designed to support efficient + and quality code generation for standard register-based microprocessors. + Code generation in this model is divided into the following stages:</p> <ol> -<li><b><a href="#instselect">Instruction Selection</a></b> - This phase -determines an efficient way to express the input LLVM code in the target -instruction set. -This stage produces the initial code for the program in the target instruction -set, then makes use of virtual registers in SSA form and physical registers that -represent any required register assignments due to target constraints or calling -conventions. This step turns the LLVM code into a DAG of target -instructions.</li> - -<li><b><a href="#selectiondag_sched">Scheduling and Formation</a></b> - This -phase takes the DAG of target instructions produced by the instruction selection -phase, determines an ordering of the instructions, then emits the instructions -as <tt><a href="#machineinstr">MachineInstr</a></tt>s with that ordering. Note -that we describe this in the <a href="#instselect">instruction selection -section</a> because it operates on a <a -href="#selectiondag_intro">SelectionDAG</a>. -</li> - -<li><b><a href="#ssamco">SSA-based Machine Code Optimizations</a></b> - This -optional stage consists of a series of machine-code optimizations that -operate on the SSA-form produced by the instruction selector. Optimizations -like modulo-scheduling or peephole optimization work here. -</li> - -<li><b><a href="#regalloc">Register Allocation</a></b> - The -target code is transformed from an infinite virtual register file in SSA form -to the concrete register file used by the target. This phase introduces spill -code and eliminates all virtual register references from the program.</li> - -<li><b><a href="#proepicode">Prolog/Epilog Code Insertion</a></b> - Once the -machine code has been generated for the function and the amount of stack space -required is known (used for LLVM alloca's and spill slots), the prolog and -epilog code for the function can be inserted and "abstract stack location -references" can be eliminated. This stage is responsible for implementing -optimizations like frame-pointer elimination and stack packing.</li> - -<li><b><a href="#latemco">Late Machine Code Optimizations</a></b> - Optimizations -that operate on "final" machine code can go here, such as spill code scheduling -and peephole optimizations.</li> - -<li><b><a href="#codeemit">Code Emission</a></b> - The final stage actually -puts out the code for the current function, either in the target assembler -format or in machine code.</li> - + <li><b><a href="#instselect">Instruction Selection</a></b> — This phase + determines an efficient way to express the input LLVM code in the target + instruction set. This stage produces the initial code for the program in + the target instruction set, then makes use of virtual registers in SSA + form and physical registers that represent any required register + assignments due to target constraints or calling conventions. This step + turns the LLVM code into a DAG of target instructions.</li> + + <li><b><a href="#selectiondag_sched">Scheduling and Formation</a></b> — + This phase takes the DAG of target instructions produced by the + instruction selection phase, determines an ordering of the instructions, + then emits the instructions + as <tt><a href="#machineinstr">MachineInstr</a></tt>s with that ordering. + Note that we describe this in the <a href="#instselect">instruction + selection section</a> because it operates on + a <a href="#selectiondag_intro">SelectionDAG</a>.</li> + + <li><b><a href="#ssamco">SSA-based Machine Code Optimizations</a></b> — + This optional stage consists of a series of machine-code optimizations + that operate on the SSA-form produced by the instruction selector. + Optimizations like modulo-scheduling or peephole optimization work + here.</li> + + <li><b><a href="#regalloc">Register Allocation</a></b> — The target code + is transformed from an infinite virtual register file in SSA form to the + concrete register file used by the target. This phase introduces spill + code and eliminates all virtual register references from the program.</li> + + <li><b><a href="#proepicode">Prolog/Epilog Code Insertion</a></b> — Once + the machine code has been generated for the function and the amount of + stack space required is known (used for LLVM alloca's and spill slots), + the prolog and epilog code for the function can be inserted and "abstract + stack location references" can be eliminated. This stage is responsible + for implementing optimizations like frame-pointer elimination and stack + packing.</li> + + <li><b><a href="#latemco">Late Machine Code Optimizations</a></b> — + Optimizations that operate on "final" machine code can go here, such as + spill code scheduling and peephole optimizations.</li> + + <li><b><a href="#codeemit">Code Emission</a></b> — The final stage + actually puts out the code for the current function, either in the target + assembler format or in machine code.</li> </ol> <p>The code generator is based on the assumption that the instruction selector -will use an optimal pattern matching selector to create high-quality sequences of -native instructions. Alternative code generator designs based on pattern -expansion and aggressive iterative peephole optimization are much slower. This -design permits efficient compilation (important for JIT environments) and -aggressive optimization (used when generating code offline) by allowing -components of varying levels of sophistication to be used for any step of -compilation.</p> + will use an optimal pattern matching selector to create high-quality + sequences of native instructions. Alternative code generator designs based + on pattern expansion and aggressive iterative peephole optimization are much + slower. This design permits efficient compilation (important for JIT + environments) and aggressive optimization (used when generating code offline) + by allowing components of varying levels of sophistication to be used for any + step of compilation.</p> <p>In addition to these stages, target implementations can insert arbitrary -target-specific passes into the flow. For example, the X86 target uses a -special pass to handle the 80x87 floating point stack architecture. Other -targets with unusual requirements can be supported with custom passes as -needed.</p> + target-specific passes into the flow. For example, the X86 target uses a + special pass to handle the 80x87 floating point stack architecture. Other + targets with unusual requirements can be supported with custom passes as + needed.</p> </div> - <!-- ======================================================================= --> <div class="doc_subsection"> <a name="tablegen">Using TableGen for target description</a> @@ -284,24 +281,23 @@ needed.</p> <div class="doc_text"> <p>The target description classes require a detailed description of the target -architecture. These target descriptions often have a large amount of common -information (e.g., an <tt>add</tt> instruction is almost identical to a -<tt>sub</tt> instruction). -In order to allow the maximum amount of commonality to be factored out, the LLVM -code generator uses the <a href="TableGenFundamentals.html">TableGen</a> tool to -describe big chunks of the target machine, which allows the use of -domain-specific and target-specific abstractions to reduce the amount of -repetition.</p> + architecture. These target descriptions often have a large amount of common + information (e.g., an <tt>add</tt> instruction is almost identical to a + <tt>sub</tt> instruction). In order to allow the maximum amount of + commonality to be factored out, the LLVM code generator uses + the <a href="TableGenFundamentals.html">TableGen</a> tool to describe big + chunks of the target machine, which allows the use of domain-specific and + target-specific abstractions to reduce the amount of repetition.</p> <p>As LLVM continues to be developed and refined, we plan to move more and more -of the target description to the <tt>.td</tt> form. Doing so gives us a -number of advantages. The most important is that it makes it easier to port -LLVM because it reduces the amount of C++ code that has to be written, and the -surface area of the code generator that needs to be understood before someone -can get something working. Second, it makes it easier to change things. In -particular, if tables and other things are all emitted by <tt>tblgen</tt>, we -only need a change in one place (<tt>tblgen</tt>) to update all of the targets -to a new interface.</p> + of the target description to the <tt>.td</tt> form. Doing so gives us a + number of advantages. The most important is that it makes it easier to port + LLVM because it reduces the amount of C++ code that has to be written, and + the surface area of the code generator that needs to be understood before + someone can get something working. Second, it makes it easier to change + things. In particular, if tables and other things are all emitted + by <tt>tblgen</tt>, we only need a change in one place (<tt>tblgen</tt>) to + update all of the targets to a new interface.</p> </div> @@ -314,18 +310,18 @@ to a new interface.</p> <div class="doc_text"> <p>The LLVM target description classes (located in the -<tt>include/llvm/Target</tt> directory) provide an abstract description of the -target machine independent of any particular client. These classes are -designed to capture the <i>abstract</i> properties of the target (such as the -instructions and registers it has), and do not incorporate any particular pieces -of code generation algorithms.</p> + <tt>include/llvm/Target</tt> directory) provide an abstract description of + the target machine independent of any particular client. These classes are + designed to capture the <i>abstract</i> properties of the target (such as the + instructions and registers it has), and do not incorporate any particular + pieces of code generation algorithms.</p> -<p>All of the target description classes (except the <tt><a -href="#targetdata">TargetData</a></tt> class) are designed to be subclassed by -the concrete target implementation, and have virtual methods implemented. To -get to these implementations, the <tt><a -href="#targetmachine">TargetMachine</a></tt> class provides accessors that -should be implemented by the target.</p> +<p>All of the target description classes (except the + <tt><a href="#targetdata">TargetData</a></tt> class) are designed to be + subclassed by the concrete target implementation, and have virtual methods + implemented. To get to these implementations, the + <tt><a href="#targetmachine">TargetMachine</a></tt> class provides accessors + that should be implemented by the target.</p> </div> @@ -337,19 +333,18 @@ should be implemented by the target.</p> <div class="doc_text"> <p>The <tt>TargetMachine</tt> class provides virtual methods that are used to -access the target-specific implementations of the various target description -classes via the <tt>get*Info</tt> methods (<tt>getInstrInfo</tt>, -<tt>getRegisterInfo</tt>, <tt>getFrameInfo</tt>, etc.). This class is -designed to be specialized by -a concrete target implementation (e.g., <tt>X86TargetMachine</tt>) which -implements the various virtual methods. The only required target description -class is the <a href="#targetdata"><tt>TargetData</tt></a> class, but if the -code generator components are to be used, the other interfaces should be -implemented as well.</p> + access the target-specific implementations of the various target description + classes via the <tt>get*Info</tt> methods (<tt>getInstrInfo</tt>, + <tt>getRegisterInfo</tt>, <tt>getFrameInfo</tt>, etc.). This class is + designed to be specialized by a concrete target implementation + (e.g., <tt>X86TargetMachine</tt>) which implements the various virtual + methods. The only required target description class is + the <a href="#targetdata"><tt>TargetData</tt></a> class, but if the code + generator components are to be used, the other interfaces should be + implemented as well.</p> </div> - <!-- ======================================================================= --> <div class="doc_subsection"> <a name="targetdata">The <tt>TargetData</tt> class</a> @@ -358,11 +353,11 @@ implemented as well.</p> <div class="doc_text"> <p>The <tt>TargetData</tt> class is the only required target description class, -and it is the only class that is not extensible (you cannot derived a new -class from it). <tt>TargetData</tt> specifies information about how the target -lays out memory for structures, the alignment requirements for various data -types, the size of pointers in the target, and whether the target is -little-endian or big-endian.</p> + and it is the only class that is not extensible (you cannot derived a new + class from it). <tt>TargetData</tt> specifies information about how the + target lays out memory for structures, the alignment requirements for various + data types, the size of pointers in the target, and whether the target is + little-endian or big-endian.</p> </div> @@ -374,14 +369,18 @@ little-endian or big-endian.</p> <div class="doc_text"> <p>The <tt>TargetLowering</tt> class is used by SelectionDAG based instruction -selectors primarily to describe how LLVM code should be lowered to SelectionDAG -operations. Among other things, this class indicates:</p> + selectors primarily to describe how LLVM code should be lowered to + SelectionDAG operations. Among other things, this class indicates:</p> <ul> - <li>an initial register class to use for various <tt>ValueType</tt>s</li> - <li>which operations are natively supported by the target machine</li> - <li>the return type of <tt>setcc</tt> operations</li> - <li>the type to use for shift amounts</li> + <li>an initial register class to use for various <tt>ValueType</tt>s,</li> + + <li>which operations are natively supported by the target machine,</li> + + <li>the return type of <tt>setcc</tt> operations,</li> + + <li>the type to use for shift amounts, and</li> + <li>various high-level characteristics, like whether it is profitable to turn division by a constant into a multiplication sequence</li> </ul> @@ -395,32 +394,30 @@ operations. Among other things, this class indicates:</p> <div class="doc_text"> -<p>The <tt>TargetRegisterInfo</tt> class is used to describe the register -file of the target and any interactions between the registers.</p> +<p>The <tt>TargetRegisterInfo</tt> class is used to describe the register file + of the target and any interactions between the registers.</p> <p>Registers in the code generator are represented in the code generator by -unsigned integers. Physical registers (those that actually exist in the target -description) are unique small numbers, and virtual registers are generally -large. Note that register #0 is reserved as a flag value.</p> + unsigned integers. Physical registers (those that actually exist in the + target description) are unique small numbers, and virtual registers are + generally large. Note that register #0 is reserved as a flag value.</p> <p>Each register in the processor description has an associated -<tt>TargetRegisterDesc</tt> entry, which provides a textual name for the -register (used for assembly output and debugging dumps) and a set of aliases -(used to indicate whether one register overlaps with another). -</p> + <tt>TargetRegisterDesc</tt> entry, which provides a textual name for the + register (used for assembly output and debugging dumps) and a set of aliases + (used to indicate whether one register overlaps with another).</p> <p>In addition to the per-register description, the <tt>TargetRegisterInfo</tt> -class exposes a set of processor specific register classes (instances of the -<tt>TargetRegisterClass</tt> class). Each register class contains sets of -registers that have the same properties (for example, they are all 32-bit -integer registers). Each SSA virtual register created by the instruction -selector has an associated register class. When the register allocator runs, it -replaces virtual registers with a physical register in the set.</p> + class exposes a set of processor specific register classes (instances of the + <tt>TargetRegisterClass</tt> class). Each register class contains sets of + registers that have the same properties (for example, they are all 32-bit + integer registers). Each SSA virtual register created by the instruction + selector has an associated register class. When the register allocator runs, + it replaces virtual registers with a physical register in the set.</p> -<p> -The target-specific implementations of these classes is auto-generated from a <a -href="TableGenFundamentals.html">TableGen</a> description of the register file. -</p> +<p>The target-specific implementations of these classes is auto-generated from + a <a href="TableGenFundamentals.html">TableGen</a> description of the + register file.</p> </div> @@ -430,14 +427,16 @@ href="TableGenFundamentals.html">TableGen</a> description of the register file. </div> <div class="doc_text"> - <p>The <tt>TargetInstrInfo</tt> class is used to describe the machine - instructions supported by the target. It is essentially an array of - <tt>TargetInstrDescriptor</tt> objects, each of which describes one - instruction the target supports. Descriptors define things like the mnemonic - for the opcode, the number of operands, the list of implicit register uses - and defs, whether the instruction has certain target-independent properties - (accesses memory, is commutable, etc), and holds any target-specific - flags.</p> + +<p>The <tt>TargetInstrInfo</tt> class is used to describe the machine + instructions supported by the target. It is essentially an array of + <tt>TargetInstrDescriptor</tt> objects, each of which describes one + instruction the target supports. Descriptors define things like the mnemonic + for the opcode, the number of operands, the list of implicit register uses + and defs, whether the instruction has certain target-independent properties + (accesses memory, is commutable, etc), and holds any target-specific + flags.</p> + </div> <!-- ======================================================================= --> @@ -446,12 +445,14 @@ href="TableGenFundamentals.html">TableGen</a> description of the register file. </div> <div class="doc_text"> - <p>The <tt>TargetFrameInfo</tt> class is used to provide information about the - stack frame layout of the target. It holds the direction of stack growth, - the known stack alignment on entry to each function, and the offset to the - local area. The offset to the local area is the offset from the stack - pointer on function entry to the first location where function data (local - variables, spill locations) can be stored.</p> + +<p>The <tt>TargetFrameInfo</tt> class is used to provide information about the + stack frame layout of the target. It holds the direction of stack growth, the + known stack alignment on entry to each function, and the offset to the local + area. The offset to the local area is the offset from the stack pointer on + function entry to the first location where function data (local variables, + spill locations) can be stored.</p> + </div> <!-- ======================================================================= --> @@ -460,11 +461,13 @@ href="TableGenFundamentals.html">TableGen</a> description of the register file. </div> <div class="doc_text"> - <p>The <tt>TargetSubtarget</tt> class is used to provide information about the - specific chip set being targeted. A sub-target informs code generation of - which instructions are supported, instruction latencies and instruction - execution itinerary; i.e., which processing units are used, in what order, and - for how long.</p> + +<p>The <tt>TargetSubtarget</tt> class is used to provide information about the + specific chip set being targeted. A sub-target informs code generation of + which instructions are supported, instruction latencies and instruction + execution itinerary; i.e., which processing units are used, in what order, + and for how long.</p> + </div> @@ -474,11 +477,13 @@ href="TableGenFundamentals.html">TableGen</a> description of the register file. </div> <div class="doc_text"> - <p>The <tt>TargetJITInfo</tt> class exposes an abstract interface used by the - Just-In-Time code generator to perform target-specific activities, such as - emitting stubs. If a <tt>TargetMachine</tt> supports JIT code generation, it - should provide one of these objects through the <tt>getJITInfo</tt> - method.</p> + +<p>The <tt>TargetJITInfo</tt> class exposes an abstract interface used by the + Just-In-Time code generator to perform target-specific activities, such as + emitting stubs. If a <tt>TargetMachine</tt> supports JIT code generation, it + should provide one of these objects through the <tt>getJITInfo</tt> + method.</p> + </div> <!-- *********************************************************************** --> @@ -490,15 +495,15 @@ href="TableGenFundamentals.html">TableGen</a> description of the register file. <div class="doc_text"> <p>At the high-level, LLVM code is translated to a machine specific -representation formed out of -<a href="#machinefunction"><tt>MachineFunction</tt></a>, -<a href="#machinebasicblock"><tt>MachineBasicBlock</tt></a>, and <a -href="#machineinstr"><tt>MachineInstr</tt></a> instances -(defined in <tt>include/llvm/CodeGen</tt>). This representation is completely -target agnostic, representing instructions in their most abstract form: an -opcode and a series of operands. This representation is designed to support -both an SSA representation for machine code, as well as a register allocated, -non-SSA form.</p> + representation formed out of + <a href="#machinefunction"><tt>MachineFunction</tt></a>, + <a href="#machinebasicblock"><tt>MachineBasicBlock</tt></a>, + and <a href="#machineinstr"><tt>MachineInstr</tt></a> instances (defined + in <tt>include/llvm/CodeGen</tt>). This representation is completely target + agnostic, representing instructions in their most abstract form: an opcode + and a series of operands. This representation is designed to support both an + SSA representation for machine code, as well as a register allocated, non-SSA + form.</p> </div> @@ -510,34 +515,34 @@ non-SSA form.</p> <div class="doc_text"> <p>Target machine instructions are represented as instances of the -<tt>MachineInstr</tt> class. This class is an extremely abstract way of -representing machine instructions. In particular, it only keeps track of -an opcode number and a set of operands.</p> - -<p>The opcode number is a simple unsigned integer that only has meaning to a -specific backend. All of the instructions for a target should be defined in -the <tt>*InstrInfo.td</tt> file for the target. The opcode enum values -are auto-generated from this description. The <tt>MachineInstr</tt> class does -not have any information about how to interpret the instruction (i.e., what the -semantics of the instruction are); for that you must refer to the -<tt><a href="#targetinstrinfo">TargetInstrInfo</a></tt> class.</p> - -<p>The operands of a machine instruction can be of several different types: -a register reference, a constant integer, a basic block reference, etc. In -addition, a machine operand should be marked as a def or a use of the value -(though only registers are allowed to be defs).</p> + <tt>MachineInstr</tt> class. This class is an extremely abstract way of + representing machine instructions. In particular, it only keeps track of an + opcode number and a set of operands.</p> + +<p>The opcode number is a simple unsigned integer that only has meaning to a + specific backend. All of the instructions for a target should be defined in + the <tt>*InstrInfo.td</tt> file for the target. The opcode enum values are + auto-generated from this description. The <tt>MachineInstr</tt> class does + not have any information about how to interpret the instruction (i.e., what + the semantics of the instruction are); for that you must refer to the + <tt><a href="#targetinstrinfo">TargetInstrInfo</a></tt> class.</p> + +<p>The operands of a machine instruction can be of several different types: a + register reference, a constant integer, a basic block reference, etc. In + addition, a machine operand should be marked as a def or a use of the value + (though only registers are allowed to be defs).</p> <p>By convention, the LLVM code generator orders instruction operands so that -all register definitions come before the register uses, even on architectures -that are normally printed in other orders. For example, the SPARC add -instruction: "<tt>add %i1, %i2, %i3</tt>" adds the "%i1", and "%i2" registers -and stores the result into the "%i3" register. In the LLVM code generator, -the operands should be stored as "<tt>%i3, %i1, %i2</tt>": with the destination -first.</p> + all register definitions come before the register uses, even on architectures + that are normally printed in other orders. For example, the SPARC add + instruction: "<tt>add %i1, %i2, %i3</tt>" adds the "%i1", and "%i2" registers + and stores the result into the "%i3" register. In the LLVM code generator, + the operands should be stored as "<tt>%i3, %i1, %i2</tt>": with the + destination first.</p> -<p>Keeping destination (definition) operands at the beginning of the operand -list has several advantages. In particular, the debugging printer will print -the instruction like this:</p> +<p>Keeping destination (definition) operands at the beginning of the operand + list has several advantages. In particular, the debugging printer will print + the instruction like this:</p> <div class="doc_code"> <pre> @@ -545,9 +550,8 @@ the instruction like this:</p> </pre> </div> -<p>Also if the first operand is a def, it is easier to <a -href="#buildmi">create instructions</a> whose only def is the first -operand.</p> +<p>Also if the first operand is a def, it is easier to <a href="#buildmi">create + instructions</a> whose only def is the first operand.</p> </div> @@ -559,9 +563,9 @@ operand.</p> <div class="doc_text"> <p>Machine instructions are created by using the <tt>BuildMI</tt> functions, -located in the <tt>include/llvm/CodeGen/MachineInstrBuilder.h</tt> file. The -<tt>BuildMI</tt> functions make it easy to build arbitrary machine -instructions. Usage of the <tt>BuildMI</tt> functions look like this:</p> + located in the <tt>include/llvm/CodeGen/MachineInstrBuilder.h</tt> file. The + <tt>BuildMI</tt> functions make it easy to build arbitrary machine + instructions. Usage of the <tt>BuildMI</tt> functions look like this:</p> <div class="doc_code"> <pre> @@ -588,11 +592,11 @@ BuildMI(MBB, X86::JNE, 1).addMBB(&MBB); </div> <p>The key thing to remember with the <tt>BuildMI</tt> functions is that you -have to specify the number of operands that the machine instruction will take. -This allows for efficient memory allocation. You also need to specify if -operands default to be uses of values, not definitions. If you need to add a -definition operand (other than the optional destination register), you must -explicitly mark it as such:</p> + have to specify the number of operands that the machine instruction will + take. This allows for efficient memory allocation. You also need to specify + if operands default to be uses of values, not definitions. If you need to + add a definition operand (other than the optional destination register), you + must explicitly mark it as such:</p> <div class="doc_code"> <pre> @@ -610,13 +614,14 @@ MI.addReg(Reg, MachineOperand::Def); <div class="doc_text"> <p>One important issue that the code generator needs to be aware of is the -presence of fixed registers. In particular, there are often places in the -instruction stream where the register allocator <em>must</em> arrange for a -particular value to be in a particular register. This can occur due to -limitations of the instruction set (e.g., the X86 can only do a 32-bit divide -with the <tt>EAX</tt>/<tt>EDX</tt> registers), or external factors like calling -conventions. In any case, the instruction selector should emit code that -copies a virtual register into or out of a physical register when needed.</p> + presence of fixed registers. In particular, there are often places in the + instruction stream where the register allocator <em>must</em> arrange for a + particular value to be in a particular register. This can occur due to + limitations of the instruction set (e.g., the X86 can only do a 32-bit divide + with the <tt>EAX</tt>/<tt>EDX</tt> registers), or external factors like + calling conventions. In any |