diff options
author | Bill Wendling <isanbard@gmail.com> | 2006-09-01 21:46:00 +0000 |
---|---|---|
committer | Bill Wendling <isanbard@gmail.com> | 2006-09-01 21:46:00 +0000 |
commit | a396ee8e179c3dff516bb55fe8cdf80d00a08aff (patch) | |
tree | d57a895362a7722a3a5724ac24e6dbd4f3fb9d85 /docs/CodeGenerator.html | |
parent | 84b3598bd0ea7a1e376c36b49b52db4740a3f9af (diff) |
Added documentation Fernando Magno Quintao Pereira wrote for the register
allocator. (First draft)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@30031 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'docs/CodeGenerator.html')
-rw-r--r-- | docs/CodeGenerator.html | 344 |
1 files changed, 341 insertions, 3 deletions
diff --git a/docs/CodeGenerator.html b/docs/CodeGenerator.html index f45a98c724..fb8afd103b 100644 --- a/docs/CodeGenerator.html +++ b/docs/CodeGenerator.html @@ -58,6 +58,18 @@ <li><a href="#selectiondag_future">Future directions for the SelectionDAG</a></li> </ul></li> + <ul> + <li><a href="#regalloc">Register Allocation</a> + <ul> + <li><a href="#regAlloc_represent">How registers are represented in + LLVM</a></li> + <li><a href="#regAlloc_howTo">Mapping virtual registers to physical + registers</a></li> + <li><a href="#regAlloc_twoAddr">Handling two address instructions</a></li> + <li><a href="#regAlloc_ssaDecon">The SSA deconstruction phase</a></li> + <li><a href="#regAlloc_fold">Instruction folding</a></li> + <li><a href="#regAlloc_builtIn">Built in register allocators</a></li> + </ul></li> <li><a href="#codeemit">Code Emission</a> <ul> <li><a href="#codeemit_asm">Generating Assembly Code</a></li> @@ -74,8 +86,10 @@ </ol> <div class="doc_author"> - <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a> & - <a href="mailto:isanbard@gmail.com">Bill Wendling</a></p> + <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a>, + <a href="mailto:isanbard@gmail.com">Bill Wendling</a>, and + <a href="mailto:pronesto@gmail.com">Fernando Magno Quintao + Pereira</a></p> </div> <div class="doc_warning"> @@ -1140,11 +1154,335 @@ SelectionDAGs.</p> <a name="ssamco">SSA-based Machine Code Optimizations</a> </div> <div class="doc_text"><p>To Be Written</p></div> + <!-- ======================================================================= --> <div class="doc_subsection"> <a name="regalloc">Register Allocation</a> </div> -<div class="doc_text"><p>To Be Written</p></div> + +<div class="doc_text"> + +<p>The <i>Register Allocation problem</i> consists in mapping a +program <i>P<sub>v</sub></i>, that can use an unbounded number of +virtual registers, to a program <i>P<sub>p</sub></i> that contains a +finite (possibly small) number of physical registers. Each target +architecture has a different number of physical registers. If the +number of physical registers is not enough to accommodate all the +virtual registers, some of them will have to be mapped into +memory. These virtuals are called <i>spilled virtuals</i>.</p> + +</div> + +<!-- _______________________________________________________________________ --> + +<div class="doc_subsubsection"> + <a name="regAlloc_represent">How registers are represented in LLVM</a> +</div> + +<div class="doc_text"> + +<p>In LLVM, physical registers are denoted by integer numbers that +normally range from 1 to 1023. To see how this numbering is defined +for a particular architecture, you can read the +<tt>GenRegisterNames.inc</tt> file for that architecture. For +instance, by inspecting +<tt>lib/Target/X86/X86GenRegisterNames.inc</tt> we see that the 32-bit +register <tt>EAX</tt> is denoted by 15, and the MMX register +<tt>MM0</tt> is mapped to 48.</p> + +<p>Some architectures contain registers that share the same physical +location. A notable example is the X86 platform. For instance, in the +X86 architecture, the registers <tt>EAX</tt>, <tt>AX</tt> and +<tt>AL</tt> share the first eight bits. These physical registers are +marked as <i>aliased</i> in LLVM. Given a particular architecture, you +can check which registers are aliased by inspecting its +<tt>RegisterInfo.td</tt> file. Moreover, the method +<tt>MRegisterInfo::getAliasSet(p_reg)</tt> returns an array containing +all the physical registers aliased to the register <tt>p_reg</tt>.</p> + +<p>Physical registers, in LLVM, are grouped in <i>Register Classes</i>. +Elements in the same register class are functionally equivalent, and can +be interchangeably used. Each virtual register can only be mapped to +physical registers of a particular class. For instance, in the X86 +architecture, some virtuals can only be allocated to 8 bit registers. +A register class is described by <tt>TargetRegisterClass</tt> objects. +To discover if a virtual register is compatible with a given physical, +this code can be used: +</p> + +<div class="doc_code"> +<pre> +bool RegMapping_Fer::compatible_class(MachineFunction &mf, + unsigned v_reg, + unsigned p_reg) { + assert(MRegisterInfo::isPhysicalRegister(p_reg) && + "Target register must be physical"); + const TargetRegisterClass *trc = mf.getSSARegMap()->getRegClass(v_reg); + return trc->contains(p_reg); +} +</pre> +</div> + +<p>Sometimes, mostly for debugging purposes, it is useful to change +the number of physical registers available in the target +architecture. This must be done statically, inside the +<tt>TargetRegsterInfo.td</tt> file. Just <tt>grep</tt> for +<tt>RegisterClass</tt>, the last parameter of which is a list of +registers. Just commenting some out is one simple way to avoid them +being used. A more polite way is to explicitly exclude some registers +from the <i>allocation order</i>. See the definition of the +<tt>GR</tt> register class in +<tt>lib/Target/IA64/IA64RegisterInfo.td</tt> for an example of this +(e.g., <tt>numReservedRegs</tt> registers are hidden.)</p> + +<p>Virtual registers are also denoted by integer numbers. Contrary to +physical registers, different virtual registers never share the same +number. The smallest virtual register is normally assigned the number +1024. This may change, so, in order to know which is the first virtual +register, you should access +<tt>MRegisterInfo::FirstVirtualRegister</tt>. Any register whose +number is greater than or equal to +<tt>MRegisterInfo::FirstVirtualRegister</tt> is considered a virtual +register. Whereas physical registers are statically defined in a +<tt>TargetRegisterInfo.td</tt> file and cannot be created by the +application developer, that is not the case with virtual registers. +In order to create new virtual registers, use the method +<tt>SSARegMap::createVirtualRegister()</tt>. This method will return a +virtual register with the highest code. +</p> + +<p>Before register allocation, the operands of an instruction are +mostly virtual registers, although physical registers may also be +used. In order to check if a given machine operand is a register, use +the boolean function <tt>MachineOperand::isRegister()</tt>. To obtain +the integer code of a register, use +<tt>MachineOperand::getReg()</tt>. An instruction may define or use a +register. For instance, <tt>ADD reg:1026 := reg:1025 reg:1024</tt> +defines the registers 1024, and uses registers 1025 and 1026. Given a +register operand, the method <tt>MachineOperand::isUse()</tt> informs +if that register is being used by the instruction. The method +<tt>MachineOperand::isDef()</tt> informs if that registers is being +defined.</p> + +<p>We will call physical registers present in the LLVM bytecode before +register allocation <i>pre-colored registers</i>. Pre-colored +registers are used in many different situations, for instance, to pass +parameters of functions calls, and to store results of particular +instructions. There are two types of pre-colored registers: the ones +<i>implicitly</i> defined, and those <i>explicitly</i> +defined. Explicitly defined registers are normal operands, and can be +accessed with <tt>MachineInstr::getOperand(int)::getReg()</tt>. In +order to check which registers are implicitly defined by an +instruction, use the +<tt>TargetInstrInfo::get(opcode)::ImplicitDefs</tt>, where +<tt>opcode</tt> is the opcode of the target instruction. One important +difference between explicit and implicit physical registers is that +the latter are defined statically for each instruction, whereas the +former may vary depending on the program being compiled. For example, +an instruction that represents a function call will always implicitly +define or use the same set of physical registers. To read the +registers implicitly used by an instruction, use +<tt>TargetInstrInfo::get(opcode)::ImplicitUses</tt>. Pre-colored +registers impose constraints on any register allocation algorithm. The +register allocator must make sure that none of them is been +overwritten by the values of virtual registers while still alive.</p> + +</div> + +<!-- _______________________________________________________________________ --> + +<div class="doc_subsubsection"> + <a name="regAlloc_howTo">Mapping virtual registers to physical registers</a> +</div> + +<div class="doc_text"> + +<p>There are two ways to map virtual registers to physical registers (or to +memory slots). The first way, that we will call <i>direct mapping</i>, +is based on the use of methods of the classes <tt>MRegisterInfo</tt>, +and <tt>MachineOperand</tt>. The second way, that we will call +<i>indirect mapping</i>, relies on the <tt>VirtRegMap</tt> class in +order to insert loads and stores sending and getting values to and from +memory.</p> + +<p>The direct mapping provides more flexibility to the developer of +the register allocator; however, it is more error prone, and demands +more implementation work. Basically, the programmer will have to +specify where load and store instructions should be inserted in the +target function being compiled in order to get and store values in +memory. To assign a physical register to a virtual register present in +a given operand, use <tt>MachineOperand::setReg(p_reg)</tt>. To insert +a store instruction, use +<tt>MRegisterInfo::storeRegToStackSlot(...)</tt>, and to insert a load +instruction, use <tt>MRegisterInfo::loadRegFromStackSlot</tt>.</p> + +<p>The indirect mapping shields the application developer from the +complexities of inserting load and store instructions. In order to map +a virtual register to a physical one, use +<tt>VirtRegMap::assignVirt2Phys(vreg, preg)</tt>. In order to map a +certain virtual register to memory, use +<tt>VirtRegMap::assignVirt2StackSlot(vreg)</tt>. This method will +return the stack slot where <tt>vreg</tt>'s value will be located. If +it is necessary to map another virtual register to the same stack +slot, use <tt>VirtRegMap::assignVirt2StackSlot(vreg, +stack_location)</tt>. One important point to consider when using the +indirect mapping, is that even if a virtual register is mapped to +memory, it still needs to be mapped to a physical register. This +physical register is the location where the virtual register is +supposed to be found before being stored or after being reloaded.</p> + +<p>If the indirect strategy is used, after all the virtual registers +have been mapped to physical registers or stack slots, it is necessary +to use a spiller object to place load and store instructions in the +code. Every virtual that has been mapped to a stack slot will be +stored to memory after been defined and will be loaded before being +used. The implementation of the spiller tries to recycle load/store +instructions, avoiding unnecessary instructions. For an example of how +to invoke the spiller, see +<tt>RegAllocLinearScan::runOnMachineFunction</tt> in +<tt>lib/CodeGen/RegAllocLinearScan.cpp</tt>.</p> + +</div> + +<!-- _______________________________________________________________________ --> +<div class="doc_subsubsection"> + <a name="regAlloc_twoAddr">Handling two address instructions</a> +</div> + +<div class="doc_text"> + +<p>With very rare exceptions (e.g., function calls), the LLVM machine +code instructions are three address instructions. That is, each +instruction is expected to define at most one register, and to use at +most two registers. However, some architectures use two address +instructions. In this case, the defined register is also one of the +used register. For instance, an instruction such as <tt>ADD %EAX, +%EBX</tt>, in X86 is actually equivalent to <tt>%EAX = %EAX + +%EBX</tt>.</p> + +<p>In order to produce correct code, LLVM must convert three address +instructions that represent two address instructions into true two +address instructions. LLVM provides the pass +<tt>TwoAddressInstructionPass</tt> for this specific purpose. It must +be run before register allocation takes place. After its execution, +the resulting code may no longer be in SSA form. This happens, for +instance, in situations where an instruction such as <tt>%a = ADD %b +%c</tt> is converted to two instructions such as:</p> + +<div class="doc_code"> +<pre> +%a = MOVE %b +%a = ADD %a %b +</pre> +</div> + +<p>Notice that, internally, the second instruction is represented as +<tt>ADD %a[def/use] %b</tt>. I.e., the register operand <tt>%a</tt> is +both used and defined by the instruction.</p> + +</div> + +<!-- _______________________________________________________________________ --> +<div class="doc_subsubsection"> + <a name="regAlloc_ssaDecon">The SSA deconstruction phase</a> +</div> + +<div class="doc_text"> + +<p>An important transformation that happens during register allocation is called +the <i>SSA Deconstruction Phase</i>. The SSA form simplifies many +analyses that are performed on the control flow graph of +programs. However, traditional instruction sets do not implement +PHI instructions. Thus, in order to generate executable code, compilers +must replace PHI instructions with other instructions that preserve their +semantics.</p> + +<p>There are many ways in which PHI instructions can safely be removed +from the target code. The most traditional PHI deconstruction +algorithm replaces PHI instructions with copy instructions. That is +the strategy adopted by LLVM. The SSA deconstruction algorithm is +implemented in n<tt>lib/CodeGen/>PHIElimination.cpp</tt>. In order to +invoke this pass, the identifier <tt>PHIEliminationID</tt> must be +marked as required in the code of the register allocator.</p> + +</div> + +<!-- _______________________________________________________________________ --> +<div class="doc_subsubsection"> + <a name="regAlloc_fold">Instruction folding</a> +</div> + +<div class="doc_text"> + +<p><i>Instruction folding</i> is an optimization performed during +register allocation that removes unnecessary copy instructions. For +instance, a sequence of instructions such as:</p> + +<div class="doc_code"> +<pre> +%EBX = LOAD %mem_address +%EAX = COPY %EBX +</pre> +</div> + +<p>can be safely substituted by the single instruction: + +<div class="doc_code"> +<pre> +%EAX = LOAD %mem_address +</pre> +</div> + +<p>Instructions can be folded with the +<tt>MRegisterInfo::foldMemoryOperand(...)</tt> method. Care must be +taken when folding instructions; a folded instruction can be quite +different from the original instruction. See +<tt>LiveIntervals::addIntervalsForSpills</tt> in +<tt>lib/CodeGen/LiveIntervalAnalysis.cpp</tt> for an example of its use.</p> + +</div> + +<!-- _______________________________________________________________________ --> + +<div class="doc_subsubsection"> + <a name="regAlloc_builtIn">Built in register allocators</a> +</div> + +<div class="doc_text"> + +<p>The LLVM infrastructure provides the application developer with +three different register allocators:</p> + +<ul> + <li><i>Simple</i> - This is a very simple implementation that does + not keep values in registers across instructions. This register + allocator immediately spills every value right after it is + computed, and reloads all used operands from memory to temporary + registers before each instruction.</li> + <li><i>Local</i> - This register allocator is an improvement on the + <i>Simple</i> implementation. It allocates registers on a basic + block level, attempting to keep values in registers and reusing + registers as appropriate.</li> + <li><i>Linear Scan</i> - <i>The default allocator</i>. This is the + well-know linear scan register allocator. Whereas the + <i>Simple</i> and <i>Local</i> algorithms use a direct mapping + implementation technique, the <i>Linear Scan</i> implementation + uses a spiller in order to place load and stores.</li> +</ul> + +<p>The type of register allocator used in <tt>llc</tt> can be chosen with the +command line option <tt>-regalloc=...</tt>:</p> + +<div class="doc_code"> +<pre> +$ llc -f -regalloc=simple file.bc -o sp.s; +$ llc -f -regalloc=local file.bc -o lc.s; +$ llc -f -regalloc=linearscan file.bc -o ln.s; +</pre> +</div> + +</div> + <!-- ======================================================================= --> <div class="doc_subsection"> <a name="proepicode">Prolog/Epilog Code Insertion</a> |