diff options
author | Bill Wendling <isanbard@gmail.com> | 2012-06-21 06:58:24 +0000 |
---|---|---|
committer | Bill Wendling <isanbard@gmail.com> | 2012-06-21 06:58:24 +0000 |
commit | bd96e0de3fb68b7c8587fed84b4233fc5aeb177a (patch) | |
tree | 8ceaf376c02f69219716999397b6255e775149bb /docs | |
parent | dc13d2ed2feb3fd9d4953a1dd49d6a93d6867bc5 (diff) |
Sphinxify the tablegen document.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158903 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'docs')
-rw-r--r-- | docs/TableGenFundamentals.html | 978 | ||||
-rw-r--r-- | docs/TableGenFundamentals.rst | 799 | ||||
-rw-r--r-- | docs/subsystems.rst | 5 |
3 files changed, 802 insertions, 980 deletions
diff --git a/docs/TableGenFundamentals.html b/docs/TableGenFundamentals.html deleted file mode 100644 index 5490eebb4f..0000000000 --- a/docs/TableGenFundamentals.html +++ /dev/null @@ -1,978 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> - <title>TableGen Fundamentals</title> - <link rel="stylesheet" href="_static/llvm.css" type="text/css"> -</head> -<body> - -<h1>TableGen Fundamentals</h1> - -<div> -<ul> - <li><a href="#introduction">Introduction</a> - <ol> - <li><a href="#concepts">Basic concepts</a></li> - <li><a href="#example">An example record</a></li> - <li><a href="#running">Running TableGen</a></li> - </ol></li> - <li><a href="#syntax">TableGen syntax</a> - <ol> - <li><a href="#primitives">TableGen primitives</a> - <ol> - <li><a href="#comments">TableGen comments</a></li> - <li><a href="#types">The TableGen type system</a></li> - <li><a href="#values">TableGen values and expressions</a></li> - </ol></li> - <li><a href="#classesdefs">Classes and definitions</a> - <ol> - <li><a href="#valuedef">Value definitions</a></li> - <li><a href="#recordlet">'let' expressions</a></li> - <li><a href="#templateargs">Class template arguments</a></li> - <li><a href="#multiclass">Multiclass definitions and instances</a></li> - </ol></li> - <li><a href="#filescope">File scope entities</a> - <ol> - <li><a href="#include">File inclusion</a></li> - <li><a href="#globallet">'let' expressions</a></li> - <li><a href="#foreach">'foreach' blocks</a></li> - </ol></li> - </ol></li> - <li><a href="#backends">TableGen backends</a> - <ol> - <li><a href="#">todo</a></li> - </ol></li> -</ul> -</div> - -<div class="doc_author"> - <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p> -</div> - -<!-- *********************************************************************** --> -<h2><a name="introduction">Introduction</a></h2> -<!-- *********************************************************************** --> - -<div> - -<p>TableGen's purpose is to help a human develop and maintain records of -domain-specific information. Because there may be a large number of these -records, it is specifically designed to allow writing flexible descriptions and -for common features of these records to be factored out. This reduces the -amount of duplication in the description, reduces the chance of error, and -makes it easier to structure domain specific information.</p> - -<p>The core part of TableGen <a href="#syntax">parses a file</a>, instantiates -the declarations, and hands the result off to a domain-specific "<a -href="#backends">TableGen backend</a>" for processing. The current major user -of TableGen is the <a href="CodeGenerator.html">LLVM code generator</a>.</p> - -<p>Note that if you work on TableGen much, and use emacs or vim, that you can -find an emacs "TableGen mode" and a vim language file in the -<tt>llvm/utils/emacs</tt> and <tt>llvm/utils/vim</tt> directories of your LLVM -distribution, respectively.</p> - -<!-- ======================================================================= --> -<h3><a name="concepts">Basic concepts</a></h3> - -<div> - -<p>TableGen files consist of two key parts: 'classes' and 'definitions', both -of which are considered 'records'.</p> - -<p><b>TableGen records</b> have a unique name, a list of values, and a list of -superclasses. The list of values is the main data that TableGen builds for each -record; it is this that holds the domain specific information for the -application. The interpretation of this data is left to a specific <a -href="#backends">TableGen backend</a>, but the structure and format rules are -taken care of and are fixed by TableGen.</p> - -<p><b>TableGen definitions</b> are the concrete form of 'records'. These -generally do not have any undefined values, and are marked with the -'<tt>def</tt>' keyword.</p> - -<p><b>TableGen classes</b> are abstract records that are used to build and -describe other records. These 'classes' allow the end-user to build -abstractions for either the domain they are targeting (such as "Register", -"RegisterClass", and "Instruction" in the LLVM code generator) or for the -implementor to help factor out common properties of records (such as "FPInst", -which is used to represent floating point instructions in the X86 backend). -TableGen keeps track of all of the classes that are used to build up a -definition, so the backend can find all definitions of a particular class, such -as "Instruction".</p> - -<p><b>TableGen multiclasses</b> are groups of abstract records that are -instantiated all at once. Each instantiation can result in multiple -TableGen definitions. If a multiclass inherits from another multiclass, -the definitions in the sub-multiclass become part of the current -multiclass, as if they were declared in the current multiclass.</p> - -</div> - -<!-- ======================================================================= --> -<h3><a name="example">An example record</a></h3> - -<div> - -<p>With no other arguments, TableGen parses the specified file and prints out -all of the classes, then all of the definitions. This is a good way to see what -the various definitions expand to fully. Running this on the <tt>X86.td</tt> -file prints this (at the time of this writing):</p> - -<div class="doc_code"> -<pre> -... -<b>def</b> ADD32rr { <i>// Instruction X86Inst I</i> - <b>string</b> Namespace = "X86"; - <b>dag</b> OutOperandList = (outs GR32:$dst); - <b>dag</b> InOperandList = (ins GR32:$src1, GR32:$src2); - <b>string</b> AsmString = "add{l}\t{$src2, $dst|$dst, $src2}"; - <b>list</b><dag> Pattern = [(set GR32:$dst, (add GR32:$src1, GR32:$src2))]; - <b>list</b><Register> Uses = []; - <b>list</b><Register> Defs = [EFLAGS]; - <b>list</b><Predicate> Predicates = []; - <b>int</b> CodeSize = 3; - <b>int</b> AddedComplexity = 0; - <b>bit</b> isReturn = 0; - <b>bit</b> isBranch = 0; - <b>bit</b> isIndirectBranch = 0; - <b>bit</b> isBarrier = 0; - <b>bit</b> isCall = 0; - <b>bit</b> canFoldAsLoad = 0; - <b>bit</b> mayLoad = 0; - <b>bit</b> mayStore = 0; - <b>bit</b> isImplicitDef = 0; - <b>bit</b> isConvertibleToThreeAddress = 1; - <b>bit</b> isCommutable = 1; - <b>bit</b> isTerminator = 0; - <b>bit</b> isReMaterializable = 0; - <b>bit</b> isPredicable = 0; - <b>bit</b> hasDelaySlot = 0; - <b>bit</b> usesCustomInserter = 0; - <b>bit</b> hasCtrlDep = 0; - <b>bit</b> isNotDuplicable = 0; - <b>bit</b> hasSideEffects = 0; - <b>bit</b> neverHasSideEffects = 0; - InstrItinClass Itinerary = NoItinerary; - <b>string</b> Constraints = ""; - <b>string</b> DisableEncoding = ""; - <b>bits</b><8> Opcode = { 0, 0, 0, 0, 0, 0, 0, 1 }; - Format Form = MRMDestReg; - <b>bits</b><6> FormBits = { 0, 0, 0, 0, 1, 1 }; - ImmType ImmT = NoImm; - <b>bits</b><3> ImmTypeBits = { 0, 0, 0 }; - <b>bit</b> hasOpSizePrefix = 0; - <b>bit</b> hasAdSizePrefix = 0; - <b>bits</b><4> Prefix = { 0, 0, 0, 0 }; - <b>bit</b> hasREX_WPrefix = 0; - FPFormat FPForm = ?; - <b>bits</b><3> FPFormBits = { 0, 0, 0 }; -} -... -</pre> -</div> - -<p>This definition corresponds to a 32-bit register-register add instruction in -the X86. The string after the '<tt>def</tt>' string indicates the name of the -record—"<tt>ADD32rr</tt>" in this case—and the comment at the end of -the line indicates the superclasses of the definition. The body of the record -contains all of the data that TableGen assembled for the record, indicating that -the instruction is part of the "X86" namespace, the pattern indicating how the -the instruction should be emitted into the assembly file, that it is a -two-address instruction, has a particular encoding, etc. The contents and -semantics of the information in the record is specific to the needs of the X86 -backend, and is only shown as an example.</p> - -<p>As you can see, a lot of information is needed for every instruction -supported by the code generator, and specifying it all manually would be -unmaintainable, prone to bugs, and tiring to do in the first place. Because we -are using TableGen, all of the information was derived from the following -definition:</p> - -<div class="doc_code"> -<pre> -let Defs = [EFLAGS], - isCommutable = 1, <i>// X = ADD Y,Z --> X = ADD Z,Y</i> - isConvertibleToThreeAddress = 1 <b>in</b> <i>// Can transform into LEA.</i> -def ADD32rr : I<0x01, MRMDestReg, (outs GR32:$dst), - (ins GR32:$src1, GR32:$src2), - "add{l}\t{$src2, $dst|$dst, $src2}", - [(set GR32:$dst, (add GR32:$src1, GR32:$src2))]>; -</pre> -</div> - -<p>This definition makes use of the custom class <tt>I</tt> (extended from the -custom class <tt>X86Inst</tt>), which is defined in the X86-specific TableGen -file, to factor out the common features that instructions of its class share. A -key feature of TableGen is that it allows the end-user to define the -abstractions they prefer to use when describing their information.</p> - -<p>Each def record has a special entry called "NAME." This is the -name of the def ("ADD32rr" above). In the general case def names can -be formed from various kinds of string processing expressions and NAME -resolves to the final value obtained after resolving all of those -expressions. The user may refer to NAME anywhere she desires to use -the ultimate name of the def. NAME should not be defined anywhere -else in user code to avoid conflict problems.</p> - -</div> - -<!-- ======================================================================= --> -<h3><a name="running">Running TableGen</a></h3> - -<div> - -<p>TableGen runs just like any other LLVM tool. The first (optional) argument -specifies the file to read. If a filename is not specified, -<tt>llvm-tblgen</tt> reads from standard input.</p> - -<p>To be useful, one of the <a href="#backends">TableGen backends</a> must be -used. These backends are selectable on the command line (type '<tt>llvm-tblgen --help</tt>' for a list). For example, to get a list of all of the definitions -that subclass a particular type (which can be useful for building up an enum -list of these records), use the <tt>-print-enums</tt> option:</p> - -<div class="doc_code"> -<pre> -$ llvm-tblgen X86.td -print-enums -class=Register -AH, AL, AX, BH, BL, BP, BPL, BX, CH, CL, CX, DH, DI, DIL, DL, DX, EAX, EBP, EBX, -ECX, EDI, EDX, EFLAGS, EIP, ESI, ESP, FP0, FP1, FP2, FP3, FP4, FP5, FP6, IP, -MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, R10, R10B, R10D, R10W, R11, R11B, R11D, -R11W, R12, R12B, R12D, R12W, R13, R13B, R13D, R13W, R14, R14B, R14D, R14W, R15, -R15B, R15D, R15W, R8, R8B, R8D, R8W, R9, R9B, R9D, R9W, RAX, RBP, RBX, RCX, RDI, -RDX, RIP, RSI, RSP, SI, SIL, SP, SPL, ST0, ST1, ST2, ST3, ST4, ST5, ST6, ST7, -XMM0, XMM1, XMM10, XMM11, XMM12, XMM13, XMM14, XMM15, XMM2, XMM3, XMM4, XMM5, -XMM6, XMM7, XMM8, XMM9, - -$ llvm-tblgen X86.td -print-enums -class=Instruction -ABS_F, ABS_Fp32, ABS_Fp64, ABS_Fp80, ADC32mi, ADC32mi8, ADC32mr, ADC32ri, -ADC32ri8, ADC32rm, ADC32rr, ADC64mi32, ADC64mi8, ADC64mr, ADC64ri32, ADC64ri8, -ADC64rm, ADC64rr, ADD16mi, ADD16mi8, ADD16mr, ADD16ri, ADD16ri8, ADD16rm, -ADD16rr, ADD32mi, ADD32mi8, ADD32mr, ADD32ri, ADD32ri8, ADD32rm, ADD32rr, -ADD64mi32, ADD64mi8, ADD64mr, ADD64ri32, ... -</pre> -</div> - -<p>The default backend prints out all of the records, as described <a -href="#example">above</a>.</p> - -<p>If you plan to use TableGen, you will most likely have to <a -href="#backends">write a backend</a> that extracts the information specific to -what you need and formats it in the appropriate way.</p> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2><a name="syntax">TableGen syntax</a></h2> -<!-- *********************************************************************** --> - -<div> - -<p>TableGen doesn't care about the meaning of data (that is up to the backend to -define), but it does care about syntax, and it enforces a simple type system. -This section describes the syntax and the constructs allowed in a TableGen file. -</p> - -<!-- ======================================================================= --> -<h3><a name="primitives">TableGen primitives</a></h3> - -<div> - -<!-- --------------------------------------------------------------------------> -<h4><a name="comments">TableGen comments</a></h4> - -<div> - -<p>TableGen supports BCPL style "<tt>//</tt>" comments, which run to the end of -the line, and it also supports <b>nestable</b> "<tt>/* */</tt>" comments.</p> - -</div> - -<!-- --------------------------------------------------------------------------> -<h4> - <a name="types">The TableGen type system</a> -</h4> - -<div> - -<p>TableGen files are strongly typed, in a simple (but complete) type-system. -These types are used to perform automatic conversions, check for errors, and to -help interface designers constrain the input that they allow. Every <a -href="#valuedef">value definition</a> is required to have an associated type. -</p> - -<p>TableGen supports a mixture of very low-level types (such as <tt>bit</tt>) -and very high-level types (such as <tt>dag</tt>). This flexibility is what -allows it to describe a wide range of information conveniently and compactly. -The TableGen types are:</p> - -<dl> -<dt><tt><b>bit</b></tt></dt> - <dd>A 'bit' is a boolean value that can hold either 0 or 1.</dd> - -<dt><tt><b>int</b></tt></dt> - <dd>The 'int' type represents a simple 32-bit integer value, such as 5.</dd> - -<dt><tt><b>string</b></tt></dt> - <dd>The 'string' type represents an ordered sequence of characters of - arbitrary length.</dd> - -<dt><tt><b>bits</b><n></tt></dt> - <dd>A 'bits' type is an arbitrary, but fixed, size integer that is broken up - into individual bits. This type is useful because it can handle some bits - being defined while others are undefined.</dd> - -<dt><tt><b>list</b><ty></tt></dt> - <dd>This type represents a list whose elements are some other type. The - contained type is arbitrary: it can even be another list type.</dd> - -<dt>Class type</dt> - <dd>Specifying a class name in a type context means that the defined value - must be a subclass of the specified class. This is useful in conjunction with - the <b><tt>list</tt></b> type, for example, to constrain the elements of the - list to a common base class (e.g., a <tt><b>list</b><Register></tt> can - only contain definitions derived from the "<tt>Register</tt>" class).</dd> - -<dt><tt><b>dag</b></tt></dt> - <dd>This type represents a nestable directed graph of elements.</dd> - -<dt><tt><b>code</b></tt></dt> - <dd>This represents a big hunk of text. This is lexically distinct from - string values because it doesn't require escapeing double quotes and other - common characters that occur in code.</dd> -</dl> - -<p>To date, these types have been sufficient for describing things that -TableGen has been used for, but it is straight-forward to extend this list if -needed.</p> - -</div> - -<!-- --------------------------------------------------------------------------> -<h4> - <a name="values">TableGen values and expressions</a> -</h4> - -<div> - -<p>TableGen allows for a pretty reasonable number of different expression forms -when building up values. These forms allow the TableGen file to be written in a -natural syntax and flavor for the application. The current expression forms -supported include:</p> - -<dl> -<dt><tt>?</tt></dt> - <dd>uninitialized field</dd> -<dt><tt>0b1001011</tt></dt> - <dd>binary integer value</dd> -<dt><tt>07654321</tt></dt> - <dd>octal integer value (indicated by a leading 0)</dd> -<dt><tt>7</tt></dt> - <dd>decimal integer value</dd> -<dt><tt>0x7F</tt></dt> - <dd>hexadecimal integer value</dd> -<dt><tt>"foo"</tt></dt> - <dd>string value</dd> -<dt><tt>[{ ... }]</tt></dt> - <dd>code fragment</dd> -<dt><tt>[ X, Y, Z ]<type></tt></dt> - <dd>list value. <type> is the type of the list -element and is usually optional. In rare cases, -TableGen is unable to deduce the element type in -which case the user must specify it explicitly.</dd> -<dt><tt>{ a, b, c }</tt></dt> - <dd>initializer for a "bits<3>" value</dd> -<dt><tt>value</tt></dt> - <dd>value reference</dd> -<dt><tt>value{17}</tt></dt> - <dd>access to one bit of a value</dd> -<dt><tt>value{15-17}</tt></dt> - <dd>access to multiple bits of a value</dd> -<dt><tt>DEF</tt></dt> - <dd>reference to a record definition</dd> -<dt><tt>CLASS<val list></tt></dt> - <dd>reference to a new anonymous definition of CLASS with the specified - template arguments.</dd> -<dt><tt>X.Y</tt></dt> - <dd>reference to the subfield of a value</dd> -<dt><tt>list[4-7,17,2-3]</tt></dt> - <dd>A slice of the 'list' list, including elements 4,5,6,7,17,2, and 3 from - it. Elements may be included multiple times.</dd> -<dt><tt>foreach <var> = [ <list> ] in { <body> }</tt></dt> -<dt><tt>foreach <var> = [ <list> ] in <def></tt></dt> - <dd> Replicate <body> or <def>, replacing instances of - <var> with each value in <list>. <var> is scoped at the - level of the <tt>foreach</tt> loop and must not conflict with any other object - introduced in <body> or <def>. Currently only <tt>def</tt>s are - expanded within <body>. - </dd> -<dt><tt>foreach <var> = 0-15 in ...</tt></dt> -<dt><tt>foreach <var> = {0-15,32-47} in ...</tt></dt> - <dd>Loop over ranges of integers. The braces are required for multiple - ranges.</dd> -<dt><tt>(DEF a, b)</tt></dt> - <dd>a dag value. The first element is required to be a record definition, the - remaining elements in the list may be arbitrary other values, including nested - `<tt>dag</tt>' values.</dd> -<dt><tt>!strconcat(a, b)</tt></dt> - <dd>A string value that is the result of concatenating the 'a' and 'b' - strings.</dd> -<dt><tt>str1#str2</tt></dt> - <dd>"#" (paste) is a shorthand for !strconcat. It may concatenate - things that are not quoted strings, in which case an implicit - !cast<string> is done on the operand of the paste.</dd> -<dt><tt>!cast<type>(a)</tt></dt> - <dd>A symbol of type <em>type</em> obtained by looking up the string 'a' in -the symbol table. If the type of 'a' does not match <em>type</em>, TableGen -aborts with an error. !cast<string> is a special case in that the argument must -be an object defined by a 'def' construct.</dd> -<dt><tt>!subst(a, b, c)</tt></dt> - <dd>If 'a' and 'b' are of string type or are symbol references, substitute -'b' for 'a' in 'c.' This operation is analogous to $(subst) in GNU make.</dd> -<dt><tt>!foreach(a, b, c)</tt></dt> - <dd>For each member 'b' of dag or list 'a' apply operator 'c.' 'b' is a -dummy variable that should be declared as a member variable of an instantiated -class. This operation is analogous to $(foreach) in GNU make.</dd> -<dt><tt>!head(a)</tt></dt> - <dd>The first element of list 'a.'</dd> -<dt><tt>!tail(a)</tt></dt> - <dd>The 2nd-N elements of list 'a.'</dd> -<dt><tt>!empty(a)</tt></dt> - <dd>An integer {0,1} indicating whether list 'a' is empty.</dd> -<dt><tt>!if(a,b,c)</tt></dt> - <dd>'b' if the result of 'int' or 'bit' operator 'a' is nonzero, - 'c' otherwise.</dd> -<dt><tt>!eq(a,b)</tt></dt> - <dd>'bit 1' if string a is equal to string b, 0 otherwise. This - only operates on string, int and bit objects. Use !cast<string> to - compare other types of objects.</dd> -</dl> - -<p>Note that all of the values have rules specifying how they convert to values -for different types. These rules allow you to assign a value like "<tt>7</tt>" -to a "<tt>bits<4></tt>" value, for example.</p> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="classesdefs">Classes and definitions</a> -</h3> - -<div> - -<p>As mentioned in the <a href="#concepts">intro</a>, classes and definitions -(collectively known as 'records') in TableGen are the main high-level unit of -information that TableGen collects. Records are defined with a <tt>def</tt> or -<tt>class</tt> keyword, the record name, and an optional list of "<a -href="#templateargs">template arguments</a>". If the record has superclasses, -they are specified as a comma separated list that starts with a colon character -("<tt>:</tt>"). If <a href="#valuedef">value definitions</a> or <a -href="#recordlet">let expressions</a> are needed for the class, they are -enclosed in curly braces ("<tt>{}</tt>"); otherwise, the record ends with a -semicolon.</p> - -<p>Here is a simple TableGen file:</p> - -<div class="doc_code"> -<pre> -<b>class</b> C { <b>bit</b> V = 1; } -<b>def</b> X : C; -<b>def</b> Y : C { - <b>string</b> Greeting = "hello"; -} -</pre> -</div> - -<p>This example defines two definitions, <tt>X</tt> and <tt>Y</tt>, both of -which derive from the <tt>C</tt> class. Because of this, they both get the -<tt>V</tt> bit value. The <tt>Y</tt> definition also gets the Greeting member -as well.</p> - -<p>In general, classes are useful for collecting together the commonality -between a group of records and isolating it in a single place. Also, classes -permit the specification of default values for their subclasses, allowing the -subclasses to override them as they wish.</p> - -<!----------------------------------------------------------------------------> -<h4> - <a name="valuedef">Value definitions</a> -</h4> - -<div> - -<p>Value definitions define named entries in records. A value must be defined -before it can be referred to as the operand for another value definition or -before the value is reset with a <a href="#recordlet">let expression</a>. A -value is defined by specifying a <a href="#types">TableGen type</a> and a name. -If an initial value is available, it may be specified after the type with an -equal sign. Value definitions require terminating semicolons.</p> - -</div> - -<!-- --------------------------------------------------------------------------> -<h4> - <a name="recordlet">'let' expressions</a> -</h4> - -<div> - -<p>A record-level let expression is used to change the value of a value -definition in a record. This is primarily useful when a superclass defines a -value that a derived class or definition wants to override. Let expressions -consist of the '<tt>let</tt>' keyword followed by a value name, an equal sign -("<tt>=</tt>"), and a new value. For example, a new class could be added to the -example above, redefining the <tt>V</tt> field for all of its subclasses:</p> - -<div class="doc_code"> -<pre> -<b>class</b> D : C { let V = 0; } -<b>def</b> Z : D; -</pre> -</div> - -<p>In this case, the <tt>Z</tt> definition will have a zero value for its "V" -value, despite the fact that it derives (indirectly) from the <tt>C</tt> class, -because the <tt>D</tt> class overrode its value.</p> - -</div> - -<!-- --------------------------------------------------------------------------> -<h4> - <a name="templateargs">Class template arguments</a> -</h4> - -<div> - -<p>TableGen permits the definition of parameterized classes as well as normal -concrete classes. Parameterized TableGen classes specify a list of variable -bindings (which may optionally have defaults) that are bound when used. Here is -a simple example:</p> - -<div class="doc_code"> -<pre> -<b>class</b> FPFormat<<b>bits</b><3> val> { - <b>bits</b><3> Value = val; -} -<b>def</b> NotFP : FPFormat<0>; -<b>def</b> ZeroArgFP : FPFormat<1>; -<b>def</b> OneArgFP : FPFormat<2>; -<b>def</b> OneArgFPRW : FPFormat<3>; -<b>def</b> TwoArgFP : FPFormat<4>; -<b>def</b> CompareFP : FPFormat<5>; -<b>def</b> CondMovFP : FPFormat<6>; -<b>def</b> SpecialFP : FPFormat<7>; -</pre> -</div> - -<p>In this case, template arguments are used as a space efficient way to specify -a list of "enumeration values", each with a "<tt>Value</tt>" field set to the -specified integer.</p> - -<p>The more esoteric forms of <a href="#values">TableGen expressions</a> are -useful in conjunction with template arguments. As an example:</p> - -<div class="doc_code"> -<pre> -<b>class</b> ModRefVal<<b>bits</b><2> val> { - <b>bits</b><2> Value = val; -} - -<b>def</b> None : ModRefVal<0>; -<b>def</b> Mod : ModRefVal<1>; -<b>def</b> Ref : ModRefVal<2>; -<b>def</b> ModRef : ModRefVal<3>; - -<b>class</b> Value<ModRefVal MR> { - <i>// Decode some information into a more convenient format, while providing - // a nice interface to the user of the "Value" class.</i> - <b>bit</b> isMod = MR.Value{0}; - <b>bit</b> isRef = MR.Value{1}; - - <i>// other stuff...</i> -} - -<i>// Example uses</i> -<b>def</b> bork : Value<Mod>; -<b>def</b> zork : Value<Ref>; -<b>def</b> hork : Value<ModRef>; -</pre> -</div> - -<p>This is obviously a contrived example, but it shows how template arguments -can be used to decouple the interface provided to the user of the class from the -actual internal data representation expected by the class. In this case, -running <tt>llvm-tblgen</tt> on the example prints the following -definitions:</p> - -<div class="doc_code"> -<pre> -<b>def</b> bork { <i>// Value</i> - <b>bit</b> isMod = 1; - <b>bit</b> isRef = 0; -} -<b>def</b> hork { <i>// Value</i> - <b>bit</b> isMod = 1; - <b>bit</b> isRef = 1; -} -<b>def</b> zork { <i>// Value</i> - <b>bit</b> isMod = 0; - <b>bit</b> isRef = 1; -} -</pre> -</div> - -<p> This shows that TableGen was able to dig into the argument and extract a -piece of information that was requested by the designer of the "Value" class. -For more realistic examples, please see existing users of TableGen, such as the -X86 backend.</p> - -</div> - -<!-- --------------------------------------------------------------------------> -<h4> - <a name="multiclass">Multiclass definitions and instances</a> -</h4> - -<div> - -<p> -While classes with template arguments are a good way to factor commonality -between two instances of a definition, multiclasses allow a convenient notation -for defining multiple definitions at once (instances of implicitly constructed -classes). For example, consider an 3-address instruction set whose instructions -come in two forms: "<tt>reg = reg op reg</tt>" and "<tt>reg = reg op imm</tt>" -(e.g. SPARC). In this case, you'd like to specify in one place that this -commonality exists, then in a separate place indicate what all the ops are. -</p> - -<p> -Here is an example TableGen fragment that shows this idea: -</p> - -<div class="doc_code"> -<pre> -<b>def</b> ops; -<b>def</b> GPR; -<b>def</b> Imm; -<b>class</b> inst<<b>int</b> opc, <b>string</b> asmstr, <b>dag</b> operandlist>; - -<b>multiclass</b> ri_inst<<b>int</b> opc, <b>string</b> asmstr> { - def _rr : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), - (ops GPR:$dst, GPR:$src1, GPR:$src2)>; - def _ri : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), - (ops GPR:$dst, GPR:$src1, Imm:$src2)>; -} - -<i>// Instantiations of the ri_inst multiclass.</i> -<b>defm</b> ADD : ri_inst<0b111, "add">; -<b>defm</b> SUB : ri_inst<0b101, "sub">; -<b>defm</b> MUL : ri_inst<0b100, "mul">; -... -</pre> -</div> - -<p>The name of the resultant definitions has the multidef fragment names - appended to them, so this defines <tt>ADD_rr</tt>, <tt>ADD_ri</tt>, - <tt>SUB_rr</tt>, etc. A defm may inherit from multiple multiclasses, - instantiating definitions from each multiclass. Using a multiclass - this way is exactly equivalent to instantiating the classes multiple - times yourself, e.g. by writing:</p> - -<div class="doc_code"> -<pre> -<b>def</b> ops; -<b>def</b> GPR; -<b>def</b> Imm; -<b>class</b> inst<<b>int</b> opc, <b>string</b> asmstr, <b>dag</b> operandlist>; - -<b>class</b> rrinst<<b>int</b> opc, <b>string</b> asmstr> - : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), - (ops GPR:$dst, GPR:$src1, GPR:$src2)>; - -<b>class</b> riinst<<b>int</b> opc, <b>string</b> asmstr> - : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), - (ops GPR:$dst, GPR:$src1, Imm:$src2)>; - -<i>// Instantiations of the ri_inst multiclass.</i> -<b>def</b> ADD_rr : rrinst<0b111, "add">; -<b>def</b> ADD_ri : riinst<0b111, "add">; -<b>def</b> SUB_rr : rrinst<0b101, "sub">; -<b>def</b> SUB_ri : riinst<0b101, "sub">; -<b>def</b> MUL_rr : rrinst<0b100, "mul">; -<b>def</b> MUL_ri : riinst<0b100, "mul">; -... -</pre> -</div> - -<p> -A defm can also be used inside a multiclass providing several levels of -multiclass instanciations. -</p> - -<div class="doc_code"> -<pre> -<b>class</b> Instruction<bits<4> opc, string Name> { - bits<4> opcode = opc; - string name = Name; -} - -<b>multiclass</b> basic_r<bits<4> opc> { - <b>def</b> rr : Instruction<opc, "rr">; - <b>def</b> rm : Instruction<opc, "rm">; -} - -<b>multiclass</b> basic_s<bits<4> opc> { - <b>defm</b> SS : basic_r<opc>; - <b>defm</b> SD : basic_r<opc>; - <b>def</b> X : Instruction<opc, "x">; -} - -<b>multiclass</b> basic_p<bits<4> opc> { - <b>defm</b> PS : basic_r<opc>; - <b>defm</b> PD : basic_r<opc>; - <b>def</b> Y : Instruction<opc, "y">; -} - -<b>defm</b> ADD : basic_s<0xf>, basic_p<0xf>; -... - -<i>// Results</i> -<b>def</b> ADDPDrm { ... -<b>def</b> ADDPDrr { ... -<b>def</b> ADDPSrm { ... -<b>def</b> ADDPSrr { ... -<b>def</b> ADDSDrm { ... -<b>def</b> ADDSDrr { ... -<b>def</b> ADDY { ... -<b>def</b> ADDX { ... -</pre> -</div> - -<p> -defm declarations can inherit from classes too, the -rule to follow is that the class list must start after the -last multiclass, and there must be at least one multiclass -before them. -</p> - -<div class="doc_code"> -<pre> -<b>class</b> XD { bits<4> Prefix = 11; } -<b>class</b> XS { bits<4> Prefix = 12; } - -<b>class</b> I<bits<4> op> { - bits<4> opcode = op; -} - -<b>multiclass</b> R { - <b>def</b> rr : I<4>; - <b>def</b> rm : I<2>; -} - -<b>multiclass</b> Y { - <b>defm</b> SS : R, XD; - <b>defm</b> SD : R, XS; -} - -<b>defm</b> Instr : Y; - -<i>// Results</i> -<b>def</b> InstrSDrm { - bits<4> opcode = { 0, 0, 1, 0 }; - bits<4> Prefix = { 1, 1, 0, 0 }; -} -... -<b>def</b> InstrSSrr { - bits<4> opcode = { 0, 1, 0, 0 }; - bits<4> Prefix = { 1, 0, 1, 1 }; -} -</pre> -</div> - -</div> - -</div> - -<!-- ======================================================================= --> -<h3> - <a name="filescope">File scope entities</a> -</h3> - -<div> - -<!-- --------------------------------------------------------------------------> -<h4> - <a name="include">File inclusion</a> -</h4> - -<div> -<p>TableGen supports the '<tt>include</tt>' token, which textually substitutes -the specified file in place of the include directive. The filename should be -specified as a double quoted string immediately after the '<tt>include</tt>' -keyword. Example:</p> - -<div class="doc_code"> -<pre> -<b>include</b> "foo.td" -</pre> -</div> - -</div> - -<!-- --------------------------------------------------------------------------> -<h4> - <a name="globallet">'let' expressions</a> -</h4> - -<div> - -<p>"Let" expressions at file scope are similar to <a href="#recordlet">"let" -expressions within a record</a>, except they can specify a value binding for -multiple records at a time, and may be useful in certain other cases. -File-scope let expressions are really just another way that TableGen allows the -end-user to factor out commonality from the records.</p> - -<p>File-scope "let" expressions take a comma-separated list of bindings to -apply, and one or more records to bind the values in. Here are some -examples:</p> - -<div class="doc_code"> -<pre> -<b>let</b> isTerminator = 1, isReturn = 1, isBarrier = 1, hasCtrlDep = 1 <b>in</b> - <b>def</b> RET : I<0xC3, RawFrm, (outs), (ins), "ret", [(X86retflag 0)]>; - -<b>let</b> isCall = 1 <b>in</b> - <i>// All calls clobber the non-callee saved registers...</i> - <b>let</b> Defs = [EAX, ECX, EDX, FP0, FP1, FP2, FP3, FP4, FP5, FP6, ST0, - MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, - XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, EFLAGS] <b>in</b> { - <b>def</b> CALLpcrel32 : Ii32<0xE8, RawFrm, (outs), (ins i32imm:$dst,variable_ops), - "call\t${dst:call}", []>; - <b>def</b> CALL32r : I<0xFF, MRM2r, (outs), (ins GR32:$dst, variable_ops), - "call\t{*}$dst", [(X86call GR32:$dst)]>; - <b>def</b> CALL32m : I<0xFF, MRM2m, (outs), (ins i32mem:$dst, variable_ops), - "call\t{*}$dst", []>; - } -</pre> -</div> - -<p>File-scope "let" expressions are often useful when a couple of definitions -need to be added to several records, and the records do not otherwise need to be -opened, as in the case with the <tt>CALL*</tt> instructions above.</p> - -<p>It's also possible to use "let" expressions inside multiclasses, providing -more ways to factor out commonality from the records, specially if using -several levels of multiclass instanciations. This also avoids the need of using -"let" expressions within subsequent records inside a multiclass.</p> - -<pre class="doc_code"> -<b>multiclass </b>basic_r<bits<4> opc> { - <b>let </b>Predicates = [HasSSE2] in { - <b>def </b>rr : Instruction<opc, "rr">; - <b>def </b>rm : Instruction<opc, "rm">; - } - <b>let </b>Predicates = [HasSSE3] in - <b>def </b>rx : Instruction<opc, "rx">; -} - -<b>multiclass </b>basic_ss<bits<4> opc> { - <b>let </b>IsDouble = 0 in - <b>defm </b>SS : basic_r<opc>; - - <b>let </b>IsDouble = 1 in - <b>defm </b>SD : basic_r<opc>; -} - -<b>defm </b>ADD : basic_ss<0xf>; -</pre> -</div> - -<!-- --------------------------------------------------------------------------> -<h4> - <a name="foreach">Looping</a> -</h4> - -<div> -<p>TableGen supports the '<tt>foreach</tt>' block, which textually replicates -the loop body, substituting iterator values for iterator references in the -body. Example:</p> - -<div class="doc_code"> -<pre> -<b>foreach</b> i = [0, 1, 2, 3] in { - <b>def</b> R#i : Register<...>; - <b>def</b> F#i : Register<...>; -} -</pre> -</div> - -<p>This will create objects <tt>R0</tt>, <tt>R1</tt>, <tt>R2</tt> and -<tt>R3</tt>. <tt>foreach</tt> blocks may be nested. If there is only -one item in the body the braces may be elided:</p> - -<div class="doc_code"> -<pre> -<b>foreach</b> i = [0, 1, 2, 3] in - <b>def</b> R#i : Register<...>; - -</pre> -</div> - -</div> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2><a name="codegen">Code Generator backend info</a></h2> -<!-- *********************************************************************** --> - -<div> - -<p>Expressions used by code generator to describe instructions and isel -patterns:</p> - -<dl> -<dt><tt>(implicit a)</tt></dt> - <dd>an implicitly defined physical register. This tells the dag instruction - selection emitter the input pattern's extra definitions matches implicit - physical register definitions.</dd> -</dl> -</div> - -<!-- *********************************************************************** --> -<h2><a name="backends">TableGen backends</a></h2> -<!-- *********************************************************************** --> - -<div> - -<p>TODO: How they work, how to write one. This section should not contain -details about any particular backend, except maybe -print-enums as an example. -This should highlight the APIs in <tt>TableGen/Record.h</tt>.</p> - -</div> - -<!-- *********************************************************************** --> - -<hr> -<address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> - <a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> - - <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> - <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> - Last modified: $Date$ -</address> - -</body> -</html> diff --git a/docs/TableGenFundamentals.rst b/docs/TableGenFundamentals.rst new file mode 100644 index 0000000000..56f06aabc7 --- /dev/null +++ b/docs/TableGenFundamentals.rst @@ -0,0 +1,799 @@ +.. _tablegen: + +===================== +TableGen Fundamentals +===================== + +.. contents:: + :local: + +Introduction +============ + +TableGen's purpose is to help a human develop and maintain records of +domain-specific information. Because there may be a large number of these +records, it is specifically designed to allow writing flexible descriptions and +for common features of these records to be factored out. This reduces the +amount of duplication in the description, reduces the chance of error, and makes +it easier to structure domain specific information. + +The core part of TableGen `parses a file`_, instantiates the declarations, and +hands the result off to a domain-specific `TableGen backend`_ for processing. +The current major user of TableGen is the `LLVM code +generator <CodeGenerator.html>`_. + +Note that if you work on TableGen much, and use emacs or vim, that you can find +an emacs "TableGen mode" and a vim language file in the ``llvm/utils/emacs`` and +``llvm/utils/vim`` directories of your LLVM distribution, respectively. + +.. _intro: + +Basic concepts +-------------- + +TableGen files consist of two key parts: 'classes' and 'definitions', both of +which are considered 'records'. + +**TableGen records** have a unique name, a list of values, and a list of +superclasses. The list of values is the main data that TableGen builds for each +record; it is this that holds the domain specific information for the +application. The interpretation of this data is left to a specific `TableGen +backend`_, but the structure and format rules are taken care of and are fixed by +TableGen. + +**TableGen definitions** are the concrete form of 'records'. These generally do +not have any undefined values, and are marked with the '``def``' keyword. + +**TableGen classes** are abstract records that are used to build and describe +other records. These 'classes' allow the end-user to build abstractions for +either the domain they are targeting (such as "Register", "RegisterClass", and +"Instruction" in the LLVM code generator) or for the implementor to help factor +out common properties of records (such as "FPInst", which is used to represent +floating point instructions in the X86 backend). TableGen keeps track of all of +the classes that are used to build up a definition, so the backend can find all +definitions of a particular class, such as "Instruction". + +**TableGen multiclasses** are groups of abstract records that are instantiated +all at once. Each instantiation can result in multiple TableGen definitions. +If a multiclass inherits from another multiclass, the definitions in the +sub-multiclass become part of the current multiclass, as if they were declared +in the current multiclass. + +.. _described above: + +An example record +----------------- + +With no other arguments, TableGen parses the specified file and prints out all +of the classes, then all of the definitions. This is a good way to see what the +various definitions expand to fully. Running this on the ``X86.td`` file prints +this (at the time of this writing): + +.. code-block:: llvm + + ... + def ADD32rr { // Instruction X86Inst I + string Namespace = "X86"; + dag OutOperandList = (outs GR32:$dst); + dag InOperandList = (ins GR32:$src1, GR32:$src2); + string AsmString = "add{l}\t{$src2, $dst|$dst, $src2}"; + list<dag> Pattern = [(set GR32:$dst, (add GR32:$src1, GR32:$src2))]; + list<Register> Uses = []; + list<Register> Defs = [EFLAGS]; + list<Predicate> Predicates = []; + int CodeSize = 3; + int AddedComplexity = 0; + bit isReturn = 0; + bit isBranch = 0; + bit isIndirectBranch = 0; + bit isBarrier = 0; + bit isCall = 0; + bit canFoldAsLoad = 0; + bit mayLoad = 0; + bit mayStore = 0; + bit isImplicitDef = 0; + bit isConvertibleToThreeAddress = 1; + bit isCommutable = 1; + bit isTerminator = 0; + bit isReMaterializable = 0; + bit isPredicable = 0; + bit hasDelaySlot = 0; + bit usesCustomInserter = 0; + bit hasCtrlDep = 0; + bit isNotDuplicable = 0; + bit hasSideEffects = 0; + bit neverHasSideEffects = 0; + InstrItinClass Itinerary = NoItinerary; + string Constraints = ""; + string DisableEncoding = ""; + bits<8> Opcode = { 0, 0, 0, 0, 0, 0, 0, 1 }; + Format Form = MRMDestReg; + bits<6> FormBits = { 0, 0, 0, 0, 1, 1 }; + ImmType ImmT = NoImm; + bits<3> ImmTypeBits = { 0, 0, 0 }; + bit hasOpSizePrefix = 0; + bit hasAdSizePrefix = 0; + bits<4> Prefix = { 0, 0, 0, 0 }; + bit hasREX_WPrefix = 0; + FPFormat FPForm = ?; + bits<3> FPFormBits = { 0, 0, 0 }; + } + ... + +This definition corresponds to a 32-bit register-register add instruction in the +X86. The string after the '``def``' string indicates the name of the +record---"``ADD32rr``" in this case---and the comment at the end of the line +indicates the superclasses of the definition. The body of the record contains +all of the data that TableGen assembled for the record, indicating that the +instruction is part of the "X86" namespace, the pattern indicating how the the +instruction should be emitted into the assembly file, that it is a two-address +instruction, has a particular encoding, etc. The contents and semantics of the +information in the record is specific to the needs of the X86 backend, and is +only shown as an example. + +As you can see, a lot of information is needed for every instruction supported +by the code generator, and specifying it all manually would be unmaintainable, +prone to bugs, and tiring to do in the first place. Because we are using +TableGen, all of the information was derived from the following definition: + +.. code-block:: llvm + + let Defs = [EFLAGS], + isCommutable = 1, // X = ADD Y,Z --> X = ADD Z,Y + isConvertibleToThreeAddress = 1 in // Can transform into LEA. + def ADD32rr : I<0x01, MRMDestReg, (outs GR32:$dst), + (ins GR32:$src1, GR32:$src2), + "add{l}\t{$src2, $dst|$dst, $src2}", + [(set GR32:$dst, (add GR32:$src1, GR32:$src2))]>; + +This definition makes use of the custom class ``I`` (extended from the custom +class ``X86Inst``), which is defined in the X86-specific TableGen file, to +factor out the common features that instructions of its class share. A key +feature of TableGen is that it allows the end-user to define the abstractions +they prefer to use when describing their information. + +Each def record has a special entry called "``NAME``." This is the name of the +def ("``ADD32rr``" above). In the general case def names can be formed from +various kinds of string processing expressions and ``NAME`` resolves to the +final value obtained after resolving all of those expressions. The user may +refer to ``NAME`` anywhere she desires to use the ultimate name of the def. +``NAME`` should not be defined anywhere else in user code to avoid conflict +problems. + +Running TableGen +---------------- + +TableGen runs just like any other LLVM tool. The first (optional) argument +specifies the file to read. If a filename is not specified, ``llvm-tblgen`` +reads from standard input. + +To be useful, one of the `TableGen backends`_ must be used. These backends are +selectable on the command line (type '``llvm-tblgen -help``' for a list). For +example, to get a list of all of the definitions that subclass a particular type +(which can be useful for building up an enum list of these records), use the +``-print-enums`` option: + +.. code-block:: bash + + $ llvm-tblgen X86.td -print-enums -class=Register + AH, AL, AX, BH, BL, BP, BPL, BX, CH, CL, CX, DH, DI, DIL, DL, DX, EAX, EBP, EBX, + ECX, EDI, EDX, EFLAGS, EIP, ESI, ESP, FP0, FP1, FP2, FP3, FP4, FP5, FP6, IP, + MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, R10, R10B, R10D, R10W, R11, R11B, R11D, + R11W, R12, R12B, R12D, R12W, R13, R13B, R13D, R13W, R14, R14B, R14D, R14W, R15, + R15B, R15D, R15W, R8, R8B, R8D, R8W, R9, R9B, R9D, R9W, RAX, RBP, RBX, RCX, RDI, + RDX, RIP, RSI, RSP, SI, SIL, SP, SPL, ST0, ST1, ST2, ST3, ST4, ST5, ST6, ST7, + XMM0, XMM1, XMM10, XMM11, XMM12, XMM13, XMM14, XMM15, XMM2, XMM3, XMM4, XMM5, + XMM6, XMM7, XMM8, XMM9, + + $ llvm-tblgen X86.td -print-enums -class=Instruction + ABS_F, ABS_Fp32, ABS_Fp64, ABS_Fp80, ADC32mi, ADC32mi8, ADC32mr, ADC32ri, + ADC32ri8, ADC32rm, ADC32rr, ADC64mi32, ADC64mi8, ADC64mr, ADC64ri32, ADC64ri8, + ADC64rm, ADC64rr, ADD16mi, ADD16mi8, ADD16mr, ADD16ri, ADD16ri8, ADD16rm, + ADD16rr, ADD32mi, ADD32mi8, ADD32mr, ADD32ri, ADD32ri8, ADD32rm, ADD32rr, + ADD64mi32, ADD64mi8, ADD64mr, ADD64ri32, ... + +The default backend prints out all of the records, as `described above`_. + +If you plan to use TableGen, you will most likely have to `write a backend`_ +that extracts the information specific to what you need and formats it in the +appropriate way. + +.. _parses a file: + +TableGen syntax +=============== + +TableGen doesn't care about the meaning of data (that is up to the backend to +define), but it does care about syntax, and it enforces a simple type system. +This section describes the syntax and the constructs allowed in a TableGen file. + +TableGen primitives +------------------- + +TableGen comments +^^^^^^^^^^^^^^^^^ + +TableGen supports BCPL style "``//``" comments, which run to the end of the +line, and it also supports **nestable** "``/* */``" comments. + +.. _TableGen type: + +The TableGen type system +^^^^^^^^^^^^^^^^^^^^^^^^ + +TableGen files are strongly typed, in a simple (but complete) type-system. +These types are used to perform automatic conversions, check for errors, and to +help interface designers constrain the input that they allow. Every `value +definition`_ is required to have an associated type. + +TableGen supports a mixture of very low-level types (such as ``bit``) and very +high-level types (such as ``dag``). This flexibility is what allows it to +describe a wide range of information conveniently and compactly. The TableGen +types are: + +``bit`` + A 'bit' is a boolean value that can hold either 0 or 1. + +``int`` + The 'int' type represents a simple 32-bit integer value, such as 5. + +``string`` + The 'string' type represents an ordered sequence of characters of arbitrary + length. + +``bits<n>`` + A 'bits' type is an arbitrary, but fixed, size integer that is broken up + into individual bits. This type is useful because it can handle some bits + being defined while others are undefined. + +``list<ty>`` + This type represents a list whose elements are some other type. The + contained type is arbitrary: it can even be another list type. + +Class type + Specifying a class name in a type context means that the defined value must + be a subclass of the specified class. This is useful in conjunction with + the **``list``** type, for example, to constrain the elements of the list to + a common base class (e.g., a ``**list**<Register>`` can only contain + definitions derived from the "``Register``" class). + +``dag`` + This type represents a nestable directed graph of elements. + +``code`` + This represents a big hunk of text. This is lexically distinct from string + values because it doesn't require escaping double quotes and other common + characters that occur in code. + +To date, these types have been sufficient for describing things that TableGen +has been used for, but it is straight-forward to extend this list if needed. + +.. _TableGen expressions: + +TableGen values and expressions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +TableGen allows for a pretty reasonable number of different expression forms +when building up values. These forms allow the TableGen file to be written in a +natural syntax and flavor for the application. The current expression forms +supported include: + +``?`` + uninitialized field + +``0b1001011`` + binary integer value + +``07654321`` + octal integer value (indicated by a leading 0) + +``7`` + decimal integer value + +``0x7F`` + hexadecimal integer value + +``"foo"`` + string value + +``[{ ... }]`` + code fragment + +``[ X, Y, Z ]<type>`` + list value. <type> is the type of the list element and is usually optional. + In rare cases, TableGen is unable to deduce the element type in which case + the user must specify it explicitly. + +``{ a, b, c }`` + initializer for a "bits<3>" value + +``value`` + value reference + +``value{17}`` + access to one bit of a value + +``value{15-17}`` + access to multiple bits of a value + +``DEF`` + reference to a record definition + +``CLASS<val list>`` + reference to a new anonymous definition of CLASS with the specified template + arguments. + +``X.Y`` + reference to the subfield of a value + +``list[4-7,17,2-3]`` + A slice of the 'list' list, including elements 4,5,6,7,17,2, and 3 from it. + Elements may be included multiple times. + +``foreach <var> = [ <list> ] in { <body> }`` + +``foreach <var> = [ <list> ] in <def>`` + Replicate <body> or <def>, replacing instances of <var> with each value + in <list>. <var> is scoped at the level of the ``foreach`` loop and must + not conflict with any other object introduced in <body> or <def>. Currently + only ``def``\s are expanded within <body>. + +``foreach <var> = 0-15 in ...`` + +``foreach <var> = {0-15,32-47} in ...`` + Loop over ranges of integers. The braces are required for multiple ranges. + +``(DEF a, b)`` + a dag value. The first element is required to be a record definition, the + remaining elements in the list may be arbitrary other values, including + nested ```dag``' values. + +``!strconcat(a, b)`` + A string value that is the result of concatenating the 'a' and 'b' strings. + +``str1#str2`` + "#" (paste) is a shorthand for !strconcat. It may concatenate things that + are not quoted strings, in which case an implicit !cast<string> is done on + the operand of the paste.</dd> + +``!cast<type>(a)`` + A symbol of type *type* obtained by looking up the string 'a' in the symbol + table. If the type of 'a' does not match *type*, TableGen aborts with an + error. !cast<string> is a special case in that the argument must be an + object defined by a 'def' construct.</dd> + +``!subst(a, b, c)`` + If 'a' and 'b' are of string type or are symbol references, substitute 'b' + for 'a' in 'c.' This operation is analogous to $(subst) in GNU make. + +``!foreach(a, b, c)`` + For each member 'b' of dag or list 'a' apply operator 'c.' 'b' is a dummy + variable that should be declared as a member variable of an instantiated + class. This operation is analogous to $(foreach) in GNU make. + +``!head(a)`` + The first element of list 'a.' + +``!tail(a)`` + The 2nd-N elements of list 'a.' + +``!empty(a)`` + An integer {0,1} indicating whether list 'a' is empty. + +``!if(a,b,c)`` + 'b' if the result of 'int' or 'bit' operator 'a' is nonzero, 'c' otherwise. + +``!eq(a,b)`` + 'bit 1' if string a is equal to string b, 0 otherwise. This only operates + on string, int and bit objects. Use !cast<string> to compare other types of + objects. + +Note that all of the values have rules specifying how they convert to values +for different types. These rules allow you to assign a value like "``7``" +to a "``bits<4>``" value, for example. + +Classes and definitions +----------------------- + +As mentioned in the `intro`_, classes and definitions (collectively known as +'records') in TableGen are the main high-level unit of information that TableGen +collects. Records are defined with a ``def`` or ``class`` keyword, the record +name, and an optional list of "`template arguments`_". If the record has +superclasses, they are specified as a comma separated list that starts with a +colon character ("``:``"). If `value definitions`_ or `let expressions`_ are +needed for the class, they are enclosed in curly braces ("``{}``"); otherwise, +the record ends with a semicolon. + +Here is a simple TableGen file: + +.. code-block:: llvm + + class C { bit V = 1; } + def X : C; + def Y : C { + string Greeting = "hello"; + } + +This example defines two definitions, ``X`` and ``Y``, both of which derive from +the ``C`` class. Because of this, they both get the ``V`` bit value. The ``Y`` +definition also gets the Greeting member as well. + +In general, classes are useful for collecting together the commonality between a +group of records and isolating it in a single place. Also, classes permit the +specification of default values for their subclasses, allowing the subclasses to +override them as they wish. + +.. _value definition: +.. _value definitions: + +Value definitions +^^^^^^^^^^^^^^^^^ + +Value definitions define named entries in records. A value must be defined +before it can be referred to as the operand for another value definition or +before the value is reset with a `let expression`_. A value is defined by +specifying a `TableGen type`_ and a name. If an initial value is available, it +may be specified after the type with an equal sign. Value definitions require +terminating semicolons. + +.. _let expression: +.. _let expressions: +.. _"let" expressions within a record: + +'let' expressions +^^^^^^^^^^^^^^^^^ + +A record-level let expression is used to change the value of a value definition +in a record. This is primarily useful when a superclass defines a value that a +derived class or definition wants to override. Let expressions consist of the +'``let``' keyword followed by a value name, an equal sign ("``=``"), and a new +value. For example, a new class could be added to the example above, redefining +the ``V`` field for all of its subclasses: + +.. code-block:: llvm + + class D : C { let V = 0; } + def Z : D; + +In this case, the ``Z`` definition will have a zero value for its ``V`` value, +despite the fact that it derives (indirectly) from the ``C`` class, because the +``D`` class overrode its value. + +.. _template arguments: + +Class template arguments +^^^^^^^^^^^^^^^^^^^^^^^^ + +TableGen permits the definition of parameterized classes as well as normal +concrete classes. Parameterized TableGen classes specify a list of variable +bindings (which may optionally have defaults) that are bound when used. Here is +a simple example: + +.. code-block:: llvm + + class FPFormat<bits<3> val> { + bits<3> Value = val; + } + def NotFP : FPFormat<0>; + def ZeroArgFP : FPFormat<1>; + def OneArgFP : FPFormat<2>; + def OneArgFPRW : FPFormat<3>; + def TwoArgFP : FPFormat<4>; + def CompareFP : FPFormat<5>; + def CondMovFP : FPFormat<6>; + def SpecialFP : FPFormat<7>; + +In this case, template arguments are used as a space efficient way to specify a +list of "enumeration values", each with a "``Value``" field set to the specified +integer. + +The more esoteric forms of `TableGen expressions`_ are useful in conjunction +with template arguments. As an example: + +.. code-block:: llvm + + class ModRefVal<bits<2> val> { + bits<2> Value = val; + } + + def None : ModRefVal<0>; + def Mod : ModRefVal<1>; + def Ref : ModRefVal<2>; + def ModRef : ModRefVal<3>; + + class Value<ModRefVal MR> { + // Decode some information into a more convenient format, while providing + // a nice interface to the user of the "Value" class. + bit isMod = MR.Value{0}; + bit isRef = MR.Value{1}; + + // other stuff... + } + + // Example uses + def bork : Value<Mod>; + def zork : Value<Ref>; + def hork : Value<ModRef>; + +This is obviously a contrived example, but it shows how template arguments can +be used to decouple the interface provided to the user of the class from the +actual internal data representation expected by the class. In this case, +running ``llvm-tblgen`` on the example prints the following definitions: + +.. code-block:: llvm + + def bork { // Value + bit isMod = 1; + bit isRef = 0; + } + def hork { // Value + bit isMod = 1; + bit isRef = 1; + } + def zork { // Value + bit isMod = 0; + bit isRef = 1; + } + +This shows that TableGen was able to dig into the argument and extract a piece +of information that was requested by the designer of the "Value" class. For +more realistic examples, please see existing users of TableGen, such as the X86 +backend. + +Multiclass definitions and instances +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +While classes with template arguments are a good way to factor commonality +between two instances of a definition, multiclasses allow a convenient notation +for defining multiple definitions at once (instances of implicitly constructed +classes). For example, consider an 3-address instruction set whose instructions +come in two forms: "``reg = reg op reg``" and "``reg = reg op imm``" +(e.g. SPARC). In this case, you'd like to specify in one place that this +commonality exists, then in a separate place indicate what all the ops are. + +Here is an example TableGen fragment that shows this idea: + +.. code-block:: llvm + + def ops; + def GPR; + def Imm; + class inst<int opc, string asmstr, dag operandlist>; + + multiclass ri_inst<int opc, string asmstr> { + def _rr : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), + (ops GPR:$dst, GPR:$src1, GPR:$src2)>; + def _ri : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), + (ops GPR:$dst, GPR:$src1, Imm:$src2)>; + } + + // Instantiations of the ri_inst multiclass. + defm ADD : ri_inst<0b111, "add">; + defm SUB : ri_inst<0b101, "sub">; + defm MUL : ri_inst<0b100, "mul">; + ... + +The name of the resultant definitions has the multidef fragment names appended +to them, so this defines ``ADD_rr``, ``ADD_ri``, ``SUB_rr``, etc. A defm may +inherit from multiple multiclasses, instantiating definitions from each +multiclass. Using a multiclass this way is exactly equivalent to instantiating +the classes multiple times yourself, e.g. by writing: + +.. code-block:: llvm + + def ops; + def GPR; + def Imm; + class inst<int opc, string asmstr, dag operandlist>; + + class rrinst<int opc, string asmstr> + : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), + (ops GPR:$dst, GPR:$src1, GPR:$src2)>; + + class riinst<int opc, string asmstr> + : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), + (ops GPR:$dst, GPR:$src1, Imm:$src2)>; + + // Instantiations of the ri_inst multiclass. + def ADD_rr : rrinst<0b111, "add">; + def ADD_ri : riinst<0b111, "add">; + def SUB_rr : rrinst<0b101, "sub">; + def SUB_ri : riinst<0b101, "sub">; + def MUL_rr : rrinst<0b100, "mul">; + def MUL_ri : riinst<0b100, "mul">; + ... + +A ``defm`` can also be used inside a multiclass providing several levels of +multiclass instanciations. + +.. code-block:: llvm + + class Instruction<bits<4> opc, string Name> { + bits<4> opcode = opc; + string name = Name; + } + + multiclass basic_r<bits<4> opc> { + def rr : Instruction<opc, "rr">; + def rm : Instruction<opc, "rm">; + } + + multiclass basic_s<bits<4> opc> { + defm SS : basic_r<opc>; + defm SD : basic_r<opc>; + def X : Instruction<opc, "x">; + } + + multiclass basic_p<bits<4> opc> { + defm PS : basic_r<opc>; + defm PD : basic_r<opc>; + def Y : Instruction<opc, "y">; + } + + defm ADD : basic_s<0xf>, basic_p<0xf>; + ... + + // Results + def ADDPDrm { ... + def ADDPDrr { ... + def ADDPSrm { ... + def ADDPSrr { ... + def ADDSDrm { ... + def ADDSDrr { ... + def ADDY { ... + def ADDX { ... + +``defm`` declarations can inherit from classes too, the rule to follow is that +the class list must start after the last multiclass, and there must be at least +one multiclass before them. + +.. code-block:: llvm + + class XD { bits<4> Prefix = 11; } + class XS { bits<4> Prefix = 12; } + + class I<bits<4> op> { + bits<4> opcode = op; + } + + multiclass R { + def rr : I<4>; + def rm : I<2>; + } + + multiclass Y { + defm SS : R, XD; + defm SD : R, XS; + } + + defm Instr : Y; + + // Results + def InstrSDrm { + bits<4> opcode = { 0, 0, 1, 0 }; + bits<4> Prefix = { 1, 1, 0, 0 }; + } + ... + def InstrSSrr { + bits<4> opcode = { 0, 1, 0, 0 }; + bits<4> Prefix = { 1, 0, 1, 1 }; + } + +File scope entities +------------------- + +File inclusion +^^^^^^^^^^^^^^ + +TableGen supports the '``include``' token, which textually substitutes the +specified file in place of the include directive. The filename should be +specified as a double quoted string immediately after the '``include``' keyword. +Example: + +.. code-block:: llvm + + include "foo.td" + +'let' expressions +^^^^^^^^^^^^^^^^^ + +"Let" expressions at file scope are similar to `"let" expressions within a +record`_, except they can specify a value binding for multiple records at a +time, and may be useful in certain other cases. File-scope let expressions are +really just another way that TableGen allows the end-user to factor out +commonality from the records. + +File-scope "let" expressions take a comma-separated list of bindings to apply, +and one or more records to bind the values in. Here are some examples: + +.. code-block:: llvm + + let isTerminator = 1, isReturn = 1, isBarrier = 1, hasCtrlDep = 1 in + def RET : I<0xC3, RawFrm, (outs), (ins), "ret", [(X86retflag 0)]>; + + let isCall = 1 in + // All calls clobber the non-callee saved registers... + let Defs = [EAX, ECX, EDX, FP0, FP1, FP2, FP3, FP4, FP5, FP6, ST0, + MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, + XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, EFLAGS] in { + def CALLpcrel32 : Ii32<0xE8, RawFrm, (outs), (ins i32imm:$dst,variable_ops), + "call\t${dst:call}", []>; + def CALL32r : I<0xFF, MRM2r, (outs), (ins GR32:$dst, variable_ops), + "call\t{*}$dst", [(X86call GR32:$dst)]>; + def CALL32m : I<0xFF, MRM2m, (outs), (ins i32mem:$dst, variable_ops), + "call\t{*}$dst", []>; + } + +File-scope "let" expressions are often useful when a couple of definitions need +to be added to several records, and the records do not otherwise need to be +opened, as in the case with the ``CALL*`` instructions above. + +It's also possible to use "let" expressions inside multiclasses, providing more +ways to factor out commonality from the records, specially if using several +levels of multiclass instanciations. This also avoids the need of using "let" +expressions within subsequent records inside a multiclass. + +.. code-block:: llvm + + multiclass basic_r<bits<4> opc> { + let Predicates = [HasSSE2] in { + def rr : Instruction<opc, "rr">; + def rm : Instruction<opc, "rm">; + } + let Predicates = [HasSSE3] in + def rx : Instruction<opc, "rx">; + } + + multiclass basic_ss<bits<4> opc> { + let IsDouble = 0 in + defm SS : basic_r<opc>; + + let IsDouble = 1 in + defm SD : basic_r<opc>; + } + + defm ADD : basic_ss<0xf>; + +Looping +^^^^^^^ + +TableGen supports the '``foreach``' block, which textually replicates the loop +body, substituting iterator values for iterator references in the body. +Example: + +.. code-block:: llvm + + foreach i = [0, 1, 2, 3] in { + def R#i : Register<...>; + def F#i : Register<...>; + } + +This will create objects ``R0``, ``R1``, ``R2`` and ``R3``. ``foreach`` blocks +may be nested. If there is only one item in the body the braces may be +elided: + +.. code-block:: llvm + + foreach i = [0, 1, 2, 3] in + def R#i : Register<...>; + +Code Generator backend info +=========================== + +Expressions used by code generator to describe instructions and isel patterns: + +``(implicit a)`` + an implicitly defined physical register. This tells the dag instruction + selection emitter the input pattern's extra definitions matches implicit + physical register definitions. + +.. _TableGen backend: +.. _TableGen backends: +.. _write a backend: + +TableGen backends +================= + +TODO: How they work, how to write one. This section should not contain details +about any particular backend, except maybe ``-print-enums`` as an example. This +should highlight the APIs in ``TableGen/Record.h``. diff --git a/docs/subsystems.rst b/docs/subsystems.rst index 9ceb842420..e643e7d4f3 100644 --- a/docs/subsystems.rst +++ b/docs/subsystems.rst @@ -10,6 +10,7 @@ Subsystem Documentation BranchWeightMetadata LinkTimeOptimization SegmentedStacks + TableGenFundamentals * `Writing an LLVM Pass <WritingAnLLVMPass.html>`_ @@ -25,8 +26,8 @@ Subsystem Documentation working on retargetting LLVM to a new architecture, designing a new codegen pass, or enhancing existing components. -* `TableGen Fundamentals <TableGenFundamentals.html>`_ - +* :ref:`tablegen` + Describes the TableGen tool, which is used heavily by the LLVM code generator. |