diff options
Diffstat (limited to 'docs/tutorial/LangImpl4.html')
-rw-r--r-- | docs/tutorial/LangImpl4.html | 1152 |
1 files changed, 0 insertions, 1152 deletions
diff --git a/docs/tutorial/LangImpl4.html b/docs/tutorial/LangImpl4.html deleted file mode 100644 index 53695924b2..0000000000 --- a/docs/tutorial/LangImpl4.html +++ /dev/null @@ -1,1152 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> - -<html> -<head> - <title>Kaleidoscope: Adding JIT and Optimizer Support</title> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> - <meta name="author" content="Chris Lattner"> - <link rel="stylesheet" href="../_static/llvm.css" type="text/css"> -</head> - -<body> - -<h1>Kaleidoscope: Adding JIT and Optimizer Support</h1> - -<ul> -<li><a href="index.html">Up to Tutorial Index</a></li> -<li>Chapter 4 - <ol> - <li><a href="#intro">Chapter 4 Introduction</a></li> - <li><a href="#trivialconstfold">Trivial Constant Folding</a></li> - <li><a href="#optimizerpasses">LLVM Optimization Passes</a></li> - <li><a href="#jit">Adding a JIT Compiler</a></li> - <li><a href="#code">Full Code Listing</a></li> - </ol> -</li> -<li><a href="LangImpl5.html">Chapter 5</a>: Extending the Language: Control -Flow</li> -</ul> - -<div class="doc_author"> - <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p> -</div> - -<!-- *********************************************************************** --> -<h2><a name="intro">Chapter 4 Introduction</a></h2> -<!-- *********************************************************************** --> - -<div> - -<p>Welcome to Chapter 4 of the "<a href="index.html">Implementing a language -with LLVM</a>" tutorial. Chapters 1-3 described the implementation of a simple -language and added support for generating LLVM IR. This chapter describes -two new techniques: adding optimizer support to your language, and adding JIT -compiler support. These additions will demonstrate how to get nice, efficient code -for the Kaleidoscope language.</p> - -</div> - -<!-- *********************************************************************** --> -<h2><a name="trivialconstfold">Trivial Constant Folding</a></h2> -<!-- *********************************************************************** --> - -<div> - -<p> -Our demonstration for Chapter 3 is elegant and easy to extend. Unfortunately, -it does not produce wonderful code. The IRBuilder, however, does give us -obvious optimizations when compiling simple code:</p> - -<div class="doc_code"> -<pre> -ready> <b>def test(x) 1+2+x;</b> -Read function definition: -define double @test(double %x) { -entry: - %addtmp = fadd double 3.000000e+00, %x - ret double %addtmp -} -</pre> -</div> - -<p>This code is not a literal transcription of the AST built by parsing the -input. That would be: - -<div class="doc_code"> -<pre> -ready> <b>def test(x) 1+2+x;</b> -Read function definition: -define double @test(double %x) { -entry: - %addtmp = fadd double 2.000000e+00, 1.000000e+00 - %addtmp1 = fadd double %addtmp, %x - ret double %addtmp1 -} -</pre> -</div> - -<p>Constant folding, as seen above, in particular, is a very common and very -important optimization: so much so that many language implementors implement -constant folding support in their AST representation.</p> - -<p>With LLVM, you don't need this support in the AST. Since all calls to build -LLVM IR go through the LLVM IR builder, the builder itself checked to see if -there was a constant folding opportunity when you call it. If so, it just does -the constant fold and return the constant instead of creating an instruction. - -<p>Well, that was easy :). In practice, we recommend always using -<tt>IRBuilder</tt> when generating code like this. It has no -"syntactic overhead" for its use (you don't have to uglify your compiler with -constant checks everywhere) and it can dramatically reduce the amount of -LLVM IR that is generated in some cases (particular for languages with a macro -preprocessor or that use a lot of constants).</p> - -<p>On the other hand, the <tt>IRBuilder</tt> is limited by the fact -that it does all of its analysis inline with the code as it is built. If you -take a slightly more complex example:</p> - -<div class="doc_code"> -<pre> -ready> <b>def test(x) (1+2+x)*(x+(1+2));</b> -ready> Read function definition: -define double @test(double %x) { -entry: - %addtmp = fadd double 3.000000e+00, %x - %addtmp1 = fadd double %x, 3.000000e+00 - %multmp = fmul double %addtmp, %addtmp1 - ret double %multmp -} -</pre> -</div> - -<p>In this case, the LHS and RHS of the multiplication are the same value. We'd -really like to see this generate "<tt>tmp = x+3; result = tmp*tmp;</tt>" instead -of computing "<tt>x+3</tt>" twice.</p> - -<p>Unfortunately, no amount of local analysis will be able to detect and correct -this. This requires two transformations: reassociation of expressions (to -make the add's lexically identical) and Common Subexpression Elimination (CSE) -to delete the redundant add instruction. Fortunately, LLVM provides a broad -range of optimizations that you can use, in the form of "passes".</p> - -</div> - -<!-- *********************************************************************** --> -<h2><a name="optimizerpasses">LLVM Optimization Passes</a></h2> -<!-- *********************************************************************** --> - -<div> - -<p>LLVM provides many optimization passes, which do many different sorts of -things and have different tradeoffs. Unlike other systems, LLVM doesn't hold -to the mistaken notion that one set of optimizations is right for all languages -and for all situations. LLVM allows a compiler implementor to make complete -decisions about what optimizations to use, in which order, and in what -situation.</p> - -<p>As a concrete example, LLVM supports both "whole module" passes, which look -across as large of body of code as they can (often a whole file, but if run -at link time, this can be a substantial portion of the whole program). It also -supports and includes "per-function" passes which just operate on a single -function at a time, without looking at other functions. For more information -on passes and how they are run, see the <a href="../WritingAnLLVMPass.html">How -to Write a Pass</a> document and the <a href="../Passes.html">List of LLVM -Passes</a>.</p> - -<p>For Kaleidoscope, we are currently generating functions on the fly, one at -a time, as the user types them in. We aren't shooting for the ultimate -optimization experience in this setting, but we also want to catch the easy and -quick stuff where possible. As such, we will choose to run a few per-function -optimizations as the user types the function in. If we wanted to make a "static -Kaleidoscope compiler", we would use exactly the code we have now, except that -we would defer running the optimizer until the entire file has been parsed.</p> - -<p>In order to get per-function optimizations going, we need to set up a -<a href="../WritingAnLLVMPass.html#passmanager">FunctionPassManager</a> to hold and -organize the LLVM optimizations that we want to run. Once we have that, we can -add a set of optimizations to run. The code looks like this:</p> - -<div class="doc_code"> -<pre> - FunctionPassManager OurFPM(TheModule); - - // Set up the optimizer pipeline. Start with registering info about how the - // target lays out data structures. - OurFPM.add(new DataLayout(*TheExecutionEngine->getDataLayout())); - // Provide basic AliasAnalysis support for GVN. - OurFPM.add(createBasicAliasAnalysisPass()); - // Do simple "peephole" optimizations and bit-twiddling optzns. - OurFPM.add(createInstructionCombiningPass()); - // Reassociate expressions. - OurFPM.add(createReassociatePass()); - // Eliminate Common SubExpressions. - OurFPM.add(createGVNPass()); - // Simplify the control flow graph (deleting unreachable blocks, etc). - OurFPM.add(createCFGSimplificationPass()); - - OurFPM.doInitialization(); - - // Set the global so the code gen can use this. - TheFPM = &OurFPM; - - // Run the main "interpreter loop" now. - MainLoop(); -</pre> -</div> - -<p>This code defines a <tt>FunctionPassManager</tt>, "<tt>OurFPM</tt>". It -requires a pointer to the <tt>Module</tt> to construct itself. Once it is set -up, we use a series of "add" calls to add a bunch of LLVM passes. The first -pass is basically boilerplate, it adds a pass so that later optimizations know -how the data structures in the program are laid out. The -"<tt>TheExecutionEngine</tt>" variable is related to the JIT, which we will get -to in the next section.</p> - -<p>In this case, we choose to add 4 optimization passes. The passes we chose -here are a pretty standard set of "cleanup" optimizations that are useful for -a wide variety of code. I won't delve into what they do but, believe me, -they are a good starting place :).</p> - -<p>Once the PassManager is set up, we need to make use of it. We do this by -running it after our newly created function is constructed (in -<tt>FunctionAST::Codegen</tt>), but before it is returned to the client:</p> - -<div class="doc_code"> -<pre> - if (Value *RetVal = Body->Codegen()) { - // Finish off the function. - Builder.CreateRet(RetVal); - - // Validate the generated code, checking for consistency. - verifyFunction(*TheFunction); - - <b>// Optimize the function. - TheFPM->run(*TheFunction);</b> - - return TheFunction; - } -</pre> -</div> - -<p>As you can see, this is pretty straightforward. The -<tt>FunctionPassManager</tt> optimizes and updates the LLVM Function* in place, -improving (hopefully) its body. With this in place, we can try our test above -again:</p> - -<div class="doc_code"> -<pre> -ready> <b>def test(x) (1+2+x)*(x+(1+2));</b> -ready> Read function definition: -define double @test(double %x) { -entry: - %addtmp = fadd double %x, 3.000000e+00 - %multmp = fmul double %addtmp, %addtmp - ret double %multmp -} -</pre> -</div> - -<p>As expected, we now get our nicely optimized code, saving a floating point -add instruction from every execution of this function.</p> - -<p>LLVM provides a wide variety of optimizations that can be used in certain -circumstances. Some <a href="../Passes.html">documentation about the various -passes</a> is available, but it isn't very complete. Another good source of -ideas can come from looking at the passes that <tt>Clang</tt> runs to get -started. The "<tt>opt</tt>" tool allows you to experiment with passes from the -command line, so you can see if they do anything.</p> - -<p>Now that we have reasonable code coming out of our front-end, lets talk about -executing it!</p> - -</div> - -<!-- *********************************************************************** --> -<h2><a name="jit">Adding a JIT Compiler</a></h2> -<!-- *********************************************************************** --> - -<div> - -<p>Code that is available in LLVM IR can have a wide variety of tools -applied to it. For example, you can run optimizations on it (as we did above), -you can dump it out in textual or binary forms, you can compile the code to an -assembly file (.s) for some target, or you can JIT compile it. The nice thing -about the LLVM IR representation is that it is the "common currency" between -many different parts of the compiler. -</p> - -<p>In this section, we'll add JIT compiler support to our interpreter. The -basic idea that we want for Kaleidoscope is to have the user enter function -bodies as they do now, but immediately evaluate the top-level expressions they -type in. For example, if they type in "1 + 2;", we should evaluate and print -out 3. If they define a function, they should be able to call it from the -command line.</p> - -<p>In order to do this, we first declare and initialize the JIT. This is done -by adding a global variable and a call in <tt>main</tt>:</p> - -<div class="doc_code"> -<pre> -<b>static ExecutionEngine *TheExecutionEngine;</b> -... -int main() { - .. - <b>// Create the JIT. This takes ownership of the module. - TheExecutionEngine = EngineBuilder(TheModule).create();</b> - .. -} -</pre> -</div> - -<p>This creates an abstract "Execution Engine" which can be either a JIT -compiler or the LLVM interpreter. LLVM will automatically pick a JIT compiler -for you if one is available for your platform, otherwise it will fall back to -the interpreter.</p> - -<p>Once the <tt>ExecutionEngine</tt> is created, the JIT is ready to be used. -There are a variety of APIs that are useful, but the simplest one is the -"<tt>getPointerToFunction(F)</tt>" method. This method JIT compiles the -specified LLVM Function and returns a function pointer to the generated machine -code. In our case, this means that we can change the code that parses a -top-level expression to look like this:</p> - -<div class="doc_code"> -<pre> -static void HandleTopLevelExpression() { - // Evaluate a top-level expression into an anonymous function. - if (FunctionAST *F = ParseTopLevelExpr()) { - if (Function *LF = F->Codegen()) { - LF->dump(); // Dump the function for exposition purposes. - - <b>// JIT the function, returning a function pointer. - void *FPtr = TheExecutionEngine->getPointerToFunction(LF); - - // Cast it to the right type (takes no arguments, returns a double) so we - // can call it as a native function. - double (*FP)() = (double (*)())(intptr_t)FPtr; - fprintf(stderr, "Evaluated to %f\n", FP());</b> - } -</pre> -</div> - -<p>Recall that we compile top-level expressions into a self-contained LLVM -function that takes no arguments and returns the computed double. Because the -LLVM JIT compiler matches the native platform ABI, this means that you can just -cast the result pointer to a function pointer of that type and call it directly. -This means, there is no difference between JIT compiled code and native machine -code that is statically linked into your application.</p> - -<p>With just these two changes, lets see how Kaleidoscope works now!</p> - -<div class="doc_code"> -<pre> -ready> <b>4+5;</b> -Read top-level expression: -define double @0() { -entry: - ret double 9.000000e+00 -} - -<em>Evaluated to 9.000000</em> -</pre> -</div> - -<p>Well this looks like it is basically working. The dump of the function -shows the "no argument function that always returns double" that we synthesize -for each top-level expression that is typed in. This demonstrates very basic -functionality, but can we do more?</p> - -<div class="doc_code"> -<pre> -ready> <b>def testfunc(x y) x + y*2; </b> -Read function definition: -define double @testfunc(double %x, double %y) { -entry: - %multmp = fmul double %y, 2.000000e+00 - %addtmp = fadd double %multmp, %x - ret double %addtmp -} - -ready> <b>testfunc(4, 10);</b> -Read top-level expression: -define double @1() { -entry: - %calltmp = call double @testfunc(double 4.000000e+00, double 1.000000e+01) - ret double %calltmp -} - -<em>Evaluated to 24.000000</em> -</pre> -</div> - -<p>This illustrates that we can now call user code, but there is something a bit -subtle going on here. Note that we only invoke the JIT on the anonymous -functions that <em>call testfunc</em>, but we never invoked it -on <em>testfunc</em> itself. What actually happened here is that the JIT -scanned for all non-JIT'd functions transitively called from the anonymous -function and compiled all of them before returning -from <tt>getPointerToFunction()</tt>.</p> - -<p>The JIT provides a number of other more advanced interfaces for things like -freeing allocated machine code, rejit'ing functions to update them, etc. -However, even with this simple code, we get some surprisingly powerful -capabilities - check this out (I removed the dump of the anonymous functions, -you should get the idea by now :) :</p> - -<div class="doc_code"> -<pre> -ready> <b>extern sin(x);</b> -Read extern: -declare double @sin(double) - -ready> <b>extern cos(x);</b> -Read extern: -declare double @cos(double) - -ready> <b>sin(1.0);</b> -Read top-level expression: -define double @2() { -entry: - ret double 0x3FEAED548F090CEE -} - -<em>Evaluated to 0.841471</em> - -ready> <b>def foo(x) sin(x)*sin(x) + cos(x)*cos(x);</b> -Read function definition: -define double @foo(double %x) { -entry: - %calltmp = call double @sin(double %x) - %multmp = fmul double %calltmp, %calltmp - %calltmp2 = call double @cos(double %x) - %multmp4 = fmul double %calltmp2, %calltmp2 - %addtmp = fadd double %multmp, %multmp4 - ret double %addtmp -} - -ready> <b>foo(4.0);</b> -Read top-level expression: -define double @3() { -entry: - %calltmp = call double @foo(double 4.000000e+00) - ret double %calltmp -} - -<em>Evaluated to 1.000000</em> -</pre> -</div> - -<p>Whoa, how does the JIT know about sin and cos? The answer is surprisingly -simple: in this -example, the JIT started execution of a function and got to a function call. It -realized that the function was not yet JIT compiled and invoked the standard set -of routines to resolve the function. In this case, there is no body defined -for the function, so the JIT ended up calling "<tt>dlsym("sin")</tt>" on the -Kaleidoscope process itself. -Since "<tt>sin</tt>" is defined within the JIT's address space, it simply -patches up calls in the module to call the libm version of <tt>sin</tt> -directly.</p> - -<p>The LLVM JIT provides a number of interfaces (look in the -<tt>ExecutionEngine.h</tt> file) for controlling how unknown functions get -resolved. It allows you to establish explicit mappings between IR objects and -addresses (useful for LLVM global variables that you want to map to static -tables, for example), allows you to dynamically decide on the fly based on the -function name, and even allows you to have the JIT compile functions lazily the -first time they're called.</p> - -<p>One interesting application of this is that we can now extend the language -by writing arbitrary C++ code to implement operations. For example, if we add: -</p> - -<div class="doc_code"> -<pre> -/// putchard - putchar that takes a double and returns 0. -extern "C" -double putchard(double X) { - putchar((char)X); - return 0; -} -</pre> -</div> - -<p>Now we can produce simple output to the console by using things like: -"<tt>extern putchard(x); putchard(120);</tt>", which prints a lowercase 'x' on -the console (120 is the ASCII code for 'x'). Similar code could be used to -implement file I/O, console input, and many other capabilities in -Kaleidoscope.</p> - -<p>This completes the JIT and optimizer chapter of the Kaleidoscope tutorial. At -this point, we can compile a non-Turing-complete programming language, optimize -and JIT compile it in a user-driven way. Next up we'll look into <a -href="LangImpl5.html">extending the language with control flow constructs</a>, -tackling some interesting LLVM IR issues along the way.</p> - -</div> - -<!-- *********************************************************************** --> -<h2><a name="code">Full Code Listing</a></h2> -<!-- *********************************************************************** --> - -<div> - -<p> -Here is the complete code listing for our running example, enhanced with the -LLVM JIT and optimizer. To build this example, use: -</p> - -<div class="doc_code"> -<pre> -# Compile -clang++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy -# Run -./toy -</pre> -</div> - -<p> -If you are compiling this on Linux, make sure to add the "-rdynamic" option -as well. This makes sure that the external functions are resolved properly -at runtime.</p> - -<p>Here is the code:</p> - -<div class="doc_code"> -<pre> -#include "llvm/DerivedTypes.h" -#include "llvm/ExecutionEngine/ExecutionEngine.h" -#include "llvm/ExecutionEngine/JIT.h" -#include "llvm/IRBuilder.h" -#include "llvm/LLVMContext.h" -#include "llvm/Module.h" -#include "llvm/PassManager.h" -#include "llvm/Analysis/Verifier.h" -#include "llvm/Analysis/Passes.h" -#include "llvm/DataLayout.h" -#include "llvm/Transforms/Scalar.h" -#include "llvm/Support/TargetSelect.h" -#include <cstdio> -#include <string> -#include <map> -#include <vector> -using namespace llvm; - -//===----------------------------------------------------------------------===// -// Lexer -//===----------------------------------------------------------------------===// - -// The lexer returns tokens [0-255] if it is an unknown character, otherwise one -// of these for known things. -enum Token { - tok_eof = -1, - - // commands - tok_def = -2, tok_extern = -3, - - // primary - tok_identifier = -4, tok_number = -5 -}; - -static std::string IdentifierStr; // Filled in if tok_identifier -static double NumVal; // Filled in if tok_number - -/// gettok - Return the next token from standard input. -static int gettok() { - static int LastChar = ' '; - - // Skip any whitespace. - while (isspace(LastChar)) - LastChar = getchar(); - - if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]* - IdentifierStr = LastChar; - while (isalnum((LastChar = getchar()))) - IdentifierStr += LastChar; - - if (IdentifierStr == "def") return tok_def; - if (IdentifierStr == "extern") return tok_extern; - return tok_identifier; - } - - if (isdigit(LastChar) || LastChar == '.') { // Number: [0-9.]+ - std::string NumStr; - do { - NumStr += LastChar; - LastChar = getchar(); - } while (isdigit(LastChar) || LastChar == '.'); - - NumVal = strtod(NumStr.c_str(), 0); - return tok_number; - } - - if (LastChar == '#') { - // Comment until end of line. - do LastChar = getchar(); - while (LastChar != EOF && LastChar != '\n' && LastChar != '\r'); - - if (LastChar != EOF) - return gettok(); - } - - // Check for end of file. Don't eat the EOF. - if (LastChar == EOF) - return tok_eof; - - // Otherwise, just return the character as its ascii value. - int ThisChar = LastChar; - LastChar = getchar(); - return ThisChar; -} - -//===----------------------------------------------------------------------===// -// Abstract Syntax Tree (aka Parse Tree) -//===----------------------------------------------------------------------===// - -/// ExprAST - Base class for all expression nodes. -class ExprAST { -public: - virtual ~ExprAST() {} - virtual Value *Codegen() = 0; -}; - -/// NumberExprAST - Expression class for numeric literals like "1.0". -class NumberExprAST : public ExprAST { - double Val; -public: - NumberExprAST(double val) : Val(val) {} - virtual Value *Codegen(); -}; - -/// VariableExprAST - Expression class for referencing a variable, like "a". -class VariableExprAST : public ExprAST { - std::string Name; -public: - VariableExprAST(const std::string &name) : Name(name) {} - virtual Value *Codegen(); -}; - -/// BinaryExprAST - Expression class for a binary operator. -class BinaryExprAST : public ExprAST { - char Op; - ExprAST *LHS, *RHS; -public: - BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs) - : Op(op), LHS(lhs), RHS(rhs) {} - virtual Value *Codegen(); -}; - -/// CallExprAST - Expression class for function calls. -class CallExprAST : public ExprAST { - std::string Callee; - std::vector<ExprAST*> Args; -public: - CallExprAST(const std::string &callee, std::vector<ExprAST*> &args) - : Callee(callee), Args(args) {} - virtual Value *Codegen(); -}; - -/// PrototypeAST - This class represents the "prototype" for a function, -/// which captures its name, and its argument names (thus implicitly the number -/// of arguments the function takes). -class PrototypeAST { - std::string Name; - std::vector<std::string> Args; -public: - PrototypeAST(const std::string &name, const std::vector<std::string> &args) - : Name(name), Args(args) {} - - Function *Codegen(); -}; - -/// FunctionAST - This class represents a function definition itself. -class FunctionAST { - PrototypeAST *Proto; - ExprAST *Body; -public: - FunctionAST(PrototypeAST *proto, ExprAST *body) - : Proto(proto), Body(body) {} - - Function *Codegen(); -}; - -//===----------------------------------------------------------------------===// -// Parser -//===----------------------------------------------------------------------===// - -/// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current -/// token the parser is looking at. getNextToken reads another token from the -/// lexer and updates CurTok with its results. -static int CurTok; -static int getNextToken() { - return CurTok = gettok(); -} - -/// BinopPrecedence - This holds the precedence for each binary operator that is -/// defined. -static std::map<char, int> BinopPrecedence; - -/// GetTokPrecedence - Get the precedence of the pending binary operator token. -static int GetTokPrecedence() { - if (!isascii(CurTok)) - return -1; - - // Make sure it's a declared binop. - int TokPrec = BinopPrecedence[CurTok]; - if (TokPrec <= 0) return -1; - return TokPrec; -} - -/// Error* - These are little helper functions for error handling. -ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;} -PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; } -FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; } - -static ExprAST *ParseExpression(); - -/// identifierexpr -/// ::= identifier -/// ::= identifier '(' expression* ')' -static ExprAST *ParseIdentifierExpr() { - std::string IdName = IdentifierStr; - - getNextToken(); // eat identifier. - - if (CurTok != '(') // Simple variable ref. - return new VariableExprAST(IdName); - - // Call. - getNextToken(); // eat ( - std::vector<ExprAST*> Args; - if (CurTok != ')') { - while (1) { - ExprAST *Arg = ParseExpression(); - if (!Arg) return 0; - Args.push_back(Arg); - - if (CurTok == ')') break; - - if (CurTok != ',') - return Error("Expected ')' or ',' in argument list"); - getNextToken(); - } - } - - // Eat the ')'. - getNextToken(); - - return new CallExprAST(IdName, Args); -} - -/// numberexpr ::= number -static ExprAST *ParseNumberExpr() { - ExprAST *Result = new NumberExprAST(NumVal); - getNextToken(); // consume the number - return Result; -} - -/// parenexpr ::= '(' expression ')' -static ExprAST *ParseParenExpr() { - getNextToken(); // eat (. - ExprAST *V = ParseExpression(); - if (!V) return 0; - - if (CurTok != ')') - return Error("expected ')'"); - getNextToken(); // eat ). - return V; -} - -/// primary -/// ::= identifierexpr -/// ::= numberexpr -/// ::= parenexpr -static ExprAST *ParsePrimary() { - switch (CurTok) { - default: return Error("unknown token when expecting an expression"); - case tok_identifier: return ParseIdentifierExpr(); - case tok_number: return ParseNumberExpr(); - case '(': return ParseParenExpr(); - } -} - -/// binoprhs -/// ::= ('+' primary)* -static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) { - // If this is a binop, find its precedence. - while (1) { - int TokPrec = GetTokPrecedence(); - - // If this is a binop that binds at least as tightly as the current binop, - // consume it, otherwise we are done. - if (TokPrec < ExprPrec) - return LHS; - - // Okay, we know this is a binop. - int BinOp = CurTok; - getNextToken(); // eat binop - - // Parse the primary expression after the binary operator. - ExprAST *RHS = ParsePrimary(); - if (!RHS) return 0; - - // If BinOp binds less tightly with RHS than the operator after RHS, let - // the pending operator take RHS as its LHS. - int NextPrec = GetTokPrecedence(); - if (TokPrec < NextPrec) { - RHS = ParseBinOpRHS(TokPrec+1, RHS); - if (RHS == 0) return 0; - } - - // Merge LHS/RHS. - LHS = new BinaryExprAST(BinOp, LHS, RHS); - } -} - -/// expression -/// ::= primary binoprhs -/// -static ExprAST *ParseExpression() { - ExprAST *LHS = ParsePrimary(); - if (!LHS) return 0; - - return ParseBinOpRHS(0, LHS); -} - -/// prototype -/// ::= id '(' id* ')' -static PrototypeAST *ParsePrototype() { - if (CurTok != tok_identifier) - return ErrorP("Expected function name in prototype"); - - std::string FnName = IdentifierStr; - getNextToken(); - - if (CurTok != '(') - return ErrorP("Expected '(' in prototype"); - - std::vector<std::string> ArgNames; - while (getNextToken() == tok_identifier) - ArgNames.push_back(IdentifierStr); - if (CurTok != ')') - return ErrorP("Expected ')' in prototype"); - - // success. - getNextToken(); // eat ')'. - - return new PrototypeAST(FnName, ArgNames); -} - -/// definition ::= 'def' prototype expression -static FunctionAST *ParseDefinition() { - getNextToken(); // eat def. - PrototypeAST *Proto = ParsePrototype(); - if (Proto == 0) return 0; - - if (ExprAST *E = ParseExpression()) - return new FunctionAST(Proto, E); - return 0; -} - -/// toplevelexpr ::= expression -static FunctionAST *ParseTopLevelExpr() { - if (ExprAST *E = ParseExpression()) { - // Make an anonymous proto. - PrototypeAST *Proto = new PrototypeAST("", std::vector<std::string>()); - return new FunctionAST(Proto, E); - } - return 0; -} - -/// external ::= 'extern' prototype -static PrototypeAST *ParseExtern() { - getNextToken(); // eat extern. - return ParsePrototype(); -} - -//===----------------------------------------------------------------------===// -// Code Generation -//===----------------------------------------------------------------------===// - -static Module *TheModule; -static IRBuilder<> Builder(getGlobalContext()); -static std::map<std::string, Value*> NamedValues; -static FunctionPassManager *TheFPM; - -Value *ErrorV(const char *Str) { Error(Str); return 0; } - -Value *NumberExprAST::Codegen() { - return ConstantFP::get(getGlobalContext(), APFloat(Val)); -} - -Value *VariableExprAST::Codegen() { - // Look this variable up in the function. - Value *V = NamedValues[Name]; - return V ? V : ErrorV("Unknown variable name"); -} - -Value *BinaryExprAST::Codegen() { - Value *L = LHS->Codegen(); - Value *R = RHS->Codegen(); - if (L == 0 || R == 0) return 0; - - switch (Op) { - case '+': return Builder.CreateFAdd(L, R, "addtmp"); - case '-': return Builder.CreateFSub(L, R, "subtmp"); - case '*': return Builder.CreateFMul(L, R, "multmp"); - case '<': - L = Builder.CreateFCmpULT(L, R, "cmptmp"); - // Convert bool 0/1 to double 0.0 or 1.0 - return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()), - "booltmp"); - default: return ErrorV("invalid binary operator"); - } -} - -Value *CallExprAST::Codegen() { - // Look up the name in the global module table. - Function *CalleeF = TheModule->getFunction(Callee); - if (CalleeF == 0) - return ErrorV("Unknown function referenced"); - - // If argument mismatch error. - if (CalleeF->arg_size() != Args.size()) - return ErrorV("Incorrect # arguments passed"); - - std::vector<Value*> ArgsV; - for (unsigned i = 0, e = Args.size(); i != e; ++i) { - ArgsV.push_back(Args[i]->Codegen()); - if (ArgsV.back() == 0) return 0; - } - - return Builder.CreateCall(CalleeF, ArgsV, "calltmp"); -} - -Function *PrototypeAST::Codegen() { - // Make the function type: double(double,double) etc. - std::vector<Type*> Doubles(Args.size(), - Type::getDoubleTy(getGlobalContext())); - FunctionType *FT = FunctionType::get(Type::getDoubleTy(getGlobalContext()), - Doubles, false); - - Function *F = Function::Create(FT, Function::ExternalLinkage, Name, TheModule); - - // If F conflicted, there was already something named 'Name'. If it has a - // body, don't allow redefinition or reextern. - if (F->getName() != Name) { - // Delete the one we just made and get the existing one. - F->eraseFromParent(); - F = TheModule->getFunction(Name); - - // If F already has a body, reject this. - if (!F->empty()) { - ErrorF("redefinition of function"); - return 0; - } - - // If F took a different number of args, reject. - if (F->arg_size() != Args.size()) { - ErrorF("redefinition of function with different # args"); - return 0; - } - } - - // Set names for all arguments. - unsigned Idx = 0; - for (Function::arg_iterator AI = F->arg_begin(); Idx != Args.size(); - ++AI, ++Idx) { - AI->setName(Args[Idx]); - - // Add arguments to variable symbol table. - NamedValues[Args[Idx]] = AI; - } - - return F; -} - -Function *FunctionAST::Codegen() { - NamedValues.clear(); - - Function *TheFunction = Proto->Codegen(); - if (TheFunction == 0) - return 0; - - // Create a new basic block to start insertion into. - BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction); - Builder.SetInsertPoint(BB); - - if (Value *RetVal = Body->Codegen()) { - // Finish off the function. - Builder.CreateRet(RetVal); - - // Validate the generated code, checking for consistency. - verifyFunction(*TheFunction); - - // Optimize the function. - TheFPM->run(*TheFunction); - - return TheFunction; - } - - // Error reading body, remove function. - TheFunction->eraseFromParent(); - return 0; -} - -//===----------------------------------------------------------------------===// -// Top-Level parsing and JIT Driver -//===----------------------------------------------------------------------===// - -static ExecutionEngine *TheExecutionEngine; - -static void HandleDefinition() { - if (FunctionAST *F = ParseDefinition()) { - if (Function *LF = F->Codegen()) { - fprintf(stderr, "Read function definition:"); - LF->dump(); - } - } else { - // Skip token for error recovery. - getNextToken(); - } -} - -static void HandleExtern() { - if (PrototypeAST *P = ParseExtern()) { - if (Function *F = P->Codegen()) { - fprintf(stderr, "Read extern: "); - F->dump(); - } - } else { - // Skip token for error recovery. - getNextToken(); - } -} - -static void HandleTopLevelExpression() { - // Evaluate a top-level expression into an anonymous function. - if (FunctionAST *F = ParseTopLevelExpr()) { - if (Function *LF = F->Codegen()) { - fprintf(stderr, "Read top-level expression:"); - LF->dump(); - - // JIT the function, returning a function pointer. - void *FPtr = TheExecutionEngine->getPointerToFunction(LF); |