<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<title>Kaleidoscope: Implementing code generation to LLVM IR</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="author" content="Chris Lattner">
<meta name="author" content="Erick Tryzelaar">
<link rel="stylesheet" href="../llvm.css" type="text/css">
</head>
<body>
<div class="doc_title">Kaleidoscope: Code generation to LLVM IR</div>
<ul>
<li><a href="index.html">Up to Tutorial Index</a></li>
<li>Chapter 3
<ol>
<li><a href="#intro">Chapter 3 Introduction</a></li>
<li><a href="#basics">Code Generation Setup</a></li>
<li><a href="#exprs">Expression Code Generation</a></li>
<li><a href="#funcs">Function Code Generation</a></li>
<li><a href="#driver">Driver Changes and Closing Thoughts</a></li>
<li><a href="#code">Full Code Listing</a></li>
</ol>
</li>
<li><a href="OCamlLangImpl4.html">Chapter 4</a>: Adding JIT and Optimizer
Support</li>
</ul>
<div class="doc_author">
<p>
Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a>
and <a href="mailto:idadesub@users.sourceforge.net">Erick Tryzelaar</a>
</p>
</div>
<!-- *********************************************************************** -->
<div class="doc_section"><a name="intro">Chapter 3 Introduction</a></div>
<!-- *********************************************************************** -->
<div class="doc_text">
<p>Welcome to Chapter 3 of the "<a href="index.html">Implementing a language
with LLVM</a>" tutorial. This chapter shows you how to transform the <a
href="OCamlLangImpl2.html">Abstract Syntax Tree</a>, built in Chapter 2, into
LLVM IR. This will teach you a little bit about how LLVM does things, as well
as demonstrate how easy it is to use. It's much more work to build a lexer and
parser than it is to generate LLVM IR code. :)
</p>
<p><b>Please note</b>: the code in this chapter and later require LLVM 2.3 or
LLVM SVN to work. LLVM 2.2 and before will not work with it.</p>
</div>
<!-- *********************************************************************** -->
<div class="doc_section"><a name="basics">Code Generation Setup</a></div>
<!-- *********************************************************************** -->
<div class="doc_text">
<p>
In order to generate LLVM IR, we want some simple setup to get started. First
we define virtual code generation (codegen) methods in each AST class:</p>
<div class="doc_code">
<pre>
let rec codegen_expr = function
| Ast.Number n -> ...
| Ast.Variable name -> ...
</pre>
</div>
<p>The <tt>Codegen.codegen_expr</tt> function says to emit IR for that AST node
along with all the things it depends on, and they all return an LLVM Value
object. "Value" is the class used to represent a "<a
href="http://en.wikipedia.org/wiki/Static_single_assignment_form">Static Single
Assignment (SSA)</a> register" or "SSA value" in LLVM. The most distinct aspect
of SSA values is that their value is computed as the related instruction
executes, and it does not get a new value until (and if) the instruction
re-executes. In other words, there is no way to "change" an SSA value. For
more information, please read up on <a
href="http://en.wikipedia.org/wiki/Static_single_assignment_form">Static Single
Assignment</a> - the concepts are really quite natural once you grok them.</p>
<p>The
second thing we want is an "Error" exception like we used for the parser, which
will be used to report errors found during code generation (for example, use of
an undeclared parameter):</p>
<div class="doc_code">
<pre>
exception Error of string
let the_module = create_module (global_context ()) "my cool jit"
let builder = builder (global_context ())
let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
let double_type = double_type context
</pre>
</div>
<p>The static variables will be used during code generation.
<tt>Codgen.the_module</tt> is the LLVM construct that contains all of the
functions and global variables in a chunk of code. In many ways, it is the
top-level structure that the LLVM IR uses to contain code.</p>
<p>The <tt>Codegen.builder</tt> object is a helper object that makes it easy to
generate LLVM instructions. Instances of the <a
href="http://llvm.org/doxygen/IRBuilder_8h-source.html"><tt>IRBuilder</tt></a>
class keep track of the current place to insert instructions and has methods to
create new instructions.</p>
<p>The <tt>Codegen.named_values</tt> map keeps track of which values are defined
in the current scope and what their LLVM representation is. (In other words, it
is a symbol table for the code). In this form of Kaleidoscope, the only things
that can be referenced are function parameters. As such, function parameters
will be in this map when generating code for their function body.</p>
<p>
With these basics in place, we can start talking about how to generate code for
each expression. Note that this assumes that the <tt>Codgen.builder</tt> has
been set up to generate code <em>into</em> something. For now, we'll assume
that this has already been done, and we'll just use it to emit code.</p>
</div>
<!-- *********************************************************************** -->
<div class="doc_section"><a name="exprs">Expression Code Generation</a></div>
<!-- *********************************************************************** -->
<div class="doc_text">
<p>Generating LLVM code for expres