<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<title>Kaleidoscope: Implementing a Parser and AST</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="author" content="Chris Lattner">
<link rel="stylesheet" href="../llvm.css" type="text/css">
</head>
<body>
<div class="doc_title">Kaleidoscope: Implementing a Parser and AST</div>
<ul>
<li>Chapter 2
<ol>
<li><a href="#intro">Chapter 2 Introduction</a></li>
<li><a href="#ast">The Abstract Syntax Tree (AST)</a></li>
<li><a href="#parserbasics">Parser Basics</a></li>
<li><a href="#parserprimexprs">Basic Expression Parsing</a></li>
<li><a href="#parserbinops">Binary Expression Parsing</a></li>
<li><a href="#parsertop">Parsing the Rest</a></li>
<li><a href="#driver">The Driver</a></li>
<li><a href="#conclusions">Conclusions</a></li>
<li><a href="#code">Full Code Listing</a></li>
</ol>
</li>
</ul>
<div class="doc_author">
<p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p>
</div>
<!-- *********************************************************************** -->
<div class="doc_section"><a name="intro">Chapter 2 Introduction</a></div>
<!-- *********************************************************************** -->
<div class="doc_text">
<p>Welcome to Chapter 2 of the "<a href="index.html">Implementing a language
with LLVM</a>" tutorial. This chapter shows you how to use the <a
href="LangImpl1.html">Lexer built in Chapter 1</a> to build a full <a
href="http://en.wikipedia.org/wiki/Parsing">parser</a> for
our Kaleidoscope language and build an <a
href="http://en.wikipedia.org/wiki/Abstract_syntax_tree">Abstract Syntax
Tree</a> (AST).</p>
<p>The parser we will build uses a combination of <a
href="http://en.wikipedia.org/wiki/Recursive_descent_parser">Recursive Descent
Parsing</a> and <a href=
"http://en.wikipedia.org/wiki/Operator-precedence_parser">Operator-Precedence
Parsing</a> to parse the Kaleidoscope language (the later for binary expression
and the former for everything else). Before we get to parsing though, lets talk
about the output of the parser: the Abstract Syntax Tree.</p>
</div>
<!-- *********************************************************************** -->
<div class="doc_section"><a name="ast">The Abstract Syntax Tree (AST)</a></div>
<!-- *********************************************************************** -->
<div class="doc_text">
<p>The AST for a program captures its behavior in a way that it is easy for
later stages of the compiler (e.g. code generation) to interpret. We basically
want one object for each construct in the language, and the AST should closely
model the language. In Kaleidoscope, we have expressions, a prototype, and a
function object. We'll start with expressions first:</p>
<div class="doc_code">
<pre>
/// ExprAST - Base class for all expression nodes.
class ExprAST {
public:
virtual ~ExprAST() {}
};
/// NumberExprAST - Expression class for numeric literals like "1.0".
class NumberExprAST : public ExprAST {
double Val;
public:
explicit NumberExprAST(double val) : Val(val) {}
};
</pre>
</div>
<p>The code above shows the definition of the base ExprAST class and one
subclass which we use for numeric literals. The important thing about this is
that the NumberExprAST class captures the numeric value of the literal in the
class, so that later phases of the compiler can know what it is.</p>
<p>Right now we only create the AST, so there are no useful accessor methods on
them. It would be very easy to add a virtual method to pretty print the code,
for example. Here are the other expression AST node definitions that we'll use
in the basic form of the Kaleidoscope language.
</p>
<div class="doc_code">
<pre>
/// VariableExprAST - Expression class for referencing a variable, like "a".
class VariableExprAST