diff options
author | Derek Schuff <dschuff@chromium.org> | 2013-01-09 16:55:43 -0800 |
---|---|---|
committer | Derek Schuff <dschuff@chromium.org> | 2013-01-11 13:47:37 -0800 |
commit | b770d0e0636a4b5ad61b1ca661caee67576c05fc (patch) | |
tree | c486ce032d41f97313c50629bd5b879f53e6ccbf /docs/tutorial/OCamlLangImpl8.html | |
parent | b835840cf112a6178506d834b58aa625f59a8994 (diff) | |
parent | 1ad9253c9d34ccbce3e7e4ea5d87c266cbf93410 (diff) |
Merge commit '1ad9253c9d34ccbce3e7e4ea5d87c266cbf93410'
deplib features commented out due to removal upstream;
will add back as a localmod
Conflicts:
include/llvm/ADT/Triple.h
include/llvm/MC/MCAssembler.h
include/llvm/Target/TargetFrameLowering.h
lib/CodeGen/AsmPrinter/DwarfDebug.cpp
lib/CodeGen/AsmPrinter/DwarfDebug.h
lib/CodeGen/BranchFolding.cpp
lib/LLVMBuild.txt
lib/Linker/LinkArchives.cpp
lib/MC/MCAssembler.cpp
lib/MC/MCELFStreamer.cpp
lib/Makefile
lib/Target/ARM/ARMExpandPseudoInsts.cpp
lib/Target/ARM/ARMFrameLowering.cpp
lib/Target/ARM/ARMISelLowering.cpp
lib/Target/ARM/ARMSubtarget.h
lib/Target/ARM/ARMTargetObjectFile.cpp
lib/Target/ARM/MCTargetDesc/ARMAsmBackend.cpp
lib/Target/Mips/MipsInstrFPU.td
lib/Target/Mips/MipsInstrInfo.td
lib/Target/X86/X86CodeEmitter.cpp
lib/Target/X86/X86Subtarget.h
lib/VMCore/Module.cpp
test/MC/MachO/ARM/nop-armv4-padding.s
tools/Makefile
tools/llc/llc.cpp
tools/lto/LTOModule.cpp
tools/lto/lto.cpp
Diffstat (limited to 'docs/tutorial/OCamlLangImpl8.html')
-rw-r--r-- | docs/tutorial/OCamlLangImpl8.html | 359 |
1 files changed, 0 insertions, 359 deletions
diff --git a/docs/tutorial/OCamlLangImpl8.html b/docs/tutorial/OCamlLangImpl8.html deleted file mode 100644 index 7c1a500a21..0000000000 --- a/docs/tutorial/OCamlLangImpl8.html +++ /dev/null @@ -1,359 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> - -<html> -<head> - <title>Kaleidoscope: Conclusion and other useful LLVM tidbits</title> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> - <meta name="author" content="Chris Lattner"> - <link rel="stylesheet" href="../_static/llvm.css" type="text/css"> -</head> - -<body> - -<h1>Kaleidoscope: Conclusion and other useful LLVM tidbits</h1> - -<ul> -<li><a href="index.html">Up to Tutorial Index</a></li> -<li>Chapter 8 - <ol> - <li><a href="#conclusion">Tutorial Conclusion</a></li> - <li><a href="#llvmirproperties">Properties of LLVM IR</a> - <ul> - <li><a href="#targetindep">Target Independence</a></li> - <li><a href="#safety">Safety Guarantees</a></li> - <li><a href="#langspecific">Language-Specific Optimizations</a></li> - </ul> - </li> - <li><a href="#tipsandtricks">Tips and Tricks</a> - <ul> - <li><a href="#offsetofsizeof">Implementing portable - offsetof/sizeof</a></li> - <li><a href="#gcstack">Garbage Collected Stack Frames</a></li> - </ul> - </li> - </ol> -</li> -</ul> - - -<div class="doc_author"> - <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p> -</div> - -<!-- *********************************************************************** --> -<h2><a name="conclusion">Tutorial Conclusion</a></h2> -<!-- *********************************************************************** --> - -<div> - -<p>Welcome to the final chapter of the "<a href="index.html">Implementing a -language with LLVM</a>" tutorial. In the course of this tutorial, we have grown -our little Kaleidoscope language from being a useless toy, to being a -semi-interesting (but probably still useless) toy. :)</p> - -<p>It is interesting to see how far we've come, and how little code it has -taken. We built the entire lexer, parser, AST, code generator, and an -interactive run-loop (with a JIT!) by-hand in under 700 lines of -(non-comment/non-blank) code.</p> - -<p>Our little language supports a couple of interesting features: it supports -user defined binary and unary operators, it uses JIT compilation for immediate -evaluation, and it supports a few control flow constructs with SSA construction. -</p> - -<p>Part of the idea of this tutorial was to show you how easy and fun it can be -to define, build, and play with languages. Building a compiler need not be a -scary or mystical process! Now that you've seen some of the basics, I strongly -encourage you to take the code and hack on it. For example, try adding:</p> - -<ul> -<li><b>global variables</b> - While global variables have questional value in -modern software engineering, they are often useful when putting together quick -little hacks like the Kaleidoscope compiler itself. Fortunately, our current -setup makes it very easy to add global variables: just have value lookup check -to see if an unresolved variable is in the global variable symbol table before -rejecting it. To create a new global variable, make an instance of the LLVM -<tt>GlobalVariable</tt> class.</li> - -<li><b>typed variables</b> - Kaleidoscope currently only supports variables of -type double. This gives the language a very nice elegance, because only -supporting one type means that you never have to specify types. Different -languages have different ways of handling this. The easiest way is to require -the user to specify types for every variable definition, and record the type -of the variable in the symbol table along with its Value*.</li> - -<li><b>arrays, structs, vectors, etc</b> - Once you add types, you can start -extending the type system in all sorts of interesting ways. Simple arrays are -very easy and are quite useful for many different applications. Adding them is -mostly an exercise in learning how the LLVM <a -href="../LangRef.html#i_getelementptr">getelementptr</a> instruction works: it -is so nifty/unconventional, it <a -href="../GetElementPtr.html">has its own FAQ</a>! If you add support -for recursive types (e.g. linked lists), make sure to read the <a -href="../ProgrammersManual.html#TypeResolve">section in the LLVM -Programmer's Manual</a> that describes how to construct them.</li> - -<li><b>standard runtime</b> - Our current language allows the user to access -arbitrary external functions, and we use it for things like "printd" and -"putchard". As you extend the language to add higher-level constructs, often -these constructs make the most sense if they are lowered to calls into a -language-supplied runtime. For example, if you add hash tables to the language, -it would probably make sense to add the routines to a runtime, instead of -inlining them all the way.</li> - -<li><b>memory management</b> - Currently we can only access the stack in -Kaleidoscope. It would also be useful to be able to allocate heap memory, -either with calls to the standard libc malloc/free interface or with a garbage -collector. If you would like to use garbage collection, note that LLVM fully -supports <a href="../GarbageCollection.html">Accurate Garbage Collection</a> -including algorithms that move objects and need to scan/update the stack.</li> - -<li><b>debugger support</b> - LLVM supports generation of <a -href="../SourceLevelDebugging.html">DWARF Debug info</a> which is understood by -common debuggers like GDB. Adding support for debug info is fairly -straightforward. The best way to understand it is to compile some C/C++ code -with "<tt>llvm-gcc -g -O0</tt>" and taking a look at what it produces.</li> - -<li><b>exception handling support</b> - LLVM supports generation of <a -href="../ExceptionHandling.html">zero cost exceptions</a> which interoperate -with code compiled in other languages. You could also generate code by -implicitly making every function return an error value and checking it. You -could also make explicit use of setjmp/longjmp. There are many different ways -to go here.</li> - -<li><b>object orientation, generics, database access, complex numbers, -geometric programming, ...</b> - Really, there is -no end of crazy features that you can add to the language.</li> - -<li><b>unusual domains</b> - We've been talking about applying LLVM to a domain -that many people are interested in: building a compiler for a specific language. -However, there are many other domains that can use compiler technology that are -not typically considered. For example, LLVM has been used to implement OpenGL -graphics acceleration, translate C++ code to ActionScript, and many other -cute and clever things. Maybe you will be the first to JIT compile a regular -expression interpreter into native code with LLVM?</li> - -</ul> - -<p> -Have fun - try doing something crazy and unusual. Building a language like -everyone else always has, is much less fun than trying something a little crazy -or off the wall and seeing how it turns out. If you get stuck or want to talk -about it, feel free to email the <a -href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">llvmdev mailing -list</a>: it has lots of people who are interested in languages and are often -willing to help out. -</p> - -<p>Before we end this tutorial, I want to talk about some "tips and tricks" for generating -LLVM IR. These are some of the more subtle things that may not be obvious, but -are very useful if you want to take advantage of LLVM's capabilities.</p> - -</div> - -<!-- *********************************************************************** --> -<h2><a name="llvmirproperties">Properties of the LLVM IR</a></h2> -<!-- *********************************************************************** --> - -<div> - -<p>We have a couple common questions about code in the LLVM IR form - lets just -get these out of the way right now, shall we?</p> - -<!-- ======================================================================= --> -<h4><a name="targetindep">Target Independence</a></h4> -<!-- ======================================================================= --> - -<div> - -<p>Kaleidoscope is an example of a "portable language": any program written in -Kaleidoscope will work the same way on any target that it runs on. Many other -languages have this property, e.g. lisp, java, haskell, javascript, python, etc -(note that while these languages are portable, not all their libraries are).</p> - -<p>One nice aspect of LLVM is that it is often capable of preserving target -independence in the IR: you can take the LLVM IR for a Kaleidoscope-compiled -program and run it on any target that LLVM supports, even emitting C code and -compiling that on targets that LLVM doesn't support natively. You can trivially -tell that the Kaleidoscope compiler generates target-independent code because it -never queries for any target-specific information when generating code.</p> - -<p>The fact that LLVM provides a compact, target-independent, representation for -code gets a lot of people excited. Unfortunately, these people are usually -thinking about C or a language from the C family when they are asking questions -about language portability. I say "unfortunately", because there is really no -way to make (fully general) C code portable, other than shipping the source code -around (and of course, C source code is not actually portable in general -either - ever port a really old application from 32- to 64-bits?).</p> - -<p>The problem with C (again, in its full generality) is that it is heavily -laden with target specific assumptions. As one simple example, the preprocessor -often destructively removes target-independence from the code when it processes -the input text:</p> - -<div class="doc_code"> -<pre> -#ifdef __i386__ - int X = 1; -#else - int X = 42; -#endif -</pre> -</div> - -<p>While it is possible to engineer more and more complex solutions to problems -like this, it cannot be solved in full generality in a way that is better than shipping -the actual source code.</p> - -<p>That said, there are interesting subsets of C that can be made portable. If -you are willing to fix primitive types to a fixed size (say int = 32-bits, -and long = 64-bits), don't care about ABI compatibility with existing binaries, -and are willing to give up some other minor features, you can have portable -code. This can make sense for specialized domains such as an -in-kernel language.</p> - -</div> - -<!-- ======================================================================= --> -<h4><a name="safety">Safety Guarantees</a></h4> -<!-- ======================================================================= --> - -<div> - -<p>Many of the languages above are also "safe" languages: it is impossible for -a program written in Java to corrupt its address space and crash the process -(assuming the JVM has no bugs). -Safety is an interesting property that requires a combination of language -design, runtime support, and often operating system support.</p> - -<p>It is certainly possible to implement a safe language in LLVM, but LLVM IR -does not itself guarantee safety. The LLVM IR allows unsafe pointer casts, -use after free bugs, buffer over-runs, and a variety of other problems. Safety -needs to be implemented as a layer on top of LLVM and, conveniently, several -groups have investigated this. Ask on the <a -href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">llvmdev mailing -list</a> if you are interested in more details.</p> - -</div> - -<!-- ======================================================================= --> -<h4><a name="langspecific">Language-Specific Optimizations</a></h4> -<!-- ======================================================================= --> - -<div> - -<p>One thing about LLVM that turns off many people is that it does not solve all -the world's problems in one system (sorry 'world hunger', someone else will have -to solve you some other day). One specific complaint is that people perceive -LLVM as being incapable of performing high-level language-specific optimization: -LLVM "loses too much information".</p> - -<p>Unfortunately, this is really not the place to give you a full and unified -version of "Chris Lattner's theory of compiler design". Instead, I'll make a -few observations:</p> - -<p>First, you're right that LLVM does lose information. For example, as of this -writing, there is no way to distinguish in the LLVM IR whether an SSA-value came -from a C "int" or a C "long" on an ILP32 machine (other than debug info). Both -get compiled down to an 'i32' value and the information about what it came from -is lost. The more general issue here, is that the LLVM type system uses -"structural equivalence" instead of "name equivalence". Another place this -surprises people is if you have two types in a high-level language that have the -same structure (e.g. two different structs that have a single int field): these -types will compile down into a single LLVM type and it will be impossible to -tell what it came from.</p> - -<p>Second, while LLVM does lose information, LLVM is not a fixed target: we -continue to enhance and improve it in many different ways. In addition to -adding new features (LLVM did not always support exceptions or debug info), we -also extend the IR to capture important information for optimization (e.g. -whether an argument is sign or zero extended, information about pointers -aliasing, etc). Many of the enhancements are user-driven: people want LLVM to -include some specific feature, so they go ahead and extend it.</p> - -<p>Third, it is <em>possible and easy</em> to add language-specific -optimizations, and you have a number of choices in how to do it. As one trivial -example, it is easy to add language-specific optimization passes that -"know" things about code compiled for a language. In the case of the C family, -there is an optimization pass that "knows" about the standard C library -functions. If you call "exit(0)" in main(), it knows that it is safe to -optimize that into "return 0;" because C specifies what the 'exit' -function does.</p> - -<p>In addition to simple library knowledge, it is possible to embed a variety of -other language-specific information into the LLVM IR. If you have a specific -need and run into a wall, please bring the topic up on the llvmdev list. At the -very worst, you can always treat LLVM as if it were a "dumb code generator" and -implement the high-level optimizations you desire in your front-end, on the -language-specific AST. -</p> - -</div> - -</div> - -<!-- *********************************************************************** --> -<h2><a name="tipsandtricks">Tips and Tricks</a></h2> -<!-- *********************************************************************** --> - -<div> - -<p>There is a variety of useful tips and tricks that you come to know after -working on/with LLVM that aren't obvious at first glance. Instead of letting -everyone rediscover them, this section talks about some of these issues.</p> - -<!-- ======================================================================= --> -<h4><a name="offsetofsizeof">Implementing portable offsetof/sizeof</a></h4> -<!-- ======================================================================= --> - -<div> - -<p>One interesting thing that comes up, if you are trying to keep the code -generated by your compiler "target independent", is that you often need to know -the size of some LLVM type or the offset of some field in an llvm structure. -For example, you might need to pass the size of a type into a function that -allocates memory.</p> - -<p>Unfortunately, this can vary widely across targets: for example the width of -a pointer is trivially target-specific. However, there is a <a -href="http://nondot.org/sabre/LLVMNotes/SizeOf-OffsetOf-VariableSizedStructs.txt">clever -way to use the getelementptr instruction</a> that allows you to compute this -in a portable way.</p> - -</div> - -<!-- ======================================================================= --> -<h4><a name="gcstack">Garbage Collected Stack Frames</a></h4> -<!-- ======================================================================= --> - -<div> - -<p>Some languages want to explicitly manage their stack frames, often so that -they are garbage collected or to allow easy implementation of closures. There -are often better ways to implement these features than explicit stack frames, -but <a -href="http://nondot.org/sabre/LLVMNotes/ExplicitlyManagedStackFrames.txt">LLVM -does support them,</a> if you want. It requires your front-end to convert the -code into <a -href="http://en.wikipedia.org/wiki/Continuation-passing_style">Continuation -Passing Style</a> and the use of tail calls (which LLVM also supports).</p> - -</div> - -</div> - -<!-- *********************************************************************** --> -<hr> -<address> - <a href="http://jigsaw.w3.org/css-validator/check/referer"><img - src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a> - <a href="http://validator.w3.org/check/referer"><img - src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a> - - <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> - <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> - Last modified: $Date$ -</address> -</body> -</html> |