diff options
author | Alon Zakai <alonzakai@gmail.com> | 2011-04-06 20:42:39 -0700 |
---|---|---|
committer | Alon Zakai <alonzakai@gmail.com> | 2011-04-06 20:42:39 -0700 |
commit | 8a6e2d67c156d9eaedf88b752be4d1cf4242e088 (patch) | |
tree | a44c76ec103edd1d7418ee146397869096c49ef1 /docs | |
parent | 407baeff8f823b6146e558a4eeed48a5a4ad784b (diff) |
paper 1.0
Diffstat (limited to 'docs')
-rw-r--r-- | docs/paper.pdf | bin | 200747 -> 203766 bytes | |||
-rw-r--r-- | docs/paper.tex | 67 |
2 files changed, 33 insertions, 34 deletions
diff --git a/docs/paper.pdf b/docs/paper.pdf Binary files differindex 8de5e182..2c0c1d23 100644 --- a/docs/paper.pdf +++ b/docs/paper.pdf diff --git a/docs/paper.tex b/docs/paper.tex index 9a350944..38341b0c 100644 --- a/docs/paper.tex +++ b/docs/paper.tex @@ -98,11 +98,11 @@ and AFAX \cite{afax}, which compiles F\# to JavaScript; see also \cite{ashkenas} for additional examples. While useful, such tools usually only allow a subset of the original language to be compiled. For example, multithreaded code (with shared memory) is -not possible on the web, so compiling code of that sort is subject to +not possible on the web, so compiling code of that sort is not directly possible. There are also often limitations of the conversion process, for example, Pyjamas compiles Python to JavaScript in a nearly 1-to-1 manner, and as a consequence the underlying semantics are those of JavaScript, -not Python so for example division of integers can yield unexpected results +not Python, so for example division of integers can yield unexpected results (it should yield an integer in Python 2.x, but in JavaScript and in Pyjamas a floating-point number can be generated). @@ -129,7 +129,7 @@ run on the web, using one of the following methods: Emscripten. It is then possible to run code in that runtime on the web. This is a useful approach if a language's runtime is written in a language for which an LLVM - frontend exists, but the language iself has no such frontend. For + frontend exists, but the language itself has no such frontend. For example, there is currently no frontend for Python, however it is possible to compile CPython -- the standard implementation of Python, written in C -- into JavaScript, and run Python code on that @@ -158,7 +158,7 @@ a substantial part of the original high-level structure of the original code, so that compiling LLVM assembly, while more difficult, can still yield good results. Another challenge in Emscripten is to achieve good performance. LLVM assembly -is an abstraction of how modern CPUs are programmmed for, and its basic +is an abstraction of how modern CPUs are programmed for, and its basic operations are not all directly possible in JavaScript. For example, if in LLVM we are to add two unsigned 8-bit numbers $x$ and $y$, with overflowing (e.g., 255 plus 1 should give 0), then there is no single operation in JavaScript which @@ -174,7 +174,7 @@ We conclude this introduction with a list of this paper's main contributions: which we detail its approach in compiling LLVM into JavaScript. \item We give details of Emscripten's `Relooper' algorithm, which generates high-level loop structures from low-level branching data, and prove - it's validity. + its validity. \end{itemize} In addition, the following are the main contributions of Emscripten itself, that to our knowledge were not previously possible: @@ -182,7 +182,7 @@ itself, that to our knowledge were not previously possible: \item It allows compiling a very large subset of C and C++ code into JavaScript, which can then be run on the web. \item By compiling their runtimes, it allows running languages such as Python - on the web. + on the web (with their normal semantics). \end{itemize} The remainder of this paper is structured as follows. In Section~\ref{sec:compapp} we @@ -270,7 +270,7 @@ though, the example assembly above can be seen to define a function main(), then allocate some values on the stack (alloca), then load and store various values (load and store). We do not have the high-level code structure as we had in C++ (with a loop), instead -we have labelled code fragments, called LLVM basic blocks, and code flow moves +we have labeled code fragments, called LLVM basic blocks, and code flow moves from one to another by branch (br) instructions. (Label 2 is the condition check in the loop; label 5 is the body, label 9 is the increment, and label 12 is the final part of the function, outside @@ -632,15 +632,15 @@ parameters used in running it. The source code to all the benchmarks can be found at \url{https://github.com/kripken/emscripten/tree/master/tests} (each in a separate file with its name, except for `primes', which is embedded inside runner.py in the function test\_primes). The second -column is the elapsed time when running the compiled code (generated using all Emscripten and LLVM +column is the elapsed time (in seconds) when running the compiled code (generated using all Emscripten and LLVM optimizations as well as the Closure Compiler) in the SpiderMonkey JavaScript engine (specifically, the JaegerMonkey branch, checked out April 2nd, 2011, running with \emph{-m -n -a}). -The third column is the elapsed time when running the original code compiled with \emph{gcc -O3}, +The third column is the elapsed time (in seconds) when running the original code compiled with \emph{gcc -O3}, using GCC 4.4.4. The last column is the ratio, that is, how much slower the JavaScript code is when compared to gcc. All the tests were run on a Lenovo N500 laptop with an Intel T6400 Core 2 Duo CPU clocked at 1.2GHz, running on Ubuntu 10.10. -Clearly the results greatly vary by the benchmark, with the generated JavaScripting from 3.06 to 32.66 times +Clearly the results greatly vary by the benchmark, with the generated JavaScript running from 3.06 to 32.66 times slower. It appears that code that does simple numerical operations -- like the primes test -- can run fairly fast, while code that has a lot of memory accesses, for example due to using structures -- like the raytrace test -- @@ -654,7 +654,8 @@ is a significant slowdown, it is still more than fast enough for many purposes, and the main point of course is that the code can run anywhere the web can be accessed. Further work on Emscripten is expected to improve the speed as well, as are improvements to LLVM, the Closure -Compiler, and JavaScript engines themselves. +Compiler, and JavaScript engines themselves; see further discussion +in the Summary. \section{Emscripten's Architecture} \label{sec:emarch} @@ -724,12 +725,13 @@ since JavaScript engines are tuned to run such code very quickly (for example, a tracing JIT as in SpiderMonkey will only trace normal loops). Returning to the LLVM assembly code on page~\pageref{code:examplellvm}, it -has has the following structure (where arrows denote potential paths of execution): +has the following structure (where arrows denote potential paths of execution): \includegraphics[width=80mm]{graph.png} In this simple example, it is fairly straightforward to see that a natural way to implement it using normal loop structures is +\newpage \begin{verbatim} ENTRY while (true) do @@ -745,7 +747,7 @@ for example the original C code relied heavily on \emph{goto} instructions. In practice, however, almost all real-world C and C++ code tends to be amenable to loop recreation. -We now begin to describe the Relooper algorithm. As mentioned before, it takes as input a `soup of labelled LLVM blocks' as described above, +We now begin to describe the Relooper algorithm. As mentioned before, it takes as input a `soup of labeled LLVM blocks' as described above, and generates a structured set of Emscripten code blocks, which are each a set of LLVM blocks with some logical structure. For simplicity we call LLVM blocks `labels' and Emscripten blocks `blocks' in the following. @@ -852,7 +854,7 @@ generated code, both subjectively and at least in some simple benchmarks. Additional details of the algorithm include \begin{itemize} -\item The technical mechanism by which execution flow is controlled involves +\item The technical mechanism by which execution flow is controlled in the generated code involves the \emph{\_\_label\_\_} variable, mentioned earlier. Whenever we enter a block with more than one entry, we set \emph{\_\_label\_\_} before we branch into it, and we check its value when we enter that block. So, for example, when we @@ -876,16 +878,13 @@ the \emph{break} and \emph{continue} commands are guaranteed, by the semantics of JavaScript, to get us to where we need to go -- they do not need any further work for them to work properly). -\end{itemize} - -Emscripten also does an additional pass after running the Relooper algorithm -which has been described. The Relooper is guaranteed to produce valid output (see below). -The second pass takes that valid output and optimizes it, by +\item Emscripten also does an additional pass after what has been +described thus far, which was the first pass. The first pass is guaranteed to produce valid output (see below), while +the second pass takes that valid output and optimizes it, by making minor changes such as removing \emph{continue} commands that occur at the very end of loops -(where they are not needed), etc. In other words, the first pass focuses on -generating high-level code flow structures that are correct, -while the second pass simplifies and optimizes that structure. +(where they are not needed), etc. +\end{itemize} We now turn to an analysis of the Relooper algorithm. It is straightforward to see that the output of the algorithm, assuming it completes successfully -- that is, that if finishes in finite time, and does @@ -918,10 +917,10 @@ The remaining issue is whether we can reach a situation where we \emph{cannot} successfully create a block, which is if we reach the final part of the relooper algorithm, but cannot create a Loop block there. For that to occur, we must not be able to return to any of the entries (or else we would create a Loop -block). But if that is so, we can, at minimum, create a Multiple -block with entries for all the current entries, since the entry -labels themselves cannot be reached by the others as we have just -assumed (when we ruled out creating a Loop block here), contradicting the assumption +block). Assume that indeed we cannot return to any of the entries. But if that is so, we can create a Multiple +block with Handled blocks that each include one of the entries (and possibly additional labels as well), since each entry +label cannot be reached from any other label by our assumption earlier, thus +contradicting that assumption and concluding the proof. (We have not, of course, proven that the shape of the blocks is optimal @@ -934,18 +933,18 @@ construction to more natural JavaScript code flow structures.) Emscripten has been run successfully on several real-world codebases. We present some examples here to give an idea of the various opportunities made possible -by Emscripten: +by Emscripten. \begin{itemize} \item \textbf{Python}: It is possible to run variants of Python on -the web in various ways, including pyjamas, IronPython on SilverLight and +the web in various ways, including Pyjamas, IronPython on SilverLight and Jython in Java. However, all of these are slightly nonstandard in the -the Python code they run, while the latter two also require plugins to be +Python code they run, while the latter two also require plugins to be installed. With Emscripten, on the other hand, it is possible to compile CPython itself -- the standard, reference implementation of Python -- and run that on the web, which allows running standard Python code. An online demo is available at \url{http://syntensity.com/static/python.html}. -Another example of a language runtime that Emscripten can convert to run -on the web is Lua, an online demo is available at \url{http://syntensity.com/static/lua.html}. +(Another example of a language runtime that Emscripten can convert to run +on the web is Lua; an online demo is available at \url{http://syntensity.com/static/lua.html}.) \item \textbf{Poppler and FreeType}: Poppler\footnote{\url{http://poppler.freedesktop.org/}} is an open source PDF rendering library. In combination with FreeType\footnote{\url{http://www.freetype.org/}}, an open source font engine, it can be used to render PDF files. By compiling it with Emscripten, @@ -1032,8 +1031,8 @@ static analysis can be time-consuming, there are practical ways for achieving similar results quickly, such as tracing and type inference, which would help on such code very significantly, and are already in use or being worked on in mainstream JavaScript engines (e.g., SpiderMonkey). As -a consequence, it may soon be possible to write code in languages such as -C and C++ and run them on web with near-native speed. +a consequence, it may soon be possible to run code written in languages such as +C and C++ on the web with near-native speed. %\appendix %\section{Appendix Title} @@ -1041,7 +1040,7 @@ C and C++ and run them on web with near-native speed. \acks -Thank you to the following people: David LaPalomento, Daniƫl Heres, Brian Crowder, Brian McKenna, dglead and tuba. +We thank the following people for their contributions to Emscripten: David LaPalomento, Daniel Heres, Brian Crowder, Brian McKenna, dglead and tuba. \bibliographystyle{abbrvnat} |