LLVM Programmer's Manual

- -

This document is meant to highlight some of the important classes and -interfaces available in the LLVM source-base. This manual is not -intended to explain what LLVM is, how it works, and what LLVM code looks -like. It assumes that you know the basics of LLVM and are interested -in writing transformations or otherwise analyzing or manipulating the -code.

- -

This document should get you oriented so that you can find your -way in the continuously growing source code that makes up the LLVM -infrastructure. Note that this manual is not intended to serve as a -replacement for reading the source code, so if you think there should be -a method in one of these classes to do something, but it's not listed, -check the source. Links to the doxygen sources -are provided to make this as easy as possible.

- -

The first section of this document describes general information that is -useful to know when working in the LLVM infrastructure, and the second describes -the Core LLVM classes. In the future this manual will be extended with -information describing how to use extension libraries, such as dominator -information, CFG traversal routines, and useful utilities like the InstVisitor template.

- -

This section contains general information that is useful if you are working -in the LLVM source-base, but that isn't specific to any particular API.

- -

LLVM makes heavy use of the C++ Standard Template Library (STL), -perhaps much more than you are used to, or have seen before. Because of -this, you might want to do a little background reading in the -techniques used and capabilities of the library. There are many good -pages that discuss the STL, and several books on the subject that you -can get, so it will not be discussed in this document.

- -

Here are some useful links:

- -

Dinkumware C++ Library -reference - an excellent reference for the STL and other parts of the -standard C++ library.
C++ In a Nutshell - This is an -O'Reilly book in the making. It has a decent Standard Library -Reference that rivals Dinkumware's, and is unfortunately no longer free since the -book has been published.
C++ Frequently Asked -Questions
SGI's STL Programmer's Guide - -Contains a useful Introduction to the -STL.
Bjarne Stroustrup's C++ -Page
-Bruce Eckel's Thinking in C++, 2nd ed. Volume 2 Revision 4.0 (even better, get -the book).

- -

You are also encouraged to take a look at the LLVM Coding Standards guide which focuses on how -to write maintainable code more than where to put your curly braces.

- -

Here we highlight some LLVM APIs that are generally useful and good to -know about when writing transformations.

- -

The LLVM source-base makes extensive use of a custom form of RTTI. -These templates have many similarities to the C++ dynamic_cast<> -operator, but they don't have some drawbacks (primarily stemming from -the fact that dynamic_cast<> only works on classes that -have a v-table). Because they are used so often, you must know what they -do and how they work. All of these templates are defined in the llvm/Support/Casting.h -file (note that you very rarely have to include this file directly).

- -

isa<>:

The isa<> operator works exactly like the Java - "instanceof" operator. It returns true or false depending on whether - a reference or pointer points to an instance of the specified class. This can - be very useful for constraint checking of various sorts (example below).

cast<>:

The cast<> operator is a "checked cast" operation. It - converts a pointer or reference from a base class to a derived class, causing - an assertion failure if it is not really an instance of the right type. This - should be used in cases where you have some information that makes you believe - that something is of the right type. An example of the isa<> - and cast<> template is:

- -

-static bool isLoopInvariant(const Value *V, const Loop *L) {
-  if (isa<Constant>(V) || isa<Argument>(V) || isa<GlobalValue>(V))
-    return true;
-
-  // Otherwise, it must be an instruction...
-  return !L->contains(cast<Instruction>(V)->getParent());
-}
-

- -

Note that you should not use an isa<> test followed - by a cast<>, for that use the dyn_cast<> - operator.

- -

dyn_cast<>:

The dyn_cast<> operator is a "checking cast" operation. - It checks to see if the operand is of the specified type, and if so, returns a - pointer to it (this operator does not work with references). If the operand is - not of the correct type, a null pointer is returned. Thus, this works very - much like the dynamic_cast<> operator in C++, and should be - used in the same circumstances. Typically, the dyn_cast<> - operator is used in an if statement or some other flow control - statement like this:

- -

-if (AllocationInst *AI = dyn_cast<AllocationInst>(Val)) {
-  // ...
-}
-

- -

This form of the if statement effectively combines together a call - to isa<> and a call to cast<> into one - statement, which is very convenient.

- -

Note that the dyn_cast<> operator, like C++'s - dynamic_cast<> or Java's instanceof operator, can be - abused. In particular, you should not use big chained if/then/else - blocks to check for lots of different variants of classes. If you find - yourself wanting to do this, it is much cleaner and more efficient to use the - InstVisitor class to dispatch over the instruction type directly.

- -

cast_or_null<>:

The cast_or_null<> operator works just like the - cast<> operator, except that it allows for a null pointer as an - argument (which it then propagates). This can sometimes be useful, allowing - you to combine several null checks into one.

dyn_cast_or_null<>:

The dyn_cast_or_null<> operator works just like the - dyn_cast<> operator, except that it allows for a null pointer - as an argument (which it then propagates). This can sometimes be useful, - allowing you to combine several null checks into one.

- -

These five templates can be used with any classes, whether they have a -v-table or not. To add support for these templates, you simply need to add -classof static methods to the class you are interested casting -to. Describing this is currently outside the scope of this document, but there -are lots of examples in the LLVM source base.

- -

Although LLVM generally does not do much string manipulation, we do have -several important APIs which take strings. Two important examples are the -Value class -- which has names for instructions, functions, etc. -- and the -StringMap class which is used extensively in LLVM and Clang.

- -

These are generic classes, and they need to be able to accept strings which -may have embedded null characters. Therefore, they cannot simply take -a const char *, and taking a const std::string& requires -clients to perform a heap allocation which is usually unnecessary. Instead, -many LLVM APIs use a const StringRef& or a const -Twine& for passing strings efficiently.

- -

The StringRef data type represents a reference to a constant string -(a character array and a length) and supports the common operations available -on std:string, but does not require heap allocation.

- -

It can be implicitly constructed using a C style null-terminated string, -an std::string, or explicitly with a character pointer and length. -For example, the StringRef find function is declared as:

- -

- iterator find(const StringRef &Key); -

- -

and clients can call it using any one of:

- -

-  Map.find("foo");                 // Lookup "foo"
-  Map.find(std::string("bar"));    // Lookup "bar"
-  Map.find(StringRef("\0baz", 4)); // Lookup "\0baz"
-

- -

Similarly, APIs which need to return a string may return a StringRef -instance, which can be used directly or converted to an std::string -using the str member function. See -"llvm/ADT/StringRef.h" -for more information.

- -

You should rarely use the StringRef class directly, because it contains -pointers to external memory it is not generally safe to store an instance of the -class (unless you know that the external storage will not be freed).

- -

The Twine class is an efficient way for APIs to accept concatenated -strings. For example, a common LLVM paradigm is to name one instruction based on -the name of another instruction with a suffix, for example:

- -

-    New = CmpInst::Create(..., SO->getName() + ".cmp");
-

- -

The Twine class is effectively a -lightweight rope -which points to temporary (stack allocated) objects. Twines can be implicitly -constructed as the result of the plus operator applied to strings (i.e., a C -strings, an std::string, or a StringRef). The twine delays the -actual concatenation of strings until it is actually required, at which point -it can be efficiently rendered directly into a character array. This avoids -unnecessary heap allocation involved in constructing the temporary results of -string concatenation. See -"llvm/ADT/Twine.h" -for more information.

- -

As with a StringRef, Twine objects point to external memory -and should almost never be stored or mentioned directly. They are intended -solely for use when defining a function which should be able to efficiently -accept concatenated strings.

- -

Often when working on your pass you will put a bunch of debugging printouts -and other code into your pass. After you get it working, you want to remove -it, but you may need it again in the future (to work out new bugs that you run -across).

- -

Naturally, because of this, you don't want to delete the debug printouts, -but you don't want them to always be noisy. A standard compromise is to comment -them out, allowing you to enable them if you need them in the future.

- -

The "llvm/Support/Debug.h" -file provides a macro named DEBUG() that is a much nicer solution to -this problem. Basically, you can put arbitrary code into the argument of the -DEBUG macro, and it is only executed if 'opt' (or any other -tool) is run with the '-debug' command line argument:

- -

-DEBUG(errs() << "I am here!\n");
-

- -

Then you can run your pass like this:

- -

-$ opt < a.bc > /dev/null -mypass
-<no output>
-$ opt < a.bc > /dev/null -mypass -debug
-I am here!
-

- -

Using the DEBUG() macro instead of a home-brewed solution allows you -to not have to create "yet another" command line option for the debug output for -your pass. Note that DEBUG() macros are disabled for optimized builds, -so they do not cause a performance impact at all (for the same reason, they -should also not contain side-effects!).

- -

One additional nice thing about the DEBUG() macro is that you can -enable or disable it directly in gdb. Just use "set DebugFlag=0" or -"set DebugFlag=1" from the gdb if the program is running. If the -program hasn't been started yet, you can always just run it with --debug.

- -

Sometimes you may find yourself in a situation where enabling -debug -just turns on too much information (such as when working on the code -generator). If you want to enable debug information with more fine-grained -control, you define the DEBUG_TYPE macro and the -debug only -option as follows:

- -

-#undef  DEBUG_TYPE
-DEBUG(errs() << "No debug type\n");
-#define DEBUG_TYPE "foo"
-DEBUG(errs() << "'foo' debug type\n");
-#undef  DEBUG_TYPE
-#define DEBUG_TYPE "bar"
-DEBUG(errs() << "'bar' debug type\n"));
-#undef  DEBUG_TYPE
-#define DEBUG_TYPE ""
-DEBUG(errs() << "No debug type (2)\n");
-

- -

Then you can run your pass like this:

- -

-$ opt < a.bc > /dev/null -mypass
-<no output>
-$ opt < a.bc > /dev/null -mypass -debug
-No debug type
-'foo' debug type
-'bar' debug type
-No debug type (2)
-$ opt < a.bc > /dev/null -mypass -debug-only=foo
-'foo' debug type
-$ opt < a.bc > /dev/null -mypass -debug-only=bar
-'bar' debug type
-

- -

Of course, in practice, you should only set DEBUG_TYPE at the top of -a file, to specify the debug type for the entire module (if you do this before -you #include "llvm/Support/Debug.h", you don't have to insert the ugly -#undef's). Also, you should use names more meaningful than "foo" and -"bar", because there is no system in place to ensure that names do not -conflict. If two different modules use the same string, they will all be turned -on when the name is specified. This allows, for example, all debug information -for instruction scheduling to be enabled with -debug-type=InstrSched, -even if the source lives in multiple files.

- -

The DEBUG_WITH_TYPE macro is also available for situations where you -would like to set DEBUG_TYPE, but only for one specific DEBUG -statement. It takes an additional first parameter, which is the type to use. For -example, the preceding example could be written as:

- - -

-DEBUG_WITH_TYPE("", errs() << "No debug type\n");
-DEBUG_WITH_TYPE("foo", errs() << "'foo' debug type\n");
-DEBUG_WITH_TYPE("bar", errs() << "'bar' debug type\n"));
-DEBUG_WITH_TYPE("", errs() << "No debug type (2)\n");
-

- -

The "llvm/ADT/Statistic.h" file -provides a class named Statistic that is used as a unified way to -keep track of what the LLVM compiler is doing and how effective various -optimizations are. It is useful to see what optimizations are contributing to -making a particular program run faster.

- -

Often you may run your pass on some big program, and you're interested to see -how many times it makes a certain transformation. Although you can do this with -hand inspection, or some ad-hoc method, this is a real pain and not very useful -for big programs. Using the Statistic class makes it very easy to -keep track of this information, and the calculated information is presented in a -uniform manner with the rest of the passes being executed.

- -

There are many examples of Statistic uses, but the basics of using -it are as follows:

- -

Define your statistic like this:
- -
-
```
-#define DEBUG_TYPE "mypassname"   // This goes before any #includes.
-STATISTIC(NumXForms, "The # of times I did stuff");
-
```
-
- -
The STATISTIC macro defines a static variable, whose name is - specified by the first argument. The pass name is taken from the DEBUG_TYPE - macro, and the description is taken from the second argument. The variable - defined ("NumXForms" in this case) acts like an unsigned integer.
Whenever you make a transformation, bump the counter:
- -
-
```
-++NumXForms;   // I did stuff!
-
```
-
- -

- -

That's all you have to do. To get 'opt' to print out the - statistics gathered, use the '-stats' option:

- -

-$ opt -stats -mypassname < program.bc > /dev/null
-... statistics output ...
-

- -

When running opt on a C file from the SPEC benchmark -suite, it gives a report that looks like this:

- -

-   7646 bitcodewriter   - Number of normal instructions
-    725 bitcodewriter   - Number of oversized instructions
- 129996 bitcodewriter   - Number of bitcode bytes written
-   2817 raise           - Number of insts DCEd or constprop'd
-   3213 raise           - Number of cast-of-self removed
-   5046 raise           - Number of expression trees converted
-     75 raise           - Number of other getelementptr's formed
-    138 raise           - Number of load/store peepholes
-     42 deadtypeelim    - Number of unused typenames removed from symtab
-    392 funcresolve     - Number of varargs functions resolved
-     27 globaldce       - Number of global variables removed
-      2 adce            - Number of basic blocks removed
-    134 cee             - Number of branches revectored
-     49 cee             - Number of setcc instruction eliminated
-    532 gcse            - Number of loads removed
-   2919 gcse            - Number of instructions removed
-     86 indvars         - Number of canonical indvars added
-     87 indvars         - Number of aux indvars removed
-     25 instcombine     - Number of dead inst eliminate
-    434 instcombine     - Number of insts combined
-    248 licm            - Number of load insts hoisted
-   1298 licm            - Number of insts hoisted to a loop pre-header
-      3 licm            - Number of insts hoisted to multiple loop preds (bad, no loop pre-header)
-     75 mem2reg         - Number of alloca's promoted
-   1444 cfgsimplify     - Number of blocks simplified
-

- -

Obviously, with so many optimizations, having a unified framework for this -stuff is very nice. Making your pass fit well into the framework makes it more -maintainable and useful.

- -

Several of the important data structures in LLVM are graphs: for example -CFGs made out of LLVM BasicBlocks, CFGs made out of -LLVM MachineBasicBlocks, and -Instruction Selection -DAGs. In many cases, while debugging various parts of the compiler, it is -nice to instantly visualize these graphs.

- -

LLVM provides several callbacks that are available in a debug build to do -exactly that. If you call the Function::viewCFG() method, for example, -the current LLVM tool will pop up a window containing the CFG for the function -where each basic block is a node in the graph, and each node contains the -instructions in the block. Similarly, there also exists -Function::viewCFGOnly() (does not include the instructions), the -MachineFunction::viewCFG() and MachineFunction::viewCFGOnly(), -and the SelectionDAG::viewGraph() methods. Within GDB, for example, -you can usually use something like call DAG.viewGraph() to pop -up a window. Alternatively, you can sprinkle calls to these functions in your -code in places you want to debug.

- -

Getting this to work requires a small amount of configuration. On Unix -systems with X11, install the graphviz -toolkit, and make sure 'dot' and 'gv' are in your path. If you are running on -Mac OS/X, download and install the Mac OS/X Graphviz program, and add -/Applications/Graphviz.app/Contents/MacOS/ (or wherever you install -it) to your path. Once in your system and path are set up, rerun the LLVM -configure script and rebuild LLVM to enable this functionality.

- -

SelectionDAG has been extended to make it easier to locate -interesting nodes in large complex graphs. From gdb, if you -call DAG.setGraphColor(node, "color"), then the -next call DAG.viewGraph() would highlight the node in the -specified color (choices of colors can be found at colors.) More -complex node attributes can be provided with call -DAG.setGraphAttrs(node, "attributes") (choices can be -found at Graph -Attributes.) If you want to restart and clear all the current graph -attributes, then you can call DAG.clearGraphAttrs().

- -

LLVM has a plethora of data structures in the llvm/ADT/ directory, - and we commonly use STL data structures. This section describes the trade-offs - you should consider when you pick one.

- -

-The first step is a choose your own adventure: do you want a sequential -container, a set-like container, or a map-like container? The most important -thing when choosing a container is the algorithmic properties of how you plan to -access the container. Based on that, you should use:

- -

a map-like container if you need efficient look-up - of an value based on another value. Map-like containers also support - efficient queries for containment (whether a key is in the map). Map-like - containers generally do not support efficient reverse mapping (values to - keys). If you need that, use two maps. Some map-like containers also - support efficient iteration through the keys in sorted order. Map-like - containers are the most expensive sort, only use them if you need one of - these capabilities.
a set-like container if you need to put a bunch of - stuff into a container that automatically eliminates duplicates. Some - set-like containers support efficient iteration through the elements in - sorted order. Set-like containers are more expensive than sequential - containers. -
a sequential container provides - the most efficient way to add elements and keeps track of the order they are - added to the collection. They permit duplicates and support efficient - iteration, but do not support efficient look-up based on a key. -
a string container is a specialized sequential - container or reference structure that is used for character or byte - arrays.
a bit container provides an efficient way to store and - perform set operations on sets of numeric id's, while automatically - eliminating duplicates. Bit containers require a maximum of 1 bit for each - identifier you want to store. -

- -

-Once the proper category of container is determined, you can fine tune the -memory use, constant factors, and cache behaviors of access by intelligently -picking a member of the category. Note that constant factors and cache behavior -can be a big deal. If you have a vector that usually only contains a few -elements (but could contain many), for example, it's much better to use -SmallVector than vector -. Doing so avoids (relatively) expensive malloc/free calls, which dwarf the -cost of adding the elements to the container.

- -

-There are a variety of sequential containers available for you, based on your -needs. Pick the first in this section that will do what you want. -

Fixed size arrays are very simple and very fast. They are good if you know -exactly how many elements you have, or you have a (low) upper bound on how many -you have.

Heap allocated arrays (new[] + delete[]) are also simple. They are good if -the number of elements is variable, if you know how many elements you will need -before the array is allocated, and if the array is usually large (if not, -consider a SmallVector). The cost of a heap -allocated array is the cost of the new/delete (aka malloc/free). Also note that -if you are allocating an array of a type with a constructor, the constructor and -destructors will be run for every element in the array (re-sizable vectors only -construct those elements actually used).

SmallVector<Type, N> is a simple class that looks and smells -just like vector<Type>: -it supports efficient iteration, lays out elements in memory order (so you can -do pointer arithmetic between elements), supports efficient push_back/pop_back -operations, supports efficient random access to its elements, etc.

- -

The advantage of SmallVector is that it allocates space for -some number of elements (N) in the object itself. Because of this, if -the SmallVector is dynamically smaller than N, no malloc is performed. This can -be a big win in cases where the malloc/free call is far more expensive than the -code that fiddles around with the elements.

- -

This is good for vectors that are "usually small" (e.g. the number of -predecessors/successors of a block is usually less than 8). On the other hand, -this makes the size of the SmallVector itself large, so you don't want to -allocate lots of them (doing so will waste a lot of space). As such, -SmallVectors are most useful when on the stack.

- -

SmallVector also provides a nice portable and efficient replacement for -alloca.

- -

-std::vector is well loved and respected. It is useful when SmallVector isn't: -when the size of the vector is often large (thus the small optimization will -rarely be a benefit) or if you will be allocating many instances of the vector -itself (which would waste space for elements that aren't in the container). -vector is also useful when interfacing with code that expects vectors :). -

- -

One worthwhile note about std::vector: avoid code like this:

- -

-for ( ... ) {
-   std::vector<foo> V;
-   use V;
-}
-

- -

Instead, write this as:

- -

-std::vector<foo> V;
-for ( ... ) {
-   use V;
-   V.clear();
-}
-

- -

Doing so will save (at least) one heap allocation and free per iteration of -the loop.

- -

std::deque is, in some senses, a generalized version of std::vector. Like -std::vector, it provides constant time random access and other similar -properties, but it also provides efficient access to the front of the list. It -does not guarantee continuity of elements within memory.

- -

In exchange for this extra flexibility, std::deque has significantly higher -constant factor costs than std::vector. If possible, use std::vector or -something cheaper.

std::list is an extremely inefficient class that is rarely useful. -It performs a heap allocation for every element inserted into it, thus having an -extremely high constant factor, particularly for small data types. std::list -also only supports bidirectional iteration, not random access iteration.

- -

In exchange for this high cost, std::list supports efficient access to both -ends of the list (like std::deque, but unlike std::vector or SmallVector). In -addition, the iterator invalidation characteristics of std::list are stronger -than that of a vector class: inserting or removing an element into the list does -not invalidate iterator or pointers to other elements in the list.

ilist<T> implements an 'intrusive' doubly-linked list. It is -intrusive, because it requires the element to store and provide access to the -prev/next pointers for the list.

- -

ilist has the same drawbacks as std::list, and additionally -requires an ilist_traits implementation for the element type, but it -provides some novel characteristics. In particular, it can efficiently store -polymorphic objects, the traits class is informed when an element is inserted or -removed from the list, and ilists are guaranteed to support a -constant-time splice operation.

- -

These properties are exactly what we want for things like -Instructions and basic blocks, which is why these are implemented with -ilists.

- -Related classes of interest are explained in the following subsections: -

ilist_traits
iplist
llvm/ADT/ilist_node.h
Sentinels

ilist_traits<T> is ilist<T>'s customization -mechanism. iplist<T> (and consequently ilist<T>) -publicly derive from this traits class.

iplist<T> is ilist<T>'s base and as such -supports a slightly narrower interface. Notably, inserters from -T& are absent.

- -

ilist_traits<T> is a public base of this class and can be -used for a wide variety of customizations.

ilist_node<T> implements a the forward and backward links -that are expected by the ilist<T> (and analogous containers) -in the default manner.

- -

ilist_node<T>s are meant to be embedded in the node type -T, usually T publicly derives from -ilist_node<T>.

ilists have another specialty that must be considered. To be a good -citizen in the C++ ecosystem, it needs to support the standard container -operations, such as begin and end iterators, etc. Also, the -operator-- must work correctly on the end iterator in the -case of non-empty ilists.

- -

The only sensible solution to this problem is to allocate a so-called -sentinel along with the intrusive list, which serves as the end -iterator, providing the back-link to the last element. However conforming to the -C++ convention it is illegal to operator++ beyond the sentinel and it -also must not be dereferenced.

- -

These constraints allow for some implementation freedom to the ilist -how to allocate and store the sentinel. The corresponding policy is dictated -by ilist_traits<T>. By default a T gets heap-allocated -whenever the need for a sentinel arises.

- -

While the default policy is sufficient in most cases, it may break down when -T does not provide a default constructor. Also, in the case of many -instances of ilists, the memory overhead of the associated sentinels -is wasted. To alleviate the situation with numerous and voluminous -T-sentinels, sometimes a trick is employed, leading to ghostly -sentinels.

- -

Ghostly sentinels are obtained by specially-crafted ilist_traits<T> -which superpose the sentinel with the ilist instance in memory. Pointer -arithmetic is used to obtain the sentinel, which is relative to the -ilist's this pointer. The ilist is augmented by an -extra pointer, which serves as the back-link of the sentinel. This is the only -field in the ghostly sentinel which can be legally accessed.

Other STL containers are available, such as std::string.

- -

There are also various STL adapter classes such as std::queue, -std::priority_queue, std::stack, etc. These provide simplified access to an -underlying container but don't affect the cost of the container itself.

- -

Set-like containers are useful when you need to canonicalize multiple values -into a single representation. There are several different choices for how to do -this, providing various trade-offs.

- -

If you intend to insert a lot of elements, then do a lot of queries, a -great approach is to use a vector (or other sequential container) with -std::sort+std::unique to remove duplicates. This approach works really well if -your usage pattern has these two distinct phases (insert then query), and can be -coupled with a good choice of sequential container. -

- -

-This combination provides the several nice properties: the result data is -contiguous in memory (good for cache locality), has few allocations, is easy to -address (iterators in the final vector are just indices or pointers), and can be -efficiently queried with a standard binary or radix search.

- -

If you have a set-like data structure that is usually small and whose elements -are reasonably small, a SmallSet<Type, N> is a good choice. This set -has space for N elements in place (thus, if the set is dynamically smaller than -N, no malloc traffic is required) and accesses them with a simple linear search. -When the set grows beyond 'N' elements, it allocates a more expensive representation that -guarantees efficient access (for most types, it falls back to std::set, but for -pointers it uses something far better, SmallPtrSet).

- -

The magic of this class is that it handles small sets extremely efficiently, -but gracefully handles extremely large sets without loss of efficiency. The -drawback is that the interface is quite small: it supports insertion, queries -and erasing, but does not support iteration.

- -

SmallPtrSet has all the advantages of SmallSet (and a SmallSet of pointers is -transparently implemented with a SmallPtrSet), but also supports iterators. If -more than 'N' insertions are performed, a single quadratically -probed hash table is allocated and grows as needed, providing extremely -efficient access (constant time insertion/deleting/queries with low constant -factors) and is very stingy with malloc traffic.

- -

Note that, unlike std::set, the iterators of SmallPtrSet are invalidated -whenever an insertion occurs. Also, the values visited by the iterators are not -visited in sorted order.

- -

-DenseSet is a simple quadratically probed hash table. It excels at supporting -small values: it uses a single allocation to hold all of the pairs that -are currently inserted in the set. DenseSet is a great way to unique small -values that are not simple pointers (use SmallPtrSet for pointers). Note that DenseSet has -the same requirements for the value type that DenseMap has. -

- -

-FoldingSet is an aggregate class that is really good at uniquing -expensive-to-create or polymorphic objects. It is a combination of a chained -hash table with intrusive links (uniqued objects are required to inherit from -FoldingSetNode) that uses SmallVector as part of -its ID process.

- -

Consider a case where you want to implement a "getOrCreateFoo" method for -a complex object (for example, a node in the code generator). The client has a -description of *what* it wants to generate (it knows the opcode and all the -operands), but we don't want to 'new' a node, then try inserting it into a set -only to find out it already exists, at which point we would have to delete it -and return the node that already exists. -

- -

To support this style of client, FoldingSet perform a query with a -FoldingSetNodeID (which wraps SmallVector) that can be used to describe the -element that we want to query for. The query either returns the element -matching the ID or it returns an opaque ID that indicates where insertion should -take place. Construction of the ID usually does not require heap traffic.

- -

Because FoldingSet uses intrusive links, it can support polymorphic objects -in the set (for example, you can have SDNode instances mixed with LoadSDNodes). -Because the elements are individually allocated, pointers to the elements are -stable: inserting or removing elements does not invalidate any pointers to other -elements. -

- -

std::set is a reasonable all-around set class, which is decent at -many things but great at nothing. std::set allocates memory for each element -inserted (thus it is very malloc intensive) and typically stores three pointers -per element in the set (thus adding a large amount of per-element space -overhead). It offers guaranteed log(n) performance, which is not particularly -fast from a complexity standpoint (particularly if the elements of the set are -expensive to compare, like strings), and has extremely high constant factors for -lookup, insertion and removal.

- -

The advantages of std::set are that its iterators are stable (deleting or -inserting an element from the set does not affect iterators or pointers to other -elements) and that iteration over the set is guaranteed to be in sorted order. -If the elements in the set are large, then the relative overhead of the pointers -and malloc traffic is not a big deal, but if the elements of the set are small, -std::set is almost never a good choice.

- -

LLVM's SetVector<Type> is an adapter class that combines your choice of -a set-like container along with a Sequential -Container. The important property -that this provides is efficient insertion with uniquing (duplicate elements are -ignored) with iteration support. It implements this by inserting elements into -both a set-like container and the sequential container, using the set-like -container for uniquing and the sequential container for iteration. -

- -

The difference between SetVector and other sets is that the order of -iteration is guaranteed to match the order of insertion into the SetVector. -This property is really important for things like sets of pointers. Because -pointer values are non-deterministic (e.g. vary across runs of the program on -different machines), iterating over the pointers in the set will -not be in a well-defined order.

- -

-The drawback of SetVector is that it requires twice as much space as a normal -set and has the sum of constant factors from the set-like container and the -sequential container that it uses. Use it *only* if you need to iterate over -the elements in a deterministic order. SetVector is also expensive to delete -elements out of (linear time), unless you use it's "pop_back" method, which is -faster. -

- -

SetVector is an adapter class that defaults to using std::vector and std::set -for the underlying containers, so it is quite expensive. However, -"llvm/ADT/SetVector.h" also provides a SmallSetVector class, which -defaults to using a SmallVector and SmallSet of a specified size. If you use -this, and if your sets are dynamically smaller than N, you will save a lot of -heap traffic.

- -

-UniqueVector is similar to SetVector, but it -retains a unique ID for each element inserted into the set. It internally -contains a map and a vector, and it assigns a unique ID for each value inserted -into the set.

- -

UniqueVector is very expensive: its cost is the sum of the cost of -maintaining both the map and vector, it has high complexity, high constant -factors, and produces a lot of malloc traffic. It should be avoided.

- -

-The STL provides several other options, such as std::multiset and the various -"hash_set" like containers (whether from C++ TR1 or from the SGI library). We -never use hash_set and unordered_set because they are generally very expensive -(each insertion requires a malloc) and very non-portable. -

- -

std::multiset is useful if you're not interested in elimination of -duplicates, but has all the drawbacks of std::set. A sorted vector (where you -don't delete duplicate entries) or some other approach is almost always -better.

- -

-Map-like containers are useful when you want to associate data to a key. As -usual, there are a lot of different ways to do this. :) -

- -

-If your usage pattern follows a strict insert-then-query approach, you can -trivially use the same approach as sorted vectors -for set-like containers. The only difference is that your query function -(which uses std::lower_bound to get efficient log(n) lookup) should only compare -the key, not both the key and value. This yields the same advantages as sorted -vectors for sets. -

- -

-Strings are commonly used as keys in maps, and they are difficult to support -efficiently: they are variable length, inefficient to hash and compare when -long, expensive to copy, etc. StringMap is a specialized container designed to -cope with these issues. It supports mapping an arbitrary range of bytes to an -arbitrary other object.

- -

The StringMap implementation uses a quadratically-probed hash table, where -the buckets store a pointer to the heap allocated entries (and some other -stuff). The entries in the map must be heap allocated because the strings are -variable length. The string data (key) and the element object (value) are -stored in the same allocation with the string data immediately after the element -object. This container guarantees the "(char*)(&Value+1)" points -to the key string for a value.

- -

The StringMap is very fast for several reasons: quadratic probing is very -cache efficient for lookups, the hash value of strings in buckets is not -recomputed when lookup up an element, StringMap rarely has to touch the -memory for unrelated objects when looking up a value (even when hash collisions -happen), hash table growth does not recompute the hash values for strings -already in the table, and each pair in the map is store in a single allocation -(the string data is stored in the same allocation as the Value of a pair).

- -

StringMap also provides query methods that take byte ranges, so it only ever -copies a string if a value is inserted into the table.

-IndexedMap is a specialized container for mapping small dense integers (or -values that can be mapped to small dense integers) to some other type. It is -internally implemented as a vector with a mapping function that maps the keys to -the dense integer range. -

- -

-This is useful for cases like virtual registers in the LLVM code generator: they -have a dense mapping that is offset by a compile-time constant (the first -virtual register ID).

- -

-DenseMap is a simple quadratically probed hash table. It excels at supporting -small keys and values: it uses a single allocation to hold all of the pairs that -are currently inserted in the map. DenseMap is a great way to map pointers to -pointers, or map other small types to each other. -

- -

-There are several aspects of DenseMap that you should be aware of, however. The -iterators in a densemap are invalidated whenever an insertion occurs, unlike -map. Also, because DenseMap allocates space for a large number of key/value -pairs (it starts with 64 by default), it will waste a lot of space if your keys -or values are large. Finally, you must implement a partial specialization of -DenseMapInfo for the key that you want, if it isn't already supported. This -is required to tell DenseMap about two special marker values (which