diff options
author | Dmitri Gribenko <gribozavr@gmail.com> | 2012-11-22 11:56:02 +0000 |
---|---|---|
committer | Dmitri Gribenko <gribozavr@gmail.com> | 2012-11-22 11:56:02 +0000 |
commit | bbef5ead4cc90d7f7ca2f5dded41751ca3ff3dc9 (patch) | |
tree | 41991cf65a641c93b3de40a2f17b5fd0c6b5f9a9 /docs/SourceLevelDebugging.rst | |
parent | 7a3b7e5efc44c3852c5b34b245bd4eedeeac886f (diff) |
Documentation: convert SourceLevelDebugging.html to reST
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168493 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'docs/SourceLevelDebugging.rst')
-rw-r--r-- | docs/SourceLevelDebugging.rst | 2285 |
1 files changed, 2285 insertions, 0 deletions
diff --git a/docs/SourceLevelDebugging.rst b/docs/SourceLevelDebugging.rst new file mode 100644 index 0000000000..2bbf2e2c35 --- /dev/null +++ b/docs/SourceLevelDebugging.rst @@ -0,0 +1,2285 @@ +================================ +Source Level Debugging with LLVM +================================ + +.. sectionauthor:: Chris Lattner <sabre@nondot.org> and Jim Laskey <jlaskey@mac.com> + +.. contents:: + :local: + +Introduction +============ + +This document is the central repository for all information pertaining to debug +information in LLVM. It describes the :ref:`actual format that the LLVM debug +information takes <format>`, which is useful for those interested in creating +front-ends or dealing directly with the information. Further, this document +provides specific examples of what debug information for C/C++ looks like. + +Philosophy behind LLVM debugging information +-------------------------------------------- + +The idea of the LLVM debugging information is to capture how the important +pieces of the source-language's Abstract Syntax Tree map onto LLVM code. +Several design aspects have shaped the solution that appears here. The +important ones are: + +* Debugging information should have very little impact on the rest of the + compiler. No transformations, analyses, or code generators should need to + be modified because of debugging information. + +* LLVM optimizations should interact in :ref:`well-defined and easily described + ways <intro_debugopt>` with the debugging information. + +* Because LLVM is designed to support arbitrary programming languages, + LLVM-to-LLVM tools should not need to know anything about the semantics of + the source-level-language. + +* Source-level languages are often **widely** different from one another. + LLVM should not put any restrictions of the flavor of the source-language, + and the debugging information should work with any language. + +* With code generator support, it should be possible to use an LLVM compiler + to compile a program to native machine code and standard debugging + formats. This allows compatibility with traditional machine-code level + debuggers, like GDB or DBX. + +The approach used by the LLVM implementation is to use a small set of +:ref:`intrinsic functions <format_common_intrinsics>` to define a mapping +between LLVM program objects and the source-level objects. The description of +the source-level program is maintained in LLVM metadata in an +:ref:`implementation-defined format <ccxx_frontend>` (the C/C++ front-end +currently uses working draft 7 of the `DWARF 3 standard +<http://www.eagercon.com/dwarf/dwarf3std.htm>`_). + +When a program is being debugged, a debugger interacts with the user and turns +the stored debug information into source-language specific information. As +such, a debugger must be aware of the source-language, and is thus tied to a +specific language or family of languages. + +Debug information consumers +--------------------------- + +The role of debug information is to provide meta information normally stripped +away during the compilation process. This meta information provides an LLVM +user a relationship between generated code and the original program source +code. + +Currently, debug information is consumed by DwarfDebug to produce dwarf +information used by the gdb debugger. Other targets could use the same +information to produce stabs or other debug forms. + +It would also be reasonable to use debug information to feed profiling tools +for analysis of generated code, or, tools for reconstructing the original +source from generated code. + +TODO - expound a bit more. + +.. _intro_debugopt: + +Debugging optimized code +------------------------ + +An extremely high priority of LLVM debugging information is to make it interact +well with optimizations and analysis. In particular, the LLVM debug +information provides the following guarantees: + +* LLVM debug information **always provides information to accurately read + the source-level state of the program**, regardless of which LLVM + optimizations have been run, and without any modification to the + optimizations themselves. However, some optimizations may impact the + ability to modify the current state of the program with a debugger, such + as setting program variables, or calling functions that have been + deleted. + +* As desired, LLVM optimizations can be upgraded to be aware of the LLVM + debugging information, allowing them to update the debugging information + as they perform aggressive optimizations. This means that, with effort, + the LLVM optimizers could optimize debug code just as well as non-debug + code. + +* LLVM debug information does not prevent optimizations from + happening (for example inlining, basic block reordering/merging/cleanup, + tail duplication, etc). + +* LLVM debug information is automatically optimized along with the rest of + the program, using existing facilities. For example, duplicate + information is automatically merged by the linker, and unused information + is automatically removed. + +Basically, the debug information allows you to compile a program with +"``-O0 -g``" and get full debug information, allowing you to arbitrarily modify +the program as it executes from a debugger. Compiling a program with +"``-O3 -g``" gives you full debug information that is always available and +accurate for reading (e.g., you get accurate stack traces despite tail call +elimination and inlining), but you might lose the ability to modify the program +and call functions where were optimized out of the program, or inlined away +completely. + +:ref:`LLVM test suite <test-suite-quickstart>` provides a framework to test +optimizer's handling of debugging information. It can be run like this: + +.. code-block:: bash + + % cd llvm/projects/test-suite/MultiSource/Benchmarks # or some other level + % make TEST=dbgopt + +This will test impact of debugging information on optimization passes. If +debugging information influences optimization passes then it will be reported +as a failure. See :doc:`TestingGuide` for more information on LLVM test +infrastructure and how to run various tests. + +.. _format: + +Debugging information format +============================ + +LLVM debugging information has been carefully designed to make it possible for +the optimizer to optimize the program and debugging information without +necessarily having to know anything about debugging information. In +particular, the use of metadata avoids duplicated debugging information from +the beginning, and the global dead code elimination pass automatically deletes +debugging information for a function if it decides to delete the function. + +To do this, most of the debugging information (descriptors for types, +variables, functions, source files, etc) is inserted by the language front-end +in the form of LLVM metadata. + +Debug information is designed to be agnostic about the target debugger and +debugging information representation (e.g. DWARF/Stabs/etc). It uses a generic +pass to decode the information that represents variables, types, functions, +namespaces, etc: this allows for arbitrary source-language semantics and +type-systems to be used, as long as there is a module written for the target +debugger to interpret the information. + +To provide basic functionality, the LLVM debugger does have to make some +assumptions about the source-level language being debugged, though it keeps +these to a minimum. The only common features that the LLVM debugger assumes +exist are :ref:`source files <format_files>`, and :ref:`program objects +<format_global_variables>`. These abstract objects are used by a debugger to +form stack traces, show information about local variables, etc. + +This section of the documentation first describes the representation aspects +common to any source-language. :ref:`ccxx_frontend` describes the data layout +conventions used by the C and C++ front-ends. + +Debug information descriptors +----------------------------- + +In consideration of the complexity and volume of debug information, LLVM +provides a specification for well formed debug descriptors. + +Consumers of LLVM debug information expect the descriptors for program objects +to start in a canonical format, but the descriptors can include additional +information appended at the end that is source-language specific. All LLVM +debugging information is versioned, allowing backwards compatibility in the +case that the core structures need to change in some way. Also, all debugging +information objects start with a tag to indicate what type of object it is. +The source-language is allowed to define its own objects, by using unreserved +tag numbers. We recommend using with tags in the range 0x1000 through 0x2000 +(there is a defined ``enum DW_TAG_user_base = 0x1000``.) + +The fields of debug descriptors used internally by LLVM are restricted to only +the simple data types ``i32``, ``i1``, ``float``, ``double``, ``mdstring`` and +``mdnode``. + +.. code-block:: llvm + + !1 = metadata !{ + i32, ;; A tag + ... + } + +<a name="LLVMDebugVersion">The first field of a descriptor is always an +``i32`` containing a tag value identifying the content of the descriptor. +The remaining fields are specific to the descriptor. The values of tags are +loosely bound to the tag values of DWARF information entries. However, that +does not restrict the use of the information supplied to DWARF targets. To +facilitate versioning of debug information, the tag is augmented with the +current debug version (``LLVMDebugVersion = 8 << 16`` or 0x80000 or +524288.) + +The details of the various descriptors follow. + +Compile unit descriptors +^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: llvm + + !0 = metadata !{ + i32, ;; Tag = 17 + LLVMDebugVersion (DW_TAG_compile_unit) + i32, ;; Unused field. + i32, ;; DWARF language identifier (ex. DW_LANG_C89) + metadata, ;; Source file name + metadata, ;; Source file directory (includes trailing slash) + metadata ;; Producer (ex. "4.0.1 LLVM (LLVM research group)") + i1, ;; True if this is a main compile unit. + i1, ;; True if this is optimized. + metadata, ;; Flags + i32 ;; Runtime version + metadata ;; List of enums types + metadata ;; List of retained types + metadata ;; List of subprograms + metadata ;; List of global variables + } + +These descriptors contain a source language ID for the file (we use the DWARF +3.0 ID numbers, such as ``DW_LANG_C89``, ``DW_LANG_C_plus_plus``, +``DW_LANG_Cobol74``, etc), three strings describing the filename, working +directory of the compiler, and an identifier string for the compiler that +produced it. + +Compile unit descriptors provide the root context for objects declared in a +specific compilation unit. File descriptors are defined using this context. +These descriptors are collected by a named metadata ``!llvm.dbg.cu``. Compile +unit descriptor keeps track of subprograms, global variables and type +information. + +.. _format_files: + +File descriptors +^^^^^^^^^^^^^^^^ + +.. code-block:: llvm + + !0 = metadata !{ + i32, ;; Tag = 41 + LLVMDebugVersion (DW_TAG_file_type) + metadata, ;; Source file name + metadata, ;; Source file directory (includes trailing slash) + metadata ;; Unused + } + +These descriptors contain information for a file. Global variables and top +level functions would be defined using this context. File descriptors also +provide context for source line correspondence. + +Each input file is encoded as a separate file descriptor in LLVM debugging +information output. + +.. _format_global_variables: + +Global variable descriptors +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: llvm + + !1 = metadata !{ + i32, ;; Tag = 52 + LLVMDebugVersion (DW_TAG_variable) + i32, ;; Unused field. + metadata, ;; Reference to context descriptor + metadata, ;; Name + metadata, ;; Display name (fully qualified C++ name) + metadata, ;; MIPS linkage name (for C++) + metadata, ;; Reference to file where defined + i32, ;; Line number where defined + metadata, ;; Reference to type descriptor + i1, ;; True if the global is local to compile unit (static) + i1, ;; True if the global is defined in the compile unit (not extern) + {}* ;; Reference to the global variable + } + +These descriptors provide debug information about globals variables. The +provide details such as name, type and where the variable is defined. All +global variables are collected inside the named metadata ``!llvm.dbg.cu``. + +.. _format_subprograms: + +Subprogram descriptors +^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: llvm + + !2 = metadata !{ + i32, ;; Tag = 46 + LLVMDebugVersion (DW_TAG_subprogram) + i32, ;; Unused field. + metadata, ;; Reference to context descriptor + metadata, ;; Name + metadata, ;; Display name (fully qualified C++ name) + metadata, ;; MIPS linkage name (for C++) + metadata, ;; Reference to file where defined + i32, ;; Line number where defined + metadata, ;; Reference to type descriptor + i1, ;; True if the global is local to compile unit (static) + i1, ;; True if the global is defined in the compile unit (not extern) + i32, ;; Line number where the scope of the subprogram begins + i32, ;; Virtuality, e.g. dwarf::DW_VIRTUALITY__virtual + i32, ;; Index into a virtual function + metadata, ;; indicates which base type contains the vtable pointer for the + ;; derived class + i32, ;; Flags - Artifical, Private, Protected, Explicit, Prototyped. + i1, ;; isOptimized + Function * , ;; Pointer to LLVM function + metadata, ;; Lists function template parameters + metadata, ;; Function declaration descriptor + metadata ;; List of function variables + } + +These descriptors provide debug information about functions, methods and +subprograms. They provide details such as name, return types and the source +location where the subprogram is defined. + +Block descriptors +^^^^^^^^^^^^^^^^^ + +.. code-block:: llvm + + !3 = metadata !{ + i32, ;; Tag = 11 + LLVMDebugVersion (DW_TAG_lexical_block) + metadata,;; Reference to context descriptor + i32, ;; Line number + i32, ;; Column number + metadata,;; Reference to source file + i32 ;; Unique ID to identify blocks from a template function + } + +This descriptor provides debug information about nested blocks within a +subprogram. The line number and column numbers are used to dinstinguish two +lexical blocks at same depth. + +.. code-block:: llvm + + !3 = metadata !{ + i32, ;; Tag = 11 + LLVMDebugVersion (DW_TAG_lexical_block) + metadata ;; Reference to the scope we're annotating with a file change + metadata,;; Reference to the file the scope is enclosed in. + } + +This descriptor provides a wrapper around a lexical scope to handle file +changes in the middle of a lexical block. + +.. _format_basic_type: + +Basic type descriptors +^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: llvm + + !4 = metadata !{ + i32, ;; Tag = 36 + LLVMDebugVersion (DW_TAG_base_type) + metadata, ;; Reference to context + metadata, ;; Name (may be "" for anonymous types) + metadata, ;; Reference to file where defined (may be NULL) + i32, ;; Line number where defined (may be 0) + i64, ;; Size in bits + i64, ;; Alignment in bits + i64, ;; Offset in bits + i32, ;; Flags + i32 ;; DWARF type encoding + } + +These descriptors define primitive types used in the code. Example ``int``, +``bool`` and ``float``. The context provides the scope of the type, which is +usually the top level. Since basic types are not usually user defined the +context and line number can be left as NULL and 0. The size, alignment and +offset are expressed in bits and can be 64 bit values. The alignment is used +to round the offset when embedded in a :ref:`composite type +<format_composite_type>` (example to keep float doubles on 64 bit boundaries). +The offset is the bit offset if embedded in a :ref:`composite type +<format_composite_type>`. + +The type encoding provides the details of the type. The values are typically +one of the following: + +.. code-block:: llvm + + DW_ATE_address = 1 + DW_ATE_boolean = 2 + DW_ATE_float = 4 + DW_ATE_signed = 5 + DW_ATE_signed_char = 6 + DW_ATE_unsigned = 7 + DW_ATE_unsigned_char = 8 + +.. _format_derived_type: + +Derived type descriptors +^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: llvm + + !5 = metadata !{ + i32, ;; Tag (see below) + metadata, ;; Reference to context + metadata, ;; Name (may be "" for anonymous types) + metadata, ;; Reference to file where defined (may be NULL) + i32, ;; Line number where defined (may be 0) + i64, ;; Size in bits + i64, ;; Alignment in bits + i64, ;; Offset in bits + i32, ;; Flags to encode attributes, e.g. private + metadata, ;; Reference to type derived from + metadata, ;; (optional) Name of the Objective C property associated with + ;; Objective-C an ivar + metadata, ;; (optional) Name of the Objective C property getter selector. + metadata, ;; (optional) Name of the Objective C property setter selector. + i32 ;; (optional) Objective C property attributes. + } + +These descriptors are used to define types derived from other types. The value +of the tag varies depending on the meaning. The following are possible tag +values: + +.. code-block:: llvm + + DW_TAG_formal_parameter = 5 + DW_TAG_member = 13 + DW_TAG_pointer_type = 15 + DW_TAG_reference_type = 16 + DW_TAG_typedef = 22 + DW_TAG_const_type = 38 + DW_TAG_volatile_type = 53 + DW_TAG_restrict_type = 55 + +``DW_TAG_member`` is used to define a member of a :ref:`composite type +<format_composite_type>` or :ref:`subprogram <format_subprograms>`. The type +of the member is the :ref:`derived type <format_derived_type>`. +``DW_TAG_formal_parameter`` is used to define a member which is a formal +argument of a subprogram. + +``DW_TAG_typedef`` is used to provide a name for the derived type. + +``DW_TAG_pointer_type``, ``DW_TAG_reference_type``, ``DW_TAG_const_type``, +``DW_TAG_volatile_type`` and ``DW_TAG_restrict_type`` are used to qualify the +:ref:`derived type <format_derived_type>`. + +:ref:`Derived type <format_derived_type>` location can be determined from the +context and line number. The size, alignment and offset are expressed in bits +and can be 64 bit values. The alignment is used to round the offset when +embedded in a :ref:`composite type <format_composite_type>` (example to keep +float doubles on 64 bit boundaries.) The offset is the bit offset if embedded +in a :ref:`composite type <format_composite_type>`. + +Note that the ``void *`` type is expressed as a type derived from NULL. + +.. _format_composite_type: + +Composite type descriptors +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: llvm + + !6 = metadata !{ + i32, ;; Tag (see below) + metadata, ;; Reference to context + metadata, ;; Name (may be "" for anonymous types) + metadata, ;; Reference to file where defined (may be NULL) + i32, ;; Line number where defined (may be 0) + i64, ;; Size in bits + i64, ;; Alignment in bits + i64, ;; Offset in bits + i32, ;; Flags + metadata, ;; Reference to type derived from + metadata, ;; Reference to array of member descriptors + i32 ;; Runtime languages + } + +These descriptors are used to define types that are composed of 0 or more +elements. The value of the tag varies depending on the meaning. The following +are possible tag values: + +.. code-block:: llvm + + DW_TAG_array_type = 1 + DW_TAG_enumeration_type = 4 + DW_TAG_structure_type = 19 + DW_TAG_union_type = 23 + DW_TAG_vector_type = 259 + DW_TAG_subroutine_type = 21 + DW_TAG_inheritance = 28 + +The vector flag indicates that an array type is a native packed vector. + +The members of array types (tag = ``DW_TAG_array_type``) or vector types (tag = +``DW_TAG_vector_type``) are :ref:`subrange descriptors <format_subrange>`, each +representing the range of subscripts at that level of indexing. + +The members of enumeration types (tag = ``DW_TAG_enumeration_type``) are +:ref:`enumerator descriptors <format_enumerator>`, each representing the +definition of enumeration value for the set. All enumeration type descriptors +are collected inside the named metadata ``!llvm.dbg.cu``. + +The members of structure (tag = ``DW_TAG_structure_type``) or union (tag = +``DW_TAG_union_type``) types are any one of the :ref:`basic +<format_basic_type>`, :ref:`derived <format_derived_type>` or :ref:`composite +<format_composite_type>` type descriptors, each representing a field member of +the structure or union. + +For C++ classes (tag = ``DW_TAG_structure_type``), member descriptors provide +information about base classes, static members and member functions. If a +member is a :ref:`derived type descriptor <format_derived_type>` and has a tag +of ``DW_TAG_inheritance``, then the type represents a base class. If the member +of is a :ref:`global variable descriptor <format_global_variables>` then it +represents a static member. And, if the member is a :ref:`subprogram +descriptor <format_subprograms>` then it represents a member function. For +static members and member functions, ``getName()`` returns the members link or +the C++ mangled name. ``getDisplayName()`` the simplied version of the name. + +The first member of subroutine (tag = ``DW_TAG_subroutine_type``) type elements +is the return type for the subroutine. The remaining elements are the formal +arguments to the subroutine. + +:ref:`Composite type <format_composite_type>` location can be determined from +the context and line number. The size, alignment and offset are expressed in +bits and can be 64 bit values. The alignment is used to round the offset when +embedded in a :ref:`composite type <format_composite_type>` (as an example, to +keep float doubles on 64 bit boundaries). The offset is the bit offset if +embedded in a :ref:`composite type <format_composite_type>`. + +.. _format_subrange: + +Subrange descriptors +^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: llvm + + !42 = metadata !{ + i32, ;; Tag = 33 + LLVMDebugVersion (DW_TAG_subrange_type) + i64, ;; Low value + i64 ;; High value + } + +These descriptors are used to define ranges of array subscripts for an array +:ref:`composite type <format_composite_type>`. The low value defines the lower +bounds typically zero for C/C++. The high value is the upper bounds. Values +are 64 bit. ``High - Low + 1`` is the size of the array. If ``Low > High`` +the array bounds are not included in generated debugging information. + +.. _format_enumerator: + +Enumerator descriptors +^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: llvm + + !6 = metadata !{ + i32, ;; Tag = 40 + LLVMDebugVersion (DW_TAG_enumerator) + metadata, ;; Name + i64 ;; Value + } + +These descriptors are used to define members of an enumeration :ref:`composite +type <format_composite_type>`, it associates the name to the value. + +Local variables +^^^^^^^^^^^^^^^ + +.. code-block:: llvm + + !7 = metadata !{ + i32, ;; Tag (see below) + metadata, ;; Context + metadata, ;; Name + metadata, ;; Reference to file where defined + i32, ;; 24 bit - Line number where defined + ;; 8 bit - Argument number. 1 indicates 1st argument. + metadata, ;; Type descriptor + i32, ;; flags + metadata ;; (optional) Reference to inline location + } + +These descriptors are used to define variables local to a sub program. The +value of the tag depends on the usage of the variable: + +.. code-block:: llvm + + DW_TAG_auto_variable = 256 + DW_TAG_arg_variable = 257 + DW_TAG_return_variable = 258 + +An auto variable is any variable declared in the body of the function. An +argument variable is any variable that appears as a formal argument to the +function. A return variable is used to track the result of a function and has +no source correspondent. + +The context is either the subprogram or block where the variable is defined. +Name the source variable name. Context and line indicate where the variable +was defined. Type descriptor defines the declared type of the variable. + +.. _format_common_intrinsics: + +Debugger intrinsic functions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +LLVM uses several intrinsic functions (name prefixed with "``llvm.dbg``") to +provide debug information at various points in generated code. + +``llvm.dbg.declare`` +^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: llvm + + void %llvm.dbg.declare(metadata, metadata) + +This intrinsic provides information about a local element (e.g., variable). +The first argument is metadata holding the alloca for the variable. The second +argument is metadata containing a description of the variable. + +``llvm.dbg.value`` +^^^^^^^^^^^^^^^^^^ + +.. code-block:: llvm + + void %llvm.dbg.value(metadata, i64, metadata) + +This intrinsic provides information when a user source variable is set to a new +value. The first argument is the new value (wrapped as metadata). The second +argument is the offset in the user source variable where the new value is +written. The third argument is metadata containing a description of the user +source variable. + +Object lifetimes and scoping +============================ + +In many languages, the local variables in functions can have their lifetimes or +scopes limited to a subset of a function. In the C family of languages, for +example, variables are only live (readable and writable) within the source +block that they are defined in. In functional languages, values are only +readable after they have been defined. Though this is a very obvious concept, +it is non-trivial to model in LLVM, because it has no notion of scoping in this +sense, and does not want to be tied to a language's scoping rules. + +In order to handle this, the LLVM debug format uses the metadata attached to +llvm instructions to encode line number and scoping information. Consider the +following C fragment, for example: + +.. code-block:: c + + 1. void foo() { + 2. int X = 21; + 3. int Y = 22; + 4. { + 5. int Z = 23; + 6. Z = X; + 7. } + 8. X = Y; + 9. } + +Compiled to LLVM, this function would be represented like this: + +.. code-block:: llvm + + define void @foo() nounwind ssp { + entry: + %X = alloca i32, align 4 ; <i32*> [#uses=4] + %Y = alloca i32, align 4 ; <i32*> [#uses=4] + %Z = alloca i32, align 4 ; <i32*> [#uses=3] + %0 = bitcast i32* %X to {}* ; <{}*> [#uses=1] + call void @llvm.dbg.declare(metadata !{i32 * %X}, metadata !0), !dbg !7 + store i32 21, i32* %X, !dbg !8 + %1 = bitcast i32* %Y to {}* ; <{}*> [#uses=1] + call void @llvm.dbg.declare(metadata !{i32 * %Y}, metadata !9), !dbg !10 + store i32 22, i32* %Y, !dbg !11 + %2 = bitcast i32* %Z to {}* ; <{}*> [#uses=1] + call void @llvm.dbg.declare(metadata !{i32 * %Z}, metadata !12), !dbg !14 + store i32 23, i32* %Z, !dbg !15 + %tmp = load i32* %X, !dbg !16 ; <i32> [#uses=1] + %tmp1 = load i32* %Y, !dbg !16 ; <i32> [#uses=1] + %add = add nsw i32 %tmp, %tmp1, !dbg !16 ; <i32> [#uses=1] + store i32 %add, i32* %Z, !dbg !16 + %tmp2 = load i32* %Y, !dbg !17 ; <i32> [#uses=1] + store i32 %tmp2, i32* %X, !dbg !17 + ret void, !dbg !18 + } + + declare void @llvm.dbg.declare(metadata, metadata) nounwind readnone + + !0 = metadata !{i32 459008, metadata !1, metadata !"X", + metadata !3, i32 2, metadata !6}; [ DW_TAG_auto_variable ] + !1 = metadata !{i32 458763, metadata !2}; [DW_TAG_lexical_block ] + !2 = metadata !{i32 458798, i32 0, metadata !3, metadata !"foo", metadata !"foo", + metadata !"foo", metadata !3, i32 1, metadata !4, + i1 false, i1 true}; [DW_TAG_subprogram ] + !3 = metadata !{i32 458769, i32 0, i32 12, metadata !"foo.c", + metadata !"/private/tmp", metadata !"clang 1.1", i1 true, + i1 false, metadata !"", i32 0}; [DW_TAG_compile_unit ] + !4 = metadata !{i32 458773, metadata !3, metadata !"", null, i32 0, i64 0, i64 0, + i64 0, i32 0, null, metadata !5, i32 0}; [DW_TAG_subroutine_type ] + !5 = metadata !{null} + !6 = metadata !{i32 458788, metadata !3, metadata !"int", metadata !3, i32 0, + i64 32, i64 32, i64 0, i32 0, i32 5}; [DW_TAG_base_type ] + !7 = metadata !{i32 2, i32 7, metadata !1, null} + !8 = metadata !{i32 2, i32 3, metadata !1, null} + !9 = metadata !{i32 459008, metadata !1, metadata !"Y", metadata !3, i32 3, + metadata !6}; [ DW_TAG_auto_variable ] + !10 = metadata !{i32 3, i32 7, metadata !1, null} + !11 = metadata !{i32 3, i32 3, metadata !1, null} + !12 = metadata !{i32 459008, metadata !13, metadata !"Z", metadata !3, i32 5, + metadata !6}; [ DW_TAG_auto_variable ] + !13 = metadata !{i32 458763, metadata !1}; [DW_TAG_lexical_block ] + !14 = metadata !{i32 5, i32 9, metadata !13, null} + !15 = metadata !{i32 5, i32 5, metadata !13, null} + !16 = metadata !{i32 6, i32 5, metadata !13, null} + !17 = metadata !{i32 8, i32 3, metadata !1, null} + !18 = metadata !{i32 9, i32 1, metadata !2, null} + +This example illustrates a few important details about LLVM debugging +information. In particular, it shows how the ``llvm.dbg.declare`` intrinsic and +location information, which are attached to an instruction, are applied +together to allow a debugger to analyze the relationship between statements, +variable definitions, and the code used to implement the function. + +.. code-block:: llvm + + call void @llvm.dbg.declare(metadata, metadata !0), !dbg !7 + +The first intrinsic ``%llvm.dbg.declare`` encodes debugging information for the +variable ``X``. The metadata ``!dbg !7`` attached to the intrinsic provides +scope information for the variable ``X``. + +.. code-block:: llvm + + !7 = metadata !{i32 2, i32 7, metadata !1, null} + !1 = metadata !{i32 458763, metadata !2}; [DW_TAG_lexical_block ] + !2 = metadata !{i32 458798, i32 0, metadata !3, metadata !"foo", + metadata !"foo", metadata !"foo", metadata !3, i32 1, + metadata !4, i1 false, i1 true}; [DW_TAG_subprogram ] + +Here ``!7`` is metadata providing location information. It has four fields: +line number, column number, scope, and original scope. The original scope +represents inline location if this instruction is inlined inside a caller, and +is null otherwise. In this example, scope is encoded by ``!1``. ``!1`` +represents a lexical block inside the scope ``!2``, where ``!2`` is a +:ref:`subprogram descriptor <format_subprograms>`. This way the location +information attached to the intrinsics indicates that the variable ``X`` is +declared at line number 2 at a function level scope in function ``foo``. + +Now lets take another example. + +.. code-block:: llvm + + call void @llvm.dbg.declare(metadata, metadata !12), !dbg !14 + +The second intrinsic ``%llvm.dbg.declare`` encodes debugging information for +variable ``Z``. The metadata ``!dbg !14`` attached to the intrinsic provides +scope information for the variable ``Z``. + +.. code-block:: llvm + + !13 = metadata !{i32 458763, metadata !1}; [DW_TAG_lexical_block ] + !14 = metadata !{i32 5, i32 9, metadata !13, null} + +Here ``!14`` indicates that ``Z`` is declared at line number 5 and +column number 9 inside of lexical scope ``!13``. The lexical scope itself +resides inside of lexical scope ``!1`` described above. + +The scope information attached with each instruction provides a straightforward +way to find instructions covered by a scope. + +.. _ccxx_frontend: + +C/C++ front-end specific debug information +========================================== + +The C and C++ front-ends represent information about the program in a format +that is effectively identical to `DWARF 3.0 +<http://www.eagercon.com/dwarf/dwarf3std.htm>`_ in terms of information +content. This allows code generators to trivially support native debuggers by +generating standard dwarf information, and contains enough information for +non-dwarf targets to translate it as needed. + +This section describes the forms used to represent C and C++ programs. Other +languages could pattern themselves after this (which itself is tuned to +representing programs in the same way that DWARF 3 does), or they could choose +to provide completely different forms if they don't fit into the DWARF model. +As support for debugging information gets added to the various LLVM +source-language front-ends, the information used should be documented here. + +The following sections provide examples of various C/C++ constructs and the +debug information that would best describe those constructs. + +C/C++ source file information +----------------------------- + +Given the source files ``MySource.cpp`` and ``MyHeader.h`` located in the +directory ``/Users/mine/sources``, the following code: + +.. code-block:: c + + #include "MyHeader.h" + + int main(int argc, char *argv[]) { + return 0; + } + +a C/C++ front-end would generate the following descriptors: + +.. code-block:: llvm + + ... + ;; + ;; Define the compile unit for the main source file "/Users/mine/sources/MySource.cpp". + ;; + !2 = metadata !{ + i32 524305, ;; Tag + i32 0, ;; Unused + i32 4, ;; Language Id + metadata !"MySource.cpp", + metadata !"/Users/mine/sources", + metadata !"4.2.1 (Based on Apple Inc. build 5649) (LLVM build 00)", + i1 true, ;; Main Compile Unit + i1 false, ;; Optimized compile unit + metadata !"", ;; Compiler flags + i32 0} ;; Runtime version + + ;; + ;; Define the file for the file "/Users/mine/sources/MySource.cpp". + ;; + !1 = metadata !{ + i32 524329, ;; Tag + metadata !"MySource.cpp", + metadata !"/Users/mine/sources", + metadata !2 ;; Compile unit + } + + ;; + ;; Define the file for the file "/Users/mine/sources/Myheader.h" + ;; + !3 = metadata !{ + i32 524329, ;; Tag + metadata !"Myheader.h" + metadata !"/Users/mine/sources", + metadata !2 ;; Compile unit + } + + ... + +``llvm::Instruction`` provides easy access to metadata attached with an +instruction. One can extract line number information encoded in LLVM IR using +``Instruction::getMetadata()`` and ``DILocation::getLineNumber()``. + +.. code-block:: c++ + + if (MDNode *N = I->getMetadata("dbg")) { // Here I is an LLVM instruction + DILocation Loc(N); // DILocation is in DebugInfo.h + unsigned Line = Loc.getLineNumber(); + StringRef File = Loc.getFilename(); + StringRef Dir = Loc.getDirectory(); + } + +C/C++ global variable information +--------------------------------- + +Given an integer global variable declared as follows: + +.. code-block:: c + + int MyGlobal = 100; + +a C/C++ front-end would generate the following descriptors: + +.. code-block:: llvm + + ;; + ;; Define the global itself. + ;; + %MyGlobal = global int 100 + ... + ;; + ;; List of debug info of globals + ;; + !llvm.dbg.cu = !{!0} + + ;; Define the compile unit. + !0 = metadata !{ + i32 786449, ;; Tag + i32 0, ;; Context + i32 4, ;; Language + metadata !"foo.cpp", ;; File + metadata !"/Volumes/Data/tmp", ;; Directory + metadata !"clang version 3.1 ", ;; Producer + i1 true, ;; Deprecated field + i1 false, ;; "isOptimized"? + metadata !"", ;; Flags + i32 0, ;; Runtime Version + metadata !1, ;; Enum Types + metadata !1, ;; Retained Types + metadata !1, ;; Subprograms + metadata !3 ;; Global Variables + } ; [ DW_TAG_compile_unit ] + + ;; The Array of Global Variables + !3 = metadata !{ + metadata !4 + } + + !4 = metadata !{ + metadata !5 + } |