diff options
-rw-r--r-- | docs/LangRef.rst | 111 | ||||
-rw-r--r-- | include/llvm/Analysis/LoopInfo.h | 14 | ||||
-rw-r--r-- | lib/Analysis/LoopInfo.cpp | 50 | ||||
-rw-r--r-- | lib/Transforms/Vectorize/LoopVectorize.cpp | 8 |
4 files changed, 183 insertions, 0 deletions
diff --git a/docs/LangRef.rst b/docs/LangRef.rst index f8e22c8d7b..c3b776ea01 100644 --- a/docs/LangRef.rst +++ b/docs/LangRef.rst @@ -2522,6 +2522,117 @@ Examples: !2 = metadata !{ i8 0, i8 2, i8 3, i8 6 } !3 = metadata !{ i8 -2, i8 0, i8 3, i8 6 } +'``llvm.loop``' +^^^^^^^^^^^^^^^ + +It is sometimes useful to attach information to loop constructs. Currently, +loop metadata is implemented as metadata attached to the branch instruction +in the loop latch block. This type of metadata refer to a metadata node that is +guaranteed to be separate for each loop. The loop-level metadata is prefixed +with ``llvm.loop``. + +The loop identifier metadata is implemented using a metadata that refers to +itself as follows: + +.. code-block:: llvm + !0 = metadata !{ metadata !0 } + +'``llvm.loop.parallel``' Metadata +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This loop metadata can be used to communicate that a loop should be considered +a parallel loop. The semantics of parallel loops in this case is the one +with the strongest cross-iteration instruction ordering freedom: the +iterations in the loop can be considered completely independent of each +other (also known as embarrassingly parallel loops). + +This metadata can originate from a programming language with parallel loop +constructs. In such a case it is completely the programmer's responsibility +to ensure the instructions from the different iterations of the loop can be +executed in an arbitrary order, in parallel, or intertwined. No loop-carried +dependency checking at all must be expected from the compiler. + +In order to fulfill the LLVM requirement for metadata to be safely ignored, +it is important to ensure that a parallel loop is converted to +a sequential loop in case an optimization (agnostic of the parallel loop +semantics) converts the loop back to such. This happens when new memory +accesses that do not fulfill the requirement of free ordering across iterations +are added to the loop. Therefore, this metadata is required, but not +sufficient, to consider the loop at hand a parallel loop. For a loop +to be parallel, all its memory accessing instructions need to be +marked with the ``llvm.mem.parallel_loop_access`` metadata that refer +to the same loop identifier metadata that identify the loop at hand. + +'``llvm.mem``' +^^^^^^^^^^^^^^^ + +Metadata types used to annotate memory accesses with information helpful +for optimizations are prefixed with ``llvm.mem``. + +'``llvm.mem.parallel_loop_access``' Metadata +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +For a loop to be parallel, in addition to using +the ``llvm.loop.parallel`` metadata to mark the loop latch branch instruction, +also all of the memory accessing instructions in the loop body need to be +marked with the ``llvm.mem.parallel_loop_access`` metadata. If there +is at least one memory accessing instruction not marked with the metadata, +the loop, despite it possibly using the ``llvm.loop.parallel`` metadata, +must be considered a sequential loop. This causes parallel loops to be +converted to sequential loops due to optimization passes that are unaware of +the parallel semantics and that insert new memory instructions to the loop +body. + +Example of a loop that is considered parallel due to its correct use of +both ``llvm.loop.parallel`` and ``llvm.mem.parallel_loop_access`` +metadata types that refer to the same loop identifier metadata. + +.. code-block:: llvm + + for.body: + ... + %0 = load i32* %arrayidx, align 4, !llvm.mem.parallel_loop_access !0 + ... + store i32 %0, i32* %arrayidx4, align 4, !llvm.mem.parallel_loop_access !0 + ... + br i1 %exitcond, label %for.end, label %for.body, !llvm.loop.parallel !0 + + for.end: + ... + !0 = metadata !{ metadata !0 } + +It is also possible to have nested parallel loops. In that case the +memory accesses refer to a list of loop identifier metadata nodes instead of +the loop identifier metadata node directly: + +.. code-block:: llvm + + outer.for.body: + ... + + inner.for.body: + ... + %0 = load i32* %arrayidx, align 4, !llvm.mem.parallel_loop_access !0 + ... + store i32 %0, i32* %arrayidx4, align 4, !llvm.mem.parallel_loop_access !0 + ... + br i1 %exitcond, label %inner.for.end, label %inner.for.body, !llvm.loop.parallel !1 + + inner.for.end: + ... + %0 = load i32* %arrayidx, align 4, !llvm.mem.parallel_loop_access !0 + ... + store i32 %0, i32* %arrayidx4, align 4, !llvm.mem.parallel_loop_access !0 + ... + br i1 %exitcond, label %outer.for.end, label %outer.for.body, !llvm.loop.parallel !2 + + outer.for.end: ; preds = %for.body + ... + !0 = metadata !{ metadata !1, metadata !2 } ; a list of parallel loop identifiers + !1 = metadata !{ metadata !1 } ; an identifier for the inner parallel loop + !2 = metadata !{ metadata !2 } ; an identifier for the outer parallel loop + + Module Flags Metadata ===================== diff --git a/include/llvm/Analysis/LoopInfo.h b/include/llvm/Analysis/LoopInfo.h index 9caa69a1d4..a20e065785 100644 --- a/include/llvm/Analysis/LoopInfo.h +++ b/include/llvm/Analysis/LoopInfo.h @@ -377,6 +377,20 @@ public: /// isSafeToClone - Return true if the loop body is safe to clone in practice. bool isSafeToClone() const; + /// Returns true if the loop is annotated parallel. + /// + /// A parallel loop can be assumed to not contain any dependencies between + /// iterations by the compiler. That is, any loop-carried dependency checking + /// can be skipped completely when parallelizing the loop on the target + /// machine. Thus, if the parallel loop information originates from the + /// programmer, e.g. via the OpenMP parallel for pragma, it is the + /// programmer's responsibility to ensure there are no loop-carried + /// dependencies. The final execution order of the instructions across + /// iterations is not guaranteed, thus, the end result might or might not + /// implement actual concurrent execution of instructions across multiple + /// iterations. + bool isAnnotatedParallel() const; + /// hasDedicatedExits - Return true if no exit block for the loop /// has a predecessor that is outside the loop. bool hasDedicatedExits() const; diff --git a/lib/Analysis/LoopInfo.cpp b/lib/Analysis/LoopInfo.cpp index 4d4c627165..f1ad6506e4 100644 --- a/lib/Analysis/LoopInfo.cpp +++ b/lib/Analysis/LoopInfo.cpp @@ -24,6 +24,7 @@ #include "llvm/Assembly/Writer.h" #include "llvm/IR/Constants.h" #include "llvm/IR/Instructions.h" +#include "llvm/IR/Metadata.h" #include "llvm/Support/CFG.h" #include "llvm/Support/CommandLine.h" #include "llvm/Support/Debug.h" @@ -233,6 +234,55 @@ bool Loop::isSafeToClone() const { return true; } +bool Loop::isAnnotatedParallel() const { + + BasicBlock *latch = getLoopLatch(); + if (latch == NULL) + return false; + + MDNode *desiredLoopIdMetadata = + latch->getTerminator()->getMetadata("llvm.loop.parallel"); + + if (!desiredLoopIdMetadata) + return false; + + // The loop branch contains the parallel loop metadata. In order to ensure + // that any parallel-loop-unaware optimization pass hasn't added loop-carried + // dependencies (thus converted the loop back to a sequential loop), check + // that all the memory instructions in the loop contain parallelism metadata + // that point to the same unique "loop id metadata" the loop branch does. + for (block_iterator BB = block_begin(), BE = block_end(); BB != BE; ++BB) { + for (BasicBlock::iterator II = (*BB)->begin(), EE = (*BB)->end(); + II != EE; II++) { + + if (!II->mayReadOrWriteMemory()) + continue; + + if (!II->getMetadata("llvm.mem.parallel_loop_access")) + return false; + + // The memory instruction can refer to the loop identifier metadata + // directly or indirectly through another list metadata (in case of + // nested parallel loops). The loop identifier metadata refers to + // itself so we can check both cases with the same routine. + MDNode *loopIdMD = + dyn_cast<MDNode>(II->getMetadata("llvm.mem.parallel_loop_access")); + bool loopIdMDFound = false; + for (unsigned i = 0, e = loopIdMD->getNumOperands(); i < e; ++i) { + if (loopIdMD->getOperand(i) == desiredLoopIdMetadata) { + loopIdMDFound = true; + break; + } + } + + if (!loopIdMDFound) + return false; + } + } + return true; +} + + /// hasDedicatedExits - Return true if no exit block for the loop /// has a predecessor that is outside the loop. bool Loop::hasDedicatedExits() const { diff --git a/lib/Transforms/Vectorize/LoopVectorize.cpp b/lib/Transforms/Vectorize/LoopVectorize.cpp index 9fb451b0a9..842ae0291e 100644 --- a/lib/Transforms/Vectorize/LoopVectorize.cpp +++ b/lib/Transforms/Vectorize/LoopVectorize.cpp @@ -2276,6 +2276,14 @@ void LoopVectorizationLegality::collectLoopUniforms() { } bool LoopVectorizationLegality::canVectorizeMemory() { + + if (TheLoop->isAnnotatedParallel()) { + DEBUG(dbgs() + << "LV: A loop annotated parallel, ignore memory dependency " + << "checks.\n"); + return true; + } + typedef SmallVector<Value*, 16> ValueVector; typedef SmallPtrSet<Value*, 16> ValueSet; // Holds the Load and Store *instructions*. |