aboutsummaryrefslogtreecommitdiff
path: root/test/Transforms
AgeCommit message (Collapse)Author
2013-04-14Reorders two transforms that collide with each otherDavid Majnemer
One performs: (X == 13 | X == 14) -> X-13 <u 2 The other: (A == C1 || A == C2) -> (A & ~(C1 ^ C2)) == C1 The problem is that there are certain values of C1 and C2 that trigger both transforms but the first one blocks out the second, this generates suboptimal code. Reordering the transforms should be better in every case and allows us to do interesting stuff like turn: %shr = lshr i32 %X, 4 %and = and i32 %shr, 15 %add = add i32 %and, -14 %tobool = icmp ne i32 %add, 0 into: %and = and i32 %X, 240 %tobool = icmp ne i32 %and, 224 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179493 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-14Make the command line triple match the module triple.Nadav Rotem
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179492 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-14Remove unused function attributes.Nadav Rotem
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179476 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-14SLPVectorizer: Add support for trees that don't start at binary operators, ↵Nadav Rotem
and add the cost of extracting values from the roots of the tree. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179475 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-14SLPVectorizer: add initial support for reduction variable vectorization.Nadav Rotem
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179470 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-13GlobalDCE: Fix an oversight in my last commit that could lead to crashes.Benjamin Kramer
There is a Constant with non-constant operands: blockaddress. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179460 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-13Fix a scalability issue with complex ConstantExprs.Benjamin Kramer
This is basically the same fix in three different places. We use a set to avoid walking the whole tree of a big ConstantExprs multiple times. For example: (select cmp, (add big_expr 1), (add big_expr 2)) We don't want to visit big_expr twice here, it may consist of thousands of nodes. The testcase exercises this by creating an insanely large ConstantExprs out of a loop. It's questionable if the optimizer should ever create those, but this can be triggered with real C code. Fixes PR15714. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179458 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-12InstCombine: Check the operand types before merging fcmp ord & fcmp ord.Benjamin Kramer
Fixes PR15737. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179417 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-12SLPVectorizer: add support for vectorization of diamond shaped trees. We now ↵Nadav Rotem
perform a preliminary traversal of the graph to collect values with multiple users and check where the users came from. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179414 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-12CostModel: increase the default cost of supported floating point operations ↵Nadav Rotem
from 1 to two. Fixed a few tests that changes because now the cost of one insert + a vector operation on two doubles is lower than two scalar operations on doubles. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179413 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-12Have the StripMetadata pass also strip unsupported named metadata.Jan Voung
Previously it was just instruction attachments. Depends on a driver change to actually work: https://codereview.chromium.org/14072004/ BUG= http://code.google.com/p/nativeclient/issues/detail?id=3348 Review URL: https://codereview.chromium.org/14022009
2013-04-12PNaCl: extend GlobalCleanup to null-out extern_weak function references, and ↵Derek Schuff
extend the ABI checker to verify. Also promote weak symbols to internal ones, which was required because the scons barebones tests define weak versions of memcpy/memset BUG= https://code.google.com/p/nativeclient/issues/detail?id=3339 TEST= llvm-regression (globalcleanup.ll), scons barebones tests Review URL: https://codereview.chromium.org/13989005
2013-04-12Simplify (A & ~B) in icmp if A is a power of 2David Majnemer
The transform will execute like so: (A & ~B) == 0 --> (A & B) != 0 (A & ~B) != 0 --> (A & B) == 0 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179386 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-12LoopVectorizer: integer division is not a reduction operationArnold Schwaighofer
Don't classify idiv/udiv as a reduction operation. Integer division is lossy. For example : (1 / 2) * 4 != 4/2. Example: int a[] = { 2, 5, 2, 2} int x = 80; for() x /= a[i]; Scalar: x /= 2 // = 40 x /= 5 // = 8 x /= 2 // = 4 x /= 2 // = 2 Vectorized: <80, 1> / <2,5> //= <40,0> <40, 0> / <2,2> //= <20,0> 20*0 = 0 radar://13640654 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179381 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-11Optimize icmp involving addition betterDavid Majnemer
Allows LLVM to optimize sequences like the following: %add = add nsw i32 %x, 1 %cmp = icmp sgt i32 %add, %y into: %cmp = icmp sge i32 %x, %y as well as: %add1 = add nsw i32 %x, 20 %add2 = add nsw i32 %y, 57 %cmp = icmp sge i32 %add1, %add2 into: %add = add nsw i32 %y, 37 %cmp = icmp sle i32 %cmp, %x git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179316 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-11Fix for wrong instcombine on vector insert/extractBenjamin Kramer
When trying to collapse sequences of insertelement/extractelement instructions into single shuffle instructions, there is one specific case where the Instruction Combiner wrongly updates the resulting Mask of shuffle indexes. The problem is in function CollectShuffleElments. If we have a sequence of insert/extract element instructions like the one below: %tmp1 = extractelement <4 x float> %LHS, i32 0 %tmp2 = insertelement <4 x float> %RHS, float %tmp1, i32 1 %tmp3 = extractelement <4 x float> %RHS, i32 2 %tmp4 = insertelement <4 x float> %tmp2, float %tmp3, i32 3 Where: . %RHS will have a mask of [4,5,6,7] . %LHS will have a mask of [0,1,2,3] The Mask of shuffle indexes is wrongly computed to [4,1,6,7] instead of [4,0,6,7]. When analyzing %tmp2 in order to compute the Mask for the resulting shuffle instruction, the algorithm forgets to update the mask index at position 1 with the index associated to the element extracted from %LHS by instruction %tmp1. Patch by Andrea DiBiagio! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179291 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-11Add missing colons to check lines.Benjamin Kramer
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179277 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-11FileCheckize a bunch of tests.Benjamin Kramer
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179276 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-10Make the SLP store-merger less paranoid about function calls. We check for ↵Nadav Rotem
function calls when we check if it is safe to sink instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179207 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-09Add support for bottom-up SLP vectorization infrastructure.Nadav Rotem
This commit adds the infrastructure for performing bottom-up SLP vectorization (and other optimizations) on parallel computations. The infrastructure has three potential users: 1. The loop vectorizer needs to be able to vectorize AOS data structures such as (sum += A[i] + A[i+1]). 2. The BB-vectorizer needs this infrastructure for bottom-up SLP vectorization, because bottom-up vectorization is faster to compute. 3. A loop-roller needs to be able to analyze consecutive chains and roll them into a loop, in order to reduce code size. A loop roller does not need to create vector instructions, and this infrastructure separates the chain analysis from the vectorization. This patch also includes a simple (100 LOC) bottom up SLP vectorizer that uses the infrastructure, and can vectorize this code: void SAXPY(int *x, int *y, int a, int i) { x[i] = a * x[i] + y[i]; x[i+1] = a * x[i+1] + y[i+1]; x[i+2] = a * x[i+2] + y[i+2]; x[i+3] = a * x[i+3] + y[i+3]; } git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179117 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-09Revert r176408 and r176407 to address PR15540.Nadav Rotem
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179111 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-09Converted 8x tests of SimplifyCFG to use FileCheck instead of grep.Michael Gottesman
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179087 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-09Revert 179071 because it is not the right way to support non standard ↵Nadav Rotem
new/new[] operators. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179084 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-08c++ new operators are not malloc-like functions because they do not return ↵Nadav Rotem
uninitialized memory. Users may overide new-operators and implement any function that they like. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179071 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-07Fix PR15674 (and PR15603): a SROA think-o.Chandler Carruth
The fix for PR14972 in r177055 introduced a real think-o in the *store* side, likely because I was much more focused on the load side. While we can arbitrarily widen (or narrow) a loaded value, we can't arbitrarily widen a value to be stored, as that changes the width of memory access! Lock down the code path in the store rewriting which would do this to only handle the intended circumstance. All of the existing tests continue to pass, and I've added a test from the PR. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178974 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05An objc_retain can serve as a use for a different pointer.Michael Gottesman
This is the counterpart to commit r160637, except it performs the action in the bottomup portion of the data flow analysis. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178922 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05Properly model precise lifetime when given an incomplete dataflow sequence.Michael Gottesman
The normal dataflow sequence in the ARC optimizer consists of the following states: Retain -> CanRelease -> Use -> Release The optimizer before this patch stored the uses that determine the lifetime of the retainable object pointer when it bottom up hits a retain or when top down it hits a release. This is correct for an imprecise lifetime scenario since what we are trying to do is remove retains/releases while making sure that no ``CanRelease'' (which is usually a call) deallocates the given pointer before we get to the ``Use'' (since that would cause a segfault). If we are considering the precise lifetime scenario though, this is not correct. In such a situation, we *DO* care about the previous sequence, but additionally, we wish to track the uses resulting from the following incomplete sequences: Retain -> CanRelease -> Release (TopDown) Retain <- Use <- Release (BottomUp) *NOTE* This patch looks large but the most of it consists of updating test cases. Additionally this fix exposed an additional bug. I removed the test case that expressed said bug and will recommit it with the fix in a little bit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178921 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05Disable the optimization about promoting vector-element-access with symbolic ↵Shuxin Yang
index. This optimization is unstable at this moment; it 1) block us on a very important application 2) PR15200 3) test6 and test7 in test/Transforms/ScalarRepl/dynamic-vector-gep.ll (the CHECK command compare the output against wrong result) I personally believe this optimization should not have any impact on the autovectorized code, as auto-vectorizer is supposed to put gather/scatter in a "right" way. Although in theory downstream optimizaters might reveal some gather/scatter optimization opportunities, the chance is quite slim. For the hand-crafted vectorizing code, in term of redundancy elimination, load-CSE, copy-propagation and DSE can collectively achieve the same result, but in much simpler way. On the other hand, these optimizers are able to improve the code in a incremental way; in contrast, SROA is sort of all-or-none approach. However, SROA might slighly win in stack size, as it tries to figure out a stretch of memory tightenly cover the area accessed by the dynamic index. rdar://13174884 PR15200 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178912 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-04LoopVectorizer: Pass OperandValueKind information to the cost modelArnold Schwaighofer
Pass down the fact that an operand is going to be a vector of constants. This should bring the performance of MultiSource/Benchmarks/PAQ8p/paq8p on x86 back. It had degraded to scalar performance due to my pervious shift cost change that made all shifts expensive on x86. radar://13576547 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178809 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-04PNaCl: Change ExpandVarArgs to work around invalid use of va_argMark Seaborn
I've encountered two uses of va_arg on an empty argument list: NaCl's open() wrapper (now fixed), and 176.gcc in Spec2k. Although this is invalid use of va_arg (giving undefined behaviour), it's too awkward to fix 176.gcc. Work around this invalid usage by ensuring that the argument list pointer points to a location on the stack rather than being uninitialised. va_arg will return undefined values as it does in the usual native ABIs, rather than crashing. We could add options for non-strict and strict modes (possibly with a bounds check), but that's too much complication for now. BUG=https://code.google.com/p/nativeclient/issues/detail?id=3338 TEST=test/Transforms/NaCl/*.ll + ran 176.gcc from Spec2k Review URL: https://codereview.chromium.org/13510004
2013-04-03Remove an optimization where we were changing an objc_autorelease into an ↵Michael Gottesman
objc_autoreleaseReturnValue. The semantics of ARC implies that a pointer passed into an objc_autorelease must live until some point (potentially down the stack) where an autorelease pool is popped. On the other hand, an objc_autoreleaseReturnValue just signifies that the object must live until the end of the given function at least. Thus objc_autorelease is stronger than objc_autoreleaseReturnValue in terms of the semantics of ARC* implying that performing the given strength reduction without any knowledge of how this relates to the autorelease pool pop that is further up the stack violates the semantics of ARC. *Even though objc_autoreleaseReturnValue if you know that no RV optimization will occur is more computationally expensive. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178612 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-02Use a worklist to avoid a sneaky iterator invalidation.Bill Wendling
The iterator could be invalidated when it's recursively deleting a whole bunch of constant expressions in a constant initializer. Note: This was only reproducible if `opt' was run on a `.bc' file. If `opt' was run on a `.ll' file, it wouldn't crash. This is why the test first pushes the `.ll' file through `llvm-as' before feeding it to `opt'. PR15440 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178531 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-01Correct assertion conditionShuxin Yang
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178484 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-01X86TTI: Add accurate costs for itofp operations, based on the actual ↵Benjamin Kramer
instruction counts. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178459 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-30Implement XOR reassociation. It is based on following rules:Shuxin Yang
rule 1: (x | c1) ^ c2 => (x & ~c1) ^ (c1^c2), only useful when c1=c2 rule 2: (x & c1) ^ (x & c2) = (x & (c1^c2)) rule 3: (x | c1) ^ (x | c2) = (x & c3) ^ c3 where c3 = c1 ^ c2 rule 4: (x | c1) ^ (x & c2) => (x & c3) ^ c1, where c3 = ~c1 ^ c2 It reduces an application's size (in terms of # of instructions) by 8.9%. Reviwed by Pete Cooper. Thanks a lot! rdar://13212115 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178409 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-29PNaCl: Fix ExpandTls to handle a couple of corner cases involving PHI nodesMark Seaborn
ExpandTls's use of replaceUsesOfWith() didn't work for PHI nodes containing the same Constant twice (which needs to work for same or differing incoming BasicBlocks). The same applies to ExpandTlsConstantExpr. I noticed this while implementing ExpandConstantExpr. Fix this and factor out some common code that all three passes can use. BUG=https://code.google.com/p/nativeclient/issues/detail?id=2837 TEST=test/Transforms/NaCl/*.ll Review URL: https://codereview.chromium.org/13128002
2013-03-29Updated test0 of retain-not-declared.ll to reflect the fact that ↵Michael Gottesman
objc-arc-expand runs before objc-arc/objc-arc-contract. Specifically, objc-arc-expand will make sure that the objc_retainAutoreleasedReturnValue, objc_autoreleaseReturnValue, and ret will all have %call as an argument. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178382 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-29Add clang.arc.used to ModuleHasARC so ARC always runs if said call is ↵Michael Gottesman
present in a module. clang.arc.used is an interesting call for ARC since ObjCARCContract needs to run to remove said intrinsic to avoid a linker error (since the call does not exist). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178369 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-28Add a pass to strip bitcode metadata.Jan Voung
This only works on instruction attachments for now. Since it is a ModulePass we can add something to strip NamedMetadata based on a whitelist, if we want to retain some of that. It does not touch debug metadata, and leaves -strip-debug to handle that. BUG= https://code.google.com/p/nativeclient/issues/detail?id=3348 Review URL: https://codereview.chromium.org/12844027
2013-03-28Non optimizable objc_retainBlock calls are not forwarding.Michael Gottesman
Since we handle optimizable objc_retainBlocks through strength reduction in OptimizableIndividualCalls, we know that all code after that point will only see non-optimizable objc_retainBlock calls. IsForwarding is only called by functions after that point, so it is ok to just classify objc_retainBlock as non-forwarding. <rdar://problem/13249661>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178285 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-28[ObjCARC] Strength reduce objc_retainBlock -> objc_retain if the ↵Michael Gottesman
objc_retainBlock is optimizable. If an objc_retainBlock has the copy_on_escape metadata attached to it AND if the block pointer argument only escapes down the stack, we are allowed to strength reduce the objc_retainBlock to to an objc_retain and thus optimize it. Current there is logic in the ARC data flow analysis to handle this case which is complicated and involved making distinctions in between objc_retainBlock and objc_retain in certain places and considering them the same in others. This patch simplifies said code by: 1. Performing the strength reduction in the initial ARC peephole analysis (ObjCARCOpts::OptimizeIndividualCalls). 2. Changes the ARC dataflow analysis (which runs after the peephole analysis) to consider all objc_retainBlock calls to not be optimizable (since if the call was optimizable, we would have strength reduced it already). This patch leaves in the infrastructure in the ARC dataflow analysis to handle this case, which due to 2 will just be dead code. I am doing this on purpose to separate the removal of the old code from the testing of the new code. <rdar://problem/13249661>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178284 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-28Remove -O3.Akira Hatanaka
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178278 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-28Revert "Adding DIImportedModules to DIScopes."David Blaikie
This reverts commit 342d92c7a0adeabc9ab00f3f0d88d739fe7da4c7. Turns out we're going with a different schema design to represent DW_TAG_imported_modules so we won't need this extra field. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178215 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-28Check if Type is a vector before calling function Type::getVectorNumElements.Akira Hatanaka
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178208 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-27PNaCl: Fix ExpandVarArgs to handle "byval" struct arguments properlyMark Seaborn
* Ensure that the "byval" attribute is preserved for the fixed arguments. Before, it was stripped off from function calls but left in for function definitions, which broke passing struct arguments (which PNaCl Clang generates as "byval"). * Ensure that function and return attributes are preserved too. * Handle "byval" variable arguments in calls too by dereferencing the pointer. These are not yet usable with PNaCl Clang, which does not allow va_arg on structs (see https://code.google.com/p/nativeclient/issues/detail?id=2381), but we might as well make this pass ready to handle this. BUG=https://code.google.com/p/nativeclient/issues/detail?id=3338 TEST=test/Transforms/NaCl/expand-varargs*.ll Review URL: https://codereview.chromium.org/13100002
2013-03-27Added back in the test for arc-annotations.Michael Gottesman
The test was removed since I had not turned off the test during release builds. This fails since ARC annotations support is conditionally compiled out during release builds. I added the proper requires header to assuage this issue. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178101 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-27Adding DIImportedModules to DIScopes.David Blaikie
This is just the basic groundwork for supporting DW_TAG_imported_module but I wanted to commit this before pushing support further into Clang or LLVM so that this rather churny change is isolated from the rest of the work. The major churn here is obviously adding another field (within the common DIScope prefix) to all DIScopes (files, classes, namespaces, lexical scopes, etc). This should be the last big churny change needed for DW_TAG_imported_module/using directive support/PR14606. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178099 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-26PNaCl: Add ExpandGetElementPtr pass for converting GetElementPtr to arithmeticMark Seaborn
This is similar to the GEP handling in visitGetElementPtr() in lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp. Once this pass is enabled, it will simplify the language to reduce the set of constructs that a PNaCl translator needs to handle as part of a stable wire format for PNaCl. BUG=https://code.google.com/p/nativeclient/issues/detail?id=3343 TEST=test/Transforms/NaCl/expand-getelementptr.ll Review URL: https://codereview.chromium.org/12849009
2013-03-26Add test case for commit r178031.Ulrich Weigand
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178038 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-26Remove testcase. It's failing on some platforms but not others.Bill Wendling
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177956 91177308-0d34-0410-b5e6-96231b3b80d8