aboutsummaryrefslogtreecommitdiff
path: root/test/CodeGen
AgeCommit message (Collapse)Author
2012-06-19Add DAG-combines for aggressive FMA formation.Lang Hames
This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or FSUB + FMUL. The combines are performed when: (a) Either AllowExcessFPPrecision option (-enable-excess-fp-precision for llc) OR UnsafeFPMath option (-enable-unsafe-fp-math) are set, and (b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of the FADD/FSUB, and (c) The FMUL only has one user (the FADD/FSUB). If your target has fast FMA instructions you can make use of these combines by overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for types supported by your FMA instruction, and adding patterns to match ISD::FMA to your FMA instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158757 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-19Add a triple.Jakob Stoklund Olesen
The test was failing on Linux because of asm syntax differences. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158748 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-19Implement PPCInstrInfo::isCoalescableExtInstr().Jakob Stoklund Olesen
The PPC::EXTSW instruction preserves the low 32 bits of its input, just like some of the x86 instructions. Use it to reduce register pressure when the low 32 bits have multiple uses. This requires a small change to PeepholeOptimizer since EXTSW takes a 64-bit input register. This is related to PR5997. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158743 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-19Add support for generating reg+reg preinc stores on PPC.Hal Finkel
PPC will now generate STWUX and friends. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158698 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-19really add a triple :-(Rafael Espindola
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158696 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-19Add a triple to the test.Rafael Espindola
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158695 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-19Move the support for using .init_array from ARM to the genericRafael Espindola
TargetLoweringObjectFileELF. Use this to support it on X86. Unlike ARM, on X86 it is not easy to find out if .init_array should be used or not, so the decision is made via TargetOptions and defaults to off. Add a command line option to llc that enables it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158692 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-18ARM: use NOEN loads and stores if possible when handling struct byval.Manman Ren
This change is to be enabled in clang. rdar://9877866 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158684 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-18This change handles a another case for generating the bic instruction Joel Jones
when a compile time constant is known. This occurs when implicitly zero extending function arguments from 16 bits to 32 bits. The 8 bit case doesn't need to be handled, as the 8 bit constants are encoded directly, thereby not needing a separate load instruction to form the constant into a register. <rdar://problem/11481151> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158659 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-18Add a regression test for the bug exposed by r158087, which has beenChandler Carruth
temporarily reverted. This test is annoyingly overspecified, but I don't know of another way to thoroughly test the saving and restoring of the registers. While this will have to be adjusted even with the issue fixed in order to re-apply r158087, those adjustments should very clearly indicate that it is still correct (%esp getting restored prior to pops), whereas without it, this case can easily slip under the radar. Still, any suggestions for improvements are very welcome. All credit to Matt Beaumont-Gay for reducing this out of an insane Address Sanitizer crash to a reasonably small seg-faulting C program when built with -mstackrealign. I just reduced it to IR, which was much simpler. =] git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158656 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-18Temporarily revert r158087.Chandler Carruth
This patch causes problems when both dynamic stack realignment and dynamic allocas combine in the same function. With this patch, we no longer build the epilog correctly, and silently restore registers from the wrong position in the stack. Thanks to Matt for tracking this down, and getting at least an initial test case to Chad. I'm going to try to check a variation of that test case in so we can easily track the fixes required. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158654 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-16Cleanup trip-count finding for PPC CTR loops (and some bug fixes).Hal Finkel
This cleans up the method used to find trip counts in order to form CTR loops on PPC. This refactoring allows the pass to find loops which have a constant trip count but also happen to end with a comparison to zero. This also adds explicit FIXMEs to mark two different classes of loops that are currently ignored. In addition, we now search through all potential induction operations instead of just the first. Also, we check the predicate code on the conditional branch and abort the transformation if the code is not EQ or NE, and we then make sure that the branch to be transformed matches the condition register defined by the comparison (multiple possible comparisons will be considered). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158607 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-15ARM: optimization for sub+abs.Manman Ren
This patch will optimize abs(x-y) FROM sub, movs, rsbmi TO subs, rsbmi For abs, we will use cmp instead of movs. This is necessary because we already have an existing peephole pass which optimizes away cmp following sub. rdar: 11633193 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158551 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-15Preserve <undef> flags in ARMExpandPseudo.Jakob Stoklund Olesen
This probably mostly shows up in bugpoint-generated code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158527 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-141. introduce MipsPat in place of Pat in order to exclude those fromAkira Hatanaka
being used by Mips16 or Micro Mips 2. clean up a few lines too long encountered Patch by Reed Kotler. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158470 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-14Make machine verifier check the first instruction of the last bundle instead ofAkira Hatanaka
the last instruction of a basic block. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158468 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-14Revert: test/CodeGen/ARM/iabs.ll in r158441Manman Ren
Sorry that I accidently checked in this file with my previous commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158442 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-14InstCombine: fix a bug when combining (fcmp cc0 x, y) && (fcmp cc1 x, y).Manman Ren
uno && ueq was converted to ueq, it should be converted to uno. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158441 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-14Test case for MIPS long branch pass.Akira Hatanaka
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158438 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-14Fix test cases.Akira Hatanaka
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158435 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-13Implement a DAGCombine in MipsISelLowering.cpp which transforms the followingAkira Hatanaka
pattern: (add v0, (add v1, abs_lo(tjt))) => (add (add v0, v1), abs_lo(tjt)) "tjt" is a TargetJumpTable node. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158419 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-13Set a higher value for maxStoresPerMemcpy in MipsISelLowering.cpp.Akira Hatanaka
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158414 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-13Implement fastcc calling convention for MIPS.Akira Hatanaka
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158410 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-13Fix pattern for MKMSK instruction.Richard Osborne
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158409 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-13Fix intrinsics for XOP frczss/sd instructions. These instructions only take ↵Craig Topper
one source register and zero the upper bits of the destination rather than preserving them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158396 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-13disable use of directive .set nomicromipsAkira Hatanaka
until this directive is pushed in gas to open source fsf Patch by Reed Kotler. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158381 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-13sched: fix latency of memory dependence chain edges for consistency.Andrew Trick
For store->load dependencies that may alias, we should always use TrueMemOrderLatency, which may eventually become a subtarget hook. In effect, we should guarantee at least TrueMemOrderLatency on at least one DAG path from a store to a may-alias load. This should fix the standard mode as well as -enable-aa-sched-mi". git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158380 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-12[arm-fast-isel] Add support for -arm-long-calls.Chad Rosier
Patch by Jush Lu <jush.msn@gmail.com>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158368 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-11Fix test that depends on register allocation.Jakob Stoklund Olesen
The test is really checking the prolog/epilog load/store multiple formation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158328 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-11Fix test case to work on ARM.Jakob Stoklund Olesen
Patch by James Benton! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158316 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-11Re-enable the CMN instruction.Bill Wendling
We turned off the CMN instruction because it had semantics which we weren't getting correct. If we are comparing with an immediate, then it's okay to use the CMN instruction. <rdar://problem/7569620> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158302 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-10Enable ILP scheduling for all nodes by default on PPC.Hal Finkel
Over the entire test-suite, this has an insignificantly negative average performance impact, but reduces some of the worst slowdowns from the anti-dep. change (r158294). Largest speedups: SingleSource/Benchmarks/Stanford/Quicksort - 28% SingleSource/Benchmarks/Stanford/Towers - 24% SingleSource/Benchmarks/Shootout-C++/matrix - 23% MultiSource/Benchmarks/SciMark2-C/scimark2 - 19% MultiSource/Benchmarks/MiBench/automotive-bitcount/automotive-bitcount - 15% (matrix and automotive-bitcount were both in the top-5 slowdown list from the anti-dep. change) Largest slowdowns: MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 28% MultiSource/Benchmarks/mediabench/gsm/toast/toast - 26% MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan - 21% SingleSource/Benchmarks/CoyoteBench/lpbench - 20% MultiSource/Applications/d/make_dparser - 16% git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158296 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09Improve ext/trunc patterns on PPC64.Hal Finkel
The PPC64 backend had patterns for i32 <-> i64 extensions and truncations that would leave self-moves in the final assembly. Replacing those patterns with ones based on the SUBREG builtins yields better-looking code. Thanks to Jakob and Owen for their suggestions in this matter. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158283 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09Replace XOP vpcom intrinsics with fewer intrinsics that take the immediate ↵Craig Topper
as an argument. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158278 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09Enable tail merging on PPC.Hal Finkel
Tail merging had been disabled on PPC because it would disturb bundling decisions made during pre-RA scheduling on the 970 cores. Now, however, all bundling decisions are made during post-RA scheduling, and tail merging is generally beneficial (the average test-suite speedup is insignificantly positive). Largest test-suite speedups: MultiSource/Benchmarks/mediabench/gsm/toast/toast - 30% MultiSource/Benchmarks/BitBench/uuencode/uuencode - 23% SingleSource/Benchmarks/Shootout-C++/ary - 21% SingleSource/Benchmarks/Stanford/Queens - 17% Largest slowdowns: MultiSource/Benchmarks/MiBench/security-sha/security-sha - 24% MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 22% MultiSource/Applications/JM/ldecod/ldecod - 14% MultiSource/Benchmarks/mediabench/g721/g721encode/encode - 9% This is improved by using full (instead of just critical) anti-dependency breaking, but doing so still causes miscompiles and so cannot yet be enabled by default. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158259 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08Don't run RAFast in the optimizing regalloc pipeline.Jakob Stoklund Olesen
The fast register allocator is not supposed to work in the optimizing pipeline. It doesn't make sense to compute live intervals, run full copy coalescing, and then run RAFast. Fast register allocation in the optimizing pipeline is better done by RABasic. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158242 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08Enable PPC CTR loop formation by default.Hal Finkel
Thanks to Jakob's help, this now causes no new test suite failures! Over the entire test suite, this gives an average 1% speedup. The largest speedups are: SingleSource/Benchmarks/Misc/pi - 108% SingleSource/Benchmarks/CoyoteBench/lpbench - 54% MultiSource/Benchmarks/Prolangs-C/unix-smail/unix-smail - 50% SingleSource/Benchmarks/Shootout/ary3 - 32% SingleSource/Benchmarks/Shootout-C++/matrix - 30% The largest slowdowns are: MultiSource/Benchmarks/mediabench/gsm/toast/toast - -30% MultiSource/Benchmarks/Prolangs-C/bison/mybison - -25% MultiSource/Benchmarks/BitBench/uuencode/uuencode - -22% MultiSource/Applications/d/make_dparser - -14% SingleSource/Benchmarks/Shootout-C++/ary - -13% In light of these slowdowns, additional profiling work is obviously needed! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158223 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08Test case for r158160Manman Ren
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158218 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08Fix a crash in APInt::lshr when shiftAmt > BitWidth.Chad Rosier
Patch by James Benton <jbenton@vmware.com>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158213 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08test/CodeGen/Generic/APIntLoadStore.ll: Mark as XFAIL:ppc since r157911.NAKAMURA Takumi
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158209 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08Disable the PPC CTR-Loops pass by default.Hal Finkel
The pass itself works well, but the something in the Machine* infrastructure does not understand terminators which define registers. Without the ability to use the block-placement pass, etc. this causes performance regressions (and so is turned off by default). Turning off the analysis turns off the problems with the Machine* infrastructure. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158206 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08Fix a bug in the new PPC CTR-Loops pass.Hal Finkel
The code which tests for an induction operation cannot assume that any ADDI instruction will have a register operand because the operand could also be a frame index; for example: %vreg16<def> = ADDI8 <fi#0>, 0; G8RC:%vreg16 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158205 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08Add the PPCCTRLoops pass: a PPC machine-code-level optimization pass to form ↵Hal Finkel
CTR-based loop branching code. This pass is derived from the Hexagon HardwareLoops pass. The only significant enhancement over the Hexagon pass is that PPCCTRLoops will also attempt to delete the replaced add and compare operations if they are no longer otherwise used. Also, invalid preheader DebugLoc is not used. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158204 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-07X86: optimize generated code for integer ABSManman Ren
This patch will generate the following for integer ABS: movl %edi, %eax negl %eax cmovll %edi, %eax INSTEAD OF movl %edi, %ecx sarl $31, %ecx leal (%rdi,%rcx), %eax xorl %ecx, %eax There exists a target-independent DAG combine for integer ABS, which converts integer ABS to sar+add+xor. For X86, we match this pattern back to neg+cmov. This is implemented in PerformXorCombine. rdar://10695237 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158175 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-07Use a base register instead of an index register with the local dynamic model.Rafael Espindola
Fixes pr13048. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158158 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-07X86: replace SUB with CMP if possibleManman Ren
This patch will optimize the following movq %rdi, %rax subq %rsi, %rax cmovsq %rsi, %rdi movq %rdi, %rax to cmpq %rsi, %rdi cmovsq %rsi, %rdi movq %rdi, %rax Perform this optimization if the actual result of SUB is not used. rdar: 11540023 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158126 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-06Revert r157755.Manman Ren
The commit is intended to fix rdar://11540023. It is implemented as part of peephole optimization. We can actually implement this in the SelectionDAG lowering phase. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158122 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-06Add support for dynamic stack realignment in the presence of dynamic allocas onChad Rosier
X86. rdar://11496434 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158087 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-05Revert commit r157966Joel Jones
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157972 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-04This change handles a another case for generating the bic instruction Joel Jones
when a compile time constant is known. This occurs when implicitly zero extending function arguments from 16 bits to 32 bits. <rdar://problem/11481151> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157966 91177308-0d34-0410-b5e6-96231b3b80d8