aboutsummaryrefslogtreecommitdiff
path: root/test/CodeGen
AgeCommit message (Collapse)Author
2012-05-09Add another peephole pattern for conditional moves.Akira Hatanaka
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156460 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-09Make register FP allocatable if the compiled function does not have dynamicAkira Hatanaka
allocas. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156458 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-09Expand 64-bit shifts if target ABI is O32.Akira Hatanaka
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156457 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-08Remove 256-bit AVX non-temporal store intrinsics. Similar was previously ↵Craig Topper
done for 128-bit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156375 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07Teach DAG combine to fold x-x to 0.0 when unsafe FP math is enabled.Owen Anderson
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156324 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07Fix a regression from r147481. This combine should only happen if there is aChad Rosier
single use. rdar://11360370 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156316 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07X86: optimization for -(x != 0)Manman Ren
This patch will optimize -(x != 0) on X86 FROM cmpl $0x01,%edi sbbl %eax,%eax notl %eax TO negl %edi sbbl %eax %eax In order to generate negl, I added patterns in Target/X86/X86InstrCompiler.td: def : Pat<(X86sub_flag 0, GR32:$src), (NEG32r GR32:$src)>; rdar: 10961709 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156312 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07Add support for the 'l' constraint.Eric Christopher
Patch by Jack Carter. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156294 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07Add support for the 'c' constraint.Eric Christopher
Patch by Jack Carter. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156293 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07Add support for the 'P' constraint.Eric Christopher
Patch by Jack Carter. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156292 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07Add support for the 'O' constraint.Eric Christopher
Patch by Jack Carter. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156285 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07Add support for the 'N' inline asm constraint.Eric Christopher
Patch by Jack Carter. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156284 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07Add support for the 'L' inline asm constraint.Eric Christopher
Patch by Jack Carter. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156283 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07Add support for the inline asm constraint 'K'.Eric Christopher
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156282 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07Add SSE4A MOVNTSS/MOVNTSD instructions.Craig Topper
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156281 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07Support the 'J' constraint.Eric Christopher
Patch by Jack Carter. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156280 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07Add support for the 'I' inline asm constraint. Also add testsEric Christopher
from the previous 2 patches. Patch by Jack Carter. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156279 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-06Switch the select to branch transformation on by default.Benjamin Kramer
The primitive conservative heuristic seems to give a slight overall improvement while not regressing stuff. Make it available to wider testing. If you notice any speed regressions (or significant code size regressions) let me know! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156258 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-05CodeGenPrepare: Add a transform to turn selects into branches in some cases.Benjamin Kramer
This came up when a change in block placement formed a cmov and slowed down a hot loop by 50%: ucomisd (%rdi), %xmm0 cmovbel %edx, %esi cmov is a really bad choice in this context because it doesn't get branch prediction. If we emit it as a branch, an out-of-order CPU can do a better job (if the branch is predicted right) and avoid waiting for the slow load+compare instruction to finish. Of course it won't help if the branch is unpredictable, but those are really rare in practice. This patch uses a dumb conservative heuristic, it turns all cmovs that have one use and a direct memory operand into branches. cmovs usually save some code size, so we disable the transform in -Os mode. In-Order architectures are unlikely to benefit as well, those are included in the "predictableSelectIsExpensive" flag. It would be better to reuse branch probability info here, but BPI doesn't support select instructions currently. It would make sense to use the same heuristics as the if-converter pass, which does the opposite direction of this transform. Test suite shows a small improvement here and there on corei7-level machines, but the actual results depend a lot on the used microarchitecture. The transformation is currently disabled by default and available by passing the -enable-cgp-select2branch flag to the code generator. Thanks to Chandler for the initial test case to him and Evan Cheng for providing me with comments and test-suite numbers that were more stable than mine :) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156234 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-04This patch adds a new NVPTX back-end to LLVM which supports code generation ↵Justin Holewinski
for NVIDIA PTX 3.0. This back-end will (eventually) replace the current PTX back-end, while maintaining compatibility with it. The new target machines are: nvptx (old ptx32) => 32-bit PTX nvptx64 (old ptx64) => 64-bit PTX The sources are based on the internal NVIDIA NVPTX back-end, and contain more functionality than the current PTX back-end currently provides. NV_CONTRIB git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156196 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-04Added missing CMN case in Thumb2SizeReduction pass so that LLVM emits ↵Sebastian Pop
16-bits encoding of CMN instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156195 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-04Allow v16i16 and v32i8 shuffles to be rewritten as narrower shuffles.Craig Topper
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156156 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-03Support for target dependent Hexagon VLIW packetizer.Sirish Pande
This patch creates and optimizes packets as per Hexagon ISA rules. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156109 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-03Fix 256-bit vpshuflw and vpshufhw immediate encoding to handle undefs in the ↵Craig Topper
lower half correctly. Missed in r155982. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156059 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-03Fix two-address pass's aggressive instruction commuting heuristics. It's meantEvan Cheng
to catch cases like: %reg1024<def> = MOV r1 %reg1025<def> = MOV r0 %reg1026<def> = ADD %reg1024, %reg1025 r0 = MOV %reg1026 By commuting ADD, it let coalescer eliminate all of the copies. However, there was a bug in the heuristics where it ended up commuting the ADD in: %reg1024<def> = MOV r0 %reg1025<def> = MOV 0 %reg1026<def> = ADD %reg1024, %reg1025 r0 = MOV %reg1026 That did no benefit but rather ensure the last MOV would not be coalesced. rdar://11355268 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156048 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-02Teach DAGCombine the same multiply-by-1.0 folding trick when doing FMAs, ↵Owen Anderson
just like it now knows for FMULs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156029 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-02Teach DAG combine that multiplication by 1.0 can always be constant folded.Owen Anderson
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156023 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-02Revert r155853Manman Ren
The commit is intended to fix rdar://10961709. But it is the root cause of PR12720. Revert it for now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155992 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-02Add support for selecting AVX2 vpshuflw and vpshufhw. Add decoding support ↵Craig Topper
for AsmPrinter. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155982 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-01Strip the pointer casts off of allocas so that the selection DAG can find them.Bill Wendling
PR10799 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155954 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-01X86: optimization for max-like structManman Ren
This patch will optimize the following cases on X86 (a > b) ? (a-b) : 0 (a >= b) ? (a-b) : 0 (b < a) ? (a-b) : 0 (b <= a) ? (a-b) : 0 FROM movl %edi, %ecx subl %esi, %ecx cmpl %edi, %esi movl $0, %eax cmovll %ecx, %eax TO xorl %eax, %eax subl %esi, %edi cmovll %eax, %edi movl %edi, %eax rdar: 10734411 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155919 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-01Regression test for PR2960.Jay Foad
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155912 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-30X86: optimization for -(x != 0)Manman Ren
This patch will optimize -(x != 0) on X86 FROM cmpl $0x01,%edi sbbl %eax,%eax notl %eax TO negl %edi sbbl %eax %eax git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155853 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-30test/CodeGen/X86/select.ll: remove spacesManman Ren
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155840 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-30Fix fastcc structure return with fast-isel on x86-32Derek Schuff
On x86-32, structure return via sret lets the callee pop the hidden pointer argument off the stack, which the caller then re-pushes. However if the calling convention is fastcc, then a register is used instead, and the caller should not adjust the stack. This is implemented with a check of IsTailCallConvention X86TargetLowering::LowerCall but is now checked properly in X86FastISel::DoSelectCall. (this time, actually commit what was reviewed!) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155825 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-30Don't introduce illegal types when creating vmull operations. <rdar://11324364>Bob Wilson
ARM BUILD_VECTORs created after type legalization cannot use i8 or i16 operands, since those types are not legal. Instead use i32 operands, which will be implicitly truncated by the BUILD_VECTOR to match the element type. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155824 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-28Reapply 155668: Fix the SD scheduler to avoid gluing the same node twice.Andrew Trick
This time, also fix the caller of AddGlue to properly handle incomplete chains. AddGlue had failure modes, but shamefully hid them from its caller. It's luck ran out. Fixes rdar://11314175: BuildSchedUnits assert. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155749 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27Revert r155745Derek Schuff
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155746 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27Fix fastcc structure return with fast-isel on x86-32Derek Schuff
On x86-32, structure return via sret lets the callee pop the hidden pointer argument off the stack, which the caller then re-pushes. However if the calling convention is fastcc, then a register is used instead, and the caller should not adjust the stack. This is implemented with a check of IsTailCallConvention X86TargetLowering::LowerCall but is now checked properly in X86FastISel::DoSelectCall. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155745 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27Temporarily revert r155668: Fix the SD scheduler to avoid gluing.Andrew Trick
This definitely caused regression with ARM -mno-thumb. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155743 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27Add x86-specific DAG combine to simplify:Chad Rosier
x == -y --> x+y == 0 x != -y --> x+y != 0 On x86, the generated code goes from negl %esi cmpl %esi, %edi je .LBB0_2 to addl %esi, %edi je .L4 This case is correctly handled for ARM with "cmn". Patch by Manman Ren. rdar://11245199 PR12545 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155739 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27Make test less fragile.Evan Cheng
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155732 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27Fix the order of the operands in the llvm.fma intrinsic patterns for ARM,Lang Hames
<rdar://problem/11325085>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155724 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27X86: Don't emit conditional floating point moves on when targeting ↵Benjamin Kramer
pre-pentiumpro architectures. * Model FPSW (the FPU status word) as a register. * Add ISel patterns for the FUCOM*, FNSTSW and SAHF instructions. * During Legalize/Lowering, build a node sequence to transfer the comparison result from FPSW into EFLAGS. If you're wondering about the right-shift: That's an implicit sub-register extraction (%ax -> %ah) which is handled later on by the instruction selector. Fixes PR6679. Patch by Christoph Erhardt! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155704 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27Add mcpu to tests to prevent them from using AVX instructions on Sandy ↵Craig Topper
Bridge after r155618. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155696 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27Implement a bastardized ABI.Evan Cheng
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155686 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27- thumbv6 shouldn't imply +thumb2. Cortex-M0 doesn't suppport 32-bit Thumb2Evan Cheng
instructions. - However, it does support dmb, dsb, isb, mrs, and msr. rdar://11331541 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155685 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-26Fix the SD scheduler to avoid gluing the same node twice.Andrew Trick
DAGCombine strangeness may result in multiple loads from the same offset. They both may try to glue themselves to another load. We could insist that the redundant loads glue themselves to each other, but the beter fix is to bail out from bad gluing at the time we detect it. Fixes rdar://11314175: BuildSchedUnits assert. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155668 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-26Use VLD1 in NEON extenting-load patterns instead of VLDR.Tim Northover
On some cores it's a bad idea for performance to mix VFP and NEON instructions and since these patterns are NEON anyway, the NEON load should be used. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155630 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-26If triple is armv7 / thumbv7 and a CPU is specified, do not automatically assumeEvan Cheng
the feature set of v7a. This comes about if the user specifies something like -arch armv7 -mcpu=cortex-m3. We shouldn't be generating instructions such as uxtab in this case. rdar://11318438 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155601 91177308-0d34-0410-b5e6-96231b3b80d8