aboutsummaryrefslogtreecommitdiff
path: root/test/CodeGen
AgeCommit message (Collapse)Author
2012-09-10Don't attempt to use flags from predicated instructions.Jakob Stoklund Olesen
The ARM backend can eliminate cmp instructions by reusing flags from a nearby sub instruction with similar arguments. Don't do that if the sub is predicated - the flags are not written unconditionally. <rdar://problem/12263428> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163535 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10Stack Coloring: Handle the case where END markers come before BEGIN markers ↵Nadav Rotem
properly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163530 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10Enhance PR11334 fix to support extload from v2f32/v4f32Michael Liao
- Fix an remaining issue of PR11674 as well git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163528 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10Add boolean simplification support from CMOVMichael Liao
- If a boolean value is generated from CMOV and tested as boolean value, simplify the use of test result by referencing the original condition. RDRAND intrinisc is one of such cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163516 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10Fix an assertion failure when optimising a shufflevector incorrectly into ↵James Molloy
concat_vectors, and a followup bug with SelectionDAG::getNode() creating nodes with invalid types. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163511 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10Stack Coloring: Add support for multiple regions of the same slot, within a ↵Nadav Rotem
single basic block. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163507 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10The VPSHUFB 256-bit instruction may be generated when one of input vector is ↵Elena Demikhovsky
undefined or zeroinitializer. I've added the "zeroinitializer" case in this patch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163506 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10Teach the DAGBuilder about lifetime markers which are generated from PHINodes.Nadav Rotem
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163494 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-09Teach DAG combiner to constant fold fneg of a BUILD_VECTOR of constants.Craig Topper
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163483 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-08Add instruction selection for ffloor of vectors when SSE4.1 or AVX is enabled.Craig Topper
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163473 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-08Add support for lowering FABS of vector types.Craig Topper
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163461 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-08Set operation action for FFLOOR to Expand for all vector types for X86. Set ↵Craig Topper
FFLOOR of v4f32 to Expand for ARM. v2f64 was already correct. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163458 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06Allow overlaps between virtreg and physreg live ranges.Jakob Stoklund Olesen
The RegisterCoalescer understands overlapping live ranges where one register is defined as a copy of the other. With this change, register allocators using LiveRegMatrix can do the same, at least for copies between physical and virtual registers. When a physreg is defined by a copy from a virtreg, allow those live ranges to overlap: %CL<def> = COPY %vreg11:sub_8bit; GR32_ABCD:%vreg11 %vreg13<def,tied1> = SAR32rCL %vreg13<tied0>, %CL<imp-use,kill> We can assign %vreg11 to %ECX, overlapping the live range of %CL. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163336 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06Disable stack coloring by default in order to resolve the i386 failures.Nadav Rotem
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163316 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06AVX2 optimization.Elena Demikhovsky
Added generation of VPSHUB instruction for <32 x i8> vector shuffle when possible. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163312 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06Fix the test by specifying an exact cpu model.Nadav Rotem
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163307 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06Improve codegen for BUILD_VECTORs on ARM.James Molloy
If we have a BUILD_VECTOR that is mostly a constant splat, it is often better to splat that constant then insertelement the non-constant lanes instead of insertelementing every lane from an undef base. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163304 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06Add a new optimization pass: Stack Coloring, that merges disjoint static ↵Nadav Rotem
allocations (allocas). Allocas are known to be disjoint if they are marked by disjoint lifetime markers (@llvm.lifetime.XXX intrinsics). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163299 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06Optimize codegen for VSETLNi{8,16,32} operating on Q registers. Degenerate ↵James Molloy
to a VSETLN on D registers, instead of an (INSERT_SUBREG (VSETLN (EXTRACT_SUBREG ))) sequence to help the register coalescer. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163298 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06Add patterns for converting stores of subvector_extracts of lower 128-bits ↵Craig Topper
of a 256-bit vector to VMOVAPSmr/VMOVUPSmr. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163292 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-05Use predication instead of pseudo-opcodes when folding into MOVCC.Jakob Stoklund Olesen
Now that it is possible to dynamically tie MachineInstr operands, predicated instructions are possible in SSA form: %vreg3<def> = SUBri %vreg1, -2147483647, pred:14, pred:%noreg, %opt:%noreg %vreg4<def,tied1> = MOVCCr %vreg3<tied0>, %vreg1, %pred:12, pred:%CPSR Becomes a predicated SUBri with a tied imp-use: SUBri %vreg1, -2147483647, pred:13, pred:%CPSR, opt:%noreg, %vreg1<imp-use,tied0> This means that any instruction that is safe to move can be folded into a MOVCC, and the *CC pseudo-instructions are no longer needed. The test case changes reflect that Thumb2SizeReduce recognizes the predicated instructions. It didn't understand the pseudos. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163274 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-05Strip old MachineInstrs *after* we know we can put them back.Tim Northover
Previous patch accidentally decided it couldn't convert a VFP to a NEON instruction after it had already destroyed the old one. Not a good move. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163230 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-05LLVM Bug Fix 13709: Remove needless lsr(Rp, #32) instruction access thePranav Bhandarkar
subreg_hireg of register pair Rp. * lib/Target/Hexagon/HexagonPeephole.cpp(PeepholeDoubleRegsMap): New DenseMap similar to PeepholeMap that additionally records subreg info too. (runOnMachineFunction): Record information in PeepholeDoubleRegsMap and copy propagate the high sub-reg of Rp0 in Rp1 = lsr(Rp0, #32) to the instruction Rx = COPY Rp1:logreg_subreg. * test/CodeGen/Hexagon/remove_lsr.ll: New test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163214 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-05Fixed the DAG combiner to better handle the folding of AND nodes for vector ↵Silviu Baranga
types. The previous code was making the assumption that the length of the bitmask returned by isConstantSplat was equal to the size of the vector type. Now we first make sure that the splat value has at least the length of the vector lane type, then we only use as many fields as we have available in the splat value. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163203 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-05Fix UseInitArray option for MIPS target.Logan Chien
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163193 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-04Move tie checks into MachineVerifier::visitMachineOperand.Jakob Stoklund Olesen
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163152 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-04Generic Bypass Slow DivPreston Gurd
- CodeGenPrepare pass for identifying div/rem ops - Backend specifies the type mapping using addBypassSlowDivType - Enabled only for Intel Atom with O2 32-bit -> 8-bit - Replace IDIV with instructions which test its value and use DIVB if the value is positive and less than 256. - In the case when the quotient and remainder of a divide are used a DIV and a REM instruction will be present in the IR. In the non-Atom case they are both lowered to IDIVs and CSE removes the redundant IDIV instruction, using the quotient and remainder from the first IDIV. However, due to this optimization CSE is not able to eliminate redundant IDIV instructions because they are located in different basic blocks. This is overcome by calculating both the quotient (DIV) and remainder (REM) in each basic block that is inserted by the optimization and reusing the result values when a subsequent DIV or REM instruction uses the same operands. - Test cases check for the presents of the optimization when calculating either the quotient, remainder, or both. Patch by Tyler Nowicki! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163150 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-04Porting Hexagon MI Scheduler to the new API.Sergei Larin
Change current Hexagon MI scheduler to use new converging scheduler. Integrates DFA resource model into it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163137 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-04Patch to implement UMLAL/SMLAL instructions for the ARM architectureArnold Schwaighofer
This patch corrects the definition of umlal/smlal instructions and adds support for matching them to the ARM dag combiner. Bug 12213 Patch by Yin Ma! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163136 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-04This patch optimizes shuffle instruction - generates 2 instructions instead ↵Elena Demikhovsky
of 4. Since this specific shuffle is widely used in many workloads we have ~10% performance on them. shufflevector <8 x float> %A, <8 x float> %B, <8 x i32> <i32 0, i32 8, i32 2, i32 10, i32 4, i32 12, i32 6, i32 14> vmovaps (%rdx), %ymm0 vshufps $8, %ymm0, %ymm0, %ymm0 vmovaps (%rcx), %ymm1 vshufps $8, %ymm0, %ymm1, %ymm1 vunpcklps %ymm0, %ymm1, %ymm0 vmovaps (%rcx), %ymm0 vmovsldup (%rdx), %ymm1 vblendps $85, %ymm0, %ymm1, %ymm0 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163134 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-02Not all targets have efficient ISel code generation for select instructions.Nadav Rotem
For example, the ARM target does not have efficient ISel handling for vector selects with scalar conditions. This patch adds a TLI hook which allows the different targets to report which selects are supported well and which selects should be converted to CF duting codegen prepare. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163093 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-02Generate better select code by allowing the target to use scalar select, and ↵Nadav Rotem
not sign-extend. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163086 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-01Revert "Take account of boolean vector contents when promoting a build ↵Pete Cooper
vector from i1 to some other type. rdar://problem/12210060" This reverts commit 5dd9e214fb92847e947f9edab170f9b4e52b908f. Thanks to Duncan for explaining how this should have been done. Conflicts: test/CodeGen/X86/vec_select.ll git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163064 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-01Fix Thumb2 fixup kind in the integrated-as.Logan Chien
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163063 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-01Teach DAG combine a number of tricks to simplify FMA expressions in ↵Owen Anderson
fast-math mode. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163051 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-01llvm/test/CodeGen/X86/fp-fast.ll: Suppress FMA4 on AMD Bulldozer host, ↵NAKAMURA Takumi
corresponding to r162999. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163041 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-01Fix Atom bots for r163036.Manman Ren
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163040 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31SelectionDAG: when constructing VZEXT_LOAD from other loads, make sure itsManman Ren
output chain is correctly setup. As an example, if the original load must happen before later stores, we need to make sure the constructed VZEXT_LOAD is constrained to be before the stores. rdar://11457792 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163036 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31Mark FMA4 instructions as commutable and add them to the folding tables.Craig Topper
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163035 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31Fix PR12359Michael Liao
- In addition to undefined, if V2 is zero vector, skip 2nd PSHUFB and POR as well as PSHUFB will zero elements with negative indices. Patch by Sriram Murali <sriram.murali@intel.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163018 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31Mark FMA3 instructions as commutable so that the operands to the multiply ↵Craig Topper
part can be commuted. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163001 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31Add support for converting llvm.fma to fma4 instructions.Craig Topper
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162999 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31Don't enforce ordered inline asm operands.Jakob Stoklund Olesen
I was too optimistic, inline asm can have tied operands that don't follow the def order. Fixes PR13742. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162998 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31llvm/test/CodeGen/X86/vec_select.ll: Fix failure on xmm-less hosts, to add ↵NAKAMURA Takumi
-mattr=+sse2. FIXME: Should this be tested with both +avx and -avx,+sse2? git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162983 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31Fix a couple of typos in EmitAtomic.Jakob Stoklund Olesen
Thumb2 instructions are mostly constrained to rGPR, not tGPR which is for Thumb1. rdar://problem/12203728 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162968 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-30Take account of boolean vector contents when promoting a build vector from ↵Pete Cooper
i1 to some other type. rdar://problem/12210060 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162960 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-30Try to make this test more generic to unbreak buildbots.Owen Anderson
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162958 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-30Teach the DAG combiner to turn chains of FADDs (x+x+x+x+...) into FMULs by ↵Owen Anderson
constants. This is only enabled in unsafe FP math mode, since it does not preserve rounding effects for all such constants. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162956 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-30Currently targets that do not support selects with scalar conditions and ↵Nadav Rotem
vector operands - scalarize the code. ARM is such a target because it does not support CMOV of vectors. To implement this efficientlyi, we broadcast the condition bit and use a sequence of NAND-OR to select between the two operands. This is the same sequence we use for targets that don't have vector BLENDs (like SSE2). rdar://12201387 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162926 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-30Introduce 'UseSSEx' to force SSE legacy encodingMichael Liao
- Add 'UseSSEx' to force SSE legacy insn not being selected when AVX is enabled. As the penalty of inter-mixing SSE and AVX instructions, we need prevent SSE legacy insn from being generated except explicitly specified through some intrinsics. For patterns supported by both SSE and AVX, so far, we force AVX insn will be tried first relying on AddedComplexity or position in td file. It's error-prone and introduces bugs accidentally. 'UseSSEx' is disabled when AVX is turned on. For SSE insns inherited by AVX, we need this predicate to force VEX encoding or SSE legacy encoding only. For insns not inherited by AVX, we still use the previous predicates, i.e. 'HasSSEx'. So far, these insns fall into the following categories: * SSE insns with MMX operands * SSE insns with GPR/MEM operands only (xFENCE, PREFETCH, CLFLUSH, CRC, and etc.) * SSE4A insns. * MMX insns. * x87 insns added by SSE. 2 test cases are modified: - test/CodeGen/X86/fast-isel-x86-64.ll AVX code generation is different from SSE one. 'vcvtsi2sdq' cannot be selected by fast-isel due to complicated pattern and fast-isel fallback to materialize it from constant pool. - test/CodeGen/X86/widen_load-1.ll AVX code generation is different from SSE one after fixing SSE/AVX inter-mixing. Exec-domain fixing prefers 'vmovapd' instead of 'vmovaps'. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162919 91177308-0d34-0410-b5e6-96231b3b80d8