aboutsummaryrefslogtreecommitdiff
path: root/test/CodeGen
AgeCommit message (Collapse)Author
2013-04-10R600/SI: Add pattern for AMDGPUurecipMichel Danzer
21 more little piglits with radeonsi. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179186 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-10This is for an experimental option -mips-os16. The idea is to compile allReed Kotler
Mips32 code as Mips16 unless it can't be compiled as Mips 16. For now this would happen as long as floating point instructions are not needed. Probably it would also make sense to compile as mips32 if atomic operations are needed too. There may be other cases too. A module pass prescans the IR and adds the mips16 or nomips16 attribute to functions depending on the functions needs. Mips 16 mode can result in a 40% code compression by utililizing 16 bit encoding of many instructions. The hope is for this to replace the traditional gcc way of dealing with Mips16 code using floating point which involves essentially using soft float but with a library implemented using mips32 floating point. This gcc method also requires creating stubs so that Mips32 code can interact with these Mips 16 functions that have floating point needs. My conjecture is that in reality this traditional gcc method would never win over this new method. I will be implementing the traditional gcc method also. Some of it is already done but I needed to do the stubs to finish the work and those required this mips16/32 mixed mode capability. I have more ideas for to make this new method much better and I think the old method will just live in llvm for anyone that needs the backward compatibility but I don't for what reason that would be needed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179185 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-10R600: Add VTX_READ_* and RAT_WRITE_CACHELESS_* when computing cf addrVincent Lejeune
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179174 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-10R600/SI: dynamical figure out the reg class of MIMGChristian Konig
Depending on the number of bits set in the writemask. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179166 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-10R600/SI: adjust writemask to only the used componentsChristian Konig
Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179165 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-10R600/SI: remove image sample writemaskChristian Konig
Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179164 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-10__sincosf_stret returns sinf / cosf in bits 0:31 and 32:63 of xmm0, not inEvan Cheng
xmm0 / xmm1. rdar://13599493 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179141 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-09Mips specific inline asm operand modifier 'D' Jack Carter
Modifier 'D' is to use the second word of a double integer. We had previously implemented the pure register varient of the modifier and this patch implements the memory reference. #include "stdio.h" int b[8] = {0,1,2,3,4,5,6,7}; void main() { int i; // The first word. Notice, no 'D' {asm ( "lw %0,%1;" : "=r" (i) : "m" (*(b+4)) );} printf("%d\n",i); // The second word {asm ( "lw %0,%D1;" : "=r" (i) : "m" (*(b+4)) );} printf("%d\n",i); } git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179135 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-09Allow PPC B and BLR to be if-converted into some predicated formsHal Finkel
This enables us to form predicated branches (which are the same conditional branches we had before) and also a larger set of predicated returns (including instructions like bdnzlr which is a conditional return and loop-counter decrement all in one). At the moment, if conversion does not capture all possible opportunities. A simple example is provided in early-ret2.ll, where if conversion forms one predicated return, and then the PPCEarlyReturn pass picks up the other one. So, at least for now, we'll keep both mechanisms. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179134 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-09This patch enables llvm to switch between compiling for mips32/mips64 Reed Kotler
and mips16 on a per function basis. Because this patch is somewhat involved I have provide an overview of the key pieces of it. The patch is written so as to not change the behavior of the non mixed mode. We have tested this a lot but it is something new to switch subtargets so we don't want any chance of regression in the mainline compiler until we have more confidence in this. Mips32/64 are very different from Mip16 as is the case of ARM vs Thumb1. For that reason there are derived versions of the register info, frame info, instruction info and instruction selection classes. Now we register three separate passes for instruction selection. One which is used to switch subtargets (MipsModuleISelDAGToDAG.cpp) and then one for each of the current subtargets (Mips16ISelDAGToDAG.cpp and MipsSEISelDAGToDAG.cpp). When the ModuleISel pass runs, it determines if there is a need to switch subtargets and if so, the owning pointers in MipsTargetMachine are appropriately changed. When 16Isel or SEIsel is run, they will return immediately without doing any work if the current subtarget mode does not apply to them. In addition, MipsAsmPrinter needs to be reset on a function basis. The pass BasicTargetTransformInfo is substituted with a null pass since the pass is immutable and really needs to be a function pass for it to be used with changing subtargets. This will be fixed in a follow on patch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179118 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-09DAGCombiner: Fold a shuffle on CONCAT_VECTORS into a new CONCAT_VECTORS if ↵Benjamin Kramer
possible. This pattern occurs in SROA output due to the way vector arguments are lowered on ARM. The testcase from PR15525 now compiles into this, which is better than the code we got with the old scalarrepl: _Store: ldr.w r9, [sp] vmov d17, r3, r9 vmov d16, r1, r2 vst1.8 {d16, d17}, [r0] bx lr Differential Revision: http://llvm-reviews.chandlerc.com/D647 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179106 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-09Use virtual base registers on PPCHal Finkel
On PowerPC, non-vector loads and stores have r+i forms; however, in functions with large stack frames these were not being used to access slots far from the stack pointer because such slots were out of range for the signed 16-bit immediate offset field. This increases register pressure because we need a separate register for each offset (when the r+r form is used). By enabling virtual base registers, we can deal with large stack frames without unduly increasing register pressure. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179105 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-09Convert test PowerPC/2007-09-07-LoadStoreIdxForms to FileCheckHal Finkel
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179104 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-09Compute correct frame sizes for SPARC v9 64-bit frames.Jakob Stoklund Olesen
The save area is twice as big and there is no struct return slot. The stack pointer is always 16-byte aligned (after adding the bias). Also eliminate the stack adjustment instructions around calls when the function has a reserved stack frame. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179083 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-08Generate PPC early conditional returnsHal Finkel
PowerPC has a conditional branch to the link register (return) instruction: BCLR. This should be used any time when we'd otherwise have a conditional branch to a return. This adds a small pass, PPCEarlyReturn, which runs just prior to the branch selection pass (and, importantly, after block placement) to generate these conditional returns when possible. It will also eliminate unconditional branches to returns (these happen rarely; most of the time these have already been tail duplicated by the time PPCEarlyReturn is invoked). This is a nice optimization for small functions that do not maintain a stack frame. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179026 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-08AArch64: remove barriers from AArch64 atomic operations.Tim Northover
I've managed to convince myself that AArch64's acquire/release instructions are sufficient to guarantee C++11's required semantics, even in the sequentially-consistent case. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179005 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-07Cleanup and improve PPC fsel generationHal Finkel
First, we should not cheat: fsel-based lowering of select_cc is a finite-math-only optimization (the ISA manual, section F.3 of v2.06, makes this clear, as does a note in our own README). This also adds fsel-based lowering of EQ and NE condition codes. As it turned out, fsel generation was covered by a grand total of zero regression test cases. I've added some test cases to cover the existing behavior (which is now finite-math only), as well as the new EQ cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179000 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-07Implement LowerCall_64 for the SPARC v9 64-bit ABI.Jakob Stoklund Olesen
There is still no support for byval arguments (which I don't think are needed) and varargs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178993 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-06Implement LowerReturn_64 for SPARC v9.Jakob Stoklund Olesen
Integer return values are sign or zero extended by the callee, and structs up to 32 bytes in size can be returned in registers. The CC_Sparc64 CallingConv definition is shared between LowerFormalArguments_64 and LowerReturn_64. Function arguments and return values are passed in the same registers. The inreg flag is also used for return values. This is required to handle C functions returning structs containing floats and ints: struct ifp { int i; float f; }; struct ifp f(void); LLVM IR: define inreg { i32, float } @f() { ... ret { i32, float } %retval } The ABI requires that %retval.i is returned in the high bits of %i0 while %retval.f goes in %f1. Without the inreg return value attribute, %retval.i would go in %i0 and %retval.f would go in %f3 which is a more efficient way of returning %multiple values, but it is not ABI compliant for returning C structs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178966 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-06SPARC v9 stack pointer bias.Jakob Stoklund Olesen
64-bit SPARC v9 processes use biased stack and frame pointers, so the current function's stack frame is located at %sp+BIAS .. %fp+BIAS where BIAS = 2047. This makes more local variables directly accessible via [%fp+simm13] addressing. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178965 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-06Implement PPCInstrInfo::FoldImmediateHal Finkel
There are certain PPC instructions into which we can fold a zero immediate operand. We can detect such cases by looking at the register class required by the using operand (so long as it is not otherwise constrained). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178961 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-06Complete formal arguments for the SPARC v9 64-bit ABI.Jakob Stoklund Olesen
All arguments are formally assigned to stack positions and then promoted to floating point and integer registers. Since there are more floating point registers than integer registers, this can cause situations where floating point arguments are assigned to registers after integer arguments that where assigned to the stack. Use the inreg flag to indicate 32-bit fragments of structs containing both float and int members. The three-way shadowing between stack, integer, and floating point registers requires custom argument lowering. The good news is that return values are passed in the exact same way, and we can share the code. Still missing: - Update LowerReturn to handle structs returned in registers. - LowerCall. - Variadic functions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178958 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05R600/SI: Add support for buffer stores v2Tom Stellard
v2: - Use the ADDR64 bit Reviewed-by: Christian König <christian.koenig@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178931 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05R600/SI: Add processor types for each SI variantTom Stellard
Reviewed-by: Christian König <christian.koenig@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178928 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05R600/SI: Avoid generating S_MOVs with 64-bit immediates v2Tom Stellard
SITargetLowering::analyzeImmediate() was converting the 64-bit values to 32-bit and then checking if they were an inline immediate. Some of these conversions caused this check to succeed and produced S_MOV instructions with 64-bit immediates, which are illegal. v2: - Clean up logic Reviewed-by: Christian König <christian.koenig@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178927 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05Enable early if conversion on PPCHal Finkel
On cores for which we know the misprediction penalty, and we have the isel instruction, we can profitably perform early if conversion. This enables us to replace some small branch sequences with selects and avoid the potential stalls from mispredicting the branches. Enabling this feature required implementing canInsertSelect and insertSelect in PPCInstrInfo; isel code in PPCISelLowering was refactored to use these functions as well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178926 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05Make the test/CodeGen/X86/win32_sret.ll reliable on any CPU by explicitly ↵Timur Iskhodzhanov
specifying the -mcpu git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178885 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05Reverting 178851 as it broke buildbotsRenato Golin
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178883 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05Fix for PR14824: "Optimization arm_ldst_opt inserts newly generated ↵Stepan Dyatkovskiy
instruction vldmia at incorrect position". Patch introduces memory operands tracking in ARMLoadStoreOpt::LoadStoreMultipleOpti. For each register it keeps the order of load operations as it was before optimization pass. It is kind of deep improvement of fix proposed by Hao: http://llvm.org/bugs/show_bug.cgi?id=14824#c4 But it also tracks conflicts between different register classes (e.g. D2 and S5). For more details see: Bug description: http://llvm.org/bugs/show_bug.cgi?id=14824 LLVM Commits discussion: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130311/167936.html http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130318/168688.html http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130325/169376.html http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130401/170238.html git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178851 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05RegisterPressure heuristics currently require signed comparisons.Andrew Trick
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178823 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-04PPC: Improve code generation for mixed-precision reciprocal sqrtHal Finkel
The DAGCombine logic that recognized a/sqrt(b) and transformed it into a multiplication by the reciprocal sqrt did not handle cases where the sqrt and the division were separated by an fpext or fptrunc. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178801 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-04Avoid high-latency false CPSR dependencies even for tMOVSi.Jakob Stoklund Olesen
The Thumb2SizeReduction pass avoids false CPSR dependencies, except it still aggressively creates tMOVi8 instructions because they are so common. Avoid creating false CPSR dependencies even for tMOVi8 instructions when the the CPSR flags are known to have high latency. This allows integer computation to overlap floating point computations. Also process blocks in a reverse post-order and propagate high-latency flags to successors. <rdar://problem/13468102> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178773 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-04New-password-test commit.Stepan Dyatkovskiy
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178765 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-04R600: Take export into account when computing cf addressVincent Lejeune
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178761 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-04Add SPARC v9 support for select on 64-bit compares.Jakob Stoklund Olesen
This requires v9 cmov instructions using the %xcc flags instead of the %icc flags. Still missing: - Select floats on %xcc flags. - Select i64 on %fcc flags. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178737 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-03R600: Fix last ALU of a clause being emitted in a separate clauseVincent Lejeune
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178675 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-03Fix PR15632: No support for ppcf128 floating-point remainder on PowerPC.Bill Schmidt
For this we need to use a libcall. Previously LLVM didn't implement libcall support for frem, so I've added it in the usual straightforward manner. A test case from the bug report is included. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178639 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-03Temporarily relax the WIN32 checks in the SRet test to fix the Atom D2700 botTimur Iskhodzhanov
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178635 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-03Fix SRet for thiscall in i686-pc-win32Timur Iskhodzhanov
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178634 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-03Add 64-bit compare + branch for SPARC v9.Jakob Stoklund Olesen
The same compare instruction is used for 32-bit and 64-bit compares. It sets two different sets of flags: icc and xcc. This patch adds a conditional branch instruction using the xcc flags for 64-bit compares. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178621 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-03Use PPC reciprocal estimates with Newton iteration in fast-math modeHal Finkel
When unsafe FP math operations are enabled, we can use the fre[s] and frsqrte[s] instructions, which generate reciprocal (sqrt) estimates, together with some Newton iteration, in order to quickly generate floating-point division and sqrt results. All of these instructions are separately optional, and so each has its own feature flag (except for the Altivec instructions, which are covered under the existing Altivec flag). Doing this is not only faster than using the IEEE-compliant fdiv/fsqrt instructions, but allows these computations to be pipelined with other computations in order to hide their overall latency. I've also added a couple of missing fnmsub patterns which turned out to be missing (but are necessary for good code generation of the Newton iterations). Altivec needs a similar fix, but that will probably be more complicated because fneg is expanded for Altivec's v4f32. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178617 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-02[mips] Small update to the implementation of eh.return for Mips.Akira Hatanaka
This patch initializes t9 to the handler address, but only if the relocation model is pic. This handles the case where handler to which eh.return jumps points to the start of the function. Patch by Sasa Stankovic. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178588 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-02llvm/test/CodeGen/X86: Unmark them out of XFAIL:cygming, in atomic{32|64}.ll ↵NAKAMURA Takumi
and handle-move.ll, corresponding to r178549. This reverts r176808, r176798, and r177914. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178583 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-02Fix PR15630: Replace faulty stdcx. with stwcx.Bill Schmidt
When doing a partword atomic operation, a lwarx was being paired with a stdcx. instead of a stwcx. when compiling for a 64-bit target. The target has nothing to do with it in this case; we always need a stwcx. Thanks to Kai Nacke for reporting the problem. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178559 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-02Don't attempt MTM heuristics without a scheduling model present.Jakob Stoklund Olesen
This should fix the PPC buildbots. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178558 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-02[fast-isel] Use the correct API to disable FastLowerArguments for Win64.Chad Rosier
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178549 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-02DAGCombiner: Merge store/loads when we have extload/truncstoresArnold Schwaighofer
This is helps on architectures where i8,i16 are not legal but we have byte, and short loads/stores. Allowing us to merge copies like the one below on ARM. copy(char *a, char *b, int n) { do { int t0 = a[0]; int t1 = a[1]; b[0] = t0; b[1] = t1; radar://13536387 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178546 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-02Simplify test cases for Atom preferring call register indirect overPreston Gurd
call memory indirect (32 and 64 bit). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178541 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-02Add 64-bit load and store instructions.Jakob Stoklund Olesen
There is only a few new instructions, the rest is handled with patterns. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178528 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-02Basic 64-bit ALU operations.Jakob Stoklund Olesen
SPARC v9 extends all ALU instructions to 64 bits, so we simply need to add patterns to use them for both i32 and i64 values. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178527 91177308-0d34-0410-b5e6-96231b3b80d8