aboutsummaryrefslogtreecommitdiff
path: root/test/CodeGen/ARM
AgeCommit message (Collapse)Author
2013-05-13Correctly preserve the input chain for potential tailcall nodes whoseLang Hames
return values are bitcasts. The chain had previously been being clobbered with the entry node to the dag, which sometimes caused other code in the function to be erroneously deleted when tailcall optimization kicked in. <rdar://problem/13827621> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181696 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-13Fix PR15950 A bug in DAG Combiner about undef maskHao Liu
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181682 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-05Test case for r181160 and r181161. rdar://13782395Evan Cheng
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181162 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-05For ARM backend, fixed "byval" attribute support.Stepan Dyatkovskiy
Now even the small structures could be passed within byval (small enough to be stored in GPRs). In regression tests next function prototypes are checked: PR15293: %artz = type { i32 } define void @foo(%artz* byval %s) define void @foo2(%artz* byval %s, i32 %p, %artz* byval %s2) foo: "s" stored in R0 foo2: "s" stored in R0, "s2" stored in R2. Next AAPCS rules are checked: 5.5 Parameters Passing, C.4 and C.5, "ParamSize" is parameter size in 32bit words: -- NSAA != 0, NCRN < R4 and NCRN+ParamSize > R4. Parameter should be sent to the stack; NCRN := R4. -- NSAA != 0, and NCRN < R4, NCRN+ParamSize < R4. Parameter stored in GPRs; NCRN += ParamSize. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181148 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-01Optimize away nop CONCAT_VECTOR nodes.Nadav Rotem
Optimize CONCAT_VECTOR nodes that merge EXTRACT_SUBVECTOR values that extract from the same vector. rdar://13402653 PR15866 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180871 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-30Only pass 'returned' to target-specific lowering code when the value of ↵Stephen Lin
entire register is guaranteed to be preserved. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180825 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-30Temporarily revert "Change the informal convention of DBG_VALUE so that we ↵Adrian Prantl
can express a" because it breaks some buildbots. This reverts commit 180816. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180819 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-30Change the informal convention of DBG_VALUE so that we can express aAdrian Prantl
register-indirect address with an offset of 0. It used to be that a DBG_VALUE is a register-indirect value if the offset (operand 1) is nonzero. The new convention is that a DBG_VALUE is register-indirect if the first operand is a register and the second operand is an immediate. For plain registers use the combination reg, reg. rdar://problem/13658587 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180816 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-30TBAA: remove !tbaa from testing cases if not used.Manman Ren
This will make it easier to turn on struct-path aware TBAA since the metadata format will change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180796 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-29TBAA: remove !tbaa from testing cases if not used.Manman Ren
This will make it easier to turn on struct-path aware TBAA since the metadata format will change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180745 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-26ARM/NEON: Pattern match vector integer abs to vabs.Benjamin Kramer
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180604 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-25Fix constant folding for one lane vector types. Constant folding one lane ↵Silviu Baranga
vector types not returns a vector instead of a scalar. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180254 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-24MI Sched: eliminate local vreg copies.Andrew Trick
For now, we just reschedule instructions that use the copied vregs and let regalloc elliminate it. I would really like to eliminate the copies on-the-fly during scheduling, but we need a complete implementation of repairIntervalsInRange() first. The general strategy is for the register coalescer to eliminate as many global copies as possible and shrink live ranges to be extended-basic-block local. The coalescer should not have to worry about resolving local copies (e.g. it shouldn't attemp to reorder instructions). The scheduler is a much better place to deal with local interference. The coalescer side of this equation needs work. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180193 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-23Add more tests for r179925 to verify correct handling of signext/zeroext; ↵Stephen Lin
strengthen condition check to require actual MVT::i32 virtual register types, just in case (no actual functionality change) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180138 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-22Extra paranoid test for r179925 (verify that tail calls are not generated to ↵Stephen Lin
'this'-returning constructors of objects with different 'this' pointers than the caller) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180032 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-22Fix for 5.5 Parameter Passing --> Stage C:Stepan Dyatkovskiy
-- C.4 and C.5 statements, when NSAA is not equal to SP. -- C.1.cp statement for VA functions. Note: There are no VFP CPRCs in a variadic procedure. Before this patch "NSAA != 0" means "don't use GPRs anymore ". But there are some exceptions in AAPCS. 1. For non VA function: allocate all VFP regs for CPRC. When all VFPs are allocated CPRCs would be sent to stack, while non CPRCs may be still allocated in GRPs. 2. Check that for VA functions all params uses GPRs and then stack. No exceptions, no CPRCs here. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180011 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-22Cleanup: test source files do not need to be executableArnaud A. de Grandmaison
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180003 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-22Revert "Revert "PR14606: debug info imported_module support""David Blaikie
This reverts commit r179840 with a fix to test/DebugInfo/two-cus-from-same-file.ll I'm not sure why that test only failed on ARM & MIPS and not X86 Linux, even though the debug info was clearly invalid on all of them, but this ought to fix it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179996 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-21Legalize vector truncates by parts rather than just splitting.Jim Grosbach
Rather than just splitting the input type and hoping for the best, apply a bit more cleverness. Just splitting the types until the source is legal often leads to an illegal result time, which is then widened and a scalarization step is introduced which leads to truly horrible code generation. With the loop vectorizer, these sorts of operations are much more common, and so it's worth extra effort to do them well. Add a legalization hook for the operands of a TRUNCATE node, which will be encountered after the result type has been legalized, but if the operand type is still illegal. If simple splitting of both types ends up with the result type of each half still being legal, just do that (v16i16 -> v16i8 on ARM, for example). If, however, that would result in an illegal result type (v8i32 -> v8i8 on ARM, for example), we can get more clever with power-two vectors. Specifically, split the input type, but also widen the result element size, then concatenate the halves and truncate again. For example on ARM, To perform a "%res = v8i8 trunc v8i32 %in" we transform to: %inlo = v4i32 extract_subvector %in, 0 %inhi = v4i32 extract_subvector %in, 4 %lo16 = v4i16 trunc v4i32 %inlo %hi16 = v4i16 trunc v4i32 %inhi %in16 = v8i16 concat_vectors v4i16 %lo16, v4i16 %hi16 %res = v8i8 trunc v8i16 %in16 This allows instruction selection to generate three VMOVN instructions instead of a sequences of moves, stores and loads. Update the ARMTargetTransformInfo to take this improved legalization into account. Consider the simplified IR: define <16 x i8> @test1(<16 x i32>* %ap) { %a = load <16 x i32>* %ap %tmp = trunc <16 x i32> %a to <16 x i8> ret <16 x i8> %tmp } define <8 x i8> @test2(<8 x i32>* %ap) { %a = load <8 x i32>* %ap %tmp = trunc <8 x i32> %a to <8 x i8> ret <8 x i8> %tmp } Previously, we would generate the truly hideous: .syntax unified .section __TEXT,__text,regular,pure_instructions .globl _test1 .align 2 _test1: @ @test1 @ BB#0: push {r7} mov r7, sp sub sp, sp, #20 bic sp, sp, #7 add r1, r0, #48 add r2, r0, #32 vld1.64 {d24, d25}, [r0:128] vld1.64 {d16, d17}, [r1:128] vld1.64 {d18, d19}, [r2:128] add r1, r0, #16 vmovn.i32 d22, q8 vld1.64 {d16, d17}, [r1:128] vmovn.i32 d20, q9 vmovn.i32 d18, q12 vmov.u16 r0, d22[3] strb r0, [sp, #15] vmov.u16 r0, d22[2] strb r0, [sp, #14] vmov.u16 r0, d22[1] strb r0, [sp, #13] vmov.u16 r0, d22[0] vmovn.i32 d16, q8 strb r0, [sp, #12] vmov.u16 r0, d20[3] strb r0, [sp, #11] vmov.u16 r0, d20[2] strb r0, [sp, #10] vmov.u16 r0, d20[1] strb r0, [sp, #9] vmov.u16 r0, d20[0] strb r0, [sp, #8] vmov.u16 r0, d18[3] strb r0, [sp, #3] vmov.u16 r0, d18[2] strb r0, [sp, #2] vmov.u16 r0, d18[1] strb r0, [sp, #1] vmov.u16 r0, d18[0] strb r0, [sp] vmov.u16 r0, d16[3] strb r0, [sp, #7] vmov.u16 r0, d16[2] strb r0, [sp, #6] vmov.u16 r0, d16[1] strb r0, [sp, #5] vmov.u16 r0, d16[0] strb r0, [sp, #4] vldmia sp, {d16, d17} vmov r0, r1, d16 vmov r2, r3, d17 mov sp, r7 pop {r7} bx lr .globl _test2 .align 2 _test2: @ @test2 @ BB#0: push {r7} mov r7, sp sub sp, sp, #12 bic sp, sp, #7 vld1.64 {d16, d17}, [r0:128] add r0, r0, #16 vld1.64 {d20, d21}, [r0:128] vmovn.i32 d18, q8 vmov.u16 r0, d18[3] vmovn.i32 d16, q10 strb r0, [sp, #3] vmov.u16 r0, d18[2] strb r0, [sp, #2] vmov.u16 r0, d18[1] strb r0, [sp, #1] vmov.u16 r0, d18[0] strb r0, [sp] vmov.u16 r0, d16[3] strb r0, [sp, #7] vmov.u16 r0, d16[2] strb r0, [sp, #6] vmov.u16 r0, d16[1] strb r0, [sp, #5] vmov.u16 r0, d16[0] strb r0, [sp, #4] ldm sp, {r0, r1} mov sp, r7 pop {r7} bx lr Now, however, we generate the much more straightforward: .syntax unified .section __TEXT,__text,regular,pure_instructions .globl _test1 .align 2 _test1: @ @test1 @ BB#0: add r1, r0, #48 add r2, r0, #32 vld1.64 {d20, d21}, [r0:128] vld1.64 {d16, d17}, [r1:128] add r1, r0, #16 vld1.64 {d18, d19}, [r2:128] vld1.64 {d22, d23}, [r1:128] vmovn.i32 d17, q8 vmovn.i32 d16, q9 vmovn.i32 d18, q10 vmovn.i32 d19, q11 vmovn.i16 d17, q8 vmovn.i16 d16, q9 vmov r0, r1, d16 vmov r2, r3, d17 bx lr .globl _test2 .align 2 _test2: @ @test2 @ BB#0: vld1.64 {d16, d17}, [r0:128] add r0, r0, #16 vld1.64 {d18, d19}, [r0:128] vmovn.i32 d16, q8 vmovn.i32 d17, q9 vmovn.i16 d16, q8 vmov r0, r1, d16 bx lr git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179989 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-21ARM: Split out cost model vcvt testcases.Jim Grosbach
They had a separate RUN line already, so may as well be in a separate file. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179988 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-21ARM: fix part of test which actually needed an asserts buildTim Northover
This should fix a buildbot failure that occurred after r179977. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179978 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-21ARM: Use ldrd/strd to spill 64-bit pairs when available.Tim Northover
This allows common sp-offsets to be part of the instruction and is probably faster on modern CPUs too. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179977 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-20ARM: don't add FrameIndex offset for LDMIA (has no immediate)Tim Northover
Previously, when spilling 64-bit paired registers, an LDMIA with both a FrameIndex and an offset was produced. This kind of instruction shouldn't exist, and the extra operand was being confused with the predicate, causing aborts later on. This removes the invalid 0-offset from the instruction being produced. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179956 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-20Add CodeGen support for functions that always return arguments via a new ↵Stephen Lin
parameter attribute 'returned', which is taken advantage of in target-independent tail call opportunity detection and in ARM call lowering (when placed on an integral first parameter). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179925 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-19Revert "PR14606: debug info imported_module support"Eric Christopher
This reverts commit r179836 as it seems to have caused test failures. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179840 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-19PR14606: debug info imported_module supportDavid Blaikie
Adding another CU-wide list, in this case of imported_modules (since they should be relatively rare, it seemed better to add a list where each element had a "context" value, rather than add a (usually empty) list to every scope). This takes care of DW_TAG_imported_module, but to fully address PR14606 we'll need to expand this to cover DW_TAG_imported_declaration too. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179836 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-18Fix for PR14824, An ARM Load/Store Optimization bugHao Liu
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179751 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-16Implement ARM unwind opcode assembler.Logan Chien
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179591 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-12Replace coff-/elf-dump with llvm-readobjNico Rieck
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179361 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-11Add missing colons to check lines.Benjamin Kramer
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179277 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-11FileCheckize a bunch of tests.Benjamin Kramer
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179276 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-09DAGCombiner: Fold a shuffle on CONCAT_VECTORS into a new CONCAT_VECTORS if ↵Benjamin Kramer
possible. This pattern occurs in SROA output due to the way vector arguments are lowered on ARM. The testcase from PR15525 now compiles into this, which is better than the code we got with the old scalarrepl: _Store: ldr.w r9, [sp] vmov d17, r3, r9 vmov d16, r1, r2 vst1.8 {d16, d17}, [r0] bx lr Differential Revision: http://llvm-reviews.chandlerc.com/D647 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179106 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05Reverting 178851 as it broke buildbotsRenato Golin
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178883 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05Fix for PR14824: "Optimization arm_ldst_opt inserts newly generated ↵Stepan Dyatkovskiy
instruction vldmia at incorrect position". Patch introduces memory operands tracking in ARMLoadStoreOpt::LoadStoreMultipleOpti. For each register it keeps the order of load operations as it was before optimization pass. It is kind of deep improvement of fix proposed by Hao: http://llvm.org/bugs/show_bug.cgi?id=14824#c4 But it also tracks conflicts between different register classes (e.g. D2 and S5). For more details see: Bug description: http://llvm.org/bugs/show_bug.cgi?id=14824 LLVM Commits discussion: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130311/167936.html http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130318/168688.html http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130325/169376.html http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130401/170238.html git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178851 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-04Avoid high-latency false CPSR dependencies even for tMOVSi.Jakob Stoklund Olesen
The Thumb2SizeReduction pass avoids false CPSR dependencies, except it still aggressively creates tMOVi8 instructions because they are so common. Avoid creating false CPSR dependencies even for tMOVi8 instructions when the the CPSR flags are known to have high latency. This allows integer computation to overlap floating point computations. Also process blocks in a reverse post-order and propagate high-latency flags to successors. <rdar://problem/13468102> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178773 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-04New-password-test commit.Stepan Dyatkovskiy
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178765 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-02DAGCombiner: Merge store/loads when we have extload/truncstoresArnold Schwaighofer
This is helps on architectures where i8,i16 are not legal but we have byte, and short loads/stores. Allowing us to merge copies like the one below on ARM. copy(char *a, char *b, int n) { do { int t0 = a[0]; int t1 = a[1]; b[0] = t0; b[1] = t1; radar://13536387 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178546 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-29Remove the old CodePlacementOpt pass.Benjamin Kramer
It was superseded by MachineBlockPlacement and disabled by default since LLVM 3.1. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178349 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-28Revert "Adding DIImportedModules to DIScopes."David Blaikie
This reverts commit 342d92c7a0adeabc9ab00f3f0d88d739fe7da4c7. Turns out we're going with a different schema design to represent DW_TAG_imported_modules so we won't need this extra field. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178215 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-27Enabling the generation of dependency breakers for partial updates on ↵Silviu Baranga
Cortex-A15. Also fixing a small bug in getting the update clearence for VLD1LNd32. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178134 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-27Adding DIImportedModules to DIScopes.David Blaikie
This is just the basic groundwork for supporting DW_TAG_imported_module but I wanted to commit this before pushing support further into Clang or LLVM so that this rather churny change is isolated from the rest of the work. The major churn here is obviously adding another field (within the common DIScope prefix) to all DIScopes (files, classes, namespaces, lexical scopes, etc). This should be the last big churny change needed for DW_TAG_imported_module/using directive support/PR14606. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178099 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-22Reorder the DIFile field in DILexicalBlock to become a prefix common with ↵David Blaikie
other DIScopes git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177703 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-21Move the DIFile in DISubprogram to the beginning to be a common prefix along ↵David Blaikie
with other DIScopes git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177674 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-21Fix Darwin NEON FP and increase coverageRenato Golin
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177664 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-21Remove unused field in DISubprogramDavid Blaikie
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177661 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-21Avoid NEON SP-FP unless unsafe-math or DarwinRenato Golin
NEON is not IEEE 754 compliant, so we should avoid lowering single-precision floating point operations with NEON unless unsafe-math is turned on. The equivalent VFP instructions are IEEE 754 compliant, but in some cores they're much slower, so some archs/OSs might still request it to be on by default, such as Swift and Darwin. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177651 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-20Debug info: refactor the first field of DICompileUnit to be a raw ↵David Blaikie
file/directory pair This removes the DICompileUnit special case from DIScope. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177610 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-20When computing the demanded bits of Load SDNodes, make sure that we are ↵Nadav Rotem
looking at the loaded-value operand and not the ptr result (in case of pre-inc loads). rdar://13348420 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177596 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-20Debug Info: Swap the 2nd and 3rd parameters to DICompileUnit to match the ↵David Blaikie
common DIScope prefix git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177595 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-20Remove unused field in DICompileUnitDavid Blaikie
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177590 91177308-0d34-0410-b5e6-96231b3b80d8