aboutsummaryrefslogtreecommitdiff
path: root/test/CodeGen
AgeCommit message (Collapse)Author
2012-12-18Disable ARM partial flag dependency optimization at -OzQuentin Colombet
To not over constrain the scheduler for ARM in thumb mode, some optimizations for code size reduction, specific to ARM thumb, are blocked when they add a dependency (like write after read dependency). Disables this check when code size is the priority, i.e., code is compiled with -Oz. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170462 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-18MISched: add dependence to ExitSU to model live-out latency.Andrew Trick
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170454 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-18Check multiple register classes for inline asm tied registersHal Finkel
A register can be associated with several distinct register classes. For example, on PPC, the floating point registers are each associated with both F4RC (which holds f32) and F8RC (which holds f64). As a result, this code would fail when provided with a floating point register and an f64 operand because it would happen to find the register in the F4RC class first and return that. From the F4RC class, SDAG would extract f32 as the register type and then assert because of the invalid implied conversion between the f64 value and the f32 register. Instead, search all register classes. If a register class containing the the requested register has the requested type, then return that register class. Otherwise, as before, return the first register class found that contains the requested register. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170436 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-17Add rest of BMI/BMI2 instructions to the folding tables as well as popcnt ↵Craig Topper
and lzcnt. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170304 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-16This patch is needed to make c++ exceptions work for mips16.Reed Kotler
Mips16 is really a processor decoding mode (ala thumb 1) and in the same program, mips16 and mips32 functions can exist and can call each other. If a jal type instruction encounters an address with the lower bit set, then the processor switches to mips16 mode (if it is not already in it). If the lower bit is not set, then it switches to mips32 mode. The linker knows which functions are mips16 and which are mips32. When relocation is performed on code labels, this lower order bit is set if the code label is a mips16 code label. In general this works just fine, however when creating exception handling tables and dwarf, there are cases where you don't want this lower order bit added in. This has been traditionally distinguished in gas assembly source by using a different syntax for the label. lab1: ; this will cause the lower order bit to be added lab2=. ; this will not cause the lower order bit to be added In some cases, it does not matter because in dwarf and debug tables the difference of two labels is used and in that case the lower order bits subtract each other out. To fix this, I have added to mcstreamer the notion of a debuglabel. The default is for label and debug label to be the same. So calling EmitLabel and EmitDebugLabel produce the same result. For various reasons, there is only one set of labels that needs to be modified for the mips exceptions to work. These are the "$eh_func_beginXXX" labels. Mips overrides the debug label suffix from ":" to "=." . This initial patch fixes exceptions. More changes most likely will be needed to DwarfCFException to make all of this work for actual debugging. These changes will be to emit debug labels in some places where a simple label is emitted now. Some historical discussion on this from gcc can be found at: http://gcc.gnu.org/ml/gcc-patches/2008-08/msg00623.html http://gcc.gnu.org/ml/gcc-patches/2008-11/msg01273.html git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170279 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-15X86: Add a couple of target-specific dag combines that turn VSELECTS into ↵Benjamin Kramer
psubus if possible. We match the pattern "x >= y ? x-y : 0" into "subus x, y" and two special cases if y is a constant. DAGCombiner canonicalizes those so we first have to undo the canonicalization for those cases. The pattern occurs in gzip when the loop vectorizer is enabled. Part of PR14613. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170273 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-15This code implements most of mips16 hardfloat as it is done by gcc.Reed Kotler
In this case, essentially it is soft float with different library routines. The next step will be to make this fully interoperational with mips32 floating point and that requires creating stubs for functions with signatures that contain floating point types. I have a more sophisticated design for mips16 hardfloat which I hope to implement at a later time that directly does floating point without the need for function calls. The mips16 encoding has no floating point instructions so one needs to switch to mips32 mode to execute floating point instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170259 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-14FastIsel: Get PIC-style GV loads for NaCl64 even w/out 64-bit pointers.Jan Voung
Without this, the load would not be RIP relative, and will end up being relative to 0, which is R15. BUG= http://code.google.com/p/nativeclient/issues/detail?id=3219 TEST= ./scons bitcode=1 pnacl_generate_pexe=0 \ run_stack_frame_noopt_noframe_test \ run_unwind_trace_noopt_noframe_test \ run_stack_frame_noopt_frame_test \ run_unwind_trace_noopt_frame_test \ platform=x86-64 nacl_pic=1 Review URL: https://codereview.chromium.org/11575042
2012-12-14TypeLegalizer: Do not generate target specific nodes with illegal types, ↵Nadav Rotem
because we cant type-legalize them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170245 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-14This patch removes some nondeterminism from direct object file outputBill Schmidt
for TLS dynamic models on 64-bit PowerPC ELF. The default sort routine for relocations only sorts on the r_offset field; but with TLS, there can be two relocations with the same r_offset. For PowerPC, this patch sorts secondarily on descending r_type, which matches the behavior expected by the linker. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170237 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-14This patch improves the 64-bit PowerPC InitialExec TLS support by providingBill Schmidt
for a wider range of GOT entries that can hold thread-relative offsets. This matches the behavior of GCC, which was not documented in the PPC64 TLS ABI. The ABI will be updated with the new code sequence. Former sequence: ld 9,x@got@tprel(2) add 9,9,x@tls New sequence: addis 9,2,x@got@tprel@ha ld 9,x@got@tprel@l(9) add 9,9,x@tls Note that a linker optimization exists to transform the new sequence into the shorter sequence when appropriate, by replacing the addis with a nop and modifying the base register and relocation type of the ld. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170209 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-13[mips] Do not copy GOT address to register $gp if the function being called hasAkira Hatanaka
internal linkage. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170092 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-13Fix a bug in DAGCombiner::MatchBSwapHWord. Make sure the node has operands ↵Evan Cheng
before referencing them. rdar://12868039 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170078 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-12Fix a logic bug in inline expansion of memcpy / memset with an overlappingEvan Cheng
load / store pair. It's not legal to use a wider load than the size of the remaining bytes if it's the first pair of load / store. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170018 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-12The ordering of two relocations on the same instruction is apparently notBill Schmidt
predictable when compiled on at least one non-PowerPC host. Source of nondeterminism not apparent. Restrict the test to build on PowerPC hosts for now while looking into the issue further. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170016 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-12This patch implements local-dynamic TLS model support for the 64-bitBill Schmidt
PowerPC target. This is the last of the four models, so we now have full TLS support. This is mostly a straightforward extension of the general dynamic model. I had to use an additional Chain operand to tie ADDIS_DTPREL_HA to the register copy following ADDI_TLSLD_L; otherwise everything above the ADDIS_DTPREL_HA appeared dead and was removed. As before, there are new test cases to test the assembly generation, and the relocations output during integrated assembly. The expected code gen sequence can be read in test/CodeGen/PowerPC/tls-ld.ll. There are a couple of things I think can be done more efficiently in the overall TLS code, so there will likely be a clean-up patch forthcoming; but for now I want to be sure the functionality is in place. Bill git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170003 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-12llvm/test/CodeGen/X86/atom-bypass-slow-division.ll: Fix possible typo(s) in ↵NAKAMURA Takumi
CHECK-NOT lines. Found by Alexander Zinenko, thanks! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169978 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-12llvm/test/CodeGen/X86/atom-bypass-slow-division.ll: Rename symbols, ↵NAKAMURA Takumi
s/test_/Test/g, not to mismatch "CHECK(-NOT): test". git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169977 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-12llvm/test/CodeGen/X86/store_op_load_fold.ll: Fix typo, s/CHECK_NEXT/CHECK-NEXT/NAKAMURA Takumi
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169957 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-12llvm/test/CodeGen/X86/store_op_load_fold.ll: Add explicit triple.NAKAMURA Takumi
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169956 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-12DAGCombine: clamp hi bit in APInt::getBitsSet to avoid assertionManman Ren
rdar://12838504 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169951 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-12Avoid using lossy load / stores for memcpy / memset expansion. e.g.Evan Cheng
f64 load / store on non-SSE2 x86 targets. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169944 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-11Add R600 backendTom Stellard
A new backend supporting AMD GPUs: Radeon HD2XXX - HD7XXX git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169915 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-11This patch implements the general dynamic TLS model for 64-bit PowerPC.Bill Schmidt
Given a thread-local symbol x with global-dynamic access, the generated code to obtain x's address is: Instruction Relocation Symbol addis ra,r2,x@got@tlsgd@ha R_PPC64_GOT_TLSGD16_HA x addi r3,ra,x@got@tlsgd@l R_PPC64_GOT_TLSGD16_L x bl __tls_get_addr(x@tlsgd) R_PPC64_TLSGD x R_PPC64_REL24 __tls_get_addr nop <use address in r3> The implementation borrows from the medium code model work for introducing special forms of ADDIS and ADDI into the DAG representation. This is made slightly more complicated by having to introduce a call to the external function __tls_get_addr. Using the full call machinery is overkill and, more importantly, makes it difficult to add a special relocation. So I've introduced another opcode GET_TLS_ADDR to represent the function call, and surrounded it with register copies to set up the parameter and return value. Most of the code is pretty straightforward. I ran into one peculiarity when I introduced a new PPC opcode BL8_NOP_ELF_TLSGD, which is just like BL8_NOP_ELF except that it takes another parameter to represent the symbol ("x" above) that requires a relocation on the call. Something in the TblGen machinery causes BL8_NOP_ELF and BL8_NOP_ELF_TLSGD to be treated identically during the emit phase, so this second operand was never visited to generate relocations. This is the reason for the slightly messy workaround in PPCMCCodeEmitter.cpp:getDirectBrEncoding(). Two new tests are included to demonstrate correct external assembly and correct generation of relocations using the integrated assembler. Comments welcome! Thanks, Bill git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169910 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-11Add a triple to this test.Chad Rosier
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169803 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-11Fix a miscompile in the DAG combiner. Previously, we would incorrectlyChandler Carruth
try to reduce the width of this load, and would end up transforming: (truncate (lshr (sextload i48 <ptr> as i64), 32) to i32) to (truncate (zextload i32 <ptr+4> as i64) to i32) We lost the sext attached to the load while building the narrower i32 load, and replaced it with a zext because lshr always zext's the results. Instead, bail out of this combine when there is a conflict between a sextload and a zext narrowing. The rest of the DAG combiner still optimize the code down to the proper single instruction: movswl 6(...),%eax Which is exactly what we wanted. Previously we read past the end *and* missed the sign extension: movl 6(...), %eax git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169802 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-11move X86-specific testPaul Redmond
This test case uses -mcpu=corei7 so it belongs in CodeGen/X86 Reviewed by: Nadav git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169801 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-11Fall back to the selection dag isel to select tail calls.Chad Rosier
This shouldn't affect codegen for -O0 compiles as tail call markers are not emitted in unoptimized compiles. Testing with the external/internal nightly test suite reveals no change in compile time performance. Testing with -O1, -O2 and -O3 with fast-isel enabled did not cause any compile-time or execution-time failures. All tests were performed on my x86 machine. I'll monitor our arm testers to ensure no regressions occur there. In an upcoming clang patch I will be marking the objc_autoreleaseReturnValue and objc_retainAutoreleaseReturnValue as tail calls unconditionally. While it's theoretically true that this is just an optimization, it's an optimization that we very much want to happen even at -O0, or else ARC applications become substantially harder to debug. Part of rdar://12553082 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169796 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-10Some enhancements for memcpy / memset inline expansion.Evan Cheng
1. Teach it to use overlapping unaligned load / store to copy / set the trailing bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies. 2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g. x86 and ARM. 3. When memcpy from a constant string, do *not* replace the load with a constant if it's not possible to materialize an integer immediate with a single instruction (required a new target hook: TLI.isIntImmLegal()). 4. Use unaligned load / stores more aggressively if target hooks indicates they are "fast". 5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8. Also increase the threshold to something reasonable (8 for memset, 4 pairs for memcpy). This significantly improves Dhrystone, up to 50% on ARM iOS devices. rdar://12760078 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169791 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-10Use GetUnderlyingObjects in mischedHal Finkel
misched used GetUnderlyingObject in order to break false load/store dependencies, and the -enable-aa-sched-mi feature similarly relied on GetUnderlyingObject in order to ensure it is safe to use the aliasing analysis. Unfortunately, GetUnderlyingObject does not recurse through phi nodes, and so (especially due to LSR) all of these mechanisms failed for induction-variable-dependent loads and stores inside loops. This change replaces uses of GetUnderlyingObject with GetUnderlyingObjects (which will recurse through phi and select instructions) in misched. Andy reviewed, tested and simplified this patch; Thanks! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169744 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-10Teach DAG combine to handle vector add/sub with vectors of all 0s.Craig Topper
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169727 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-08Teach DAG combine to handle vector logical operations with vectors of all 1s ↵Craig Topper
or all 0s. These cases can show up when vectors are split for legalizing. Fix some tests that were dependent on these cases not being combined. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169684 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-07When we use the BLEND instruction that uses the MSB as a mask, we can removeNadav Rotem
the VSRI instruction before it since it does not affect the MSB. Thanks Craig Topper for suggesting this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169638 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-07In hexagon convertToHardwareLoop, don't deref end() iteratorMatthew Curtis
In particular, check if MachineBasicBlock::iterator is end() before using it to call getDebugLoc(); See also this thread on llvm-commits: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121112/155914.html git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169634 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-07X86: Prefer using VPSHUFD over VPERMIL because it has better throughput.Nadav Rotem
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169624 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-07Added Mapping Symbols for ARM ELFTim Northover
Before this patch, when you objdump an LLVM-compiled file, objdump tried to decode data-in-code sections as if they were code. This patch adds the missing Mapping Symbols, as defined by "ELF for the ARM Architecture" (ARM IHI 0044D). Patch based on work by Greg Fitzgerald. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169609 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-06Fix typos in CHECK lines.Dmitri Gribenko
Patch by Alexander Zinenko. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169547 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-06Fix a bug in the code that merges consecutive stores. Previously we did notNadav Rotem
check if loads that happen in between stores alias with the first store in the chain, only with the second store onwards. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169516 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-06Remove intrinsic specific instructions for (V)MOVQUmr with patterns pointing ↵Craig Topper
to the normal instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169482 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-06Properly fix the tes.Evan Cheng
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169464 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-06llvm/test/CodeGen/ARM/extload-knownzero.ll: Try to unbreak, to add -O0. I ↵NAKAMURA Takumi
guess Chad expects fastisel here. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169463 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-06[arm fast-isel] Make the fast-isel implementation of memcpy respect alignment.Chad Rosier
rdar://12821569 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169460 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-06Let targets provide hooks that compute known zero and ones for any_extendEvan Cheng
and extload's. If they are implemented as zero-extend, or implicitly zero-extend, then this can enable more demanded bits optimizations. e.g. define void @foo(i16* %ptr, i32 %a) nounwind { entry: %tmp1 = icmp ult i32 %a, 100 br i1 %tmp1, label %bb1, label %bb2 bb1: %tmp2 = load i16* %ptr, align 2 br label %bb2 bb2: %tmp3 = phi i16 [ 0, %entry ], [ %tmp2, %bb1 ] %cmp = icmp ult i16 %tmp3, 24 br i1 %cmp, label %bb3, label %exit bb3: call void @bar() nounwind br label %exit exit: ret void } This compiles to the followings before: push {lr} mov r2, #0 cmp r1, #99 bhi LBB0_2 @ BB#1: @ %bb1 ldrh r2, [r0] LBB0_2: @ %bb2 uxth r0, r2 cmp r0, #23 bhi LBB0_4 @ BB#3: @ %bb3 bl _bar LBB0_4: @ %exit pop {lr} bx lr The uxth is not needed since ldrh implicitly zero-extend the high bits. With this change it's eliminated. rdar://12771555 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169459 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-05RegisterPressureTracker: fix findUseBetween to handle DebugValueAndrew Trick
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169427 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-05RegisterPresssureTracker: Track live physical register by unit.Andrew Trick
This is much simpler to reason about, more efficient, and fixes some corner cases involving implicit super-register defs. Fixed rdar://12797931. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169425 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-05[NVPTX] Fix crash with unnamed struct argumentsJustin Holewinski
Patch by Eric Holk git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169418 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-05Use multiclass to define store instructions with base+immediate offsetJyotsna Verma
addressing mode and immediate stored value. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169408 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-05Simplified BLEND pattern matching for shuffles.Elena Demikhovsky
Generate VPBLENDD for AVX2 and VPBLENDW for v16i16 type on AVX2. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169366 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-05Add x86 isel lowering logic to form bit test with inverted condition. e.g.Evan Cheng
x ^ -1. Patch by David Majnemer. rdar://12755626 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169339 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-04ARM custom lower ctpop for vector types. Patch by Pete Couperus.Evan Cheng
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169325 91177308-0d34-0410-b5e6-96231b3b80d8