llvm - http://llvm.org

Age	Commit message (Collapse)	Author
2011-02-22	available_externally (hidden or not) GVs are always accessed via stubs. ↵	Evan Cheng
	rdar://9027648. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126191 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-14	PR9139: Specify ARM/Darwin triple for vector-DAGCombine.ll test.	Bob Wilson
	The i64_buildvector test in this file relies on the alignment of i64 and f64 types being the same, which is true for Darwin but not AAPCS. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125525 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-11	Implement sdiv & udiv for <4 x i16> and <8 x i8> NEON vector types.	Nate Begeman
	This avoids moving each element to the integer register file and calling __divsi3 etc. on it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125402 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-11	Fix buggy fcopysign lowering.	Evan Cheng
	This define float @foo(float %x, float %y) nounwind readnone { entry: %0 = tail call float @copysignf(float %x, float %y) nounwind readnone ret float %0 } Was compiled to: vmov s0, r1 bic r0, r0, #-2147483648 vmov s1, r0 vcmpe.f32 s0, #0 vmrs apsr_nzcv, fpscr it lt vneglt.f32 s1, s1 vmov r0, s1 bx lr This fails to copy the sign of -0.0f because it's lost during the float to int conversion. Also, it's sub-optimal when the inputs are in GPR registers. Now it uses integer and + or operations when it's profitable. And it's correct! lsrs r1, r1, #31 bfi r0, r1, #31, #1 bx lr rdar://8984306 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125357 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-08	PostRA antidependence breaker unit test for PR8986.	Andrew Trick
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125091 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-08	PostRA antidependence breaker unit test for rdar://8959122.	Andrew Trick
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125090 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-07	Fix an obvious typo which caused an isel assertion. rdar://8964854.	Evan Cheng
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125023 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-07	Add codegen support for using post-increment NEON load/store instructions.	Bob Wilson
	The vld1-lane, vld1-dup and vst1-lane instructions do not yet support using post-increment versions, but all the rest of the NEON load/store instructions should be handled now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125014 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-07	Rework some .ARM.attribute work for improved gcc compatibility.	Jason W Kim
	Unified EmitTextAttribute for both Asm and Obj emission (.cpu only) Added necessary cortex-A8 related attrs for codegen compat tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124995 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-02	Given a pair of floating point load and store, if there are no other uses of	Evan Cheng
	the load, then it may be legal to transform the load and store to integer load and store of the same width. This is done if the target specified the transformation as profitable. e.g. On arm, this can transform: vldr.32 s0, [] vstr.32 s0, [] to ldr r12, [] str r12, [] rdar://8944252 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124708 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-27	Add a testcase for my last checkin.	Eric Christopher
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124358 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-25	Don't merge restore with tail call instruction.	Evan Cheng
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124167 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-21	Last round of fixes for movw + movt global address codegen.	Evan Cheng
	1. Fixed ARM pc adjustment. 2. Fixed dynamic-no-pic codegen 3. CSE of pc-relative load of global addresses. It's now enabled by default for Darwin. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123991 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-21	Enable support for precise scheduling of the instruction selection	Andrew Trick
	DAG. Disable using "-disable-sched-cycles". For ARM, this enables a framework for modeling the cpu pipeline and counting stalls. It also activates several heuristics to drive scheduling based on the model. Scheduling is inherently imprecise at this stage, and until spilling is improved it may defeat attempts to schedule. However, this framework provides greater control over tuning codegen. Although the flag is not target-specific, it should have very little affect on the default scheduler used by x86. The only two changes that affect x86 are: - scheduling a high-latency operation bumps the current cycle so independent operations can have their latency covered. i.e. two independent 4 cycle operations can produce results in 4 cycles, not 8 cycles. - Two operations with equal register pressure impact and no latency-based stalls on their uses will be prioritized by depth before height (height is irrelevant if no stalls occur in the schedule below this point). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123971 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-21	Convert -enable-sched-cycles and -enable-sched-hazard to -disable	Andrew Trick
	flags. They are still not enable in this revision. Added TargetInstrInfo::isZeroCost() to fix a fundamental problem with the scheduler's model of operand latency in the selection DAG. Generalized unit tests to work with sched-cycles. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123969 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-20	Don't be overly aggressive with CSE of "ldr constantpool". If it's a pc-relative	Evan Cheng
	value, the "add pc" must be CSE'ed at the same time. We could follow the same approach as T2 by adding pseudo instructions that combine the ldr + "add pc". But the better approach is to use movw + movt (which I will enable soon), so I'll leave this as a TODO. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123949 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-20	Add test.	Evan Cheng
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123906 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-20	Sorry, several patches in one.	Evan Cheng
	TargetInstrInfo: Change produceSameValue() to take MachineRegisterInfo as an optional argument. When in SSA form, targets can use it to make more aggressive equality analysis. Machine LICM: 1. Eliminate isLoadFromConstantMemory, use MI.isInvariantLoad instead. 2. Fix a bug which prevent CSE of instructions which are not re-materializable. 3. Use improved form of produceSameValue. ARM: 1. Teach ARM produceSameValue to look pass some PIC labels. 2. Look for operands from different loads of different constant pool entries which have same values. 3. Re-implement PIC GA materialization using movw + movt. Combine the pair with a "add pc" or "ldr [pc]" to form pseudo instructions. This makes it possible to re-materialize the instruction, allow machine LICM to hoist the set of instructions out of the loop and make it possible to CSE them. It's a bit hacky, but it significantly improve code quality. 4. Some minor bug fixes as well. With the fixes, using movw + movt to materialize GAs significantly outperform the load from constantpool method. 186.crafty and 255.vortex improved > 20%, 254.gap and 176.gcc ~10%. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123905 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-20	If we can, lower the multiply part of a umulo/smulo call to a libcall	Eric Christopher
	with an invalid type then split the result and perform the overflow check normally. Fixes the 32-bit parts of rdar://8622122 and rdar://8774702. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123864 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-20	Fix debug info for merged global.	Devang Patel
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123862 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-17	Materialize GA addresses with movw + movt pairs for Darwin in PIC mode. e.g.	Evan Cheng
	movw r0, :lower16:(L_foo$non_lazy_ptr-(LPC0_0+4)) movt r0, :upper16:(L_foo$non_lazy_ptr-(LPC0_0+4)) LPC0_0: add r0, pc, r0 It's not yet enabled by default as some tests are failing. I suspect bugs in down stream tools. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123619 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-11	Even if we don't have 7 bytes of stack space we may need to save and	Eric Christopher
	restore the stack pointer from the frame pointer on thumbv6. Fixes rdar://8819685 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123196 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-08	Recognize inline asm 'rev /bin/bash, ' as a bswap intrinsic call.	Evan Cheng
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123048 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-07	Lower some BUILD_VECTORS using VEXT+shuffle.	Bob Wilson
	Patch by Tim Northover. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123035 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-07	Add testcases for PR8411 (vget_low and vget_high implemented as shuffles).	Bob Wilson
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122997 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-07	Add ARM patterns to match EXTRACT_SUBVECTOR nodes.	Bob Wilson
	Also fix an off-by-one in SelectionDAGBuilder that was preventing shuffle vectors from being translated to EXTRACT_SUBVECTOR. Patch by Tim Northover. The test changes are needed to keep those spill-q tests from testing aligned spills and restores. If the only aligned stack objects are spill slots, we no longer realign the stack frame. Prior to this patch, an EXTRACT_SUBVECTOR was legalized by loading from the stack, which created an aligned frame index. Now, however, there is nothing except the spill slot in the stack frame, so I added an aligned alloca. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122995 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-06	PR8921: LDM/POP do not support interworking prior to v5t.	Bob Wilson
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122970 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-01	Update the test	Anton Korobeynikov
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122666 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-23	Radar 8803471: Fix expansion of ARM BCCi64 pseudo instructions.	Bob Wilson
	If the basic block containing the BCCi64 (or BCCZi64) instruction ends with an unconditional branch, that branch needs to be deleted before appending the expansion of the BCCi64 to the end of the block. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122521 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-21	Add ARM-specific DAG combining to cast i64 vector element load/stores to f64.	Bob Wilson
	Type legalization splits up i64 values into pairs of i32 values, which leads to poor quality code when inserting or extracting i64 vector elements. If the vector element is loaded or stored, it can be treated as an f64 value and loaded or stored directly from a VPR register. Use the pre-legalization DAG combiner to cast those vector elements to f64 types so that the type legalizer won't mess them up. Radar 8755338. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122319 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-19	move this test into the ARM test so that it is only run when the arm backend	Chris Lattner
	is enabled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122163 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-18	Fix result type of Neon floating-point comparisons against zero.	Bob Wilson
	The result vector elements are always integers. Radar 8782191. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122112 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-17	During local stack slot allocation, the materializeFrameBaseRegister function	Bill Wendling
	may be called. If the entry block is empty, the insertion point iterator will be the "end()" value. Calling ->getParent() on it (among others) causes problems. Modify materializeFrameBaseRegister to take the machine basic block and insert the frame base register at the beginning of that block. (It's very similar to what the code does all ready. The only difference is that it will always insert at the beginning of the entry block instead of after a previous materialization of the frame base register. I doubt that that matters here.) <rdar://problem/8782198> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122104 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-17	Fix a DAGCombiner crash when folding binary vector operations with constant	Bob Wilson
	BUILD_VECTOR operands where the element type is not legal. I had previously changed this code to insert TRUNCATE operations, but that was just wrong. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122102 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-17	Combine several vector-related DAGCombiner tests.	Bob Wilson
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122101 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-17	Fix crash compiling a QQQQ REG_SEQUENCE for a Neon vld3_lane operation.	Bob Wilson
	Radar 8776599 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122018 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-16	1. ARM/MC/ELF: A few more ELF relocs for .o	Jason W Kim
	2. Fixed EmitLocalCommonSymbol for ELF (Yes, they exist. :) Test added. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121951 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-15	Don't handle -arm-long-calls in fast isel for now.	Eric Christopher
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121919 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-15	Add Neon VCVT instructions for f32 <-> f16 conversions.	Bob Wilson
	Clang is now providing intrinsics for these and so we need to support them in the backend. Radar 8068427. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121902 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-14	bfi A, (and B, C1), C2) -> bfi A, B, C2 iff C1 & C2 == C1. rdar://8458663	Evan Cheng
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121746 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-14	fix fixme case typo :-)	Jason W Kim
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121743 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-13	First cut of ARM/MC/ELF PIC relocations.	Jason W Kim
	Test has fixme, to move to .s -> .o test when AsmParser works better. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121732 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-11	(or (and (shl A, #shamt), mask), B) => ARMbfi B, A, ~mask where lsb(mask) == ↵	Evan Cheng
	#shamt. rdar://8752056 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121606 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-10	Add float patterns for Neon vld1-lane/dup and vst1-lane operations.	Bob Wilson
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121583 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-10	Fix some invalid alignments for Neon vld-dup and vld/st-lane instructions.	Bob Wilson
	Alignments smaller than the total size of the memory being loaded or stored, unless the alignment is 8 bytes, are not allowed. Add tests for this, too. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121506 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-09	ARM stm/ldm instructions require more than one register in the register list.	Jim Grosbach
	Otherwise, a plain str/ldr should be used instead. Make sure we account for that in prologue/epilogue code generation. rdar://8745460 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121391 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-08	ARM/MC/ELF TPsoft is now a proper pseudo inst.	Jason W Kim
	Added test to check bl __aeabi_read_tp gets emitted properly for ELF/ASM as well as ELF/OBJ (including fixup) Also added support for ELF::R_ARM_TLS_IE32 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121312 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-07	Fix a bad prologue / epilogue codegen bug where the compiler would emit illegal	Evan Cheng
	vpush instructions to save / restore VFP / NEON registers like this: vpush {d8,d10,d11} vpop {d8,d10,d11} vpush and vpop do not allow gaps in the register list. rdar://8728956 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121197 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-06	If dbg_declare() or dbg_value() is not lowered by isel then emit DEBUG ↵	Devang Patel
	message instead of creating DBG_VALUE for undefined value in reg0. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121059 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-05	Making use of VFP / NEON floating point multiply-accumulate / subtraction is	Evan Cheng
	difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Work in progress, only A+B are enabled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120960 91177308-0d34-0410-b5e6-96231b3b80d8