llvm - http://llvm.org

Age	Commit message (Collapse)	Author
2009-03-29	Fix PR3899: add support for extracting floats from vectors	Duncan Sands
	when using -soft-float. Based on a patch by Jakob Stoklund Olesen. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67996 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-28	Make check in CheckTailCallReturnConstraints for ignorable instructions between	Arnold Schwaighofer
	a CALL and a RET node more generic. Add a test for tail calls with a void return. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67943 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-28	Enable tail call optimization for functions that return a struct (bug 3664) ↵	Arnold Schwaighofer
	and for functions that return types that need extending (e.g i1). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67934 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-28	Optimize some 64-bit multiplication by constants into two lea's or one lea + ↵	Evan Cheng
	shl since imulq is slow (latency 5). e.g. x * 40 => shlq $3, %rdi leaq (%rdi,%rdi,4), %rax This has the added benefit of allowing more multiply to be folded into addressing mode. e.g. a * 24 + b => leaq (%rdi,%rdi,2), %rax leaq (%rsi,%rax,8), %rax git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67917 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-27	Fix what surely must be a copy+pasto.	Dan Gohman
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67881 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-27	Initialize LiveOutInfo's APInt members to zero, as APInt's	Dan Gohman
	default constructor produces an uninitialized APInt. This fixes PR3896. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67879 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-26	Pull transform from target-dependent code into target-independent code.	Bill Wendling
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67742 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-25	Revert 67132. This is breaking some objective-c apps.	Evan Cheng
	Also fixes SDISel so it does not force promote return value if the function is not marked signext / zeroext. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67701 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-25	When optimizing with debug info, don't keep the	Dale Johannesen
	stoppoint nodes around until Legalize; doing this imposed an ordering on a sequence of loads that came from different lines, interfering with scheduling. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67692 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-24	more tidying: name the components of PhysReg in the case when	Chris Lattner
	the target constraint specifies a specific physreg. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67618 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-24	Tidy a bit more.	Chris Lattner
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67617 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-24	simplify this code a bit now that "allocation to a vreg class" can never	Chris Lattner
	fail. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67616 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-24	Minor compile-time optimization; don't bother checking	Dan Gohman
	canClobberPhysRegDefs if the successor node doesn't clobber any physical registers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67587 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-24	Add a pre-pass to the burr-list scheduler which makes adjustments to	Dan Gohman
	help out the register pressure reduction heuristics in the case of nodes with multiple uses. Currently this uses very conservative heuristics, so it doesn't have a broad impact, but in cases where it does help it can make a big difference. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67586 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-23	When unfolding a load during scheduling, the new operator node has	Dan Gohman
	a data dependency on the load node, so it really needs a data-dependence edge to the load node, even if the load previously existed. And add a few comments. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67554 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-23	Don't set SUnit::hasPhysRegDefs to true unless the defs are	Dan Gohman
	actually have uses, which reflects the way it's used. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67540 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-23	Fix canClobberPhysRegDefs to check all SDNodes grouped together	Dan Gohman
	in an SUnit, instead of just the first one. This fix is needed by some upcoming scheduler changes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67531 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-23	Add a new bit to SUnit to record whether a node has implicit physreg	Dan Gohman
	defs, regardless of whether they are actually used. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67528 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-23	Now that errs() is properly non-buffered, there's no need to	Dan Gohman
	explicitly flush it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67526 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-23	Model inline asm constraint which ties an input to an output register as ↵	Evan Cheng
	machine operand TIED_TO constraint. This eliminated the need to pre-allocate registers for these. This also allows register allocator can eliminate the unneeded copies. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67512 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-20	Simplify this code; use a while instead of an if and a do-while.	Dan Gohman
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67400 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-20	For inline asm output operand that matches an input. Encode the input ↵	Evan Cheng
	operand index in the high bits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67387 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-20	Fixed the comment. No functionality change.	Sanjiv Gupta
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67370 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-18	Added missing support for widening when splitting an unary op (PR3683)	Mon P Wang
	and expanding a bit convert (PR3711). In both cases, we extract the valid part of the widen vector and then do the conversion. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67175 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-17	Don't force promotion of return arguments on the callee.	Rafael Espindola
	Some architectures (like x86) don't require it. This fixes bug 3779. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67132 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-17	Fix codegen to compute the size of an allocation by multiplying the	Chris Lattner
	size by the array amount as an i32 value instead of promoting from i32 to i64 then doing the multiply. Not doing this broke wrap-around assumptions that the optimizers (validly) made. The ultimate real fix for this is to introduce i64 version of alloca and remove mallocinst. This fixes PR3829 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67093 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-17	Fix a problem with DAGCombine where we were building an illegal build	Mon P Wang
	vector shuffle mask. Forced the mask to be built using i32. Note: this will be irrelevant once vector_shuffle no longer takes a build vector for the shuffle mask. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67076 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-14	Avoid doing the transformation c ? 1.0 : 2.0 as load { 2.0, 1.0 } + c*4	Mon P Wang
	if FPConstant is legal because if the FPConstant doesn't need to be stored in a constant pool, the transformation is unlikely to be profitable. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66994 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-13	Improve FastISel's handling of truncates to i1, and implement	Dan Gohman
	ptrtoint and inttoptr in X86FastISel. These casts aren't always handled in the generic FastISel code because X86 sometimes needs custom code to do truncation and zero-extension. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66988 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-13	Fix FastISel's assumption that i1 values are always zero-extended	Dan Gohman
	by inserting explicit zero extensions where necessary. Included is a testcase where SelectionDAG produces a virtual register holding an i1 value which FastISel previously mistakenly assumed to be zero-extended. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66941 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-13	Fix some significant problems with constant pools that resulted in ↵	Evan Cheng
	unnecessary paddings between constant pool entries, larger than necessary alignments (e.g. 8 byte alignment for .literal4 sections), and potentially other issues. 1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants. 2. MachineConstantPool alignment field is also a log2 value. 3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values. 4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries. 5. Asm printer uses expensive data structure multimap to track constant pool entries by sections. 6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic. Solutions: 1. ConstantPoolSDNode alignment field is changed to keep non-log2 value. 2. MachineConstantPool alignment field is also changed to keep non-log2 value. 3. Functions that create ConstantPool nodes are passing in non-log2 alignments. 4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT. 5. Asm printer uses cheaper data structure to group constant pool entries. 6. Asm printer compute entry offsets after grouping is done. 7. Change JIT code to compute entry offsets on the fly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66875 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-13	Oops...I committed too much.	Bill Wendling
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66867 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-13	Temporarily XFAIL this test.	Bill Wendling
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66866 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-12	Fix a typo in a comment.	Dan Gohman
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66843 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-12	Move 3 "(add (select cc, 0, c), x) -> (select cc, x, (add, x, c))"	Chris Lattner
	related transformations out of target-specific dag combine into the ARM backend. These were added by Evan in r37685 with no testcases and only seems to help ARM (e.g. test/CodeGen/ARM/select_xform.ll). Add some simple X86-specific (for now) DAG combines that turn things like cond ? 8 : 0 -> (zext(cond) << 3). This happens frequently with the recently added cp constant select optimization, but is a very general xform. For example, we now compile the second example in const-select.ll to: _test: movsd LCPI2_0, %xmm0 ucomisd 8(%esp), %xmm0 seta %al movzbl %al, %eax movl 4(%esp), %ecx movsbl (%ecx,%eax,4), %eax ret instead of: _test: movl 4(%esp), %eax leal 4(%eax), %ecx movsd LCPI2_0, %xmm0 ucomisd 8(%esp), %xmm0 cmovbe %eax, %ecx movsbl (%ecx), %eax ret This passes multisource and dejagnu. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66779 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-12	Enable Chris' value propagation change. It make available known sign, zero, ↵	Evan Cheng
	one bits information for values that are live out of basic blocks. The goal is to eliminate unnecessary sext, zext, truncate of values that are live-in to blocks. This does not handle PHI nodes yet. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66777 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-11	reapply my previous patch (r66358) with a tweak to set the	Chris Lattner
	alignment of the generated constant pool entry to the desired alignment of a type. If we don't do this, we end up trying to do movsd from 4-byte alignment memory. This fixes 450.soplex and 456.hmmer. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66641 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-10	Revert 66358 for now. It's breaking povray, 450.soplex, and 456.hmmer on x86 ↵	Evan Cheng
	/ Darwin. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66574 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-09	Fix PR3763 by using proper APInt methods instead of uint64_t's.	Chris Lattner
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66434 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-09	Pass in a std::string when getting the names of debugging things. This cuts down	Bill Wendling
	on the number of times a std::string is created and copied. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66396 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-08	implement an optimization to codegen c ? 1.0 : 2.0 as load { 2.0, 1.0 } + c*4.	Chris Lattner
	For 2009-03-07-FPConstSelect.ll we now produce: _f: xorl %eax, %eax testl %edi, %edi movl $4, %ecx cmovne %rax, %rcx leaq LCPI1_0(%rip), %rax movss (%rcx,%rax), %xmm0 ret previously we produced: _f: subl $4, %esp cmpl $0, 8(%esp) movss LCPI1_0, %xmm0 je LBB1_2 ## entry LBB1_1: ## entry movss LCPI1_1, %xmm0 LBB1_2: ## entry movss %xmm0, (%esp) flds (%esp) addl $4, %esp ret on PPC the code also improves to: _f: cntlzw r2, r3 srwi r2, r2, 5 li r3, lo16(LCPI1_0) slwi r2, r2, 2 addis r3, r3, ha16(LCPI1_0) lfsx f1, r3, r2 blr from: _f: li r2, lo16(LCPI1_1) cmplwi cr0, r3, 0 addis r2, r2, ha16(LCPI1_1) beq cr0, LBB1_2 ; entry LBB1_1: ; entry li r2, lo16(LCPI1_0) addis r2, r2, ha16(LCPI1_0) LBB1_2: ; entry lfs f1, 0(r2) blr This also improves the existing pic-cpool case from: foo: subl $12, %esp call .Lllvm$1.$piclabel .Lllvm$1.$piclabel: popl %eax addl $_GLOBAL_OFFSET_TABLE_ + [.-.Lllvm$1.$piclabel], %eax cmpl $0, 16(%esp) movsd .LCPI1_0@GOTOFF(%eax), %xmm0 je .LBB1_2 # entry .LBB1_1: # entry movsd .LCPI1_1@GOTOFF(%eax), %xmm0 .LBB1_2: # entry movsd %xmm0, (%esp) fldl (%esp) addl $12, %esp ret to: foo: call .Lllvm$1.$piclabel .Lllvm$1.$piclabel: popl %eax addl $_GLOBAL_OFFSET_TABLE_ + [.-.Lllvm$1.$piclabel], %eax xorl %ecx, %ecx cmpl $0, 4(%esp) movl $8, %edx cmovne %ecx, %edx fldl .LCPI1_0@GOTOFF(%eax,%edx) ret This triggers a few dozen times in spec FP 2000. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66358 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-08	random cleanups.	Chris Lattner
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66357 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-07	Introduce new linkage types linkonce_odr, weak_odr, common_odr	Duncan Sands
	and extern_weak_odr. These are the same as the non-odr versions, except that they indicate that the global will only be overridden by an equivalent global. In C, a function with weak linkage can be overridden by a function which behaves completely differently. This means that IP passes have to skip weak functions, since any deductions made from the function definition might be wrong, since the definition could be replaced by something completely different at link time. This is not allowed in C++, thanks to the ODR (One-Definition-Rule): if a function is replaced by another at link-time, then the new function must be the same as the original function. If a language knows that a function or other global can only be overridden by an equivalent global, it can give it the weak_odr linkage type, and the optimizers will understand that it is alright to make deductions based on the function body. The code generators on the other hand map weak and weak_odr linkage to the same thing. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66339 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-06	Fix ScheduleDAGRRList::CopyAndMoveSuccessors' handling of nodes	Dan Gohman
	with multiple chain operands. This can occur when the scheduler has added chain operands to a node that already has a chain operand, in order to handle physical register dependencies. This fixes an llvm-gcc bootstrap failure on x86-64 introduced in r66058. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66240 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-04	Fix BuildVectorSDNode::isConstantSplat to handle one-element vectors.	Bob Wilson
	It is an error to call APInt::zext with a size that is equal to the value's current size, so use zextOrTrunc instead. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66039 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-04	PR3686: make the legalizer handle bitcast from i80 to x86 long double.	Eli Friedman
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66021 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-04	Fix PR3701. 1. X86 target renamed eflags register to flags. This matches ↵	Evan Cheng
	what llvm-gcc generates so codegen knows flags register is being clobbered by inline asm. 2. BURR scheduler should also check if inline asm nodes can clobber "live" physical registers. Previously it was only checking target nodes with implicit defs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65996 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-04	The DAG combiner was performing a BT combine. The BT combine had a value of -1,	Bill Wendling
	so it changed it into a 31 via the TLO.ShrinkDemandedConstant() call. Then it would go through the DAG combiner again. This time it had a value of 31, which was turned into a -1 by TLI.SimplifyDemandedBits(). This would ping pong forever. Teach the TLO.ShrinkDemandedConstant() call not to lower a value if the demanded value is an XOR of all ones. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65985 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-02	Generalize BuildVectorSDNode::isConstantSplat to use APInts and handle	Bob Wilson
	arbitrary vector sizes. Add an optional MinSplatBits parameter to specify a minimum for the splat element size. Update the PPC target to use the revised interface. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65899 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-01	Fix a problem with DAGCombine on 64b targets where folding	Nate Begeman
	extracts + build_vector into a shuffle would fail, because the type of the new build_vector would not be legal. Try harder to create a legal build_vector type. Note: this will be totally irrelevant once vector_shuffle no longer takes a build_vector for shuffle mask. New: _foo: xorps %xmm0, %xmm0 xorps %xmm1, %xmm1 subps %xmm1, %xmm1 mulps %xmm0, %xmm1 addps %xmm0, %xmm1 movaps %xmm1, 0 Old: _foo: xorps %xmm0, %xmm0 movss %xmm0, %xmm1 xorps %xmm2, %xmm2 unpcklps %xmm1, %xmm2 pshufd $80, %xmm1, %xmm1 unpcklps %xmm1, %xmm2 pslldq $16, %xmm2 pshufd $57, %xmm2, %xmm1 subps %xmm0, %xmm1 mulps %xmm0, %xmm1 addps %xmm0, %xmm1 movaps %xmm1, 0 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65791 91177308-0d34-0410-b5e6-96231b3b80d8