llvm - http://llvm.org

Age	Commit message (Collapse)	Author
2013-01-11	PPC: Implement efficient lowering of sign_extend_inreg.	Nadav Rotem
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172269 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-11	Update patch for the pad short functions pass for Intel Atom (only).	Preston Gurd
	Adds a check for -Oz, changes the code to not re-visit BBs, and skips over DBG_VALUE instrs. Patch by Andy Zhang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172258 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-11	For inline asm:	Eric Christopher
	- recognize string "{memory}" in the MI generation - mark as mayload/maystore when there's a memory clobber constraint. PR14859. Patch by Krzysztof Parzyszek git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172228 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-11	Simplify writing floating types to assembly.	Tim Northover
	This removes previous special cases for each floating-point type in favour of a shared codepath. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172189 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-10	llvm/test/CodeGen/X86/ms-inline-asm.ll: Fixup; Globals doesn't have leading ↵	NAKAMURA Takumi
	underscore in symbol on linux. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172139 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-10	PR14896: Handle memcpy from constant string where the memcpy size is larger ↵	Evan Cheng
	than the string size. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172124 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-10	[ms-inline asm] Add support for calling functions from inline assembly.	Chad Rosier
	Part of rdar://12991541 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172121 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-10	Stack Alignment: throw error if we can't satisfy the minimal alignment	Manman Ren
	requirement when creating stack objects in MachineFrameInfo. Add CreateStackObjectWithMinAlign to throw error when the minimal alignment can't be achieved and to clamp the alignment when the preferred alignment can't be achieved. Same is true for CreateVariableSizedObject. Will not emit error in CreateSpillStackObject or CreateStackObject. As long as callers of CreateStackObject do not assume the object will be aligned at the requested alignment, we should not have miscompile since later optimizations which look at the object's alignment will have the correct information. rdar://12713765 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172027 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-09	Fix a DAG combine bug visitBRCOND() is transforming br(xor(x, y)) to br(x != y).	Evan Cheng
	It cahced XOR's operands before calling visitXOR() but failed to update the operands when visitXOR changed the XOR node. rdar://12968664 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171999 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-09	add -march to the test	Nadav Rotem
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171956 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-09	Efficient lowering of vector sdiv when the divisor is a splatted power of ↵	Nadav Rotem
	two constant. PR 14848. The lowered sequence is based on the existing sequence the target-independent DAG Combiner creates for the scalar case. Patch by Zvi Rackover. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171953 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-09	MIsched: add an ILP window property to machine model.	Andrew Trick
	This was an experimental option, but needs to be defined per-target. e.g. PPC A2 needs to aggressively hide latency. I converted some in-order scheduling tests to A2. Hal is working on more test cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171946 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-08	Specify complete triple for fp128 tests.	Tim Northover
	This avoids FileCheck failing over different comment characters in assembly (notably powerpc64 on Linux vs Darwin) and should fix David's build-bot. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171886 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-08	Pad Short Functions for Intel Atom	Preston Gurd
	The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. This patch has been updated to address Nadav's review comments - Optimize only at >= O1 and don't do optimization if -Os is set - Stores MachineBasicBlock* instead of BBNum - Uses DenseMap instead of std::map - Fixes placement of braces Patch by Andy Zhang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171879 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-08	Allow the asm printer to print fp128 values properly.	Tim Northover
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171866 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-07	This patch addresses bug 14678 by fixing two problems in medium code model	Bill Schmidt
	code generation. Variables addressed through a GlobalAlias were not being handled, and variables with available_externally linkage were treated incorrectly. The patch contains two new tests to verify the correct code generation for these cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171778 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-07	Make the MergeGlobals pass correctly handle the address space qualifiers of ↵	Silviu Baranga
	the global variables. We partition the set of globals by their address space, and apply the same the trasnformation as before to merge them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171730 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-06	Fix suffix handling for parsing and printing of cvtsi2ss, cvtsi2sd, ↵	Craig Topper
	cvtss2si, cvttss2si, cvtsd2si, and cvttsd2si to match gas behavior. cvtsi2* should parse with an 'l' or 'q' suffix or no suffix at all. No suffix should be treated the same as 'l' suffix. Printing should always print a suffix. Previously we didn't parse or print an 'l' suffix. cvtt2si/cvt2si should parse with an 'l' or 'q' suffix or not suffix at all. No suffix should use the destination register size to choose encoding. Printing should not print a suffix. Original 'l' suffix issue with cvtsi2* pointed out by Michael Kuperstein. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171668 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-06	Fix for PR14739. It's not safe to fold a load into a call across a store. ↵	Evan Cheng
	Thanks to Nick Lewycky for the initial patch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171665 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-05	Recommit r171461 which was incorrectly reverted. Mark DIV/IDIV instructions ↵	Craig Topper
	hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171608 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-05	Revert revision 171524. Original message:	Nadav Rotem
	URL: http://llvm.org/viewvc/llvm-project?rev=171524&view=rev Log: The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. Patch by Andy Zhang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171603 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-04	The current Intel Atom microarchitecture has a feature whereby when a function	Preston Gurd
	returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. Patch by Andy Zhang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171524 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-04	[mips] MipsTargetLowering::getSetCCResultType should return a vector type if	Akira Hatanaka
	vectors are being compared. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171517 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-04	Revert revision: 171467. This transformation is incorrect and makes some ↵	Nadav Rotem
	tests fail. Original message: Simplified TRUNCATE operation that comes after SETCC. It is possible since SETCC result is 0 or -1. Added a test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171468 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-03	Simplified TRUNCATE operation that comes after SETCC. It is possible since ↵	Elena Demikhovsky
	SETCC result is 0 or -1. Added a test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171467 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-03	Revert "Mark DIV/IDIV instructions hasSideEffects=1 because they can trap ↵	Michael Gottesman
	when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks." This reverts commit r171461 since it breaks the following tests: Clang :: Analysis/outofbound-notwork.c Clang :: Analysis/string-fail.c Clang :: CXX/basic/basic.lookup/basic.lookup.qual/p6-0x.cpp Clang :: CXX/basic/basic.lookup/basic.lookup.unqual/p15.cpp Clang :: CXX/dcl.dcl/dcl.spec/dcl.fct.spec/p4.cpp Clang :: CXX/dcl.dcl/dcl.spec/dcl.stc/p10.cpp Clang :: CXX/temp/temp.param/p14.cpp Clang :: CXX/temp/temp.res/temp.dep.res/temp.point/p1.cpp Clang :: CodeGen/2009-02-13-zerosize-union-field-ppc.c Clang :: CodeGen/blocks-2.c Clang :: CodeGen/libcalls-d.c Clang :: CodeGen/libcalls-ld.c Clang :: CodeGenCXX/conversion-function.cpp Clang :: CodeGenCXX/debug-info-limit-type.cpp Clang :: CodeGenCXX/inheriting-constructor.cpp Clang :: FixIt/fixit-errors.c Clang :: FixIt/fixit-pmem.cpp Clang :: Modules/namespaces.cpp Clang :: PCH/changed-files.c Clang :: PCH/pr4489.c Clang :: PCH/source-manager-stack.c Clang :: Parser/cxx-ambig-decl-expr-xfail.cpp Clang :: SemaCXX/switch-implicit-fallthrough-cxx98.cpp Clang :: SemaTemplate/instantiate-function-1.mm git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171466 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-03	Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when ↵	Craig Topper
	dividing by 0. This is needed to keep early if conversion from moving them across basic blocks. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171461 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-03	Fix PR14732 by handling all kinds of IMPLICIT_DEF live ranges.	Jakob Stoklund Olesen
	Most IMPLICIT_DEF instructions are removed by the ProcessImplicitDefs pass, and a few are reinserted by PHIElimination when a PHI argument is <undef>. RegisterCoalescer was assuming that all IMPLICIT_DEF live ranges look like those created by PHIElimination, and that their live range never leaves the basic block. The PR14732 test case does tricks with PHI nodes that causes a longer IMPLICIT_DEF live range to appear. This happens very rarely, but RegisterCoalescer should be able to handle it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171435 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-02	DAGCombiner: Avoid generating illegal vector INT_TO_FP nodes	Tom Stellard
	DAGCombiner::reduceBuildVecConvertToConvertBuildVec() was making two mistakes: 1. It was checking the legality of scalar INT_TO_FP nodes and then generating vector nodes. 2. It was passing the result value type to TargetLoweringInfo::getOperationAction() when it should have been passing the value type of the first operand. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171420 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-02	AVX: Fix a bug in WidenMaskArithmetic.	Nadav Rotem
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171397 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-30	Support ppcf128 in SelectionDAG::getConstantFP	Hal Finkel
	Fixes pr14751. Patch by Kai; Thanks! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171261 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-30	Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ↵	Dmitri Gribenko
	ModuleID This is done to avoid odd test failures, like the one fixed in r171243. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171250 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-28	AVX: Move the ZEXT/ANYEXT DAGCo optimizations to the lowering of these ↵	Nadav Rotem
	optimizations. The old test cases still cover all of these lowering/optimizations. The single change that we have is that now anyext does not need to zero a register, because it does not use the exact code path as the zero_extend. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171178 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-27	On AVX/AVX2 the type v8i1 is legalized to v8i16, which is an XMM sized	Nadav Rotem
	register. In most cases we actually compare or select YMM-sized registers and mixing the two types creates horrible code. This commit optimizes some of the transition sequences. PR14657. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171148 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-26	llvm/test/CodeGen/X86: FileCheck-ize two tests in r171083.	NAKAMURA Takumi
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171084 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-26	llvm/test/CodeGen/X86: Disable avx in two tests corresponding to r171082.	NAKAMURA Takumi
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171083 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-25	Loosen scheduling restrictions on the PPC dcbt intrinsic	Hal Finkel
	As with the prefetch intrinsic to which it maps, simply have dcbt marked as reading from and writing to its arguments instead of having unmodeled side effects. While this might cause unwanted code motion (because aliasing checks don't really capture cache-line sharing), it is more important that prefetches in unrolled loops don't block the scheduler from rearranging the unrolled loop body. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171073 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-25	Expand PPC64 atomic load and store	Hal Finkel
	Use of store or load with the atomic specifier on 64-bit types would cause instruction-selection failures. As with the 32-bit case, these can use the default expansion in terms of cmp-and-swap. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171072 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-25	Harden test so it's not affected by changes to compare lowering.	Benjamin Kramer
	This only failed on hosts that don't have SSE41. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171066 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-25	X86: Shave off one shuffle from the pcmpeqq sequence for SSE2 by making use ↵	Benjamin Kramer
	of and commutativity. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171064 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-25	X86: Custom lower <2 x i64> eq and ne when SSE41 is not available.	Benjamin Kramer
	pcmpeqd, pshufd, pshufd, pand is cheaper than unpack + cmpq, sbbq, cmpq, sbbq + pack. Small speedup on loop-vectorized viterbi (-march=core2). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171063 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-24	llvm/test/CodeGen/X86/fold-vex.ll: Add explicit triple.	NAKAMURA Takumi
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171029 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-24	Some x86 instructions can load/store one of the operands to memory. On SSE, ↵	Nadav Rotem
	this memory needs to be aligned. When these instructions are encoded in VEX (on AVX) there is no such requirement. This changes the folding tables and removes the alignment restrictions from VEX-encoded instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171024 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-22	X86: Turn mul of <4 x i32> into pmuludq when no SSE4.1 is available.	Benjamin Kramer
	pmuludq is slow, but it turns out that all the unpacking and packing of the scalarized mul is even slower. 10% speedup on loop-vectorized paq8p. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170985 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-22	X86: Emit vector sext as shuffle + sra if vpmovsx is not available.	Benjamin Kramer
	Also loosen the SSSE3 dependency a bit, expanded pshufb + psra is still better than scalarized loads. Fixes PR14590. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170984 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-21	In some cases, due to scheduling constraints we copy the EFLAGS.	Nadav Rotem
	The only way to read the eflags is using push and pop. If we don't adjust the stack then we run over the first frame index. This is not something that we want to do, so we have to make sure that our machine function does not copy the flags. If it does then we have to emit the prolog that adjusts the stack. rdar://12896831 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170961 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-21	try to unbreak ppc buildbots.	Benjamin Kramer
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170913 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-21	X86: Match pmin/pmax as a target specific dag combine. This occurs during ↵	Benjamin Kramer
	vectorization. Part of PR14667. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170908 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-21	R600: Expand vec4 INT <-> FP conversions	Tom Stellard
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170901 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-21	Add test case for r170674	Reed Kotler
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170823 91177308-0d34-0410-b5e6-96231b3b80d8