aboutsummaryrefslogtreecommitdiff
path: root/lib
AgeCommit message (Collapse)Author
2012-04-09Remove unnecessary type check when combining and/or/xor of swizzles. Move ↵Craig Topper
some checks to allow better early out. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154309 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-09Remove unnecessary 'else' on an 'if' that always returnsCraig Topper
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154308 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-09Optimize code slightly. No functionality change.Craig Topper
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154307 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-09Replace some explicit checks with asserts for conditions that should never ↵Craig Topper
happen. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154305 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-09Cleanup and relax a restriction on the matching of global offsets intoChandler Carruth
x86 addressing modes. This allows PIE-based TLS offsets to fit directly into an addressing mode immediate offset, which is the last remaining code quality issue from PR12380. With this patch, that PR is completely fixed. To understand why this patch is correct to match these offsets into addressing mode immediates, break it down by cases: 1) 32-bit is trivially correct, and unmodified here. 2) 64-bit non-small mode is unchanged and never matches. 3) 64-bit small PIC code which is RIP-relative is handled specially in the match to try to fit RIP into the base register. If it fails, it now early exits. This behavior is unchanged by the patch. 4) 64-bit small non-PIC code which is not RIP-relative continues to work as it did before. The reason these immediates are safe is because the ABI ensures they fit in small mode. This behavior is unchanged. 5) 64-bit small PIC code which is *not* using RIP-relative addressing. This is the only case changed by the patch, and the primary place you see it is in TLS, either the win64 section offset TLS or Linux local-exec TLS model in a PIC compilation. Here the ABI again ensures that the immediates fit because we are in small mode, and any other operations required due to the PIC relocation model have been handled externally to the Wrapper node (extra loads etc are made around the wrapper node in ISelLowering). I've tested this as much as I can comparing it with GCC's output, and everything appears safe. I discussed this with Anton and it made sense to him at least at face value. That said, if there are issues with PIC code after this patch, yell and we can revert it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154304 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-08Optimize code a bit. No functional change intended.Craig Topper
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154299 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-08Silence sign-compare warning.Benjamin Kramer
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154297 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-08Only have codegen turn fdiv by a constant into fmul by the reciprocalDuncan Sands
when -ffast-math, i.e. don't just always do it if the reciprocal can be formed exactly. There is already an IR level transform that does that, and it does it more carefully. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154296 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-08Simplify code that tries to do vector extracts for shuffles when the mask ↵Craig Topper
width and the input vector widths don't match. No need to check the min and max are in range before calculating the start index. The range check after having the start index is sufficient. Also no need to check for an extract from the beginning differently. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154295 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-08Teach LLVM about a PIE option which, when enabled on top of PIC, makesChandler Carruth
optimizations which are valid for position independent code being linked into a single executable, but not for such code being linked into a shared library. I discussed the design of this with Eric Christopher, and the decision was to support an optional bit rather than a completely separate relocation model. Fundamentally, this is still PIC relocation, its just that certain optimizations are only valid under a PIC relocation model when the resulting code won't be in a shared library. The simplest path to here is to expose a single bit option in the TargetOptions. If folks have different/better designs, I'm all ears. =] I've included the first optimization based upon this: changing TLS models to the *Exec models when PIE is enabled. This is the LLVM component of PR12380 and is all of the hard work. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154294 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-08Move the TLSModel information into the TargetMachine rather than hidingChandler Carruth
in TargetLowering. There was already a FIXME about this location being odd. The interface is simplified as a consequence. This will also make it easier to change TLS models when compiling with PIE. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154292 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-08EngineBuilder::create is expected to take ownership of the TargetMachine ↵Benjamin Kramer
passed to it. Delete it on error or when we create an interpreter that doesn't need it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154288 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-08Remove an over zealous assert. The assert was trying to catch placesChandler Carruth
where a chain outside of the loop block-set ended up in the worklist for scheduling as part of the contiguous loop. However, asserting the first block in the chain is in the loop-set isn't a valid check -- we may be forced to drag a chain into the worklist due to one block in the chain being part of the loop even though the first block is *not* in the loop. This occurs when we have been forced to form a chain early due to un-analyzable branches. No test case here as I have no idea how to even begin reducing one, and it will be hopelessly fragile. We have to somehow end up with a loop header of an inner loop which is a successor of a basic block with an unanalyzable pair of branch instructions. Ow. Self-host triggers it so it is unlikely it will regress. This at least gets block placement back to passing selfhost and the test suite. There are still a lot of slowdown that I don't like coming out of block placement, although there are now also a lot of speedups. =[ I'm seeing swings in both directions up to 10%. I'm going to try to find time to dig into this and see if we can turn this on for 3.1 as it does a really good job of cleaning up after some loops that degraded with the inliner changes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154287 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-08Add a debug-only 'dump' method to the BlockChain structure to easeChandler Carruth
debugging. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154286 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-08Teach InstCombine to nuke a common alloca pattern -- an alloca which hasChandler Carruth
GEPs, bit casts, and stores reaching it but no other instructions. These often show up during the iterative processing of the inliner, SROA, and DCE. Once we hit this point, we can completely remove the alloca. These were actually showing up in the final, fully optimized code in a bunch of inliner tests I've been working on, and notably they show up after LLVM finishes optimizing away all function calls involved in hash_combine(a, b). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154285 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-08AVX2: Build splat vectors by broadcasting a scalar from the constant pool.Nadav Rotem
Previously we used three instructions to broadcast an immediate value into a vector register. On Sandybridge we continue to load the broadcasted value from the constant pool. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154284 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-08Remove the 'Parent' pointer from the MDNodeOperand class.Bill Wendling
An MDNode has a list of MDNodeOperands allocated directly after it as part of its allocation. Therefore, the Parent of the MDNodeOperands can be found by walking back through the operands to the beginning of that list. Mark the first operand's value pointer as being the 'first' operand so that we know where the beginning of said list is. This saves a *lot* of space during LTO with -O0 -g flags. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154280 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-08Allow subclasses of the ValueHandleBase to store information as part of theBill Wendling
value pointer by making the value pointer into a pointer-int pair with 2 bits available for flags. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154279 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-07Turn avx2 vinserti128 intrinsic calls into INSERT_SUBVECTOR DAG nodes and ↵Craig Topper
remove patterns for selecting the intrinsic. Similar was already done for avx1. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154272 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-07Move vinsertf128 patterns near the instruction definitions. Add ↵Craig Topper
AddedComplexity to AVX2 vextracti128 patterns to give them priority over the integer versions of vextractf128 patterns. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154268 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-07Remove 'else' after 'if' that ends in return.Craig Topper
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154267 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-071. Remove the part of r153848 which optimizes shuffle-of-shuffle into a newNadav Rotem
shuffle node because it could introduce new shuffle nodes that were not supported efficiently by the target. 2. Add a more restrictive shuffle-of-shuffle optimization for cases where the second shuffle reverses the transformation of the first shuffle. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154266 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-07Convert floating point division by a constant into multiplication by theDuncan Sands
reciprocal if converting to the reciprocal is exact. Do it even if inexact if -ffast-math. This substantially speeds up ac.f90 from the polyhedron benchmarks. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154265 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-07Fix ValueTracking to conclude that debug intrinsics are safe toChandler Carruth
speculate. Without this, loop rotate (among many other places) would suddenly stop working in the presence of debug info. I found this looking at loop rotate, and have augmented its tests with a reduction out of a very hot loop in yacr2 where failing to do this rotation costs sometimes more than 10% in runtime performance, perturbing numerous downstream optimizations. This should have no impact on performance without debug info, but the change in performance when debug info is enabled can be extreme. As a consequence (and this how I got to this yak) any profiling of performance problems should be treated with deep suspicion -- they may have been wildly innacurate of debug info was enabled for profiling. =/ Just a heads up. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154263 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-07SCEV: When expanding a GEP the final addition to the base pointer has NUW ↵Benjamin Kramer
but not NSW. Found by inspection. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154262 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-07Fix Thumb __builtin_longjmp with integrated assembler. <rdar://problem/11203543>Bob Wilson
The tLDRr instruction with the last register operand set to the zero register prints in assembly as if no register was specified, and the assembler encodes it as a tLDRi instruction with a zero immediate. With the integrated assembler, that zero register gets emitted as "r0", so we get "ldr rx, [ry, r0]" which is broken. Emit the instruction as tLDRi with a zero immediate. I don't know if there's a good way to write a testcase for this. Suggestions welcome. Opportunities for follow-up work: 1) The asm printer should complain if a non-optional register operand is set to the zero register, instead of silently dropping it. 2) The integrated assembler should complain in the same situation, instead of silently emitting the operand as "r0". git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154261 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-07Refactor: Use positive field names in VectorizeConfig.Hongbin Zheng
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154249 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-07Target/X86/MCTargetDesc/X86MCAsmInfo.cpp: Enable DwarfCFI (aka DW2) on Cygming.NAKAMURA Takumi
Cygwin-1.7 supports dw2. Some recent mingw distros support one, too. I have confirmed test-suite/SingleSource/Benchmarks/Shootout-C++/except.cpp can pass on Cygwin. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154247 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-07Output UTF-8-encoded characters as identifier characters into assemblySean Hunt
by default. This is a behaviour configurable in the MCAsmInfo. I've decided to turn it on by default in (possibly optimistic) hopes that most assemblers are reasonably sane. If this proves a problem, switching to default seems reasonable. I'm not sure if this is the opportune place to test, but it seemed good to make sure it was tested somewhere. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154235 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-06Tidy up. 80 columns.Jim Grosbach
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154226 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-06ARMPat is equivalent to Requires<[IsARM]>.Jakob Stoklund Olesen
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154210 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-06Eliminate iOS-specific tail call instructions.Jakob Stoklund Olesen
After register masks were introdruced to represent the call clobbers, it is no longer necessary to have duplicate instruction for iOS. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154209 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-06There is no portable std::abs overload for int64_t, use the llvm::abs64Chandler Carruth
which exists for this purpose. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154199 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-06Fixed two leaks in the MC disassembler. The MCSean Callanan
disassembler requires a MCSubtargetInfo and a MCInstrInfo to exist in order to initialize the instruction printer and disassembler; however, although the printer and disassembler keep references to these objects they do not own them. Previously, the MCSubtargetInfo and MCInstrInfo objects were just leaked. I have extended LLVMDisasmContext to own these objects and delete them when it is destroyed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154192 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-06Allow negative immediates in ARM and Thumb2 compares.Jakob Stoklund Olesen
ARM and Thumb2 mode can use cmn instructions to compare against negative immediates. Thumb1 mode can't. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154183 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-06Reintroduce InlineCostAnalyzer::getInlineCost() variant with explicit calleeDavid Chisnall
parameter until we have a more sensible API for doing the same thing. Reviewed by Chandler. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154180 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-06Sink the collection of return instructions until after *all*Chandler Carruth
simplification has been performed. This is a bit less efficient (requires another ilist walk of the basic blocks) but shouldn't matter in practice. More importantly, it's just too much work to keep track of all the various ways the return instructions can be mutated while simplifying them. This fixes yet another crasher, reported by Daniel Dunbar. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154179 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-06Make GVN's propagateEquality non-recursive. No intended functionality change.Duncan Sands
The modifications are a lot more trivial than they appear to be in the diff! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154174 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-06Fix narrowing conversion.Benjamin Kramer
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154171 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-06Allow 256-bit shuffles to be split if a 128-bit lane contains elements from ↵Craig Topper
a single source. This is a rewrite of the 256-bit shuffle splitting code based on similar code from legalize types. Fixes PR12413. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154166 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-06Sink the return instruction collection until after we're done deletingChandler Carruth
dead code, including dead return instructions in some cases. Otherwise, we end up having a bogus poniter to a return instruction that blows up much further down the road. It turns out that this pattern is both simpler to code, easier to update in the face of enhancements to the inliner cleanup, and likely cheaper given that it won't add dead instructions to the list. Thanks to John Regehr's numerous test cases for teasing this out. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154157 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-06Deduplicate ARM call-related instructions.Jakob Stoklund Olesen
We had special instructions for iOS because r9 is call-clobbered, but that is represented dynamically by the register mask operands now, so there is no need for the pseudo-instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154144 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-05ARM: Don't form a t2LDRi8 or t2STRi8 with an offset of zero.Jim Grosbach
The load/store optimizer splits LDRD/STRD into two instructions when the register pairing doesn't work out. For negative offsets in Thumb2, it uses t2STRi8 to do that. That's fine, except for the case when the offset is in the range [-4,-1]. In that case, we'll also form a second t2STRi8 with the original offset plus 4, resulting in a t2STRi8 with a non-negative offset, which ends up as if it were an STRT, which is completely bogus. Similarly for loads. No testcase, unfortunately, as any I've been able to construct is both large and extremely fragile. rdar://11193937 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154141 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-05ARM assembly aliases for add negative immediates using sub.Jim Grosbach
'add r2, #-1024' should just use 'sub r2, #1024' rather than erroring out. Thumb1 aliases for adding a negative immediate to the stack pointer, also. rdar://11192734 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154123 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-05Patch to set is_stmt a little better for prologue lines in a function.Eric Christopher
This enables debuggers to see what are interesting lines for a breakpoint rather than any line that starts a function. rdar://9852092 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154120 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-05Don't break the IV update in TLI::SimplifySetCC().Jakob Stoklund Olesen
LSR always tries to make the ICmp in the loop latch use the incremented induction variable. This allows the induction variable to be kept in a single register. When the induction variable limit is equal to the stride, SimplifySetCC() would break LSR's hard work by transforming: (icmp (add iv, stride), stride) --> (cmp iv, 0) This forced us to use lea for the IC update, preventing the simpler incl+cmp. <rdar://problem/7643606> <rdar://problem/11184260> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154119 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-05Fix accidentally inverted logic from r152803, and make theDan Gohman
testcase slightly less trivial. This fixes rdar://11171718. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154118 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-05Treat f16 the same as f80/f128 for the purposes of generating constants ↵Owen Anderson
during instruction selection. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154113 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-05Added support for unpredictable ADC/SBC instructions on ARM, and also fixed ↵Silviu Baranga
some corner cases involving the PC register as an operand for these instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154101 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-05Added support for handling unpredictable arithmetic instructions on ARM.Silviu Baranga
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154100 91177308-0d34-0410-b5e6-96231b3b80d8