aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2012-06-11Re-enable the CMN instruction.Bill Wendling
We turned off the CMN instruction because it had semantics which we weren't getting correct. If we are comparing with an immediate, then it's okay to use the CMN instruction. <rdar://problem/7569620> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158302 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-11InstCombine: factor code better.Benjamin Kramer
No functionality change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158301 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-10InstCombine: Turn (zext A) == (B & (1<<X)-1) into A == (trunc B), narrowing ↵Benjamin Kramer
the compare. This saves a cast, and zext is more expensive on platforms with subreg support than trunc is. This occurs in the BSD implementation of memchr(3), see PR12750. On the synthetic benchmark from that bug stupid_memchr and bsd_memchr have the same performance now when not inlining either function. stupid_memchr: 323.0us bsd_memchr: 321.0us memchr: 479.0us where memchr is the llvm-gcc compiled bsd_memchr from osx lion's libc. When inlining is enabled bsd_memchr still regresses down to llvm-gcc memchr time, I haven't fully understood the issue yet, something is grossly mangling the loop after inlining. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158297 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-10Enable ILP scheduling for all nodes by default on PPC.Hal Finkel
Over the entire test-suite, this has an insignificantly negative average performance impact, but reduces some of the worst slowdowns from the anti-dep. change (r158294). Largest speedups: SingleSource/Benchmarks/Stanford/Quicksort - 28% SingleSource/Benchmarks/Stanford/Towers - 24% SingleSource/Benchmarks/Shootout-C++/matrix - 23% MultiSource/Benchmarks/SciMark2-C/scimark2 - 19% MultiSource/Benchmarks/MiBench/automotive-bitcount/automotive-bitcount - 15% (matrix and automotive-bitcount were both in the top-5 slowdown list from the anti-dep. change) Largest slowdowns: MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 28% MultiSource/Benchmarks/mediabench/gsm/toast/toast - 26% MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan - 21% SingleSource/Benchmarks/CoyoteBench/lpbench - 20% MultiSource/Applications/d/make_dparser - 16% git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158296 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-10Add AutoUpgrade support for the SSE4 ptest intrinsics.Nadav Rotem
Patch by Michael Kuperstein. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158295 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-10Use critical anti-dep. breaking on all PPC targets, but also add other ↵Hal Finkel
register classes. Using 'all' instead of 'critical' would be better because it would make it easier to satisfy the bundling constraints, but, as noted in the FIXME, that is currently not possible with the crs. This yields an average 1% speedup over the entire test suite (on Power 7). Largest speedups: SingleSource/Benchmarks/Shootout-C++/moments - 40% MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 28% SingleSource/Benchmarks/BenchmarkGame/nsieve-bits - 26% SingleSource/Benchmarks/McGill/misr - 23% MultiSource/Applications/JM/ldecod/ldecod - 22% Largest slowdowns: SingleSource/Benchmarks/Shootout-C++/matrix - -29% SingleSource/Benchmarks/Shootout-C++/ary3 - -22% MultiSource/Benchmarks/BitBench/uuencode/uuencode - -18% SingleSource/Benchmarks/Shootout-C++/ary - -17% MultiSource/Benchmarks/MiBench/automotive-bitcount/automotive-bitcount - -15% git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158294 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-10Add intrinsics for immediate form of XOP vprot instructions. Use i128mem ↵Craig Topper
instead of f128mem for integer XOP instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158291 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09Improve ext/trunc patterns on PPC64.Hal Finkel
The PPC64 backend had patterns for i32 <-> i64 extensions and truncations that would leave self-moves in the final assembly. Replacing those patterns with ones based on the SUBREG builtins yields better-looking code. Thanks to Jakob and Owen for their suggestions in this matter. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158283 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09Use XOP vpcom intrinsics in patterns instead of a target specific SDNode ↵Craig Topper
type. Remove the custom lowering code that selected the SDNode type. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158279 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09Replace XOP vpcom intrinsics with fewer intrinsics that take the immediate ↵Craig Topper
as an argument. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158278 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09Hashing: Remove outdated comment. Support for reserved hash values was ↵Benjamin Kramer
removed in r151865. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158276 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09Disabling a spurious deprecation warning about using PathV1 from within the ↵Aaron Ballman
PathV1 implementation file. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158274 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09Fixing a typo in the comments.Aaron Ballman
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158273 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09Allocate the contents of DwarfDebug's StringMaps in a single big ↵Benjamin Kramer
BumpPtrAllocator. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158265 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09Silence a gcc-4.6 warning: GCC fails to understand that secondReg and cmpOp2 areDuncan Sands
correlated, and thinks that cmpOp2 may be used uninitialized. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158263 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09Enable tail merging on PPC.Hal Finkel
Tail merging had been disabled on PPC because it would disturb bundling decisions made during pre-RA scheduling on the 970 cores. Now, however, all bundling decisions are made during post-RA scheduling, and tail merging is generally beneficial (the average test-suite speedup is insignificantly positive). Largest test-suite speedups: MultiSource/Benchmarks/mediabench/gsm/toast/toast - 30% MultiSource/Benchmarks/BitBench/uuencode/uuencode - 23% SingleSource/Benchmarks/Shootout-C++/ary - 21% SingleSource/Benchmarks/Stanford/Queens - 17% Largest slowdowns: MultiSource/Benchmarks/MiBench/security-sha/security-sha - 24% MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 22% MultiSource/Applications/JM/ldecod/ldecod - 14% MultiSource/Benchmarks/mediabench/g721/g721encode/encode - 9% This is improved by using full (instead of just critical) anti-dependency breaking, but doing so still causes miscompiles and so cannot yet be enabled by default. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158259 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09Register pressure: added getPressureAfterInstr.Andrew Trick
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158256 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09Sketch a LiveRegMatrix analysis pass.Jakob Stoklund Olesen
The LiveRegMatrix represents the live range of assigned virtual registers in a Live interval union per register unit. This is not fundamentally different from the interference tracking in RegAllocBase that both RABasic and RAGreedy use. The important differences are: - LiveRegMatrix tracks interference per register unit instead of per physical register. This makes interference checks cheaper and assignments slightly more expensive. For example, the ARM D7 reigster has 24 aliases, so we would check 24 physregs before assigning to one. With unit-based interference, we check 2 units before assigning to 2 units. - LiveRegMatrix caches regmask interference checks. That is currently duplicated functionality in RABasic and RAGreedy. - LiveRegMatrix is a pass which makes it possible to insert target-dependent passes between register allocation and rewriting. Such passes could tweak the register assignments with interference checking support from LiveRegMatrix. Eventually, RABasic and RAGreedy will be switched to LiveRegMatrix. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158255 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09Test commitJack Carter
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158250 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09Also compute MBB live-in lists in the new rewriter pass.Jakob Stoklund Olesen
This deduplicates some code from the optimizing register allocators, and it means that it is now possible to change the register allocators' solutions simply by editing the VirtRegMap between the register allocator pass and the rewriter. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158249 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09Convert comments to proper Doxygen comments.Dmitri Gribenko
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158248 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08Removing strange "using" declarations form TargetInstrInfo.Andrew Trick
I can't imagine why these were added. Trial and error. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158247 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08Reintroduce VirtRegRewriter.Jakob Stoklund Olesen
OK, not really. We don't want to reintroduce the old rewriter hacks. This patch extracts virtual register rewriting as a separate pass that runs after the register allocator. This is possible now that CodeGen/Passes.cpp can configure the full optimizing register allocator pipeline. The rewriter pass uses register assignments in VirtRegMap to rewrite virtual registers to physical registers, and it inserts kill flags based on live intervals. These finalization steps are the same for the optimizing register allocators: RABasic, RAGreedy, and PBQP. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158244 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08Don't run RAFast in the optimizing regalloc pipeline.Jakob Stoklund Olesen
The fast register allocator is not supposed to work in the optimizing pipeline. It doesn't make sense to compute live intervals, run full copy coalescing, and then run RAFast. Fast register allocation in the optimizing pipeline is better done by RABasic. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158242 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08canonicalize:Nuno Lopes
-%a + 42 into 42 - %a previously we were emitting: -(%a + 42) This fixes the infinite loop in PR12338. The generated code is still not perfect, though. Will work on that next git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158237 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08Start implementing pre-ra if-converter: using speculation and selects to ↵Evan Cheng
eliminate branches. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158234 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08TargetInstrInfo hooks implemented in codegen should be declared pure virtual.Andrew Trick
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158233 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08Reapply commit 158073 with a fix (the testcase was already committed). TheDuncan Sands
problem was that by moving instructions around inside the function, the pass could accidentally move the iterator being used to advance over the function too. Fix this by only processing the instruction equal to the iterator, and leaving processing of instructions that might not be equal to the iterator to later (later = after traversing the basic block; it could also wait until after traversing the entire function, but this might make the sets quite big). Original commit message: Grab-bag of reassociate tweaks. Unify handling of dead instructions and instructions to reoptimize. Exploit this to more systematically eliminate dead instructions (this isn't very useful in practice but is convenient for analysing some testcase I am working on). No need for WeakVH any more: use an AssertingVH instead. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158226 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08Remove the TODO statement in the PPC README re: CTR loopsHal Finkel
As Chris points out, this can now be removed! TODO: check if the associated section on viterbi's inner loop can also be removed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158224 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08Enable PPC CTR loop formation by default.Hal Finkel
Thanks to Jakob's help, this now causes no new test suite failures! Over the entire test suite, this gives an average 1% speedup. The largest speedups are: SingleSource/Benchmarks/Misc/pi - 108% SingleSource/Benchmarks/CoyoteBench/lpbench - 54% MultiSource/Benchmarks/Prolangs-C/unix-smail/unix-smail - 50% SingleSource/Benchmarks/Shootout/ary3 - 32% SingleSource/Benchmarks/Shootout-C++/matrix - 30% The largest slowdowns are: MultiSource/Benchmarks/mediabench/gsm/toast/toast - -30% MultiSource/Benchmarks/Prolangs-C/bison/mybison - -25% MultiSource/Benchmarks/BitBench/uuencode/uuencode - -22% MultiSource/Applications/d/make_dparser - -14% SingleSource/Benchmarks/Shootout-C++/ary - -13% In light of these slowdowns, additional profiling work is obviously needed! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158223 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08Mark the PPC CTRRC and CTRRC8 register classes as non-allocatable.Hal Finkel
Marking these classes as non-alocatable allows CTR loop generation to work correctly with the block placement passes, etc. These register classes are currently used only by some unused TCRETURN patterns. In future cleanup, these will be removed. Thanks again to Jakob for suggesting this fix to the CTR loop problem! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158221 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08Enable optimization for integer ABS on X86 if Subtarget has CMOV.Manman Ren
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158220 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08Test case for r158160Manman Ren
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158218 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08Sched itinerary fix: Avoid static initializers.Andrew Trick
This fixes an accidental dependence on static initialization order that I introduced yesterday. Thank you Lang!!! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158215 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08Fix a crash in APInt::lshr when shiftAmt > BitWidth.Chad Rosier
Patch by James Benton <jbenton@vmware.com>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158213 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08Fix Target->Codegen dependence.Andrew Trick
Bulk move of TargetInstrInfo implementation into TargetInstrInfoImpl. This is dirty because the code isn't part of TargetInstrInfoImpl class, nor should it be, because the methods are not target hooks. However, it's the current mechanism for keeping libTarget useful outside the backend. You'll get a not-so-nice link error if you invoke a TargetInstrInfo method that depends on CodeGen. The TargetInstrInfoImpl class should probably be removed since it doesn't really solve this problem. To really fix this, we probably need separate interfaces for the CodeGen/nonCodeGen sides of TargetInstrInfo. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158212 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08BoundsChecking: add support for ConstantPointerNull. fixes a bunch of ↵Nuno Lopes
instrumentation failures in loops with reallocs git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158210 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08test/CodeGen/Generic/APIntLoadStore.ll: Mark as XFAIL:ppc since r157911.NAKAMURA Takumi
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158209 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08Disable the PPC CTR-Loops pass by default.Hal Finkel
The pass itself works well, but the something in the Machine* infrastructure does not understand terminators which define registers. Without the ability to use the block-placement pass, etc. this causes performance regressions (and so is turned off by default). Turning off the analysis turns off the problems with the Machine* infrastructure. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158206 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08Fix a bug in the new PPC CTR-Loops pass.Hal Finkel
The code which tests for an induction operation cannot assume that any ADDI instruction will have a register operand because the operand could also be a frame index; for example: %vreg16<def> = ADDI8 <fi#0>, 0; G8RC:%vreg16 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158205 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08Add the PPCCTRLoops pass: a PPC machine-code-level optimization pass to form ↵Hal Finkel
CTR-based loop branching code. This pass is derived from the Hexagon HardwareLoops pass. The only significant enhancement over the Hexagon pass is that PPCCTRLoops will also attempt to delete the replaced add and compare operations if they are no longer otherwise used. Also, invalid preheader DebugLoc is not used. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158204 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08Revert commit 158073 while waiting for a fix. The issue is that reassociateDuncan Sands
can move instructions within the instruction list. If the instruction just happens to be the one the basic block iterator is pointing to, and it is moved to a different basic block, then we get into an infinite loop due to the iterator running off the end of the basic block (for some reason this doesn't fire any assertions). Original commit message: Grab-bag of reassociate tweaks. Unify handling of dead instructions and instructions to reoptimize. Exploit this to more systematically eliminate dead instructions (this isn't very useful in practice but is convenient for analysing some testcase I am working on). No need for WeakVH any more: use an AssertingVH instead. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158199 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08cmake: Pass the -m32 flag to modules if LLVM_BUILD_32_BITS is enabledTobias Grosser
This was previously only done for executables and shared libraries, but not for modules. As modules are essentially shared libraries (that need to be dlopened explicitly), threating them the same as shared libraries seems reasonable. This fixes the LLVM_BUILD_32_BITS build of Polly. Contributed by: Ondra Hosek <ondra.hosek@gmail.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158195 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08Teach the AsmMatcherEmitter to allow InstAlias' where the suboperands of a ↵Owen Anderson
complex operand are called out explicitly in the asm string. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158183 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-07[CMake] Promote extension warnings to errors.Michael J. Spencer
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158176 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-07X86: optimize generated code for integer ABSManman Ren
This patch will generate the following for integer ABS: movl %edi, %eax negl %eax cmovll %edi, %eax INSTEAD OF movl %edi, %ecx sarl $31, %ecx leal (%rdi,%rcx), %eax xorl %ecx, %eax There exists a target-independent DAG combine for integer ABS, which converts integer ABS to sar+add+xor. For X86, we match this pattern back to neg+cmov. This is implemented in PerformXorCombine. rdar://10695237 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158175 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-07[CMake] Order MSVC warnings numerically.Michael J. Spencer
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158171 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-07[CMake] Adjust MSVC warnings.Michael J. Spencer
Remove /Wall from LLVM_ENABLE_WARNINGS (it's useless) and promote 4239 to a level 1 warning. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158170 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-07Do not optimize the used bits of the x86 vselect condition operand, when the ↵Nadav Rotem
condition operand is a vector of 1-bit predicates. This may happen on MIC devices. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158168 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-07Fix a bug in FoldSelectOpOp. Bitcast ops may change the number of vector ↵Nadav Rotem
elements, which may disagree with the select condition type. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158166 91177308-0d34-0410-b5e6-96231b3b80d8