Age | Commit message (Collapse) | Author |
|
|
|
Avoid infinite looping in AllocaManager::computeInterBlockLiveness,
without using new temporary variables. This also makes it clear that
the algorithm is conservative in the face of conflicting liveness
intrinsics.
|
|
This allows them to be run without specifying any command-line options,
which is convenient.
|
|
|
|
|
|
Also add a test for handling of global aliases.
|
|
|
|
We never reference these intrinsics by name, so we don't need redirects
for them.
|
|
|
|
|
|
Also, fix the code for supressing declarations for no-op intrinsics.
|
|
Make inline asm a report_fatal_error instead of an assertion failure so
that it's a little friendlier, and add a test to make sure llc -march=js
rejects inline asm.
Also disable the PNaCl inline asm("":::"memory") lowering pass. If people
are using this, it's best that we diagnose it as it likely isn't portable.
There are usually better alternatives.
|
|
This fixes a regression introduced in d95cd364f0c049d6c1b8d78746d44c00ed2f69f2;
when regular expression translation looks through bitcasts but phi translation
doesn't, phi translation may fail to properly detect dependencies.
|
|
The AllocaManager performs stack layout for alloca objects. It is capable
of analyzing alloca liveness and sharing space between multiple allocas.
|
|
|
|
some places.
|
|
|
|
|
|
|
|
See:
http://llvm.org/viewvc/llvm-project?view=revision&revision=187787
The newer version of newlib tickles this x86-32 bug
when building the exception handling tests, which don't
strip the "tail" attribute.
BUG=https://code.google.com/p/nativeclient/issues/detail?id=3702
Waiting on trybots, but it seems to have fixed the minimal
reproducer I have:
http://chromegw.corp.google.com/i/tryserver.nacl/builders/nacl-toolchain-linux-pnacl-x86_64/builds/922
http://chromegw.corp.google.com/i/tryserver.nacl/builders/nacl-toolchain-linux-pnacl-x86_32/builds/870
http://chromegw.corp.google.com/i/tryserver.nacl/builders/nacl-toolchain-mac-pnacl-x86_32/builds/875
R=jfb@chromium.org
Review URL: https://codereview.chromium.org/26538008
|
|
Cherry-pick r191978 from upstream.
Original commit message:
Author: Akira Hatanaka <ahatanaka@mips.com>
Date: Fri Oct 4 20:51:40 2013 +0000
[mips] Fix a bug in MipsLongBranch::replaceBranch, which was erasing
instructions in delay slots along with the original branch instructions
This has to be cherrypicked, as it is a bug in backend. It was exposed in
a long function inside of llc, which caused llc.nexe to work incorrectly.
TBR= mseaborn@chromium.org, dschuff@chromium.org
BUG= bug in MIPS backend
Review URL: https://codereview.chromium.org/26933005
|
|
Cherry-pick r187244 from upstream.
Original commit message:
Author: Akira Hatanaka <ahatanaka@mips.com>
Date: Fri Jul 26 20:58:55 2013 +0000
[mips] Implement llvm.trap intrinsic.
Patch by Sasa Stankovic.
This has to be cherrypicked, as two tests fail due to missing llvm.trap
intrinsic. The tests are:
- run_sysbrk_test
- run_abi_types_test
TBR= mseaborn@chromium.org, dschuff@chromium.org
BUG= sysbrk and abi_types tests fail for MIPS
Review URL: https://codereview.chromium.org/26953003
|
|
Cherry-pick r185186 from upstream.
Original commit message:
Author: Lang Hames <lhames@gmail.com>
Date: Fri Jun 28 18:36:42 2013 +0000
Add missing case to switch statement - DAGTypeLegalizer::ExpandIntegerResult
should expand ATOMIC_CMP_SWAP nodes the same way that it does for ATOMIC_SWAP.
Since ATOMIC_LOADs on some targets (e.g. older ARM variants) get legalized to
ATOMIC_CMP_SWAPs, the missing case had been causing i64 atomic loads to crash
during isel.
This has to be cherry-picked, as we have experienced the same bug described
in the original message. Missing case caused MIPS 64 atomics to crash.
TBR= mseaborn@chromium.org, dschuff@chromium.org
BUG= crash for MIPS atomics
Review URL: https://codereview.chromium.org/26958002
|
|
Cherry-pick r182306 from upstream.
Original commit message:
Author: Akira Hatanaka <ahatanaka@mips.com>
Date: Mon May 20 18:07:43 2013 +0000
[mips] Trap on integer division by zero.
By default, a teq instruction is inserted after integer divide. No divide-by-zero
checks are performed if option "-mnocheck-zero-division" is used.
TBR= mseaborn@chromium.org, dschuff@chromium.org
BUG= missing trap for MIPS
Review URL: https://codereview.chromium.org/26846007
|
|
GCC bug http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58416
causes the bf64-1.c test to fail in the GCC torture test suite.
This provides an upstreamable workaround.
Inspection of the LLVM code base showed no other instances of
the pattern that triggers the gcc bug.
This can also be upstreamed as soon as I can get a working
x86-32 upstream build working to verify/test against. In the
meantime, we can make one pnacl-fyi bot go green again.
BUG= https://code.google.com/p/nativeclient/issues/detail?id=3685
R=jfb@chromium.org, jfb@google.com
Review URL: https://codereview.chromium.org/23437037
|
|
BUG= https://code.google.com/p/nativeclient/issues/detail?id=3588
R=jvoung@chromium.org
Review URL: https://codereview.chromium.org/19606003
|
|
Prevent sandbox addresses from being written to the stack. This
covers the following cases:
1. Function calls manually push a masked return address and jump to
the target, rather than using the call instruction.
2. When the function prolog chooses to use a frame pointer (rbp), it
saves a masked version of the old rbp.
3. Indirect branches (jumps, calls, and returns) uniformly use r11 to
construct the 64-bit target address.
4. Register r11 is marked as reserved (similar to r15) so that the
register allocator won't inadvertently spill a code address to the
stack.
These transformations can be disabled for performance testing with the
flag "-sfi-hide-sandbox-base=false".
BUG= https://code.google.com/p/nativeclient/issues/detail?id=1235
R=eliben@chromium.org, mseaborn@chromium.org
Review URL: https://codereview.chromium.org/19505003
|
|
BUG= test the fix that was already cherrypicked
TEST= self
R=eliben@chromium.org
Review URL: https://codereview.chromium.org/19704008
|
|
Specifically:
r186489 - Fix ARMFastISel::ARMEmitIntExt shift emission
r183794 - ARM FastISel fix sext/zext fold
r183601 - Fix unused variable warning from my previous patch
r183551 - ARM FastISel integer sext/zext improvements
These should fix some failures that I had run into back then, as well as make ARM FastISel faster because it doesn't go to SelectionDAG.
BUG= https://code.google.com/p/nativeclient/issues/detail?id=3501
R=jvoung@chromium.org
TEST= make check-all
Review URL: https://codereview.chromium.org/19992002
|
|
|
|
This applies only to %r15 sandboxed memory references. The problem
is that if the index register is negative, the sandboxing operation
will cause the index to become a large positive 32-bit value, which
combined with the displacement, will overflow and try to reference
memory outside the sandbox. This situation may legitimately occur
if the compiler happens to construct a (constant) interior pointer
to the middle of the global struct/array, and then dereferences it
with a variable offset.
After this fix, pnacl/scripts/testsuite_known_failures_pnacl.txt can
be updated to remove the "aha x86-64" known failure.
BUG= https://code.google.com/p/nativeclient/issues/detail?id=3517
R=eliben@chromium.org
Review URL: https://codereview.chromium.org/17987002
|
|
Conflicts:
docs/LangRef.rst
include/llvm/CodeGen/CallingConvLower.h
include/llvm/IRReader/IRReader.h
include/llvm/Target/TargetMachine.h
lib/CodeGen/CallingConvLower.cpp
lib/IRReader/IRReader.cpp
lib/IRReader/LLVMBuild.txt
lib/IRReader/Makefile
lib/LLVMBuild.txt
lib/Makefile
lib/Support/MemoryBuffer.cpp
lib/Support/Unix/PathV2.inc
lib/Target/ARM/ARMBaseInstrInfo.cpp
lib/Target/ARM/ARMISelLowering.cpp
lib/Target/ARM/ARMInstrInfo.td
lib/Target/ARM/ARMSubtarget.cpp
lib/Target/ARM/ARMTargetMachine.cpp
lib/Target/Mips/CMakeLists.txt
lib/Target/Mips/MipsDelaySlotFiller.cpp
lib/Target/Mips/MipsISelLowering.cpp
lib/Target/Mips/MipsInstrInfo.td
lib/Target/Mips/MipsSubtarget.cpp
lib/Target/Mips/MipsSubtarget.h
lib/Target/X86/X86FastISel.cpp
lib/Target/X86/X86ISelDAGToDAG.cpp
lib/Target/X86/X86ISelLowering.cpp
lib/Target/X86/X86InstrControl.td
lib/Target/X86/X86InstrFormats.td
lib/Transforms/IPO/ExtractGV.cpp
lib/Transforms/InstCombine/InstCombineCompares.cpp
lib/Transforms/Utils/SimplifyLibCalls.cpp
test/CodeGen/X86/fast-isel-divrem.ll
test/MC/ARM/data-in-code.ll
tools/Makefile
tools/llvm-extract/llvm-extract.cpp
tools/llvm-link/CMakeLists.txt
tools/opt/CMakeLists.txt
tools/opt/LLVMBuild.txt
tools/opt/Makefile
tools/opt/opt.cpp
|
|
Patch by: Vincent Lejeune
https://bugs.freedesktop.org/show_bug.cgi?id=64877
NOTE: This is a candidate for the 3.3 branch.
Merged from r182600
Author: Tom Stellard <thomas.stellard@amd.com>
Date: Thu May 23 18:26:42 2013 +0000
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_33@185868 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Revert this until we fix i1 sext. Currently, it uses LSL and ASR,
which are pseudo-instructions and get dropped on the floor when
generating .o files. We'll fix that, but for now revert to green
the bots.
BUG=https://code.google.com/p/nativeclient/issues/detail?id=3501
R=jfb@chromium.org
Review URL: https://codereview.chromium.org/17715002
|
|
This change will fix the regression in -O0 translation time caused by
putting "align 1" on integer loads and stores, which was causing
FastISel to fall back to SelectionDAG more often.
> In X86FastISel::X86SelectStore(), improperly aligned stores are
> rejected and handled by the DAG-based ISel. However,
> X86FastISel::X86SelectLoad() makes no such requirement. There
> doesn't appear to be an x86 architectural correctness issue with
> allowing potentially unaligned store instructions. This patch
> removes this restriction.
>
> Patch by Jim Stichnot.
BUG=https://code.google.com/p/nativeclient/issues/detail?id=3445
TEST=PNaCl toolchain trybots
Review URL: https://codereview.chromium.org/17575003
|
|
if the function calls _builtin_unwind_init()
Also fix the list of callee-saved registers returned by
X86RegisterInfo::getCalleeSavedRegisters
BUG= https://code.google.com/p/nativeclient/issues/detail?id=3486
R=mseaborn@chromium.org
Review URL: https://codereview.chromium.org/16987002
|
|
Rename countTrailingZeros to the older CountTrailingZeros_32, mark as localmod.
These patches fix correctness issues with ARM FastISel, and should make it faster while generating better code.
BUG= none
TEST= self
R=jvoung@chromium.org
Review URL: https://codereview.chromium.org/16712002
|
|
This also pulls in a TargetMachine.h change from r176986 and changes
NaCl's intrinsics-bitmanip.ll test to account for register spills at O0.
FastISel was only enabled for iOS ARM and Thumb2, this patch enables it
for ARM (not Thumb2) on Linux and NaCl.
Thumb2 support needs a bit more work, mainly around register class
restrictions.
The patch punts to SelectionDAG when doing TLS relocation on non-Darwin
targets. I will fix this and other FastISel-to-SelectionDAG failures in
a separate patch.
The patch also forces FastISel to retain frame pointers: iOS always
keeps them for backtracking (so emitted code won't change because of
this), but Linux was getting much worse code that was incorrect when
using big frames (such as test-suite's lencod). I'll also fix this in a
later patch, it will probably require a peephole so that FastISel
doesn't rematerialize frame pointers back-to-back.
The test changes are straightforward, similar to:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130513/174279.html
They also add a vararg test that got dropped in that change.
I ran all of test-suite on A15 hardware with --optimize-option=-O0 and
all the tests pass.
R=dschuff@chromium.org, jvoung@chromium.org
BUG= https://code.google.com/p/nativeclient/issues/detail?id=3120
Review URL: https://codereview.chromium.org/15671004
|
|
enabling ARM FastISel
ARM FastISel is currently only enabled for iOS non-Thumb1, and I'm working on
enabling it for other targets. As a first step I've fixed some of the tests.
Changes to ARM FastISel tests:
- Different triples don't generate the same relocations (especially
movw/movt versus constant pool loads). Use a regex to allow either.
- Mangling is different. Use a regex to allow either.
- The reserved registers are sometimes different, so registers get
allocated in a different order. Capture the names only where this
occurs.
- Add -verify-machineinstrs to some tests where it works. It doesn't
work everywhere it should yet.
- Add -fast-isel-abort to many tests that didn't have it before.
- Split out the VarArg test from fast-isel-call.ll into its own
test. This simplifies test setup because of --check-prefix.
R=dschuff@chromium.org
Review URL: https://codereview.chromium.org/15737029
|
|
This patch matches GCC behavior: the code used to only allow unaligned
load/store on ARM for v6+ Darwin, it will now allow unaligned load/store
for v6+ Darwin as well as for v7+ on Linux and NaCl.
The distinction is made because v6 doesn't guarantee support (but LLVM
assumes that Apple controls hardware+kernel and therefore have
conformant v6 CPUs), whereas v7 does provide this guarantee (and
Linux/NaCl behave sanely).
The patch keeps the -arm-strict-align command line option, and adds
-arm-no-strict-align. They behave similarly to GCC's -mstrict-align and
-mnostrict-align.
I originally encountered this discrepancy in FastIsel tests which expect
unaligned load/store generation. Overall this should slightly improve
performance in most cases because of reduced I$ pressure.
R=dschuff@chromium.org
Review URL: https://codereview.chromium.org/15677005
|
|
------------------------------------------------------------------------
r182394 | jholewinski | 2013-05-21 09:51:30 -0700 (Tue, 21 May 2013) | 1 line
[NVPTX] Add @llvm.nvvm.sqrt.f() intrinsic
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_33@182829 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r182298 | jholewinski | 2013-05-20 09:42:18 -0700 (Mon, 20 May 2013) | 1 line
[NVPTX] Fix mis-use of CurrentFnSym in NVPTXAsmPrinter. This was causing a symbol name error in the output PTX.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_33@182828 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r182254 | jholewinski | 2013-05-20 05:13:32 -0700 (Mon, 20 May 2013) | 12 lines
[NVPTX] Add GenericToNVVM IR converter to better handle idiomatic LLVM IR inputs
This converter currently only handles global variables in address space 0. For
these variables, they are promoted to address space 1 (global memory), and all
uses are updated to point to the result of a cvta.global instruction on the new
variable.
The motivation for this is address space 0 global variables are illegal since we
cannot declare variables in the generic address space. Instead, we place the
variables in address space 1 and explicitly convert the pointer to address
space 0. This is primarily intended to help new users who expect to be able to
place global variables in the default address space.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_33@182826 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r182253 | jholewinski | 2013-05-20 05:13:28 -0700 (Mon, 20 May 2013) | 1 line
[NVPTX] Fix i1 kernel parameters and global variables. ABI rules say we need to use .u8 for i1 parameters for kernels.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_33@182825 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r182486 | d0k | 2013-05-22 10:01:12 -0700 (Wed, 22 May 2013) | 3 lines
X86: When expanding PCMPGTQ to PCMPGTD we always want to compare the lower halves as unsigned.
Take #2 on fixing PR15977.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_33@182489 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r182387 | jholewinski | 2013-05-21 07:37:16 -0700 (Tue, 21 May 2013) | 7 lines
Drop @llvm.annotation and @llvm.ptr.annotation intrinsics during codegen.
The intrinsic calls are dropped, but the annotated value is propagated.
Fixes PR 15253
Original patch by Zeng Bin!
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_33@182417 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r182364 | d0k | 2013-05-21 02:58:54 -0700 (Tue, 21 May 2013) | 4 lines
X86: When emulating unsigned PCMPGTQ with PCMPGTD, fix the sign bit for the smaller type.
Otherwise we'll get a mix of signed and unsigned compares.
Fixes PR15977.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_33@182413 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r182113 | tstellar | 2013-05-17 08:23:21 -0700 (Fri, 17 May 2013) | 9 lines
R600: Fix encoding for R600 family GPUs
Reviewed-by: Vincent Lejeune <vljn@ovi.com>
https://bugs.freedesktop.org/show_bug.cgi?id=64193
https://bugs.freedesktop.org/show_bug.cgi?id=64257
https://bugs.freedesktop.org/show_bug.cgi?id=64320
NOTE: This is a candidate for the 3.3 branch.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_33@182174 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
There was an old fix for r+r based memory references on
x86-64 that checked for isTargetNaCl() instead of
isTargetNaCl64(). This disabled some r+r for 32-bit.
However, fast isel only sets up r+r with geps, and we don't
have geps in the stable ABI. We could potentially add
some similar pattern matching in the future...
The problem we *do* see with the current bitcode, is that
this change also made it preferred to use an index register
instead of a base register. This made the memory references
on x86-32 look like:
cmpl ..., (,%eax,1)
instead of
cmpl ..., (%eax)
So we had longer instructions.
Total zipped nexe sizes: 5.73MB (old) vs 5.59 MB (new) (2.5%)
Total not zipped: 17.28MB vs 16.28 MB (6%)
runtime diffs (min of 5 runs)
* eon 4.94 (old) vs 4.72 (new) (~4%)
* mesa 21.64 vs 21.08
* mcf 5.76 vs 5.60
* vortex 4.21 vs 4.05
* perlbmk 27.62 vs 26.55
(the rest were under 2% better)
BUG=https://code.google.com/p/nativeclient/issues/detail?id=3359
R=stichnot@chromium.org
Review URL: https://codereview.chromium.org/15047013
|
|
stack slots (PR15707)
IR optimisation passes can result in a basic block that contains:
llvm.lifetime.start(%buf)
...
llvm.lifetime.end(%buf)
...
llvm.lifetime.start(%buf)
Before this change, calculateLiveIntervals() was ignoring the second
lifetime.start() and was regarding %buf as being dead from the
lifetime.end() through to the end of the basic block. This can cause
StackColoring to incorrectly merge %buf with another stack slot.
Fix by removing the incorrect Starts[pos].isValid() and
Finishes[pos].isValid() checks.
Just doing:
Starts[pos] = Indexes->getMBBStartIdx(MBB);
Finishes[pos] = Indexes->getMBBEndIdx(MBB);
unconditionally would be enough to fix the bug, but it causes some
test failures due to stack slots not being merged when they were
before. So, in order to keep the existing tests passing, treat LiveIn
and LiveOut separately rather than approximating the live ranges by
merging LiveIn and LiveOut.
This fixes PR15707.
Patch by Mark Seaborn.
BUG=https://code.google.com/p/nativeclient/issues/detail?id=3374
Review URL: https://codereview.chromium.org/15302009
|