Age | Commit message (Collapse) | Author |
|
Although PNaCl doesn't fully support struct types as varargs
arguments, there is a test in Spec2k (255.vortex) that passes a struct
as a varargs argument but never reads the argument using va_arg (which
is legal, but strange).
ExpandVarArgs was handling the struct argument by copying it with a
struct load+store. This is undesirable because currently SROA
converts that into code that uses extractvalue, which was rejected by
the PNaCl ABI checker. Struct load/store will soon be rejected by the
ABI checker too.
We could fix this by running ExpandStructRegs after ExpandVarArgs, but
it's cleaner for ExpandVarArgs to use memcpy() instead of struct
load+store. memcpy() is potentially more efficient because it avoids
having a temporary copy of the struct, and using memcpy() avoids
dependencies between IR passes.
BUG=https://code.google.com/p/nativeclient/issues/detail?id=3338
BUG=https://code.google.com/p/nativeclient/issues/detail?id=3343
TEST=*.ll tests + built 255.vortex from Spec2k
Review URL: https://codereview.chromium.org/16232021
|
|
behavior.
Fixes rdar:14036816, PR16130.
There is an opportunity to compute precise trip counts for 'or'
expressions and multi-exit loops.
rdar:14038809: Optimize trip count computation for multi-exit loops.
To do this we need to record the fact that ExitLimit assumes NSW. When
it does not we can safely assume that the loop trip count is the
minimum ExitLimt across all subexpressions and loop exits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183060 91177308-0d34-0410-b5e6-96231b3b80d8
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_33@183066 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
The constructor for getelementptr ConstantExprs does some constant
folding which can add 1 or 2 more indexes to the getelementptr. This
complicates checking for FlattenGlobals' normal form in the PNaCl ABI
checker.
Worse, the GCC torture tests turned up a pathological case where this
constant folding adds 4 indexes to the getelementptr and leaves the
original struct type behind:
@q = global i8* getelementptr inbounds (
%union.u* bitcast ([260 x i8]* @v to %union.u*),
i32 0, i32 0, i32 0, i32 0, i32 4)
That comes from the following code in
gcc/testsuite/gcc.c-torture/execute/pr43784.c:
struct s {
unsigned char a[256];
};
union u {
struct { struct s b; int c; } d;
struct { int c; struct s b; } e;
};
static union u v;
static struct s *q = &v.e.b;
We can fix this by using ptrtoint+add instead of
getelementptr+bitcast, because ConstantExpr won't automatically
convert ptrtoint to something else.
BUG=https://code.google.com/p/nativeclient/issues/detail?id=3113
TEST=*.ll tests + PNaCl toolchain trybots
Review URL: https://codereview.chromium.org/15647009
|
|
These markers work in a similar way to llvm.lifetime.start/end, so we
should remove them for similar reasons: it's not very well defined how
one marker cancels out the effects of the other.
Arguably, invariant.start/end are less useful than lifetime.start/end.
They are ignored by the backend. They are generated in fewer places:
invariant.start is generated by Clang (at -O1 or higher) when a const
global is initialised with a non-POD initialiser. invariant.end is
apparently not generated at all.
Do the stripping in ReplacePtrsWithInts for consistency with the
existing lifetime.start/end stripping.
BUG=https://code.google.com/p/nativeclient/issues/detail?id=3443
TEST=PNaCl toolchain trybots
Review URL: https://codereview.chromium.org/15995004
|
|
Clang's implementation of C++ method pointers generates IR that uses
LLVM registers with struct type -- specifically, loads and stores of
struct values, and extractvalue instructions. See
lib/CodeGen/ItaniumCXXABI.cpp in Clang. Add a pass, ExpandStructRegs,
which expands out those uses.
Factor out a function from ExpandArithWithOverflow so that the two
passes can share some code.
BUG=https://code.google.com/p/nativeclient/issues/detail?id=3343
TEST=*.ll tests + trybots + GCC torture tests
Review URL: https://codereview.chromium.org/15692014
|
|
Odd-sized switch statements can appear in the sandboxed translator build.
R=mseaborn@chromium.org
BUG= https://code.google.com/p/nativeclient/issues/detail?id=3360
Review URL: https://codereview.chromium.org/15894006
|
|
This widens i1, i8 and i16 function arguments and return types.
Factor out RecreateFunction() helper function from existing PNaCl
passes since this is a reoccurring code fragment.
BUG=https://code.google.com/p/nativeclient/issues/detail?id=3342
TEST=*.ll tests + PNaCl toolchain trybots + GCC torture tests + LLVM test suite
Review URL: https://codereview.chromium.org/15971007
|
|
ConvertConstant now returns an undef constant of the appropriate type.
This fixes the translator build failure caused by enabling the pass.
R=mseaborn@chromium.org
BUG= https://code.google.com/p/nativeclient/issues/detail?id=3360
Review URL: https://codereview.chromium.org/16086005
|
|
Change the PNaCl ABI checker to disallow these intrinsics.
Note that I had originally intended to commit this before my earlier
change (https://codereview.chromium.org/15688011) that enables the
ExpandArithWithOverflow pass. Enabling ExpandArithWithOverflow
without changing InstCombine causes ExpandArithWithOverflow to fail on
some of the *.with.overflow intrinsic calls that InstCombine
introduces. This change therefore fixes some breakage.
BUG=https://code.google.com/p/nativeclient/issues/detail?id=3434
TEST=PNaCl toolchain trybots + GCC torture tests + LLVM test suite
Review URL: https://codereview.chromium.org/16042011
|
|
Also fix the diagnostic asserts in getPromotedType.
R=mseaborn@chromium.org
BUG= https://code.google.com/p/nativeclient/issues/detail?id=3360
Review URL: https://codereview.chromium.org/16004003
|
|
It turned out that umul.with.overflow wasn't the only *.with.overflow
intrinsic usage introduced by Clang.
I knew that Clang's CGExprCXX.cpp generates umul.with.overflow for an
overflow check for C++'s "new Foo[]". The same code for handling "new
Foo[]" also generates uadd.with.overflow in some cases. This happens
if class Foo has a destructor or a delete[] operator that takes a size
argument. In those cases, the C++ ABI adds a "cookie" to the
allocation which contains the array's size.
Rename the pass to "ExpandArithWithOverflow" and rename files
accordingly.
Also enable the pass.
BUG=https://code.google.com/p/nativeclient/issues/detail?id=3434
TEST=*.ll tests + trybots + GCC torture tests
Review URL: https://codereview.chromium.org/15688011
|
|
------------------------------------------------------------------------
r182656 | d0k | 2013-05-24 11:05:35 -0700 (Fri, 24 May 2013) | 3 lines
LoopVectorize: LoopSimplify can't canonicalize loops with an indirectbr in it, don't assert on those cases.
Fixes PR16139.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_33@182785 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
to external function calls during the translation stage (llc).
One of the passes is a ModulePass that adds the appropriate function
declarations to the module. The other is a FunctionPass that performs the
actual call replacement. This split exists because of bitcode streaming.
Initially the passes handle the llvm.nacl.{set|long}jmp intrinsics. In the
future they may handle additional intrinsics that are part of the PNaCl
stable bitcode ABI.
This CL also removes the previous approach to handling this conversion
(in SelectionDAGBuilder.cpp). That ended up not working - more details in
issue 3429.
BUG=https://code.google.com/p/nativeclient/issues/detail?id=3429
R=mseaborn@chromium.org
Review URL: https://codereview.chromium.org/16047002
|
|
These meta-passes will be used to replace the pass lists that are
currently in the pnacl-ld.py driver script in the NaCl repo.
I've moved the comments across from pnacl-ld.py and added a couple
more comments for ExpandByVal and StripMetadata.
Fix the declaration of createResolveAliasesPass().
BUG=https://code.google.com/p/nativeclient/issues/detail?id=3435
TEST=new *.ll tests + tested with change to pnacl-ld.py
Review URL: https://codereview.chromium.org/15669002
|
|
This adds a pass, ExpandMulWithOverflow, to expand out the
llvm.umul.with.overflow calls that Clang generates to implement an
overflow check for C++'s new[] operator.
BUG=https://code.google.com/p/nativeclient/issues/detail?id=3434
TEST=expand-mul-with-overflow.ll
Review URL: https://codereview.chromium.org/14649027
|
|
Running the LLVM test suite with the ReplacePtrsWithInts pass enabled
produced a single failure (in MultiSource/Applications/SPASS),
revealing a corner case in which a mixture of forward and backward
references plus a bitcast causes the pass to fail (see
@forwards_reference() in the test).
The problem was that we were doing replaceAllUsesWith() on a
placeholder value too early. RewriteMap was mapping a bitcast to a
placeholder P, but RewriteMap's reference to P didn't get updated by
P->replaceAllUsesWith() and P became a dangling pointer.
The fix is:
* Change convert() to strip off casts first, so that RewriteMap isn't
used for mapping casts to converted values.
* Defer the replaceAllUsesWith() calls until after creating all the
replacement instructions. This makes the pass more robust against
instruction ordering in the input.
This requires debug instrinsics to be updated in a separate pass,
because replaceAllUsesWith() doesn't work for references by
metadata nodes.
This also fixes some pathological corner cases of cyclic references in
unreachable blocks.
Fix indentation in one place.
BUG=https://code.google.com/p/nativeclient/issues/detail?id=3343
TEST=replace-ptrs-with-ints.ll + LLVM test suite
Review URL: https://codereview.chromium.org/15761003
|
|
ReplacePtrsWithInts converts IR to a normal form in which functions
don't reference any aggregate pointer types and pointer types only
appear inside a few instructions.
Change BlockAddress::replaceUsesOfWithOnConstant() to handle changing
a function's type by replacing a function with a bitcast ConstantExpr
of a new function.
BUG=https://code.google.com/p/nativeclient/issues/detail?id=3343
TEST=replace-ptrs-with-ints.ll + PNaCl toolchain trybots, torture tests, etc.
Review URL: https://codereview.chromium.org/14262011
|
|
------------------------------------------------------------------------
r182485 | arnolds | 2013-05-22 09:54:56 -0700 (Wed, 22 May 2013) | 7 lines
LoopVectorize: Make Value pointers that could be RAUW'ed a VH
The Value pointers we store in the induction variable list can be RAUW'ed by a
call to SCEVExpander::expandCodeFor, use a TrackingVH instead. Do the same thing
in some other places where we store pointers that could potentially be RAUW'ed.
Fixes PR16073.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_33@182492 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
While writing a test, I noticed that ExpandCtors crashes if given the
empty list "[]", because this gets converted into an UndefValue
ConstantExpr by the LLVM assembly reader.
Fix this by checking the array's size via its type. This replaces the
isNullValue() check.
Make error handling cleaner by splitting out a separate function.
BUG=none
TEST=test/Transforms/NaCl/expand-ctors-emptylist.ll
Review URL: https://codereview.chromium.org/15659005
|
|
BUG=https://code.google.com/p/nativeclient/issues/detail?id=3429
R=mseaborn@chromium.org
Review URL: https://codereview.chromium.org/15470008
|
|
If a global variable has no "align" attribute, it must be aligned
based on its type.
BUG=https://code.google.com/p/nativeclient/issues/detail?id=3437
TEST=flatten-globals.ll
Review URL: https://codereview.chromium.org/15359006
|
|
BUG=None
Review URL: https://codereview.chromium.org/15100013
|
|
------------------------------------------------------------------------
r181524 | rafael | 2013-05-09 10:22:59 -0700 (Thu, 09 May 2013) | 4 lines
Don't replace an alias in llvm.used with its target.
When we replace an internal alias with its target, be careful not to
replace the entry in llvm.used (and llvm.compiler_used).
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_33@181909 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
LLVM regression suite)
BUG=None
R=mseaborn@chromium.org
Review URL: https://codereview.chromium.org/15042009
|
|
------------------------------------------------------------------------
r181586 | d0k | 2013-05-10 02:16:52 -0700 (Fri, 10 May 2013) | 3 lines
InstCombine: Verify the type before transforming uitofp into select.
PR15952.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_33@181813 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
with stable bitcode intrinsics.
Starting with setjmp and longjmp.
BUG=https://code.google.com/p/nativeclient/issues/detail?id=3429
R=jvoung@chromium.org, mseaborn@chromium.org
Review URL: https://codereview.chromium.org/14617017
|
|
This pass (mostly) legalizes integer types by promoting them.
It has some limitations (e.g. it can't change function types)
but it is more than sufficient for what clang and SROA generate.
A more significant limitation of promotion is that packed
bitfields of size > 64 bits are still not handled. There are
none in our tests (other than callingconv_case_by_case which
doesn't require a stable pexe) but we will want to either
handle them by correctly expanding them, or find a better way
to error out.
BUG= https://code.google.com/p/nativeclient/issues/detail?id=3360
R=eliben@chromium.org, jvoung@chromium.org
Review URL: https://codereview.chromium.org/14569012
|
|
This could help prevent these expansion passes from inhibiting
optimisations than run after the expansion. e.g. It gives the
optimiser more freedom to move around reads from the varargs buffer
because they will not alias writes to other locations.
BUG=https://code.google.com/p/nativeclient/issues/detail?id=3400
TEST=PNaCl toolchain trybots + GCC torture tests + LLVM test suite + Spec2k
Review URL: https://codereview.chromium.org/14060026
|
|
------------------------------------------------------------------------
r181286 | arnolds | 2013-05-06 21:37:05 -0700 (Mon, 06 May 2013) | 7 lines
LoopVectorize: getConsecutiveVector must respect signed arithmetic
We were passing an i32 to ConstantInt::get where an i64 was needed and we must
also pass the sign if we pass negatives numbers. The start index passed to
getConsecutiveVector must also be signed.
Should fix PR15882.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_33@181455 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181249 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181234 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Test case by Michele Scandale!
Fixes PR10293: Load not hoisted out of loop with multiple exits.
There are few regressions with this patch, now tracked by
rdar:13817079, and a roughly equal number of improvements. The
regressions are almost certainly back luck because LoopRotate has very
little idea of whether rotation is profitable. Doing better requires a
more comprehensive solution.
This checkin is a quick fix that lacks generality (PR10293 has
a counter-example). But it trivially fixes the case in PR10293 without
interfering with other cases, and it does satify the criteria that
LoopRotate is a loop canonicalization pass that should avoid
heuristics and special cases.
I can think of two approaches that would probably be better in
the long run. Ultimately they may both make sense.
(1) LoopRotate should check that the current header would make a good
loop guard, and that the loop does not already has a sufficient
guard. The artifical SimplifiedLoopLatch check would be unnecessary,
and the design would be more general and canonical. Two difficulties:
- We need a strong guarantee that we won't endlessly rotate, so the
analysis would need to be precise in order to avoid the
SimplifiedLoopLatch precondition.
- Analysis like this are usually based on SCEV, which we don't want to
rely on.
(2) Rotate on-demand in late loop passes. This could even be done by
shoving the loop back on the queue after the optimization that needs
it. This could work well when we find LICM opportunities in
multi-branch loops. This requires some work, and it doesn't really
solve the problem of SCEV wanting a loop guard before the analysis.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181230 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181219 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
A * (1 - (uitofp i1 C)) -> select C, 0, A
B * (uitofp i1 C) -> select C, B, 0
select C, 0, A + select C, B, 0 -> select C, B, A
These come up in code that has been hand-optimized from a select to a linear blend,
on platforms where that may have mattered. We want to undo such changes
with the following transform:
A*(1 - uitofp i1 C) + B*(uitofp i1 C) -> select C, A, B
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181216 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Thanks Nick Lewycky for pointing this out.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181177 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Use unknown results for places where it would be needed
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181176 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
We used to disable constant merging not only if a constant is llvm.used, but
also if an alias of a constant is llvm.used. This change fixes that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181175 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Add support for min/max reductions when "no-nans-float-math" is enabled. This
allows us to assume we have ordered floating point math and treat ordered and
unordered predicates equally.
radar://13723044
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181144 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
We can just use the initial element that feeds the reduction.
max(max(x, y), z) == max(max(x,y), max(x,z))
radar://13723044
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181141 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
values.
By supporting the vectorization of PHINodes with more than two incoming values we can increase the complexity of nested if statements.
We can now vectorize this loop:
int foo(int *A, int *B, int n) {
for (int i=0; i < n; i++) {
int x = 9;
if (A[i] > B[i]) {
if (A[i] > 19) {
x = 3;
} else if (B[i] < 4 ) {
x = 4;
} else {
x = 5;
}
}
A[i] = x;
}
}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181037 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This will make it easier to turn on struct-path aware TBAA since the metadata
format will change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180935 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This tests a case where C1 and C2 were the same but X and Y were different
widths.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180907 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
is the cannonical form.
Shuffles are more difficult to lower and we usually don't touch them, while we do optimize selects more often.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180875 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This reverts commit r180802
There's ongoing discussion about whether this is the right place to make
this transformation. Reverting for now while we figure it out.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180834 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Always fold a shuffle-of-shuffle into a single shuffle when there's only one
input vector in the first place. Continue to be more conservative when there's
multiple inputs.
rdar://13402653
PR15866
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180802 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This will make it easier to turn on struct-path aware TBAA since the metadata
format will change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180796 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This fixes the optimization introduced in r179748 and reverted in r179750.
While the optimization was sound, it did not properly respect differences in
bit-width.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180777 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Saves a tiny bit of space for var-args heavy bitcode
programs, since the anonymous types get unique'ed. E.g.,
saves 20KB out of 1.6MB in spec gcc when comparing the
gzipped files, or about 100KB when not zipped. This is only
a savings with the current wire format. If we change the
alloca, etc. to only have sizes and not struct types
then we would also not have this duplication.
Just happened to notice while looking through code for
what struct types remain used in the bitcode.
random cleanup for:
BUG= https://code.google.com/p/nativeclient/issues/detail?id=3338
R=mseaborn@chromium.org
Review URL: https://codereview.chromium.org/14197004
|
|
This resurrects r179957, but adds code that makes sure we don't touch
atomic/volatile stores:
This transformation will transform a conditional store with a preceeding
uncondtional store to the same location:
a[i] =
may-alias with a[i] load
if (cond)
a[i] = Y
into an unconditional store.
a[i] = X
may-alias with a[i] load
tmp = cond ? Y : X;
a[i] = tmp
We assume that on average the cost of a mispredicted branch is going to be
higher than the cost of a second store to the same location, and that the
secondary benefits of creating a bigger basic block for other optimizations to
work on outway the potential case where the branch would be correctly predicted
and the cost of the executing the second store would be noticably reflected in
performance.
hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With
this change we are on par with gcc's performance (gcc also performs this
transformation). There was a 1.2 % performance improvement on a ARM swift chip.
Other tests in the test-suite+external seem to be mostly uninfluenced in my
experiments:
This optimization was triggered on 41 tests such that the executable was
different before/after the patch. Only 1 out of the 40 tests (dealII) was
reproducable below 100% (by about .4%). Given that hmmer benefits so much I
believe this to be a fair trade off.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180731 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
ObjCARCContract instead of ObjCARCOpts.
Turning retains into retainRV calls disrupts the data flow analysis in
ObjCARCOpts. Thus we move it as late as we can by moving it into
ObjCARCContract.
We leave in the conversion from retainRV -> retain in ObjCARCOpt since
it enables the dataflow analysis.
rdar://10813093
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180698 91177308-0d34-0410-b5e6-96231b3b80d8
|