| Age | Commit message (Collapse) | Author |
|
$ svn merge -c 113576 https://llvm.org/svn/llvm-project/llvm/trunk
--- Merging r113576 into '.':
U test/CodeGen/ARM/2007-01-19-InfiniteLoop.ll
U lib/Target/ARM/ARMLoadStoreOptimizer.cpp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_28@113583 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Bruno, please review.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113014 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
"For ARM stack frames that utilize variable sized objects and have either
large local stack areas or require dynamic stack realignment, allocate a
base register via which to access the local frame. This allows efficient
access to frame indices not accessible via the FP (either due to being out
of range or due to dynamic realignment) or the SP (due to variable sized
object allocation). In particular, this greatly improves efficiency of access
to spill slots in Thumb functions which contain VLAs."
r112986 fixed a latent bug exposed by the above.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112989 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
either", it is breaking oggenc with Clang for ARMv6.
This reverts commit 8d6e29cfda270be483abf638850311670829ee65.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112962 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112947 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
The AVX versions of PALIGN and PABS* should only exist for
128-bit. Remove the unnecessary stuff.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112944 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
vabd intrinsic and add and/or zext operations. In the case of vaba, this
also avoids the need for a DAG combine pattern to combine vabd with add.
Update tests. Auto-upgrade the old intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112941 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Patch by Cameron Esfahani!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112902 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
large local stack areas or require dynamic stack realignment, allocate a
base register via which to access the local frame. This allows efficient
access to frame indices not accessible via the FP (either due to being out
of range or due to dynamic realignment) or the SP (due to variable sized
object allocation). In particular, this greatly improves efficiency of access
to spill slots in Thumb functions which contain VLAs.
rdar://7352504
rdar://8374540
rdar://8355680
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112883 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
there are clearly no stores between the load and the store. This fixes
this miscompile reported as PR7833.
This breaks the test/CodeGen/X86/narrow_op-2.ll optimization, which is
safe, but awkward to prove safe. Move it to X86's README.txt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112861 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112853 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
ARM register class allocation order functions to take advantage of that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112841 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
after regalloc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112825 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112802 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Win32 codegen emits implicit invoking __main into, to fail.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112801 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
add, and subtract operations with zero-extended or sign-extended vectors.
Update tests. Add auto-upgrade support for the old intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112773 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
check more strict, breaking some cases not checked in the
testsuite, but also exposes some foldings not done before,
as this example:
movaps (%rdi), %xmm0
movaps (%rax), %xmm1
movaps %xmm0, %xmm2
movss %xmm1, %xmm2
shufps $36, %xmm2, %xmm0
now is generated as:
movaps (%rdi), %xmm0
movaps %xmm0, %xmm1
movlps (%rax), %xmm1
shufps $36, %xmm1, %xmm0
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112753 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This caused a miscompilation in WebKit where %RAX had conflicting defs when
RemoveCopyByCommutingDef was commuting a %EAX use.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112751 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
the testcases should be merged.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112711 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
by 112440 are resolved.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112692 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
int x(int t) {
if (t & 256)
return -26;
return 0;
}
We generate this:
tst.w r0, #256
mvn r0, #25
it eq
moveq r0, #0
while gcc generates this:
ands r0, r0, #256
it ne
mvnne r0, #25
bx lr
Scandalous really!
During ISel time, we can look for this particular pattern. One where we have a
"MOVCC" that uses the flag off of a CMPZ that itself is comparing an AND
instruction to 0. Something like this (greatly simplified):
%r0 = ISD::AND ...
ARMISD::CMPZ %r0, 0 @ sets [CPSR]
%r0 = ARMISD::MOVCC 0, -26 @ reads [CPSR]
All we have to do is convert the "ISD::AND" into an "ARM::ANDS" that sets [CPSR]
when it's zero. The zero value will all ready be in the %r0 register and we only
need to change it if the AND wasn't zero. Easy!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112664 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112610 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112555 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Auto-upgrade the old intrinsic and update tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112507 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
1) nuke ConstDataCoalSection, which is dead.
2) revise my previous patch for rdar://8018335,
which was completely wrong. Specifically, it doesn't
make sense to mark __TEXT,__const_coal as PURE_INSTRUCTIONS,
because it is for readonly data. templates (it turns out)
go to const_coal_nt. The real fix for rdar://8018335 was
to give ConstTextCoalSection a section kind of ReadOnly
instead of Text.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112496 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112469 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This has the side effect of reversing the order of most of
IVUser's results.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112442 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112426 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
The IDX was treated as byte index, not element index.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112422 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
IR add/sub operations with one or both operands sign- or zero-extended.
Auto-upgrade the old intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112416 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112398 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112396 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
insertp[sd] $0, which is a noop. Before:
_f32: ## @f32
pshufd $1, %xmm1, %xmm2
pshufd $1, %xmm0, %xmm3
addss %xmm2, %xmm3
addss %xmm1, %xmm0
## kill: XMM0<def> XMM0<kill> XMM0<def>
insertps $0, %xmm0, %xmm0
insertps $16, %xmm3, %xmm0
ret
after:
_f32: ## @f32
movdqa %xmm0, %xmm2
addss %xmm1, %xmm2
pshufd $1, %xmm1, %xmm1
pshufd $1, %xmm0, %xmm3
addss %xmm1, %xmm3
movdqa %xmm2, %xmm0
insertps $16, %xmm3, %xmm0
ret
The extra movs are due to a random (poor) scheduling decision.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112379 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
when the top elements of a vector are undefined. This happens all
the time for X86-64 ABI stuff because only the low 2 elements of
a 4 element vector are defined. For example, on:
_Complex float f32(_Complex float A, _Complex float B) {
return A+B;
}
We used to produce (with SSE2, SSE4.1+ uses insertps):
_f32: ## @f32
movdqa %xmm0, %xmm2
addss %xmm1, %xmm2
pshufd $16, %xmm2, %xmm2
pshufd $1, %xmm1, %xmm1
pshufd $1, %xmm0, %xmm0
addss %xmm1, %xmm0
pshufd $16, %xmm0, %xmm1
movdqa %xmm2, %xmm0
unpcklps %xmm1, %xmm0
ret
We now produce:
_f32: ## @f32
movdqa %xmm0, %xmm2
addss %xmm1, %xmm2
pshufd $1, %xmm1, %xmm1
pshufd $1, %xmm0, %xmm3
addss %xmm1, %xmm3
movaps %xmm2, %xmm0
unpcklps %xmm3, %xmm0
ret
This implements rdar://8368414
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112378 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
doesn't currently support dealing with this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112341 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
all the other LDM/STM instructions. This fixes asm printer crashes when
compiling with -O0. I've changed one of the NEON tests (vst3.ll) to run
with -O0 to check this in the future.
Prior to this change VLDM/VSTM used addressing mode #5, but not really.
The offset field was used to hold a count of the number of registers being
loaded or stored, and the AM5 opcode field was expanded to specify the IA
or DB mode, instead of the standard ADD/SUB specifier. Much of the backend
was not aware of these special cases. The crashes occured when rewriting
a frameindex caused the AM5 offset field to be changed so that it did not
have a valid submode. I don't know exactly what changed to expose this now.
Maybe we've never done much with -O0 and NEON. Regardless, there's no longer
any reason to keep a count of the VLDM/VSTM registers, so we can use
addressing mode #4 and clean things up in a lot of places.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112322 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112280 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Update all the tests using those intrinsics and add support for
auto-upgrading bitcode files with the old versions of the intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112271 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
to find miscompiles with the integrated assembler.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112250 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
lack sse2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112175 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
on random hosts, lets see!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112172 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112171 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112169 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
apparently try to support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112168 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112127 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
a VLD result was not used (Radar 8355607). It should also fix pr7988, but
I haven't verified that yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112118 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
builder. I will investigate tonight.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112113 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
expanding: e.g. <2 x float> -> <4 x float> instead of -> 2 floats. This
affects two places in the code: handling cross block values and handling
function return and arguments. Since vectors are already widened by
legalizetypes, this gives us much better code and unblocks x86-64 abi
and SPU abi work.
For example, this (which is a silly example of a cross-block value):
define <4 x float> @test2(<4 x float> %A) nounwind {
%B = shufflevector <4 x float> %A, <4 x float> undef, <2 x i32> <i32 0, i32 1>
%C = fadd <2 x float> %B, %B
br label %BB
BB:
%D = fadd <2 x float> %C, %C
%E = shufflevector <2 x float> %D, <2 x float> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef>
ret <4 x float> %E
}
Now compiles into:
_test2: ## @test2
## BB#0:
addps %xmm0, %xmm0
addps %xmm0, %xmm0
ret
previously it compiled into:
_test2: ## @test2
## BB#0:
addps %xmm0, %xmm0
pshufd $1, %xmm0, %xmm1
## kill: XMM0<def> XMM0<kill> XMM0<def>
insertps $0, %xmm0, %xmm0
insertps $16, %xmm1, %xmm0
addps %xmm0, %xmm0
ret
This implements rdar://8230384
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112101 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
comparison that would overflow.
- The other under/overflow cases can't actually happen because the immediates
which would trigger them are legal (so we don't enter this code), but
adjusted the style to make it clear the transform is always valid.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112053 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112036 91177308-0d34-0410-b5e6-96231b3b80d8
|