Age | Commit message (Collapse) | Author |
|
NEON is not IEEE 754 compliant, so we should avoid lowering single-precision
floating point operations with NEON unless unsafe-math is turned on. The
equivalent VFP instructions are IEEE 754 compliant, but in some cores they're
much slower, so some archs/OSs might still request it to be on by default,
such as Swift and Darwin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177651 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Patch by Stepan Dyatkovskiy <stpworld@narod.ru>
rdar://13457826
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177463 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
The ARM backend currently has poor codegen for long sext/zext
operations, such as v8i8 -> v8i32. This patch addresses this
by performing a custom expansion in ARMISelLowering. It also
adds/changes the cost of such lowering in ARMTTI.
This partially addresses PR14867.
Patch by Pete Couperus
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177380 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
The default logic marks them as too expensive.
For example, before this patch we estimated:
cost of 16 for instruction: %r = uitofp <4 x i16> %v0 to <4 x float>
While this translates to:
vmovl.u16 q8, d16
vcvt.f32.u32 q8, q8
All other costs are left to the values assigned by the fallback logic. Theses
costs are mostly reasonable in the sense that they get progressively more
expensive as the instruction sequences emitted get longer.
radar://13445992
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177334 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Fix cost of some "cheap" cast instructions. Before this patch we used to
estimate for example:
cost of 16 for instruction: %r = fptoui <4 x float> %v0 to <4 x i16>
While we would emit:
vcvt.s32.f32 q8, q8
vmovn.i32 d16, q8
vuzp.8 d16, d17
All other costs are left to the values assigned by the fallback logic. Theses
costs are mostly reasonable in the sense that they get progressively more
expensive as the instruction sequences emitted get longer.
radar://13434072
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177333 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
I was too pessimistic in r177105. Vector selects that fit into a legal register
type lower just fine. I was mislead by the code fragment that I was using. The
stores/loads that I saw in those cases came from lowering the conditional off
an address.
Changing the code fragment to:
%T0_3 = type <8 x i18>
%T1_3 = type <8 x i1>
define void @func_blend3(%T0_3* %loadaddr, %T0_3* %loadaddr2,
%T1_3* %blend, %T0_3* %storeaddr) {
%v0 = load %T0_3* %loadaddr
%v1 = load %T0_3* %loadaddr2
==> FROM:
;%c = load %T1_3* %blend
==> TO:
%c = icmp slt %T0_3 %v0, %v1
==> USE:
%r = select %T1_3 %c, %T0_3 %v0, %T0_3 %v1
store %T0_3 %r, %T0_3* %storeaddr
ret void
}
revealed this mistake.
radar://13403975
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177170 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
registers. The pass handles all the required transformations pre-regalloc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177169 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Fixes PR15520.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177167 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
A vector fptrunc and fpext simply gets split into scalar instructions.
radar://13192358
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177159 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177135 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This is a generic function (derived from PEI); moving it into
MachineFrameInfo eliminates a current redundancy between the ARM and AArch64
backends, and will allow it to be used by the PowerPC target code.
No functionality change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177111 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
By terrible I mean we store/load from the stack.
This matters on PAQp8 in _Z5trainPsS_ii (which is inlined into Mixer::update)
where we decide to vectorize a loop with a VF of 8 resulting in a 25%
degradation on a cortex-a8.
LV: Found an estimated cost of 2 for VF 8 For instruction: icmp slt i32
LV: Found an estimated cost of 2 for VF 8 For instruction: select i1, i32, i32
The bug that tracks the CodeGen part is PR14868.
radar://13403975
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177105 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Increase the cost of v8/v16-i8 to v8/v16-i32 casts and truncates as the backend
currently lowers those using stack accesses.
This was responsible for a significant degradation on
MultiSource/Benchmarks/Trimaran/enc-pc1/enc-pc1
where we vectorize one loop to a vector factor of 16. After this patch we select
a vector factor of 4 which will generate reasonable code.
unsigned char cle[32];
void test(short c) {
unsigned short compte;
for (compte = 0; compte <= 31; compte++) {
cle[compte] = cle[compte] ^ c;
}
}
radar://13220512
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176898 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
intrinsic - it can cause impossible-to-schedule subgraphs to be
introduced.
PR15053.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176777 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176648 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
The VDUP instruction source register doesn't allow a non-constant lane
index, so make sure we don't construct a ARM::VDUPLANE node asking it to
do so.
rdar://13328063
http://llvm.org/bugs/show_bug.cgi?id=13963
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176413 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176412 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176411 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Mark them as expand, they are not legal as our backend does not match them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176410 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
dispatch code. As far as I can tell the thumb2 code is behaving as expected.
I was able to compile and run the associated test case for both arm and thumb1.
rdar://13066352
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176363 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176288 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176285 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
rdar://13306723
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176212 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176208 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This fixes an issue where trying to assemlbe valid ADR instructions would cause
LLVM to hit a failed assertion.
Patch by Keith Walker.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176189 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
arguments type is a simple type.
rdar://13290455
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176066 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Handle an implied 'sp' operand.
rdar://11466783
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175940 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
instructions.
The Printer will now print instructions with the correct alignment specifier syntax, like
vld1.8 {d16}, [r0:64]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175884 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
to TargetFrameLowering, where it belongs. Incidentally, this allows us
to delete some duplicated (and slightly different!) code in TRI.
There are potentially other layering problems that can be cleaned up
as a result, or in a similar manner.
The refactoring was OK'd by Anton Korobeynikov on llvmdev.
Note: this touches the target interfaces, so out-of-tree targets may
be affected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175788 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175775 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
s/AddDirectiveHandler/addDirectiveHandler/
s/ParseMSInlineAsm/parseMSInlineAsm/
s/ParseIdentifier/parseIdentifier/
s/ParseStringToEndOfStatement/parseStringToEndOfStatement/
s/ParseEscapedString/parseEscapedString/
s/EatToEndOfStatement/eatToEndOfStatement/
s/ParseExpression/parseExpression/
s/ParseParenExpression/parseParenExpression/
s/ParseAbsoluteExpression/parseAbsoluteExpression/
s/CheckForValidSection/checkForValidSection/
http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly
No functional change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175675 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly
ivars should be camel-case and start with an upper-case letter. A few in
TargetLowering were starting with a lower-case letter.
No functional change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175667 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
It is possible that frame pointer is not found in the
callee saved info, thus FramePtrSpillFI may be incorrect
if we don't check the result of hasFP(MF).
Besides, if we enable the stack coloring algorithm, there
will be an assertion to ensure the slot is live. But in
the test case, %var1 is not live in the prologue of the
function, and we will get the assertion failure.
Note: There is similar code in ARMFrameLowering.cpp.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175616 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
In my previous commit:
"Merge a f32 bitcast of a v2i32 extractelt
A vectorized sitfp on doubles will get scalarized to a sequence of an
extract_element of <2 x i32>, a bitcast to f32 and a sitofp.
Due to the the extract_element, and the bitcast we will uneccessarily generate
moves between scalar and vector registers."
I added a pattern containing a copy_to_regclass. The copy_to_regclass is
actually not needed.
radar://13191881
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175555 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
When creating an allocation hint for a register pair, make sure the hint
for the physical register reference is still in the allocation order.
rdar://13240556
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175541 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175527 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
A vectorized sitfp on doubles will get scalarized to a sequence of an
extract_element of <2 x i32>, a bitcast to f32 and a sitofp.
Due to the the extract_element, and the bitcast we will uneccessarily generate
moves between scalar and vector registers.
The patch fixes this by using a COPY_TO_REGCLASS and a EXTRACT_SUBREG to extract
the element from the vector instead.
radar://13191881
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175520 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
If the memcpy has an odd length with an alignment of 2, this would incorrectly
assert on the last 1 byte copy.
rdar://13202135
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175459 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175371 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
new features.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175336 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175322 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
When we're recalculating the feature set of the subtarget, we need to have the
ivars in their initial state.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175320 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175315 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
assembler should also accept a two arg form, as the docuemntation specifies that
the first (destination) register is optional.
This patch uses TwoOperandAliasConstraint to add the two argument form.
It also fixes an 80-column formatting problem in:
test/MC/ARM/neon-bitwise-encoding
<rdar://problem/12909419> Clang rejects ARM NEON assembly instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175221 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
inline asm with 64-bit data on ARM
Update test case to use -mtriple=arm-linux-gnueabi
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175186 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
The parser will now accept instructions with alignment specifiers written like
vld1.8 {d16}, [r0:64]
, while also still accepting the incorrect syntax
vld1.8 {d16}, [r0, :64]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175164 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175107 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
on ARM
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175088 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175020 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
A reverse shuffle is lowered to a vrev and possibly a vext instruction (quad
word).
radar://13171406
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174933 91177308-0d34-0410-b5e6-96231b3b80d8
|