Age | Commit message (Collapse) | Author |
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169854 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
of EVT.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169845 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Accordingly, add helper funtions getSimpleValueType (in parallel to
getValueType) in SDValue, SDNode, and TargetLowering.
This is the first, in a series of patches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169837 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169811 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
1. Teach it to use overlapping unaligned load / store to copy / set the trailing
bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies.
2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g.
x86 and ARM.
3. When memcpy from a constant string, do *not* replace the load with a constant
if it's not possible to materialize an integer immediate with a single
instruction (required a new target hook: TLI.isIntImmLegal()).
4. Use unaligned load / stores more aggressively if target hooks indicates they
are "fast".
5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8.
Also increase the threshold to something reasonable (8 for memset, 4 pairs
for memcpy).
This significantly improves Dhrystone, up to 50% on ARM iOS devices.
rdar://12760078
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169791 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
understand target implementation of any_extend / extload, just generate
zero_extend in place of any_extend for liveouts when the target knows the
zero_extend will be implicit (e.g. ARM ldrb / ldrh) or folded (e.g. x86 movz).
rdar://12771555
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169536 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
and extload's. If they are implemented as zero-extend, or implicitly
zero-extend, then this can enable more demanded bits optimizations. e.g.
define void @foo(i16* %ptr, i32 %a) nounwind {
entry:
%tmp1 = icmp ult i32 %a, 100
br i1 %tmp1, label %bb1, label %bb2
bb1:
%tmp2 = load i16* %ptr, align 2
br label %bb2
bb2:
%tmp3 = phi i16 [ 0, %entry ], [ %tmp2, %bb1 ]
%cmp = icmp ult i16 %tmp3, 24
br i1 %cmp, label %bb3, label %exit
bb3:
call void @bar() nounwind
br label %exit
exit:
ret void
}
This compiles to the followings before:
push {lr}
mov r2, #0
cmp r1, #99
bhi LBB0_2
@ BB#1: @ %bb1
ldrh r2, [r0]
LBB0_2: @ %bb2
uxth r0, r2
cmp r0, #23
bhi LBB0_4
@ BB#3: @ %bb3
bl _bar
LBB0_4: @ %exit
pop {lr}
bx lr
The uxth is not needed since ldrh implicitly zero-extend the high bits. With
this change it's eliminated.
rdar://12771555
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169459 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
(TIL that Clang's -Wparentheses ignores 'x || y && "foo"' on purpose. Neat.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169337 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169325 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.
Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169131 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Codegen was failing with an assertion because of unexpected vector
operands when legalizing the selection DAG for a MUL instruction.
The asserting code was legalizing multiplies for vectors of size 128
bits. It uses a custom lowering to try and detect cases where it can
use a VMULL instruction instead of a VMOVL + VMUL. The code was
looking for input operands to the MUL that had been sign or zero
extended. If it found the extended operands it would drop the
sign/zero extension and use the original vector size as input to a
VMULL instruction.
The code assumed that the original input vector was 64 bits so that
after dropping the extension it would fit directly into a D register
and could be used as an operand of a VMULL instruction. The input
code that trigger the failure used a vector of <4 x i8> that was
sign extended to <4 x i32>. It was not safe to drop the sign
extension in this case because the original vector is only 32 bits
wide. The fix is to insert a sign extension for the vector to reach
the required 64 bit size. In this particular example, the vector would
need to be sign extented to a <4 x i16>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169024 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168886 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Fixes 14337.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168809 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Conflicts:
lib/CodeGen/AsmPrinter/DwarfDebug.cpp
lib/CodeGen/AsmPrinter/DwarfDebug.h
lib/Target/ARM/MCTargetDesc/ARMAsmBackend.cpp
lib/Target/Mips/MipsISelDAGToDAG.cpp
lib/Target/Mips/MipsInstrFPU.td
lib/Target/Mips/MipsSubtarget.cpp
lib/Target/Mips/MipsSubtarget.h
lib/Target/X86/X86MCInstLower.cpp
tools/Makefile
tools/llc/llc.cpp
|
|
A number of calling convention related changes were made unconditionally.
This makes these conditionalized, although there are still some small
differences I would like to address separately in the logs for struct_byval.ll
BUG= http://code.google.com/p/nativeclient/issues/detail?id=1711
Review URL: https://codereview.chromium.org/11416053
|
|
Couperus.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168240 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This patch replaces the hard coded GPR pair [R0, R1] of
Intrinsic:arm_ldrexd and [R2, R3] of Intrinsic:arm_strexd with
even/odd GPRPair reg class.
Similar to the lowering of atomic_64 operation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168207 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This fixes PR14359
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168200 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
case to vector legalization so this actually works.
Patch by Pete Couperus. Fixes PR12540.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168107 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
(svn r167699, also the 3.2 branch point)
Conflicts:
lib/Target/X86/X86Subtarget.cpp
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168030 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168029 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168025 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Conflicts:
lib/Target/Mips/MipsISelLowering.cpp
lib/Target/X86/X86ISelLowering.cpp
lib/Target/X86/X86InstrFormats.td
|
|
mov lr, pc
b.w _foo
The "mov" instruction doesn't set bit zero to one, it's putting incorrect
value in lr. It messes up backtraces.
rdar://12663632
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167657 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167622 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
rdar://12340498
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167620 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This is in preparation for adding an LLVM pass that will expand out
TLS (thread_local) variable accesses into calls to nacl.read.tp.
On ARM, there is already an arm.thread.pointer intrinsic. We reuse
the code for that.
On x86, we have to add an implementation. The added code is based on
x86's LowerToTLSExecModel() for the %gs:0 case, and on NaCl-MIPS'
LowerGlobalTLSAddress() for the __nacl_read_tp() case. (In contrast,
X86NaClRewritePass.cpp inserts a __nacl_read_tp() call at the lower MI
level; we don't use that approach here.)
We convert LowerINTRINSIC_WO_CHAIN() into a method in order to access
the Subtarget member. This is consistent with other x86 Lower methods
and with the ARM version.
BUG=https://code.google.com/p/nativeclient/issues/detail?id=2837
TEST="llvm-lit test/NaCl"
Review URL: https://codereview.chromium.org/11383002
|
|
registers. Previously, the register we being marked as implicitly defined, but
not killed. In some cases this would cause the register scavenger to spill a
dead register.
Also, use an empty register mask to simplify the logic and to reduce the memory
footprint.
rdar://12592448
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167499 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Conflicts:
lib/Target/ARM/ARMFrameLowering.cpp
lib/Target/Mips/MipsRegisterInfo.cpp
lib/Target/X86/X86ISelLowering.cpp
lib/Transforms/IPO/ExtractGV.cpp
tools/Makefile
tools/gold/gold-plugin.cpp
The only interesting conflict was X86ISelLowering.ccp, which
meant I had to essentially revert r167104. The problem is that we are
using ESP as the stack pointer in X86ISelLowering and RSP as the
stack pointer in X86FrameLowering, and that revision made them
both consistently use X86RegisterInfo to determine which to use.
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167318 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Warnings: unused variables, unused functions, -Wreorder,
and remember to return a value in a non-void function.
Also remove setjmp/longjmp intrinsics for x86, which aren't
being used now (no equivalents in ARM and no equivalent for
x86-64 with the zero-based sandbox, etc.). This exposes a
few more unused functions.
BUG= none
TEST= test-all
Review URL: https://codereview.chromium.org/11345016
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167020 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
sequence
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166854 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Conflicts:
lib/Target/ARM/ARMISelLowering.cpp
lib/Target/Mips/MipsISelLowering.h
lib/Target/X86/X86ISelLowering.h
lib/Target/X86/X86TargetMachine.h
tools/llc/llc.cpp
The only interesting conflict was ARMISelLowering, caused by
http://llvm.org/viewvc/llvm-project?view=rev&revision=166273
which actually removes a LOCALMOD for ARM byval lowering.
|
|
Removed extra stack frame object for fixed byval arguments,
VarArgsStyleRegisters invocation was reworked due to some improper usage in
past. PR14099 also demonstrates it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166273 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
|
|
Stack is formed improperly for long structures passed as byval arguments for
EABI mode.
If we took AAPCS reference, we can found the next statements:
A: "If the argument requires double-word alignment (8-byte), the NCRN (Next
Core Register Number) is rounded up to the next even register number." (5.5
Parameter Passing, Stage C, C.3).
B: "The alignment of an aggregate shall be the alignment of its most-aligned
component." (4.3 Composite Types, 4.3.1 Aggregates).
So if we have structure with doubles (9 double fields) and 3 Core unused
registers (r1, r2, r3): caller should use r2 and r3 registers only.
Currently r1,r2,r3 set is used, but it is invalid.
Callee VA routine should also use r2 and r3 regs only. All is ok here. This
behaviour is guessed by rounding up SP address with ADD+BFC operations.
Fix:
Main fix is in ARMTargetLowering::HandleByVal. If we detected AAPCS mode and
8 byte alignment, we waste odd registers then.
P.S.:
I also improved LDRB_POST_IMM regression test. Since ldrb instruction will
not generated by current regression test after this patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166018 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Conflicts:
lib/Target/X86/X86FrameLowering.cpp
lib/Target/X86/X86ISelLowering.cpp
|
|
VDUPLANE node with the vector input size different from the output size. This was bacause the BUILD_VECTOR lowering code didn't check that the size of the input vector was correct for using VDUPLANE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165929 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
local frame causes problem.
For example:
void f(StructToPass s) {
g(&s, sizeof(s));
}
will cause problem with tail-call since part of s is passed via registers and
saved in f's local frame. When g tries to access s, part of s may be corrupted
since f's local frame is popped out before the tail-call.
The current fix is to disable tail-call if getVarArgsRegSaveSize is not 0 for
the caller. This is a conservative approach, if we can prove the address of
s or part of s is not taken and passed to g, it should be okay to perform
tail-call.
rdar://12442472
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165853 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Conflicts:
include/llvm/MC/MCAssembler.h
lib/Target/ARM/ARMISelLowering.cpp
lib/Target/X86/X86TargetMachine.h
tools/llc/llc.cpp
|
|
The backend already pattern matches to form VBSL when it can. We may want to
teach it to use the vbsl intrinsics at some point to prevent machine licm from
mucking with this, but using the Expand is completely correct.
http://llvm.org/bugs/show_bug.cgi?id=13831
http://llvm.org/bugs/show_bug.cgi?id=13961
Patch by Peter Couperus <peter.couperus@st.com>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165845 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Conflicts:
lib/ExecutionEngine/JIT/JITEmitter.cpp
lib/MC/MCELFStreamer.cpp
lib/Target/ARM/ARMAsmPrinter.h
lib/Target/X86/X86RegisterInfo.td
lib/Target/X86/X86TargetMachine.cpp
tools/llc/llc.cpp
|
|
SDNode for LDRB_POST_IMM is invalid: number of registers added to SDNode fewer
that described in .td.
7 ops is needed, but SDNode with only 6 is created.
In more details:
In ARMInstrInfo.td, in multiclass AI2_ldridx, in definition _POST_IMM, offset
operand is defined as am2offset_imm. am2offset_imm is complex parameter type,
and actually it consists from dummy register and imm itself. As I understood
trick with dummy reg was made for AsmParser. In ARMISelLowering.cpp, this dummy
register was not added to SDNode, and it cause crash in Peephole Optimizer pass.
The problem fixed by setting up additional dummy reg when emitting
LDRB_POST_IMM instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165617 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
SchedulerDAGInstrs::buildSchedGraph ignores dependencies between FixedStack
objects and byval parameters. So loading byval parameters from stack may be
inserted *before* it will be stored, since these operations are treated as
independent.
Fix:
Currently ARMTargetLowering::LowerFormalArguments saves byval registers with
FixedStack MachinePointerInfo. To fix the problem we need to store byval
registers with MachinePointerInfo referenced to first the "byval" parameter.
Also commit adds two new fields to the InputArg structure: Function's argument
index and InputArg's part offset in bytes relative to the start position of
Function's argument. E.g.: If function's argument is 128 bit width and it was
splitted onto 32 bit regs, then we got 4 InputArg structs with same arg index,
but different offset values.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165616 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
We use the enums to query whether an Attributes object has that attribute. The
opaque layer is responsible for knowing where that specific attribute is stored.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165488 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165402 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Conflicts:
lib/Target/ARM/ARMISelDAGToDAG.cpp
lib/Target/Mips/MipsISelLowering.cpp
lib/Target/Mips/MipsSubtarget.cpp
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164899 91177308-0d34-0410-b5e6-96231b3b80d8
|